DISCRETE VARIATIONAL OPTIMAL CONTROL

Similar documents
arxiv: v5 [math-ph] 1 Oct 2014

Discrete Geometric Optimal Control on Lie Groups

Review of Lagrangian Mechanics and Reduction

Gauge Fixing and Constrained Dynamics in Numerical Relativity

DISCRETE GEOMETRIC MOTION CONTROL OF AUTONOMOUS VEHICLES. Marin Kobilarov

Discrete Dirac Mechanics and Discrete Dirac Geometry

Chap. 1. Some Differential Geometric Tools

Curves in the configuration space Q or in the velocity phase space Ω satisfying the Euler-Lagrange (EL) equations,

A CRASH COURSE IN EULER-POINCARÉ REDUCTION

Invariant Lagrangian Systems on Lie Groups

On variational integrators for optimal control of mechanical control systems. Leonardo Colombo, David Martín de Diego and Marcela Zuccalli

Legendre Transforms, Calculus of Varations, and Mechanics Principles

EXERCISES IN POISSON GEOMETRY

arxiv: v1 [math.ds] 29 Jun 2015

DIRAC COTANGENT BUNDLE REDUCTION HIROAKI YOSHIMURA JERROLD E. MARSDEN. (Communicated by Juan-Pablo Ortega)

Variational Collision Integrators and Optimal Control

The Geometry of Euler s equation. Introduction

Global Formulations of Lagrangian and Hamiltonian Dynamics on Embedded Manifolds

[#1] R 3 bracket for the spherical pendulum

BACKGROUND IN SYMPLECTIC GEOMETRY

Dirac Structures and the Legendre Transformation for Implicit Lagrangian and Hamiltonian Systems

Physics 5153 Classical Mechanics. Canonical Transformations-1

Autonomous Underwater Vehicles: Equations of Motion

Hamilton-Jacobi theory on Lie algebroids: Applications to nonholonomic mechanics. Manuel de León Institute of Mathematical Sciences CSIC, Spain

Chapter 3 Numerical Methods

CDS 205 Final Project: Incorporating Nonholonomic Constraints in Basic Geometric Mechanics Concepts

Some aspects of the Geodesic flow

The Toda Lattice. Chris Elliott. April 9 th, 2014

Video 3.1 Vijay Kumar and Ani Hsieh

Hamiltonian Systems of Negative Curvature are Hyperbolic

Lecture 4. Alexey Boyarsky. October 6, 2015

Discrete Dirac Structures and Implicit Discrete Lagrangian and Hamiltonian Systems

Physics 235 Chapter 7. Chapter 7 Hamilton's Principle - Lagrangian and Hamiltonian Dynamics

Analytical Dynamics: Lagrange s Equation and its Application A Brief Introduction

EN Nonlinear Control and Planning in Robotics Lecture 2: System Models January 28, 2015

Variational principles and Hamiltonian Mechanics

Chap. 3. Controlled Systems, Controllability

1 Hamiltonian formalism

Energy-based Swing-up of the Acrobot and Time-optimal Motion

ON COLISSIONS IN NONHOLONOMIC SYSTEMS

Master of Science in Advanced Mathematics and Mathematical Engineering

Diffraction by Edges. András Vasy (with Richard Melrose and Jared Wunsch)

Exercises in Geometry II University of Bonn, Summer semester 2015 Professor: Prof. Christian Blohmann Assistant: Saskia Voss Sheet 1

arxiv: v1 [math-ph] 12 Apr 2017

Finite dimensional thermo-mechanical systems and second order constraints

Control Systems on Lie Groups

Choice of Riemannian Metrics for Rigid Body Kinematics

M3-4-5 A16 Notes for Geometric Mechanics: Oct Nov 2011

Modified Equations for Variational Integrators

Lie Groups for 2D and 3D Transformations

Multibody simulation

Lecture I: Constrained Hamiltonian systems

Sketchy Notes on Lagrangian and Hamiltonian Mechanics

September 21, :43pm Holm Vol 2 WSPC/Book Trim Size for 9in by 6in

Linear Algebra and Robot Modeling

Physics 6010, Fall 2016 Constraints and Lagrange Multipliers. Relevant Sections in Text:

REVIEW. Hamilton s principle. based on FW-18. Variational statement of mechanics: (for conservative forces) action Equivalent to Newton s laws!

The Astrojax Pendulum and the N-Body Problem on the Sphere: A study in reduction, variational integration, and pattern evocation.

Principles of Optimal Control Spring 2008

Nonholonomic Constraints Examples

arxiv:math-ph/ v1 22 May 2003

Computational Geometric Uncertainty Propagation for Hamiltonian Systems on a Lie Group

Geodesic Equivalence in sub-riemannian Geometry

A DISCRETE THEORY OF CONNECTIONS ON PRINCIPAL BUNDLES

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.)

Multibody simulation

Geometric Mechanics and Global Nonlinear Control for Multi-Body Dynamics

A Projection Operator Approach to Solving Optimal Control Problems on Lie Groups

SINGULAR SOLITONS, MOMENTUM MAPS & COMPUTATIONAL ANATOMY

Lagrangian submanifolds and generating functions

THEODORE VORONOV DIFFERENTIAL GEOMETRY. Spring 2009

A Gauss Lobatto quadrature method for solving optimal control problems

Hamiltonian theory of constrained impulsive motion

Solutions to the Hamilton-Jacobi equation as Lagrangian submanifolds

Definition 5.1. A vector field v on a manifold M is map M T M such that for all x M, v(x) T x M.

PHYS 705: Classical Mechanics. Euler s Equations

Noether Symmetries and Conserved Momenta of Dirac Equation in Presymplectic Dynamics

GEOMETRIC QUANTIZATION

arxiv: v1 [math.ds] 18 Nov 2008

Analytical Mechanics for Relativity and Quantum Mechanics

Math 692A: Geometric Numerical Integration

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M =

Solutions of M3-4A16 Assessed Problems # 3 [#1] Exercises in exterior calculus operations

Dirac Structures in Lagrangian Mechanics

Yuri Fedorov Multidimensional Integrable Generalizations of the nonholonomic Chaplygin sphere problem. November 29, 2006

06. Lagrangian Mechanics II

EN Nonlinear Control and Planning in Robotics Lecture 3: Stability February 4, 2015

SYMPLECTIC MANIFOLDS, GEOMETRIC QUANTIZATION, AND UNITARY REPRESENTATIONS OF LIE GROUPS. 1. Introduction

Controlling and Stabilizing a Rigid Formation using a few agents

Physical Dynamics (PHY-304)

Pose estimation from point and line correspondences

REMARKS ON THE TIME-OPTIMAL CONTROL OF A CLASS OF HAMILTONIAN SYSTEMS. Eduardo D. Sontag. SYCON - Rutgers Center for Systems and Control

Stabilization of Angular Velocity of Asymmetrical Rigid Body. Using Two Constant Torques

Group Actions and Cohomology in the Calculus of Variations

1. Geometry of the unit tangent bundle

arxiv: v1 [math-ph] 28 Aug 2017

LECTURE 3 MATH 261A. Office hours are now settled to be after class on Thursdays from 12 : 30 2 in Evans 815, or still by appointment.

Part of the advantage : Constraint forces do no virtual. work under a set of virtual displacements compatible

SYMMETRIC PROJECTION METHODS FOR DIFFERENTIAL EQUATIONS ON MANIFOLDS

Hamiltonian. March 30, 2013

Transcription:

DISCRETE VARIATIONAL OPTIMAL CONTROL FERNANDO JIMÉNEZ, MARIN KOBILAROV, AND DAVID MARTÍN DE DIEGO Abstract. This paper develops numerical methods for optimal control of mechanical systems in the Lagrangian setting. It extends the theory of discrete mechanics to enable the solutions of optimal control problems through the discretization of variational principles. The key point is to solve the optimal control problem as a variational integrator of a specially constructed higher-dimensional system. The developed framework applies to systems on tangent bundles, Lie groups, underactuated and nonholonomic systems with symmetries, and can approximate either smooth or discontinuous control inputs. The resulting methods inherit the preservation properties of variational integrators and result in numerically robust and easily implementable algorithms. Several theoretical and a practical example, the control of an underwater vehicle, will illustrate the application of the proposed approach. 1. Introduction The goal of this paper is to develop, from a geometric point of view, numerical methods for optimal control of Lagrangian mechanical systems. Our approach employs the theory of discrete mechanics and variational integrators [32] to derive both an integrator for the dynamics and an optimal control algorithm in a unified manner. This is accomplished through the discretization of the Lagrange-d Alembert variational principle on manifolds. An integrator for the mechanics is derived using a standard Lagrangian function and virtual work done by control forces, while control optimality conditions are derived using a special Lagrangian defined on a higher-dimensional space which encodes the dynamics and a desired cost function. The resulting integration and optimization schemes are symplectic and respect the state space structure and momentum evolution. These qualities are associated with favorable numerical properties which motivate the development of practical algorithms that can be applied to robotic or aerospace vehicles. The proposed framework is general and applies to unconstrained systems, as well systems with symmetries, underactuation, and nonholonomic constraints. In particular, our construction is appropriate for controlled Lagrangian systems that evolve on a general tangent bundle T Q with associated discrete state space Q Q, where Q is a differentiable manifold [32, 34]). In addition we focus on systems evolving on a Lie group G [3, 5, 18, 21]) and also consider the underactuated case [18] applicable for rigid body systems. Finally, the theory extends to the more general principle bundle setting with discrete analog Q Q G or more generally Q Q)/G) This work has been partially supported by MEC Spain) Grants MTM 2010-21186- C02-01, MTM2009-08166-E, and IRSES-project Geomech-246981. 1

2 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO assuming that the action of a Lie group G of symmetry leaves the control system invariant [8, 10, 19]). The main idea is the following: we take an approximation of the Lagranged Alembert principle for forced Lagrangian systems, which models control inputs and external forces such as gravity or drag. The formulation permits piecewise continuous control forces that can be encountered in practical applications. We observe that the discrete equations of motion for this type of systems are interpreted as the discrete Euler-Lagrange equations of a new Lagrangian defined in an augmented discrete phase space. Next, we apply discrete variational calculus techniques to derive the discrete optimality conditions. After this, we recover two sequences of discrete controls modeling a piecewise control trajectory. Additionally, we show how to derive the equations for various reduced systems. We specifically develop numerical methods for systems on Lie groups that lead to practical algorithm implementation. One such example system an underactuated underwater vehicle is used to illustrate the developed methodology. The resulting algorithm is simple to implement and has the ability to quickly converge to a solution which is close to the optimal solution and to the true system dynamics. We also extend our techniques to more general reduced systems like optimal control problems in trivial principal bundles and we show how to introduce nonholonomic constraints in our framework. The contributions of this work are several. First, it formulates and derives numerical methods for dynamics integration and optimal control of mechanical systems in a unified discrete variational setting. Performing the optimization via trajectory variations) in an enlarged phase space then naturally enables the treatment of general systems on either vector spaces or principle bundles with Lie group symmetries and subject to underactuation, nonholonomic constraints, and discontinuous control inputs. Second, the paper details a nonlinear root-finding algorithm for the optimal control problem between two given initial and final states that is surprisingly easy to construct since it is implemented similarly to an integrator with the addition of a boundary reconstruction condition. Finally, the geometric preservation properties of the optimal control solutions such as symplectic-momentum preservation in the standard case or Poisson bracket and momentum preservation for reduced systems are automatically guaranteed using the results in [32, 24]. The developed optimization methods inherit the backward-error analysis properties of standard variational integrators. Yet, while backward error analysis explains the long-time properties of standard integrators its significance in the context of optimal control problems with finite horizon and fixed final boundary state requires further study. In addition, while symplecticity is linked to favorable behavior in dynamics time-stepping, the symplecticity of the higher-dimensional optimal control system is likely to have further implications that remain to be studied. Finally, as with any other local optimization method for nonlinear systems, the proposed approach does not have global convergence guarantees.

DISCRETE VARIATIONAL OPTIMAL CONTROL 3 The paper is organized as follows. 2 introduces variational integrators. 3 formulates optimal control problems for Lagrangian systems defined on tangent bundles, in the continuous and discrete setting, and for both fully and underactuated systems. A simple control problem for a mechanical Lagrangian on R n illustrates these developments. In 4, discrete mechanics on Lie groups is introduced. Specifically, discrete Euler-Poincaré equations and their Hamiltonian version, the discrete Lie-Poisson equations, are obtained. Sections 5 and 6 develop the discretization procedure and the numerical aspects of the proposed approach. The developed algorithm is illustrated with an application to an unmanned underwater vehicle evolving on SE3). Finally, 7 deals with reduced systems on a trivial principal bundle and with nonholonomic mechanics. 2. Discrete Mechanics and Variational Integrators Let Q be a n-dimensional differentiable manifold with local coordinates q i ), 1 i n. Denote by T Q its tangent bundle with induced coordinates q i, q i ). Given a Lagrangian function L: T Q R the Euler Lagrange equations are ) d L dt q i L = 0, 1 i n. 1) qi These equations are a system of implicit second order differential equations. In the sequel, we will assume that the Lagrangian is regular, that is, the matrix 2 L is non-singular. It is well known that the origin of these q i q j ) equations is variational see [1, 31]). Variational integrators retain this variational character and also some of the key geometric properties of the continuous system, such as symplecticity and momentum conservation see [11] and references therein). In the following we will summarize the main features of this type of numerical integrators [32]. A discrete Lagrangian is a map L d : Q Q R, which may be considered as an approximation of the integral action defined by a continuous Lagrangian L: T Q R: L d q 0, q 1 ) h 0 Lqt), qt)) dt where qt) is a solution of the Euler-Lagrange equations for L, where q0) = q 0 and qh) = q 1 and h > 0 is enough small. Remark 2.1. The Cartesian product Q Q is equipped with an interesting differential structure, termed Lie groupoid, which allows the extension of variational calculus to various settings see [24] for more details). Define the action sum S d : Q N+1 R, corresponding to the Lagrangian L d by S d = N k=1 L dq k 1, q k ), where q k Q for 0 k N, and N is the number of steps. The discrete variational principle states that the solutions of the discrete system determined by L d must extremize the action sum given fixed endpoints q 0 and q N. By extremizing S d over q k, 1 k N 1, we obtain the system of difference equations or, in coordinates, D 1 L d q k, q k+1 ) + D 2 L d q k 1, q k ) = 0, 2) L d x i q k, q k+1 ) + L d y i q k 1, q k ) = 0,

4 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO where 1 i n, 1 k N 1 and x, y denote the n-first and n-second variables of the function L respectively. These equations are usually called the discrete Euler Lagrange equations. Under some regularity hypotheses the matrix D 12 L d q k, q k+1 )) is regular), it is possible to define a local) discrete flow Υ Ld : Q Q Q Q, by Υ Ld q k 1, q k ) = q k, q k+1 ) from 2). Define the discrete Legendre transformations associated to L d as F L d : Q Q T Q q 0, q 1 ) q 0, D 1 L d q 0, q 1 )), F + L d : Q Q T Q q 0, q 1 ) q 1, D 2 L d q 0, q 1 )), and the discrete Poincaré Cartan 2-form ω d = F + L d ) ω Q = F L d ) ω Q, where ω Q is the canonical symplectic form on T Q. The discrete algorithm determined by Υ Ld preserves the symplectic form ω d, i.e., Υ L d ω d = ω d. Moreover, if the discrete Lagrangian is invariant under the diagonal action of a Lie group G, then the discrete momentum map J d : Q Q g defined by J d q k, q k+1 ), ξ = D 2 L d q k, q k+1 ), ξ Q q k+1 ) is preserved by the discrete flow. Therefore, these integrators are symplecticmomentum preserving. Here, ξ Q denotes the fundamental vector field determined by ξ g, where g is the Lie algebra of G. See [32] for more details.) 3. Discrete optimal control on tangent bundles Consider a mechanical system which configuration space is an n-dimensional differentiable manifold Q and which dynamics is determined by a Lagrangian L : T Q R. The control forces are modeled as a mapping f : T Q U T Q, where fv q, u) Tq Q, v q T q Q and u U, being U the control space. Observe that this last definition also covers configuration and velocity dependent forces such as dissipation or friction see [34]). For greater generality we consider control variables that are only piecewise continuous to account for impulsive controls. The motion of the mechanical system is described by applying the principle of Lagrange -D Alembert, which requires that the solutions qt) Q must satisfy T T δ Lqt), qt)) dt + fqt), qt), ut)) δqt) dt = 0, 3) 0 0 where q, q) are the local coordinates of T Q and where we consider arbitrary variations δq T qt) Q with δq0) = 0 and δqt ) = 0 since we are prescribing fixed initial and final conditions q0), q0)) and qt ), qt ))). Given that we are considering an optimal control problem, the forces f must be chosen, if they exist, as the ones that extremize the cost functional: where C : T Q U R. T 0 Cqt), qt), ut)) dt, 4)

DISCRETE VARIATIONAL OPTIMAL CONTROL 5 The optimal equations of motion can now be derived using Pontryagin maximum principle. Generally, it is not possible to explicitly integrate these equations and, consequently, it is necessary to apply a numerical method. In this work, using discrete variational techniques, we will first discretize the Lagrange-d Alembert principle and then the cost functional. We obtain a numerical method that preserves some geometric features of the original continuous system as we will see in the sequel. To discretize this problem we replace the tangent space T Q by the Cartesian product Q Q and the continuous curves by sequences q 0, q 1,... q N we are using N steps, with time step h fixed, in such a way t k = kh and Nh = T ). The discrete Lagrangian L d : Q Q R is constructed as an approximation of the action integral in a single time step see [32]), that is L d q k, q k+1 ) k+1)h kh Lqt), qt)) dt. We choose the following discretization for the external forces: f ± k : Q Q U T Q, where U R m, m n, such that f k q k, q k+1, u k ) T q k Q, f + k q k, q k+1, u + k ) T q k+1 Q. Observe that, as mentioned above, we have introduced the discrete controls as two different sequences { u } { } k and u + k. In the notation followed through this paper, the time interval [kh, k + 1)h] is referred to as the k-th interval, while the controls immediately before and after time t k+1 = k+1)h are denoted by u + k and u k+1, respectively. This choice allows us to model piecewise continuous controls, admitting discrete jumps at every time t k. The notation is also depicted in the following figure: u + k u k u k+1 u k+2 u + k+2 u + k+1 }{{}}{{}}{{} k) th k+1) th k+2) th tk hk hk+1) hk+2) hk+3)

6 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO Moreover, we have that f k q k, q k+1, u k ) δq k + f + k q k, q k+1, u + k ) δq k+1 k+1)h kh fqt), qt), ut))δqt) dt where f k q k, q k+1, u k ), f + k q k, q k+1, u + k )) T q k Q T q k+1 Q see [32]). Therefore, we derive a discrete version of the Lagrange-D Alembert principle given in 3): L d q k, q k+1 ) + δ f k q k, q k+1, u k ) δq k + f + k q k, q k+1, u + k ) δq k+1) = 0, for all variations {δq k },...N with δq k T qk Q such that δq 0 = δq N = 0. From this principle is easy to derive the system of difference equations: D 2 L d q k 1, q k ) + D 1 L d q k, q k+1 ) +f + k 1 q k 1, q k, u + k 1 ) + f k q k, q k+1, u k ) = 0, 5) where k = 1,..., N 1. Equations 5) are called the forced discrete Euler-Lagrange equations see [34]). We can also approximate the cost functional 4) in a single time step h by k+1)h C d q k, u k, q k+1, u + k ) Cqt), qt), ut)) dt, kh yielding the discrete cost functional: C d q k, u k, q k+1, u + k ). Observe that C d : Q U Q U R. 3.1. Fully-actuated Systems. In this section we assume the following condition Definition 3.1. Fully actuated discrete system) The discrete mechanical control system is fully actuated if the mappings f k qk,q k+1 ) : U T q k Q, f k qk,q k+1 ) u) = f k q k, q k+1, u), qk,q k+1 ) : U T q k+1 Q, qk,q k+1 ) u) = f + k q k, q k+1, u), f + k are both diffeomorphisms. Define the momenta f + k p k = D 1 L d q k, q k+1 ) f k q k, q k+1, u k ), 6) p k+1 = D 2 L d q k, q k+1 ) + f + k q k, q k+1, u + k ). 7) Since both f ± k qk,q k+1 ) are diffeomorphisms we can express u± k in terms of q k, p k, q k+1, p k+1 ) using 6) and 7). Next, we define a new Lagrangian L d : T Q T Q R by

DISCRETE VARIATIONAL OPTIMAL CONTROL 7 L d q k, p k, q k+1, p k+1 ) = = C d q k, f k qk,q k+1 ) ) 1 D 1 L d p k ), q k+1, qk,q k+1 ) ) 1 D 2 L d + p k+1 )). f + k 8) The system is fully-actuated, consequently the Lagrangian L d is well defined on the entire discrete space T Q T Q. Now the discrete phase space is the Cartesian product T Q T Q of two copies of the cotangent bundle. The definition 6), 7) gives us a matching of momenta see [32]) which automatically implies D 2 L d q k 1, q k ) + f + k 1 q k 1, q k, u + k 1 ) = D 1L d q k, q k+1 ) f k q k, q k+1, u k ), k = 1,..., N 1, which are the forced discrete Euler-Lagrange equations 5). In other words, the matching condition enforces that the momentum at time k should be the same when evaluated from the lower interval [k 1, k] or the upper interval [k, k + 1]. Consequently, along a solution curve there is a unique momentum at each time t k, which can be called p k. The discrete Euler-Lagrange equations of motion for the Lagrangian L d : T Q T Q R are D 3 L d q k 1, p k 1, q k, p k ) + D 1 L d q k, p k, q k+1, p k+1 ) = 0, 9) D 4 L d q k 1, p k 1, q k, p k ) + D 2 L d q k, p k, q k+1, p k+1 ) = 0. 10) Assuming the regularity of the matrix ) D13 L d D 14 L d, D 23 L d D 24 L d then, applying the implicit Function theorem the two discrete Legendre transformations F L d q k, p k, q k+1, p k+1 ) = q k, p k, D 1 L d, D 2 L d ), F + L d q k, p k, q k+1, p k+1 ) = q k+1, p k+1, D 3 L d, D 4 L d ), are local diffeomorphisms. In many cases, such as if Q is a vector space, it may be that both discrete Legendre transformations are global diffeomorphisms. In that case we say that L d is hyperregular and can define the discrete Hamiltonian map F. d = F + L d F L d ) 1 : T T Q) T T Q). From the standard properties of discrete variational calculus [32], we deduce that the discrete Hamiltonian map will preserve the canonical symplectic form on T T Q) and the canonical momentum maps in the case of invariance of L d by a Lie group of symmetries see next subsections for further discussions). In summary, we have obtained the discrete equations of motion for a fully-actuated mechanical optimal control problem as the discrete Euler- Lagrange equations for a Lagrangian defined on the product of two copies of the cotangent bundle and derive its preservation properties.

8 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO 3.2. Example: optimal control problem for a mechanical Lagrangian with configuration space R n. Consider the case Q = R n and assume that M is an n n constant and symmetric matrix. The mechanical Lagrangian L : R 2n R is defined by Lx, ẋ) = 1 2ẋT Mẋ V x), where V : R n R is the potential function and x R. The system is fully actuated and there exist no velocity constraints. The optimal control problem is typically in terms of boundary conditions x0), ẋ0)) and xt ), ẋt )) for a given final time T. Note that in the continuous setting we can define the momentum by the continuous Legendre transformation FL : T Q T Q, q, q) q, p): p = L ẋ, i.e. pt) = ẋt t) M. In consequence, we can define boundary constraints also in the phase space: x0), p0) = ẋ0) T M) and xt ), pt ) = ẋt ) T M). We employ a trapezoidal discretization for the Lagrangian see [11]), that is, L d x k, x k+1 ) = h 2 Lx k, x k+1 x k h ) + h 2 Lx k+1, x k+1 x k h ) where, as above, h is the fixed time step and x 1, x 2,..., x N is a sequence of elements on R n. The discrete Lagrangian then becomes L d x k, x k+1 ) = 1 2h x k+1 x k ) T Mx k+1 x k ) h 2 V x k) + V x k+1 )). The control forces are f k x k, x k+1, u k ) T x k R n and f + k x k, x k+1, u + k ) Tx k+1 R n. For sake of clarity, we are going to fix the control forces in the following manner f ± x k, x k+1, u ± k ) = u± k. Using equations 6) and 7) it is straightforward to obtain the associated momenta p k and p k+1, namely p k = 1 h x k+1 x k ) T M + h 2 V xx k ) T u k, p k+1 = 1 h x k+1 x k ) T M h 2 V xx k+1 ) T + u + k. Let C d = h [ 4 u k )2 + u + k )2] be a discrete approximation of the cost function. Consequently, the Lagrangian over T R n T R n is L d x k, p k, x k+1, p k+1 ) = = 1 xk+1 x k p k 4 h + 1 xk+1 x k p k+1 4 h ) ) T 2 M h 2 V xx k ) T ) ) T 2 M + h 2 V xx k+1 ) T, where V x represents the derivative of V with respect to the variable x. Applying equations 9) and 10) to L d we obtain the following equations: ) T xk+1 x k 1 p k M = 0, 11) 2h p k x k+1 x k h ) T M h ) ) 2 V xx k ) T M h2 2 V xxx k ) T p k x k x k 1 ) T M + h ) h 2 V xx k ) T M h2 2 V xxx k ) T ) = 0, 12)

DISCRETE VARIATIONAL OPTIMAL CONTROL 9 where both sets of equations are defined for k = 1,..., N 1. Note that it is possible to remove the dependence on p k in equation 12). However, we prefer to keep it in order to stress that the discrete variational Euler- Lagrange equations 9) and 10) are defined in T Q T Q T R n T R n in the particular case we are considering in this example). Expressions 11) and 12) give 2N 1)n equations for the 2N + 1)n unknowns {x k } N, {p k} N. The boundary conditions x 0 = x0), p 0 = p0), x N = xt ), p N = pt ), contribute 4n extra equations that convert eqs. 11) and 12) in a nonlinear root finding problem of 2N + 1)n and the same amount of unknowns. 3.3. Underactuated Systems. In this section, we examine the case of underactuated systems defined as follows: Definition 3.2. Underactuated discrete system) A discrete mechanical control system is underactuated if the mappings f k f + k qk,q k+1 ) : U T q k Q, qk,q k+1 ) : U T q k+1 Q, f k f + k qk,q k+1 ) u) = f k q k, q k+1, u), qk,q k+1 ) u) = f + k q k, q k+1, u), are both embeddings. Under this hypothesis we deduce that M q k,q k+1 ) = f k qk,q k+1 ) U), M + q k,q k+1 ) = f + k qk,q k+1 ) U) are submanifolds of T q k Q and Tq k+1 Q, respectively. Therefore, f ± k qk are diffeomorphisms onto its image. Moreover,,q k+1 ) dim M q k,q k+1 ) = dim M+ q k,q k+1 ) = dim U. The set of admissible forces is restricted to the space M q k,q k+1 ) M+ q k,q k+1 ) Tq k Q Tq k+1 Q. As a consequence, the set of admissible momenta defined in 6) and 7) satisfy q k, D 1 L d q k, q k+1 ) p k ) M q k,q k+1 ) T q k Q, q k+1, D 2 L d q k, q k+1 ) + p k+1 ) M + q k,q k+1 ) T q k+1 Q. Thus, the Lagrangian function defined in 8) is restricted to these points only. Thus, it is necessary to apply constrained variational calculus typically performed by means of constraint functions Φ α, Φ + α : T Q T Q R, 1 α n dim U. Therefore, the solutions of the optimal control problem are now viewed as the solutions of the discrete constrained problem determined by an extended Lagrangian L d and the constraints Φ ± α. Since f ± qk,q k+1 ) are embeddings, as established in definition 3.2), the number of constraints is determined by n minus the dimension of U. Note that the total number of constraints, Φ ± α, is therefore 2n dim U).

10 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO To solve this problem we introduce Lagrange multipliers λ k )α,λ + k )α and consider discrete variational calculus using the augmented Lagrangian L d q k, p k, λ k, q k+1, p k+1, λ + k ) =L dq k, p k, q k+1, p k+1 ) + λ k )α Φ α q k, p k, q k+1, p k+1 ) + λ + k )α Φ + α q k, p k, q k+1, p k+1 ). Observe that, even though the constraints are functions of the Cartesian product of two copies of the cotangent bundle i.e. Φ ± α : T Q T Q R, neither Φ α depends on p k+1 nor Φ + α on p k. The discrete Euler-Lagrange equations gives us the solutions of the underactuated problem. Typically, the underactuated systems appear in an affine way that is f k q k, q k+1, u k ) = A k q k, q k+1 ) + B k q k, q k+1 )u k ) f + k q k, q k+1, u + k ) = A+ k q k, q k+1 ) + B + k q k, q k+1 )u + k ) where A k q k, q k+1 ) T q k Q, A + k q k, q k+1 ) T q k+1 Q. Moreover B k q k, q k+1 ) LinU, T q k Q) and B + k q k, q k+1 ) LinU, T q k+1 Q) are linear maps we assume that U is a vector space and LinE 1, E 2 ) is the set of all linear maps between E 1 and E 2 ). In consequence B k q k, q k+1 )u k ) T q k Q and B + k q k, q k+1 )u + k ) T q k+1 Q. Then the constraints are deduced using the compatibility conditions: rank B k = rank B k ; D 1L d q k, q k+1 ) p k A k q k, q k+1 ) ), rank B + k = rank B + k ; D 2L d q k, q k+1 ) + p k+1 A + k q k, q k+1 ) ), which imply constraints in q k, q k+1, p k ) and q k, q k+1, p k+1 ) respectively. The fact that f ± k qk are both embeddings implies furthermore that,q k+1 ) rank B k = rank B+ k = dim U. Since we are dealing with a discrete constrained variational problem, the geometric preservation properties are deduced directly applying the results in [26]. 4. Discrete optimal control on Lie groups The case when the configuration space is a Lie group G is studied next. Variational integrators for such systems were developed in [30, 5] and the corresponding discrete variational optimal control problems studied in [21, 3, 18, 38]. Our approach, employing the developments in [18], is to reduce the second order Euler-Lagrange equations on G to first order equations on the Lie algebra g and to perform optimization in this reduced unconstrained space. Following the developments in 2, assume that the Lagrangian defined by L d : G G R is invariant so that L d g k, g k+1 ) = L d ḡg k, ḡg k+1 ) for any element ḡ G and g k, g k+1 ) G G. According to this, we can define a reduced Lagrangian l d : G R by l d W k ) = L d e, g 1 k g k+1) where W k = g 1 k g k+1 and e is the identity of the Lie group G.

DISCRETE VARIATIONAL OPTIMAL CONTROL 11 The reduced action sum is given by S d : G R W 0,..., W ) l dw k ). Taking variations of S d and noting that δw k = g 1 k δg k)g 1 k g k+1 + g 1 k δg k+1 = η k W k + W k η k+1, where η k = g 1 k δg k, we arrive to the discrete Euler-Poincaré equations: r W k dl d )W k ) l W k 1 dl d )W k 1 ) = 0, k = 1,..., N 1, where l : G G G and r : G G G are respectively the left and the right translations of the group see also [5]). If we denote by µ k = r W k dl d )W k ) then the discrete Euler-Poincaré equations are rewritten µ k+1 = Ad W k µ k, 13) where Ad : G g g is the adjoint action of G on g. Typically this equations are known as the discrete Lie-Poisson equations see [5, 29, 30]). Consider a mechanical system determined by a Lagrangian l : g R, where g is the Lie algebra of a Lie group G, which also is a n-dimensional vector space. The continuous external forces are defined as follows f : g U g. The motion of the mechanical system is described applying the following principle δ T 0 lξt)) dt + T 0 fξt), ut)), ηt) dt = 0, 14) for all variations δξt) of the form δξt) = ηt) + [ξt), ηt)], where ηt) is an arbitrary curve on the Lie algebra with η0) = 0 and ηt ) = 0 see [31]). In addition, is the natural pairing between g and g. These equations give us the controlled Euler-Poincaré equations: ) ) d δl δl = ad ξ + f, dt δξ δξ where ad ξ η = [ξ, η]. The optimal control problem consists of minimizing a given cost functional: T 0 Cξt), ut))) dt, 15) where C : g U R. Now, we consider the associated discrete problem. First we replace the Lie algebra g by the Lie group G and the continuous curves by sequences W 0, W 1,... W N since the Lie algebra is the infinitesimal version of a Lie group, its proper discretization is consequently that Lie group [30, 32]). The discrete Lagrangian l d : G R is constructed as an approximation of the action integral, that is l d W k ) k+1)h kh lξt)) dt.

12 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO Let define the discrete external forces in the following way: f ± k : G U g, where U R m for m n = dim g. In consequence f k W k, u k ), η k + f + k W k, u + k ), η k+1 k+1)h kh fξt), ut)), ηt) dt, where f k W k, u k ), f + k W k, u + k )) g g and η k g, for all k. In addition η 0 = η N = 0 and, is the natural pairing between g and g. For sake of simplicity we are sometimes going to omit the dependence on G U of both f + k and f k. Taking all the previous into account, we derive a discrete version of the Lagrange-D Alembert principle for Lie groups: l d W k ) + δ f k, η k + f + k, η k+1 ) = 0, 16) for all variations {δw k },... verifying the relation δw k = η k W k + W k η k+1 with {η k } k=1,... an arbitrary sequence of elements of g which satisfies η 0, η N = 0 see [21, 18]). From this principle is easy to derive the system of difference equations: l W k 1 dl d W k 1 ) r W k dl d W k ) + f + k 1 W k 1, u + k 1 ) + f k W k, u k ) = 0, 17) for k = 1,..., N 1, which are called the controlled discrete Euler- Poincaré equations. The cost functional 15) is approximated by k+1)h C d u k, W k, u + k ) Cξt), ut)) dt, 18) kh yielding the discrete cost functional: J = Observe that now C d : U G U R. C d u k, W k, u + k ). 19) 4.1. Fully Actuated Systems. In the fully actuated case the mappings f ± k W : U g defined by f ± k W u) = f ± k W, u) are diffeomorphisms for all W G, therefore, we can construct the Lagrangian L d : g G g R by L d ν k, W k, ν k+1 ) = C d f k ) 1 r dl W W d W k ) ν k ), W k, f + k k k ) W 1 l dl W d W k ) + ν k+1 )), k k 20) where the variables ν k, ν k+1 g are defined by ν k = r W k dl d W k ) f k W k, u k ), ν k+1 = l W k dl d W k ) + f + k W k, u + k ), 21) The discrete phase space g G g is now a mixture of two copies of the Lie algebra g and a Lie group G. This is also an example of a Lie groupoid [24]).

DISCRETE VARIATIONAL OPTIMAL CONTROL 13 The discrete optimal control problem defined in 16) and 18) has been reduced to a Lagrangian one, with Lagrangian function L d : g G g R. In consequence, we are able to apply discrete variational calculus to obtain the discrete equations of motion in the phase space g G g. Let us show how to derive these equations from a variational point of view see [24] for further details). Define first the discrete action sum S d = L d ν k, W k, ν k+1 ). Consider sequences of the type {ν k, W k, ν k+1 )},..., with boundary conditions: ν 0, ν N and the composition W = W 0 W 1 W N 2 W fixed. Therefore an arbitrary variation of this sequence has the form {ν k ɛ), h 1 k ɛ) W k h k+1 ɛ), ν k+1 ɛ)},...,, with ɛ δ, δ) R both ɛ and δ > 0 are real parameters) and ν 0 ɛ) = ν 0, ν k 0) = ν k, ν N ɛ) = ν N, h k ɛ) G and h 0 ɛ) = h N ɛ) = e, for all ɛ. Additionally h k 0) = e for all k. The critical points of the discrete action sum subjected to the previous boundary conditions are characterized by 0 = d ) L d ν k ɛ), h 1 dɛ k ɛ) W k h k+1 ɛ), ν k+1 ɛ)) ɛ=0 = d { Ld ν 0, W 0 h 1 ɛ), ν 1 ɛ)) + L d ν 1 ɛ), h 1 1 dɛ ɛ) W 1 h 2 ɛ), ν 2 ɛ)) ɛ=0 +... + L d ν N 2 ɛ), h 1 N 2 ɛ) W N 2 h ɛ), ν ɛ)) +L d ν ɛ), h 1 ɛ) W, ν N ) }. Taking derivatives we obtain 0 = k=1 + k=1 [ l dl νk 1 W d k 1,ν k ) W k 1) r ] dl νk W d k,ν k+1 ) W k) δh k [ D 2 L Wk 1 d ) ν ] k 1, ν k ) + D 1 L Wk d ) ν k, ν k+1 ) δν k, where L W d ) : g g R and L ν,ν d ) : G R are defined by L d W ) ν, ν ) = L ν,ν d ) W ) = L dν, W, ν ), where W G and ν, ν g. Since δh k which is defined as d h k dɛ ɛ=0) and δν k which is defined as d ν k dɛ ɛ=0), k = 1,..., N 1 are arbitrary, we deduce the following discrete equations of motion: l dl νk 1 W d k 1,ν k ) W k 1) r dl νk W d k,ν k+1 ) W k) = 0, D 2 L Wk 1 d ) ν k 1, ν k ) + D 1 L Wk d ) ν k, ν k+1 ) = 0, for k = 1,..., N 1. Similarly to 3.1 we obtain the control inputs u k using 21). u + k 22) and

14 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO Define the following discrete Legendre transformations F L d : g G g g T g ν k 1, W k 1, ν k ) r W k 1 dl d νk 1,ν k ) W k 1), D 1 L d Wk 1 ) ν k 1, ν k )), and F + L d : g G g g T g ν k 1, W k 1, ν k ) l W k 1 dl d νk 1,ν k ) W k 1), D 2 L d Wk 1 ) ν k 1, ν k ). These relationships implicitly define the discrete Hamiltonian evolution operator γ Ld : g T g g T g r W k 1 dl d νk 1,ν k ), D 1L d Wk 1 ) ) l W k 1 dl d νk 1,ν k ), D 2L d Wk 1 ) ), which verifies γ L d {F, G} g T g = { γ L d F, γ L d G} g T g. The Poisson bracket is specified in canonical coordinates z i, y i, p i ) on g T g by {z i, z j } g T g = Ck ijz k, {z i, y j } g T g = {z i, p j } g T g = {y i, y j } g T g = {pi, p j } g T g = 0, {y i, p j } g T g = δj i, where C k ij are the structure constants of the Lie algebra g fixed a basis {e i}, 1 i dim g. Here, we denote by z i and y i the induced coordinates on g, and y i, p i ) coordinates on T g. 4.2. Underactuated Systems. The underactuated case can now be considered by adding constraints. Similarly to 3.3 underactuation restricts the control forces to lie in a subspace spanned by vectors {e s } of the basis {e s, e σ } of g, where {s, σ} = 1,..., n. Then f k W k, u k ) = a k W k) + b k W k, u k )) se s, f + k W k, u + k ) = a+ k W k) + b + k W k, u + k )) se s, where a k W k), a + k W k) g and b k W k, u k )) s, b + k W k, u + k )) s R, for all s. Additionally, the embedding condition implies that rank b k = rank b+ k = dim U. Then, taking the dual basis {e s, e σ }, we induce the following constraints: Φ σ ν k, W k, ν k+1 ) = r W k dl d W k ) ν k a k W k), e σ = 0, Φ + σ ν k, W k, ν k+1 ) = ν k+1 l W k dl d W k ) a + k W k), e σ = 0. 23a) 23b) Observe in 23) that, even though the constraints are functions Φ ± σ : g G g R, neither Φ σ depends on ν k+1 nor Φ + σ on ν k. Once we have defined the constraints we can implement the Lagrangian multiplier rule in order to solve the underactuated problem. Namely, we define the extended

Lagrangian as: DISCRETE VARIATIONAL OPTIMAL CONTROL 15 L d ν k, λ k, W k, ν k+1, λ + k ) =L dν k, W k, ν k+1 ) Defining the discrete action sum Sd under = + λ k )σ Φ σ ν k, W k, ν k+1 ) + λ + k )σ Φ + σ ν k, W k, ν k+1 ). L d ν k, λ k, W k, ν k+1, λ + k ), we obtain the underactuated discrete equations of motion l dl νk 1 W d k 1,ν k ) W k 1) r dl νk W d k 1,ν k+1 ) W k) + l λ W k 1 k 1 )σ d Φ σ νk 1,ν k ) W k 1) + λ + k 1 )σ d Φ + ) σ νk 1,ν k W k 1) r λ W k 1 k )σ d Φ σ νk,ν k+1 ) W k) + λ + k )σ d Φ + ) σ νk,ν k+1 ) W k) = 0, D 2 L d Wk 1 ) ν k 1, ν k ) + D 1 L d Wk ) ν k, ν k+1 ) + [ λ + k 1 )σ λ k )σ] e σ = 0, Φ σ ν k, W k, ν k+1 ) = 0, Φ + σ ν k, W k, ν k+1 ) = 0, 24) 25) where the subscripts W k 1 ), W k ), ν k 1, ν k ), ν k, ν k+1 ) denoted variables that are fixed. 5. Numerical Methods for Systems on Lie Groups We now put the discrete optimal control equations 22) and 25) into a form suitable for algorithmic implementation. The numerical methods are constructed using the following guidelines: 1) good approximation of the dynamics and optimality, 2) avoid issues with local coordinates 3) guarantee for numerical robustness and convergence, 4) numerical efficiency. The algorithm developed in this section will be derived from a trapezoidal quadrature and will approximate the dynamics and optimality conditions to at least second order [32] requirement 1). In addition, we will satisfy requirement 2 for systems on Lie groups by lifting the optimization to the Lie algebra through a retraction map that will be defined in this section. The resulting algorithms are numerically robust in the sense that there are no issues with coordinate singularities and the dynamics and optimality conditions remain close to their continuous counterparts even at big time steps. Yet, as with any other nonlinear optimization scheme it is difficult to formally claim that the algorithm will always converge requirement 3). Nevertheless, in practice there are only isolated cases for underactuated systems that fail to converge. A remedy for such cases has been suggested in [18]. In general, the resulting algorithms require a small number of iterations, e.g. between 10 and 20 to converge requirement 4).

16 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO Note that the discrete mechanics approach provides an accurate approximation of the dynamics associated with its momentum and symplectic form or Poisson bracket) preservation and good energy behavior. Long-time stability, though, becomes less important for optimal control problems with short time horizon T. Yet, the notion of symplectic optimality conditions for two-point boundary value problems likely has deeper implications, e.g. related to the region of attraction and numerical stability of the associated root-finding numerical procedure. Such directions are not explored in this work and require further study. The optimization variables W k are regarded as small displacements on the Lie group. Thus, it is possible to express each term through a Lie algebra element that can be regarded as the averaged velocity of this displacement. This is accomplished using a retraction map τ : g G which is an analytic local diffeomorphism around the identity such that τξ)τ ξ) = e, where ξ g. Two standard choices for τ are employed in this work: the exponential map, and the Cayley map. Regarding ξ as a velocity we set the discrete Lagrangian l d : G R to l d W k ) = h lξ k ), where ξ k = τ 1 g 1 k g k+1)/h = τ 1 W k )/h. The difference g 1 k g k+1 G, which is an element of a nonlinear space, can now be represented by the vector ξ k in order to enable unconstrained optimization in the linear space g for optimal control purposes. The variational principle will now be expressed in terms of the chosen map τ. The resulting discrete mechanics will thus involve the derivatives of the map which we define next see also [6, 14, 18]): Definition 5.1. Given a map τ : g G, its right trivialized tangent dτ ξ : g g and is inverse dτ 1 ξ : g g, are such that for g = τξ) G and η g, the following holds ξ τξ) η = dτ ξ η τξ), ξ τ 1 g) η = dτ 1 ξ η τ ξ)). Using these definitions, variations δξ and δg are constrained by δξ k = dτ 1 hξ k η k + Ad τhξk )η k+1 )/h, where η k = g 1 k δg k, which is obtained by straightforward differentiation of ξ k = τ 1 g 1 k g k+1 )/h. The retraction map τ choices are: a) The exponential map exp : g G, defined by expξ) = γ1), with γ : R G in the integral curve through the identity of the vector field associated with ξ g hence, with γ0) = ξ). The right trivialized derivative and its inverse are defined by dexp x y = dexp 1 x y = j=0 j=0 1 j + 1)! adj x y, B j j! adj x y,

DISCRETE VARIATIONAL OPTIMAL CONTROL 17 where B j are the Bernoulli numbers see [11]). Typically, these expressions are truncated in order to achieve a desired order of accuracy. b) The Cayley map cay : g G is defined by cayξ) = e ξ 2 ) 1 e + ξ 2 ) and is valid for a general class of quadratic groups see [11]) that include the groups of interest in this paper e.g. SO3), SE2) and SE3)). Its right trivialized derivative and inverse are defined by dcay x y = e x 2 ) 1 y e + x 2 ) 1, dcay 1 x y = e x 2 ) y e + x 2 ). Next, the discrete forces and cost function are defined through a trapezoidal approximation, i.e. and f ± k ξ k, u ± k ) = h 2 fξ k, u ± k ), C d u k, ξ k, u + k ) = h 2 Cξ k, u k ) + h 2 Cξ k, u + k ), respectively. With the choice of a retraction map and the trapezoidal rule the equations of motion 13) become µ k Ad τhξ k 1 ) µ k 1 = h 2 fξ k, u k ) + h 2 fξ k 1, u + k 1 ), µ k = dτ 1 hξ k ) ξ lξ k ), g k+1 = g k τhξ k ), while the momenta defined in 21) take the form ν k = µ k h 2 fξ k, u k ), 26) ν k+1 = Ad τhξ k ) µ k + h 2 fξ k, u + k ). 27) Finally, define the Lagrangian l d : g g g R such that l d ν, ξ, ν ) = L d ν, τhξ), ν ). Note that the Lagrangian is well-defined only on g U g, where U g is an open neighborhood around the identity for which τ is a diffeomorphism. To make the notation as simple as possible we retain the Lagrangian definition to the full space g g g. The optimality conditions corresponding to 22) become dτ 1 hξ k 1 ) d l d νk 1,ν k ) ξ k 1) dτ 1 hξ k ) d l d νk,ν k+1 ) ξ k) = 0, 28) D 2 l d ξk 1 ) ν k 1, ν k ) + D 1 l d ξk ) ν k, ν k+1 ) = 0, 29) for k = 0,..., N 1. Here, l d ξ) ν, ν ) = l d ν,ν ) ξ) = l dν, ξ, ν ). Equations 28) and 29) can be also obtained from 22) employing Lemma 8.2 and Lemma 8.3 in Appendix A.

18 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO In the underactuated case we define l d ν, ξ, ν, λ, λ + ) =L d ν, τhξ), ν ) + λ ) σ Φ σ ν,ν ) τhξ)) + λ+ ) σ Φ + σ ν,ν ) τhξ)), 30) and from 25) obtain the equations dτ 1 hξ k 1 ) d l d νk 1,ν k,λ ± k 1 )ξ k 1) dτ 1 hξ k ) d l d νk,ν k+1,λ ± k )ξ k) = 0, D 2 L d τhξk 1 ) ν k 1, ν k ) + D 1 L d τhξk ) ν k, ν k+1 ) + λ + k 1 λ k = 0, Φ σ ν k, τhξ k ), ν k+1 ) = 0, Φ + σ ν k, τhξ k ), ν k+1 ) = 0, where we employed the notation λ ± := λ ± ) σ e σ. 31) Boundary Conditions.: Establishing the exact relationship between the discrete and continuous momenta, µ k and µt) = ξ lξt)), respectively, is particularly important for properly enforcing boundary conditions that are given in terms of continuous quantities. The following equations relate the momenta at the initial and final times t = 0 and t = T and are used to transform between the continuous and discrete representations: µ 0 ξ lξ0)) = h 2 fξ0), u 0 ), ξ lξt )) Ad τhξ ) µ = h 2 fξt ), u+ N ). which also corresponds to the relations ν 0 = ξ lξ0)) and ν N = ξ lξt )). These equations can also be regarded as structure-preserving velocity boundary conditions, i.e., for given fixed velocities ξ0) and ξt ). The exact form of the previous equations depends on the choice of τ. This choice will also influence the computational efficiency of the optimization framework when the above equalities are enforced as constraints. The numerical procedure to compute the trajectory is summarized as follows: Algorithm 5.2. Optimal control Data: group G; mechanical Lagrangian l; control functions a, b; cost function C; final time T ; number of segments N. 1) Input: boundary conditions g0), ξ0)) and gt ), ξt )). 2) Set momenta ν 0 = ξ lξ0)) and ν N = ξ lξt )) 3) Solve for ξ 0,..., ξ, ν 1,..., ν, λ ± 1,..., λ± ) the relations: { equations 31) for all k = 1,..., N 1, τ 1 τhξ ) 1...τhξ 0 ) 1 g0) 1 gt ) ) = 0 4) Output: optimal sequence of velocities ξ 0,..., ξ. 5) Reconstruct path g 0,..., g N by g k+1 = g k τhξ k ) for k = 0,..., N 1. The solution is computed using root-finding procedure such as Newton s method. If the initial guess does not satisfy the dynamics we recommend to use a Levenberg-Marquardt algorithm which has slower but more robust convergence.

DISCRETE VARIATIONAL OPTIMAL CONTROL 19 5.1. Example: optimal control effort. Consider a Lagrangian consisting of the kinetic energy only lξ) = 1 Iξ), ξ, 2 full unconstrained actuation, no potential or external forces and no velocity constraint. The map I : g g is called the inertia tensor and is assumed full rank. In the fully actuated case we have fξ k, u ± k ) u± k. We consider a minimum effort control problem, i.e. Cξ, u) = 1 2 u 2. The optimal control problem for fixed initial and final states g0), ξ0)) and gt ), ξt )) can now be summarized as: Compute: ξ 0:, u ± 0:N, minimizing: h 4 subject to: µ 0 Iξ0)) = h 2 u 0, u k 2 + u + k 2), µ k Ad τhξ k 1 ) µ k 1 = hu k + u+ k 1 ), k = 1,..., N 1, IξT )) Ad τhξ ) µ = h 2 u+ N, µ k = dτ 1 h ξ k ) Iξ k ), g k+1 = g k τhξ k ), k = 0,...N 1, τ 1 g 1 N gt )) = 0. The optimality conditions for this problem are derived as follows. The Lagrangian becomes l d ν k, ξ k, ν k+1 ) = 1 4h ν k d τ 1 hξ k ) Iξ k ) 2 + ν k+1 d τ 1 hξ k ) Iξ k ) 2), where the momentum has been computed according to ν k = 1 ) d τ 1 2 hξ k ) Iξ k ) + d τ 1 hξ k 1 ) Iξ k 1 ), 32) Thus the optimality conditions become d τ 1 hξ k ) dl d νk,ν k+1 ) ξ k) d τ 1 hξ k 1 ) dl d νk 1,ν k ) ξ k 1) = 0, k = 1,..., N 1, τ 1 τhξ ) 1...τhξ 0 ) 1 g 1 0 gt )) = 0. It is important to note that these last two equations define N n equations in the N n unknowns ξ 0:. A solution can be found using nonlinear root finding. Once ξ 0:N have been computed, is possible to obtain the final configuration g N by reconstructing the curve by these velocities. Beside, the

20 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO boundary condition gt ) is enforced through the relation τ 1 g 1 N gt )) = 0 without the need to optimize over any of the configurations g k. 5.2. Extension: the configuration-dependent case. The developed framework can be extended to a configuration-dependent Lagrangian L : G g R, for instance defined in terms of a kinetic energy K : g R and potential energy V : G R according to Lg, ξ) = Kξ) V g), where g G and ξ g. The controlled Euler-Poincaré equations are in this case µ ad ξ µ = g g V g) + f, µ = ξ Kξ), ġ = g ξ, where the external forces are defined as f : G g U g. Our discretization choice L d : G G R will be recall that ξ k = τ 1 g 1 k g k+1)/h) L d g k, g k+1 ) = h 2 Lg k, ξ k ) + h 2 Lg k+1, ξ k ) = h Kξ k ) h V g k) + V g k+1 ), 2 while the G-dependent discrete forces now become f k g k, ξ k, u k ) = h 2 fg k, ξ k, u k ), f + k g k+1, ξ k, u + k ) = h 2 fg k+1, ξ k, u + k ). This leads to the discrete equations µ k Ad τhξ k 1 ) µ k 1 = h g k gv g k ) + h 2 fg k, ξ k, u k ) + h 2 fg k, ξ k 1, u + k 1 ), The momenta become µ k = dτ 1 hξ k ) ξ Kξ k ), g k+1 = g k τhξ k ). ν k = µ k + h 2 g k g V g k ) h 2 fg k, ξ k, u k ), ν k+1 = Ad τhξ k ) µ k h 2 g k+1 g V g k+1 ) + h 2 fg k+1, ξ k, u + k ). In consequence, we can define a discrete Lagrangian L d : g G g g R, depending on the variables ν k, g k, ξ k, ν k+1 ) which discrete equations of motion will be a mixture between 22) and 28), 29), namely

DISCRETE VARIATIONAL OPTIMAL CONTROL 21 D 2 L gk 1 d,ξ k 1 ) ν k 1, ν k ) + D 1 L gk d,ξ k ) ν k, ν k+1 ) = 0, l d L νk 1 g d k 1,ξ k 1,ν k ) g k 1) + r ) d L νk g d k,ξ k,ν k+1 ) g k) + dτ 1 hξ k 1 ) d L νk 1 d,g k 1,ν k ) ξ k 1) dτ 1 hξ k ) ) d L νk d,g k,ν k+1 ) ξ k) = 0. 6. Applications 6.1. Underwater Vehicle. We illustrate the developed algorithm with an application to a simulated unmanned underwater vehicle. Figure 1) shows the model equipped with five thrusters which produce forces and torques in all directions but the body-fixed y -axis. Since the input directions span only a five-dimensional subspace the problem is solved through the underactuated framework. y u 5 z x reconfigurations d u 4 u 3 propellers u 2 c a) u 1 drag flip b) parallel parking Figure 1. An underwater vehicle model a) and various computed optimal trajectories between chosen states b). Only a few frames along the path are shown for clarity. rad/s 5 0 Angular velocity ω x ω y ω z 5 0 2 4 6 sec. m/s 4 2 0 Linear velocity v x v y v z 2 0 2 4 6 sec. N 5 0 Control inputs 5 0 2 4 6 sec. 1 2 3 4 5 Figure 2. Details of the computed optimal path for the reconfiguration maneuver given in Figure 1). The vehicle configuration space is G = SE3). We make the identification SE3) SO3) R 3 using elements R SO3) and x R 3 through ) R x g =, g 0 1 3 1 1 R T R = T ) x, 0 1 3 1

22 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO where g SE3). Elements of the Lie algebra ξ se3) are identified with body-fixed angular and linear velocities denoted ω R 3 and v R 3, respectively, through ) ˆω v ξ =, 0 1 3 0 where the map ˆ : R 3 so3) is defined by ˆω = 0 ω 3 ω 2 ω 3 0 ω 1. 33) ω 2 ω 1 0 The algorithm is thus implemented in terms of vectors in R 6 rather than matrices in se3). The map τ = cay : se3) SE3) is chosen, instead of the exponential, since it results in more computationally efficient implementation. It is defined by cayξ) = where cay : so3) SO3) is given 1 by cayˆω) = I 3 + cayˆω) dcayω v 0 1 4 4+ ω 2 ), ) ˆω + ˆω2, 34) 2 where I n is the n n identity matrix and dcay : R 3 R 3 is defined by 2 dcay ω = 4+ ω 2 2I 3 + ˆω). 35) The matrix representation of the right-trivialized tangent inverse dτ 1 ω,v) : R 3 R 3 R 3 R 3 becomes [ [dcay 1 ω,v) ] = I3 1 2 ω + 1 ] 4 ωωt 0 3 I3 1 2 ω) v I 3 1 2 ω. 36) 1 2 The vehicle inertia tensor I is computed assuming cylindrical mass distribution with mass m = 3kg. The control basis vectors are {e s } 5 s=1 = {e 1, e 2, e 3, e 4, e 6 }, while the non-actuated direction is e σ = e 5, where e i is the i-th standard basis vector of R 6. The control functions take the form bw, u) 1 = du 5 u 4 ), bw, u) 2 = cu 1 + u 2 )/2 u 3 ), bw, u) 3 = c sin π 3 )u 2 u 1 ), bw, u) 4 = u 1 + u 2 + u 3, bw, u) 5 = u 4 + u 5, aw ) = Hτ 1 W ), here H is a negative definite viscous drag matrix and the constants c, d are the lengths of the thrusting torque moment arms see Figure 1). We are interested in computing a minimum control effort trajectory between two given boundary states, i.e. conditions on both the configurations 1 note that cay denotes a map to either SO3) or SE3) which should be clear from its argument.

DISCRETE VARIATIONAL OPTIMAL CONTROL 23 and velocities. Such a cost function is defined in 5.1. The optimal control problem is solved using equations 31). The computation is performed using Algorithm 5.2. Figure 2 shows the computed velocities and controls for the reconfiguration trajectory shown in Figure 1. The algorithms requires between 10-20 iterations depending on the boundary conditions and when applied to N = 32 segments. 0.4 0.2 Forces u x u y N 0 0.2 0.4 0 5 10 sec. a) b) Figure 3. An optimal trajectory of an underactuated rigid body on SO3) a). The body is controlled using two force inputs around the body-fixed x and y axes. A discontinuous optimal trajectory b) which our algorithm can handle. 6.2. Discontinuous Control. One of the advantages of employing the discrete variational framework is the treatment of discontinuous control inputs as illustrated in 3. The nature of the control curve depends on the cost function. In the standard squared control effort case i.e. L 2 control curve norm employed in 6.1) the resulting control is smooth. Another cost function of interest is T 0 ut) dt which is typically imposed along with the constraints u min ut) u max. This case results in a discontinuous optimal control curve. Our formulation can handle such problems easily since the terms u + k 1 and u k are regarded as the forces before and after time t k, respectively. A computed scenario of a rigid body actuated with two control torques around its principles axes of inertia Fig. 3) illustrates the discontinuous case. 7. Extensions The methods developed in the previous sections are easily adapted to other cases which are of interest in practical applications. In particular, this section will be devoted to the discussion of two important extensions: the case of optimal control problems for Lagrangians of the type l : T M g R that is, reduction by symmetries on a trivial principal fiber bundle) and the case of nonholonomic systems. Here, M denotes a smooth manifold. Observe that the phase space T M g unifies the previously studied cases of a tangent bundle and a Lie algebra. The notion of principal fiber bundle is present in many locomotion and robotic systems [7, 4, 35]. When the configuration manifold is Q = M G, there exists a canonical splitting between variables describing the position and variables describing the orientation of the mechanical system. Then,

24 F. JIMÉNEZ, M. KOBILAROV, AND D. MARTÍN DE DIEGO we distinguish the pose coordinates g G the elements in the Lie algebra will be denoted by ξ g), and the variables describing the internal shape of the system, that is x M in consequence x, ẋ) T M). Observe that the Lagrangians of the type l : T M g R mainly appears as reduction of Lagrangians of the type L : T M G) R, which are invariant under the action of the Lie group G. Under the identification T M G)/G T M g we obtain the reduced Lagrangian l. We first develop the discrete optimal control problem for systems in an unconstrained principle bundle setting in 7.1. Nonholonomic constraints are then added to treat the more general case of locomotion systems in 7.2. 7.1. Discrete Optimal Control on Principle Bundles. The discrete case is modeled by a Lagrangian l d : M M G R which is an approximation of the action integral in one time step l d x k, x k+1, W k ) hk+1) hk l xt), ẋt), ξt)) dt, where x k, x k+1 ) M M and W k G. Again, we define the discrete control forces according to f ± k : M M G U T M g, where U R m : f k x k, x k+1, W k, u k ) = f k x k, x k+1, W k, u k ), ˆf ) k x k, x k+1, W k, u k ), f + k x k, x k+1, W k, u + k ) = f + k x k, x k+1, W k, u + k ), ˆf ) + k x k, x k+1, W k, u + k ), here f k T x k M g and f + k T x k+1 M g more concretely f k T x k M, f + k T x k+1 M, ˆf k g, ˆf + k g ). Similarly to the developments in 3 and 4.1 we can formulate the discrete Lagrange-D Alembert principle: l d x k, x k+1, W k ) + δ + f k, δx k, η k ) f + k, δx k+1, η k+1 ) = 0, which can be rewritten as l d x k, x k+1, W k ) + δ + f k δx k + f + k δx k+1 ˆf k, η k + ˆf + k, η k+1 = 0, for all variations {δx k } N with δx k T xk M and δx 0 = δx N = 0; also {δw k } N with δw k T gk G, such that δw k = η k W k + W k η k+1, being {η k } N a sequence of independent elements of g such that η 0 = η N = 0.