Introduction to Riemannian and Sub-Riemannian geometry

Size: px
Start display at page:

Download "Introduction to Riemannian and Sub-Riemannian geometry"

Transcription

1 Introduction to Riemannian and Sub-Riemannian geometry from Hamiltonian viewpoint andrei agrachev davide barilari ugo boscain This version: November 2, 216 Preprint SISSA 9/212/M

2 2

3 Contents Introduction 1 1 Geometry of surfaces in R Geodesics and optimality Existence and minimizing properties of geodesics Absolutely continuous curves Parallel transport Gauss-Bonnet Theorems Gauss-Bonnet theorem: local version Gauss-Bonnet theorem: global version Consequences of the Gauss-Bonnet Theorems The Gauss map Surfaces in R 3 with the Minkowski inner product Model spaces of constant curvature Zero curvature: the Euclidean plane Positive curvature: spheres Negative curvature: the hyperbolic plane Vector fields Differential equations on smooth manifolds Tangent vectors and vector fields Flow of a vector field Vector fields as operators on functions Nonautonomous vector fields Differential of a smooth map Lie brackets Frobenius theorem Cotangent space Vector bundles Submersions and level sets of smooth maps Sub-Riemannian structures Basic definitions The minimal control and the length of an admissible curve Equivalence of sub-riemannian structures

4 3.1.3 Examples Every sub-riemannian structure is equivalent to a free one Proto sub-riemannian structures Sub-Riemannian distance and Chow-Rashevskii Theorem Proof of Chow-Raschevskii Theorem Existence of length-minimizers Pontryagin extremals The energy functional Proof of Theorem A Measurability of the minimal control A.1 Main lemma A.2 Proof of Lemma B Lipschitz vs Absolutely continuous admissible curves Characterization and local minimality of Pontryagin extremals Geometric characterization of Pontryagin extremals Lifting a vector field from M to T M The Poisson bracket Hamiltonian vector fields The symplectic structure The symplectic form vs the Poisson bracket Characterization of normal and abnormal extremals Normal extremals Abnormal extremals Example: codimension one distribution and contact distributions Examples D Riemannian Geometry Isoperimetric problem Heisenberg group Lie derivative Symplectic geometry Local minimality of normal trajectories The Poincaré-Cartan one form Normal trajectories are geodesics Integrable systems Reduction of Hamiltonian systems with symmetries Example of symplectic reduction: the space of affine lines in R n Riemannian geodesic flow on hypersurfaces Geodesics on hypersurfaces Riemannian geodesic flow and symplectic reduction Sub-Riemannian structures with symmetries Completely integrable systems Arnold-Liouville theorem Geodesic flows on quadrics

5 6 Chronological calculus Motivation Duality On the notation Topology on the set of smooth functions Family of functionals and operators Operator ODE and Volterra expansion Volterra expansion Adjoint representation Variations Formulae A Estimates and Volterra expansion B Remainder term of the Volterra expansion Lie groups and left-invariant sub-riemannian structures Sub-groups of Diff(M) generated by a finite dimensional Lie algebra of vector fields Proof of Proposition Passage to infinite dimension Lie groups and Lie algebras Example: matrix Lie groups Lie groups as group of diffeomorphisms The matrix notation Bi-invariant pseudo-metrics The Levi-Malcev decomposition Product of Lie groups Trivialization of TG and T G Left-invariant sub-riemannian structures Existence of minimizers step Carnot groups Pontryagin extremals for 2-step Carnot groups Left-invariant Hamiltonian systems on Lie groups Vertical coordinates in TG and T G Left-invariant Hamiltonians First integrals for Hamiltonian systems on Lie groups* Integrability of left invariant sub-riemannian structures on 3D Lie groups* Normal Extremals for left-invariant sub-riemannian structures Explicit expression of normal Pontryagin extremals in the d s case Example: The d s problem on SO(3) Explicit expression of normal Pontryagin extremals in the k z case Rolling spheres (3,5) - Rolling sphere with twisting (2,3,5) - Rolling without twisting Euler s cvrvae elasticae

6 8 End-point and Exponential map The end-point map and its differential* Lagrange multipliers rule Pontryagin extremals via Lagrange multipliers Critical points and second order conditions The manifold of Lagrange multipliers Sub-Riemannian case Free initial point problem Exponential map Conjugate points and minimality of extremal trajectories* Local minimality of normal extremal trajectories in the uniform topology Global minimizers* An example: the first conjugate locus on perturbed sphere D-Almost-Riemannian Structures Basic Definitions and properties How big is the singular set? Genuinely 2D-almost-Riemannian structures have always infinite area Normal Pontryagin extremals The Grushin plane Normal Pontryagin extremals of the Grushin plane Riemannian, Grushin and Martinet points Normal forms* Generic 2D-almost-Riemannian structures Proof of the genericity result A Gauss-Bonnet theorem Proof of Theorem 9.44* Construction of trivializable 2-ARSs with no tangency points Nonholonomic tangent space Jet spaces Admissible variations Nilpotent approximation and privileged coordinates Existence of privileged coordinates Geometric meaning Algebraic meaning Normal forms for nilpotent approximation in low dimension* Examples* Regularity of the sub-riemannian distance General properties of the distance function Regularity of the sub-riemannian distance Locally Lipschitz functions and maps Locally Lipschitz map and Lipschitz submanifolds A non-smooth version of Sard Lemma Regularity of sub-riemannian spheres

7 11.5 Geodesic completeness and Hopf-Rinow theorem Abnormal extremals and second variation Second variation Abnormal extremals and regularity of the distance Goh and generalized Legendre conditions Proof of Goh condition - (i) of Theorem Proof of generalized Legendre condition - (ii) of Theorem More on Goh and generalized Legendre conditions Rank 2 distributions and nice abnormal extremals Optimality of nice abnormal in rank 2 structures Conjugate points along abnormals Abnormals in dimension Higher dimension Equivalence of local minimality Some model spaces* Carnot groups of step 2* Multi-dimensional Heisenberg groups* Free Carnot groups of step 2* Some non-equiregular nilpotent structures* Grushin* Martinet* Some non-nilpotent left invariant structures* SU(2) SO(3),SL(2)* SE(2)* Curves in the Lagrange Grassmannian The geometry of the Lagrange Grassmannian The Lagrange Grassmannian Regular curves in Lagrange Grassmannian Curvature of a regular curve Reduction of non-regular curves in Lagrange Grassmannian Ample curves From ample to regular Conjugate points in L(Σ) Comparison theorems for regular curves Jacobi curves From Jacobi fields to Jacobi curves Jacobi curves Conjugate points and optimality Reduction of the Jacobi curves by homogeneity

8 16 Riemannian curvature Ehresmann connection Curvature of an Ehresmann connection Linear Ehresmann connections Covariant derivative and torsion for linear connections Riemannian connection Relation with Hamiltonian curvature Locally flat spaces Example: curvature of the 2D Riemannian case Curvature in 3D contact sub-riemannian geometry D contact sub-riemannian manifolds Canonical frames Curvature of a 3D contact structure Application: classification of 3D left-invariant structures Proof of Theorem Case χ > Case χ = Proof of Theorem Asymptotic expansion of the 3D contact exponential map Nilpotent case General case: second order asymptotic expansion Some more details on the computations* General case: higher order asymptotic expansion Proof of Theorem 18.6: asymptotics of the exponential map Asymptotics of the conjugate locus Asymptotics of the conjugate length Stability of the conjugate locus The volume in sub-riemannian geometry The Popp volume Popp volume for equiregular sub-riemannian manifolds A formula for Popp volume Popp volume and isometries Hausdorff dimension and Hausdorff volume* Density of the Hausdorff volume with respect to a smooth volume* The sub-riemannian heat equation The heat equation The heat equation in the Riemannian context The heat equation in the sub-riemannian context Few properties of the sub-riemannian Laplacian: the Hörmander theorem and the existence of the heat kernel The heat equation in the non-lie-bracket generating case The heat-kernel on the Heisenberg group

9 2.2.1 The Heisenberg group as a group of matrices The heat equation on the Heisenberg group A Hermite polynomials* 445 B Elliptic functions* 447 C Structural equations for curves in Lagrange Grassmannian* 449 Index 449 Bibliography 452 9

10 1

11 Introduction Thisbookconcerns a freshdevelopment of theeternal ideaof thedistance as thelength of a shortest path. In Euclidean geometry, shortest paths are segments of straight lines that satisfy all classical axioms. In the Riemannian world, Euclidean geometry is just one of a huge amount of possibilities. However, each of these possibilities is well approximated by Euclidean geometry at very small scale. In other words, Euclidean geometry is treated as geometry of initial velocities of the paths starting from a fixed point of the Riemannian space rather than the geometry of the space itself. The Riemannian construction was based on the previous study of smooth surfaces in the Euclidean space undertaken by Gauss. The distance between two points on the surface is the length of a shortest path on the surface connecting the points. Initial velocities of smooth curves starting from a fixed point on the surface form a tangent plane to the surface, that is an Euclidean plane. Tangent planes at two different points are isometric, but neighborhoods of the points on the surface are not locally isometric in general; certainly not if the Gaussian curvature of the surface is different at the two points. Riemann generalized Gauss construction to higher dimensions and realized that it can be done in an intrinsic way; you do not need an ambient Euclidean space to measure the length of curves. Indeed, to measure the length of a curve it is sufficient to know the Euclidean length of its velocities. A Riemannian space is a smooth manifold whose tangent spaces are endowed with Euclidean structures; each tangent space is equipped with its own Euclidean structure that smoothly depends on the point where the tangent space is attached. For a habitant sitting at a point of the Riemannian space, tangent vectors give directions where to move or, more generally, to send and receive information. He measures lengths of vectors, and angles between vectors attached at the same point, according to the Euclidean rules, and this is essentially all what he can do. The point is that our habitant can, in principle, completely recover the geometry of the space by performing these simple measurements along different curves. In the sub-riemannian space we cannot move, receive and send information in all directions. There are restictions (imposed by the God, the moral imperative, the government, or simply a physical law). A sub-riemannian space is a smooth manifold with a fixed admissible subspace in any tangent space where admissible subspaces are equipped with Euclidean structures. Admissible paths are those curves whose velocities are admissible. The distance between two points is the infimum of the length of admissible paths connecting the points. It is assumed that any pair of points in the same connected component of the manifold can be connected by at least an admissible path. The last assumption might look strange at a first glance, but it is not. The admissible subspace depends on the point where it is attached, and our assumption is satisfied for a more or less general smooth dependence on the point; better to say that it is not satisfied only for very special families of admissible subspaces. Let us describe a simple model. Let our manifold be R 3 with coordinates x,y,z. We consider 11

12 the differential 1-form ω = dz (xdy ydx). Then dω = dx dy is the pullback on R3 of the area form on the xy-plane. In this model the subspace of admissible velocities at the point (x,y,z) is assumed to be the kernel of the form ω. In other words, a curve t (x(t),y(t),z(t)) is an admissible path if and only if ż(t) = 1 2 (y(t)ẋ(t) x(t)ẏ(t)). The length of an admissible tangent vector (ẋ,ẏ,ż) is defined to be (ẋ 2 +ẏ 2 ) 1 2, that is the length of the projection of the vector to the xy-plane. We see that any smooth planar curve (x(t),y(t)) has a unique admissible lift (x(t),y(t),z(t)) in R 3, where: z(t) = 1 2 t x(s)ẏ(s) ẋ(s)y(s) ds. If x() = y() =, then z(t) is the signed area of the domain boundedby the curve and the segment connecting (, ) with (x(t), y(t)). By construction, the sub-riemannian length of the admissible curve in R 3 is equal to the Euclidean length of its projection to the plane. We see that sub-riemannian shortest paths are lifts to R 3 of the solutions to the classical Dido isoperimetric problem: find a shortest planar curve among those connecting (,) with (x 1,y 1 ) and such that the signed area of the domain bounded by the curve and the segment joining (,) and (x 1,y 1 ) is equal to z 1 (see Figure 1). z (x(t),y(t),z(t)) y (x(t),y(t)) x Figure 1: The Dido problem Solutions of the Dido problem are arcs of circles and their lifts to R 3 are spirals where z(t) is the area of the piece of disc cut by the hord connecting (,) with (x(t),y(t)). A piece of such a spiral is a shortest admissible path between its endpoints while the planar projection of this piece is an arc of the circle. The spiral ceases to be a shortest path when its planar projection starts to run the circle for the second time, i.e. when the spiral starts its second turn. Sub-Riemannian balls centered at the origin for this model look like apples with singularities at the poles (see Figure 3). Singularities are points on the sphere connected with the center by more than one shortest path. The dilation (x,y,z) (rx,ry,r 2 z) transforms the ball of radius 1 into the ball of radius r. In particular, arbitrary small balls have singularities. This is always the case when admissible subspaces are proper subspaces. Another important symmetry connects balls with different centers. Indeed, the product operation (x,y,z) (x,y,z ) =. ( x+x, y +y, z +z + 1 ) 2 (xy x y) 12

13 z y x Figure 2: Solutions to the Dido problem Figure 3: The Heisenberg sub-riemannian sphere turns R 3 into a group, the Heisenberg group. The origin in R 3 is the unit element of this group. It is easy to see that left translations of the group transform admissible curves into admissible ones and preserve the sub-riemannian length. Hence left translations transform balls in balls of the same radius. A detailed description of this example and other models of sub-riemannian spaces is done in Section 1.7 and Chapter 13. Actually, even this simplest model tells us something about life in a sub-riemannian space. Here we deal with planar curves but, in fact, operate in the three-dimensional space. Sub-Riemannian spacesalways haveakindofhiddenextradimension. Agoodandnotyet exploited sourceformystic speculations but also for theoretical physicists who are always searching new crazy formalizations. In mechanics, this is a natural geometry for systems with nonholonomic constraints like skates, wheels, rolling balls, bearings etc. This kind of geometry could also serve to model social behavior that allows to increase the level of freedom without violation of a restrictive legal system. Anyway, in this book we perform a purely mathematical study of sub-riemannian spaces to provide an appropriate formalization ready for all eventual applications. Riemannian spaces appear as a very special case. Of course, we are not the first to study the sub-riemannian stuff. There is a broad literature even if there are few experts who could claim that sub-riemannian geometry is his main field of expertise. Important motivations come from CR geometry, hyperbolic geometry, 13

14 analysis of hypoelliptic operators, and some other domains. Our first motivation was control theory: length minimizing is a nice class of optimal control problems. Indeed, one can find a control theory spirit in our treatment of the subject. First of all, we include admissible paths in admissible flows that are flows generated by vector fields whose values in all points belong to admissible subspaces. The passage from admissible subspaces attached at different points of the manifold to a globally defined space of admissible vector fields makes the structure more flexible and well-adapted to algebraic manipulations. We pick generators f 1,...,f k of the space of admissible fields, and this allows us to describe all admissible paths as solutions to time-varying ordinary differential equations of the form: q(t) = k i=1 u i(t)f i (q(t)). Different admissible paths correspond to the choice of different control functions u i ( ) and initial points q() while the vector fields f i are fixed at the very beginning. We also use a Hamiltonian approach supported by the Pontryagin maximum principle to characterize shortest paths. Few words about the Hamiltonian approach: sub-riemannian geodesics are admissible paths whose sufficiently small pieces are length-minimizers, i. e. the length of such a piece is equal to the distance between its endpoints. In the Riemannian setting, any geodesic is uniquely determined by its velocity at the initial point q. In the general sub-riemannian situation we have much more geodesics based at the the point q than admissible velocities at q. Indeed, every point in a neighborhood of q can be connected with q by a length-minimizer, while the dimension of the admissible velocities subspace at q is usually smaller than the dimension of the manifold. What is a natural parametrization of the space of geodesics? To understand this question, we adapt a classical trajectory wave front duality. Given a length-parameterized geodesic t γ(t), we expect that the values at a fixed time t of geodesics starting at γ() and close to γ fill a piece of a smooth hypersurface (see Figure 4). For small t this hypersurface is a piece of the sphere of radius t, while in general it is only a piece of the wave front. p(t) γ(t) γ() Figure 4: The wave front and the impulse Moreover, we expect that γ(t) is transversal to this hypersurface. It is not always the case but this is true for a generic geodesic. The impulse p(t) Tγ(t) M is the covector orthogonal to the wave front and normalized by the condition p(t), γ(t) = 1. The curve t (p(t),γ(t)) in the cotangent bundle T M satisfies a Hamiltonian system. This is exactly what happens in rational mechanics or geometric optics. The sub-riemannian Hamiltonian H : T M R is defined by the formula H(p,q) = 1 2 p,v 2, where p Tq M, and v T qm is an admissible velocity of length 1 that maximizes the inner product of p with admissible velocities of length 1 at q M. Any smooth function on the cotangent bundle defines a Hamiltonian vector field and such a 14

15 field generates a Hamiltonian flow. The Hamiltonian flow on T M associated to H is the sub- Riemannian geodesic flow. The Riemannian geodesic flow is just a special case. As we mentioned, in general, the construction described above cannot be applied to all geodesics: the so-called abnormal geodesics are missed. An abnormal geodesic γ(t) also possesses its impulse p(t) Tγ(t) M but this impulse belongs to the orthogonal complement to the subspace of admissible velocities and does not satisfy the above Hamiltonian system. Geodesics that are trajectories of the geodesic flow are called normal. Actually, abnormal geodesics belong to the closure of the space of the normal ones, and elementary symplectic geometry provides a uniform characterization of the impulses for both classes of geodesics. Such a characterization is, in fact, a very special case of the Pontryagin maximum principle. Recall that all velocities are admissible in the Riemannian case, and the Euclidean structure on the tangent bundle induces the identification of tangent vectors and covectors, i. e. of the velocities and impulses. We should however remember that this identification depends on the metric. One can think to a sub-riemannian metric as the limit of a family of Riemannian metrics when the length of forbidden velocities tends to infinity, while the length of admissible velocities remains untouched. It is easy to see that the Riemannian Hamiltonians defined by such a family converge with all derivatives to the sub-riemannian Hamiltonian. Hence the Riemannian geodesics with a prescribed initial impulse converge to the sub-riemannian geodesic with the same initial impulse. On the other hand, we cannot expect any reasonable convergence for the family of Riemannian geodesics with a prescribed initial velocity: those with forbidden initial velocities disappear at the limit while geodesics with admissible initial velocities multiply. Outline of the book WestartinChapter1fromsurfacesinR 3 thatisthebeginningofeverythingindifferential geometry and also a starting point of the story told in this book. There are not yet Hamiltonians here, but a control flavor is already present. The presentation is elementary and self-contained. A student in applied mathematics or analysis who missed the geometry of surfaces at the university or simply is not satisfied by his understanding of these classical ideas, might find it useful to read just this chapter even if he does not plan to study the rest of the book. In Chapter 2, we recall some basic properties of vector fields and vector bundles. Sub-Riemannian structures are defined in Chapter 3 where we also prove three fundamental facts: the finiteness and the continuity of the sub-riemannian distance; the existence of length-minimizers; the infinitesimal characterization of geodesics. The first is the classical Chow-Rashevski theorem, the second and the third one are simplified versions of the Filippov existence theorem and the Pontryagin maximum principle. In Chapter 4, we introduce the symplectic language. We define the geodesic Hamiltonian flow, we consider an interesting class of three-dimensional problems and we prove a general sufficient condition for length-minimality of normal trajectories. Chapter 5 is devoted to applications to integrable Hamiltonian systems. We explain the construction of the action-angle coordinates and we describe classical examples of integrable geodesic flows, such as the geodesic flow on ellipsoids. Chapters 1 5 form a first part of the book where we do not use any tool from functional analysis. In fact, even the knowledge of the Lebesgue integration and elementary real analysis are not essential with a unique exception of the existence theorem in Section 3.3. In all other places the reader can substitute terms Lipschitz and absolutely continuous by piecewise C 1 and 15

16 measurable by piecewise continuous without a loss for the understanding. We start to use some basic functional analysis in Chapter 6. In this chapter, we give elements of an operator calculus that simplifies and clarifies calculations with non-stationary flows, their variations and compositions. In Chapter 7, we give a brief introduction to the Lie group theory. Lie groups are introduced as subgroups of the groups of diffeomorphisms of a manifold M induced by a family of vector fields whose Lie algebra is finite dimensional. Then we study left-invariant sub-riemannian structures and their geodesics. In Chapter 8, we interpret the impulses as Lagrange multipliers for constrained optimization problems and apply this point of view to the sub-riemannian case. We also introduce the sub- Riemannian exponential map and we study conjugate points. In Chapter 9, we consider two-dimensional sub-riemannian metrics; such a metric differs from a Riemannian one only along a one-dimensional submanifold. In Chapter 1, we construct the nonholonomic tangent space at a point q of the manifold: a first quasi-homogeneous approximation of the space if you observe and exploit it from q by means of admissible paths. In general, such a tangent space is a homogeneous space of a nilpotent Lie group equipped with an invariant vector distribution; its structure may depend on the point where the tangent space is attached. At generic points, this is a nilpotent Lie group endowed with a left-invariant vector distribution. The construction of the nonholonomic tangent space does not need a metric; if we take into account the metric, we obtain the Gromov Hausdorff tangent to the sub-riemannian metric space. Useful ball-box estimates of small balls follow automatically. In Chapter 11, we study general analytic properties of the sub-riemannian distance as a function of points of the manifold. It is shown that the distance is smooth on an open dense subset and is semi-concave out of the points connected by abnormal length-minimizers. Moreover, generic sphere is a Lipschitz submanifold if we remove these bad points. In Chapter 12, we turn to abnormal geodesics, which provide the deepest singularities of the distance. Abnormal geodesics are critical points of the endpoint map defined on the space of admissible paths, and the main tool for their study is the Hessian of the endpoint map. Chapter 13 is devoted to the explicit calculation of the sub-riemannian distance for model spaces. This is the end of the second part of the book; next few chapters are devoted to the curvature and its applications. Let Φ t : T M T M, for t R, be a sub-riemannian geodesic flow. Submanifolds Φ t (TqM), q M, form a fibration of T M. Given λ T M, let J λ (t) T λ (T M) be the tangent space to the leaf of this fibration. Recall that Φ t is a Hamiltonian flow and Tq M are Lagrangian submanifolds; hence the leaves of our fibrations are Lagrangian submanifolds and J λ (t) is a Lagrangian subspace of the symplectic space T λ (T M). In other words, J λ (t) belongs to the Lagrangian Grassmannian of T λ (T M), and t J λ (t) is a curve in the Lagrangian Grassmannian, a Jacobi curve of the sub-riemannian structure. The curvature of the sub-riemannian space at λ is simply the curvature of this curve in the Lagrangian Grassmannian. Chapter 14 is devoted to the elementary differential geometry of curves in the Lagrangian Grassmannian; in Chapter 15 we apply this geometry to Jacobi curves. The language of Jacobi curves is translated to the traditional language in the Riemannian case in Chapter 16. We recover the Levi Civita connection and the Riemannian curvature and demonstrate their symplectic meaning. In Chapter 17, we explicitly compute the sub-riemannian curvature for contact three-dimensional spaces and we show how the curvature invariants appear 16

17 in the classification of sub-riemannian left-invariant structures on 3D Lie groups. In the next Chapter 18 we study the small distance asymptotics of the expowhree-dimensional contact case and see how the structure of the conjugate locus is encoded in the curvature. In the last Chapter 2 we define the sub-riemannian Laplace operator, the canonical volume form, and compute the density of the sub-riemannian Hausdorff measure. We conclude with a discussion of the sub-riemannian heat equation and an explicit formula for the heat kernel in the three-dimensional Heisenberg case. We finish here this introduction into the Introduction...We hope that the reader won t be bored; comments to the chapters contain suggestions for further reading. 1 1 This research has been supported by the European Research Council, ERC StG 29 GeCoMethods, contract number and by the ANR project SRGI Sub-Riemannian Geometry and Interactions, contract number ANR-15-CE

18 18

19 Chapter 1 Geometry of surfaces in R 3 In this preliminary chapter we study the geometry of smooth two dimensional surfaces in R 3 as a heating problem and we recover some classical results. In the fist part of the chapter we consider surfaces in R 3 endowed with the standard Euclidean product, which we denote by. In the second part we study surfaces in the Minskowski space, that is R 3 endowed with a sign-indefinite inner product, which we denote by h Definition 1.1. A surface of R 3 is a subset M R 3 such that for every q M there exists a neighborhood U R 3 of q and a smooth function a : U R such that U M = a 1 () and a on U M. 1.1 Geodesics and optimality Let M R 3 be a surface and γ : [,T] M be a smooth curve in M. The length of γ is defined as l(γ) := T where v = v v denotes the norm of a vector in R 3. γ(t) dt. (1.1) Remark 1.2. Notice that the definition of length in (1.1) is invariant by reparametrizations of the curve. Indeed let ϕ : [,T ] [,T] be a monotone smooth function. Define γ ϕ : [,T ] M by γ ϕ := γ ϕ. Using the change of variables t = ϕ(s), one gets l(γ ϕ ) = T γ ϕ (s) ds = T γ(ϕ(s)) ϕ(s) ds = T γ(t) dt = l(γ). The definition of length can be extended to piecewise smooth curves on M, by adding the length of every smooth piece of γ. When the curve γ is parametrized in such a way that γ(t) c for some c > we say that γ has constant speed. If moreover c = 1 we say that γ is parametrized by length. The distance between two points p,q M is the infimum of length of curves that join p to q d(p,q) = inf{l(γ), γ : [,T] M piecewise smooth, γ() = p,γ(t) = q}. (1.2) Now we focus on length-minimizers, i.e., piece-wise smooth curves that realize the distance between their endpoints: l(γ) = d(γ(), γ(t)). 19

20 γ(t) γ(t) T γ(t) M γ(t) M Figure 1.1: A smooth minimizer Exercise 1.3. Prove that, if γ : [,T] M is a length-minimizer, then the curve γ [t1,t 2 ] is also a length-minimizer, for all < t 1 < t 2 < T. The following proposition characterizes smooth minimizers. We prove later that all minimizers are smooth (cf. Corollary 1.15). Proposition 1.4. Let γ : [, T] M be a smooth minimizer parametrized by length. Then γ(t) T γ(t) M for all t [,T]. Proof. Consider a smooth non-autonomous vector field (t,q) f t (q) T q M that extends the tangent vector to γ in a neighborhood W of the graph of the curve {(t,γ(t)) R M}, i.e. f t (γ(t)) = γ(t) and f t (q) 1, (t,q) W. Let now (t,q) g t (q) T q M be a smooth non-autonomous vector field such that f t (q) and g t (q) define a local orthonormal frame in the following sense f t (q) g t (q) =, g t (q) 1, (t,q) W. Piecewise smooth curves parametrized by length on M are solutions of the following ordinary differential equation ẋ(t) = cosu(t)f t (x(t))+sinu(t)g t (x(t)), (1.3) for some initial condition x() = q and some piecewise continuous function u(t), which we call control. The curve γ is the solution to (1.3) associated with the control u(t) and initial condition γ(). Let us consider the family of controls {, t < τ u τ,s (t) = τ T, s R (1.4) s, t τ and denote by x τ,s (t) the solution of (1.3) that corresponds to the control u τ,s (t) and with initial condition x τ,s () = γ(). 2

21 Lemma 1.5. For every τ 1,τ 2,t [,T] the following vectors are linearly dependent s x τ1,s(t) s= s x τ2,s(t) (1.5) s= Proof. By Exercice 1.3 is not restrictive to assume t = T. Fix τ 1 τ 2 T and consider the family of curves φ(t;h 1,h 2 ) solutions of (1.3) associated with controls, t [,τ 1 [, v h1,h 2 (t) = h 1, t [τ 1,τ 2 [, h 1 +h 2, t [τ 2,T +ε[, where h 1,h 2 belong to a neighborhood of and ε is small enough (to guarantee the existence of the trajectory). Notice that φ is smooth in a neighborhood of (t,h 1,h 2 ) = (T,,) and φ h i = (h1,h 2 )= s x τi,s(t), i = 1,2. s= By contradiction assume that the vectors in (1.5) are linearly independent. Then φ h is invertible and the classical implicit function theorem applied to the map (t,h 1,h 2 ) φ(t;h 1,h 2 ) at the point (T,,) implies that there exists δ > such that t ]T δ,t +δ[, h 1,h 2, s.t. φ(t;h 1,h 2 ) = γ(t), In particular there exists a curve with unit speed joining γ() and γ(t) in time t < T, which gives a contradiction, since γ is a minimizer. Lemma 1.6. For every τ,t [,T] the following identity holds s x τ,s (t) γ(t) =. (1.6) s= Proof. If t τ, then by construction (cf. (1.4)) the first vector is zero since there is no variation w.r.t. s and the conclusion follows. Let us now assume that t > τ. Again, by Remark 1.3, it is sufficient to prove the statement at t = T. Let us write the Taylor expansion of ψ(t) = s s= x τ,s (t) in a right neighborhood of t = τ. Observe that, for t τ Hence Then, for t τ, we have ψ(τ) = s x τ,s (τ) =, s= ẋ τ,s = cos(s)f t (x τ,s )+sin(s)g t (x τ,s ). ψ(τ) = s ẋ τ,s (τ) = g τ (x τ,s (τ)). s= ψ(t) = (t τ)g τ (x τ,s (τ))+o((t τ) 2 ). (1.7) For τ sufficiently close to T, one can take t = T in (1.7). Passing to the limit for τ T one gets 1 T τ s x τ,s (T) g T (γ(t)). s= τ T Now, by Lemma 1.5 all vectors in left hand side are parallel among them, hence they are parallel to g T (γ(t)). The lemma is proved since γ(t) = f T (γ(t)) and f T and g T are orthogonal. 21

22 Now we end the proposition by showing that γ(t) T γ(t) M. Notice that this is equivalent to show γ(t) f t (γ(t)) = γ(t) g t (γ(t)) =. (1.8) Recall that γ(t) γ(t) = 1. Differentiating this identity one gets = d γ(t) γ(t) = 2 γ(t) γ(t), dt which shows that γ(t) is orthogonal to f t (γ(t)). Next, differentiating (1.6) with respect to t, we have 1 for t τ s ẋ τ,s (t) + γ(t) s= s x τ,s (t) γ(t) =. (1.9) s= Now, from ẋ τ,s (t) ẋ τ,s (t) = 1 one gets sẋτ,s(t) ẋτ,s(t) =, for t τ. Evaluating at s =, using that x τ, (t) = γ(t), one has s ẋ τ,s (t) γ(t) =, for t τ. s= Hence, by (1.9), it follows that s x τ,s (t) γ(t) =, s= which, by continuity, holds for every t [,T]. Using that s s= x τ,s (t) is parallel to g t (γ(t)) (see proof of Lemma 1.6), it follows that g t (γ(t)) γ(t) =. Definition 1.7. Asmooth curveγ : [,T] M parametrized with constant speediscalled geodesic if it satisfies γ(t) T γ(t) M, t [,T]. (1.1) Proposition 1.4 says that a smooth curve that minimizes the length is a geodesic. Now we get an explicit characterization of geodesics when the manifold M is globally defined as the zero level of a smooth function. In other words there exists a smooth function a : R 3 R such that M = a 1 (), and a on M. (1.11) Remark 1.8. Recall that for all q M it holds q a T q M. Indeed, for every q M and v T q M, let γ : [,T] M be a smooth curve on M such that γ() = q and γ() = v. By definition of M one has a(γ(t)) =. Differentiating this identity with respect to t at t = one gets q a v =. Proposition 1.9. A smooth curve γ : [,T] M is a geodesic if and only if it satisfies, in matrix notation: γ(t) = γ(t)t ( 2 γ(t) a) γ(t) γ(t) a 2 γ(t) a, t [,T], (1.12) where 2 γ(t) a is the Hessian matrix of a. 1 notice that x τ,s is smooth on the set [,T]\{τ}. 22

23 Proof. Differentiating the equality γ(t) a γ(t) = we get, in matrix notation: γ(t) T ( 2 γ(t) a) γ(t)+ γ(t)t γ(t) a =. By definition of geodesic there exists a function b(t) such that γ(t) = b(t) γ(t) a. Hence we get from which (1.12) follows. γ(t) T ( 2 γ(t) a) γ(t)+b(t) γ(t)a 2 =, Remark 1.1. Notice that formula (1.12) is always true locally since, by definition of surface, the assumptions (1.11) are always satisfied locally Existence and minimizing properties of geodesics As a direct consequence of Proposition 1.9 one gets the following existence and uniqueness theorem for geodesics. Corollary Let q M and v T q M. There exists a unique geodesic γ : [,ε] M, for ε > small enough, such that γ() = q and γ() = v. Proof. By Proposition 1.9, geodesics satisfy a second order ODE, hence they are smooth curves, characterized by ther initial position and velocity. To end this section we show that small pieces of geodesics are always global minimizers. Theorem Let γ : [,T] M be a geodesic. For every τ [,T[ there exists ε > such that (i) γ [τ,τ+ε] is a minimizer, i.e. d(γ(τ),γ(τ +ε)) = l(γ [τ,τ+ε] ), (ii) γ [τ,τ+ε] is the unique minimizers joining γ(τ) and γ(τ +ε) in the class of piecewise smooth curves, up to reparametrization. Proof. Without loss of generality let us assume that τ = and that γ is length parametrized. Consider a length-parametrized curve α on M such that α() = γ() and α() γ() and denote by (t,s) x s (t) the smooth variation of geodesics such that x (t) = γ(t) and (see also Figure 1.2) x s () = α(s), ẋ s () α(s). (1.13) The map ψ : (t,s) x s (t) is a local diffeomorphism near (,). Indeed the partial derivatives ψ = ψ t t=s= t x (t) = γ(), = t= s t=s= s x s () = α(), s= are linearly independent. Thus ψ maps a neighborhood U of (, ) on a neighborhood W of γ(). We now consider the function φ and the vector field X defined on W φ : x s (t) t, X : x s (t) ẋ s (t). 23

24 x s (t) γ α(s) Figure 1.2: Proof of Theorem 1.12 Lemma q φ = X(q) for every q W. Proof of Lemma We first show that the two vectors are parallel, and then that they actually coincide. To show that they are parallel, first notice that φ is orthogonal to its level set {t = const}, hence xs(t)φ s x s(t) =, (t,s) U. (1.14) Now, let us show that s x s(t) ẋs(t) =, (t,s) U. (1.15) Computing the derivative with respect to t of the left hand side of (1.15) one gets sẋs(t) ẋs(t) + s x s(t) ẍs(t), which is identically zero. Indeed the first term is zero because ẋ s (t) has unit speed and the second one vanishes because of (1.1). Hence, the left hand side of (1.15) is constant and coincides with its value at t =, which is zero by the orthogonality assumption (1.13). By (1.14) and (1.15) one gets that φ is parallel to X. Actually they coincide since φ X = d dt φ(x s(t)) = 1. Now consider ε > small enough such that γ [,ε] is contained in W and take a piecewise smooth and length parametrized curve c : [,ε ] M contained in W and joining γ() to γ(ε). Let us show that γ is shorter than c. First notice that l(γ [,ε] ) = ε = φ(γ(ε)) = φ(c(ε )) 24

25 Using that φ(c()) = φ(γ()) = and that l(c) = ε we have that l(γ [,ε] ) = φ(c(ε )) φ(c()) = ε ε = = d φ(c(t))dt (1.16) dt ε φ(c(t)) ċ(t) dt The last inequality follows from the Cauchy-Schwartz inequality X(c(t)) ċ(t) dt ε = l(c), (1.17) X(c(t)) ċ(t) X(c(t)) ċ(t) = 1 (1.18) which holds at every smooth point of c(t). In addition, equality in (1.18) holds if and only if ċ(t) = X(c(t)) (at the smooth points of c). Hence we get that l(c) = l(γ [,ε] ) if and only if c coincides with γ [,ε]. Now let us show that thereexists ε εsuch that γ [, ε] is a global minimizer among all piecewise smoothcurvesjoiningγ() toγ( ε). Itisenoughtotake ε < dist(γ(), W). Everycurvethatescape from W has length greater than ε. From Theorem 1.12 it follows Corollary Any minimizer of the distance (in the class of piecewise smooth curves) is a geodesic, and hence smooth Absolutely continuous curves Notice that formula (1.1) defines the length of a curve even in the class of absolutely continuous ones, if one understands the integral in the Lebesgue sense. In this setting, in the proof of Theorem 1.12, one can assume that the curve c is actually absolutely continuous. This proves that small pieces of geodesics are minimizers also in the class of absolutely continuous curves on M. Morever, this proves the following. Corollary Any minimizer of the distance (in the class of absolutely continuous curves) is a geodesic, and hence smooth. 1.2 Parallel transport In this section we want to introduce the notion of parallel transport, which let us to define the main geometric invariant of a surface: the Gaussian curvature. Let us consider a curve γ : [,T] M and a vector ξ T γ() M. We want to define the parallel transport of ξ along γ. Heuristically, it is a curve ξ(t) T γ(t) M such that the vectors {ξ(t),t [,T]} are all parallel. Remark If M = R 2 R 3 is the set {z = } we can canonically identify every tangent space T γ(t) M with R 2 so that every tangent vector ξ(t) belong to the same vector space. 2 In this case, parallel simply means ξ(t) = as an element of R 3. This is not the case if M is a manifold because tangent spaces at different points are different. 2 The canonical isomorphism R 2 T xr 2 is written explicitly as follows: y d dt t= x+ty. 25

26 Definition Let γ : [,T] M be a smooth curve. A smooth curve of tangent vectors ξ(t) T γ(t) M is said to be parallel if ξ(t) T γ(t) M. Assume now that M is the zero level of a smooth function a : R 3 R as in (1.11). We have the following description: Proposition A smooth curve of tangent vectors ξ(t) defined along γ : [,T] M is parallel if and only if it satisfies ξ(t) = γ(t)t ( 2 γ(t) a)ξ(t) γ(t) a 2 γ(t) a, t [,T]. (1.19) Proof. As in Remark 1.8, ξ(t) T γ(t) M implies γ(t) a,ξ(t) =. Moreover, by assumption ξ(t) = α(t) γ(t) a for some smooth function α. With analogous computations as in the proof of Proposition 1.9 we get that from which the statement follows. γ(t) T ( 2 γ(t) a)ξ(t)+α(t) γ(t)a 2 =, Remark Notice that, since (1.53) is a first order linear ODE with respect to ξ, for a given curve γ : [,T] M and initial datum v T γ() M, there is a unique parallel curve of tangent vectors ξ(t) T γ(t) M along γ such that ξ() = v. Since (1.53) is a linear ODE, the operator that associates with every initial condition ξ() the final vector ξ(t) is a linear operator, which is called parallel transport. Next we state a key property of the parallel transport. Proposition 1.2. The parallel transport preserves the inner product. In other words, if ξ(t), η(t) are two parallel curves of tangent vectors along γ, then we have d ξ(t) η(t) =, t [,T]. (1.2) dt Proof. From the fact that ξ(t),η(t) T γ(t) M and ξ(t), η(t) T γ(t) M one immediately gets d ξ(t) η(t) = ξ(t) η(t) + ξ(t) η(t) =. dt The notion of parallel transport permits to give a new characterization of geodesics. Indeed, by definition Corollary A smooth curve γ : [,T] M is a geodesic if and only if γ is parallel along γ. In the following we assume that M is oriented. Definition The spherical bundle SM on M is the disjoint union of all unit tangent vectors to M: SM = S q M, S q M = {v T q M, v = 1}. (1.21) q M 26

27 SM is a smooth manifold of dimension 3. Moreover it has the structure of fiber bundle with base manifold M, typical fiber S 1, and canonical projection π : SM M, π(v) = q if v T q M. Remark Since every vector in the fiber S q M has norm one, we can parametrize every v S q M by an angular coordinate θ S 1 through an orthonormal frame {e 1 (q),e 2 (q)} for S q M, i.e. v = cos(θ)e 1 (q)+sin(θ)e 2 (q). The choice of a positively oriented orthonormal frame {e 1 (q),e 2 (q)} corresponds to fix the element in the fiber corresponding to θ =. Hence, the choice of such an orthonormal frame at every point q induces coordinates on SM of the form (q,θ +ϕ(q)), where ϕ C (M). Given an element ξ S q M we can complete it to an orthonormal frame (ξ,η,ν) of R 3 in the following unique way: (i) η T q M is orthogonal to ξ and (ξ,η) is positively oriented (w.r.t. the orientation of M), (ii) ν T q M and (ξ,η,ν) is positively oriented (w.r.t. the orientation of R 3 ). Let t ξ(t) S γ(t) M be a smooth curve of unit tangent vectors along γ : [,T] M. Define η(t),ν(t) T γ(t) M as above. Since t ξ(t) has constant speed, one has ξ(t) ξ(t) and we can write ξ(t) = u ξ (t)η(t)+v ξ (t)ν(t). In particular this shows that every element of T ξ SM, written in the basis (ξ,η,ν), has zero component along ξ. Definition The Levi-Civita connection on M is the 1-form ω Λ 1 (SM) defined by ω ξ : T ξ SM R, ω ξ (z) = u z, (1.22) where z = u z η +v z ν and (ξ,η,ν) is the orthonormal frame defined above. Notice that ω change sign if we change the orientation of M. Lemma A curve of unit tangent vectors ξ(t) is parallel if and only if ω ξ(t) ( ξ(t)) =. Proof. By definition ξ(t) is parallel if and only if ξ(t) is orthogonal to Tγ(t) M, i.e., collinear to ν(t). In particular, a curve parametrized by length γ : [,T] M is a geodesic if and only if ω γ(t) ( γ(t)) =, t [,T]. (1.23) Proposition The Levi-Civita connection ω Λ 1 (SM) satisfies: (i) there exist two smooth functions a 1,a 2 : M R such that where (x 1,x 2,θ) is a system of coordinates on SM. ω = dθ +a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2, (1.24) 27

28 (ii) dω = π Ω, where Ω is a 2-form defined on M and π : SM M is the canonical projection. Proof. (i) Fix a system of coordinates (x 1,x 2,θ) on SM and consider the vector field / θ on SM. Let us show that ( ) ω = 1. θ Indeed consider a curve t ξ(t) of unit tangent vector at a fixed point which describes a rotation in a single fibre. As a curve on SM, the velocity of this curve is exactly its orthogonal vector, i.e. ξ(t) = η(t) and the equality above follows from the definition of ω. By construction, ω is invariant by rotations, hence the coefficients a i = ω( / x i ) do not depend on the variable θ. (ii) Follows directly from expression (1.24) noticing that dω depends only on x 1,x 2. Remark Notice that the functions a 1,a 2 in (1.24) are not invariant by change of coordinates on the fiber. Indeed the transformation θ θ+ϕ(x 1,x 2 ) induces dθ dθ+( x1 ϕ)dx 1 +( x2 ϕ)dx 2 which gives a i a i + xi ϕ for i = 1,2. By definition ω is an intrinsic 1-form on SM. Its differential, by property (ii) of Proposition 1.55, is the pull-back of an intrinsic 2-form on M, that in general is not exact. Definition The area form dv on a surface M is the differential two form that on every tangent space to the manifold agrees with the volume induced by the inner product. In other words, for every positively oriented orthonormal frame e 1,e 2 of T q M, one has dv(e 1,e 2 ) = 1. Given a set Γ M its area is the quantity Γ = Γ dv. Since any 2-form on M is proportional to the area form dv, it makes sense to give the following definition: Definition The Gaussian curvature of M is the function κ : M R defined by the equality Ω = κdv. (1.25) Note that κ does not dependon the orientation of M, since both Ω and dv change sign if we reverse the orientation. Moreover the area 2-form dv on the surface depends only on the metric structure on the surface. 1.3 Gauss-Bonnet Theorems In this section we will prove both the local and the global version of the Gauss-Bonnet theorem. A strong consequence of these results is the celebrated Gauss Theorema Egregium which says that the Gaussian curvature of a surface is independent on its embedding in R 3. Definition 1.3. Let γ : [,T] M be a smooth curve parametrized by length. The geodesic curvature of γ is defined as ρ γ (t) = ω γ(t) ( γ(t)). (1.26) Notice that if γ is a geodesic, then ρ γ (t) = for every t [,T]. The geodesic curvature measures how much a curve is far from being a geodesic. Remark The geodesic curvature changes sign if we move along the curve in the opposite direction. Moreover, if M = R 2, it coincides with the usual notion of curvature of a planar curve. 28

29 1.3.1 Gauss-Bonnet theorem: local version Definition A curvilinear polygon Γon an oriented surfacem is theimage of aclosed polygon in R 2 under a diffeomorphism. We assume that Γ is oriented consistently with the orientation of M. In the following we represent Γ = m i=1 γ i(i i ) where γ i : I i M, for i = 1,...,m, are smooth curves parametrized by length, with orientation consistent with Γ. We denote by α i the external angles at the points where Γ is not C 1 (see Figure 1.3). α3 γ 3 α 2 γ 4 Γ γ 2 α 4 γ 5 α 1 γ 1 α 5 Figure 1.3: A curvilinear polygon Notice that a curvilinear polygon is homeomorphic to a disk. Theorem 1.33 (Gauss-Bonnet, local version). Let Γ be a curvilinear polygon on an oriented surface M. Then we have m m κdv + ρ γi (t)dt+ α i = 2π. (1.27) I i Γ i=1 Proof. (i) Case Γ is smooth. In this case Γ is the image of the unit (closed) ball B 1, centered in the origin of R 2, under a diffeomorphism F : B 1 M, Γ = F(B 1 ). In what follows we denote by γ : I M the curve such that γ(i) = Γ. We consider on B 1 the vector field V(x) = x 1 x2 x 2 x1 which has an isolated zero at the origin and whose flow is a rotation around zero. Denote by X := F V the induced vector field on M with critical point q = F(). For ε small enough, we define (cf. Figure 1.4) i=1 Γ ε := Γ\F(B ε ), and A ε := F(B ε ), where B ε is the ball of radius ε centered in zero in R 2. We have Γ ε = A ε Γ. Define the map φ : Γ ε SM, φ(q) = X(q) X(q). 29

30 F A ε γ Γ ε B 1 \B ε M Figure 1.4: The map F First notice that φ(γ ε) dω = φ(γ ε) π Ω = π(φ(γ ε)) Ω = Ω, (1.28) Γ ε where we used the fact that π(φ(γ ε )) = Γ ε. Then let us compute the integral of the curvature κ on Γ ε κdv = Ω = dω, (by (1.28)) Γ ε Γ ε φ(γ ε) = ω, (by Stokes Theorem) = φ(γ ε) φ(a ε) ω φ( Γ) ω, (since φ(γ ε ) = φ(a ε ) φ( Γ)) (1.29) Notice that in the third equality we used the fact that the induced orientation on φ(γ ε ) gives opposite orientation on the two terms. Let us treat separately these two terms. The first one, by Proposition 1.55, can be written as φ(a ε) ω = φ(a ε) dθ+ a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 (1.3) φ(a ε) The first element of (1.3) is equal to 2π since we integrate the 1-form dθ on a closed curve. The second element of (1.3), for ε, satisfies φ(a ε) a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 Cl(φ(A ε )), (1.31) Indeed the functions a i are smooth (hence bounded on compact sets) and the length of φ(a ε ) goes to zero for ε. 3

31 Let us now consider the second term of (1.29). Since φ( Γ) is parametrized by the curve t γ(t) (as a curve on SM), we have ω = ω γ(t) ( γ(t))dt = ρ γ (t)dt. φ( Γ) Concluding we have from (1.29) κdv = lim κdv = 2π Γ ε Γ ε I I I ρ γ (t)dt, that is (1.27) in the smooth case (i.e. when α i = for all i). (ii) Case Γ non smooth. We reduce to the previous case with a sequence of polygons Γ n such that Γ n is smooth and Γ n approximates Γ in a smooth way. In particular, we assume that Γ n coincides with Γ excepts in neighborhoods U i, for i = 1,...,m, of each point q i where Γ is not smooth, in such a way that the curve σ (n) i that parametrize ( Γ n \ Γ) U i satisfies l(σi n) 1/n. If we apply the statement of the Theorem for the smooth case to Γ n we have κdv + ρ γ (n)(t)dt = 2π, Γ n where γ (n) is the curve that parametrizes Γ n. Since Γ n tends to Γ as n, then lim κdv = κdv. n Γ n Γ We are left to prove that lim n ρ γ (n)(t)dt = i=1 m i=1 I i ρ γi (t)dt+ For every n, let us split the curve γ (n) as the union of the smooth curves σ (n) i??. Then m m ρ γ (n)(t)dt = ρ (n) γ (t)dt + ρ (n) i σ (t)dt. i Since the curve γ (n) i tends to γ i for n one has lim ρ (n) n γ (t)dt = i i=1 ρ γi (t)dt. m α i. (1.32) i=1 and γ (n) i Moreover, with analogous computations of part (i) of the proof ρ (n) σ (t)dt = ω = dθ+a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 i φ(σ (n) i ) and one has, using that l(φ(σ (n) i )) dθ α i, n φ(σ (n) i ) Then (1.32) follows. φ(σ (n) i ) φ(σ (n) i ) a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 n. 31 as in Figure

32 An important corollary is obtained by applying the Gauss-Bonnet Theorem to geodesic triangles. A geodesic triangle T is a curvilinear polygon with m = 3 edges and such that every smooth piece of boundary γ i is a geodesic. For a geodesic triangle T we denote by A i := π α i its internal angles. Corollary Let T be a geodesic triangle and A i (T) its internal angles. Then i κ(q) = lim A i(t) π T T Proof. Fix a geodesic triangle T. Using that the geodesic curvature of γ i vanishes, the local version of Gauss-Bonnet Theorem (1.27) can be rewritten as 3 i=1 A i = π + κdv. (1.33) Γ Dividing for T and passing to the limit for T in the class of geodesic triangles containing q one obtains 1 i κ(q) = lim κdv = lim A i(t) π T T T T T Gauss-Bonnet theorem: global version Now we state the global version of the Gauss-Bonnet theorem. In other words we want to generalize (1.27) to thecase whenγis a region of M not necessarily homeomorphictothe disk, seefor instance Figure 1.5. As we will see that the result depends on the Euler characteristic χ(γ) of this region. Inwhatfollows, byatriangulationofm wemeanadecompositionofm intocurvilinearpolygons (see Definition 1.32). Notice that every compact surface admits a triangulation. 3 Definition Let M R 3 be a compact oriented surface with boundary M (possibly with angles). Consider a triangulation of M. We define the Euler characteristic of M as where n i is the number of i-dimensional faces in the triangulation. χ(m) := n 2 n 1 +n, (1.34) The Euler characteristic can be defined for every region Γ of M in the same way. Here, by a region Γ on asurfacem, wemean a closed domainof themanifold withpiecewise smooth boundary. Remark The Euler characteristic is well-defined. Indeed one can show that the quantity (1.34) is invariant for refinement of a triangulation, since every at every step of the refinement the alternating sum does not change. Moreover, given two different triangulations of the same region, there always exists a triangulation that is a refinement of both of them. This shows that the quantity (1.34) is independent on the triangulation. Example For a compact connected orientable surface M g of genus g (i.e., a surface that topologically is a sphere with g handles) one has χ(m g ) = 2 2g. For instance one has χ(s 2 ) = 2, χ(t 2 ) =, where T 2 is the torus. Notice also that χ(b 1 ) = 1, where B 1 is the closed unit disk in R 2. 3 Formally, a triangulation of a topological space M is a simplicial complex K, homeomorphic to M, together with a homeomorphism h : K M. 32

33 Following the notation introduced in the previous section, for a given region Γ, we assume that Γ is oriented consistently with the orientation of M and Γ = m i=1 γ i(i i ) where γ i : I i M, for i = 1,...,m, are smooth curves parametrized by length (with orientation consistent with Γ). We denote by α i the external angles at the points where Γ is not C 1 (see Figure 1.5). Γ 2 Γ 1 Γ 4 Γ 3 M Figure 1.5: Gauss-Bonnet Theorem Theorem 1.38 (Gauss-Bonnet, global version). Let Γ be a region of a surface on a compact oriented surface M. Then Γ κdv + m i=1 I i ρ γi (t)dt+ m α i = 2πχ(Γ). (1.35) Proof. As in the proof of the local version of the Gauss-Bonnet theorem we consider two cases: (i) Case Γ smooth (in particular α i = for all i). Consider a triangulation of Γ and let {Γ j,j = 1,...,n 2 } be the corresponding subdivision of Γ in curvilinear polygons. We denote by {γ (j) k } the smooth curves parametrized by length whose image are the edges of Γ j and by and θ (j) k the external angles of Γ j. We assume that all orientations are chosen accordingly to the orientation of M. Applying Theorem 1.33 to every Γ j and summing w.r.t. j we get We have that n 2 j=1 n 2 j=1 ( κdv = Γ j Γ j κdv + k Γ κdv, i=1 ρ (j) γ (t)dt+ k k j,k θ (j) k ρ (j) γ (t)dt = k ) = 2πn 2. (1.36) m i=1 ρ γi (t)dt. (1.37) The second equality is a consequence of the fact that every edge of the decomposition that does 33

34 not belong to Γ appears twice in the sum, with opposite sign. It remains to check that θ (j) k = 2π(n 1 n ), (1.38) j,k Let us denote by N the total number of angles in the sum of the left hand side of (1.38). After reindexing we have to check that N θ ν = 2π(n 1 n ). (1.39) ν=1 Denote by n the number of vertexes that belong to Γ and with ni := n n. Similarly we define n 1 and ni 1. We have the following relations: (i) N = 2n I 1 +n 1, (ii) n = n 1, Claim (i) follows from the fact that every curvilinear polygon with n edges has n angles, but the internal edges are counted twice since each of them appears in two polygons. Claim (ii) is a consequence of the fact that Γ is the union of closed curves. If we denote by A k := π θ k the internal angles, we have N N θ ν = Nπ A ν. (1.4) ν=1 Moreover the sum of the internal angles is equal to π for a boundary vertex, and to 2π for an internal one. Hence one gets N A ν = 2πn I +πn, (1.41) ν=1 Combining (1.4), (1.41) and (i) one has ν=1 ν θ ν = (2n I 1 +n 1 )π (2nI +n )π i=1 Using (ii) one finally gets (1.39). (ii) Case Γ non-smooth. We consider a decomposition of Γ into curvilinear polygons whose edges intersect the boundary in the smooth part (this is always possible). The proof is identical to the smooth case up to formula (1.37). Now, instead of (1.39), we have to check that N θ ν = ν=1 m α i +2π(n 1 n ), (1.42) i=1 Now (1.42) can be rewritten as θ ν = 2π(n 1 n ), ν/ A where A is the set of indices whose corresponding angles are non smooth points of Γ. 34

35 Consider now a new region Γ, obtained by smoothing the edges of Γ, together with the decomposition induced by Γ (see Figure 1.5). Denote by ñ 1 and ñ the number of edges and vertexes of the decomposition of Γ. Notice that {θ ν,ν / A} is exactly the set of all angles of the decomposition of Γ. Moreover ñ 1 ñ = n 1 n, since n = ñ +m and n 1 = ñ 1 +m, where m is the number of non-smooth points. Hence, by part (i) of the proof: θ ν = 2π(ñ 1 ñ ) = 2π(n 1 n ). ν/ A Corollary Let M be a compact oriented surface without boundary. Then κdv = 2πχ(M). (1.43) Consequences of the Gauss-Bonnet Theorems M Definition 1.4. Let M,M be two surfaces in R 3. A smooth map φ : R 3 R 3 is called an isometry between M and M if φ(m) = M and for every q M it satisfies v w = D q φ(v) D q φ(w), v,w T q M. (1.44) If the property (1.44) is satisfied by a map defined locally in a neighborhood of every point q of M, then it is called a local isometry. Two surfaces M and M are said to be isometric (resp. locally isometric) if there exists an isometry (resp. local isometry) between M and M. Notice that the restriction φ of a global isometry Φ of R 3 to a surface M R 3 always defines an isometry between M and M = φ(m). From (1.44) it follows that an isometry preserves the angles between vectors and, a fortiori, the length of a curve and the distance between two points. Corollary 1.34, and the fact that the angles and the volumes are preserved by isometries, one obtains that the Gaussian curvature is invariant by local isometries, in the following sense. Corollary 1.41 (Gauss s Theorema Egregium). Assume φ is a local isometry between M and M, then for every q M one has κ(q) = κ (φ(q)), where κ (resp. κ ) is the Gaussian curvature of M (resp. M ). This Theorem says that the Gaussian curvature κ depends only on the metric structure on M and not on the specific fact that the surface is embedded in R 3 with the induced inner product. Corollary Let M be surface and q M. If κ(q) then M is not locally isometric to R 2 in a neighborhood of q. Exercise Prove that a surface M is locally isometric to the Euclidean plane R 2 around a point q M if and only if there exists a coordinate system (x 1,x 2 ) in a neighborhood U of q M such that the vectors x1 and x2 have unit length and are everywhere orthonormal. As a converse of Corollary 1.42 we have the following. 35

36 Theorem Assume that κ in a neighborhood of a point q M. Then M is locally Euclidean (i.e., locally isometric to R 2 ) around q. Proof. From our assumptions we have, in a neighborhood U of q: Ω = κdv =. Hence dω = π Ω =. From its explicit expression ω = dθ +a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2, it follows that the 1-form a 1 dx 1 +a 2 dx 2 is locally exact, i.e. there exists a neighborhood W of q, W U, and a function φ : W R such that a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 = dφ. Hence ω = d(θ +φ(x 1,x 2 )). Thus we can define a new angular coordinate on SM, which we still denote by θ, in such a way that (see also Remark 1.27) ω = dθ. (1.45) Now, let γ be a length parametrized geodesic, i.e. ω γ(t) ( γ(t)) =. Using the the angular coordinate θ just defined on the fibers of SM, the curve t γ(t) S γ(t) M is written as t θ(t). Using (1.45), we have then = ω γ(t) ( γ(t)) = dθ( γ(t)) = θ(t). In other words the angular coordinate of a geodesic γ is constant. We want to construct Cartesian coordinates in a neighborhood U of q. Consider the two length parametrized geodesics γ 1 and γ 2 starting from q and such that θ 1 () =, θ 2 () = π/2. Define them to be the x 1 -axes and x 2 -axes of our coordinate system, respectively. Then, for each point q U consider the two geodesics starting from q and satisfying θ 1 () = and θ 2 () = π/2. We assign coordinates (x 1,x 2 ) to each point q in U by considering the length parameter of the geodesic projection of q on γ 1 and γ 2 (See Figure 1.6). Notice that the family of geodesics constructed in this way, and parametrized by q U, are mutually orthogonal at every point. By construction, in this coordinate system the vectors x1 and x2 have length one (being the tangent vectors to length parametrized geodesics) and are everywhere mutually orthogonal. Hence the theorem follows from Exercise The Gauss map We end this section with a geometric characterization of the Gaussian curvature of a manifold M, using the Gauss map. Definition Let M be an oriented surface. We define the Gauss map associated to M as N : M S 2, q ν q, (1.46) where ν q S 2 R 3 denotes the external unit normal vector to M at q. 36

37 x 2 γ 2 q q x 1 γ 1 Figure 1.6: Proof of Theorem Let us consider the differential of the Gauss map at the point q D q N : T q M T N(q) S 2 T q M where an element tangent to the sphere S 2 at N(q), being orthogonal to N(q), is identified with a tangent vector to M at q. Theorem We have that κ(q) = det(d q N). Before proving this theorem we prove an important property of the Gauss map. Lemma For every q M, the differential D q N of the Gauss map is a symmetric operator, i.e., D q N(ξ) η = ξ D q N(η), ξ,η T q M. (1.47) Proof. We prove the statement locally, i.e., for a manifold M parametrized by a function φ : R 2 M. In this case T q M = ImD u φ, where φ(u) = q. Let v,w R 2 such that ξ = D u φ(v) and η = D u φ(w). Since N(q) T q M we have N(q) η = N(q) D u φ(w) =. Taking the derivative in the direction of ξ one gets D q N(ξ) η + N(q) D 2 u φ(v,w) =, where D 2 uφ is a bilinear symmetric map. Now (1.47) follows exchanging the role of v and w. Proof of Theorem We will use Cartan s moving frame method. Let ξ SM and denote with (e 1 (ξ),e 2 (ξ),e 3 (ξ)), e i : SM R 3, the orthonormal basis attached at ξ and constructed in Section 1.2. Let us compute the differentials of these vectors in the ambient space R 3 and write them as a linear combination (with 1-form as coefficients) of the vectors e i d ξ e i (η) = 3 (ω ξ ) ij (η)e j (ξ), ω ij Λ 1 SM, η T ξ SM. j=1 37

38 Dropping ξ and η from the notation one gets the relation 3 de i = ω ij e j, j=1 ω ij Λ 1 SM. Since for each ξ the basis (e 1 (ξ),e 2 (ξ),e 3 (ξ)) is orthonormal (hence can be seen as an element of SO(3)) its derivative is expressed through a skew-symmentric matrix (i.e., ω ij = ω ji ) and one gets the equations Let us now prove the following identity de 1 = ω 12 e 2 +ω 13 e 3, de 2 = ω 12 e 1 +ω 23 e 3, (1.48) de 3 = ω 13 e 1 ω 23 e 2. ω 13 ω 23 = dω 12. (1.49) Indeed, differentiating the first equation in (1.48) one gets, using that d 2 =, = d 2 e 1 = dω 12 e 2 +ω 12 de 2 +dω 13 e 3 +ω 13 de 3 = (dω 12 ω 13 ω 23 )e 2 +(dω 13 ω 12 ω 23 )e 3, which implies in particular (1.49). The statement of the theorem can be rewritten as an identity between 2-forms as follows Applying π to both sides one gets det(d q N)dV = κdv. π (det(d q N)dV) = π κdv = dω (1.5) where ω is the Levi-Civita connection. Let us show that (1.5) is equivalent to (1.49). Indeed by construction ω 12 computes the coefficient of the derivative of the first vector of the orthonormal basis along the second one, hence ω 12 = ω (see also Definition 1.54). It remains to show that ω 13 ω 23 = π (det(d q N)dV) = det(d π(ξ) N)π dv Since e 3 = N π, where π : SM M is the canonical projection, one has The proof is completed by the following D q N π = de 3 = ω 13 e 1 ω 23 e 2 Exercise Let V be a 2-dimensional Euclidean vector space and e 1,e 2 an orthonormal basis. Let F : V V a linear map and write F = F 1 e 1 +F 2 e 2, where F i : V R are linear functionals. Prove that F 1 F 2 = (detf)dv, where dv is the area form induced by the inner product. 38

39 Remark Lemma 1.47 allows us to define the principal curvatures of M at the point q as the two real eigenvalues k 1 (q),k 2 (q) of the map D q N. In particular κ(q) = k 1 (q)k 2 (q), q M. The principal curvatures can be geometrically interpreted as the maximum and the minimum of curvature of sections of M with orthogonal planes. Notice moreover that, using the Gauss-Bonnet theorem, one can relate then degree of the map N with the Euler characteristic of M as follows 1 degn := Area(S 2 (detd q N)dV = 1 κdv = 1 ) 4π 2 χ(m). M 1.4 Surfaces in R 3 with the Minkowski inner product The theory and the results obtained in this chapter can be adapted to the case when M R 3 is a surface in the Minkowski 3-space, that is R 3 endowed with the hyperbolic (or Minkowski-type) inner product q 1,q 2 h = x 1 x 2 +y 1 y 2 z 1 z 2. (1.51) Here q i = (x i,y i,z i ) for i = 1,2, are two points in R 3. When q,q h, we denote by q h = q,q 1/2 h the norm induced by the inner product (1.51). For the metric structure to be defined on M, we require that the restriction of the inner product (1.51) to the tangent space to M is positive definite at every point. Indeed, under this assumption, the inner product (1.51) can be used to define the length of a tangent vector to the surface (which is non-negative). Thus one can introduce the length of (piecewise) smooth curves on M and its distance by the same formulas as in Section 1.1. These surfaces are also called space-like surfaces in the Minkovski space. The structure of the inner product impose some condition on the structure of space-like surfaces, as the following exercice shows. Exercise 1.5. Let M be a space-like surface in R 3 endowed with the inner product (1.51). (i) Show that if v T q M is a non zero vector that is orthogonal to T q M, then v,v h <. (ii) Prove that, if M is compact, then M. (iii) Show that restriction to M of the projection π(x,y,z) = (x,y) onto the xy-plane is a local diffeomorphism. (iv) Show that M is locally a graph on the plane {z = }. The results obtained in the previous sections for surfaces embedded in R 3 can be recovered for space-like surfaces by simply adapting all formulas to their hyperbolic counterpart. For instance, geodesics are defined as curves of unit speed whose second derivative is orthogonal, with respect to h, to the tangent space to M. For a smooth function a : R 3 R, its hyperbolic gradient h qa is defined as ( a h q a = x, a y, a z 39 ) M

40 If we assume that M = a 1 () is a regular level set of a smooth function a : R 3 R. If γ(t) is a curve contained in M, i.e. a(γ(t)) =, one has the identity = h γ(t) a γ(t) The same computation shows that h γ(t) a is orthogonal to the level sets of a, where orthogonal always means with respect to h. In particular, if M = a 1 () is space-like, one has q a, q a h <. Exercise Let γ be a geodesic on M = a 1 (). Show that γ satisfies the equation (in matrix notation) γ(t) = γ(t)t ( 2 γ(t) a) γ(t) h h γ(t) a 2 γ(t) a, t [,T]. (1.52) h where 2 γ(t) a is the (classical) matrix of second derivatives of a.4 Given a smooth curve γ : [,T] M on a surface M, a smooth curve of tangent vectors ξ(t) T γ(t) M is said to be parallel if ξ(t) T γ(t) M, with respect to the hyperbolic inner product. It is then straightforward to check that, if M is the zero level of a smooth function a : R 3 R, then ξ(t) is parallel along γ if and only if it satisfies h. ξ(t) = γ(t)t ( 2 γ(t) a)ξ(t) h h γ(t) a 2 γ(t) a, t [,T]. (1.53) h By definition a smooth curve γ : [,T] M is a geodesic if and only if γ is parallel along γ. Remark As for surfaces in the Euclidean space, given curve γ : [,T] M and initial datum v T γ() M, there is a unique parallel curve of tangent vectors ξ(t) T γ(t) M along γ such that ξ() = v. Moreover the operator ξ() ξ(t) is a linear operator, which the parallel transport of v along γ. Exercise Show that if ξ(t),η(t) are two parallel curves of tangent vectors along γ, then we have d dt ξ(t) η(t) h =, t [,T]. (1.54) Assume that M is oriented. Given an element ξ S q M we can complete it to an orthonormal frame (ξ,η,ν) of R 3 in the following unique way: (i) η T q M is orthogonal to ξ with respect to h and (ξ,η) is positively oriented (w.r.t. the orientation of M), (ii) ν T q M with respect to h and (ξ,η,ν) is positively oriented (w.r.t. the orientation of R 3 ). For a smooth curve of unit tangent vectors ξ(t) S γ(t) M along a curve γ : [,T] M we define η(t),ν(t) T γ(t) M and we can write ξ(t) = u ξ (t)η(t)+v ξ (t)ν(t). 4 otherwise one can write the numerator of (1.52) as 2,h γ(t) γ(t) γ(t), where h 2,h γ(t) is the hyperbolic Hessian. 4

41 Definition The hyperbolic Levi-Civita connection on M is the 1-form ω Λ 1 (SM) defined by ω ξ : T ξ SM R, ω ξ (z) = u z, (1.55) where z = u z η +v z ν and (ξ,η,ν) is the orthonormal frame defined above. It is again easy to check that a curve of unit tangent vectors ξ(t) is parallel if and only if ω ξ(t) ( ξ(t)) = and a curve parametrized by length γ : [,T] M is a geodesic if and only if ω γ(t) ( γ(t)) =, t [,T]. (1.56) Exercise Prove that the hyperbolic Levi Civita connection ω Λ 1 (SM) satisfies: (i) there exist two smooth functions a 1,a 2 : M R such that where (x 1,x 2,θ) is a system of coordinates on SM. ω = dθ +a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2, (1.57) (ii) dω = π Ω, where Ω is a 2-form defined on M and π : SM M is the canonical projection. Again one can introduce the area form dv on M induced by the inner product and it makes sense to give the following definition: Definition The Gaussian curvature of a surface M in the Minkowski 3-space is the function κ : M R defined by the equality Ω = κdv. (1.58) By reasoning as in the Euclidean case, one can define the geodesic curvature of a curve and prove the analogue of the Gauss-Bonnet theorem in this context. As a consequence one gets that the Gaussian curvature is again invariant under isometries of M and hence is an intrinsic quantity that depends only on the metric properties of the surface and not on the fact that its metric is obtained as the restriction of some metric defined in the ambient space. Finally one can define the hyperbolic Gauss map Definition Let M be an oriented surface. We define the Gauss map N : M H 2, q ν q, (1.59) where ν q H 2 R 3 denotes the external unit normal vector to M at q, with respect to the Minkovsky inner product. Let us now consider the differential of the Gauss map at the point q: D q N : T q M T N(q) H 2 T q M where an element tangent to the hyperbolic plane H 2 at N(q), being orthogonal to N(q), is identified with a tangent vector to M at q. Theorem The differential of the Gauss map D q N is symmetric, and κ(q) = det(d q N). 41

42 1.5 Model spaces of constant curvature In this section we briefly discuss surfaces embedded in R 3 (with Euclidean or Lorentzian inner product) that have constant Gaussian curvature, playing the role of model spaces. For each model we are interested in describing geodesics and, more generally, curves of constant geodesic curvature. These results will be useful in the study of sub-riemannian model spaces in dimension three (cf. Chapter??). Assume that the surface M has constant Gaussian curvature κ R. We already know that κ is a metric invariant of the surface, i.e., it does not depend on the embedding of the surface in R 3. We will distinguish the following three cases: (i) κ = : this is the flat model of the classical Euclidean plane, (ii) κ > : these corresponds to the case of the sphere, (iii) κ < : these corresponds to the hyperbolic plane. We will briefly discuss the cases (i), since it is trivial, and study in some more detail the cases (ii) and (iii) of spherical and hyperbolic geometry Zero curvature: the Euclidean plane The Euclidean plane can be realized as the surface of R 3 defined by the zero level set of the function a : R 3 R, a(x,y,z) = z. It is an easy exercise, applying the results of the previous sections, to show that the curvature of this surface is zero (the Gauss map is constant) and to characterize geodesics and curves with constant curvature. Exercise Prove that geodesics on the Euclidean plane are lines. Moreover, show that curves with constant curvature c are circles of radius 1/c Positive curvature: spheres Let us consider the sphere S 2 r of radius r as the surface of R3 defined as the zero level set of the function S 2 r = a 1 (), a(x,y,z) = x 2 +y 2 +z 2 r 2. (1.6) If we denote, as usual, with the Euclidean inner product in R 3, Sr 2 can be viewed also as the set of points q = (x,y,z) whose Euclidean norm is constant S 2 r = {q R 3 q q = r 2 }. The Gauss map associated with this surface can be easily computed since its is explicitly given by N : Sr 2 S2, N(q) = 1 q, (1.61) r It follows immediately by (1.69) that the Gaussian curvature of the sphere is κ = 1/r 2 at every point q Sr 2. Let us now recover the structure of geodesics and constant geodesic curvature curves on the sphere. 42

43 Proposition 1.6. Let γ : [,T] Sr 2 be a curve with constant geodesic curvature equal to c R. For every vector w R 3 the function α(t) = γ(t) w is a solution of the differential equation α(t) + (c 2 + 1r ) 2 α(t) = Proof. Without loss of generality, we can assume that γ is parametrized by unit speed. Differentiating twice the equality a(γ(t)) =, where a is the function defined in (1.68), we get (in matrix notation): γ(t) T ( 2 γ(t) a) γ(t)+ γ(t)t γ(t) a =. Moreover, since γ(t) is constant and γ has constant geodesic curvature equal to c, there exists a function b(t) such that γ(t) = b(t) γ(t) a+cη(t) (1.62) where c is the geodesic curvature of the curve and η(t) = γ(t) is the vector orthogonal to γ(t) in T γ(t) S 2 r (defined in such a way that γ(t) and η(t) is a positively oriented frame). Reasoning as in the proof of Proposition 1.9 and noticing that γ(t) a is proportional to the vector γ(t), one can compute b(t) and obtains that γ satisfies the differential equation Lemma η(t) = c γ(t) γ(t) = 1 r2γ(t)+cη(t). (1.63) Proof of Lemma The curve η(t) has constant norm, hence η(t) is orthogonal to η(t). Recall that the triple (γ(t), γ(t), η(t)) defines an orthogonal frame at every point. Differentiating the identity η(t) γ(t) = with respect to t one has = η(t) γ(t) + η(t) γ(t) = η(t) γ(t). Hence η(t) has nonvanishing component only along γ(t). Differentiating the identity η(t) γ(t) = one obtains = η(t) γ(t) + η(t) γ(t) = η(t) γ(t) +c where we used (1.63). Hence η(t) = η(t) γ(t) γ(t) = c γ(t). Next we compute the derivatives of the function α as follows Using Lemma 1.61, we have α(t) = γ(t) w = 1 γ(t) w +c η(t) w. (1.64) r2 α(t) = 1 γ(t) w +c η(t) w (1.65) r2 = 1 ( ) 1 r 2 γ(t) w c2 γ(t) w = r 2 +c2 α(t). (1.66) which ends the proof of the Proposition

44 Corollary Constant geodesic curvature curves are contained in the intersection of Sr 2 with an affine plane of R 3. In particular, geodesics are contained in the intersection of Sr 2 with planes passing through the origin, i.e., great circles. Proof. Let us fix a vector w R 3 that is orthogonal to γ() and γ(). Let us then prove that α(t) := γ(t) w = for all t [,T]. By Proposition 1.6, the function α(t) is a solution of the Cauchy problem { α(t)+( 1 +c 2 )α(t) = r 2 (1.67) α() = α() = Since (1.67) admits the unique solution α(t) = for all t. If the curve is a geodesic, then c = and the geodesic equation is written as γ(t) = γ(t). Then consider the function Γ(t) := γ(t) w, where w is chosen as before. Γ(t) is constant since Γ(t) = α(t) =. In fact Γ(t) is identically zero since Γ() = γ() w = γ() w =, by the assumption on w. This proves that the curve γ is contained in a plane passing through the origin. Remark Curves with constant geodesic curvatures on the spheres are circles obtained as the intersection of the sphere with an affine plane. Moreover all these curves can be also characterized in the following two ways: (i) curves that have constant distance from a geodesic (equidistant curves), (ii) boundary of metric balls (spheres) Negative curvature: the hyperbolic plane The negative constant curvature model is the hyperbolic plane H 2 r obtained as the surface of R 3, endowed with the hyperbolic metric, defined as the zero level set of the function a(x,y,z) = x 2 +y 2 z 2 +r 2. (1.68) Indeed this surface is a two-fold hyperboloid, so we restrict our attention to the set of points H 2 r = a 1 () {z > }. In analogy with the positive constant curvature model (which is the set of points in R 3 whose euclidean norm is constant) the negative constant curvature can be seen as the set of points whose hyperbolic norm is constant in R 3. In other words H 2 r = {q = (x,y,z) R 3 q 2 h = r2 } {z > }. The hyperbolic Gauss map associated with this surface can be easily computed since its is explicitly given by N : H 2 r H2, N(q) = 1 r qa, (1.69) Exercise Prove that the Gaussian curvature of H 2 r is κ = 1/r 2 at every point q H 2 r. We can now discuss the structure of geodesics and constant geodesic curvature curves on the hyperbolic space. With start with a result than can be proved in an analogous way to Proposition

45 Proposition Let γ : [,T] Hr 2 be a curve with constant geodesic curvature equal to c R. For every vector w R 3 the function α(t) = γ(t) w h is a solution of the differential equation α(t) + (c 2 1r ) 2 α(t) =. (1.7) As for the sphere, this result implies immediately the following corollary. Corollary Constant geodesic curvature curves on Hr 2 are contained in the intersection of Hr 2 with affine planes of R3. In particular, geodesics are contained in the intersection of Hr 2 with planes passing through the origin. Exercise Prove Proposition 1.65 and Corollary Geodesics on Hr 2 are hyperbolas, obtained as intersections of the hyperboloid with plane passing through the origin. The classification of constant geodesic curvature curves is in fact more rich. The sections of the hyperboloid with affine planes can have different shapes depending on the Euclidean orthogonal vector to the plane: they are circles when it has negative hyperbolic length, hyperbolas when it has positive hyperbolic length or parabolas when it has length zero (that is it belong to the x 2 +y 2 z 2 = ). These distinctions reflects in the value of the geodesic curvature. Indeed, as the form of (1.7) also suggest, the value c = 1 r is a threshold and we have the following situation: (i) if c < 1/r, then the curve is an hyperbola, (ii) if c = 1/r, then the curve is a parabola, (iii) if c > 1/r, then the curve is a circle. This is not the only interesting feature of this classification. Indeed curves of type(i) are equidistant curves while curves of type (iii) are boundary of balls, i.e., spheres, in the hyperbolic plane. Finally, curves of type (ii) are also called horocycles (cf. Remark 1.63 for the difference with respect to the case of the positive constant curvature model). 45

46 46

47 Chapter 2 Vector fields In this chapter we collect some basic definitions of differential geometry, in order to recall some useful results and to fix the notation. We assume the reader to be familiar with the definitions of smooth manifold and smooth map between manifolds. 2.1 Differential equations on smooth manifolds In what follows I denotes an interval of R containing in its interior Tangent vectors and vector fields Let M be a smooth n-dimensional manifold and γ 1,γ 2 : I M two smooth curves based at q = γ 1 () = γ 2 () M. We say that γ 1 and γ 2 are equivalent if they have the same 1-st order Taylor polynomial in some (or, equivalently, in every) coordinate chart. This defines an equivalence relation on the space of smooth curves based at q. Definition 2.1. Let M be a smooth n-dimensional manifold and let γ : I M be a smooth curve such that γ() = q M. Its tangent vector at q = γ(), denoted by d dt γ(t), or γ(), (2.1) t= is the equivalence class in the space of all smooth curves in M such that γ() = q. It is easy to check, usingthe chain rule, that this definition is well-posed (i.e., it does not depend on the representative curve). Definition 2.2. Let M be a smooth n-dimensional manifold. The tangent space to M at a point q M is the set { } d T q M := dt γ(t), γ : I M smooth, γ() = q. t= It is a standard fact that T q M has a natural structure of n-dimensional vector space, where n = dimm. 47

48 Definition 2.3. A smooth vector field on a smooth manifold M is a smooth map X : q X(q) T q M, that associates to every point q in M a tangent vector at q. We denote by Vec(M) the set of smooth vector fields on M. Incoordinates wecan writex = n i=1 Xi (x) x i, andthevector fieldissmoothifits components X i (x) are smooth functions. The value of a vector field X at a point q is denoted in what follows both with X(q) and X q. Definition 2.4. Let M be a smooth manifold and X Vec(M). The equation q = X(q), q M, (2.2) is called an ordinary differential equation (or ODE) on M. A solution of (2.2) is a smooth curve γ : J M, where J R is an open interval, such that We also say that γ is an integral curve of the vector field X. γ(t) = X(γ(t)), t J. (2.3) A standard theorem on ODE ensures that, for every initial condition, there exists a unique integral curve of a smooth vector field, defined on some open interval. Theorem 2.5. Let X Vec(M) and consider the Cauchy problem { q(t) = X(q(t)) q() = q (2.4) For any point q M there exists δ > and a solution γ : ( δ,δ) M of (2.4), denoted by γ(t;q ). Moreover the map (t,q) γ(t;q) is smooth on a neighborhood of (,q ). The solution is unique in the following sense: if there exists two solutions γ 1 : I 1 M and γ 2 : I 2 M of (2.4) defined on two different intervals I 1,I 2 containing zero, then γ 1 (t) = γ 2 (t) for every t I 1 I 2. This permits to introduce the notion of maximal solution of (2.4), that is the unique solution of (2.4) that is not extendable to a larger interval J containing I. If the maximal solution of (2.4) is defined on a bounded interval I = (a,b), then the solution leaves every compact K of M in a finite time t K < b. A vector field X Vec(M) is called complete if, for every q M, the maximal solution γ(t;q ) of the equation (2.2) is defined on I = R. Remark 2.6. The classical theory of ODE ensure completeness of the vector field X Vec(M) in the following cases: (i) M is a compact manifold (or more generally X has compact support in M), (ii) M = R n and X is sub-linear, i.e. there exists C 1,C 2 > such that where denotes the Euclidean norm in R n. X(x) C 1 x +C 2, x R n. 48

49 When we are interested in the behavior of the trajectories of a vector field X Vec(M) in a compact subset K of M, the assumption of completeness is not restrictive. Indeed consider an open neighborhood O K of a compact K with compact closure O K in M. There exists a smooth cut-off function a : M R that is identically 1 on K, and that vanishes out of O K. Then the vector field ax is complete, since it has compact support in M. Moreover, the vector fields X and ax coincide on K, hence their integral curves coincide on K too Flow of a vector field Given a complete vector field X Vec(M) we can consider the family of maps φ t : M M, φ t (q) = γ(t;q), t R. (2.5) where γ(t;q) is the integral curve of X starting at q when t =. By Theorem 2.5 it follows that the map φ : R M M, φ(t,q) = φ t (q), is smooth in both variables and the family {φ t, t R} is a one parametric subgroup of Diff(M), namely, it satisfies the following identities: φ = Id, Moreover, by construction, we have φ t φ s = φ s φ t = φ t+s, t,s R, (2.6) (φ t ) 1 = φ t, t R, φ t (q) t = X(φ t (q)), φ (q) = q, q M. (2.7) The family of maps φ t defined by (2.5) is called the flow generated by X. For the flow φ t of a vector field X it is convenient to use the exponential notation φ t := e tx, for every t R. Using this notation, the group properties (2.6) take the form: e X = Id, e tx e sx = e sx e tx = e (t+s)x, (e tx ) 1 = e tx, (2.8) d dt etx (q) = X(e tx (q)), q M. (2.9) Remark 2.7. When X(x) = Ax is a linear vector field on R n, where A is a n n matrix, the corresponding flow φ t is the matrix exponential φ t (x) = e ta x Vector fields as operators on functions A vector field X Vec(M) induces an action on the algebra C (M) of the smooth functions on M, defined as follows where X : C (M) C (M), a Xa, a C (M), (2.1) (Xa)(q) = d dt a(e tx (q)), q M. (2.11) t= In other words X differentiates the function a along its integral curves. 49

50 Remark 2.8. Let us denote a t := a e tx. The map t a t is smooth and from (2.11) it immediately follows that Xa represents the first order term in the expansion of a t when t : a t = a+txa+o(t 2 ). Exercise 2.9. Let a C (M) and X Vec(M), and denote a t = a e tx. Prove the following formulas d dt a t = Xa t, (2.12) a t = a+txa+ t2 2! X2 a+ t3 3! X3 a+...+ tk k! Xk a+o(t k+1 ). (2.13) It is easy to see also that the following Leibnitz rule is satisfied X(ab) = (Xa)b+a(Xb), a,b C (M), (2.14) that means that X, as an operator on functions, is a derivation of the algebra C (M). Remark 2.1. Notice that, in coordinates, if a C (M) and X = i X i(x) x i then Xa = i X i(x) a x i. In particular, when X is applied to the coordinate functions a i (x) = x i then Xa i = X i, which shows that a vector field is completely characterized by its action on functions. Exercise Let f 1,...,f k C (M) and assume that N = {f 1 =... = f k = } M is a smooth submanifold. Show that X Vec(M) is tangent to N, i.e., X(q) T q N for all q N, if and only if Xf i = for every i = 1,...,k Nonautonomous vector fields Definition A nonautonomous vector field is family of vector fields {X t } t R such that the map X(t,q) = X t (q) satisfies the following properties (C1) X(,q) is measurable for every fixed q M, (C2) X(t, ) is smooth for every fixed t R, (C3) for every system of coordinates defined in an open set Ω M and every compact K Ω and compact interval I R there exists L functions c(t),k(t) such that X(t,x) c(t), X(t,x) X(t,y) k(t) x y, (t,x),(t,y) I K Notice that conditions (C1) and (C2) are equivalent to require that for every smooth function a C (M) the real function (t,q) X t a q defined on R M is measurable in t and smooth in q. Remark In these lecture notes we are mainly interested in nonautonomous vector fields of the following form m X t (q) = u i (t)f i (q) (2.15) i=1 5

51 where u i are L functions and f i are smooth vector fields on M. For this class of nonautonomous vector fields assumptions (C1)-(C2) are trivially satisfied. For what concerns (C3), by the smoothness of f i for every compact set K Ω we can find two positive constants C K,L K such that for all i = 1,...,m and j = 1,...,n we have f i (x) C K, f i (x) x j L K, x K, and one gets for all (t,x),(t,y) I K m m X(t,x) C K u i (t), X(t,x) X(t,y) L K u i (t) x y. (2.16) i=1 The existence and uniqueness of integral curves of a nonautonomous vector field is guaranteed by the following theorem (see [BP7a]). Theorem 2.14 (Carathéodory theorem). Assume that the nonautonomous vector field {X t } t R satisfies (C1)-(C3). Then the Cauchy problem { q(t) = X(t,q(t)) q(t ) = q (2.17) has a unique solution γ(t;t,q ) defined on an open interval I containing t such that (2.17) is satisfied for almost every t I and γ(t ;t,q ) = q. Moreover the map (t,q ) γ(t;t,q ) is Lipschitz with respect to t and smooth with respect to q. Let us assume now that the equation (2.14) is complete, i.e., for all t R and q M the solution γ(t;t,q ) is defined on I = R. Let us denote P t,t(q) = γ(t;t,q). The family of maps {P t,s } t,s R where P t,s : M M is the (nonautonomous) flow generated by X t. It satisfies t i=1 P t,t X (q) = q q (t,p t,t(q ))P t,t(q) Moreover the following algebraic identities are satisfied P t,t = Id, P t2,t 3 P t1,t 2 = P t1,t 3, t 1,t 2,t 3 R, (2.18) (P t1,t 2 ) 1 = P t2,t 1, t 1,t 2 R, Conversely, with every family of smooth diffeomorphism P t,s : M M satisfying the relations (2.18), that is called a flow on M, one can associate its infinitesimal generator X t as follows: X t (q) = d ds P t,t+s (q), q M. (2.19) s= The following lemma characterizes flows whose infinitesimal generator is autonomous. Lemma Let {P t,s } t,s R be a family of smooth diffeomorphisms satisfying (2.18). Its infinitesimal generator is an autonomous vector field if and only if P,t P,s = P,t+s, t,s R. 51

52 2.2 Differential of a smooth map A smooth map between manifolds induces a map between the corresponding tangent spaces. Definition Let ϕ : M N a smooth map between smooth manifolds and q M. The differential of ϕ at the point q is the linear map ϕ,q : T q M T ϕ(q) N, (2.2) defined as follows: ϕ,q (v) = d dt ϕ(γ(t)), if v = d t= dt γ(t), q = γ(). t= It is easily checked that this definition depends only on the equivalence class of γ. ϕ q v γ(t) ϕ,q v ϕ(γ(t)) ϕ(q) M N Figure 2.1: Differential of a map ϕ : M N The differential ϕ,q of a smooth map ϕ : M N, also called its pushforward, is sometimes denoted by the symbols D q ϕ or d q ϕ (see Figure 2.2). Exercise Let ϕ : M N, ψ : N Q be smooth maps between manifolds. Prove that the differential of the composition ψ ϕ : M Q satisfies (ψ ϕ) = ψ ϕ. As we said, a smooth map induces a transformation of tangent vectors. If we deal with diffeomorphisms, we can also pushforward a vector field. Definition Let X Vec(M) and ϕ : M N be a diffeomorphism. The pushforward ϕ X Vec(N) is the vector field on N defined by (ϕ X)(ϕ(q)) := ϕ (X(q)), q M. (2.21) When P Diff(M) is a diffeomorphism on M, we can rewrite the identity (2.21) as (P X)(q) = P (X(P 1 (q))), q M. (2.22) Notice that, in general, if ϕ is a smooth map, the pushforward of a vector field is not well-defined. Remark From this definition it follows the useful formula for X, Y Vec(M) (e tx Y) q = e tx ( Y e d (q)) = tx ds e tx e sy e tx (q). s= 52

53 If P Diff(M) and X Vec(M), then P X is, by construction, the vector field whose integral curves are the image under P of integral curves of X. The following lemma shows how it acts as operator on functions. Lemma 2.2. Let P Diff(M), X Vec(M) and a C (M) then e tp X = P e tx P 1, (2.23) (P X)a = (X(a P)) P 1. (2.24) Proof. From the formula d dt P e tx P 1 (q) = P (X(P 1 (q))) = (P X)(q), t= it follows that t P e tx P 1 (q) is an integral curve of P X, from which (2.23) follows. To prove (2.24) let us compute (P X)a q = d dt a(e tp X (q)). t= Using (2.23) this is equal to d dt a(p(e tx (P 1 (q))) = d t= dt (a P)(e tx (P 1 (q))) = (X(a P)) P 1. t= As a consequence of Lemma 2.2 one gets the following formula: for every X,Y Vec(M) 2.3 Lie brackets (e tx Y)a = Y(a e tx ) e tx. (2.25) In this section we introduce a fundamental notion for sub-riemannian geometry, the Lie bracket of twovector fieldsx andy. Geometrically itisdefinedastheinfinitesimalversion ofthepushforward of the second vector field along the flow of the first one. As expalined below, it measures how much Y is modified by the flow of X. Definition Let X,Y Vec(M). We define their Lie bracket as the vector field [X,Y] := t e tx Y. (2.26) t= Remark The geometric meaning of the Lie bracket can be understood by writing explicitly [X,Y] q = t e tx Y q = t= t e tx (Y e tx (q) ) = t= s t e tx e sy e tx (q). (2.27) t=s= Proposition As derivations on functions, one has the identity [X,Y] = XY YX. (2.28) 53

54 Proof. By definition of Lie bracket we have [X,Y]a = t t= (e tx Y)a. Hence we have to compute the first order term in the expansion, with respect to t, of the map Using formula (2.25) we have t (e tx Y)a. (e tx Y)a = Y(a e tx ) e tx. By Remark 2.8 we have a e tx = a txa+o(t 2 ), hence (e tx Y)a = Y(a txa+o(t 2 )) e tx = (Ya tyxa+o(t 2 )) e tx. Denoting b = Ya tyxa+o(t 2 ), b t = b e tx, and using again the expansion above we get (e tx Y)a = (Ya tyxa+o(t 2 ))+tx(ya tyxa+o(t 2 ))+O(t 2 ) = Ya+t(XY YX)a+O(t 2 ). that proves that the first order term with respect to t in the expansion is (XY YX)a. Proposition 2.23 shows that (Vec(M),[, ]) is a Lie algebra. Exercise Prove the coordinate expression of the Lie bracket: let n n X = X i, Y = Y j, x i x j be two vector fields in R n. Show that i=1 [X,Y] = n i,j=1 j=1 ( ) Y j X j X i Y i. x i x i x j Next we prove that every diffeomorphism induces a Lie algebra homomorphism on Vec(M). Proposition Let P Diff(M). Then P is a Lie algebra homomorphism of Vec(M), i.e., P [X,Y] = [P X,P Y], X,Y Vec(M). Proof. We show that the two terms are equal as derivations on functions. Let a C (M), preliminarly we see, using (2.24), that and using twice this property and (2.28) P X(P Ya) = P X(Y(a P) P 1 ) = X(Y(a P) P 1 P) P 1 = X(Y(a P)) P 1, [P X,P Y]a = P X(P Ya) P Y(P Xa) = XY(a P) P 1 YX(a P) P 1 = (XY YX)(a P) P 1 = P [X,Y]a. 54

55 To end this section, we show that the Lie bracket of two vector fields is zero (i.e., they commute as operator on functions) if and only if their flows commute. Proposition Let X, Y Vec(M). The following properties are equivalent: (i) [X,Y] =, (ii) e tx e sy = e sy e tx, t,s R. Proof. We start the proof with the following claim [X,Y] = = e tx Y = Y, t R. (2.29) To prove (2.29) let us show that [X,Y] = d dt t= e tx Y = implies that d dt e tx Y = for all t R. Indeed we have d dt e tx Y = d dε e (t+ε)x Y = d ε= dε e tx e εx Y ε= = e tx d dε e εx Y = e tx [X,Y] =, ε= which proves (2.29). (i) (ii). Fix t R. Let us show that φ s := e tx e sy e tx is the flow generated by Y. Indeed we have s φ s = ε e tx e (s+ε)y e tx ε= = ε e tx e εy e tx e} tx e {{ sy e tx } ε= = e tx Y φ s = Y φ s. where in the last equality we used (2.29). Using uniqueness of the flow generated by a vector field we get e tx e sy e tx = e sy, t,s R, which is equivalent to (ii). (ii) (i). For every function a C we have Then (i) follows from (2.28). XYa = 2 a e sy e tx = 2 a e tx e sy = YXa. t s t=s= s t t=s= Exercise Let X,Y Vec(M) and q M. Consider the curve on M γ(t) = e ty e tx e ty e tx (q). Prove that the tangent vector to the curve t γ( t) at t = is [X,Y](q). 55 φ s

56 Exercise Let X,Y Vec(M). Using the semigroup property of the flow, prove that Deduce the following expansion e tx Y = n= d dt e tx Y = e tx [X,Y] (2.3) t n n! (adx)n Y (2.31) = Y +t[x,y]+ t2 2 [X,[X,Y]]+ t3 [X,[X,[X,Y ]]] Exercise Let X,Y Vec(M) and a C (M). Prove the following Leibnitz rule for the Lie bracket: [X,aY] = a[x,y]+(xa)y. Exercise 2.3. Let X,Y,Z Vec(M). Prove that the Lie bracket satisfies the Jacobi identity: [X,[Y,Z]]+[Y,[Z,X]] +[Z,[X,Y]] =. (2.32) Hint: Differentiate the identity e tx [Y,Z] = [etx Y,etX Z] with respect to t. Exercise Let M be a smooth dimensional manifold and X 1,...,X n be linearly independent vector fields in a neighborhood of a point q M. Prove that the map ψ : R n M, ψ(t 1,...,t n ) = e t 1X 1... e tnxn (q ) is a local diffeomorphism at. Moreover we have, denoting t = (t 1,...,t n ), ψ (t) = e t 1X 1... e t i+1x i+1 X i (ψ(t)) t i Deduce that, when [X i,x j ] = for every i,j = 1,...,n, one has 2.4 Frobenius theorem ψ t i (t) = X i (ψ(t)). In this section we prove Frobenius theorem about vector distributions. Definition Let M be a smooth manifold. A vector distribution D of rank m on M is a family of vector subspaces D q T q M where dimd q = m for every q. A vector distribution D is said to be smooth if, for every point q M, there exists a neighborhood O q of q and a family of vector fields X 1,...,X m such that D q = span{x 1 (q),...,x m (q)}, q O q. (2.33) Definition A smooth vector distribution D (of rank m) on M is said to be involutive if there exists a local basis of vector fields X 1,...,X m satisfying (2.33) and smooth functions a k ij on M such that m [X k,x i ] = a k ijx j, i,k = 1,...,m. (2.34) j=1 56

57 Exercise Prove that a smooth vector distribution D is involutive if and only if for every local basis of vector fields X 1,...,X m satisfying (2.33) there exist smooth functions a k ij such that (2.34) holds. Definition A smooth vector distribution D on M is said to be flat if for every point q M there exists a diffeomorphism φ : O q R n such that φ,q (D q ) = R m {} for all q O q. Theorem 2.36 (Frobenius Theorem). A smooth distribution is involutive if and only if it is flat. Proof. The statement is local, hence it is sufficient to prove the statement on a neighborhood of every point q M. (i). Assume first that the distribution is flat. Then there exists a diffeomorphism φ : O q R n such that D q = φ 1,q(R m {}). It follows that for all q O q we have and we have for i,k = 1,...,m [ [X k,x i ] = D q = span{x 1 (q),...,x m (q)}, φ 1,q ],φ 1,q x k x i X i (q) := φ 1,q. x i [ = φ 1,q, x k ] =. x i (ii). Let us now prove that if D is involutive then it is flat. As before it is not restrictive to work on a neighborhood where and (2.34) are satisfied. We first need a lemma. D q = span{x 1 (q),...,x m (q)}, q O q. (2.35) Lemma For every k = 1,...,m we have e tx k D = D. Proof of Lemma Let us define the time dependent vector fields Using (2.34) and (2.3) we compute Ẏ k i (t) = e tx k [X k,x i ] = Y k i (t) := e tx k X i m j=1 e tx k ) (a k ijx j = m k=1 a k ij(t)y k j (t) where we set a k ij (t) = ak ij etx k. Denote by A k (t) = (a k ij (t))m i,j=1 Γ k (t) = (γij k(t))m i,j=1 to the matrix Cauchy problem and consider the unique solution Then we have that implies, for every i,k = 1,...,m which proves the claim. Γ(t) = A k (t)γ(t), Γ() = I. (2.36) Y k i (t) = m j=1 e tx k X i = γ k ij(t)y k j () m γij(t)x k j j=1 57

58 We can now end the proof of Theorem Complete the family X 1,...,X m to a basis of the tangent space T q M = span{x 1 (q),...,x m (q),z m+1 (q),...,z n (q)} in a neighborhood of q and set ψ : R n M defined by ψ(t 1,...,t m,s m+1,...,s n ) = e t 1X 1... e tmxm e s m+1z m+1... e snzn (q ) By construction ψ is a local diffeomorphism at (t,s) = (,) and for (t,s) close to (,) we have that (cf. Exercice 2.31) ψ (t,s) = e t 1X 1 t i... e t ix i X i (ψ(t,s)), for every i = 1,...,m. These vectors are linearly independent and, thanks to Lemma 2.37, belong to D. Hence { } D q = ψ span,...,, q = ψ(t,s), t 1 t m and the claim is proved. 2.5 Cotangent space In this section we introduce tangent covectors, that are linear functionals on the tangent space. The space of all covectors at a point q M, called cotangent space is, in algebraic terms, simply the dual space to the tangent space. Definition Let M be a n-dimensional smooth manifold. The cotangent space at a point q M is the set T q M := (T qm) = {λ : T q M R,λ linear}. If λ T q M and v T qm, we will denote by λ,v := λ(v) the action of the covector λ on the vector v. As we have seen, the differential of a smooth map yields a linear map between tangent spaces. The dual of the differential gives a linear map between cotangent spaces. Definition Let ϕ : M N be a smooth map and q M. The pullback of ϕ at point ϕ(q), where q M, is the map ϕ : T ϕ(q) N T q M, λ ϕ λ, defined by duality in the following way ϕ λ,v := λ,ϕ v, v T q M, λ T ϕ(q) M. Example 2.4. Let a : M R be a smooth function and q M. The differential d q a of the function a at the point q M, defined through the formula d q a,v := d dt a(γ(t)), v T q M, (2.37) t= where γ is any smooth curve such that γ() = q and γ() = v, is an element of Tq M, since (2.37) is linear with respect to v. 58

59 Definition A differential 1-form on a smooth manifold M is a smooth map ω : q ω(q) T q M, that associates with every point q in M a cotangent vector at q. We denote by Λ 1 (M) the set of differential forms on M. Since differential forms are dual objects to vector fields, it is well defined the action of ω Λ 1 M on X Vec(M) pointwise, defining a function on M. ω,x : q ω(q),x(q). (2.38) The differential form ω is smooth if and only if, for every smooth vector field X Vec(M), the function ω,x C (M) Definition Let ϕ : M N be a smooth map and a : N R be a smooth function. The pullback ϕ a is the smooth function on M defined by (ϕ a)(q) = a(ϕ(q)), q M. In particular, if π : T M M is the canonical projection and a C (M), then which is constant on fibers. 2.6 Vector bundles (π a)(λ) = a(π(λ)), λ T M, Heuristically, a smooth vector bundle on a manifold M, is a smooth family of vector spaces parametrized by points in M. Definition Let M be a n-dimensional manifold. A smooth vector bundle of rank k over M is a smooth manifold E with a surjective smooth map π : E M such that (i) the set E q := π 1 (q), the fiber of E at q, is a k-dimensional vector space, (ii) for every q M there exist a neighborhood O q of q and a linear-on-fibers diffeomorphism (called local trivialization) ψ : π 1 (O q ) O q R k such that the following diagram commutes π 1 ψ (O q ) O q R k (2.39) π π 1 O q The space E is called total space and M is the base of the vector bundle. We will refer at π as the canonical projection and rank E will denote the rank of the bundle. Remark A vector bundle E, as a smooth manifold, has dimension dime = dimm +rank E = n+k. In the case when there exists a global trivialization map, i.e. one can choose a local trivialization with O q = M for all q M, then E is diffeomorphic to M R k and we say that E is trivializable. 59

60 Example For any smooth n-dimensional manifold M, the tangent bundle TM, defined as the disjoint union of the tangent spaces at all points of M, TM = T q M, q M has a natural structure of 2n-dimensional smooth manifold, equipped with the vector bundle structure (of rank n) induced by the canonical projection map π : TM M, π(v) = q if v T q M. In the same way one can consider the cotangent bundle T M, defined as T M = TqM. q M Again, it is a 2n-dimensional manifold, and the canonical projection map π : T M M, π(λ) = q if λ T qm, endows T M with a structure of rank n vector bundle. Let O M be a coordinate neighborhood and denote by φ : O R n, φ(q) = (x 1,...,x n ), a local coordinate system. The differentials of the coordinate functions dx i q, i = 1,...,n, q O, form a basis of the cotangent space Tq M. The dual basis in the tangent space T qm is defined by the vectors x i T q M, i = 1,...,n, q O, (2.4) q dx i, = δ ij, i,j = 1,...,n. (2.41) x j Thus any tangent vector v T q M and any covector λ Tq M can be decomposed in these basis and the maps v = n v i x i, λ = q i=1 n p i dx q i, ψ : v (x 1,...,x n,v 1,...,v n ), ψ : λ (x1,...,x n,p 1,...,p n ), (2.42) define local coordinates on TM and T M respectively, which we call canonical coordinates induced by the coordinates ψ on M. 6 i=1

61 Definition A morphism f : E E between two vector bundles E,E on the base M (also called a bundle map) is a smooth map such that the following diagram is commutative f E E (2.43) π π M where f is linear on fibers. Here π and π denote the canonical projections. Definition Let π : E M be a smooth vector bundle over M. A local section of E is a smooth map 1 σ : A M E satisfying π σ = Id A, where A is an open set of M. In other words σ(q) belongs to E q for each q A, smoothly with respect to q. If σ is defined on all M it is said to be a global section. Example Let π : E M be a smooth vector bundle over M. The zero section of E is the global section We will denote by M := ζ(m) E. ζ : M E, ζ(q) = E q, q M. Remark Notice that smooth vector fields and smooth differential forms are, by definition, sections of the vector bundles TM and T M respectively. We end this section with some classical construction on vector bundles. Definition 2.5. Let ϕ : M N be a smooth map between smooth manifolds and E be a vector bundle on N, with fibers {E q,q N}. The induced bundle (or pullback bundle) ϕ E is a vector bundle on the base M defined by ϕ E := {(q,v) q M,v E ϕ(q) } M E. Notice that rankϕ E = ranke, hence dimϕ E = dimm +ranke. Example (i). Let M be a smooth manifold and TM its tangent bundle, endowed with an Euclidean structure. The spherical bundle SM is the vector subbundle of T M defined as follows SM = q M S q M, S q M = {v T q M v = 1}. (ii). Let E,E be two vector bundles over a smooth manifold M. The direct sum E E is the vector bundle over M defined by 1 hetre smooth means as a map between manifolds. (E E ) q := E q E q. 61

62 2.7 Submersions and level sets of smooth maps If ϕ : M N is a smooth map, we define the rank of ϕ at q M to be the rank of the linear map ϕ,q : T q M T ϕ(q) N. It is of course just the rank of the matrix of partial derivatives of ϕ in any coordinate chart, or the dimension of Im(ϕ,q ) T ϕ(q) N. If ϕ has the same rank k at every point, we say ϕ has constant rank, and write rankϕ = k. An immersion is a smooth map ϕ : M N with the property that ϕ is injective at each point (or equivalently rankϕ = dimm). Similarly, a submersion is a smooth map ϕ : M N such that ϕ is surjective at each point (equivalently, rankϕ = dimn). Theorem 2.52 (Rank Theorem). Suppose M and N are smooth manifolds of dimensions m and n, respectively, and ϕ : M N is a smooth map with constant rank k in a neighborhood of q M. Then there exist coordinates (x 1,...,x m ) centered at q and (y 1,...,y n ) centered at ϕ(q) in which ϕ has the following coordinate representation: ϕ(x 1,...,x m ) = (x 1,...,x k,,...,). (2.44) Remark The previous theorem can be rephrased in the following way. Let ϕ : M N be a smooth map between two smooth manifolds. Then the following are equivalent: (i) ϕ has constant rank in a neighborhood of q M. (ii) There exist coordinates near q M and ϕ(q) N in which the coordinate representation of ϕ is linear. In the case of a submersion, from Theorem 2.52 one can deduce the following result. Corollary Assume ϕ : M N is a smooth submersion at q. Then ϕ admits a local right inverse at ϕ(q). Moreover ϕ is open at q. More precisely it exist ε > and C > such that B ϕ(q) (C 1 r) ϕ(b q (r)), r [,ε), (2.45) where the balls in (2.45) are considered with respect to some Euclidean norm in a coordinate chart. Remark The constant C appearing in (2.45) is related to the norm of the differential of the local right inverse, computed with respect to the chosen Euclidean norm in the coordinate chart. When ϕ is a diffeomorphism, C is a bound on the norm of the differential of the inverse of ϕ. This recover the classical quantitative statement of the inverse function theorem. Using these results, one can give some general criteria for level sets of smooth maps (or smooth functions) to be submanifolds. Theorem 2.56 (Constant Rank Level Set Theorem). Let M and N be smooth manifolds, and let ϕ : M N be a smooth map with constant rank k. Each level set ϕ 1 (y), for y N is a closed embedded submanifold of codimension k in M. Remark It is worth to specify the following two important sub cases of Theorem 2.56: (a) If ϕ : M N is a submersion at every q ϕ 1 (y) for some y N, then ϕ 1 (y) is a closed embedded submanifold whose codimension is equal to the dimension of N. 62

63 (b) If a : M R is a smooth function such that d q a for every q a 1 (c), where c R, then the level set a 1 (c) is a smooth hypersurface of M Exercise Let a : M R be a smooth function. Assume that c R is a regular value of a, i.e., d q a for every q a 1 (c). Then N c = a 1 (c) = {q M a(q) = c} M is a smooth submanifold. Prove that for every q N c Bibliographical notes T q N c = kerd q a = {v T q M d q a,v = }. The material presented in this chapter is classical and covered by many textbook in differential geometry, as for instance in [Boo86, Lee13, dc92, Spi79]. Theorem 2.14 is a well-known theorem in ODE. The statement presented here can be deduced from [BP7b, Theorem 2.1.1, Exercice 2.4]. The functions c(t), k(t) appearing in (C3) are assumed to be L, that is stronger than L 1 (on compact intervals). This stronger assumptions imply that the solution is not only absolutely continuous with respect to t, but also locally Lipschitz. 63

64 64

65 Chapter 3 Sub-Riemannian structures 3.1 Basic definitions In this section we introduce a definition of sub-riemannian structure which is quite general. Indeed, this definition includes all the classical notions of Riemannian structure, constant-rank sub- Riemannian structure, rank-varying sub-riemannian structure, almost-riemannian structure etc. Definition 3.1. Let M be a smooth manifold and let F Vec(M) be a family of smooth vector fields. The Lie algebra generated by F is the smallest sub-algebra of Vec(M) containing F, namely LieF := span{[x 1,...,[X j 1,X j ]],X i F,j N}. (3.1) We will say that F is bracket-generating (or that satisfies the Hörmander condition) if Moreover, for s N, we define Lie q F := {X(q),X LieF} = T q M, q M. Lie s F := span{[x 1,...,[X j 1,X j ]],X i F,j s}. (3.2) We say that the family F has step s at q if s N is the minimal integer satisfying Lie s qf := {X(q),X Lie s F} = T q M, Notice that, in general, the step may depend on the point on M and s = s(q) can be unbounded on M even for bracket-generating structures. Definition 3.2. Let M be a connected smooth manifold. A sub-riemannian structure on M is a pair (U,f) where: (i) U is an Euclidean bundle with base M and Euclidean fiber U q, i.e., for every q M, U q is a vector space equipped with a scalar product ( ) q, smooth with respect to q. For u U q we denote the norm of u as u 2 = (u u) q. (ii) f : U TM is a smooth map that is a morphism of vector bundles, i.e. the following diagram is commutative (here π U : U M and π : TM M are the canonical projections) U f TM π U π M (3.3) 65

66 and f is linear on fibers. (iii) The set of horizontal vector fields D := {f(σ) σ : M U smooth section}, is a bracketgenerating family of vector fields. We call step of the sub-riemannian structure at q the step of D. Whenthevector bundleuadmitsaglobal trivialization wesaythat (U,f) isafree sub-riemannian structure. A smooth manifold endowed with a sub-riemannian structure (i.e., the triple (M, U, f)) is called a sub-riemannian manifold. When the map f : U TM is fiberwise surjective, (M,U,f) is called a Riemannian manifold (cf. Exercise 3.23). Definition 3.3. Let (M, U, f) be a sub-riemannian manifold. The distribution is the family of subspaces {D q } q M, where D q := f(u q ) T q M. We call k(q) := dimd q the rank of the sub-riemannian structure at q M. We say that the sub-riemannian structure (U, f) on M has constant rank if k(q) is constant. Otherwise we say that the sub-riemannian structure is rank-varying. Theset of horizontal vector fields D Vec(M) has the structureof a finitely generated C (M)- module, whose elements are vector fields tangent to the distribution at each point, i.e. D q = {X(q) X D}. The rank of a sub-riemannian structure (M,U,f) satisfies k(q) m, where m = ranku, (3.4) k(q) n, where n = dimm. (3.5) In what follows we denote points in U as pairs (q,u), where q M is an element of the base and u U q is an element of the fiber. Following this notation we can write the value of f at this point as f(q,u) or f u (q). We prefer the second notation to stress that, for each q M, f u (q) is a vector in T q M. Definition 3.4. A Lipschitz curve γ : [,T] M is said to be admissible (or horizontal) for a sub-riemannian structure if there exists a measurable and essentially bounded function called the control function, such that u : t [,T] u(t) U γ(t), (3.6) γ(t) = f(γ(t),u(t)), for a.e. t [,T]. (3.7) In this case we say that u( ) is a control corresponding to γ. Notice that different controls could correspond to the same trajectory (see Figure 3.1). 66

67 D q Figure 3.1: A horizontal curve Remark 3.5. Once we have chosen a local trivialization O q R m for the vector bundle U, where O q is a neighborhood of a point q M, we can choose a basis in the fibers and the map f is written f(q,u) = m i=1 u if i (q), where m is the rank of U. In this trivialization, a Lipschitz curve γ : [,T] M is admissible if there exists u = (u 1,...,u m ) L ([,T],R m ) such that γ(t) = m u i (t)f i (γ(t)), for a.e. t [,T]. (3.8) i=1 Thanks to this local characterization and Theorem 2.14, for each initial condition q M and u L ([,T],R m ) it follows that there exists an admissible curve γ, defined on a sufficiently small interval, such that u is the control associated with γ and γ() = q. Remark 3.6. Notice that, for a curve to be admissible, it is not sufficient to satisfy γ(t) D γ(t) for almost every t [,T]. Take for instance the two free sub-riemannian structures on R 2 having rank two and defined by f(x,y,u 1,u 2 ) = (x,y,u 1,u 2 x), f (x,y,u 1,u 2 ) = (x,y,u 1,u 2 x 2 ). (3.9) and let D and D the corresponding moduli of horizontal vector fields. It is easily seen that the curve γ : [ 1,1] R 2, γ(t) = (t,t 2 ) satisfies γ(t) D γ(t) and γ(t) D γ(t) for every t [ 1,1]. Moreover, γ is admissible for f, since its corresponding control is (u 1,u 2 ) = (1,2) for a.e. t [ 1,1], but it is not admissible for f, since its corresponding control is uniquely determined as (u 1 (t),u 2 (t)) = (1,2/t) for a.e. t [ 1,1], which is not essentially bounded. This example shows that, for two different sub-riemannian structures (U,f) and (U,f ) on the same manifold M, one can have D q = D q for every q M, but D D. Notice, however, that if the distribution has constant rank one has D q = D q for every q M if and only if D = D The minimal control and the length of an admissible curve We start by defining the sub-riemannian norm for vectors that belong to the distribution. Definition 3.7. Let v D q. We define the sub-riemannian norm of v as follows v := min{ u, u U q s.t. v = f(q,u)}. (3.1) 67

68 Notice that since f is linear with respect to u, the minimum in (3.1) is always attained at a unique point. Indeed the condition f(q, ) = v defines an affine subspace of U q (which is nonempty since v D q ) and the minimum in (3.1) is uniquely attained at the orthogonal projection of the origin onto this subspace (see Figure 3.2). u 2 u 1 +u 2 = v v u 1 Figure 3.2: The norm of a vector v for f(x,u 1,u 2 ) = u 1 +u 2 Exercise 3.8. Show that is a norm in D q. Moreover prove that it satisfies the parallelogram law, i.e., it is induced by a scalar product q on D q, that can be recovered by the polarization identity v w q = 1 4 v +w v w 2, v,w D q. (3.11) Exercise 3.9. Let u 1,...,u m U q be an orthonormal basis for U q. Define v i = f(q,u i ). Show that if f(q, ) is injective then v 1,...,v m is an orthonormal basis for D q. An admissible curve γ : [,T] M is Lipschitz, hence differentiable at almost every point. Hence it is well defined the unique control t u (t) associated with γ and realizing the minimum in (3.1). Definition 3.1. Given an admissible curve γ : [,T] M, we define u (t) := argmin{ u, u U q s.t. γ(t) = f(γ(t),u)}. (3.12) for all differentiability point of γ. We say that the control u is the minimal control associated with γ. We stress that u (t) is pointwise defined for a.e. t [,T]. The proof of the following crucial Lemma is postponed to the Section 3.A. Lemma Let γ : [,T] M be an admissible curve. Then its minimal control u ( ) is measurable and essentially bounded on [, T]. 68

69 Remark If the admissible curve γ : [,T] M is differentiable, its minimal control is defined everywhere on [, T]. Nevertheless, it could be not continuous, in general. Consider, as in Remark 3.6, the free sub-riemannian structure on R 2 f(x,y,u 1,u 2 ) = (x,y,u 1,u 2 x), (3.13) and let γ : [ 1,1] R 2 defined by γ(t) = (t,t 2 ). Its minimal control u (t) satisfies (u 1 (t),u 2 (t)) = (1,2) when t, while (u 1 (),u 2 ()) = (1,), hence is not continuous. Thanks to Lemma 3.11 we are allowed to introduce the following definition. Definition Let γ : [,T] M be an admissible curve. We definethe sub-riemannian length of γ as l(γ) := T γ(t) dt. (3.14) We say that γ is length-parametrized (or arclength parametrized) if γ(t) = 1 for a.e. t [,T]. Notice that for a length-parametrized curve we have that l(γ) = T. Formula (3.14) says that the length of an admissible curve is the integral of the norm of its minimal control. l(γ) = T In particular any admissible curve has finite length. u (t) dt. (3.15) Lemma The length of an admissible curve is invariant by Lipschitz reparametrization. Proof. Let γ : [,T] M be an admissible curve and ϕ : [,T ] [,T] a Lipschitz reparametrization, i.e., a Lipschitz and monotone surjective map. Consider the reparametrized curve γ ϕ : [,T ] M, γ ϕ := γ ϕ. First observe that γ ϕ is a composition of Lipschitz functions, hence Lipschitz. Moreover γ ϕ is admissible since, by the linearity of f, it has minimal control (u ϕ) ϕ L, where u is the minimal control of γ. Using the change of variables t = ϕ(s), one gets l(γ ϕ ) = T γ ϕ (s) ds = T u (ϕ(s)) ϕ(s) ds = T u (t) dt = T γ(t) dt = l(γ). (3.16) Lemma Every admissible curve of positive length is a Lipschitz reparametrization of a lengthparametrized admissible one. Proof. Let ψ : [,T] M be an admissible curve with minimal control u. Consider the Lipschitz monotone function ϕ : [,T] [,l(ψ)] defined by ϕ(t) := t 69 u (τ) dτ.

70 Notice that if ϕ(t 1 ) = ϕ(t 2 ), the monotonicity of ϕ ensures ψ(t 1 ) = ψ(t 2 ). Hence we are allowed to define γ : [,l(ψ)] M by γ(s) := ψ(t), if s = ϕ(t) for some t [,T]. In other words, it holds ψ = γ ϕ. To show that γ is Lipschitz let us first show that there exists a constant C > such that, for every t,t 1 [,T] one has, in some local coordinates (where denotes the Euclidean norm in coordinates) ψ(t 1 ) ψ(t ) C t1 t u (τ) dτ. Indeed fix K M a compact set such that ψ([,t]) K and C := max x K ψ(t 1 ) ψ(t ) C t1 t t1 Hence if s 1 = ϕ(t 1 ) and s = ϕ(t ) one has m u i(t)f i (ψ(t)) dt i=1 m u i (t) 2 m f i (ψ(t)) 2 dt t i=1 t1 t u (t) dt, i=1 ( m ) 1/2. f i (x) 2 Then i=1 γ(s 1 ) γ(s ) = ψ(t 1 ) ψ(t ) C t1 t u (τ) dτ = C s 1 s, which proves that γ is Lipschitz. It particular γ(s) exists for a.e. s [,l(ψ)]. We are going to prove that γ is admissible and its minimal control has norm one. Define for every s such that s = ϕ(t), ϕ(t) exists and ϕ(t), the control v(s) := u (t) ϕ(t) = u (t) u (t). By Exercise 3.16 the control v is defined for a.e. s. Moreover, by construction, v(s) = 1 for a.e. s and v is the minimal control associated with γ. Exercise Show that for a Lipschitz and monotone function ϕ : [,T] R, the Lebesgue measure of the set {s R s = ϕ(t), ϕ(t) exists, ϕ(t) = } is zero. By the previous discussion, in what follows, it will be often convenient to assume that admissible curves are length-parametrized (or parametrized such that γ(t) is constant). 7

71 3.1.2 Equivalence of sub-riemannian structures In this section we introduce the notion of equivalence for sub-riemannian structures on the same base manifold M and the notion of isometry between sub-riemannian manifolds. Definition Let (U,f),(U,f ) be two sub-riemannian structures on a smooth manifold M. They are said to be equivalent if the following conditions are satisfied (i) there exist an Euclidean bundle V and two surjective vector bundle morphisms p : V U and p : V U such that the following diagram is commutative U p f 5 V TM p 2 f U (3.17) (ii) the projections p, p are compatible with the scalar product, i.e., it holds u = min{ v,p(v) = u}, u U, u = min{ v,p (v) = u }, u U, Remark If (U,f) and (U,f ) are equivalent sub-riemannian structures on M, then: (a) the distributions D q and D q defined by f and f coincide, since f(u q ) = f (U q ) for all q M. (b) for each w D q we have w = w, where and are the norms are induced by (U,f) and (U,f ) respectively. In particular the length of an admissible curve for two equivalent sub-riemannian structures is the same. Remark Notice that (i) is satisfied (with the vector bundle V possibly non Euclidean) if and only if the two moduli of horizontal vector fields D and D defined by U and U are equal (cf. Definition 3.2). Definition 3.2. Let M be a sub-riemannian manifold. We define the minimal bundle rank of M as the infimum of rank of bundles that induce equivalent structures on M. Given q M the local minimal bundle rank of M at q is the minimal bundle rank of the structure restricted on a sufficiently small neighborhood O q of q. Exercise Prove that the free sub-riemannian structure on R 2 defined by f : R 2 R 3 TR 2 defined by f(x,y,u 1,u 2,u 3 ) = (x,y,u 1,u 2 x+u 3 y) has non constant local minimal bundle rank. For equivalence classes of sub-riemannian structures we introduce the following definition. 71

72 Definition Two equivalent classes of sub-riemannian manifolds are said to be isometric if there exist two representatives (M,U,f),(M,U,f ), a diffeomorphism φ : M M and an isomorphism 1 of Euclidean bundles ψ : U U such that the following diagram is commutative U f TM (3.18) ψ φ U f TM Examples Our definition of sub-riemannian manifold is quite general. In the following we list some classical geometric structures which are included in our setting. 1. Riemannian structures. Classically a Riemannian manifold is defined as a pair (M, ), where M is a smooth manifold and q is a family of scalar product on T q M, smoothly depending on q M. This definition is included in Definition 3.2 by taking U = TM endowed with the Euclidean structure induced by and f : TM TM the identity map. Exercise Show that every Riemannian manifold in the sense of Definition 3.2 is indeed equivalent to a Riemannian structure in the classical sense above (cf. Exercise 3.8). 2. Constant rank sub-riemannian structures. Classically a constant rank sub-riemannian manifold is a triple (M,D, ), where D is a vector subbundle of TM and q is a family of scalar product on D q, smoothly depending on q M. This definition is included in Definition 3.2 by taking U = D, endowed with its Euclidean structure, and f : D TM the canonical inclusion. 3. Almost-Riemannian structures. An almost-riemannian structure on M is a sub-riemannian structure (U,f) on M such that its local minimal bundle rank is equal to the dimension of the manifold, at every point. 4. Free sub-riemannian structures. Let U = M R m be the trivial Euclidean bundle of rank m on M. A point in U can be written as (q,u), where q M and u = (u 1,...,u m ) R m. If we denote by {e 1,...,e m } an orthonormal basis of R m, then we can define globally m smooth vector fields on M by f i (q) := f(q,e i ) for i = 1,...,m. Then we have f(q,u) = f ( q, ) m u i e i = i=1 m u i f i (q), q M. (3.19) In this case, the problem of finding an admissible curve joining two fixed points q,q 1 M 1 isomorphism of bundles in the broad sense, it is fiberwise but is not obliged to map a fiber in the same fiber. i=1 72

73 and with minimal length is rewritten as the optimal control problem m γ(t) = u i (t)f i (γ(t)) i=1 T u(t) dt min γ() = q, γ(t) = q 1 (3.2) For a free sub-riemannian structure, the set of vector fields f 1,...,f m build as above is called a generating family. Notice that, in general, a generating family is not orthonormal when f is not injective. 5. Surfaces in R 3 as free sub-riemannian structures Due to topological constraints, in general it not possible to regard a surface of R 3 (with the induced metric) as a free sub-riemannian structure of rank 2, i.e., defined by a pair of globally defined orthonormal vector fields. However, it is always possible to regard it as a free sub-riemannian structure of rank 3. Indeed, for an embedded surface M in R 3, consider the trivial Euclidean bundle U = M R 3, where points are denoted as usual (q,u), with u R 3,q M, and the map f : U TM, f(q,u) = π q (u) T qm. (3.21) where π q : R 3 T q M R 3 is the orthogonal projection. Notice that f is a surjective bundle map and the set of vector fields {π q ( x ),π q ( y ),π q ( z )} is a generating family for this structure. Exercise Show that (U, f) defined in (3.21) is equivalent to the Riemannian structure on M induced by the embedding in R Every sub-riemannian structure is equivalent to a free one The purpose of this section is to show that every sub-riemannian structure (U,f) on M is equivalent to a sub-riemannian structure (U,f ) where U is a trivial bundle with sufficiently big rank. Lemma Let M be a n-dimensional smooth manifold and π : E M a smooth vector bundle of rank m. Then, there exists a vector bundle π : E M with ranke 2n + m such that E E is a trivial vector bundle. Proof. Remember that E, as a smooth manifold, has dimension dim E = dim M +rank E = n+m. Consider the map i : M E which embeds M into the vector bundle E as the zero section M = i(m). If we denote with T M E := i (TE) the pullback vector bundle, i.e., the restriction of TE to the section M, we have the isomorphism (as vector bundles on M) T M E E TM. (3.22) 73

74 Eq. (3.22) is a consequence of the fact that the tangent to every fibre E q, being a vector space, is canonically isomorphic to its tangent space T q E q so that T q E = T q E q T q M E q T q M, q M. By Whitney theorem we have a (nonlinear on fibers, in general) immersion Ψ : E R N, Ψ : T M E TE TR N, for N = 2(n+m), and Ψ is injective as bundle map, i.e., T M E is a sub-bundleof TR N R N R N. Thus we can choose as a complement E, the orthogonal bundle (on the base M) with respect to the Euclidean metric in R N, i.e. E = E q, E q = (T qe q T q M), q M and considering E := T M E E we have that E is trivial since its fibers are sum of orthogonal complements and by (3.22) we are done. Corollary Every sub-riemannian structure (U, f) on M is equivalent to a sub-riemannian structure (U, f) where U is a trivial bundle. Proof. By Lemma 3.25 there exists a vector bundle U such that the direct sum U := U U is a trivial bundle. Endow U with any metric structure g. Define a metric on U in such a way that ḡ(u+u,v +v ) = g(u,v) +g (u,v ) on each fiber Ūq = U q U q. Notice that U q and U q are orthogonal subspace of Ūq with respect to ḡ. Let us define the sub-riemannian structure (U, f) on M by f : U TM, f := f p1, where p 1 : U U U denotes the projection on the first factor. By construction, the diagram U Id f U U TM p 1 4 f U (3.23) is commutative. Moreover condition (ii) of Definition 3.17 is satisfied since for every ū = u +u, with u U q and u U q, we have ū 2 = u 2 + u 2, hence u = min{ ū,p 1 (ū) = u}. Since every sub-riemannian structure is equivalent to a free one, in what follows we can assume that there exists a global generating family, i.e., a family of f 1,...,f m of vector fields globally defined on M such that every admissible curve of the sub-riemannian structure satisfies γ(t) = m u i (t)f i (γ(t)), (3.24) i=1 74

75 Moreover, by the classical Gram-Schmidt procedure, we can assume that f i are the image of an orthonormal frame defined on the fiber. (cf. Example 4 of Section 3.1.3) Under these assumptions the length of an admissible curve γ is given by T T l(γ) = u (t) dt = m u i (t)2 dt, where u (t) is the minimal control associated with γ. Notice that Corollary 3.26 implies that the modulus of horizontal vector fields D is globally generated by f 1,...,f m. Remark The integral curve γ(t) = e tf i, defined on [,T], of an element f i of a generating family F = {f 1,...,f m } is admissible and l(γ) T. If F = {f 1,...,f m } are linearly independent then they are an orthonormal frame and l(γ) = T. Exercise Consider a sub-riemannian structure (U, f) over M. Let m = rank(u) and h max = max{h(q) : q M} m where h(q) is the local minimal bundle rank at q. Prove that there exists a sub-riemannian structure (Ū, f) equivalent to (U,f) such that rank(ū) = h max Proto sub-riemannian structures Sometimes can be useful to consider structures that satisfy only property (i) and (ii) of Definition 3.2, but that are not bracket generating. In what follows we call these structures proto sub- Riemannian structures. The typical example is the one of a Riemannian foliation, that is obtained when the family of horizontal vector fields D satisfies (i) [D,D] D, (ii) dimd q does not depend on q M. In this case the manifold M is foliated by integral manifolds of the distribution, and each of them is endowed with a Riemannian structure. i=1 3.2 Sub-Riemannian distance and Chow-Rashevskii Theorem In this section we introduce the sub-riemannian distance between two points as the infimum of the length of admissible curves joining them. Recall that, in the definition of sub-riemannian manifold, M is assumed to be connected. Moreover, thanks to the construction of Section 3.1.4, in what follows we can assume that the sub- Riemannian structure is free, with generating family F = {f 1,...,f m }. Notice that, by definition, F is assumed to be bracket generating. Definition Let M be a sub-riemannian manifold and q,q 1 M. The sub-riemannian distance (or Carnot-Caratheodory distance) between q and q 1 is d(q,q 1 ) = inf{l(γ) γ : [,T] M admissible, γ() = q, γ(t) = q 1 }, (3.25) 75

76 One of the purpose of this section is to show that, thanks to the bracket generating condition, (9.1) is well-defined, namely for every q,q 1 M, there exists an admissible curve that joins q to q 1, hence d(q,q 1 ) < +. Theorem 3.3 (Chow-Raschevskii). Let M be a sub-riemannian manifold. Then (i) (M,d) is a metric space, (ii) the topology induced by (M, d) is equivalent to the manifold topology. In particular, d : M M R is continuous. In what follows B(q,r) (sometimes denoted also B r (q)) is the (open) sub-riemannian ball of radius r and center q B(q,r) := {q M d(q,q ) < r}. The rest of this section is devoted to the proof of Theorem 3.3. To prove it, we have to show that d is actually a distance, i.e., (a) d(q,q 1 ) < + for all q,q 1 M, (b) d(q,q 1 ) = if and only if q = q 1, (c) d(q,q 1 ) = d(q 1,q ) and d(q,q 2 ) d(q,q 1 )+d(q 1,q 2 ) for all q,q 1,q 2 M, and the equivalence between the metric and the manifold topology: for every q M we have (d) for every ε > there exists a neighborhood O q of q such that O q B(q,ε), (e) for every neighborhood O q of q there exists δ > such that B(q,δ) O q Proof of Chow-Raschevskii Theorem The symmetry of d is a direct consequence of the fact that if γ : [,T] M is admissible, then the curve γ : [,T] M defined by γ(t) = γ(t t) is admissible and l( γ) = l(γ). The triangular inequality follows from the fact that, given two admissible curves γ 1 : [,T 1 ] M and γ 2 : [,T 2 ] M such that γ 1 (T 1 ) = γ 2 (), their concatenation { γ 1 (t), t [,T 1 ], γ : [,T 1 +T 2 ] M, γ(t) = (3.26) γ 2 (t T 1 ), t [T 1,T 1 +T 2 ]. is still admissible. These two arguments prove item (c). We divide the rest of the proof of the Theorem in the following steps. S1. We prove that, for every q M, there exists a neighborhood O q of q such that d(q, ) is finite and continuous in O q. This proves (d). S2. We prove that d is finite on M M. This proves (a). S3. We prove (b) and (e). To prove Step 1 we first need the following lemmas: 76

77 Lemma Let N M be a submanifold and F Vec(M) be a family of vector fields tangent to N, i.e., X(q) T q N, for every q N and X F. Then for all q N we have Lie q F T q N. In particular dimlie q F dimn. Proof. Let X F. As a consequence of the local existence and uniqueness of the two Cauchy problems { { q = X(q), q M, q = X and N (q), q N, q() = q, q N. q() = q, q N. it follows that e tx (q) N for every q N and t small enough. This property, together with the definition of Lie bracket (see formula (2.27)) implies that, if X,Y are tangent to N, the vector field [X,Y] is tangent to N as well. Iterating this argument we get that Lie q F T q N for every q N, from which the conclusion follows. Lemma Let M be an n-dimensional sub-riemannian manifold with generating family F = {f 1,...,f m }. For every q M and every neighborhood V of the origin in R n there exist ŝ = (ŝ 1,...,ŝ n ) V, and a choice of n vector fields f i1,...,f in F, such that ŝ is a regular point of the map ψ : R n M, ψ(s 1,...,s n ) = e snf in e s 1f i1(q ). Remark Notice that, if D q T q M, then ŝ = cannot be a regular point of the map ψ. Indeed, for s =, the image of the differential of ψ at is span q {f ij,j = 1,...,n} D q and the differential of ψ cannot be surjective. We stress that, in the choice of f i1,...,f in F, a vector field can appear more than once, as for instance in the case m < n. Proof of Lemma We prove the lemma by steps. 1. There exists a vector field f i1 F such that f i1 (q ), otherwise all vector fields in F vanish at q and dimlie q F =, which contradicts the bracket generating condition. Then, for s small enough, the map φ 1 : s 1 e s 1f i1(q ), is a local diffeomorphism onto its image Σ 1. If dimm = 1 the Lemma is proved. 2. Assume dimm 2. Then there exist t 1 1 R, with t1 1 small enough, and f i 2 F such that, if we denote by q 1 = e t1 1 f i 1 (q ), the vector f i2 (q 1 ) is not tangent to Σ 1. Otherwise, by Lemma 3.31, dim Lie q F = 1, which contradicts the bracket generating condition. Then the map φ 2 : (s 1,s 2 ) e s 2f i2 e s 1 f i1(q ), is a local diffeomorphism near (t 1 1,) onto its image Σ 2. Indeed the vectors φ 2 s 1 T q1 Σ 1, (t 1 1,) φ 2 s 2 = f i2 (q 1 ), (t 1 1,) are linearly independent by construction. If dimm = 2 the Lemma is proved. 77

78 3. Assume dimm 3. Then there exist t 1 2,t2 2, with t1 2 t1 1 and t2 2 small enough, and f i 3 F such that, if q 2 = e t2 2 f i 2 e t1 2 f i 1 (q ) we have that f i3 (q 2 ) is not tangent to Σ 2. Otherwise, by Lemma 3.31, dim Lie q1 D = 2, which contradicts the bracket generating condition. Then the map φ 3 : (s 1,s 2,s 3 ) e s 3f i3 e s 2 f i2 e s 1 f i1(q ), is a local diffeomorphism near (t 1 2,t2 2,). Indeed the vectors φ 3 s 1, φ 3 φ 3 (t 1 2,t 2 2,) s 2 T q2 Σ 2, (t 1 2,t 2 2,) s 3 = f i3 (q 2 ), (t 1 2,t 2 2,) are linearly independent since the last one is transversal to T q2 Σ 2 by construction, while the first two are linearly independent since φ 3 (s 1,s 2,) = φ 2 (s 1,s 2 ) and φ 2 is a local diffeomorphisms at (t 1 2,t2 2 ) which is close to (t1 1,). Repeating the same argument n times (with n = dimm), the lemma is proved. Proof of Step 1. Thanks to Lemma 3.32 there exists a neighborhood V V of ŝ such that ψ is a diffeomorphism from V to ψ( V), see Figure 3.3. We stress that in general q = ψ() does not belong to ψ( V), cf. Remark V V ŝ ψ ψ( V) q Figure 3.3: Proof of Lemma 3.32 To build a local diffeomorphism whose image contains q, we consider the map ψ : R n M, ψ(s1,...,s n ) = e ŝ 1f i1 e ŝ nf in ψ(s 1,...,s n ), which has the following property: ψ is a diffeomorphism from a neighborhood of ŝ V, that we still denote V, to a neighborhood of ψ(ŝ) = q. Fix now ε > and apply the construction above where V is the neighborhood of the origin in R n defined by V = {s R n, n i=1 s i < ε}. Let us show that the claim of Step 1 holds with O q = ψ( V). Indeed, for every q ψ( V), let s = (s 1,...,s n ) such that q = ψ(s), and denote by γ the admissible curve joining q to q, built by 2n-pieces, as in Figure

79 ψ V V s ψ(s) ψ(s) q ψ( V) Figure 3.4: The map ψ In other words γ is the concatenation of integral curves of the vector fields f ij, i.e., admissible curves of the form t e tf i j (q) defined on some interval [,T], whose length is less or equal than T (cf. Remark 3.27). Since s,ŝ V V, it follows that: which ends the proof of Step 1. d(q,q) l(γ) s s n + ŝ ŝ n < 2ε, Proof of Step 2. To prove that d is finite on M M let us consider the equivalence classes of points in M with respect to the relation q 1 q 2 if d(q 1,q 2 ) < +. (3.27) From the triangular inequality and the proof of Step 1, it follows that each equivalence class is open. Moreover, by definition, the equivalence classes are disjoint and nonempty. Since M is connected, it cannot be the union of open disjoint and nonempty subsets. It follows that there exists only one equivalence class. Lemma Let q M and K M a compact set with q intk. Then there exists δ K > such that every admissible curve γ starting from q and with l(γ) δ K is contained in K. Proof. Without loss of generality we can assume that K is contained in a coordinate chart of M, where we denote by the Euclidean norm in the coordinate chart. Let us define C K := max x K ( m ) 1/2 f i (x) 2 i=1 (3.28) and fix δ K > such that dist(q, K) > C K δ K (here dist is the Euclidean distance, in coordinates). Let us show that for any admissible curve γ : [,T] M such that γ() = q and l(γ) δ K we have γ([,t]) K. Indeed, if this is not true, there exists an admissible curve γ : [,T] M 79

80 with l(γ) δ K and t := sup{t [,T],γ([,t]) K}, with t < T. Then γ(t ) γ() t t γ(t) dt t m u i (t)f i(γ(t)) dt (3.29) i=1 m f i (γ(t)) 2 m u i (t)2 dt (3.3) i= i= t C K m u i (t)2 dt C K l(γ) (3.31) i= C K δ K < dist(q, K). (3.32) which contradicts the fact that, at t, the curve γ leaves the compact K. Thus t = T. Proof of Step 3. Let us prove that Lemma 3.34 implies property (b). Indeed the only nontrivial implication is that d(q,q 1 ) > whenever q q 1. To prove this, fix a compact neighborhood K of q such that q 1 / K. By Lemma 3.34, each admissible curve joining q and q 1 has length greater than δ K, hence d(q,q 1 ) δ K >. Let us now prove property (e). Fix ε > and a compact neighborhood K of q. Define C K and δ K as in Lemma 3.34, and set δ := min{δ K,ε/C K }. Let us show that q q < ε whenever d(q,q) < δ, where again is the Euclidean norm in a coordinate chart. Consider a minimizing sequence γ n : [,T] M of admissible trajectories joining q and q such that l(γ n ) d(q,q) for n. Without loss of generality, we can assume that l(γ n ) δ for all n. By Lemma 3.34, γ n ([,T]) K for all n. We can repeat estimates (3.29)-(3.31) proving that q q = γ n (T) γ n () C K l(γ n ) for all n. Passing to the limit for n, one gets q q C K d(q,q) C K δ < ε. (3.33) Corollary The metric space (M,d) is locally compact, i.e., for any q M there exists ε > such that the closed sub-riemannian ball B(q,r) is compact for all r ε. Proof. By the continuity of d, the set B(q,r) = {d(q, ) r} is closed for all q M and r. Moreover the sub-riemannian metric d induces the manifold topology on M. Hence, for radius small enough, the sub-riemannian ball is bounded. Thus small sub-riemannian balls are compact. 3.3 Existence of length-minimizers In this section we want to discuss the existence of length-minimizers. Definition Let γ : [,T] M be an admissible curve. We say that γ is a length-minimizer if it minimizes the length among admissible curves with same endpoints, i.e., l(γ) = d(γ(), γ(t)). 8

81 Remark Notice that the existence length-minimizers between two points is not guaranteed in general, as it happens for two points in M = R 2 \{} (endowed with the Euclidean distance) that are symmetric with respect to the origin. On the other hand, when length-minimizers exist between two fixed points, they may not be unique, as it happens for two antipodal points on the sphere S 2. We now show a general semicontinuity property of the length functional. Theorem Let γ n : [,T] M be a sequence of admissible curves on M such that γ n γ uniformly on [,T]. Then l(γ) liminf l(γ n). (3.34) n If moreover liminf n l(γ n ) < +, then γ is also admissible. Proof. Denote L := liminfl(γ n ) and choose a subsequence, still denoted by the same symbol, such that l(γ n ) L. If L = + the inequality (3.34) is true, thus we can assume L < +. Fix δ >. It is not restrictive to assume that, for n large enough, l(γ n ) L+δ and, by uniform convergence, that the image of γ n are all contained in a common compact set K. Now we divide the proof into two steps (i). We first prove that statement assuming that all γ n are parametrized with constant speed on the interval [,1]. Under this assumption we have that γ n (t) V γn(t) for a.e. t, where V q = {f u (q), u L+δ} T q M, f u (q) = m u i f i (q). i=1 Notice that V q is convex for every q M, thanks to the linearity of f in u. Let us prove that γ is admissible and satisfies l(γ) L+δ. Once this is done, since δ is arbitrary, this implies l(γ) L, that is (3.34). Writing in local coordinates, we have for every ε > 1 ε (γ n(t+ε) γ n (t)) = 1 ε t+ε t f un(τ)(γ n (τ))dτ conv{v γn(τ),τ [t,t+ε]}. (3.35) Next we want to estimate the right hand side of (3.35) uniformly with respect to n. For n n sufficiently large, we have γ n (t) γ(t) < ε (by uniform convergence) and an estimate similar to (3.31) gives for τ [t,t+ε] γ n (t) γ n (τ) τ t γ n (s) ds C K (L+δ)ε. (3.36) where C K is the constant (3.28) defined by the compact K. Hence we deduce for every τ [t,t+ε] and every n n γ n (τ) γ(t) γ n (t) γ n (τ) + γ n (t) γ(t) C ε, (3.37) where C is independent on n and ε. From the estimate (3.37) and the equivalence of the manifold and metric topology we have that, for all τ [t,t+ε] and n n, γ n (τ) B γ(t) (r ε ), with r ε when ε. In particular conv{v γn(τ), τ [t,t+ε]} conv{v q, q B γ(t) (r ε )}. (3.38) 81

82 Plugging (3.38) in (3.35) and passing to the limit for n we get finally to 1 ε (γ(t+ε) γ(t)) conv{v q, q B γ(t) (r ε )}. (3.39) Assume now that t [,1] is a differentiability point of γ. Then the limit of the left hand side in (3.39) for ε exists and gives γ(t) convv γ(t) = V γ(t). For every differentiability point t we can thus define the unique u (t) satisfying γ(t) = f(γ(t),u (t)) and u (t) = γ(t). Using the argument contained in Appendix 3.A it follows that u (t) is measurable in t. Moreover u (t) is essentially bounded since, by construction, u (t) L+δ for a.e. t [,T]. Hence γ is admissible. Moreover l(γ) L+δ since γ is defined on the interval [,1]. When γ n : [,T] M is an arbitrary sequence converging uniformly to γ, let us consider the family γ n : [,1] M such that γ n is parametrized by constant speed on [,1] (cf. Lemma??). In particular γ n = γ n ϕ n, ϕ n (t) = 1 t u n l(γ n ) (s) ds To prove the statement it is enough to prove that γ n γ where γ is some reparametrization of g, since length is invariant by reparametrization. Reasoning as in the proof of we get γ n (s 1 ) γ n (s ) C K (L+δ) s 1 s then we can apply the Ascoli-Arzela theorem on the reparametrized sequence and we get that a subsequence is uniformly convergent to a curve, that is necessarily a curve γ whose γ is a reparametrization. Corollary Let γ n be a sequence of length-minimizers on M such that γ n γ uniformly. Then γ is a length-minimizer. Proof. Since the length is invariant under reparametrization, it is not restrictive to assume that all curves γ n and γ are parametrized on [,1]. Since γ n is a length-minimizer one has l(γ n ) = d(γ n (),γ n (1)). By uniform convergence γ n (t) γ(t) for every t [,1] and, by continuity of the distance and semicontinuity of the length l(γ) liminf n l(γ n) = liminf n d(γ n(),γ n (1)) = d(γ(),γ(1)), that implies that l(γ) = d(γ(), γ(1)), i.e., γ is a length-minimizer. The semicontinuity of the length implies the existence of minimizers, under a natural compactness assumption on the space. Theorem 3.4 (Existence of minimizers). Let M be a sub-riemannian manifold and q M. Assume that the ball B q (r) is compact, for some r >. Then for all q 1 B q (r) there exists a length minimizer joining q and q 1, i.e., we have d(q,q 1 ) = min{l(γ) γ : [,T] M admissible,γ() = q,γ(t) = q 1 }. Proof. Fix q 1 B q (r) and consider a minimizing sequence γ n : [,1] M of admissible trajectories, parametrized with constant speed, joining q and q 1 and such that l(γ n ) d(q,q 1 ). 82

83 Since d(q,q 1 ) < r, we have l(γ n ) r for all n n large enough, hence we can assume without loss of generality that the image of γ n is contained in the common compact K = B q (r) for all n. In particular, the same argument leading to (3.36) shows that for all n n γ n (t) γ n (τ) t τ γ n (s) ds C K r t τ, t,τ [,1]. (3.4) wherec K dependsonly on K. In other words, all trajectories in thesequence {γ n } n N are Lipschitz with the same Lipschitz constant. Thus the sequence is equicontinuous and uniformly bounded. By the classical Ascoli-Arzelà Theorem there exist a subsequence of γ n, which we still denote by the same symbol, and a Lipschitz curve γ : [,T] M such that γ n γ uniformly. By Theorem 3.38, the curve γ satisfies l(γ) liminfl(γ n ) = d(q,q 1 ), that implies l(γ) = d(q,q 1 ). Remark Assume that B(q,r ) is compact for some r >. Then for every < r r we have that B(q,r) is compact also, being a closed subset of a compact set B(q,r ). Corollary Let q M. Under the hypothesis of Corollary 3.4 there exists ε > such that for all r ε and q 1 B q (r) there exists a minimizing curve joining q and q 1. Proof. It is a direct consequence of Theorem 3.4 and Corollary Remark It is well known that a length space is complete if and only if all closed balls are compact, see[bbi1, Ch. 2]. In particular, if(m, d) is complete with respect to the sub-riemannian distance, then for every q,q 1 M there exists a length minimizer joining q and q Pontryagin extremals In this section we want to give necessary conditions to characterize length-minimizer trajectories. To begin with, we would like to motivate our Hamiltonian approach that we develop in the sequel. In classical Riemannian geometry length-minimizer trajectories satisfy a necessary condition given by a second order differential equation in M, which can be reduced to a first-order differential equation in TM. Hence the set of all length-minimizers is contained in the set of extremals, i.e., trajectories that satisfy the necessary condition, that are be parametrized by initial position and velocity. In our setting (which includes Riemannian and sub-riemannian geometry) we cannot use the initial velocity to parametrize length-minimizer trajectories. This can be easily understood by a dimensional argument. If the rank of the sub-riemannian structure is smaller than the dimension of the manifold, the initial velocity γ() of an admissible curve γ(t) starting from q, belongs to the proper subspace D q of the tangent space T q M. Hence the set of admissible velocities form a set whose dimension is smaller than the dimension of M, even if, by the Chow and Filippov theorems, length-minimizer trajectories starting from a point q cover a full neighborhood of q. The right approach is to parametrize length-minimizers by their initial point and an initial covector λ T q M, which can be thought as the linear form annihilating the front, i.e., the set {γ q (ε) γ q is a length-minimizer starting from q } on the corresponding length-minimizer trajectory for ε. The next theorem gives the necessary condition satisfied by length-minimizers in sub-riemannian geometry. Curves satisfying this condition are called Pontryagin extremals. The proof the following theorem is given in the next section. 83

84 Theorem 3.44 (Characterization of Pontryagin extremals). Let γ : [, T] M be an admissible curve which is a length-minimizer, parametrized by constant speed. Let u( ) be the corresponding minimal control, i.e., for a.e. t [,T] m T γ(t) = u i (t)f i (γ(t)), l(γ) = u(t) dt = d(γ(),γ(t)), i=1 with u(t) constant a.e. on [,T]. Denote with P,t the flow 2 of the nonautonomous vector field f u(t) = k i=1 u i(t)f i. Then there exists λ Tγ() M such that defining we have that one of the following conditions is satisfied: (N) u i (t) λ(t),f i (γ(t)), i = 1,...,m, (A) λ(t),f i (γ(t)), i = 1,...,m. Moreover in case (A) one has λ. λ(t) := (P,t 1 ) λ, λ(t) Tγ(t) M, (3.41) Notice that, by definition, the curve λ(t) is Lipschitz continuous. Moreover the conditions (N) and (A) are mutually exclusive, unless u(t) = for a.e. t [,T], i.e., γ is the trivial trajectory. Definition Letγ : [,T] M beanadmissiblecurvewithminimalcontrolu L ([,T],R m ). Fix λ Tγ() M \{}, and define λ(t) by (3.41). - If λ(t) satisfies (N) then it is called normal extremal (and γ(t) a normal extremal trajectory). - If λ(t) satisfies (A) then it is called abnormal extremal (and γ(t) a abnormal extremal trajectory). Remark If the sub-riemannian structure is not Riemannian at q, namely if D q = span q {f 1,...,f m } T q M, then the trivial trajectory, corresponding to u(t), is always normal and abnormal. Notice that even a nontrivial admissible trajectory γ can be both normal and abnormal, since there may exist two different lifts λ(t),λ (t) T γ(t) M, such that λ(t) satisfies (N) and λ (t) satisfies (A). Remark In the Riemannian case there are no abnormal extremals. Indeed, since the map f is fiberwise surjective, we can always find m vector fields f 1,...,f m on M such that span q {f 1,...,f m } = T q M, and (A) would imply that λ,v =, for all v T q M, that gives the contradiction λ =. Exercise Prove that condition (N) of Theorem 3.41 implies that the minimal control u(t) is smooth. In particular normal extremals are smooth. At this level it seems not obvious how to use Theorem 3.44 to find the explicit expression of extremals for a given problem. In the next chapter we provide another formulation of Theorem 3.44 which gives Pontryagin extremals as solutions of a Hamiltonian system. The rest of this section is devoted to the proof of Theorem P,t(x) is defined for t [,T] and x in a neighborhood of γ() 84

85 3.4.1 The energy functional Let γ : [,T] M be an admissible curve. We define the energy functional J on the space of Lipschitz curves on M as follows J(γ) = 1 2 T Notice that J(γ) < + for every admissible curve γ. γ(t) 2 dt. Remark While l is invariant by reparametrization (see Remark 3.14), J is not. Indeed consider, for every α >, the reparametrized curve γ α : [,T/α] M, γ α (t) = γ(αt). Using that γ α (t) = α γ(αt), we have J(γ α ) = 1 2 T/α γ α (t) 2 dt = 1 2 T/α α 2 γ(αt) 2 dt = αj(γ). Thus, if the final time is not fixed, the infimum of J, among admissible curves joining two fixed points, is always zero. The following lemma relates minimizers of J with fixed final time with minimizers of l. Lemma 3.5. Fix T > and let Ω q,q 1 be the set of admissible curves joining q,q 1 M. An admissible curve γ : [,T] M is a minimizer of J on Ω q,q 1 if and only if it is a minimizer of l on Ω q,q 1 and has constant speed. Proof. Applying the Cauchy-Schwarz inequality ( T 2 f(t)g(t)dt) with f(t) = γ(t) and g(t) = 1 we get T T f(t) 2 dt g(t) 2 dt, (3.42) l(γ) 2 2J(γ)T. (3.43) Moreover in (3.42) equality holds if and only if f is proportional to g, i.e., γ(t) = const. in (3.43). Since, by Lemma 3.15, every curve is a Lipschitz reparametrization of a length-parametrized one, the minima of J are attained at admissible curves with constant speed, and the statement follows Proof of Theorem 3.44 By Lemma 3.5 we can assume that γ is a minimizer of the functional J among admissible curves joining q = γ() and q 1 = γ(t) in fixed time T >. In particular, if we define the functional J(u( )) := 1 2 T 85 u(t) 2 dt, (3.44)

86 on the space of controls u( ) L ([,T],R m ), the minimal control u( ) of γ is a minimizer for the energy functional J J(u( )) J(u( )), u L ([,T],R m ), where trajectories corresponding to u( ) join q,q 1 M. In the following we denote the functional J by J. Consider now a variation u( ) = u( )+v( ) of the control u( ), and its associated trajectory q(t), solution of the equation q(t) = f u(t) (q(t)), q() = q, (3.45) Recall that P,t denotes the local flow associated with the optimal control u( ) and that γ(t) = P,t (q ) is the optimal admissible curve. We stress that in general, for q different from q, the curve t P,t (q) is not optimal. Let us introduce the curve x(t) defined by the identity q(t) = P,t (x(t)). (3.46) Inotherwordsx(t) = P,t 1 (q(t)) isobtained byapplyingtheinverseof theflow ofu( ) tothesolution associated with the new control u( ) (see Figure 3.5). Notice that if v( ) =, then x(t) q. q(t) P,t x(t) q Figure 3.5: The trajectories q(t), associated with u( ) = u( ) + v( ), and the corresponding x(t). The next step is to write the ODE satisfied by x(t). Differentiating (3.46) we get q(t) = f u(t) (q(t))+(p,t ) (ẋ(t)) (3.47) = f u(t) (P,t (x(t)))+(p,t ) (ẋ(t)) (3.48) and using that q(t) = f u(t) (q(t)) = f u(t) (P,t (x(t)) we can invert (3.48) with respect to ẋ(t) and rewrite it as follows ẋ(t) = (P,t 1 ) [ (fu(t) f u(t) )(P,t (x(t))) ] [ ] = (P,t 1 ) (f u(t) f u(t) ) (x(t)) [ ] = (P,t 1 ) (f u(t) u(t) ) (x(t)) [ = (P,t 1 ) f v(t) ](x(t)) (3.49) 86

87 If we define the nonautonomous vector field g t v(t) = (P 1,t ) f v(t) we finally obtain by (3.49) the following Cauchy problem for x(t) ẋ(t) = g t v(t) (x(t)), x() = q. (3.5) Notice that the vector field gv t is linear with respect to v, since f u is linear with respect to u. Now we fix the control v(t) and consider the map ( ) J(u+sv) s R R M x(t;u+sv) where x(t;u + sv) denote the solution at time T of (3.5), starting from q, corresponding to control u( )+sv( ), and J(u+sv) is the associated cost. Lemma There exists λ (R T q M), with λ, such that for all v L ([,T],R m ) ( J(u+sv) λ,, x(t;u+sv) ) =. (3.51) s s= s s= Proof of Lemma We argue by contradiction: assume that (3.51) is not true, then there exist v,...,v n L ([,T],R m ) such that the vectors in R T q M J(u+sv ) s x(t;u+sv ) s s= s=,..., are linearly independent. Let us then consider the map Φ : R n+1 R M, Φ(s,...,s n ) = J(u+sv n ) s x(t;u+sv n ) s s= s= ( J(u+ n i= s iv i ) x(t;u+ n i= s iv i ) (3.52) ). (3.53) By differentiability properties of solution of smooth ODEs with respect to parameters, the map (3.53) is smooth in a neighborhood of s =. Moreover, since the vectors (3.52) are the components of the differential of Φ and they are independent, then the inverse function theorem implies that Φ is a local diffeomorphism sending a neighborhood of s = in R n+1 in a neighborhood of (J(u),q ) in R M. As a result we can find v( ) = i s iv i ( ) such that (see also Figure 3.4.2) x(t;u+v) = q, J(u+v) < J(u). In other words the curve t q(t;u+v) joins q(;u+v) = q to q(t;u+v) = P,T (x(t;u+v)) = P,T (q ) = q 1, with a cost smaller that the cost of γ(t) = q(t;u), which is a contradiction Remark Notice that if λ satisfies (3.51), then for every α R, with α, α λ satisfies (3.51) too. Thus we can normalize λ to be ( 1,λ ) or (,λ ), with λ T q M, and λ in the second case (since λ is not zero). 87

88 J J(ū) x(t, ū) x Condition (3.51) implies that there exists λ Tq M such that one of the following identities is satisfied for all v L ([,T],R m ): J(u+sv) = λ, x(t;u+sv), (3.54) s s= s s= = λ, x(t;u+sv). (3.55) s s= with λ in the second case (cf. Remark 3.52). To end the proof we have to show that identities (3.54) and (3.55) are equivalent to conditions (N) and (A) of Theorem Let us show that J(u+sv) s x(t;u+sv) s = s= = s= T T m u i (t)v i (t)dt, (3.56) i=1 g t v(t) (q )dt = The identity (3.56) follows from the definition of J J(u+sv) = 1 2 T T m i=1 ((P 1,t ) f i )(q )v i (t)dt. (3.57) u+sv 2 dt. (3.58) Eq. (3.57) can be proved in coordinates. Indeed by (3.5) and the linearity of g v with respect to v we have T x(t;u+sv) = q +s gv(t) t (x(t;u+sv))dt, and differentiating with respect to s at s = one gets (3.57). Let us show that (3.54) is equivalent to (N) of Theorem Similarly, one gets that (3.55) is equivalent to (A). Using (3.56) and (3.57), equation (3.54) is rewritten as T m u i (t)v i (t)dt = i=1 = T m i=1 T λ,((p,t 1 ) f i )(q ) v i (t)dt m λ(t),f i (γ(t)) v i (t)dt, (3.59) i=1 88

89 where we used, for every i = 1,...,m, the identities λ,((p,t 1 ) f i )(q ) = λ,(p,t 1 ) f i (γ(t)) = (P,t 1 ) λ,f i (γ(t)) = λ(t),f i (γ(t)). Since v i ( ) L ([,T],R m ) are arbitrary, we get u i (t) = λ(t),f i (γ(t)) for a.e. t [,T]. 3.A Measurability of the minimal control In this appendix we prove a technical lemma about measurability of solutions to a class of minimization problems. This lemma when specified to the sub-riemannian context, implies that the minimal control associated with an admissible curve is measurable. 3.A.1 Main lemma Let us fix an interval I = [a,b] R and a compact set U R m. Consider two functions g : I U R n, v : I R n such that (M1) g(,u) is measurable in t for every fixed u U, (M2) g(t, ) is continuous in u for every fixed t I, (M3) v(t) is measurable with respect to t. Moreover we assume that (M4) for every fixed t I, the problem min{ u : g(t,u) = v(t),u U} has a unique solution. Let us denote by u (t) the solution of (M4) for a fixed t I. Lemma Under assumptions (M1)-(M4), the function t u (t) is measurable on I. Proof. Denote ϕ(t) := u (t). To prove the lemma we show that for every fixed r > the set is measurable in R. By our assumptions A = {t I : ϕ(t) r} A = {t I : u U s.t. u r,g(t,u) = v(t)} Let us fix r > and a countable dense set {u i } i N in the ball of radius r in U. Let show that where A = n N A n = n N i N A i,n }{{} :=A n A i,n := {t I : g(t,u i ) v(t) < 1/n} (3.6) Notice that the set A i,n is measurable by construction and if (17.12) is true, A is also measurable. 89

90 inclusion. Let t A. This means that there exists ū U such that ū r and g(t,ū) = v(t). Since g is continuous with respect to u and {u i } i N is a dense, for each n we can find u in such that g(t,u in ) v(t) < 1/n, that is t A n for all n. inclusion. Assume t n N A n. Then for every n there exists i n such that the corresponding u in satisfies g(t,u in ) v(t) < 1/n. From the sequence u in, by compactness, it is possible to extract a convergent susequence u in ū. By continuity of g with respect to u one easily gets that g(t,ū) = v(t). That is t A. Next we exploit the fact that the scalar function ϕ(t) := u (t) is measurable to show that the vector function u (t) is measurable. Lemma Under assumptions (M1)-(M4), the vector function t u (t) is measurable on I. Proof. It is sufficient to prove that, for every closed ball O in R n the set B := {t I : u (t) O} is measurable. Since the minimum in (M4) is uniquely determined, this is equivalent to where B = {t I : u O s.t. u = ϕ(t),g(t,u) = v(t)} Let us fix the ball O and a countable dense set {u i } i N in O. Let show that B = B n = B i,n n N n N i N }{{} :=B n B i,n := {t I : u i < ϕ(t)+1/n, g(t,u i ) v(t) < 1/n;} (3.61) Notice that the set B i,n is measurable by construction and if (3.61) is true, B is also measurable. inclusion. Let t B. This means that there exists ū O such that ū = ϕ(t) and g(t,ū) = v(t). Since g is continuous with respect to u and {u i } i N is a dense in O, for each n we can find u in such that g(t,u in ) v(t) < 1/n and u in < ϕ(t)+1/n, that is t B n for all n. inclusion. Assume t n N B n. Then for every n it is possible to find i n such that the corresponding u in satisfies g(t,u in ) v(t) < 1/n and u in < ϕ(t)+1/n. From the sequence u in, by compactness of the closed ball O, it is possible to extract a convergent susequence u in ū. By continuity of f in u one easily gets that g(t,ū) = v(t). Moreover ū ϕ(t). Hence ū = ϕ(t). That is t B. 3.A.2 Proof of Lemma 3.11 Consider an admissible curve γ : [,T] M. Since measurability is a local property it is not restrictive to assume M = R n. Moreover, by Lemma 3.15, we can assume that γ is lengthparametrized so that its minimal control belong to the compact set U = { u 1}. Define g : [,T] U R n and v : [,T] R n by g(t,u) = f(γ(t),u), v(t) = γ(t). 9

91 Assumptions (M1)-(M4) are satisfied. Indeed (M1)-(M3) follow from the fact that g(t, u) is linear with respect to u and measurable in t. Moreover (M4) is also satisfied by linearity with respect to u of f. Applying Lemma 3.54 one gets that the minimal control u (t) is measurable in t. Exercise Let γ : [,T] M be an admissible curve. For every t [,T] let us define, whenever it exists, the limit d(γ(t+h),γ(t)) v γ (t) := lim (3.62) h h Prove that v γ (t) = γ(t) = u (t) for a.e. t [,T]. Exercise Let γ : [,T] M be an admissible curve. Prove that { n 1 } l(γ) = sup d(γ(t i ),γ(t i+1 )), t 1 <... < t n T. (3.63) i=1 3.B Lipschitz vs Absolutely continuous admissible curves In these lecture notes sub-riemannian geometry is developed in the framework of Lipschitz admissible curves (that correspond to the choice of L controls). However, the theory can be equivalently developed in the framework of H 1 admissible curves (corresponding to L 2 controls) or in the framework of absolutely continuous admissible curves (corresponding to L 1 controls). Definition An absolutely continuous curve γ : [,T] M is said to be AC-admissible if there exists an L 1 function u : t [,T] u(t) U γ(t) such that γ(t) = f(γ(t),u(t)), for a.e. t [,T]. We define H 1 -admissible curves similarly. Being the set of absolutely continuous curve bigger than the set of Lipschitz ones, one could expect that the sub-riemannian distance between two points is smaller when computed among all absolutely continuous admissible curves. However this is not the case thanks to the invariance by reparametrization. Indeed Lemmas 3.14 and 3.15 can be rewritten in the absolutely continuous framework in the following form. Lemma The length of an AC-admissible curve is invariant by AC reparametrization. Lemma Any AC-admissible curve of positive length is a AC reparametrization of a lengthparametrized admissible one. The proof of Lemma 3.58 differs from the one of Lemma 3.14 only by the fact that, if u L 1 is the minimal control of γ then (u ϕ) ϕ is the minimal control associated with γ ϕ. Moreover (u ϕ) ϕ L 1, using the monotonicity of ϕ. Under these assumptions the change of variables formula (3.16) still holds. The proof of Lemma 3.59 is unchanged. Notice that the statement of Exercise 3.16 remains true if we replace Lipschitz with absolutely continuous. We stress that the curve γ built in the proof is Lipschitz (since it is length-parametrized). As a consequence of these results, if we define d AC (q,q 1 ) = inf{l(γ) γ : [,T] M AC-admissible, γ() = q, γ(t) = q 1 }, (3.64) we have the following proposition. 91

92 Proposition 3.6. d AC (q,q 1 ) = d(q,q 1 ) Since L 2 ([,T]) L 1 ([,T]), Lemmas 3.58, 3.59 and Proposition 3.6 are valid also in the framework of admissible curves associated with L 2 controls. Bibliographical notes Sub-Riemannian manifolds have been introduced, even if with different terminology, in several contexts starting from the end of 6s, see for instance [JSC87, Hör67, Fol73, Hul76, Gav77] and [Jur97, Jur16, Pan89, GV88, Bro82, BR96, Bro84, VG87]. However, some pioneering ideas were already present in the work of Carathéodory and Cartan. The name sub-riemannian geometry first appeared in [Str86]. Classical general references for sub-riemannian geometry are [Mon2, Bel96, Mon96, Gro96, Sus96]. Recent monographs [Jea14, Rif14]. The definition of sub-riemannian manifold using the language of bundles dates back to [AG97, Bel96]. For the original proof of the Raschevski-Chow theorem see [Ras38, Cho39]. The proof of existence of sub-riemannian length minimizer presented here is an adaptation of the proof of Filippov theorem in optimal control. The fact that in sub-riemannian geometry there exist abnormal length minimizers is due to Montgomery [Mon94, Mon2]. The fact that the theory can be equivalently developed for Lipschitz or absolutely continuous curves is well known, a discussion can be found in [Bel96]. The definition of the length by using the minimal control is, up to our best knowledge, original. The problem of the measurability of the minimal control can be seen as a problem of differential inclusion [BP7b]. The characterization of Pontryagin extremals given in Theorem 3.44 is a simplified version of the Pontryagin Maximum Priciple (PMP) [PBGM62]. The proof presented here is original and adapted to this setting. For more general versions of PMP see [AS4, BC3]. The fact that every sub-riemannian structure is equivalent to a free one (cf. Section 3.1.4) is a consequence of classical results on fiber bundles. A different proof in the case of classical (constant rank) distribution was also considered in [Rif14, Sus8]. 92

93 Chapter 4 Characterization and local minimality of Pontryagin extremals This chapter is devoted to the study of geometric properties of Pontryagin extremals. To this purpose we first rewrite Theorem 3.44 in a more geometric setting, which permits to write a differential equation in T M satisfied by Pontryagin extremals and to show that they do not depend on the choice of a generating family. Finally we prove that small pieces of normal extremal trajectories are length-minimizers. To this aim, all along this chapter we develop the language of symplectic geometry, starting by the key concept of Poisson bracket. 4.1 Geometric characterization of Pontryagin extremals In the previous chapter we proved that if γ : [,T] M is a length minimizer on a sub-riemannian manifold, associated with a control u( ), then there exists λ Tγ() M such that defining one of the following conditions is satisfied: (N) u i (t) λ(t),f i (γ(t)), i = 1,...,m, (A) λ(t),f i (γ(t)), i = 1,...,m, λ. λ(t) = (P,t 1 ) λ, λ(t) Tγ(t) M, (4.1) Here P,t denotes the flow associated with the nonautonomous vector field f u(t) = m i=1 u i(t)f i and (P,t 1 ) : TqM TP,t (q) M. (4.2) is the induced flow on the cotangent space. The goal of this section is to characterize the curve (4.1) as the integral curve of a suitable (non-autonomous) vector field on T M. To this purpose, we start by showing that a vector field on T M is completely characterized by its action on functions that are affine on fibers. To fix the ideas, we first focus on the case in which P,t : M M is the flow associated with an autonomous vector field X Vec(M), namely P,t = e tx. 93

94 4.1.1 Lifting a vector field from M to T M We start by some preliminary considerations on the algebraic structure of smooth functions on T M. As usual π : T M M denotes the canonical projection. Functions in C (M) are in a one-to-one correspondence with functions in C (T M) that are constant on fibers via the map α π α = α π. In other words we have the isomorphism of algebras C (M) C cst(t M) := {π α α C (M)} C (T M). (4.3) In what follows, with abuse of notation, we often identify the function π α C (T M) with the function α C (M). In a similar way smooth vector fields on M are in a one-to-one correspondence with smooth functions in C (T M) that are linear on fibers via the map Y a Y, where a Y (λ) := λ,y(q) and q = π(λ). Vec(M) C lin (T M) := {a Y Y Vec(M)} C (T M). (4.4) Notice that this is an isomorphism as modules over C (M). Indeed, as Vec(M) is a module over C (M), we have that C lin (T M) is a module over C (M) as well. For any α C (M) and a X C lin (T M) their product is defined as αa X := (π α)a X = a αx C lin (T M). Definition 4.1. Wesay that afunctiona C (T M) isaffine on fibers if thereexist twofunctions α C cst (T M) and a X C lin (T M) such that a = α+a X. In other words a(λ) = α(q)+ λ,x(q), q = π(λ). We denote by C aff (T M) the set of affine function on fibers. Remark 4.2. Linear and affine functions on T M are particularly important since they reflects the linear structure of the cotangent bundle. In particular every vector field on T M, as a derivation of C (T M), is completely characterized by its action on affine functions, Indeed for a vector field V Vec(T M) and f C (T M), one has that (Vf)(λ) = d dt f(e tv (λ)) = d λ f,v(λ), λ T M. (4.5) t= which depends only on the differential of f at the point λ. Hence, for each fixed λ T M, to compute (4.5) one can replace the function f with any affine function whose differential at λ coincide with d λ f. Notice that such a function is not unique. Let us now consider the infinitesimal generator of the flow (P 1,t ) = (e tx ). Since it satisfies the group law (e tx ) (e sx ) = (e (t+s)x ) t,s R, by Lemma 2.15 its infinitesimal generator is an autonomous vector field V X on T M. In other words we have (e tx ) = e tv X for all t. Let us then compute the right hand side of (4.5) when V = V X and f is either a function constant on fibers or a function linear on fibers. 94

95 The action of V X on functions that are constant on fibers, of the form β π with β C (M), coincides with the action of X. Indeed we have for all λ T M d dt β π((e tx ) λ)) = d t= dt β(e tx (q)) = (Xβ)(q), q = π(λ). (4.6) t= For what concerns the action of V X on functions that are linear on fibers, of the form a Y (λ) = λ,y(q), we have for all λ T M d dt a Y ((e tx ) λ) = d t= dt (e tx ) λ,y(e tx (q)) t= = d dt λ,(e tx Y)(q) = λ,[x,y](q) (4.7) t= = a [X,Y] (λ). Hence, by linearity, one gets that the action of V X on functions of C aff (T M) is given by V X (β +a Y ) = Xβ +a [X,Y]. (4.8) As explained in Remark 4.2, formula (4.8) characterizes completely the generator V X of (P 1,t ). To find its explicit form we introduce the notion of Poisson bracket The Poisson bracket The purpose of this section is to introduce an operation {, } on C (T M), called Poisson bracket. First we introduce it in C lin (T M), where it reflects the Lie bracket of vector fields in Vec(M), seen as elements of C lin (T M). Then it is uniquely extended to C aff (T M) and C (T M) by requiring that it is a derivation of the algebra C (T M) in each argument. More precisely we start by the following definition. Definition 4.3. Let a X,a Y C lin (T M) be associated with vector fields X,Y Vec(M). Their Poisson bracket is defined by {a X,a Y } := a [X,Y], (4.9) where a [X,Y] is the function in C lin (T M) associated with the vector field [X,Y]. Remark 4.4. Recall that the Lie bracket is a bilinear, skew-symmetric map defined on Vec(M), that satisfies the Leibnitz rule for X,Y Vec(M): [X,αY] = α[x,y]+(xα)y, α C (M). (4.1) As a consequence, the Poisson bracket is bilinear, skew-symmetric and satisfies the following relation {a X,αa Y } = {a X,a αy } = a [X,αY] = αa [X,Y] +(Xα)a Y, α C (M). (4.11) Noticethatthisrelationmakessensesincetheproductbetweenα C cst (T M)anda X C lin (T M) belong to C lin (T M), namely αa X = a αx. Next, we extend this definition on the whole C (T M). 95

96 Proposition 4.5. There exists a unique bilinear and skew-simmetric map {, } : C (T M) C (T M) C (T M) that extends (4.9) on C (T M), and that is a derivation in each argument, i.e. it satisfies {a,bc} = {a,b}c+{a,c}b, a,b,c C (T M). (4.12) We call this operation the Poisson bracket on C (T M). Proof. We start by proving that, as a consequence of the requirement that {, } is a derivation in each argument, it is uniquely extended to C aff (T M). By linearity and skew-symmetry we are reduced to compute Poisson brackets of kind {a X,α} and {α,β}, where a X C lin (T M) and α,β C cst(t M). Using that a αy = αa Y and (4.12) one gets Comparing (4.11) and (4.13) one gets Next, using (4.12) and (4.14), one has {a X,a αy } = {a X,αa Y } = α{a X,a Y }+{a X,α}a Y. (4.13) {a X,α} = Xα (4.14) {a αy,β} = {αa Y,β} = α{a Y,β}+{α,β}a Y (4.15) = αyβ +{α,β}a Y. (4.16) Using again (4.14) one also has {a αy,β} = αyβ, hence {α,β} =. Combining the previous formulas one obtains the following expression for the Poisson bracket between two affine functions on T M {a X +α,a Y +β} := a [X,Y] +Xβ Yα. (4.17) From the explicit formula (4.17) it is easy to see that the Poisson bracket computed at a fixed λ T M depends only on the differential of the two functions a X +α and a Y +β at λ. Next we extend this definition to C (T M) in such a way that it is still a derivation. For f,g C (T M) we define {f,g} λ := {a f,λ,a g,λ } λ (4.18) where a f,λ and a g,λ are two functions in C aff (T M) such that d λ f = d λ (a f,λ ) and d λ g = d λ (a g,λ ). Remark 4.6. The definition (4.18) is well posed, since if we take two different affine functions a f,λ and a f,λ their difference satisfy d λ(a f,λ a f,λ ) = d λ(a f,λ ) d λ (a f,λ ) =, hence by bilinearity of the Poisson bracket {a f,λ,a g,λ } λ = {a f,λ,a g,λ} λ. Let us now compute the coordinate expression of the Poisson bracket. In canonical coordinates (p,x) in T M, if n X = X i (x) n, Y = Y i (x), x i x i i=1 96 i=1

97 we have n a X (p,x) = p i X i (x), a Y (p,x) = i=1 and, denoting f = a X +α, g = a Y +β we have n p i Y i (x). i=1 {f,g} = a [X,Y] +Xβ Yα n ( ) Y j X j β α = p j X i Y i +X i Y i x i,j=1 i x i p i p i n ( Y j = X i p j + β ( X j ) Y i p j + α ) x i p i x i p i = i,j=1 n i=1 f g f g. p i x i x i p i From these computations we get the formula for Poisson brackets of two functions a,b C (T M) {a,b} = n i=1 a b a b, a,b C (T M). (4.19) p i x i x i p i The explicit formula (4.19) shows that the extension of the Poisson bracket to C (T M) is still a derivation. Remark 4.7. We stress that the value {a,b} λ at a point λ T M depends only on d λ a and d λ b. Hence the Poisson bracket computed at the point λ T M can be seen as a skew-symmetric and nondegenerate bilinear form {, } λ : T λ (T M) T λ (T M) R. Exercise 4.8. Let f = (f 1,...,f k ) : T M R k, g : T M R and ϕ : R k R be smooth functions. Denote by ϕ f := ϕ f. Prove that {ϕ f,g} = k i=1 ϕ f i {f i,g}. (4.2) Hamiltonian vector fields By construction, the linear operator defined by a : C (T M) C (T M) a(b) := {a,b} (4.21) is a derivation of the algebra C (T M), therefore can be identified with an element of Vec(T M). Definition 4.9. The vector field a on T M defined by (4.21) is called the Hamiltonian vector field associated with the smooth function a C (T M). 97

98 From (4.19) we can easily write the coordinate expression of a for any arbitrary function a C (T M) n a a = a. (4.22) p i x i x i p i i=1 The following proposition gives the explicit form of the vector field V on T M generating the flow (P 1,t ). Proposition 4.1. Let X Vec(M) be complete and let P,t = e tx. The flow on T M defined by (P,t 1 ) = (e tx ) is generated by the Hamiltonian vector field a X, where a X (λ) = λ,x(q) and q = π(λ). Proof. To prove that the generator V of (P 1,t ) coincides with the vector field a X it is sufficient to show that their action is the same. Indeed, by definition of Hamiltonian vector field, we have a X (α) = {a X,α} = Xα a X (a Y ) = {a X,a Y } = a [X,Y]. Hence this action coincides with the action of V as in (4.6) and (4.7). Remark In coordinates (p,x) if the vector field X is written X = n n i=1 p ix i and the Hamitonian vector field a X is written as follows i=1 X i x i then a X (p,x) = a X = n i=1 X i x i n i,j=1 p i X i x j p j. (4.23) Notice that the projection of a X onto M coincides with X itself, i.e., π ( a X ) = X. This construction can be extended to the case of nonautonomous vector fields. Proposition Let X t be a nonautonomous vector field and denote by P,t the flow of X t on M. Then the nonautonomous vector field on T M V t := a Xt, a Xt (λ) = λ,x t (q), is the generator of the flow (P 1,t ). 4.2 The symplectic structure In this section we introduce thesymplectic structureof T M following theclassical construction. In subsection we show that the symplectic form can be interpreted as the dual of the Poisson bracket, in a suitable sense. Definition The tautological (or Liouville) 1-form s Λ 1 (T M) is defined as follows: s : λ s λ T λ (T M), s λ,w := λ,π w, λ T M, w T λ (T M), where π : T M M denotes the canonical projection. 98

99 The name tautological comes from its expression in coordinates. Recall that, given a system of coordinates x = (x 1,...,x n ) on M, canonical coordinates (p,x) on T M are coordinates for which every element λ T M is written as follows λ = n p i dx i. i=1 For every w T λ (T M) we have the following w = n i=1 α i p i +β i x i = π w = n i=1 β i x i, hence we get s λ,w = λ,π w = n p i β i = i=1 n n p i dx i,w = p i dx i,w. i=1 i=1 In other words the coordinate expression of the Liouville form s at the point λ coincides with the one of λ itself, namely n s λ = p i dx i. (4.24) Exercise Let s Λ 1 (T M) be the tautological form. Prove that i=1 ω s = ω, ω Λ 1 (M). (Recall that a 1-form ω is a section of T M, i.e. a map ω : M T M such that π ω = id M ). Definition The differential of the tautological 1-form σ := ds Λ 2 (T M) is called the canonical symplectic structure on T M. By construction σ is a closed 2-form on T M. Moreover its expression in canonical coordinates (p, x) shows immediately that is a nondegenerate two form σ = n dp i dx i. (4.25) i=1 Remark 4.16 (Thesymplecticforminnon-canonicalcoordinates). Givenabasisof1-formsω 1,...,ω n in Λ 1 (M), one can build coordinates on the fibers of T M as follows. Every λ T M can be written uniquely as λ = n i=1 h iω i. Thus h i become coordinates on the fibers. Notice that these coordinates are not related to any choice of coordinates on the manifold, as the p were. By definition, in these coordinates, we have s = n h i ω i, σ = ds = i=1 n dh i ω i +h i dω i. (4.26) i=1 Notice that, with respect to (4.25) in the expression of σ an extra term appears since, in general, the 1-forms ω i are not closed. 99

100 4.2.1 The symplectic form vs the Poisson bracket Let V be a finite dimensional vector space and V denotes its dual (i.e. the space of linear forms on V). By classical linear algebra arguments one has the following identifications { } non degenerate bilinear forms on V { } { } linear invertible maps non degenerate V V bilinear forms on V. (4.27) Indeed to every bilinear form B : V V R we can associate a linear map L : V V defined by L(v) = B(v, ). On the other hand, given a linear map L : V V, we can associate with it a bilinear map B : V V R defined by B(v,w) = L(v),w, where, denotes as usual the pairing between a vector space and its dual. Moreover B is non-degenerate if and only if the map B(v, ) is an isomorphism for every v V, that is if and only if L is invertible. The previous argument shows how to identify a bilinear form on B on V with an invertible linear map L from V to V. Applying the same reasoning to the linear map L 1 one obtain a bilinear map on V. Exercise (a). Let h C (T M). Prove that the Hamiltonian vector field h Vec(T M) satisfies the following identity σ(, h(λ)) = d λ h, λ T M. (b). Prove that, for every λ T M the bilinear forms σ λ on T λ (T M) and {, } λ on T λ (T M) (cf. Remark 4.7) are dual under the identification (4.27). In particular show that {a,b} = a(b) = db, a = σ( a, b), a,b C (T M). (4.28) Remark Notice that σ is nondegenerate, which means that the map w σ λ (,w) defines a linear isomorphism between the vector spaces T λ (T M) and T λ (T M). Hence h is the vector field canonically associated by the symplectic structure with the differential dh. For this reason h is also called symplectic gradient of h. From formula (4.25) we have that in canonical coordinates (p, x) the Hamiltonian vector filed associated with h is expressed as follows h = n i=1 h h, p i x i x i p i and the Hamiltonian system λ = h(λ) is rewritten as ẋ i = h p i ṗ i = h, i = 1,...,n. x i We conclude this section with two classical but rather important results: Proposition A function a C (T M) is a constant of the motion of the Hamiltonian system associated with h C (T M) if and only if {h,a} =. 1

101 Proof. Let us consider a solution λ(t) = e t h (λ ) of the Hamiltonian system associated with h, with λ T M. From (4.28), we have the following formula for the derivative of the function a along the solution d a(λ(t)) = {h,a}(λ(t)). (4.29) dt It is then easy to see that {h,a} = if and only if the derivative of the function a along the flow vanishes for all t, that is a is constant. The skew-simmetry of the Poisson brackets immediately implies the following corollary. Corollary 4.2. A function h C (T M) is a constant of the motion of the Hamiltonian system defined by h. 4.3 Characterization of normal and abnormal extremals Now we can rewrite Theorem 3.44 using the symplectic language developed in the last section. Given a sub-riemannian structure on M with generating family {f 1,...,f m }, and define the fiberwise linear functions on T M associated with these vector fields h i : T M R, h i (λ) := λ,f i (q), i = 1,...,m. Theorem 4.21 (Hamiltonian characterization of Pontryagin extremals). Let γ : [, T] M be an admissible curve which is a length-minimizer, parametrized by constant speed. Let u( ) be the corresponding minimal control. Then there exists a Lipschitz curve λ(t) Tγ(t) M such that λ(t) = m u i (t) h i (λ(t)), a.e. t [,T], (4.3) i=1 and one of the following conditions is satisfied: (N) h i (λ(t)) u i (t), i = 1,...,m, t, (A) h i (λ(t)), i = 1,...,m, t. Moreover in case (A) one has λ(t) for all t [,T]. Proof. The statement is a rephrasing of Theorem 3.44, obtained by combining Proposition 4.1 and Exercise Notice that Theorem 4.21 says that normal and abnormal extremals appear as solution of an Hamiltonian system. Nevertheless, this Hamiltonian system is non autonomous and depends on the trajectory itself by the presence of the control u(t) associated with the extremal trajectory. Moreover, the actual formulation of Theorem 4.21 for the necessary condition for optimality still does not clarify if the extremals depend on the generating family {f 1,...,f m } for the sub- Riemannian structure. The rest of the section is devoted to the geometric intrinsic description of normal and abnormal extremals. 11

102 4.3.1 Normal extremals In this section we show that normal extremals are characterized as solutions of a smooth autonomous Hamiltonian system on T M, where the Hamiltonian H is a function that encodes all the informations on the sub-riemannian structure. Definition Let M be a sub-riemannian manifold. The sub-riemannian Hamiltonian is the function on T M defined as follows H : T M R, H(λ) = max ( λ,f u (q) 12 ) u U u 2, q = π(λ). (4.31) q Proposition The sub-riemannian Hamiltonian H is smooth and quadratic on fibers. Moreover, for every generating family {f 1,...,f m } of the sub-riemannian structure, the sub-riemannian Hamiltonian H is written as follows H(λ) = 1 2 m i=1 λ,f i (q) 2, λ Tq M, q = π(λ). (4.32) Proof. In terms of a generating family {f 1,...,f m }, the sub-riemannian Hamiltonian (4.31) is written as follows ( m ) H(λ) = max u i λ,f i (q) 1 m u 2 u R m i. (4.33) 2 i=1 Differentiating (4.33) with respect to u i, one gets that the maximum in the r.h.s. is attained at u i = λ,f i (q), from which formula (4.32) follows. The fact that H is smooth and quadratic on fibers then easily follows from (4.32). Exercise Prove that two equivalent sub-riemannian structures (U,f) and (U,f ) on a manifold M define the same Hamiltonian. Exercise Consider the sub-riemannian Hamiltonian H : T M R. Denote by H q : Tq M R its restriction on fiber and fix λ T q M. The differential d λh q : Tq M R is a linear form, hence it can be canonically identified with an element of T q M. (i) Prove that d λ H q D x for all λ T qm. (ii) Prove that d λ H q 2 = 2H(λ). Hint: use that, if f 1,...,f m is a generating frame, then i=1 d λ H q = m λ,f i (q) f i (q). i=1 Theorem Every normal extremal is a solution of the Hamiltonian system λ(t) = H(λ(t)). In particular, every normal extremal trajectory is smooth. 12

103 Proof. Denoting, as usual, h i (λ) = λ,f i (q) for i = 1,...,m, the functions linear on fibers associated with a generating family and using the identity h 2 i = 2h i h i (see (4.12)), it follows that H = 1 m h 2 i = 2 i=1 m h i hi. In particular, since along a normal extremal h i (λ(t)) = u i (t) by condition (N) of Theorem 4.21, one gets m H(λ(t)) = h i (λ(t)) m h i (λ(t)) = u i (t) h i (λ(t)). i=1 i=1 i=1 Remark In canonical coordinates λ = (p,x) in T M, H is quadratic with respect to p and H(p,x) = 1 2 m p,f i (x) 2. i=1 The Hamiltonian system associated with H, in these coordinates, is written as follows ẋ = H p = m i=1 p,f i(x) f i (x) ṗ = H x = m i=1 p,f i(x) p,d x f i (x) (4.34) From here it is easy to see that if λ(t) = (p(t),x(t)) is a solution of (4.34) then also the rescaled extremal αλ(αt) = (αp(αt),x(αt)) is a solution of the same Hamiltonian system, for every α >. Lemma Let λ(t) be an integral curve of the Hamiltonian vector field H and γ(t) = π(λ(t)) be the corresponding normal extremal trajectory. Then for all t [,T] one has 1 2 γ(t) 2 = H(λ(t)). Proof. Fix a generating frame f 1,...,f m. Since λ(t) is a solution of the Hamiltonian system we have m γ(t) = λ(t),f i (γ(t) f i (γ(t)) i=1 hence u i (t) = λ(t),f i (γ(t) defines a control for the curve γ. This control is indeed the minimal one as it follows from Exercice 4.25 and 1 2 γ(t) 2 = 1 2 m u i (t) 2 = 1 2 i=1 m λ(t),f i (γ(t) 2 = H(λ(t)) (4.35) i=1 Corollary A normal extremal trajectory is parametrized by constant speed. In particular it is length parametrized if and only if its extremal lift is contained in the level set H 1 (1/2). 13

104 Proof. The fact that H is constant along λ(t), easily implies by (4.35) that γ(t) 2 is constant. Moreover one easily gets that γ(t) = 1 if and only if H(λ(t)) = 1/2. Finally, by Remark 4.27, all normal extremal trajectories are reparametrization of length parametrized ones. Let λ(t) beanormal extremal suchthat λ() = λ T q M. Thecorrespondingnormal extremal trajectory γ(t) = π(λ(t)) can be written in the exponential notation γ(t) = π e t H (λ ). By Corollary 4.29, length-parametrized normal extremal trajectories corresponds to the choice of λ H 1 (1/2). We end this section by characterizing normal extremal trajectory as characteristic curves of the canonical symplectic form contained in the level sets of H. Definition 4.3. Let M be a smooth manifold and Ω Λ k M a 2-form. A Lipschitz curve γ : [,T] M is a characteristic curve for Ω if for almost every t [,T] it holds γ(t) KerΩ γ(t), (i.e. Ω γ(t) ( γ(t), ) = ) (4.36) Notice that this notion is independent on the parametrization of the curve. Proposition Let H be the sub-riemannian Hamiltonian and assume that c > is a regular value of H. Then a Lipschitz curve γ is a characteristic curve for σ H 1 (c) if and only if it is the reparametrization of a normal extremal on H 1 (c). Proof. Recall that if c is a regular value of H, then the set H 1 (c) is a smooth (2n 1)-dimensional manifold in T M (notice that by Sard Theorem almost every c > is regular value for H). For every λ H 1 (c) let us denote by E λ = T λ H 1 (c) its tangent space at this point. Notice that, by construction, E λ is an hyperplane (i.e., dime λ = 2n 1) and d λ H Eλ =. The restriction σ H 1 (c) is computed by σ λ Eλ, for each λ H 1 (c). One one hand kerσ λ Eλ is non trivial since the dimension of E λ is odd. On the other hand the symplectic 2-form σ is nondegenerate on T M, hence the dimension of kerσ λ Eλ cannot be greater than one. It follows that dimkerσ λ Eλ = 1. We arelefttoshow that kerσ λ Eλ = H(λ). Assumethat kerσ λ Eλ = Rξ, forsomeξ T λ (T M). By construction, E λ coincides with the skew-orthogonal to ξ, namely E λ = ξ = {w T λ (T M)) σ λ (ξ,w) = }. Since, by skew-symmetry, σ λ (ξ,ξ) =, it follows that ξ E λ. Moreover, by definition of Hamiltonian vector field σ(, H) = dh, hence for the restriction to E λ one has σ λ (, H(λ)) Eλ = d λ H Eλ =. Exercise Prove that if two smooth Hamiltonians h 1,h 2 : T M R define the same level set, i.e. E = {h 1 = c 1 } = {h 2 = c 2 } for some c 1,c 2 R, then their Hamiltonian flow h 1, h 2 coincide on E, up to reparametrization. 14

105 Exercise The sub-riemannian Hamiltonian H encodes all the information about the sub- Riemannian structure. (a) Prove that a vector v T q M is sub-unit, i.e., it satisfies v D q and v 1 if and only if 1 2 λ,v 2 H(λ), λ T qm. (b) Show that this implies the following characterization for the sub-riemannian Hamiltonian H(λ) = 1 2 λ 2, λ = sup λ, v. v D q, v =1 When the structure is Riemannian, H is the inverse norm defined on the cotangent space Abnormal extremals In this section we provide a geometric characterization of abnormal extremals. Even if for abnormal extremals it is not possible to determine a priori their regularity, we show that they can be characterized as characteristic curves of the symplectic form. This gives an unified point of view of both class of extremals. We recall that an abnormal extremal is a non zero solution of the following equations λ(t) = m u i (t) h i (λ(t)), i=1 h i (λ(t)) =, i = 1,...,m. where {f 1,...,f m } is a generating family for the sub-riemannian structure and h 1,...,h m are the corresponding functions on T M linear on fibers. In particular every abnormal extremal is contained in the set H 1 () = {λ T M λ,f i (q) =, i = 1,...,m, q = π(λ)}. (4.37) where H denotes the sub-riemannian Hamiltonian (4.32). Proposition Let H be the sub-riemannian Hamiltonian and assume that H 1 () is a smooth manifold. Then a Lipschitz curve γ is a characteristic curve for σ H 1 () if and only if it is the reparametrization of a abnormal extremal on H 1 (). Proof. In this proof we denote for simplicity N := H 1 () T M. For every λ N we have the identity Kerσ λ N = T λ N = span{ h i (λ) i = 1,...,m}. (4.38) Indeed, from the definition of N, it follows that T λ N = {w T λ (T M) d λ h i, w =, i = 1,...,m} = {w T λ (T M) σ(w, h i (λ)) =, i = 1,...,m} = span{ h i (λ) i = 1,...,m}. 15

106 and (4.38) follows by taking the skew-orthogonal on both sides. Thus w T λ H 1 () if and only if w is a linear combination of the vectors h i (λ). This implies that λ(t) is a characteristic curve for σ H 1 () if and only if there exists controls u i ( ) for i = 1,...,m such that λ(t) = m u i (t) h i (λ(t)). (4.39) i=1 Notice that is never a regular value of H. Nevertheless, the following exercise shows that the assumption of Proposition 4.34 is always satisfied in the case of a regular sub-riemannian structure. Exercise Assume that the sub-riemannian structure is regular, namely the following assumption holds dimd q = dimspan q {f 1,...,f m } = const. (4.4) Then prove that the set H 1 () defined by (4.37) is a smooth submanifold of T M. Remark From Proposition 4.34 it follows that abnormal extremals do not depend on the sub-riemannian metric, but only on the distribution. Indeed the set H 1 () is characterized as the annihilator D of the distribution H 1 () = {λ T M λ,v =, v D π(λ) } = D T M. Here the orthogonal is meant in the duality sense. Under the regularity assumption (4.4) we can select (at least locally) a basis of 1-forms ω 1,...,ω m for the dual of the distribution D q = span{ω i(q) i = 1,...,m}, (4.41) Letuscompletethissetof1-formstoabasisω 1,...,ω n oft M andconsidertheinducedcoordinates h 1,...,h n asdefinedinremark4.16. Inthesecoordinatestherestriction ofthesymplecticstructure D to is expressed as follows σ D = d(s D ) = m dh i ω i +h i dω i, (4.42) i=1 We stress that the restriction σ D can be written only in terms of the elements ω 1,...,ω m (and not of a full basis of 1-forms) since the differential d commutes with the restriction Example: codimension one distribution and contact distributions Let M be a n-dimensional manifold endowed with a constant rank distribution D of codimension one, i.e., dimd q = n 1 for every q M. In this case D and D are sub-bundles of TM and T M respectively and their dimension, as smooth manifolds, are dim D = dimm +rankd = 2n 1, dim D = dimm +rankd = n+1. Since the symplectic form σ is skew-symmetric, a dimensional argument implies that for n even, the restriction σ D has always a nontrivial kernel. Hence there always exist characteristic curves of σ D, that correspond to reparametrized abnormal extremals by Proposition

107 Let us consider in more detail the case n = 3. Assume that there exists a one form ω Λ 1 (M) such that D = kerω (this is not restrictive for a local description). Consider a basis of one forms ω,ω 1,ω 2 such that ω := ω and the coordinates h,h 1,h 2 associated to these forms (see Remark 4.16). By (4.42) and we can easily compute (recall that D is 4-dimensional) σ D = dh ω +h dω, (4.43) σ σ D = 2h dh ω dω. (4.44) Lemma Let N be a smooth 2k-dimensional manifold and Ω Λ 2 M. Then Ω is nondegenerate on N if and only if k Ω. 1 Definition Let M be a three dimensional manifold. We say that a constant rank distribution D = kerω on M of corank one is a contact distribution if ω dω. For a three dimensional manifold M endowed with a distribution D = ker ω we define the Martinet set as M = {q M (ω dω) q = } M. Corollary Under the previous assumptions all nontrivial abnormal extremal trajectories are contained in the Martinet set M. In particular, if the structure is contact, there are no nontrivial abnormal extremal trajectories. Proof. ByProposition4.34 anyabnormalextremal λ(t)isacharacteristic curveofσ D. ByLemma 4.37 σ D is degenerate if and only if σ σ D =, which is in turn equivalent to ω dω = thanks to (4.44) (notice that dh and ω dω are independent since they depend on coordinates on the fibers and on the manifold, respectively). This shows that, if γ(t) is an abnormal trajectory and λ(t) is the associated abnormal extremal, then λ(t) is a characteristic curve of σ D if and only if (ω dω) γ(t) =, that is γ(t) M. By definition of M it follows that, if D is contact, then M is empty. Remark 4.4. Since M is three dimensional, we can write ω dω = adv where a C (M) and dv is some smooth volume form on M, i.e., a never vanishing 3-form on M. In particular the Martinet set is M = a 1 () and the distribution is contact if and only if the function a is never vanishing. When is a regular value of a, the set a 1 () defines a two dimensional surface on M, called the Martinet surface. Notice that this condition is satisfied for a generic choice of the (one form defining the) distribution. Abnormal extremal trajectories are the horizontal curves that are contained in the Martinet surface. When M is smooth, the intersection of the tangent bundle to the surface M and the 2-dimensional distribution of admissible velocities defines, generically, a line field on M. Abnormal extremal trajectories coincide with the integral curves of this line field, up to a reparametrization. 1 Here k Ω = } Ω... Ω {{}. k 17

108 4.4 Examples D Riemannian Geometry Let M be a 2-dimensional manifold and f 1,f 2 Vec(M) a local orthonormal frame for the Riemannian structure. The problem of finding length-minimizers on M could be described as the optimal control problem q(t) = u 1 (t)f 1 (q(t))+u 2 (t)f 2 (q(t)), where length and energy are expressed as l(q( )) = T u1 (t) 2 +u 2 (t) 2 dt, J(q( )) = 1 2 T ( u1 (t) 2 +u 2 (t) 2) dt. Geodesics are projections of integral curves of the sub-riemannian Hamiltonian in T M H(λ) = 1 2 (h 1(λ) 2 +h 2 (λ) 2 ), h i (λ) = λ,f i (q), i = 1,2. Since the vector fields f 1 and f 2 are linearly independent, the functions (h 1,h 2 ) defines a system of coordinates on fibers of T M. In what follows it is convenient to use (q,h 1,h 2 ) as coordinates on T M (even if coordinates on the manifold are not necessarily fixed). Let us start by showing that there are no abnormal extremals. Indeed if λ(t) is an abnormal extremal and γ(t) is the associated abnormal trajectory we have λ(t),f 1 (γ(t)) = λ(t),f 2 (γ(t)) =, t [,T], (4.45) that implies that λ(t) = for all t [,T] since {f 1,f 2 } is a basis of the tangent space at every point. This is a contradiction since λ(t) by Theorem Suppose now that λ(t) is a normal extremal. Then u i (t) = h i (λ(t)) and the equation on the base is q = h 1 f 1 (q)+h 2 f 2 (q). (4.46) For the equation on the fiber we have (remember that along solutions ȧ = {H,a}) {ḣ1 = {H,h 1 } = {h 1,h 2 }h 2 ḣ 2 = {H,h 2 } = {h 1,h 2 }h 1. (4.47) From here one can see directly that H is constant along solutions. Indeed Ḣ = h 1 ḣ 1 +h 2 ḣ 2 =. If we require that extremals are parametrized by arclength u 1 (t) 2 +u 2 (t) 2 = 1 for a.e. t [,T], we have H(λ(t)) = 1 2 h 2 1 (λ(t))+h2 2 (λ(t)) = 1. It is then convenient to restrict to the spherical cotangent bundle S M (see Example 2.51) of coordinates (q, θ), by setting h 1 = cosθ, h 2 = sinθ. 18

109 Let a 1,a 2 C (M) be such that [f 1,f 2 ] = a 1 f 1 +a 2 f 2. (4.48) Since {h 1,h 2 }(λ) = λ,[f 1,f 2 ], we have {h 1,h 2 } = a 1 h 1 + a 2 h 2 and equations (7.26) and (4.55) are rewritten in (θ, q) coordinates { θ = a1 (q)cosθ+a 2 (q)sinθ (4.49) q = cosθf 1 (q)+sinθf 2 (q) In other words we are saying that an arc-length parametrized curve on M (i.e. a curve which satisfies the second equation) is a geodesic if and only if it satisfies the first. Heuristically this suggests that the quantity θ a 1 (q)cosθ a 2 (q)sinθ, has some relation with the geodesic curvature on M. Let µ 1,µ 2 the dual frame of f 1,f 2 (so that dv = µ 1 µ 2 ) and consider the Hamiltonian field in these coordinates H = cosθf 1 +sinθf 2 +(a 1 cosθ +a 2 sinθ) θ. (4.5) The Levi-Civita connection on M is expressed by some coefficients (see Chapter??) ω = dθ +b 1 µ 1 +b 2 µ 2, where b i = b i (q). On the other hand geodesics are projections of integral curves of H so that ω, H = = b 1 = a 1, b 2 = a 2. In particular if we apply ω = dθ a 1 µ 1 a 2 µ 2 to a generic curve (not necessarily a geodesic) which projects on γ we find geodesic curvature λ = cosθf 1 +sinθf 2 + θ θ, κ g (γ) = θ a 1 (q)cosθ a 2 (q)sinθ, as we infer above. To end this section we prove a useful formula for the Gaussian curvature of M Corollary If κ denotes the Gaussian curvature of M we have κ = f 1 (a 2 ) f 2 (a 1 ) a 2 1 a2 2. Proof. From (1.58) we have dω = κdv where dv = µ 1 µ 2 is the Riemannian volume form. On the other hand, using the following identities we can compute dµ i = a i µ 1 µ 2, da i = f 1 (a i )µ 1 +f 2 (a i )µ 2, i = 1,2. dω = da 1 µ 1 da 2 µ 2 a 1 dµ 1 a 2 dµ 2 = (f 1 (a 2 ) f 2 (a 1 ) a 2 1 a2 2 )µ 1 µ 2. 19

110 4.4.2 Isoperimetric problem Let M be a 2-dimensional orientable Riemannian manifold and ν its Riemannian volume form. Fix a smooth one-form A Λ 1 M and c R. Problem 1. Fix c R and q,q 1 M. Find, whenever it exists, the solution to { } min l(γ) : γ() = q,γ(t) = q 1, A = c. (4.51) Remark Minimizers depend only on da, i.e., if we add an exact term to A we will find same minima for the problem (with a different value of c). Problem 1 can be reformulated as a sub-riemannian problem on the extended manifold M = M R, in the sense that solutions of the problem (12.76) turns to be length minimizers for a suitable sub-riemannian structure on M, that we are going to construct. To every curve γ on M satisfying γ() = q and γ(t) = q 1 we can associate the function t z(t) = A = A( γ(s))ds. γ [,t] The curve ξ(t) = (γ(t),z(t)) defined on M satisfies ω( ξ(t)) = where ω = dz A is a one form on M, since ω( ξ(t)) = ż(t) A( γ(t)) =. Equivalently, ξ(t) Dξ(t) where D = kerω. We define a metric on D by defining the norm of a vector v D as the Riemannian norm of its projection π v on M, where π : M M is the canonical projection on the first factor. This endows M with a sub-riemannian structure. If we fix a local orthonormal frame f 1,f 2 for M, the pair (γ(t),z(t)) satisfies ) ( γ ż Hence the two vector fields on M = u 1 ( f1 A,f 1 γ ) ( ) f2 +u 2. (4.52) A,f 2 F 1 = f 1 + A,f 1 z, F 2 = f 2 + A,f 2 z, defines an orthonormal frame for the metric defined above on D = span(f 1,F 2 ). Problem 1 is then equivalent to the following: Problem 2. Fix c R and q,q 1 M. Find, whenever it exists, the solution to { min l(ξ) : ξ() = (q,),ξ(t) = (q 1,c), ξ(t) } D ξ(t). (4.53) Notice that, by construction, D is a distribution of constant rank (equal to 2) but is not necessarily bracket-generating. Let us now compute normal and abnormal extremals associated to the sub-riemannian structure just introduced on M. In what follows we denote with h i (λ) = λ,f i (q) the Hamiltonians linear on fibers of T M. 11

111 Normal extremals Equations of normal extremals are projections of integral curves of the sub-riemannian Hamiltonian in T M H(λ) = 1 2 (h2 1(λ)+h 2 2(λ)), h i (λ) = λ,f i (q), i = 1,2. Let us introduce F = z and h (λ) = λ,f (q). Since F 1,F 2 and F are linearly independent, then (h 1,h 2,h ) defines a system of coordinates on fibers of T M. In what follows it is convenient to use (q,h 1,h 2,h ) as coordinates on T M. For a normal extremal we have u i (t) = h i (λ(t)) for i = 1,2 and the equation on the base is ξ = h 1 F 1 (ξ)+h 2 F 2 (ξ). (4.54) For the equation on the fibers we have (remember that along solutions ȧ = {H,a}) ḣ 1 = {H,h 1 } = {h 1,h 2 }h 2 ḣ 2 = {H,h 2 } = {h 1,h 2 }h 1. ḣ = {H,h } = (4.55) If we require that extremals are parametrized by arclength we can restrict to the cylinder of the cotangent bundle T M defined by h 1 = cosθ, h 2 = sinθ. Let a 1,a 2 C (M) be such that Then [f 1,f 2 ] = a 1 f 1 +a 2 f 2. (4.56) [F 1,F 2 ] = [f 1 + A,f 1 z,f 2 + A,f 2 z ] (by (4.56)) = [f 1,f 2 ]+(f 1 A,f 2 f 2 A,f 1 ) z = a 1 (F 1 A,f 1 )+a 2 (F 2 A,f 2 )+f 1 A,f 2 f 2 A,f 1 ) z = a 1 F 1 +a 2 F 2 +da(f 1,f 2 ) z. where in the last equality we use Cartan formula (cf. (4.75) for a proof). Let µ 1, µ 2 be the dual forms to f 1 and f 2. Then ν = µ 1 µ 2 and we can write da = bµ 1 µ 2, for a suitable function b C (M). In this case [F 1,F 2 ] = a 1 F 1 +a 2 F 2 +b z. and {h 1,h 2 } = λ,[f 1,F 2 ] = a 1 h 1 +a 2 h 2 +bh. (4.57) With computations analogous to the 2D case we obtain the Hamiltonian system associated to H in the (q,θ,h ) coordinates ξ = cosθf 1 (ξ)+sinθf 2 (ξ) θ = a 1 cosθ +a 2 sinθ +bh (4.58) ḣ = 111

112 Inotherwordsifq(t) = π(ξ(t)) istheprojection ofanormalextremal pathonm (here π : M M), its geodesic curvature satisfies κ g (q(t)) = θ(t) a 1 (q(t))cosθ(t) a 2 (q(t))sinθ(t) (4.59) κ g (q(t)) = b(q(t))h. (4.6) Namely, projections on M of normal extremal paths are curves with geodesic curvature proportional to the function b at every point. The case b equal to constant is treated in the example of Section Abnormal extremals We prove the following characterization of abnormal extremal Lemma Abnormal extremal trajectories are contained in the Martinet set M = {b = }. Proof. Assume that λ(t) is an abnormal extremal whose projection is a curve ξ(t) = π(λ(t)) that is not reduced to a point. Then we have h 1 (λ(t)) = λ(t),f 1 (ξ(t)) =, h 2 (λ(t)) = λ(t),f 2 (ξ(t)) =, t [,T], (4.61) We can differentiate the two equalities with respect to t [,T] and we get d dt h 1(λ(t)) = u 2 (t){h 1,h 2 } λ(t) = d dt h 2(λ(t)) = u 1 (t){h 1,h 2 } λ(t) = Since the pair (u 1 (t),u 2 (t)) (,) we have that {h 1,h 2 } λ(t) = that implies = λ(t),[f 1,F 2 ](ξ(t)) = b(ξ(t))h, (4.62) where in the last equality we used (4.57) and the fact that h 1 (λ(t)) = h 2 (λ(t)) =. Recall that h otherwisethecovector isidentically zero(thatisnotpossibleforabnormals), thenb(ξ(t)) = for all t [,T]. The last result shows that abnormal extremal trajectories are forced to live in connected components of b 1 (). Exercise Prove that the set b 1 () is independent on the Riemannian metric chosen on M (and the corresponding sub-riemannian metric defined on D) Heisenberg group The Heisenberg group is a basic example in sub-riemannian geometry. It is the sub-riemannian structure defined by the isoperimetric problem in M = R 2 = {(x,y)} endowed with its Euclidean scalar product and the 1-form (cf. previous section) A = 1 (xdy ydx)

113 Notice that da = dx dy defines the area form on R 2, hence b 1 in this case. On the extended manifold M = R 3 = {(x,y,z)} the one-form ω is written as ω = dz 1 (xdy ydx) 2 Following the notation of the previous paragraph we can choose as an orthonormal frame for R 2 the frame f 1 = x and f 2 = y. This induced the choice F 1 = x y 2 z, F 2 = y + x 2 z. for the orthonormal frame on D = kerω. Notice that [F 1,F 2 ] = z, that implies that D is bracketgenerating at every point. Defining F = z and h i = λ,f i (q) for i =,1,2, the Hamiltonians linear on fibers of T M, we have {h 1,h 2 } = h, hence the equation (4.58) for normal extremals become q = cosθf 1 (q)+sinθf 2 (q) θ = h ḣ = (4.63) It follows that the two last equation can be immediately solved { θ(t) = θ +h t h (t) = h (4.64) Moreover { h 1 (t) = cos(θ +h t) h 2 (t) = sin(θ +h t) (4.65) From these formulas and the explicit expression of F 1 and F 2 it is immediate to recover the normal extremal trajectories starting from the origin (x = y = z = ) in the case h x(t) = 1 h (sin(θ +h t) sin(θ )) y(t) = 1 h (cos(θ +h t) cos(θ )) (4.66) and the vertical coordinate z is computed as the integral z(t) = 1 2 t When h = the curve is simply a straight line x(t)y (t) y(t)x (t)dt = 1 2h 2 (h t sin(h t)) x(t) = sin(θ )t y(t) = cos(θ )t z(t) = (4.67) Notice that, as we know from the results of the previous paragraph, normal extremal trajectories are curves whose projection on R 2 = {(x,y)} has constant geodesic curvature, i.e., straight lines or circles on R 2 (that correspond to horizontal lines and helix on M). There are no non trivial abnormal geodesics since b = 1. Remark This sub-riemannian structure on R 3 is called Heisenberg group since it can be seen as a left-invariant structure on a Lie group, as explained in Section

114 4.5 Lie derivative In this section we extend the notion of Lie derivative, already introduced for vector fields in Section 3.2, to differential forms. Recall that if X,Y Vec(M) are two vector fields we define L X Y = [X,Y] = d dt e tx Y. t= If P : M M is a diffeomorphism we can consider the pullback P : TP(q) M T qm and extend its action to k-forms. Let ω Λ k M, we define P ω Λ k M in the following way: (P ω) q (ξ 1,...,ξ k ) := ω P(q) (P ξ 1,...,P ξ k ), q M, ξ i T q M. (4.68) It is an easy check that this operation is linear and satisfies the two following properties P (ω 1 ω 2 ) = P ω 1 P ω 2, (4.69) P d = d P. (4.7) Definition Let X Vec(M) and ω Λ k M, where k. We define the Lie derivative of ω with respect to X as L X : Λ k M Λ k M, L X ω = d dt (e tx ) ω. (4.71) t= When k = this definition recovers the Lie derivative of smooth functions L X f = Xf, for f C (M). From (4.69) and (4.7), we easily deducethe following propertiesof thelie derivative: (i) L X (ω 1 ω 2 ) = (L X ω 1 ) ω 2 +ω 1 (L X ω 2 ), (ii) L X d = d L X. The first of these properties can be also expressed by saying that L X is a derivation of the exterior algebra of k-forms. TheLiederivativecombinestogether ak-formandavectorfielddefininganewk-form. Asecond way of combining these two object is to define their inner product, by defining a (k 1)-form. Definition Let X Vec(M) and ω Λ k M, with k 1. We define the inner product of ω and X as the operator i X : Λ k M Λ k 1 M, where we set (i X ω)(y 1,...,Y k 1 ) := ω(x,y 1,...,Y k 1 ), Y i Vec(M). (4.72) One can show that the operator i X is an anti-derivation, in the following sense: i X (ω 1 ω 2 ) = (i X ω 1 ) ω 2 +( 1) k 1 ω 1 (i X ω 2 ), ω i Λ k i M, i = 1,2. (4.73) We end this section proving two classical formulas linking together these notions, and usually referred as Cartan s formulas. Proposition 4.48 (Cartan s formula). The following identity holds true L X = i X d+d i X. (4.74) 114

115 Proof. Define D X := i X d+d i X. It is easy to check that D X is a derivation on the algebra of k-forms, since i X and d are anti-derivations. Let us show that D X commutes with d. Indeed, using that d 2 =, one gets d D X = d i X d = D X d. Since any k-form can be expressed in coordinates as ω = ω i1...i k dx i1...dx ik, it is sufficient to prove that L X coincide with D X on functions. This last property is easily checked by D X f = i X (df)+d(i X f) = df,x = Xf = L }{{} X f. = Corollary Let X,Y Vec(M) and ω Λ 1 M, then dω(x,y) = X ω,y Y ω,x ω,[x,y]. (4.75) Proof. On one hand Definition 4.46 implies, by Leibnitz rule L X ω,y q = d dt (e tx ) ω,y q t= = d dt ω,e tx Y e tx (q) t= On the other hand, Cartan s formula (4.74) gives Comparing the two identities one gets (4.75). 4.6 Symplectic geometry = X ω,y ω,[x,y]. L X ω,y = i X (dω),y + d(i X ω),y = dω(x,y)+y ω,x. In this section we generalize some of the constructions we considered on the cotangent bundle T M to the case of a general symplectic manifold. Definition 4.5. A symplectic manifold (N, σ) is a smooth manifold N endowed with a closed, non degenerate 2-form σ Λ 2 (N). A symplectomorphism of N is a diffeomorphism φ : N N such that φ σ = σ. Notice that a symplectic manifold N is necessarily even-dimensional. We stress that, in general, the symplectic form σ is not exact, as in the case of N = T M. The symplectic structure on a symplectic manifold N permits us to define the Hamiltonian vector field h Vec(N) associated with a function h C (N) by the formula i h σ = dh, or equivalently σ(, h) = dh. Proposition A diffeomorphism φ : N N is a symplectomorphism if and only if for every h C (N): (φ 1 ) h = h φ. (4.76) 115

116 Proof. Assume that φ is a symplectomorphism, namely φ σ = σ. More precisely, this means that for every λ N and every v,w T λ N one has σ λ (v,w) = (φ σ) λ (v,w) = σ φ(λ) (φ v,φ w), where the second equality is the definition of φ σ. If we apply the above equality at w = φ 1 h one gets, for every λ N and v T λ N σ λ (v,φ 1 h) = (φ σ) λ (v,φ 1 h) = σ φ(λ) (φ v, h) = d φ(λ) h,φ v = φ d φ(λ) h,v. = d(h φ),v This shows that σ λ (,φ 1 h) = d(h φ), that is (4.76). The converse implication follows analogously. Next we want to characterize those vector fields whose flow generates a one-parametric family of symplectomorphisms. Lemma Let X Vec(N) be a complete vector field on a symplectic manifold (N,σ). The following properties are equivalent (i) (e tx ) σ = σ for every t R, (ii) L X σ =, (iii) i X σ is a closed 1-form on N. Proof. By the group property e (t+s)x = e tx e sx one has the following identity for every t R: d dt (etx ) σ = d ds (e tx ) (e sx ) σ = (e tx ) L X σ. s= This proves the equivalence between (i) and (ii), since the map (e tx ) is invertible for every t R. Recall now that the symplectic form σ is, by definition, a closed form. Then dσ = and Cartan s formula (4.74) reads as follows L X σ = d(i X σ)+i X (dσ) = d(i X σ), which proves the the equivalence between (ii) and (iii). Corollary The flow of a Hamiltonian vector field defines a flow of symplectomorphisms. Proof. This is a direct consequence of the fact that, for an Hamitonian vector field h, one has i h σ = dh. Hence i h σ is a cloded form (actually exact) and property (iii) of Lemma 4.52 holds. Notice that the converse of Corollary 4.53 is true when N is simply connected, since in this case every closed form is exact. Definition Let (N,σ) be a symplectic manifold and a,b C (N). The Poisson bracket between a and b is defined as {a,b} = σ( a, b). 116

117 We end this section by collecting some properties of the Poisson bracket that follow from the previous results. Proposition The Poisson bracket satisfies the identities (i) {a,b} φ = {a φ,b φ}, a,b C (N), φ Sympl(N), (ii) {a,{b,c}} +{c,{a,b}} +{b,{c,a}} =, a,b,c C (N). Proof. Property (i) follows from (4.76). Property (ii) follows by considering φ = e t c in (i), for some c C (N),. and computing the derivative with respect to t at t =. Corollary For every a,b C (N) we have {a,b} = [ a, b]. (4.77) Proof. Property (ii) of Proposition 4.55 can be rewritten, by skew-symmetry of the Poisson bracket, as follows {{a,b},c} = {a,{b,c}} {b,{a,c}}. (4.78) Using that {a,b} = σ( a, b) = ab one rewrite (4.78) as {a,b}c = a( bc) b( ac) = [ a, b]c. Remark Property (ii) of Proposition 4.55 says that {a, } is aderivation of thealgebra C (N). Moreover, the space C (N) endowed with {, } as a product is a Lie algebra isomorphic to a subalgebra of Vec(N). Indeed, by (4.77), the correspondence a a is a Lie algebra homomorphism between C (N) and Vec(N). 4.7 Local minimality of normal trajectories In this section we prove a fundamental result about local optimality of normal trajectories. More precisely we show small pieces of a normal trajectory are length minimizers The Poincaré-Cartan one form Fix a smooth function a C (M) and consider the smooth submanifold of T M defined by the graph of its differential L = {d q a q M} T M. (4.79) Notice that the restriction of thecanonical projection π : T M M tol definesadiffeomorphism between L and M, hence diml = n. Assume that the Hamiltonian flow is complete and consider the image of L under the Hamiltonian flow L t := e t H (L ), t [,T]. (4.8) Define the (n+1)-dimensional manifold with boundary in R T M as follows L = {(t,λ) R T M λ L t, t T} (4.81) = {(t,e t H λ ) R T M λ L, t T}. (4.82) 117

118 Finally, let us introduce the Poincaré-Cartan 1-form on T M R T (M R) defined by s Hdt Λ 1 (T M R) where s Λ 1 (T M) denotes, as usual, the tautological 1-form of T M. We start by proving a preliminary lemma. Lemma s L = d(a π) L Proof. By definition of tautological 1-form s λ (w) = λ,π w, for every w T λ (T M). If λ L then λ = d q a, where q = π(λ). Hence for every w T λ (T M) s λ (w) = λ,π w = d q a,π w = π d q a,w = d q (a π),w. Proposition The 1-form (s Hdt) L is exact. Proof. We divide the proof in two steps: (i) we show that the restriction of the Poincare-Cartan 1-form (s Hdt) L is closed and (ii) that it is exact. (i). To prove that the 1-form is closed we need to show that the differential d(s Hdt) = σ dh dt, (4.83) vanishes when applied to every pair of tangent vectors to L. Since, for each t [,T], the set L t has codimension 1 in L, there are only two possibilities for the choice of the two tangent vectors: (a) both vectors are tangent to L t, for some t [,T]. (b) one vector is tangent to L t while the second one is transversal. Case (a). Since both tangent vectors are tangent to L t, it is enough to show that the restriction of the one form σ dh dt to L t is zero. First let us notice that dt vanishes when applied to tangent vectors to L t, thus σ dh dt Lt = σ Lt. Moreover, since by definition L t = e t H (L ) one has σ Lt = σ e t H (L ) = (e t H ) σ L = σ L = ds L = d 2 (a π) L =. whereinthelastlineweusedlemma4.58andthefactthat(e t H ) σ = σ, sincee t H isanhamiltonian flow and thus preserves the symplectic form. Case (b). The manifold L is, by construction, the image of the smooth mapping Ψ : [,T] L [,T] T M, Ψ(t,λ) (t,e t H λ), Thus a tangent vector to L that is transversal to L t can be obtained by differentiating the map Ψ with respect to t: Ψ t (t,λ) = t + H(λ) T (t,λ) L. (4.84) It is then sufficient to show that the vector (4.84) is in the kernel of the two form σ dh dt. In other words we have to prove i t+ H (σ dh dt) =. (4.85) 118

119 The last equality is a consequence of the following identities i H σ = σ( H, ) = dh, i t σ =, i H (dh dt) = (i H dh) dt dh (i H dt) =, }{{}}{{} = = i t (dh dt) = (i t dh) dt dh (i }{{} t dt) = dh. }{{} = =1 where we used that ih dh = dh( H) = {H,H} =. (ii). Next we show that the form s Hdt L is exact. To this aim we have to prove that, for every closed curve Γ in L one has s Hdt =. (4.86) Every curve Γ in L can be written as follows Γ Γ : [,T] L, Γ(s) = (t(s),e t(s) H λ(s)), where λ(s) L. Moreover, it is easy to see that the continuous map defined by K : [,T] L L, K(τ,(t,e t H λ )) = (t τ,e (t τ) H λ ) defines an homotopy of L such that K(,(t,e t H λ )) = (t,e t H λ ) and K(t,(t,e t H λ )) = (,λ ). Then the curve Γ is homotopic to the curve Γ (s) = (,λ(s)). Since the 1-form s Hdt is closed, the integral is invariant under homotopy, namely s Hdt = s Hdt. Γ Γ Moreover, the integral over Γ is computed as follows (recall that Γ L and dt = on L ): s Hdt = s = d(a π) =, Γ Γ Γ where we used Lemma 4.58 and the fact that the integral of an exact form over a closed curve is zero. Then (4.86) follows Normal trajectories are geodesics Nowwearereadytoproveasufficientconditionthatensurestheoptimality ofsmallpieces ofnormal trajectories. As a corollary we will get that small pieces of normal trajectories are geodesics. Recall that normal trajectories for the problem m q = f u (q) = u i f i (q), (4.87) where f 1,...,f m is a generating family for the sub-riemannian structure are projections of integral curves of the Hamiltonian vector fields associated with the sub-riemannian Hamiltonian i=1 λ(t) = H(λ(t)), (i.e. λ(t) = e t H (λ )), (4.88) γ(t) = π(λ(t)), t [,T]. (4.89) 119

120 where H(λ) = max { λ,f u (q) 12 } u U u 2 = 1 q 2 m λ,f i (q) 2. (4.9) Recall that, given a smooth function a C (M), we can consider the image of its differential L and its evolution L t under the Hamiltonian flow associated to H as is (4.79) and (4.8). Theorem 4.6. Assume that there exists a C (M) such that the restriction of the projection π Lt is a diffeomorphism for every t [,T]. Then for any λ L the normal geodesic i=1 γ(t) = π e t H (λ ), t [,T], (4.91) is a strict length-minimizer among all admissible curves γ with the same boundary conditions. Proof. Let γ(t) be an admissible trajectory, different from γ(t), associated with the control u(t) and such that γ() = γ() and γ(t) = γ(t). We denote by u(t) the control associated with the curve γ(t). By assumption, for every t [,T] the map π Lt : L t M is a local diffeomorphism, thus the trajectory γ(t) can be uniquely lifted to a smooth curve λ(t) L t. Notice that the corresponding curves Γ and Γ in L defined by Γ(t) = (t,λ(t)), Γ(t) = (t,λ(t)) (4.92) have the same boundary conditions, since for t = and t = T they project to the same base point on M and their lift is uniquely determined by the diffeomorphisms π L and π LT, respectively. Recall now that, by definition of the sub-riemannian Hamiltonian, we have H(λ(t)) λ(t),f u(t) (γ(t)) 1 2 u(t) 2, γ(t) = π(λ(t)), (4.93) where λ(t) is a lift of the trajectory γ(t) associated with a control u(t). Moreover, the equality holds in (4.93) if and only if λ(t) is a solution of the Hamiltonian system λ(t) = H(λ(t)). For this reason we have the relations H(λ(t)) > λ(t),f u(t) (γ(t)) 1 2 u(t) 2, (4.94) H(λ(t)) = λ(t),f u(t) (γ(t)) 1 2 u(t) 2. (4.95) since λ(t) is a solution of the Hamiltonian equation by assumptions, while λ(t) is not. Indeed λ(t) and λ(t) have the same initial condition, hence, by uniqueness of the solution of the Cauchy problem, it follows that λ(t) = H(λ(t)) if and only if λ(t) = λ(t), that implies that γ(t) = γ(t). Let us then show that the energy associated with the curve γ is bigger than the one of the curve γ. Actually we prove the following chain of (in)equalities 1 T u(t) 2 dt = s Hdt = < 2 Γ Γs Hdt 1 T u(t) 2 dt, (4.96) 2 where Γ and Γ are the curves in L defined in (4.92). 12

121 By Lemma 4.59, the 1-form s Hdt is exact. Then the integral over the closed curve Γ Γ vanishes, and one gets s Hdt = s Hdt. The last inequality in (4.96) can be proved as follows Γ s Hdt = = < = 1 2 T T T T Γ λ(t), γ(t) H(λ(t))dt λ(t),fu(t) (γ(t)) H(λ(t))dt Γ λ(t),fu(t) (γ(t)) λ(t),fu(t) ( (γ(t)) 1 2 u(t) 2 u(t) 2 dt. where we used (4.94). A similar computation, using (4.95), gives that ends the proof of (4.96). Γs Hdt = 1 2 T ) dt (4.97) u(t) 2 dt, (4.98) As a corollary we state a local version of the same theorem, that can be proved by adapting the above technique. Corollary Assume that there exists a C (M) and neighborhoods Ω t of γ(t), such that π e t H da Ω : Ω Ω t is a diffeomorphism for every t [,T]. Then (4.91) is a strict length-minimizer among all admissible trajectories γ with same boundary conditions and such that γ(t) Ω t for all t [,T]. We are in position to prove that small pieces of normal trajectories are global length minimizers. Theorem Let γ : [,T] M be a sub-riemannian normal trajectory. Then for every τ [,T[ there exists ε > such that (i) γ [τ,τ+ε] is a length minimizer, i.e., d(γ(τ),γ(τ +ε)) = l(γ [τ,τ+ε] ). (ii) γ [τ,τ+ε] is the unique length minimizer joining γ(τ) and γ(τ +ε), up to reparametrization. Proof. Without loss of generality we can assume that the curve is parametrized by length and prove the theorem for τ =. Let γ(t) be a normal extremal trajectory, such that γ(t) = π(e t H (λ )), for t [,T]. Consider a smooth function a C (M) such that d q a = λ and let L t be the family of submanifold of T M associated with this function by (4.79) and (4.8). By construction, for the extremal lift associated with γ one has λ(t) = e t H (λ ) L t for all t. Moreover the projection π L is a diffeomorphism, since L is a section of T M. Fix a compact K M containing the curve γ and consider the restriction π t,k : L t π 1 (K) K of the map π Lt. By continuity there exists t = t (K) such that π t,k is a diffeomorphism, for 121

122 all t < t. Let us now denote δ K > the constant defined in Lemma 3.34 such that every curve starting from γ() and leaving K is necessary longer than δ K. Then, defining ε = ε(k) := min{δ K,t (K)}, we have that the curve γ [,ε] is contained in K and is shorter than any other curve contained in K with the same boundary condition by Corollary 4.61 (applied to Ω t = K for all t [,T]). Moreover l(γ [,ε] ) = ε since γ is length parametrized, hence it is shorter than any admissible curve that is not contained in K. Thus γ [,ε] is a global minimizer. Moreover it is unique up to reparametrization by uniqueness of the solution of the Hamiltonian equation (see proof of Theorem 4.6). Remark When D q = T q M, as it is the case for a Riemannian structure, the level set of the Hamiltonian {H = 1/2} = {λ T q M H(λ) = 1/2}, is diffeomorphic to an ellipsoid, hence compact. Under this assumption, for each λ {H = 1/2}, the corresponding geodesic γ(t) = π(e t H (λ )) is optimal up to a time ε = ε(λ ), with λ belonging to a compact set. It follows that it is possible to find a common ε > (depending only on q ) such that each normal trajectory with base point q is optimal on the interval [,ε]. It can be proved that this is false as soon as D q T q M. Indeed in this case, for every ε > there exists a normal extremal path that lose optimality in time ε, see Theorem Bibliographical notes The Hamiltonian approach to sub-riemannian geometry is nowadays classical. However the construction of the symplectic structure, obtained by extending the Poisson bracket from the space of affine functions, is not standard and is inspired by [?]. Historically, in the setting of PDE, the sub-riemannian distance(also called Carnot-Carathéodory distance) is introduced by means of sub-unit curves, see for instance[dgn7] and references therein. The link between the two definition is clarified in Exercice 4.33 The proof that normal extremal are geodesics is an adaptation of a more general condition for optimality given in [AS4] for a more general class of problems. This is inspired by the classical idea of fields of extremals in classical Calculus of Variation. 122

123 Chapter 5 Integrable systems In this chapter we present some applications of the Hamiltonian formalism developed in the previous chapter. In particular we give a proof the well-known Arnold-Liouville s Theorem and, as an application, we study the complete integrability of the geodesic flow on a special class of Riemannian manifolds. More examples of sub-riemannian completely integrable systems, together with a proof that all left-invariant sub-riemannian geodesic flows on 3D Lie groups are completely integrable, are presented in Chapter Reduction of Hamiltonian systems with symmetries Recall that a symplectic manifold (N,σ) is a smooth manifold wendowed with a closed nondegenerate two-form σ (cf. Section 4.6). Fix a smooth Hamiltonian h : N R. Definition 5.1. A first integral for the Hamiltonian system defined by h is any smooth function g : N R such that {h,g} =. Recall that by definition {h,g} = h(g) = g(h), hence, if g is afirstintegral for the Hamiltonian system defined by h, we have d dt h et g =. (5.1) namely, h is preserved along the flow of g. We want to show that the existence of a first integral for the Hamiltonian flow generated by h permits to define a reduction of the symplectic space and to reduce to 2n 2 dimensions. The construction of the reduction is local, in general. Fix a regular level set N g,c = {x N g(x) = c} of the function g. This means that d x g for every x N g,c. Fix a point x in the level set and a neighborhood U of x such that g(x) for x U. Notice that this is possible since d x g = σ(, g(x )) with d x g and σ non-degenerate. By continuity this holds in a neighborhood U. The set N g,c has the structure of smooth manifold of dimension 2n 1. Being odd dimensional, the restriction of the symplectic form to the tangent space to its tangent space T x N g,c is necessarily degenerate, and its kernel is one dimensional. Indeed, following the same arguments as in the proof of Proposition 4.31, we have that kerσ TxN g,c = g(x) 123

124 and integral curves of g are tangent to the level set N g,c. This is saying that the flow of g is well defined on the level set. Consider then the quotient N/ := {x U N g,c x 1 x 2 if x 2 = e s g (x 1 ),s R, t [,s] e t g (x 1 ) U}. In other words N/ is the set of orbits of the one parametric group {e s g } s R contained in the fixed level set N g,c of g (and not leaving U). Under our assumptions, the quotient has the structure of smooth manifold of dimension 2n 2. To build a chart close to a point [x ] N/ (with x N g,c ) it is enough to find an hypersurface N g,c N g,c passing through x and transversal to the orbit itself, namely T x N g,c = T x N g,c g(x ) Then local coordinates on N g,c, which has dimension 2n 2, induces local coordinates on N/. The construction of the above quotient is classical (see for instance [Arn89]). The restriction of the symplectic structure σ to the quotient N/ is necessarily non-degenerate (since σ is nondegenerate on the whole space N), hence gives to N/ the structure of symplectic space. Coming back to the original Hamiltonian h in involution with g, we have that h is indeed well defined on the quotient. Indeed since {h,g} = we have, for every t,s such that the terms are defined: e s g e t h = e t h e s g and h induces a well defined Hamiltonian flow on N/. In particular every function f on N that commutes with g, thanks to (5.1), is constant along the trajectories of g, hence defines a function on the quotient N/. Exercise 5.2. Prove that given f 1,f 2 C (N) such that {f 1,g} = {f 2,g} =, one has that {{f 1,f 2 },g} =. Deduce that the Poisson bracket defined on N descends to a well-defined Poisson bracket defined on the quotient N/ with C (N/ ) {f C (N) {f,g} = }. We end this section by showing that the construction of the space of orbits of an (Hamiltonian) vector field is in general only local as the following classical example shows. Example 5.3. Considerthetorus 1 T 2 [,1] 2 /, endowedwiththecanonical symplecticstructure σ = dp dx and the Hamitonian g(x,p) = αx+p. The vector field g is written as follows whose trajectories are given by g(x,y) = g p x g x p = x +α p, x(t) = x +t, p(t) = p +αt. It is well known that, for α R\Q, then every trajectory is an immersed one dimensional submanifold of T 2 that is dense in T 2. Hence the space of orbits (quotient with respect to the equivalence relation) has globally even no structure of topological manifold (the quotient topology is not Hausdorff). The next subsection describes an explicit situation where the symplectic reduction is globally defined. 1 with the equivalence relation (x,) (x,1) and (,p) (1,p). 124

125 5.1.1 Example of symplectic reduction: the space of affine lines in R n In this section we consider an important example of symplectic reduction, that is going to be used in what follows. Let us consider the symplectic manifold N = T R n = R n R n with coordinates (p,x) R n R n and canonical symplectic form n σ = dp i dx i. Define the Hamiltonian g : R 2n R given by We want to prove the following result. i=1 g(x,p) = 1 2 p 2. Proposition 5.4. For every c > the level set N g,c of g is globally diffeomorphic to R n S n 1, and its symplectic reduction N/ is a smooth (symplectic) manifold of dimension 2n 2 globally diffeomorphic to the space of affine lines in R n. Proof. For every c > then we have that the level set N g,c = {(x,p) : g(x,p) = c} = {(x,p) : p 2 = 2c}, is a smooth hypersurface of R 2n of dimension 2n 1, indeed globally diffeomorphic to R n S n 1. The Hamiltonian system for g is easily solved for every initial condition (x(),p()) = (x,p ) ẋ = g p (x,p) = p ṗ = g x (x,p) = { x(t) = x +tp p(t) = p, (5.2) and its flow is globally defined, described by a straight line contained in the space N g,c (notice that c > implies p ). Hence it is clear that the quotient N/ of N g,c with respect to orbits of the Hamiltonian vector field g is the space of affine lines of R n and is globally defined. The proof is completed by Proposition 5.5. Proposition 5.5. The set A(n) of affine lines in R n has the structure of smooth (symplectic) manifold of dimension 2n 2. Proof. We first fix some notation: denote by H i := {x i = } R n the i-th coordinate hyperplane and by U + i = S n 1 {x i > } an open subset of the sphere S n 1, for every i = 1,...,n. We define an open cover on A(n) in the following way: consider the open sets W i A(n) of affine lines L of R n that are not parallel to the hyperplane H i. Then for every line L W i there exists a unique x H i and v U + i such that L = { x + t v t R}. Then, for i = 1,...,n, we define the coordinate chart φ i : W i H i U + i, φ i(l) = ( x, v). Using the standard identification H i R n 1 and the stereographic projection W i R n 1, we build coordinate maps φ i : W i R 2n 2 for i = 1,...,n. Exercise 5.6. Check that {W i } i=1,...,n is an open cover of A(n), and that the change of coordinates φ i φ 1 j : R 2n 2 R 2n 2 is smooth for every i,j = 1,...,n. 125

126 5.2 Riemannian geodesic flow on hypersurfaces In this section we want to show that the Riemannian geodesic flow on an hypersurface of R n, that is an Hamiltonian flow on a 2n 2 dimension, can be seen as the restriction of the Hamiltonian flow of R 2n to the (reduced) symplectic space of affine lines in R n (cf. Section 5.1.1) Geodesics on hypersurfaces Let us consider now a smooth function a : R n R and consider the family of hypersurfaces defined by the level sets of a M c := a 1 (c) R n, c is a regular value of a, endowed with the Riemannian structure induced by the ambient space R n. Recall that, by classical Sard s Lemma for almost every c R, c is a regular value for a (in particular, M c is a smooth submanifold of codimension one in R n ). An adaptation of the arguments of Proposition 1.4 in Chapter 1, one can prove the following characterization of geodesics on a hypersurface M c. Proposition 5.7. Let γ : [, T] M be a smooth minimizer parametrized by length. Then γ(t) T γ(t) M. Exercise 5.8. Prove Proposition Riemannian geodesic flow and symplectic reduction For a large class of functions a, we will find an Hamiltonian, defined on the ambient space T R n, whose (reparametrized) flow generates the geodesic flow when restricted to each level set M c. Consider the standard symplectic structure on T R n T R n = R n R n = {(x,p) x,p R n }, σ = n dp i dx i, i=1 For x,p R n we will denote by x+rp the line {x+tp t R} R n. Assumption. We assume that the function a : R n R satisfies the following assumptions: (A1) the restriction of a : R n R to every affine line is strictly convex, (A2) a(x) + when x +. Under assumptions (A1)-(A2), the restriction of the function a to each affine line in R n always attains a minimum and we can define the Hamiltonian h : R n R n R, h(x,p) = min t R a(x+tp). (5.3) By definition, the function h is constant on every affine line in R n. If we define g : R n R n R, g(x,p) = 1 2 p 2. (5.4) this implies the following (cf. proof of Proposition 5.4). 126

127 Lemma 5.9. The Hamiltonian h is constant along the flow of g, i.e., {h,g} =. We can then apply the symplectic reduction technique explained in Section 5.1: the flow of h induced a well defined flow on the reduced symplectic space of dimension 2n 2 of affine lines in R n (cf. Section 5.1.1). We want to interpret this flow of affine lines as a flow on the level set M c and to show that this is actually the Riemannian geodesic flow. For every x,p R n let us define the functions defined as follows s : R n R n R, ξ : R n R n R n, (a) s(x,p) is the point at which the scalar function t a(x+tp) attains its minimum, (b) ξ(x,p) = x+s(x,p)p. Notice that, by construction, we have h(x,p) = a(ξ(x,p)) for every x,p R n. The first observation is that the line x+rp is tangent at ξ(x,p) to the level set a 1 (c), with c := a(ξ(x,p)). Indeed combining (a) and (b) we have ξ a p = d dt a(x+tp) =, (5.5) t=s(x,p) where denotes the scalar product in R n. The following proposition says that if we follow the motion of the affine lines x(t)+rp(t) along the flow (x(t),p(t)) of h, then the family of lines stay tangent to a fixed quadric and the point of tangency describes a geodesic on it. Proposition 5.1. Let (x(t),p(t)), for t [,T], be a trajectory of the Hamiltonian vector field h associated with (5.3). Then the function t ξ(t) := ξ(x(t),p(t)) R n, (5.6) (i) is contained in a fixed level set M c = a 1 (c), for some c R, (ii) is a geodesic on M c. Proof. Property (i) is a simple consequence of Corollary 4.2, since every function is constant along the flow of its Hamiltonian vector field. Indeed by construction h(x, p) = a(ξ(x, p)) and, denoting by (x(t), p(t)) the Hamiltonian flow, one gets a(ξ(t)) = a(ξ(x(t),p(t))) = h(x(t),p(t)) = const, i.e., the curve ξ(t) is contained on a level set of a. Moreover by definition of ξ(t) we have (cf. (5.5)) ξ(t) a p(t) =, t. (5.7) The Hamiltonian system associated with h reads { ẋ(t) = s(t) ξ(t) a ṗ(t) = ξ(t) a (5.8) 127

128 that immediately implies ẋ(t)+s(t)ṗ(t) =. Thus computing the derivative of ξ(t) = x(t)+s(t)p(t) one gets ξ(t) = ṡ(t)p(t), it follows that ξ(t) is parallel to p(t). Notice that s = s(t) is a well defined parameter on the curve ξ(t). Indeed computing the derivative with respect to t in (5.7) we have that ṡ(t) 2 p(t) ξ(t) ap(t) ξ(t) a 2 =. and the strict convexity of a implies 2ξ(t) p(t) ap(t) and ṡ(t) = ξ(t) a 2 2ξ(t) ap(t) p(t). In particular p(t) denotes the velocity of the curve ξ(t), when reparametrized with the parameter s = s(t), since p(t) = 1 implies ξ(t) = ṡ(t). Finally, the second derivative of the reparametrized ξ(s) is ṗ(s) and, since ṗ(s) is parallel to ξ(s) a = by (5.8), the second derivative ξ(s) (i.e., the curve ξ reparametrized by the length) is orthogonal to the level set, i.e., s ξ(s) is a geodesic on the level set. Remark Thus we can visualize the solutions of h as a motion of lines: the lines move in such a way to be tangent to one and the same geodesic. The tangency point x on the line moves perpendicular to this line in this process. We will also refer to this flow as the line flow associated with a. To end this section let us prove the following result, that will be used later in Section 5.6. Considertwo functionsa,b : R n Rsatisfyingourassumptions(A1)-(A2). Following ournotation, we set h(x,p) = a(ξ(x,p)), g(x,p) = b(η(x,p)), ξ(x,p) = x+s(x,p)p η(x,p) = x+τ(x,p)p where s(x,p) and τ(x,p) are defined as above, and ξ, η denote the tangency point of the line x+rp with the level set of a and b respectively. The following proposition computes the Poisson bracket of these Hamiltonian functions Proposition Under the previous assumptions {h,g} = (s τ) ξ a η b. (5.9) Proof. The coordinate expression of the Poisson bracket (4.19) can be rewritten as and using equation (5.8) for both h and g one gets {h,g} = p h x g x h p g, (5.1) {h,g} = (s τ) ξ a η b. (5.11) 128

129 5.3 Sub-Riemannian structures with symmetries Recall that, for a sub-riemannian manifold, we denote by H the sub-riemannian Hamiltonian. Definition We say that a complete smooth vector field X Vec(M) is a Killing vector field if it generates a one parametric flow of isometries, i.e. e tx : M M is an isometry for all t R. For every X Vec(M), we can define the function h X C (T M) linear on fibers associated with X by h X (λ) = λ,x(q), where q = π(λ). The following lemma shows that X is a Killing vector field if and only if h X commutes with the sub-riemannian Hamiltonian H. Lemma Let M be a sub- Riemannian manifold and H the sub-riemannian Hamiltonian. For a vector field X Vec(M) is a Killing vector field if and only if {H,h X } =. Proof. A vector field X generates isometries if and only if, by definition, the differential of its flow e tx : T q M T e tx (q)m preserves the sub-riemannian distribution and the norm on it, i.e. e tx v D e tx (q) for every v D q and e tx v = v. By definition of H, this is equivalent to the identity H((e tx ) λ) = H(λ), λ T M. (5.12) On the other hand Proposition 4.1 implies that (e tx ) = e t h X, where h X is the Hamiltonian linear on fibers related to X. Differentiating (5.12) with respect to t we find the equivalence H e tx = H h X H = {H,h X } =. In other words, with every 1-parametric group of isometries of M we can associate an Hamiltonian in involution with H. Let us show two classical examples where we have a sub-riemannian structure with symmetries. Example 5.15 (Revolution surfaces in R 3 ). Let M be a 2-dimensional revolution surface in R 3. Since the rotation around the revolution axis preserves the Riemannian structure, by definition, we have that the Hamiltonian generated by this flow and the Riemannian Hamiltonian H are in involution. Example 5.16 (Isoperimetric sub-riemannian problem). Let us consider a sub-riemannian structure associated with an isoperimetric problem defined on a 2-dimensional revolution surface M (see Section 4.4.2). The sub-riemannian structure on M R is determined by the function b C (M) satisfying da = bdv, where A Λ 1 (M) is the 1-form defining the isoperimetric problem and dv is the volume form on M. (i) By construction the problem is invariant by translation along the z-axis (ii) If, moreover, both M and b are rotational invariant we find a first integral of the geodesic flow as in the previous example 129

130 5.4 Completely integrable systems Let M be an n-dimensional smooth manifold and assume that there exist n independent Hamiltonians in involution in T M, i.e. a set of n smooth functions h i : T M R, i = 1,...,n, {h i,h j } =, i,j = 1,...,n. (5.13) such that the differentials d λ h 1,...,d λ h n of the functions are independent on an open dense set of point λ T M. Let us consider the vector valued map, called moment map, defined by h : T M R n, h = (h 1,...,h n ). Definition Under the assumptions (5.13), then we say that the map h is completely integrable. The same terminology applies to any of the Hamiltonian system defined by one of the Hamiltonian h i, for i = 1,...,n. Lemma Assume that h is completely integrable and c R n be a regular value of h. Then the set h 1 (c) is a n-dimensional submanifold in T M and we have T λ h 1 (c) = span{ h 1 (λ),..., h n (λ)}, λ h 1 (c). (5.14) Proof. Since c is a regular value of h, by Remark 2.58 the set h 1 (c) is a submanifold of dimension n in T M. In particular dimt λ h 1 (c) = n for every λ h 1 (c). Moreover, by Exercise 2.11, each vector field h i is tangent to h 1 (c), since h i h j = {h i,h j } = by assumption. To prove (5.14) it is then enough to show that these vector fields are linearly independent. Since c is a regular value of h, the differentials of the functions h i are linearly independent on h 1 (c), namely dimspan{d λ h 1,...,d λ h n } = n, λ h 1 (c). (5.15) Moreover the symplectic form σ on T M induces for all λ an isomorphism T λ (T M) T λ (T M) defined by w σ λ (,w). By nondegeneracy of the symplectic form, this implies that hence they form a basis for T λ h 1 (c). dimspan{ h 1 (λ),..., h n (λ)} = n, λ h 1 (c). (5.16) Remark Notice that the symplectic form vanishes on T λ h 1 (c). Indeed this is a consequence of the fact that σ( h i, h j ) = {h i,h j } = for all i,j = 1,...,n. In what follows we denote by N c = h 1 (c) the level set of h. If h 1 (c) is not connected, N c will denote a connected component of h 1 (c). Proposition 5.2. Assume that the vector fields h i are complete and define the map Ψ : R n Diff(N c ), Ψ(s 1,...,s n ) := e s 1 h 1... e sn h n Nc. (5.17) For every λ N c, the map Ψ λ : R n N c defined by Ψ λ (s) := Ψ(s)λ defines a transitive action of R n onto N c. 13

131 Proof. The complete integrability assumption together with Corollary 4.56 implies that the flows of h i and h j commute for every i,j = 1,...,n since By Proposition 2.26, this is equivalent to [ h i, h j ] = {h i,h j } =. e t h i e τ h j = e τ h j e t h i, t,τ R. (5.18) Thus, for every λ, the map Ψ λ is a smooth local diffeomorphism between at each point. Indeed, using (5.18), one has (cf. also Exercice 2.31) Ψ λ s i (Ψ λ (s)) = h i (Ψ λ (s)), i = 1,...,n, and the partial derivatives are linearly independent at each point of N c. Since the vector fields are complete by assumption, we can compute for every s,s R n Ψ(s+s ) = e (s 1+s 1 ) h 1... e (sn+s n ) h n = e s 1 h 1 e s 1 h 1... e sn h n e s n h n = e s 1 h 1... e sn h n e s 1 h 1... e s n h n (by (5.18)) = Ψ(s) Ψ(s ), which proves that Ψ is a group action. Denote, for every point λ N c, its orbit under the group action, namely Ω λ = imψ λ = {Ψ λ (s) s R n }. Exercise Using the fact that N c is connected, prove that Ω λ = N c for every λ N c. Hence the map Ψ λ is surjective, but in general it is not injective (as for instance in the case when M is compact). As a consequence we consider the stabiliser S λ of the point λ, i.e. the set S λ = {s R n Ψ λ (s) = λ}, Exercise Prove that S λ is a discrete 2 subgroup of R n, independent on λ N c. Then the proof of Proposition 5.2 is completed by the next lemma. Lemma Let G be a non trivial discrete subgroup of R n. Then there exist k N with 1 k n and v 1,...,v k R n such that { k } G = m i v i, m i Z. i=1 2 Recall that a subgroup G of R n is discrete if and only if for every g G there exist an open set U R n containing g and such that U G = {g}. 131

132 Proof. We prove the claim by induction on the dimension n of the ambient space R n. (i). Let n = 1. Since G is a discrete subgroup of R, then there exists an element e 1 closest to the origin R. We claim that G = Zv 1 = {mv 1, m Z}. By contradiction assume that there exists an element f G such that mv 1 < f < (m + 1)v 1 for some m Z. Then f := f mv 1 belong to G and is closer to the origin with respect to v 1, that is a contradiction. (ii). Assume the statement is true for n 1 and let us prove it for n. The discreteness of G guarantees the existence of an element v 1 G, closest to the origin. Moreover one can prove that G 1 := G Rv 1 is a subgroup and, as in part (i) of the proof, that G 1 := G Rv 1 = Zv 1. If G = G 1 then the theorem is proved with k = 1. Otherwise one can consider the quotient G/G 1. Exercise (i). Prove that there exists a nonzero element v 2 G/G 1 that minimize the distance to the line l = Rv 1 in R n. (ii). Show that there exists a neighborhood of the line l that does not contain elements of G/G 1. By Exercise 5.24 the quotient group G/G 1 is a discrete subgroup in R n /l R n 1. Hence, by the induction step there exists v 2,...,v k such that { k } G/G 1 = m i v i, m i Z. i=2 Corollary The connected manifold N c is diffeomorphic to T k R n k for some k n, where T k denotes the k-dimensional torus. Fix coordinates θ T k R n k, with (θ 1,...,θ k ) T k and (θ k+1,...,θ n ) R n k, then we have hi = n b ij (c) θj, (5.19) j=1 for some constants b ij (c) independent on λ N c. Proof. Fix c R n and a point λ N c. Let us consider the elements v 1,...,v k R n generators of the stabiliser S λ (independent on λ) given by Lemma 5.23 and complete it to a global basis v 1,...,v n. Denote by e 1,...,e n the canonical basis of R n and by B : R n R n any isomorphism such that Be i = v i for i = 1,...,n. We stress again that B does not depend on λ N c and is thus a function of c only. Then clearly the map B Ψ λ : R n N c is a local diffeomorphism and, due to the fact that S λ is the stabiliser of Ψ λ, descends to a well-defined map on the quotient B Ψ λ : T k R n k N c that is a global diffeomorphism. Introduce the coordinates (θ 1,...,θ n ) in R n induced by the choice of the basis v 1,...,v n. 132

133 Since (θ 1,...,θ n ) are obtained by (s 1,...,s n ) by a linear change of coordinates on each level set, the vector fields h i are constant in the s coordinates (indeed h i = si ) we have and the basis θ1,..., θn can be expressed as follows hi = si = n b ij (c) θj, (5.2) j=1 where b ij are the coefficients of the operator B, depending only on c (i.e., are constant on each level set N c ). Remark In general, due to the fact that the level set N c is not compact, the set (c,θ) do not define local coordinates on T M. If we assume that (c,θ) define a set of local coordinates, then the Hamiltonian system defined by h i takes the form (on the whole space T M) { ċ =, i = 1,...,n. (5.21) θ j = b ij (c) Notice that, assoonas(c,θ)definelocal coordinates, thecoordinateset(θ 1,...,θ n )arenotuniquely defined. In particular, every transformation of the kind θ i θ i + ψ i (c) still defines a set of cylindirical coordinates on each level set. The choice of the functions ψ i (c) corresponds to the choice of the initial value of θ i at a point (for every choice of c). However, the vector fields θi are independent on this choice. 5.5 Arnold-Liouville theorem In this section we consider in detail the case when the level set of a completely integrable system defined by h : T M R n, h = (h 1,...,h n ), are compact. More precisely we assume that for all values of c R the level set h 1 (c) is a smooth compact and connected manifold. From Proposition 5.2 and the fact that T k R n k is compact if and only if k = n we have the following corollary. Corollary If N c is compact, then N c T n. Fix λ N c and introduce the diffeomorphism F c : T n N c, F c (θ 1,...,θ n ) = Ψ λ (θ 1 +2πZ,...,θ n +2πZ). Next we want to analyze the dependence of this construction with respect to c. Fix c R n and consider a neighborhood O of the submanifold N c in the cotangent space T M. Being N c compact, in O we have a foliation of invariant tori N c, for c close to c. In other words (c 1,...,c n,θ 1,...,θ n ) is a well defined coordinate set on O. Theorem 5.28 (Arnold-Liouville). Let us consider a moment map h : T M R n associated with a completely integrable system such that every level set N c is compact and connected. Then for every c R there exists a neighborhood O of N c and a change of coordinates such that (c 1,...,c n,θ 1,...,θ n ) (I 1,...,I n,ϕ 1,...,ϕ n ) (5.22) 133

134 (i) I = Φ h, where Φ : h(o) R n is a diffeomorphism, (ii) σ = n j=1 di j dϕ j. Definition The coordinates (I, ϕ) defined in Theorem 5.28 are called action-angle coordinates. Proof of Theorem In this proof we will use the following notation: for c = (c 1,...,c n ) R n, j = 1,...,n and ε > we set -) c j,ε := (c 1,...,c j +ε,...,c n ), - γ i (c) is the closed curve in the torus N c parametrized by the i-th angular coordinate θ i, namely γ i (c) := {F c (θ 1,...,θ i +τ,...,θ n ) N c τ [,2π]}. - C j,ε i denotes the cylinder defined by the union of curves γ i (c j,τ ), for τ ε. Let us first define the coordinates I i = I i (c 1,...,c n ) by the formula I i (c) = 1 s, 2π γ i (c) where s is the tautological 1-form on T M. Being σ Nc, by Stokes Theorem the variable I i depends only on the homotopy class of γ i. 3 Let us compute the Jacobian of the change of variables. I i (c) = 1 c j 2π = 1 2π = 1 2π = 1 2π = 1 2π ( ) ε s s ε= γ i (c j,ε ) γ i (c) ε s ε= C j,ε i ε σ (where σ = ds) ε= C j,ε i cj +ε ε σ( cj, θi )dθ i dτ ε= c j γ i (c j,τ ) σ( cj, θi )dθ i. γ i (c) Using that θi = n j=1 bij (c) h j (see (5.2)) (where b ij are the entries of the inverse matrix of b ij ) one gets n σ(, θi ) = b ij (c)dh j. (5.23) 3 Hence, in principle, we are free to choose any basis γ 1,...,γ n for the fundamental group of T n. j=1 134

135 Moreover dh i = dc i since they define the same coordinate set. Hence I i (c) = 1 n b ik (c)dc k, ci dθ i c j 2π γ i (c) k=1 = 1 b ij (c)dθ i 2π = b ij (c) γ i (c) Combining the last identity with (5.23) one gets σ(, θi ) = di i In particular this implies that the symplectic form has the following expression in the coordinates (I,θ) n n σ = a ij (I)dI i di j + di i dθ i. (5.24) i,j=1 where the smooth functions a ij depends only on the action variables, since the symplectic form σ and the term n i=1 di i dθ i are closed form. Moreover it is easy to see that the first term of (5.24) can be rewritten as ( n n ) a ij (I)dI i di j = d β i (I) di i, and σ can be rewritten as i,j=1 σ = i=1 i=1 n di i d(θ i β i (I)). i=1 The proof is completed by setting ϕ i := θ i β i (I). Remark 5.3. This proves that there exists a regular foliation of the phase space by invariant manifolds, that are actually tori, such that the Hamiltonian vector fields associated to the invariants of the foliation span the tangent distribution. There then exist, as mentioned above, special sets of canonical coordinates on the phase space such that the invariant tori are the level sets of the action variables, and the angle variables are the natural periodic coordinates on the torus. The motion on the invariant tori, expressed in terms of these canonical coordinates, is linear in the angle variables. Indeed, since the h j are functions on I variables only, we have hj = n i=1 h j I i ϕi. Inotherwords, thehamiltonian systemdefinedbyh j intheangle-action coordinate(i,ϕ) iswritten as follows I i = h j ϕ i =, ϕ i = h j I i (I). (5.25) This explains also why this property is called complete integrability. The Hamitonian equation in these coordinates can indeed be solved explicitly. 135

136 5.6 Geodesic flows on quadrics In this chapter we prove that the geodesic flow on an ellipsoid (and, as a consequence, on quadrics) is completely integrable. More precisely we consider the particular case when the function a is a quadratic polynomial, i.e. every level set of our function is a quadric in R n. The presentation follows the arguments of Moser [Mos8b]. Definition Let A be an n n non degenerate symmetrix matrix. The quadric Q associated to A is the set Q = {x R n, A 1 x,x = 1}. (5.26) For simplicity we deal with the case when A has simple distinct eigenvalues α 1 <... < α n. Define, for every λ that is not an eigenvalue of A, a λ (x) = (A λi) 1 x,x, Q λ = {x R n, a λ (x) = 1}. If A = diag(α 1,...,α n ) is a diagonal matrix then (5.26) reads Q = {x R n, n i=1 x 2 i α i = 1}, and Q λ represents the family quadrics that are confocal to Q { } n Q λ = x R n x 2 i, α i λ = 1, λ R\Λ, i=1 where Λ = {α 1,...,α n } denotes the set of eigenvalues of A. Note that Q λ is an ellipsoid only if λ < α 1, while Q λ = when λ > α n. Note. In what follows by a generic point x for A we mean a point x that does not belong to anyproperinvariant subspaceof A. Inthediagonal caseit isequivalent tosay that x = (x 1,...,x n ), with x i for every i = 1,...,n. Exercise Denote by A λ := (A λi) 1. Prove the two following formulas: (i) d dλ A λ = A 2 λ, (ii) A λ A µ = (µ λ)a λ A µ. Lemma Let x R n be a generic point for A and let {Q λ } λ Λ be the family of confocal quadrics. Then there exists exactly n distinct real numbers λ 1,...,λ n in R\Λ such that x Q λi for every i = 1,...,n,. Moreover the quadrics Q λi are pairwise orthoghonal at the point x. Proof. For a fixed x, the function λ a λ (x) = A λ x,x satisfies in R\Λ a λ λ (x) = A 2 λ x,x = A λ x 2, where A λ := (A λi) 1, as follows from part (i) of Exercise 5.32 and the fact that A (hence A λ ) is self-adjoint. Thus a λ (x) is monotone increasing as a function of λ, and takes values from to + in each interval ]α i,α i+1 [ contained between two eigenvalues of A. Thisimplies that, for a fixed x, there exist exactly n values 136

137 λ 1,...,λ n such that a λi (x) = 1 (that means x Q λi ). Next, using part (ii) of Exercise 5.32 (also known as resolvent formula) we can compute, for two distinct values λ i λ j and x Q λi Q λj : x a λi, x a λj = 4 Aλi x,a λj x = 4 A λi A λj x,x = 4 λ j λ i ( A λi x,x A λj x,x ) =, where again we used the fact that A λ is selfadjoint and A λ x,x = 1 for all λ. Now we define the family of Hamiltonians associated with the family of confocal quadrics h λ (x,p) = min t a λ (x+tp) = a λ (ξ λ (x,p)), (5.27) Remark Notice that the minimum in (5.27) is attained at a unique point, and the function a λ satisfies the assumptions (A1)-(A2) introduced in Section??, only if the corresponding quadric is an ellipsoid. In what follows we generalize the considerations to all quadrics associated to λ R\Λ. Indeed we can still define the hamiltonian h λ as the value of the function a λ at its critical point along an affine line (hence defining h λ as an Hamiltonian on the set of affine lines as well). Now we prove another interesting orthogonality property of the family. We show that if two confocal quadrics are tangent to the same line, then their gradient are orthogonal at the tangency points. Proposition Assume that two confocal quadrics are tangent to a given line, i.e. there exist x,y R n such that a λ (ξ λ ) = a µ (ξ µ ), where ξ λ = x+t λ p, ξ µ = x+t µ p. Then ξλ a λ, ξµ a µ =. In particular {h λ,h µ } =. Proof. The condition that the quadric Q λ is tangent to the line x+ry at ξ λ is expressed by the following two equality A λ ξ λ,y =, A λ ξ λ,ξ λ = 1 (5.28) and an analogue relations is valid for Q µ. Notice than from (5.28) one also gets A λ ξ λ,ξ µ = A µ ξ µ,ξ λ = 1. Then, with the same computation as before using (5.32) ξλ a λ, ξµ a µ = 4 Aλ ξ λ,a µ ξ µ = 4 A λ A µ ξ λ,ξ µ = 4 µ λ ( A λξ λ,ξ µ A µ ξ µ,ξ λ ) =, This implies also {h λ,h µ } =, thanks to Proposition Proposition A generic line in R n is tangent to n 1 quadrics of a confocal family. 137

138 Proof. Write R n = L L where L = x+rp and L is the orthogonal hyperplane (passing through x). Consider the orthogonal projection π : R n L in the direction of L. The following exercise shows that the projection of a confocal family of quadrics in R n is a confocal family of quadrics on L. Exercise (i). Show that the map x a p λ (x) := A λ(x+t λ p),x+t λ p is a quadratic form and that p Kera p λ. In particular this implies that ap λ is well defined on the quotient Rn /Rp. (ii). Prove that {a p λ } λ is a family of confocal quadric on the factor space (in n 1 variables). Applying then Lemma 5.33 to the family {a p λ } λ we get that, for a generic choice of x, there exists n 1 quadrics passing through the point on the plane where the line is projected, i.e. the line x+rp is tangent to n 1 confocal quadrics of the family {a λ } λ. Remark Notice that thisproves that everygenericlineinr n isassociated withanorthonormal frame of R n, being all the normal vectors to the n 1 quadrics given by Proposition 5.36 mutually orthogonal and orthogonal to the line itself. Theorem The geodesic flow on an ellipsoid is completely integrable. In particular, the tangents of any geodesics on an ellipsoid are tangent to the same set of its confocal quadrics, i.e. independently on the point on the geodesic. Proof. We want to show that the functions λ 1 (x,p),...,λ n 1 (x,p) (as functions defined on the set of lines in R n ) that assign to each line x + Rp in R n the n 1 values of λ such that the line is tangent to Q λ are independent and in involution. First notice that each level set λ i (x,p) = c coincide with the level set h c = 1. Hence, by Exercise 4.32, the two functions defines the same Hamiltonian flow on this level set(up to reparametrization). We are then reduced to prove that the functions h c1,...,h cn 1 are independent and in involution, which is a consequence of Proposition Since the lines that are tangent to a geodesic on the ellipsoid Q λ form an integral curve of the Hamiltoian flow of the associated function h λ, and all the Poisson brackets with the other Hamiltonians are zero, it follows that the line remains tangent to the same set of n 1 quadrics. Bibliographical notes The notion of complete integrability introduced here is the classical one given by Liouville and Arnold[Arn89]. Sometimes, complete integrability of a dynamical system is also referred to systems whose solution can be reduced to a sequence of quadratures. This means that, even if the solution is implicitly given by some inverse function or integrals, one does not need to solve any differential equation. Notice that by Theorem 5.28 complete integrability implies integrability by quadratures (see also Remark 5.3). The complete integrability of the geodesic flow on the triaxial ellipsoid was established by Jacobi in Jacobi integrated the geodesic flow by separation of variables, see [Jac39]. The appropriate coordinates are called the elliptic coordinates, and this approach works in any dimension. Here we give a different derivation, essentially due to Moser [Mos8b], as an application of the theory developed in the first sections of the chapter. For further discussions on the geodesic flow on the ellipsoids or quadrics, one can see [Mos8a, Aud94,?]. 138

139 Chapter 6 Chronological calculus In this chapter we develop a language, called chronological calculus, that will allow us to work in an efficient way with flows of nonautonomous vector fields. 6.1 Motivation Classical formulas from calculus that are valid in R n are often no more meaningful on a smooth manifold, unless one consider them as written in coordinates. Let us consider for instance a smooth curve γ : [,T] R n. The fundamental theorem of calculus states that, for every t [,T], one has t γ(t) = γ()+ γ(s) ds. (6.1) Formula (6.1) has no meaning a priori if γ takes values on a smooth manifold M. Indeed, if γ : [,T] M, then γ(s) T γ(s) M and one should integrate a family of tangent vectors belonging to different tangent spaces. Moreover, since M has no affine space structure, one should explain what is the sum of a point on M with a tangent vector. Saying that formula (6.1) is meaningful in coordinates means that, once we identify an open set U on M with R n through a coordinate map φ : U M R n, we reduce (6.1) to n scalar identities. These n identities corresponds to the choice of n (independent) scalar functions φ = (φ 1,...,φ n ). Indeed it is not necessary to choose a specific set of coordinate functions to let (6.1) have a meaning, and the idea behind the formalism we introduce in this chapter is that we can write formula (6.1) along any scalar function, treating this function as the object where the formula is evaluated. More precisely, let us fix a smooth curve γ : [,T] M and a smooth function a : M R and let us apply the fundamental theorem of calculus to the scalar function a γ : [,T] R. We get, for every t [,T] the following identity a(γ(t)) = a(γ())+ t dγ(s) a, γ(s) ds (6.2) Formula (6.2) is meaningful even if we are on a manifold since it is a scalar identity. The integrand is the duality product between d γ(s) a T γ(s) M and γ(s) T γ(s)m. 139

140 If we think to a point on M as acting on a function by evaluating the function at that point, and to a tangent vector as acting on a function by differentiating in the direction of the vector, then we can think to (6.2) as formula (6.1) when evaluated at a, or at (6.2) as the coordinate version of (6.1). If we choose as a the functions φ i for i = 1,...,n we are writing the coordinate version of the identity in the classical sense. In what follows we develop in a formal way this flexible language that has the advantage of computing things as in coordinates keeping track the geometric meaning of the object we are dealing with. 6.2 Duality The basic idea behind this formal construction is to replace nonlinear objects defined on the manifold M with their linear counterpart, when interpreted as maps on the space C (M) of smooth functions on M. WerecallthatthesetC (M)ofsmoothfunctionsonM isanr-algebrawiththeusualoperation of pointwise addition and multiplication (a+b)(q) = a(q)+b(q), (λa)(q) = λa(q), a,b C (M), λ R, (a b)(q) = a(q)b(q). Any point q M can be interpreted as the evaluation linear functional q : C (M) R, q(a) := a(q). For every q M, the functional q is a homomorphism of algebras, i.e., it satisfies q(a b) = q(a) q(b). A diffeomorphism P Diff(M) can be thought as the change of variables linear operator P : C (M) C (M), P(a) := a(p(q)). which is an automorphism of the algebra C (M). Remark 6.1. One can prove that for every nontrivial homomorphism of algebras ϕ : C (M) R thereexistsq M suchthatϕ = q. Analogously, foreveryautomorphismofalgebrasφ : C (M) C (M), there exists a diffeomorphism P Diff(M) such that P = Φ. A proof of these facts is contained in [AS4, Appendix A]. Nextwewanttocharacterize tangentvectors asfunctionalsonc (M). AsexplainedinChapter 2, a tangent vector v T q M defines in a natural way the derivation in the direction of v, i.e. the functional v : C (M) R, v(a) = d q a,v, that satisfies the Leibnitz rule v(a b) = v(a)b(q)+a(q) v(b), a,b C (M). 14

141 If v T q M is the tangent vector of a curve q(t) such that q() = q, it is also natural to check the identity as operators v = d dt : C t= q(t) (M) R. (6.3) Indeed, it is sufficient to differentiate at t = the following identity q(t)(a b) = q(t)a q(t)b. In the same spirit, a vector field X Vec(M) is characterized, as a derivation of C (M) (cf. again the discussion in Chapter 2), as the infinitesimal version of a flow (i.e., family of diffeomorphisms smooth w.r.t t) P t Diff(M). Indeed if we set X = d dt Pt : C (M) C (M), t= we find that X satisfies (see (2.14)) On the notation X(ab) = X(a)b+a X(b), a,b C (M). In the following we will identify any object with its dual interpretation as operator on functions and stop to use a different notation for the same object when acting on the space of smooth functions. If P is a diffeomorphism on M and q is a point on M the point P(q) is simply represented by the usual composition q P of the corresponding linear operator. Thus, when using the operator notation, composition works in the opposite side. To simplify the notation in what follows we will remove the hat identifying an object with its dual, but use the symbol to denote the composition of these object, so that P(q) will be q P. Analogously, the composition X P of a vector field X and a diffeomorphism P will denote the linear operator a X(a P). 6.3 Topology on the set of smooth functions We introduce the standard topology on the space C (M). Denote by X 1,...,X N a family of globally defined vector fields such that span{x 1,...,X N } q = T q M, q M. For α N and K M compact, define the following seminorms of a function f C (M) f α,k = sup{ (X il Xi1 f)(q) : 1 i j N, l α} q K, The family of seminorms s,k induces a topology on C (M) with countable local bases of neighborhood as follows: take an increasing family of compact sets {K n } n N invading M, i.e., K n K n+1 M for every n N and M = n N K n. For every f C (M), a countable local base of neighborhood of f is given by U f,n := { g C (M) : f g n,kn 1 n }, n N. (6.4) 141

142 Exercise 6.2. (i). Prove that (6.4) defines a basis for a topology. (ii) Prove that this topology does not depend neither on the family of vector fields X 1,...,X N generating the tangent space to M nor on the family of compact sets {K n } n N invading M. This topology turns C (M) into a Fréchet space, i.e., a complete, metrizable, locally convex topological vector space, see [Hir76, Chapter 2]. Remark 6.3. In differential topology this is also called weak topology on C (M), in contrast with the strong (or Whitney) topology that can be defined on C (M). The two topology coincide when the manifold M is compact. For more details about different topologies on the spaces C k (M,N) of C k maps among two smooth manifolds M and N one can see, for instance, [Hir76, Chapter 2]. Example 6.4. Prove that, given a diffeomorphism P Diff(M) and α N, there exists a constant C α,p > such that for all f C (M) one has Pf α,k C α,p f α,p(k), K M. In other words the diffeomorphism P, when interpreted as a linear operator on C (M), is continuous in the Whitnhey topology. One can then define its seminorm P α,k := sup{ Pf α,k : f α,p(k) 1}. Similarly, given a smooth vector field X on M, one defines its seminorms by X α,k := sup{ Xf α,k : f α+1,k 1} Family of functionals and operators Once the structure of a Fréchet space on C (M) is given, one can define regularity properties of family of functions in C (M). In particular continuous and differentiable families of functions t a t are defined in a standard way. Moreover, we say that the family t a t C (M) defined on an interval [t,t 1 ] is measurable, if the map q a t (q) is measurable on [t,t 1 ] for every q M locally integrable, if for every α N and K M compact. t1 t a t α,k dt <, absolutely continuous, if there exists a locally integrable family of functions b t such that Lipschitz, if for every α N and K M compact. t a t = a t + b s ds. t a t a s α,k C s,k t s, 142

143 Analogous regularity property for a family of linear functionals (or linear operators) on C (M) are then naturally defined in a weak sense: we say that a family of operators t A t is continuos (differentiable, etc.) if the map t A t a has the same property for every a C (M). We define a non-autonomous vector field as a family of vector fields X t that is locally bounded. A non-autonomous flow is a family of diffeomorphisms P t that is absolutely continuous. Hence, for any non-autonomous vector field X t, the family of functions t X t a is locally integrable for any a C (M). Similarly, for any non-autonomous flow P t the family of functions t a P t is absolutely continuous for any a C (M). Integrals of measurable locally integrable families, and derivative of differentiable families are also defined in the weak sense: for instance, if X t denotes some locally integrable family of vector fields we denote t X s ds : a t X s ads d dt X t : a d dt (X ta) One can show that if A t and B t are continuous families of operators on C (M) wich are differentiable at some t, then the family A t Bt is differentiable at t and satisfies the Leibnitz rule ( ) ( ) d d dt (A t Bt ) = d t=t dt A t Bt +A t t=t dt B t. (6.5) t=t The same result holds true for the composition of functionals with operators. For a proof of the last fact one can see [AS4, Chapter 2 and Appendix A]. 6.4 Operator ODE and Volterra expansion Consider a nonautonomous vector field X t and the corresponding nonautonomous ODE d dt q(t) = X t(q(t)), q M. (6.6) Using the notation introduced in the previous section we can rewrite (6.6) in the following way 1 d dt q(t) = q(t) X t. (6.8) As discussed in Chapter 2, the solution to the nonautonomous ODE (6.6) defines a flow, i.e., family of diffeomorphisms, P s,t. We call P s,t the right chronological exponential and use the notation P s,t := exp t s X τ dτ. (6.9) Sometimes it is useful to set the initial time s =. In this case we use the short notation P t := P,t. 1 Indeed assume that q(t) satisfies (6.6) and let a C (M). Using hat notation of Section 6.2 ( ) d dt q(t) a = dt q(t)a d = d dt a(q(t)) = d q(t) a,x t(q(t)) = ( X ta)(q(t)) = ( q(t) X t)a. (6.7) 143

144 Lemma 6.5. The flow P t defined by (6.9) satisfies the differential equation d dt P t = P t Xt, P = Id. (6.1) Proof. Fix a point q M and denote by q(t) the solution of the Cauchy problem (6.6) with initial condition q() = q. By the very definition of P t we have that q(t) = P t (q ), which easily implies q(t) = q Pt Volterra expansion The operator differential equation { P t = P t Xt P = Id can be rewritten as an integral equation as follows P t = Id+ Replacing into (6.12), and iterating we have t ( P t = Id+ Id+ where = Id+ =... t N 1 = Id+ R N = X s ds+ s1 t (6.11) P s Xs ds (6.12) P s2 Xs2 ds 2 ) s 2 s 1 t k=1 s k... s 1 t s N... s 1 t Xs1 ds 1 P s2 Xs2 Xs1 ds 1 ds 2 X sk Xs1 d k s+r N P sn XsN Xs1 d N s Formally, letting N and assuming that R N, we can write the chronological series Id+ X sk Xs1 d k s (6.13) k=1 k (t) where k (t) = {(s 1,...,s k ) R k s k... s 1 t} denotes the k-dimensional symplex. A detailed discussion about the convergence of the series is contained in Section 6.A. Remark 6.6. If we write expansion (6.13) when X t = X is an autonomous vector field, we find that the chronological exponential coincides with the exponential of the vector field t exp Xds Id+ X} {{ X } d k s k=1 k (t) k vol( k (t))x k t k = k! Xk = e tx, k= 144 k=

145 since vol( k (t)) = t k /k!. In the nonautonomous case for different time X s1 and X s2 might not commute, hence the order in which the vector fields appears in the composition is crucial. The arrow in the notation recalls in which direction the parameters are increasing. Exercise 6.7. Prove that in general, for a nonautonomous vector field X t, one has exp t X s ds e t Xsds. (6.14) Prove that if [X t,x τ ] = for all t,τ R then the equality holds in (6.14) Assumenow that P t satisfies (6.12) and consider the inverse flow Q t := P 1 t. Let us characterize the differential equation satisfied by Q t. First, by differentiating the identity and using the Leibnitz rule one gets to P t Qt = Id, (6.15) P t Qt +P t Q t =. Using (6.11) then we get P t Xt Qt +P t Q t = hence we get, multiplying Q t on the right both sides, that Q t satisfies the equation { Q t = X t Qt, Q = Id. (6.16) The solution to the problem (6.16) will be denoted by the left chronological exponential Q t := exp Repeating analogous reasoning, we find the formal expansion exp t ( X s )ds Id+ t k=1 s k... s 1 t ( X s )ds. (6.17) ( X s1 ) ( X sk )d k s. The difference with respect to the right chronological exponential is in the order of composition. In particular the arrow over the exp says in which direction the time increases. We can summarize properties of the chronological exponential into the following d exp dt t t X s ds = exp t X s ds X t, (6.18) t d exp X s ds = X t exp X s ds, (6.19) dt ( t ) 1 exp X s ds = t exp ( X s )ds. (6.2) 145

146 6.4.2 Adjoint representation Now we can study the action of diffeomorphisms on vectors and vector fields. Let v T q M and P Diff(M). We claim that, as functionals on C (M), we have P v = v P. Indeed consider a curve q(t) such that q() = v and compute (P v)a = d ( ) d dt a(p(q(t))) = t= dt q(t) Pa = v Pa t= Recall that, if X Vec(M) is a vector field we have P X q = P (X P ). In a similar way we 1 (q) will find an expression for P X as derivation of C (M) P X = P 1 X P. (6.21) Remark 6.8. We can reinterpret the pushforward of a vector field in a totally algebraic way in the space of linear operator on C (M). Indeed where P X = (AdP 1 )X, (6.22) AdP : X P X P 1, X Vec(M) is the adjoint action of P on the space of vector fields 2. Assume now that P t = exp t X sds. We try to characterize the flow AdP t by looking for the ODE it satisfies. Applying to a vector field Y we have ( ) d dt AdP t Y = d dt (AdP t)y = d dt (P t Y Pt 1 ) where = P t Xt Y P 1 t = P t (Xt Y Y Xt ) P 1 t = (AdP t )[X t,y] = (AdP t )(adx t )Y adx : Y [X,Y], +P t Y ( Xt ) P 1 t is the adjoint action on the Lie algebra of vector fields. In other words we proved that AdP t is a solution to the differential equation Ȧ t = A t adxt, A = Id. Thus it can be expressed as chronological exponential and we have the identity ( t ) Ad exp X s ds = t exp adx s ds. (6.23) Notice that combining (6.23) with (6.22) in the case of an autonomous vector field one gets 2 this is the differential of the conjugation Q P Q P 1, Q Diff(M) e tx = e tadx (6.24) 146

147 Exercise 6.9. Prove that, if [X t,y] = for all t, then (AdP t )Y = Y. Remark 6.1. More explicitly we can write the following formal expansion (AdP t )Y Y + [X sn,...,[x s2,[x s1,y]]d k s, (6.25) k=1 s k... s 1 t which generalizes the formula (2.31). Indeed if P t = e tx is the flow associated with an autonomous vector field we get (Ade tx )Y = e tx Y = Y + k=1 t k k! [X,...,[X,Y]] Y +t[x,y]+ t2 2 [X,[X,Y]]+o(t2 ) Exercise Prove the following using operator notation: 1. Show that ad is the infinitesimal version of the operator Ad, i.e. if P t is a flow generated by the vector field X Vec(M) then adx = d dt AdP t. t= 2. Show that, if P Diff(M), then P preserves Lie brackets, i.e. P [X,Y] = [P X,P Y]. 3. Show that the Jacobi identity in Vec(M) is the infinitesimal version of the identity proved in 2. (Hint. use P t = e tz ) Exercise Prove the following change of variables formula for a nonautonomous flow: P exp t X s ds P 1 = exp Notice that for an autonomous vector field this identity reduces to (2.23). 6.5 Variations Formulae Consider the following ODE t (AdP)X s ds. (6.26) q = X t (q)+y t (q) (6.27) where Y t is thought as a perturbation term of the equation (6.6). We want to describe the solution to the perturbed equation (6.27) as the perturbation of the solution of the original one. Proposition Let X t,y t be two nonautonomous vector fields. Then t exp (X s +Y s )ds = t ( s ) exp exp adx τ dτ Y s ds exp = exp t where P t = exp t X sds denotes the flow of the original vector field. t X s ds (6.28) (AdP s )Y s ds P t (6.29) 147

148 Proof. Our goal is to find a flow R t such that Q t := exp t By definition of right chronological exponential we have (X s +Y s )ds = R t Pt (6.3) Q t = Q t (Xt +Y t ) (6.31) On the other hand, from (6.3), we also have Comparing (6.31) and (6.32), one gets Q t = Ṙt P t +R t P t = Ṙt P t +R t Pt Xt = Ṙt P t +Q t Xt (6.32) and the ODE satisfied by R t is Q t Yt = Ṙt P t Ṙ t = Q t Yt P 1 t = R t (AdPt )Y t Since R = Id we find that R t is a chronological exponential and exp t (X s +Y s )ds = exp t which is (6.29). Plugging (6.23) in (6.29) one gets (6.28). (AdP s )Y s ds P t Exercise Prove the following versions of the variation formula: (i) For every non autonomous vector fields X t,y t on M exp t (X s +Y s )ds = exp t X s ds exp t ( s exp t ) adx τ dτ Y s ds (6.33) (ii) For every autonomous vector fields X, Y Vec(M) prove that e t(x+y ) = exp t = e tx exp e sadx Yds e tx = exp t t e sx Yds e tx (6.34) e (s t)adx Yds (6.35) 148

149 6.A Estimates and Volterra expansion In this section we discuss the convergence of the Volterra expansion Id+ k=1 k (t) X sk Xs1 d k s (6.36) where k (t) = {(s 1,...,s k ) R k s k... s 1 t} denotes the k-dimensional symplex. Recall that if X s = X is autonomous then the series (6.36) simplifies in k= t k k! Xk (6.37) We prove the following result, saying that in general, if the vector field is not zero, the chronological exponential is never convergent on the whole space C (M). Proposition Let X be a nonzero smooth vector field. Then there exists a C (M) such that the Volterra expansion t k k! Xk a (6.38) is not convergent at some point q M. k= Proof. Fix a point q M such that X(q) and consider a smooth coordinate chart around q such that X is rectified in this chart. We are then reduced to prove the statement in the case when X = x1 in R n. Fix an arbitrary sequence (c n ) n N and let f : I R defined in a neighborhood I of such that f (n) () = c n, for every n N. The existence of such a function is guaranteed by Lemma Then define a(x) = f(x 1 ), where x = (x 1,x ) R n. In this case X k a(q) = k x 1 f() = c k and k= t k k! Xk a q = which is not convergent for a suitable choice of the sequence (c n ) n N. k= t k k! c k (6.39) Lemma 6.16 (Borel lemma). Let (c n ) n N be a real sequence. Then there exist a C function f : I R defined in a neighborhood I of such that f (n) () = c n, for every n N. Proof. Fix a C bump function φ : R R with compact support and such that φ() = 1 and φ (j) () = for every j 1. Then set g k (x) := c ( ) k x k! xk φ (6.4) ε k Notice that g (j) k () = δ jkc k, where δ jk is the Kronecker symbol, and g (j) k (x) C j,kε k j k x R and some constant C j,k >. Then choose ε k > in such a way that for every g (j) k (x) 2 j, j k 1, x R, (6.41) 149

150 and define the function f(x) := g k (x). (6.42) k= The series (6.42) converges uniformly with all the derivatives by (6.41) and, by differentiating under the sum, one obtains f (j) (x) := k= g (j) k (x), f(j) () := k= g (j) k () = a j. Even if in general the Volterra expansion is not convergent, it gives a good approximation of the chronological exponential. More precisely, if we denote by S N (t) := Id+ N 1 k=1 k (t) the N-th partial sum, we have the following estimate. X sk Xs1 d k s Theorem For every t >, α N, K M compact, we have ( t exp ) X s ds S m (t) a C ( t t α,k m! ec Xs α,k ds for some K compact set containing K and some constant C = C α,m,k >. m X s α+m 1,K ds) a α+m,k, (6.43) The proof of this result is postponed to Appendix 6.B. Let us specify this estimate for a non autonomous vector field of the form m X t = u i (t)x i where X 1,...,X m are smooth vector fields on M and u L 2 ([,T],R m ). i=1 Theorem For every t >, α N, K M compact, we have (denoting u 1,t = u L 1 ([,t],r m )) ( exp t ) m u i (t)x i S m (t) a C m! ec u 1,t u m 1,t a α+m,k (6.44) α,k i=1 for some K compact set containing K and some constant C = C α,m,k >. Proof. It follows from the previous theorem and from the fact that for a vector field of the form X t = m i=1 u i(t)x i we have the estimate t X s s,k ds u L 1 ([,t],r m ) (6.45) 15

151 Indeed we have for every f such that f α+1,k 1 that m u i (s)x i f i=1 α,k ( sup X m i l Xi1 u i (s)x i f) x K sup x K i=1 m u i (s) X il Xi1 Xi f i=1 (6.46) m u i (s) (6.47) i=1 To complete the discussion, let us describe a special case when the Volterra expansion is actually convergent. One can prove the following convergence result. Proposition Let X t be a nonautonomous vector field, locally bounded w.r.t. t. Assume that there exists a Banach space (L, ) C (M) such that (a) X t a L for all a L and all t I (b) sup{ X t a : a L, a 1,t I} < Then the Volterra expansion (6.36) converges on L for every t I. Proof. We can bound the general term of the sum with respect to the norm of L k (t) X sk Xs1 a d k s X sk X s1 d k s a (6.48) k (t) = 1 ( t n X s ds) a (6.49) n! then the norm of the n-th term of the Volterra expansion is bounded above by the exponential series, and the Volterra expansion converges on L uniformly. Remark 6.2. The assumption in the theorem is satisfied in particular for a linear vector field X on M = R n and L C (R n ) the set of linear functions. If M, the vector field X t and the function a are real analytic, then it can be proved that the Volterra expansion is convergent for small time. For a precise statement seet [AG78]. 6.B Remainder term of the Volterra expansion In this Appendix we prove Theorem We start with the following key result. Proposition Let X t be a complete non autonomous vector field and P t,s its flow. Then for every t >, α N and K M compact, there exists K compact containing K and C > such that P,t a α,k Ce t Xs α,k ds a α,k (6.5) 151

152 Proof. Denote the compact set K t = {P,s (K) s [,t]} and define the function β(t) = sup { P,t f } α,k f C (M), f α+1,kt f α+1,kt (6.51) Notice that the function β is measurable in t since the supremum in the right hand side can be taken over an arbitrary countable dense subset of C (M). We have the following lemma, whose proof is postponed at the end of the proof of the proposition. Lemma For every t >, α N and K M compact, there exists C > such that Let us now consider the identity which implies P,t f α,k Cβ(t) f α,kt (6.52) P,t a = a+ t P,t a α,k a α,k + appying Lemma 6.22 with f = X s a we get P,t a α,k a α,k +C t P,s Xs ads t P,s Xs a α,k ds β(s) X s a α,kt ds t a α,k +C a α+1,kt β(s) X s α,kt ds where we used that K s K t for s [,t], hence α,ks α,kt. Dividing by a α+1,kt and using a α,kt a α+1,kt we get P,t a α,k a α+1,kt 1+C By definition (6.51) of the function β we have the inequality that by Gronwall inequality implies β(t) 1+C t t β(s) X s α,kt ds β(s) X s α,kt ds (6.53) β(t) e C t Xs α,k t ds (6.54) and (6.5) follows combining the last inequality and (6.52) choosing f equal to a and for every compact set K containing K t. 152

153 Now we complete the proof of the main result, namely Theorem Recall that we can write t exp X s ds S m (t) = P,sm Xsm Xs1 ds s m... s 1 t hence ( t exp ) X s ds S m (t) a α,k s m... s 1 t P,sm Xsm Xs1 a α,k ds Applying Proposition 6.21 to the function X sm Xs1 a one obtains ( t ) t exp X s ds S m (t) a Ce Xs α,kds X sm Xs1 a α,k ds (6.55) α,k s m... s 1 t for some compact K containing K. Now let us estimate the integral X sm Xs1 a α,k ds (6.56) s m... s 1 t s m... s 1 t a α+m,k X sm α,k s m... s 1 t Xsm 1 α+1,k X s1 α+m 1,K a α+m,k ds (6.57) X sm α+m 1,K Xsm 1 α+m 1,K X s1 α+m 1,K ds (6.58) ( 1 t m a α+m,k X s m! α+m 1,K ds) (6.59) and combining this inequality with (6.55) we are done. Proof of Lemma By Whitney theorem it is not restrictive to assume that M is a submanifold of R n for some n. We still denote by {X i } i=1,...,n the vector fields (now defined on R n ) spanning the tangent space to M. Notice that if f α,kt = then also P,t f α,k = and the identity is satisfied, hence we can assume f α,kt. Fix a point q K where the supremum in P,t f α,k = sup{ (X il Xi1 P,t f)(q) : 1 i j N, l α} q K, is attained (the existence guaranteed by compactness of K) and let p f be the polynomial in R n and of degree α that coincides with the Taylor polynomial of degree α of f at q t = P,t (q ). Then by construction we have P,t f α,k P,t p f α,k, p f α,qt f α,kt (6.6) Moreover in thefinite-dimensional spaceofpolynomials inr n ofdegree α all normsareequivalent then there exist C > such that p f α,kt C p f α,qt (6.61) 153

154 Combining (6.6) and (6.61) with p f α,kt = p f α+1,kt (since p f is a polynomial of degree α) and the definition of β, we have P,t f α,k f α,kt P,tp f α,k p f α,qt C P,tp f α,k p f α,kt C P,tp f α,k p f α+1,kt Cβ(t). 154

155 Chapter 7 Lie groups and left-invariant sub-riemannian structures In this chapter we study normal Pontryagin extremals on left-invariant sub-riemannian structures on a Lie groups G. Such a structures provide most of the examples in which normal Pontryagin extremal can be computed explicitly in terms of elementary functions. We introduce a Lie groups as a sub-group of the group of diffeomorphisms of a manifold M induced by a family of vector fields whose Lie algebra is finite dimensional. We then define left-invariant sub-riemannian structures. Such structures are always constant rank and, if they are of rank k, they can be generated by exactly k linearly independent vector fields defined globally. On such a structure we have always global existence of minimizers. We then discuss Hamiltonian systems on Lie groups with left-invariant Hamiltonians. Such Hamiltonian systems are particularly simple since their tangent and cotangent bundles are always trivial. They have always a certain number of constant of the motion that for systems on a Lie group of dimension 3 are sufficient for the complete integrability. We study in details some classes of systems in which one can obtain the explicit expression of normal Pontryagin extremals. 7.1 Sub-groups of Diff(M) generated by a finite dimensional Lie algebra of vector fields Let M beasmooth manifold of dimension n and let L Vec(M) beafinite-dimensional Lie algebra of vector fields of dimension diml = l. Assume that all elements of L are complete vector fields. The set G := {e X 1... e X k k N,X 1,...,X k L} Diff(M), (7.1) that has a natural structure of subgroup of the group of diffeomorphisms of M, where the group law is given by the composition. We want to prove the following result. Theorem 7.1. The group G can be endowed with a structure of connected smooth manifold of dimension l = dim L. Moreover the group multiplication and the inversion are smooth with respect to the differentiable structure. 155

156 To prove this theorem, we build the differentiable structure on G by explicitly defining charts. To this aim, for all P G let us consider the map Φ P : L G, Φ P (X) = P e X. Proposition 7.2. The following properties holds true: (i) there exists U L neighborhood of such that Φ P U is invertible on its image, for all P G, (ii) for all P Φ P (U) there exists V U neighborhood of such that Φ P (V) Φ P (U). Thanks to the previous result, one can introduce the following basis of neighborhoods 1 on G: B = {Φ P (W) P G,W U, W}. (7.2) where U is determined as in (i) of Proposition 7.2. Part (ii) of Proposition 7.2 ensures that (7.2) satisfies the axioms of a basis for generates a unique topology on G. Indeed it is sufficient to apply it twice to show that if Φ P (W) Φ P (W ) then there exists Q Φ P (W) Φ P (W ) and V U with V such that Φ Q (V) Φ P (W) Φ P (W ). Once the topology generated by B is introduced the map Φ P U is automatically an homeomorphism, and this proves that G is a topological group, i.e. a group that is also a topological manifold such that the multiplication and the inversion are continuous with respect to the topological structure. Indeed it can be shown that, if Φ P (W) Φ P (W ), then the change of chart Φ 1 P Φ P : W W W W is smooth with respect to the smooth structure defined on the vector space L (cf. Exercice 7.1). Hence G has the structure of smooth manifold Proof of Proposition 7.2 To prove this theorem we use a reduction to a finite dimensional setting, by evaluating elements of G, that are diffeomorphisms of M, on a special set of l points, where l is the dimension of L. To identify this set of points, we first need a general lemma. Lemma 7.3. For every k N and F 1,...,F k : R m R n family of linearly independent functions, there exist x 1,...,x k R m such that the vectors (F i (x 1 ),F i (x 2 ),...,F i (x k )), i = 1,...,k are linearly independent as elements of (R n ) k = R n... R n. Proof. We prove the statement by induction on k. (i). Since F 1 is not the zero function then there exists x 1 R m such that F 1 (x 1 ). (ii). Assume that the statement is true for every set of k linearly independent functions and consider a family F 1,...,F k+1 of linearly independent functions. Let x 1,...,x k to be the set of 1 Recall that a collection B of subset of a set X is a basis for a (unique) topology on X if and only if (a) B B = X, (b) for all B 1,B 2 B with B 1 B 2 there exists nonempty B 3 B such that B 3 B 1 B

157 points obtained by applying the inductive step to the family F 1,...,F k. If the claim is not true for k+1, it means that for every x R m there exists a non zero vector (c 1 ( x),...,c k+1 ( x)) such that k+1 c i ( x)f i ( x) =, i=1 k+1 c i ( x)f i (x j ) =, j = 1,...,k, (7.3) i=1 By definition of x 1,...,x k we have that c k+1 ( x), otherwise we get a contradiction with the inductive assumption. Hence we can assume c k+1 ( x) = 1 and rewrite equation (7.3) as k c i ( x)f i (x j ) = F k+1 (x j ), j = 1,...,k, (7.4) i=1 k c i ( x)f i ( x) = F k+1 ( x), (7.5) i=1 Treating (7.4) as a linear equation in the variables c 1,...,c k, its matrix of coefficients has rank k by assumption, hence its solution (that exists by assumption) is unique and independent on x. Let us denote it by (c 1,...,c k ). Then (7.5) gives k c i F i ( x) = F k+1 ( x) i=1 for every arbitrary x R m, which is in contradiction with the fact that F 1,...,F k+1 is a linearly independent family. As an immediate consequence of the previous lemma one obtains the following property. Proposition 7.4. Let X 1,...,X l be a basis of L. Then there exists q 1,...,q l M such that the vectors (X i (q 1 ),...,X i (q l )), i = 1,...,l, are linearly independent as elements of T q1 M... T ql M. In the rest of this section, the points q 1,...,q l are determined as in Proposition 7.4. The following proposition defines the neighborhood U that appears in the statement of Proposition 7.2. Proposition 7.5. There exists a neighborhood of the origin U L such that the map is an immersion at the origin. 2 φ : U M l, φ(x) = (e X (q 1 ),...,e X (q l )) M l, Proof. It is enough to show that the rank of φ is equal to l. Computing the partial derivatives at L of φ in the directions X 1,...,X l we have φ () = d X i dt (e tx i (q 1 ),...,e tx i (q l )) = (X i (q 1 ),...,X i (q l )), i = 1,...,l, t= and these are linearly independent as elements of T q1 M... T ql M by Lemma here M l = M... M. }{{} l times 157

158 We are now going to study L seen as a Lie algebra of vector fields on M k. Given k N, we can give Vec(M k ) = Vec(M) k the structure of a Lie algebra as follows: [(X 1,...,X k ),(Y 1,...,Y k )] = ([X 1,Y 1 ],...,[X k,y k ]). Lemma 7.6. For every k N the map i : L Vec(M) k defined by i(x) = (X,...,X) defines an involutive distribution on M k. Proof. It follows from the identity [i(x),i(y )] = i([x,y]), since Lemma 7.7. If P G then P L = L. [(X,...,X),(Y,...,Y)] = ([X,Y],...,[X,Y]). Proof. Let us first prove that P L L for every P G. Since elements in G are written as P = e X 1... e X k, X j L it is enough to show that for every X,Y L we have that e X Y L. By (6.24) we have the identity e X Y = e adx Y, The Volterra exponential series of adx converges, since L is a finite dimensional space. The N-th term of the sum N ( 1) k Y + (adx) k Y, k! k=1 belongs to L for each N N, since L is a Lie algebra. Hence one can pass to the limit for N and e adx Y L. This proves that P L L. Actually P L = L since P L is a Lie algebra and dimp L = diml, since P is a diffeomorphism. For every P G we introduce φ P : U M l, φ P = P φ or, more explicitly φ P (X) = (P e X (q 1 ),...,P e X (q l )), X U. Thanks to Proposition 7.5 it follows that φ P is an immersion at zero for all P G, since it is a composition of an immersion with a diffeomorphism. Proposition 7.8. For all P G we have that φ P (U) belongs to the integral manifold in M l of the foliation defined by L (seen as distribution in Vec(M) l ) passing through the point (P(q 1 ),...,P(q l )) M l. Moreover for every P G, φ P (U) belongs to the same leaf of the foliation. Proof. The Lie algebra L, seen as a distribution in Vec(M) l, is involutive. Thus it generates a foliation by Frobenius theorem. The leaf of the foliation passing through (q 1,...,q l ) (that has dimension l) has the expression N = {(ˆP(q 1 ),..., ˆP(q l )) ˆP = e X 1... X k,k N,X 1,...,X k L}, 158

159 while for each P G, φ P (U) = {(P e X (q 1 ),...,P e X (q l )) P G,X U L}, hence for each P G we have that φ P (U) N. The image φ P (U) is an immersed submanifold of dimension l that is tangent to L thanks to Lemma 7.7, and passes through the point φ P () = (P(q 1 ),...,P(q l )) M l. Remark 7.9. The previous result implies that for every (q 1,...,q l ) φ P(U) φ P (U) there exists uniques X,X U such that (P e X (q 1 ),...,P e X (q l )) = (P e X (q 1 ),...,P e X (q l )) = (q 1,...,q l ). (7.6) In other words we are saying that the two diffeomorphisms P e X and P e X coincides when evaluated on the set of points {q 1,...,q l }. Exercise 7.1. Prove that the maps that associates X X defined in (7.6) is smooth. The argument that is developed in the next section shows that actually, not only one has the identity (7.6), but also P e X = P e X as diffeomorphisms Passage to infinite dimension In what follows, to study elements of G as diffeomorphisms and not only as acting on a finite set of points, we use the following idea: we study diffeomorphisms on a set of l+1 points, where the last one is free. Fix q M. Let us introduce φ : U M l+1, φ(x) = (e X (q),e X (q 1 ),...,e X (q l )) M l+1. Moreover, we define for every P G φ P : U M l+1, φ P (X) = (P e X (q),p e X (q 1 ),...,P e X (q l )) M l+1. The following Proposition can be proved following the same arguments as the one of Proposition 7.8. Proposition Fix q M. For all P G we have that φ P (U) is an integral manifold of dimension l in M l+1 of a foliation defined by L (seen as distribution in Vec(M) l+1 ) and passing through the point (P(q),P(q 1 ),...,P(q l )) M l+1. Moreover, for every P G, φ P (U) belong to the same leaf of the foliation. Notice that if π : M l+1 M l denotes the projection π(q,q 1,...,q l ) = (q 1,...,q l ) that forgets about the first element we have φ = π φ and φ P = π φ P. Notice that by construction π : φ P (U) φ P (U) (7.7) is a diffeomorphism for every choice of P (in particular it is one-to-one). We can now prove the main result. 159

160 Proof of Proposition 7.2. (i). It is enough to show that Φ P is injective on its image. In other words we have to show that, if P e X = P e Y for some X,Y U, then X = Y. The assumption implies that φ P (X) = (P e X (q 1 ),...,P e X (q l )) = (P e Y (q 1 ),...,P e Y (q l )) = φ P (Y) hence by invertibility of φ P on U we have that X = Y. (ii). Recall that, by construction, one has the following relation between Φ P and its finitedimensional representation φ P φ P (W) = {(Q(q 1 ),...,Q(q l )) : Q Φ P (W)}, W U. For every V U, with V, one has that φ P (V) and φ P (U) are integral submanifold of M l belonging to the same leaf of the foliation, thanks to Proposition 7.8. Since by assumption P Φ P (U), it follows that the intersection φ P (V) φ P (U) is open and non empty in M l and contains the point (P (q 1 ),...,P (q l )). We can then choose V small enough such that φ P (V) φ P (U). This inclusion of the finite-dimensional images implies the following: for every X V there exists a unique element X U such that P e X = P e X when evaluated on the special set of points, namely (P e X (q 1 ),...,P e X (q l )) = (P e X (q 1 ),...,P e X (q l )). (7.8) To complete the proof it is enough to show that P e X = P e X at every point. To this aim fix an arbitrary q M and let us consider the extended finite-dimensional maps φ P and φ P. Let us firs prove that, for V defined as before, one has φ P (V) φ P (U) (indepedently on q). Assume that φ P (U)\φ P (V), then we have π(φ P (V)) = π(φ P (V) φ P (U)) π(φ P (U)\φ P (V)) (7.9) = φ P (V) π(φ P (U)\φ P (V)) (7.1) This gives a contradiction since on one hand the left-hand is connected thanks to (7.7) (for P = P ), while on the other hand it is written as a union of nonempty disjoint sets. This implies in particular: for every X V W there exists a unique element X U (a priori dependent on q) such that P e X = P e X when evaluated at {q,q 1,...,q l }, namely (P e X (q),p e X (q 1 ),...,P e X (q l )) = (P e X (q),p e X (q 1 ),...,P e X (q l )). (7.11) Combining (7.8) with (7.11) one obtains φ P ( X) = (P e X(q 1 ),...,P e X(q l )) = (P e X (q 1 ),...,P e X (q l )) = φ P (X). By invertibility of φ P on U, it follows that X = X, independently on q. Thus by (7.11) and the arbitrarity of q we have P e X (q) = P e X (q) for every q, for every fixed X V, as claimed. 16

161 7.2 Lie groups and Lie algebras Definition A Lie group is a group G that has a structure of smooth manifold such that the group multiplication G G G, (g,h) gh and inversion G G, g g 1 are smooth with respect to the differentiable structure of G. We denote by L g : G G and R g : G G the left and right multiplication respectively L g (h) = gh, R g (h) = hg. Notice that L g and R g are diffeomorphisms of G for every g G. Moreover L g R g = R g L g. Definition A vector field X on a Lie group G is said to be left-invariant (resp. rightinvariant) if it satisfies (L g ) X = X (resp. (R g ) X = X) for every g G. Remark Every left-invariant vector field X on a Lie group G its uniquely identified with its value at the origin ½ of the Lie group. Indeed if X is left-invariant, it satisfies the relation X(g) = L g X(½). (7.12) Onthe other handavector fielddefinedby formula X(g) = L g v for somev T ½ G, is left-invariant. Notice that left-invariant vector fields are always complete. Definition The Lie algebra associated with a Lie group G is the Lie algebra g of its leftinvariant vector fields. By Remark 7.14 the Lie algebra g associated with a Lie group G is a finite dimensional Lie algebra, that is isomorphic to T ½ G as vector space. Hence g endows T ½ G with the structure of Lie algebra. In particular dimg = dimg. Given a basis e 1,...,e n of T ½ G we will often consider the induced basis of g given by X i (g) = (L g ) e i, i = 1,...,n. When it is convenient we can identify g with T ½ G and a left invariant vector field X with its value at the origin X(½). Definition Given a Lie group G and its Lie algebra g the group exponential map is the map exp : T ½ G G, exp(x) = e X (½). It is important to remember that in general the exponential map is not surjective. If G 1 and G 2 are Lie groups, then a Lie group homomorphism φ : G 1 G 2 is a smooth map such that f(gh) = f(g)f(h) for every g,h G 1. Two Lie groups are said to be isomorphic if there exist a diffeomorphism φ : G 1 G 2 that is also a Lie group homomorphism. Two Lie groups G 1 and G 2 are said locally isomorphic if there exists neighborhoods U G 1 and V G 2 of the identity element and a diffeomorphism f : U V such that f(gh) = f(g)f(h) for every g,h U such that gh U. 161

162 Exercise 7.17 (Third theorem of Lie). Let G i be a Lie group with Lie algebra L i, for i = 1,2. Prove that an isomorphism between Lie algebras i : L 1 L 2 induces a local isomorphism of groups. (Hint: Prove that the set (X,i(X)) is a subalgebra L of the Lie algebra of the product group product G 1 G 2. Build the group G G 1 G 2 associated with this and then show that the two projections p i : G 1 G 2 G i define p 2 (p 1 G ) 1 : G 1 G 2 a local isomorphism of groups.) Example: matrix Lie groups A very important example of Lie group is the group of all invertible n n real matrices, with respect to the matrix multiplication GL(n) = {M R n n det(m) }. Similarly one define GL(n,C) = {M C n n det(m) }. Exercise Prove that GL(n,C) is connected while GL(n) is not. Prove that the Lie algebra of GL(n) (resp. GL(n,C)) is gl(n) = {M R n n } (resp. gl(n,c) = {M C n n }). Definition A group of matrices is a sub group of GL(n) or of GL(n,C). Remark 7.2. The Lie algebra of a sub-group of GL(n) (resp. GL(n,C)) is a sub-algebra of gl(n) (resp. gl(n,c)). Group of matrices that we are going to meet along the book are The special linear group SL(n) = {M R n n det(m) = 1}, whose Lie algebra is sl(n) = {M R n n trace(m) = }. The orthogonal group and the special orthogonal group O(n) = {M R n n MM T = ½}, SO(n) = {M R n n MM T = ½,det(M) = 1}, (7.13) for both the Lie algebra is so(n) = {M R n n M = M T }. SO(n) is the connected component of O(n) to the identity. The special unitary group SU(n) = {M C n n MM = ½}, where M is the transpose of the complex conjugate of M. The Lie algebra of SU(n) is su(n) = {M C n n M = M }. 162

163 The group of (positively oriented) Euclidean transformations of the plane SE(n) = a 1 R. a n 1 R SO(n), a 1,...,a n R. The name of this group comes from the fact that if we represent a point of R n as a vector (x 1,...,x n,1) then the action of a matrix of SE(n) produces a rotation and a translation. The Lie algebra of SE(n) is se(n) = b 1 M. b n M so(n), b 1,...,b n R. Exercise Prove that o(3) and su(2) are isomorphic as Lie algebras. Lemma On group of matrices a left invariant vector field X = L g A = ga, A T ½ G. Proof. By using the expression in coordinates L g : h k g ikh kj we have that (L g A) ij = l,m,k (g ik h kj ) A lm = ik δ kl δ jm A lm = hlm l,m,kg k g ik A kj Similarly one obtains that for R g A = Ag for every A T ½ G. Remark Notice that the for a left invariant vector field on a group of matrix X(g) = ga, the integral curveofx satisfyingg() = g isg(t) = g e ta wheree ta isthestandardmatrix exponential. Hence the integral curve of a left invariant vector field, at a given t, is a right translation. This is indeed a general fact as explained in the next section. Exercise (i). Let X(g) = ga and Y(g) = gb be two left invariant vector on a group of matrices. Prove that [X,Y](g) = g(ab BA) = g[a,b]. (Hint: use the expression in coordinates X ij = k g ika kj and Y ij = k g ikb kj, [X,Y] ij = ( Yij kl g kl X kl X ij g kl Y kl ).) (ii). Prove that for right invariant vector fields X(g) = Ag and Y(g) = Bg we have [X,Y](g) = [A,B]g. Notation. For a left-invariant vector fields on a group of matrices it is often convenient to use the abuse of notation X(g) = gx. This formula clarify well the identification of g with T ½ G. Here X( ) g and X T ½ G. 163

164 7.2.2 Lie groups as group of diffeomorphisms In Section 7.1 we have proved that given a manifold M and a finite dimensional Lie algebra L of vector fields, the subgroup of Diff(M) generated by these vector fields has a structure of finite dimensional differentiable manifold for which the groups operations are smooth. We call such a subgroup G M,L. By Definition 7.12 we have Proposition G M,L is a Lie group. We now want to prove a converse statement for connected group, i.e., every connetected Lie group is isomorphic to a subgroup of the group of the diffeomorphisms of a manifold generated by a finite dimensional Lie algebra of vector fields. Indeed this is true with M = G and L being the Lie algebra of left invariant vector fields on G. More precisely we have the following. Theorem Let G a connected Lie group and L the Lie algebra of left invariant vector fields. Then G is isomorphic to G G,L. To prove Theorem 7.26, we give first the following definition. Definition Let G be a Lie group and let us define the group of its right translations as G R = {R g g G}. On G R we give consider the group structure given by the operation (notice the inverse order) R g1 R g2 := R g2 R g1. Then we need the following simple facts. Lemma G is isomorphic G R. Proof. Clearly the map φ : g R g is a diffeomorphism. That is a group homomorphism follows from the fact that R g1 g 2 h = h(g 1 g 2 ) = (R g2 R g1 )h. Hence φ(g 1 g 2 ) = R g1 g 2 = R g2 R g1 = R g1 R g2. Similarly one obtains that a Lie group G is isomorphic to the group G L = {L g g G} of left translations on G endowed with the group low given by the standard composition. Lemma The flow of a left invariant vector fields on a Lie group G commutes with left translations. Proof. If φ is a diffeomorphism and X a vector field we have that (see Lemma 2.2) Composing on the right with φ, we have e tφ X = φ e tx φ 1. e tφ X φ = φ e tx. Now taking φ = L g for some g, X a left invariant vector field and using that L g X = X, we have that e tlg X L g = L g e tx = L g e tlg X. The conclusion follows from the arbitrarity of g. 164

165 A similar statement holds for right invariant vector fields. Lemma 7.3. Let G be a Lie group. A diffeomorphism on G is a right translation if and only if it commutes with all left translations. Proof. Let P be the diffeomorphism. If P is a right translation then it commutes with left translation since for every g,h 1,h 2 G, we have L h1 R h2 g = h 1 gh 2 = R h2 L h1 g. To prove the opposite, let us define g = P(½). For every h H, we have hence P = R g. P(h) = P(L h ½) = L h P(½) = L h g = hg Remark By Lemma 7.29 and Lemma 7.3 we have that the flow of a left-invariant vector field is a right translation. Proof of Theorem By Lemma 7.28, it remains to prove that G G,L is isomorphic to G R. Indeed we are going to prove that G G,L = G R. To prove that G G,L G R observe that every element of G G,L is a composition of the flow of left invariant vector fields and hence it is a right translation. To prove that G G,L = G R, observe that by the argument above G G,L is a subgroup of G R. Moreover since dim(g G,L ) = dim(g R ). It follows that G G,L contains an open neighborhood of the identity. The conclusion of the Theorem is then a consequence of the following Lemma. Lemma Let G be a connected Lie group. If H is a subgroup of G containing an open neighborhood of the identity then H = G. Proof. Since by hypothesis H is nonempty and open it remains to prove that H is closed. To this purpose observe that if g G\H, then gh is disjoint from H (otherwise there exists u H such that gu H which implies that guu 1 = g H). Hence G\H = gh. Since each set gh is open, it follows that G\H is open and hence that H is closed The matrix notation g/ H Given a vector field X on a manifold, one can consider its integral curve on M, i.e., the solutions to q = X(q), the equation for the flow of X, i.e., P t = P t X. Let us write these equations for a left invariant vector field X on a Lie group G, ġ = X(g), P t = P t X. These two equations are indeed the same equation because: 165

166 the flow of a left invariant vector field is a right translation (see Remark 7.31); an element g of a Lie group G can be interpreted both as a point on G seen as a manifold or as a diffeomorphism over G, once that G is identified with the group of right translations G R. This fact is particularly evident when written for left invariant vector fields on group of matrices. In this case the two equations take exactly the same form ġ = gx P t = P t X In the following we take advantage of this fact to simplify the notation. We sometimes eliminate the use of the symbols L g and L g : we write a left invariant vector field in the form X(g) = gx, thinking to gx as to the matrix product when we are working with Lie groups of matrices (and in this case we think to X T ½ G), or as the composition of the left translation g with the left invariant vector field X otherwise (and in this case we think to X g) Bi-invariant pseudo-metrics Recall that a pseudo-riemannian metric is a family of non-degenerate, symmetric metric bilinear form on each tangent space smoothly depending on the point. Since a Lie group G is a smooth manifold as well as a group, it is natural to introduce the class of pseudo-riemannian metric that respects the group structure of G. Definition Let be a pseudo-riemannian metric on G. It is said to be left-invariant if Similarly, is a right-invariant metric if v w = L g v L g w, v,w T ½ G,g G. v w = R g v R g w, v,w T ½ G,g G. A bi-invariant metric is a pseudo-riemannian metric that is at the same time left and rightinvariant. Exercise Prove that for a bi-invariant pseudo-metric we have the following [X,Y] Z = X [Y,Z], X,Y,Z g. (7.14) Definition A Lie algebra g is said to be compact if it admits a positive definite bi-invariant pseudo-metric (hence a bi-invariant Riemannian metric). One can prove that the Lie algebra of a compact Lie group is compact in the sense above. See for instance [BR86]. Next we define the natural adjoint action of G onto g. Definition For every g G, the conjugation C g : G G, is the map C g = R g 1 L g, C g (h) = ghg 1. The adjoint action Ad g : g g is defined as Ad g = C g, namely Ad g (X) = R g 1 L g X = R g 1 X, X g. 166

167 In matrix notation Ad g (X) = gxg 1, X T ½ G. Recall that, given x g, its adjoint representation adx : g g is given by adx(y) = [x,y]. Definition The Killing form on a Lie algebra g is the symmetric bilinear form K : g g R, K(x,y) = trace(adx ady) (7.15) Exercise Prove that the Killing form has the associativity property K([x,y],z) = K(x,[y,z]). (7.16) Exercise Prove that the Killing form of a nilpotent Lie algebra is identically zero. Definition 7.4. A Lie algebra is said to be semisimple if the Killing form is non-degenerate. Exercise Prove that for semisimple Lie algebras, the Killing form is a bi-invariant pseudometric. Prove that for compact semisimple Lie algebras the Killing form is negative definite. From the algebraic viewpoint a semisimple Lie algebra can be equivalently defined as a a Lie algebra g satisfying [g,g] = g. See for instance [BR86] The Levi-Malcev decomposition A very important result in the theory of Lie algebras (see for instance [Che55, Ch. 4, Sect. 4, Thm. 4]) states that every Lie algebra can be decomposed as where g = r s, r is the so called radical, i.e., the maximal solvable ideal of g. A solvable Lie algebra is defined in the following way. An ideal of a Lie algebra l is a subspace i such that [l,i] i. Given a Lie algebra l define the sequence of ideals l = l, l (1) = [l (),l () ],..., l (n+1) = [l (n),l (n) ]. The Lie algebra l is said to be solvable if there exists n such that l (n) =. s is a semisimple sub-algebra. The symbol indicates the semidirect sum of two Lie algebras defined in the following way. Let T and M two Lie algebras and D the homomorphism of M into the set of linear operators in the vector space T such that every operator D(X) is a derivation of T. The Lie algebra T M is the vector space T M with a Lie algebra structure given by using the given Lie brackets of T and M in each subspace and for the Lie brackets between the two subspaces we set [X,Y] = D(X)Y, X M,Y T. Exercise Prove that T M is a well defined Lie algebra. 167

168 7.2.6 Product of Lie groups Given two Lie groups G 1 and G 2 their direct product is the Lie groups that one obtains taking as manifold G 1 G 2 with the multiplication rule (g 1,g 2 ),(h 1,h 2 ) G 1 G 2 (g 1 h 1,g 2 h 2 ) G 1 G 2. One immediately verify that if g 1 and g 2 are the Lie algebras of G 1 and G 2, the Lie algebra of G 1 G 2 is g 1 g 2. In g 1 g 2 we have that [g 1,g 2 ] =. 7.3 Trivialization of TG and T G Lemma The tangent bundle TG of a Lie group G is trivializable Proof. Recall that the tangent bundle T M of a smooth manifold M is trivializable if and only if there exists a basis of globally defined independent vector fields. In the case of the tangent bundle TG of a Lie group G we can build a global family of independent vector field by fixing a basis e 1,...,e n of T ½ G and consider the induced left-invariant vector fields given by X i (g) = (L g ) e i, i = 1,...,n, that are linearly independent by construction. We have then an isomorphism between TG and G T ½ G. This isomorphism is is given by L g 1, that is acting in the following way TG (g,v) (g,ν) G T ½ G, where ν = L g 1 v. Notice that given two left invariant vector fields X(g) = L g ν and Y(g) = L g µ where ν,µ T ½ G, we have [X,Y](g) = L g [ν,µ] The isomorphism between TG and G T ½ G extend to the dual. Hence T G is isomorphic to G T ½ G, the isomorphism being given by L g, i.e. T G (g,p) (g,ξ) G T ½G, where ξ = L g p. Notice that without an additional notion of scalar product, the Lie algebra structure on T ½ G induced by g does not induce a Lie algebra structure on T ½ G. In the following it is often convenient to make computations in G T ½ G and G T ½ G instead than TG and T G. It is then useful to recall that if v = L g ν T g G and p = L g 1 ξ T g G, then p,v g = ξ,ν ½. 168

169 7.4 Left-invariant sub-riemannian structures A left-invariant sub-riemannian structure is a constant rank sub-riemannian structure(g, D, ) (cf. Section 3.1.3, Example 2) where G is a Lie group of dimensione n; the distribution is left-invariant, i.e., D(g) = L g d, where d is a subspace of T½ G. Moreover we assume that the distribution is Lie bracket generating or equivalently that the smallest Lie sub-algebra of g containing D is g itself; is a scalar product on D(g) that is left-invariant, i.e., if v = L g ν and w = L g µ with ν,µ d we have v w g = ν µ ½. Remark Left-invariant sub-riemannian structure are by construction free and constant rank. If D has dimension m n then the local minimum bundle rank is constantly equal to m (cf. Definition 3.2). Given a left-invariant sub-riemannian structure we can always find m linearly independent vectors e 1,...,e m in T ½ G such that (i) D(g) = { m i=1 u il g e i u 1,...u m R} (ii) e i e j ½ = δ ij. The problem of finding the shortest curve connecting two points g 1,g 2 G can then be formulated as the optimal control problem γ(t) = m i=1 u i(t)l g e i T m i=1 u i(t) 2 dt min γ() = g 1, γ(t) = g 1, (7.17) Exercise (i). Prove that if g G and γ : [,T] G is an horizontal curve, then the left-translated curve γ g := L g γ is also horizontal and l(γ g ) = l(γ). (ii). Prove thatd(l g h 1,L g h 2 ) = d(h 1,h 2 ) forevery g,h 1,h 2 G. Deducethatforevery g,h G and r > one has L g (B(h,r)) = B(gh,r) Existence of minimizers We start by proving a general result saying that a uniformly locally compact metric space is complete. Proposition Let (M,d) be a metric space such that there exists ε > such that B(x,ε) is compact for every x M. Then M is complete. Proof. Let us prove that every Cauchy sequence {x n } in M is convergent. Fix ε > satisfying the assumption. Since {x n } is Cauchy there exists N N such that one has d(x n,x m ) < ε for all n,m N. 169

170 In particular, by choosing m = N, for all n N one has that x n B(x N,ε), that is compact by assumption. Hence {x n } n N is Cauchy and admits a convergent subsequence, that implies that the whole sequence {x n } in M is convergent. This result immediately implies the following. Corollary Any left-invariant sub-riemannian structure on a Lie group G is complete. Proof. By Proposition 3.35 small balls are compact. Hence there exists ε > such that the ball B(½, ε) is compact, where ½ is the identity of G. By left-invariance (cf. Exercice 7.45) B(g,ε) = L g (B(½,ε)) is compact for every g G, independently on ε. By Proposition 7.46, the sub-riemannian structure is complete step Carnot groups The Heisenberg sub-riemannian structure that we studied in Section as an isoperimetric problem is indeed a left-invariant sub-riemannian structure on the group G = R 3 endowed with the product (x,y,z) (x,y,z ) =. ( x+x, y +y, z +z + 1 ) 2 (xy x y). Such a group is called the Heisenberg group. Exercise Prove that the Lie algebra of the Heisenberg group can be written as g = g 1 g 2 where g 1 = span{ x y 2 z, y + x 2 z}, and g 2 = span{ z }. Notice that we have the commutation relations [g 1,g 1 ] = g 2 and [g 1,g 2 ] =. In this section we focus on Carnot groups of step 2, which are natural generalization of the Heisenberg group, namely Lie groups G on R n such that its Lie algebra g satisfies g = g 1 g 2, [g 1,g 1 ] = g 2, [g 1,g 2 ] = [g 2,g 2 ] =. (7.18) G is endowed by the left-invariant sub-riemannian structure induced by the choice of a scalar product on the distribution g 1, that is bracket-generating of step 2 thanks to (7.18). Notice that g is a nilpotent Lie algebra and that we have the inequality n k(k +1), k = dimg 1, n = dimg. 2 We say that g is a Carnot algebra of step 2. Let us now choose a basis of left-invariant vector fields (on R n ) of g such that g 1 = span{x 1,...,X k }, g 2 = span{y 1,...,Y n k }, where {X 1,...,X k } define an orthonormal frame for on the distribution g 1. Such a basis will be referred also as an adapted basis. We can write the commutation relations: [X i,x j ] = n k h=1 ch ij Y h, i,j = 1,...,k, where c h ij = ch ji, (7.19) [X i,y j ] = [Y j,y h ] =, i = 1,...,k, j,h = 1,...,n k. 17

171 Define the the n k skew-symmetric matrices C h = (c h ij ), for h = 1,...,n k. We stress that since the vector fields are left-invariant, then the structure functions c h ij are constant. Given an adapted basis, we can associate with the family of matrices {C 1,...,C n k } the subspace C = span{c 1,...,C n k } so(g 1 ) (7.2) of skew-symmetric operators on g 1 that are represented by linear combination of this family of matrices. Proposition 7.49 (2-step Carnot algebras and subspaces of so(g 1 )). For a given a 2-step Carnot algebra g, the subspace C so(g 1 ) is independent on the choice of the adapted basis on g Proof. Assume that we fix another adapted basis g 1 = span{x 1,...,X k }, g 2 = span{y 1,...,Y n k }. where{x 1,...,X k } is orthonormal for the inner prodict. Then there exists A = (a ij) an orthogonal matrix and B = (b hl ) an invertible matrix such that k X i = a ij X j, j=1 n k Y h = b hl Y l. l=1 A direct computation shows that, denoting B 1 = (b hl ), we have it follows that k [X i,x j ] = a ih a jl [X h,x l ] = h,l=1 n k n k = s=1 k n k a ih a jl c r hl Y r (7.21) h,l=1 r=1 h,l=1 r=1 k a ih a jl c r hl brs Y s (7.22) n k C s = b hs (AC h A ) (7.23) h=1 Recall that two matrices C and C represents the same element of so(g 1 ) with respect to the two basis if and only if C = ACA. Then formula (7.23) implies that elements of C are written as linear combination of elements of C that represents the same linear operator, as claimed. Remark 7.5. We have the following basis-independent interpretation of Proposition The Lie bracket defines a well-defined skew-symmetric bilinear map [, ] : g 1 g 1 g 2. If we compose this map with an element ξ g 2 we get a skew-symmetric bilinear form ξ [, ] : g 1 g 1 R, that can be identified with an element of so(g 1 ), thanks to the inner product on g 1. Hence with every 2-step Carnot algebra we can associate a well-defined linear map Ψ : g 2 so(g 1 ) The subspace C introduced in (7.2) coincides with imψ so(g 1 ). 171

172 Definition Two Carnot algebras g and g are isomorphic if there exists a Lie algebra isomorphism φ : g g such that φ g1 : g 1 g 1 preserves the scalar products, i.e., φ(v) φ(w) = v w, v,w g. Following the same arguments one can prove the following result Corollary The set of equivalence classes of 2-step Carnot algebras (with respect to isomorphisms) on g = g 1 g 2 is in one-to-one correspondence with the set of subspaces of so(g 1 ) Pontryagin extremals for 2-step Carnot groups Let us fix a 2-step Carnot group G and let g its associated Lie algebra. A basis of a Lie algebra of vector fields on R n = R k R n k (using coordinates g = (x,y) R k R n k ) and satisfying (7.19) is given by X i = x i 1 2 c h ij x j, i = 1,...,k, (7.24) y h j,h Y h = y h, h = 1,...,n k. (7.25) The group G is R n = R k R n k endowed with the group law (x,y) (x,y ) = (x+x,y +y 12 ) Cx x where we denoted for the (n k)-tuple C = (C 1,...,C n k ) of k k matrices, the product Cx x = (C 1 x x,...,c n k x x ) R n k. and x x denotes the Euclidean inner product in R k. Let us introduce the following coordinates on T G h i (λ) = λ,x i (g), ω h (λ) = λ,y h (g) Since the vector fields {X 1,...,X k,y 1,...,Y n k } are linearly independent, the functions (h i,ω h ) defines a system of coordinates on fibers of T G. In what follows it is convenient to use (x,y,h,ω) as coordinates on T G. Geodesics are projections of integral curves of the sub-riemannian Hamiltonian in T G k H = 1 2 i=1 h 2 i Suppose now that λ(t) = (x(t),y(t),h(t),ω(t)) is a normal Pontryagin extremal. Then u i (t) = h i (λ(t)) and the equation on the base is ġ = k h i X i (g). (7.26) i=1 172

173 that rewrites as { ẋ i = h i ẏ h = 1 k 2 i,j=1 cl ij h (7.27) ix j For the equations on the fiber we have (remember that along solutions ȧ = {H,a}) {ḣi = {H,h i } = k j=1 {h i,h j }h j = n k l=1 k j=1 cl ij h jω l ω h = {H,ω h } =. (7.28) H is constant along solutions and if we require that extremals are parametrized by arclength. From (7.28) we easily get that ω h is constant and the vector h = (h 1,...,h k ) R k satisfies the linear equation n k ḣ = C ω h, C ω = ω l C l where we recall that the vector ω = (ω 1,...,ω n k ) is constant. It follows that h(t) = e tcω h() l=1 and x(t) = x()+ t e scω h()ds Proposition The projection x(t) on the layer g 1 R k of a Pontryagin extremal such that x() = is the image of the origin through a one-parametric group of isometries of R k. Proof. The action of a 1-parametric group of isometries can be recovered by exponentiating an element of its Lie algebra. This reduces to compute the solution of the differential equation ẋ = Ax+b where A is skew-symmetric and b R k. Its flow is given by φ t ( x) = e ta x+ t e sa bds and it is easy to see that the projection x(t) on the layer g 1 R k of a Pontryagin extremal satisfies this equation with x =, A = C ω and b = h(). Notice that the vertical coordinates y can be always recovered, once h(t) and x(t) are computed, by a simple integration. Heisenberg group The simplest example of 2-step Carnot group is the Heisenberg group, whose Lie algebra g has dimension 3. It can be realized in R 3 by the left invariant vector fields X 1 = 1 x 1 2 x 2 y, X 2 = + 1 x 2 2 x 1 y, Y = y, 173

174 satisfying the relation [X 1,X 2 ] = Y. In this case the set of matrices representing the Lie bracket is reduced to a single matrix C, namely ( ) 1 C = 1 and the projection x(t) on the layer g 1 R k of a Pontryagin extremal starting from the origin satisfies the equation t ( ) ωs x(t) = exp h()ds ωs Computing t exp ( ) ωs ds = 1 ωs ω ( sin(ωt) ) cos(ωt) 1 cos(ωt) + 1 sin(ωt) and choosing h() = ( sinθ,cosθ) S 1, we get ( )( ) ( ) cos(ωt) sin(ωt) sinθ sin(ωt+θ) h(t) = = sin(ωt) cos(ωt) cosθ cos(ωt+θ) x(t) = 1 ( )( ) sin(ωt) cos(ωt) 1 sinθ = 1 ( ) cos(ωt+θ) cosθ ω cos(ωt) + 1 sin(ωt) cosθ ω sin(ωt+θ) sinθ This recovers the formulas already computed in Section Notice that the y component is recovered simply by integrating the last equation, that in this case gives ẏ = 1 2 ( h 1x 2 +h 2 x 1 ) y(t) = 1 2ω = 1 2ω t t sin(ωs+θ)(sin(ωs+θ) sinθ)+cos(ωs+θ)(cos(ωs+θ) cosθ)ds 1 sin(ωs+θ)sinθ cos(ωs+θ)cosθds = 1 2ω = 1 2ω 2(ωt sin(ωt)). t 1 cos(ωs)ds 7.6 Left-invariant Hamiltonian systems on Lie groups In this section we study Hamiltonian systems non necessarily coming from a sub-riemnnian problem Vertical coordinates in TG and T G Thanks to the isomorphism between TG and G T ½ G, a bases {e 1,...,e n } of T ½ G induces global coordinates on TG. Indeed a base of T g G is L g e 1,...,L g e n and every element (g,v) of TG can be written as n (g,v) = (g, v i L g e i ). i=1 174

175 Figure 7.1: The set of end points of length 1 Pontryagin extremals for the 3D Heisenberg group. Notice the singularities accumulating at the origin. The coordinates v 1,...v n are called the vertical coordinates in TG and they are also coordinates in thevertical partofg T ½ G. Indeedif(g,v) = (g, n i=1 vi L g e i ) TG, thenthecorrespondingpoint in G T ½ G is (g,ξ) = (g, n i=1 vi e i ) hence, in coordinates, both are representedby (g,v 1,...,v n ). If {e 1,...,e n } is the dual base in T ½ G to {e 1,...,e n }, i.e., e i,e j = δ i,j, then every element (g,p) of T G can be written as (g,p) = (g, n h i L g e i). 1 i=1 The coordinates h 1,...h n are called vertical coordinates in T G. For the same reason as above, in vertical coordinates (g,h 1,...,h n ) represents both a point in T G and the corresponding point in G T ½ G. In other words, when using vertical coordinates it is not important to distinguish if we are working in TG or G T ½ G (the same holds for T G or G T ½ G). 175

176 Remark Notice that if X i (g) = L g e i then h i (p,g) = p,x i (g), hence h i are the functions linear on fibers associated with X i. Moreover if make the change of variable (p,g) (ξ,g) where p(ξ,g) = L g 1 ξ where ξ T ½ G, we have that h i becomes independent from g. Indeed we can write h i (p(ξ,g),g) = ξ,e i ½. The vertical coordinates h 1,...,h n are functions on T G hence we can compute their Poisson bracket (cf. Section 4.1.2) {h i,h j } = p,[x i,x j ] g = ξ,[e i,e j ] ½. (7.29) Remark Note that the vertical coordinates h i are not induced by a system of coordinates x 1,...,x n on the base G (we have not fixed coordinates on G). If they were induced by coordinates on G, we would have obtained zero in the right-hand side of (7.29) since [ xi, xj ] = Left-invariant Hamiltonians Consider a Hamiltonian function H : T G R. Thanks to the isomorphism between T G and G T ½ G we can interpret it as a function on G T½ G, i.e., we can define H(g,ξ) = H(g,L g 1 ξ), H : G T ½ G R. We say that H is left-invariant if H(g, ξ) = H(ξ). For a left-invariant hamiltonian we call the corresponding H the trivialized Hamiltonian. Equivalently we can use the following definition Definition A Hamiltonian H : T G R is said to be left-invariant if there exists a function H : T½ G R such that H(g,p) = H(L g p). Hence a left invariant-hamiltonian can be interpreted as a function on T ½ G. Example Given a set of left-invariant vector field f i (g) = L g w i, w i T ½ G, i = 1,...,m, we have that H(g,p) = 1 2 m i=1 p,f i(g) 2 is a left-invariant Hamiltonian. Indeed which is independent on g. H(g,ξ) = 1 2 m L g ξ,l 1 g w i = 1 2 i=1 Remark If we write p = n j=1 h jl g 1 e j then m ξ,w i, i=1 H(g, L g h 1 j e j) = H(Lg hj L g e j) = H( h 1 j e j). In other words in vertical coordinates h 1,...h n, we have for a left-invariant Hamiltonian H(g,h 1,...,h n ) = H(h 1,...,h n ). and we can identify H and H. 176

177 Remark In the context of Lie groups, to write Hamiltonian equations is convenient avoiding fixing coordinates on G and use vertical coordinates on the fiber only. This permits to exploit better the trivialization of T G in G T½ G and the left invariance of H. Since vertical coordinates h i do not come, in general, from coordinates on G, we do not have equations of the form ẋ i = hi H, ḣ i = xi H for a system of coordinates x 1,...,x n on G. Consider a left-invariant Hamiltonian H(h 1,...,h n ). Let us write vertical part of the Hamiltonian equations. We are going to see that this equation is particularly simple. We have Using Exercice 4.8 we have for i = 1,...,n, ḣ i = n j=1 H h j {h j,h i } = ḣ i = {H,h i }, i = 1,...,n. (7.3) n j=1 H h j ξ,[e j,e i ] = n ξ, H e j,e i. (7.31) h j Notice that since H is a function on the linear space T½ G, then dh(h 1,...,h n ) is an element of T½ G = T ½G. If we write an element of T½ G as h 1e h ne n, then an element of its tangent at (h 1,...,h n )iswrittenasv 1 h1,...,v n hn withtheidentification hi = e i duetothelinearstructure. An element of its cotangent space T½ G at (h 1,...,h n ) is then written as ω 1 dh ω n dh n with the identification dh i = (e i ) = e i again due to the linear structure. Then dh(h 1,...,h n ) = n j=1 H h j dh j = n j=1 H h j e j = Hence the vertical part of the Hamiltonian equations can be written as j=1 n j=1 H h j e j. (7.32) ḣ i = ξ,[dh,e i ] or more compactly recalling that ξ = k i=1 h ie i, = ξ,(addh)e i = (addh) ξ,e i (7.33) ξ = (addh) ξ. (7.34) For what concerns the horizontal part, let β C (G), i.e., a function in C (T G) that is constant on fibers. For every curve g( ) solution of the horizontal part of the Hamiltonian system on T G corresponding to H we have d dt β(g(t)) = {H,β} (g(t),p(t)) = n j=1 H h j {h j,β} (g(t),p(t)). Now recalling that (cf. (4.17)) { p,x(g) +α(g), p,y(g) +β(g)} = p,[x,y](g) +Xβ(g) Yα(g) we have {h j,β} = { p,x j,β} = X j β = (L g e j )β. Hence d n dt β(g(t)) = H n (L g e j )β H h j = L g e j β h j = L g dh g(t). g(t) g(t) j=1 177 j=1

178 Since the function β is arbitrary we have We have then proved the following ġ = L g dh. Proposition 7.6. Let H be a left invariant Hamiltonian on a Lie group G, i.e. H(g,p) = H(L g p) where (g,p) T G and H is a smooth function from T½ G to R. Let dh be the differential of H seen as an element of T ½ G. Then the Hamiltonian equations d dt (g,p) = H(g,p) are, { ġ = Lg dh ξ = (addh) (7.35) ξ. Here ξ T ½ G and p(t) = L g 1 ξ(t). Notice that the second equation is decoupled from the first (it does not involve g). When we have available a bi-invariant metric equation (7.34) can be written in a simpler form. Indeed in this case we can identify T ½ G with T ½ G via ξ T ½ G M T ½G M v = ξ,v, v T ½ G. Using (7.34) and (7.14), for every v T ½ G let us compute dm dξ dt v = dt,v = (addh) ξ,v = ξ,(addh)v = ξ,[dh,v] = M [dh,v] = [M,dH] v. Hence the Hamiltonian equations for a left-invariant Hamiltonian, when we have a bi-invariant peseudometric are: { ġ = Lg dh dm (7.36) dt = [M,dH]. 7.7 First integrals for Hamiltonian systems on Lie groups* Integrability of left invariant sub-riemannian structures on 3D Lie groups* 7.8 Normal Extremals for left-invariant sub-riemannian structures Consider a left-invariant sub-riemannian structure of rank m (cf. (7.17)) for which an orthonormal frame is given by a set of left-invariant vector fields X i = L g e i (g), i = 1,...,m. The maximized Hamiltonian is H(g,p) = 1 m p,x i (g) 2 = 1 m p,l g e i 2, 2 2 i=1 hence it is left invariant (cf. Example 7.57). The corresponding trivialized Hamiltonian is H(ξ) = 1 m ξ,e i 2. 2 Now ξ,e i = h i (g,p) hence in vertical coordinates we have H(h 1,...,h m ) = 1 m h 2 2 i. i=1 178 i=1 i=1

179 7.8.1 Explicit expression of normal Pontryagin extremals in the d s case Explicit expressions of normal Pontryagin extremals can be obtained for left-invariant sub-riemannain structures when a bi-invariant pseudo-metric on G is given; T ½ G = d s where d is positive defined and s satisfies the following i) s := d (where the orthogonality is taken with respect to ); ii) s is a sub-algebra; The distribution is d and the metric is d. We say that such a sub-riemannian structure is of type d s. Remark A classical example of such a d s sub-riemannian structure is provided by the group of matrices SO(n) in which the distribution at the identity d is given by any codimension one subspace of T ½ SO(n) and the norm of a vector in d is the square root of the sum square of its matrix elements. Exercise Prove that the distribution defined in Remark 7.61 is Lie bracket generating. Prove that the metric induced by the norm defined above is induced (up to a negative proportionality constant) by the Killing form. Let us write an element of v T ½ G as v = x+y where x d and y s. Let e 1,...e m be an orthonormal frame for the structure. In this case if M = x+y is the element in T ½ G corresponding to ξ T ½ G via we have h i = ξ,e i = M e i = x i. Hence H = 1 2 n h 2 i = 1 n x 2 i 2 = 1 2 x x = 1 2 x 2. (7.37) i=1 i=1 Notice that (cf. (7.32)) dh = n i=1 H h i e i = n i=1 H x i e i = n i=1 x ie i = x. Hence the vertical part of the Hamiltonian equation dm/dt = [M, dh] become ẋ+ẏ = [x+y,x] = [y,x]. (7.38) Now for every v s one has [y,x] v = x [y,v] =, where we have used equation (7.14) and for the last equality that facts that [y,v] s since s is a sub-algebra. d and s are orthogonal for. We then conclude that [y,x] d. Hence (7.38) become ẋ = [y,x] ẏ = 179

180 Hence all y component are constant of the motion and we have y(t) = y ẋ = [y,x] = (ady )x The solution of the last equation is x(t) = e tady x. (7.39) Then for the horizontal part we have ġ = L g dh = L g x(t) = L g e tady() x(). (7.4) Using the variation formula for smooth vector fields (cf. (6.34)), e t(y +X) = exp t e sady Xds e ty, (7.41) we have that the solution of (7.4) starting from g and corresponding to x,y is 3 g(x,y ;t) = g e t(x +y ) e ty (7.42) The parameterization by arclength is obtained requiring H = 1/2. From (7.37) at t = we obtain that the normal Pontryagin extremals (7.42) are parametrized by arclength when x x = x 2 = 1. The controls whose corresponding trajectories starting from g are the normal Pontryagin extremals (7.42) are u i (t) = p(t),x i (g(t)) = h i (g(t),p(t)) = x i (t) = Exercise Study abnormal extremals for this problem Example: The d s problem on SO(3) e tady x ei, i = 1,...,m. The Lie group SO(3) is the group of special orthogonal 3 3 real matrices SO(3) = { g Mat(3,R) gg T = Id,det(g) = 1 }. To compute its Lie algebra, let us compute its tangent space at the identity. Consider a smooth curve g : [,ε] SO(3), such that g() = e. Computing the derivative in zero of both sides of the equation g(t)g T (t) = e, we have ġ()g() +g()g T () = from which we deduce g() = g T (). Hence the Lie algebra of SO(3) is the space of skew symmetric 3 3 real matrices and it is usually denoted by so(3). In other words so(3) = a b a c Mat(3,R). b c 3 For a group of matrices: formula (7.39) reads as e ty x e ty, while (7.4) is ge ty x e ty. 18

181 A basis of so(3) is {e 1,e 2,e 3 } where e 1 = 1, e 2 = 1 1 1, e 3 = 1 1 whose commutation relations are [e 1,e 2 ] = e 3 [e 2,e 3 ] = e 1 [e 3,e 1 ] = e 2. For so(3) the Killing form is K(X,Y) = trace(xy) so, in particular, K(e i,e j ) = 2δ ij. Hence = 1 2 K(, ) is a (positive definite) bi-invariant metric on so(3). If we define d = span{e 1,e 2 }, s = span{e 3 } and we provide d with the metric d we get a sub-riemannian structre of type d s. Expression of normal Pontryagin extremals Let us write an initial covector x +y such that x x = 1 in the following form x +y = cos(θ)e 1 +sin(θ)e }{{ 2 + ce } 3, θ S }{{} 1, c R. x y Using formula (7.42), we have that the normal Pontryagin extremals starting from the origin are = g(θ,c;t) := e (cos(θ)e 1+sin(θ)e 2 +ce 3 )t e ce3t = (7.43) K 1 cos(ct)+k 2 cos(2θ +ct)+k 3 csin(ct) K 1 sin(ct)+k 2 sin(2θ +ct) K 3 ccos(ct) K 4 cos(θ)+k 3 sin(θ) K 1 sin(ct)+k 2 sin(2θ +ct)+k 3 ccos(ct) K 1 cos(ct) K 2 cos(2θ +ct)+k 3 csin(ct) K 3 cos(θ)+k ( 4 sin(θ) ) cos 1+c 2 t +c 2 K 4 cos(θ +ct) K 3 sin(θ +ct) K 3 cos(θ +ct)+k 4 sin(θ +ct) 1+c 2 with K 1 = 1+(1+2c2 )cos( 1+c 2 t), K 2(1+c 2 ) 2 = 1 cos( 1+c 2 t), K 2(1+c 2 ) 3 = sin( 1+c 2 t) 1+c, K 2 4 = c(1 cos( 1+c 2 t)). 1+c 2 The end point of all normal Pontryagin extremals for t = 1 are plotted in Figure Explicit expression of normal Pontryagin extremals in the k z case Another case in which one can get an explicit expression of normal Pontryagin extremals is when G = G k G z where G k has a compact algebra k and G z is abelian. In other words the Lie algebra at the origin of G can be written as T ½ G = k z where k is a compact subalgebra and z is contained in the center of T ½ G, i.e., [v,y] = for every v T ½ G and y z. In the following we write an element of v T ½ G as v = x+y where x k and y z. Moreover we assume that a bi-invariant metric k on k is given (this is always possible by definition of compact Lie algebra); we assume that the distribution (that we assume to be Lie bracket generating) projects well on k, that is if π : T ½ G k is the canonical projection induced by the splitting, we have π D is 1:1 over k. Under this condition, there exists a linear operator A : k z such that d = {x+ax x k} k z = T ½ G. 181

182 Figure 7.2: The set of end points of Pontryagin extremals of length 1 for the d s sub-riemannian problem on SO(3). In the picture the x-axis is the element (g) 23, the z-axis is the element (g) 13, the z-axis is the element (g) 12. Notice the singularities accumulating at the origin. This picture looks very similar to the one of the Heisenberg group (cf. Figure 7.1). Indeed it is possible to prove (cf. Chapter 1) that the two pictures become more and more similar if one consider end points of geodesics of length r and makes r smaller and smaller. For r big the two pictures become very different due to the different topology of R 3 and SO(3). we assume that the metric on d is induced by the projection, i.e., w 1 w 2 d = π(w 1 ) π(w 2 ) k, for every w 1,w 2 d, or equivalently that if v 1,v 2 d, v 1 = (x 1,Ax 1 ), v 2 = (x 2,Ax 2 ) with x 1,x 2 k, then v 1 v 2 d = x 1 x 2 k. See Figure 7.3. Let us fix any scalar product on z on z and define the scalar product on T ½ G by v 1 v 2 = x 1 x 2 k + y 1 y 2 z, where v 1 = x 1 +y 1, v 2 = x 2 +y

183 z d k Figure 7.3: The k z problem z 3 u 3 u 2 u 1 z 2 X SO(3) (z 1,z 2 ) z 1 Figure 7.4: Rolling sphere with twisting. Notice that if x k and y z then x y =. Exercise Prove that is bi-invariant as a consequence of the bi-invariance of k and of the fact that z is in the center of T ½ G. The metric T½ G is used to identify vectors and covectors, to use the simpler form (7.36) of the Hamiltonian equations for normal Pontryagin extremals. The resulting normal Pontryagin extremals will be independent on the choice of the scalar product z. Remark An example of such a structure is provided by the problem of rolling without slipping a sphere of radius 1 in R 3 on a plane. Its state is described by a point in R 2 giving the projection of its center on the plane and by an element of SO(3) describing its orientation. Given an initial and final position in SO(3) R 2 one would like to roll the sphere on the plane in such a way that 3 i=1 u i(t) 2 dt is minimal, where u 1, u 2 the initial and final conditions are the given ones and T and u 3 are the three controls corresponding to the rolling of the sphere along the two axes of the plane and to the twist. See Figure 7.4. Why this problem gives rise to a k z sub-riemannian structure is described in detail in the next section. Let us write the maximized Hamiltonian. Let e 1,...,e m be an orthonormal frame for k. Then 183

184 an orthonormal frame for d is e 1 +Ae 1,...,e m +Ae m. We have H(g,p) = 1 2 The corresponding trivialized Hamiltonian is H(ξ) = 1 2 m p,l g (e i +Ae i ) 2. i=1 m ξ,(e i +Ae i ) 2, ξ T½G. i=1 Now using the metric T½ G we can identify T ½G with T½ G and write ξ = x+y. Then H(x,y) = 1 2 m x+y (e i +Ae i ) 2 T ½ G = 1 2 i=1 m ( x e i + y Ae i ) 2. (7.44) Here we have used the the fact that x,e i k and y,ae i z and we have used the orthogonality of k and z with respect to. Now y Ae i = A y e i = A y e i k, where A is the adjoint of A. Hence H(x,y) = 1 m ( x e i + A y e i 2 k ) 2 = 1 2 x+a y 2 k. (7.45) i=1 The vertical part of the Hamiltonian equations are (cf. the second equation of (7.36) with M replaced by x+y) ẋ+ẏ = [x+y,dh]. (7.46) The let us compute i=1 dh = x+a y+ax+aa y }{{}}{{} k z Now since z is in the center, the second part of dh disappear in the commutator in (7.46) and we get ẋ+ẏ = [x+y,x+a y] = [x,a y], from which we deduce ẋ = [x,a y], ẏ =. Hence all y components are constant of the motion and we have The solution of the last equation is y(t) = y ẋ = [x,a y ] = [A y,x] = (ad(a y ))x For the horizontal part of the Hamiltonian equations we have x(t) = e tad(a y ) x. (7.47) ġ(t) = L g(t) dh(x(t),y(t)) = L g(t) (x(t)+a y }{{ +Ax(t)+AA y } ). (7.48) }{{} k z 184

185 Using the fact that G = G k G z, it is convenient to write an element of G as g = (g 1,g 2 ) where g 1 G k and g 2 G z. Then equation (7.48) splits in the following way ġ 1 = L g1 (x(t)+a y ) (7.49) ġ 2 = Ax(t)+AA y (7.5) In the second equation we have used the fact that L g2 (Ax(t) + AA y ) = Ax(t) + AA y, since we are in an Abelian group. Moreover if g() = (g 1,g 2 ), then for (7.49) and (7.49) we have the initial conditions g 1 () = g 1 and g 2 () = g 2. Let us solve (7.49). Using (7.47) this equation is reduced to ġ 1 = L g1 (e tad(a y ) x +A y ) = L g1 e tad(a y ) (x +A y ), (7.51) where in the last formula we have used the fact that e tad(a y ) A y = A y. Using the variation formula (cf. (6.34)), e t(y +X) = exp with Y A y and X x +A y, we get t e sady Xds e ty, (7.52) g 1 (t) = g 1 e tx e ta y. (7.53) For (7.5), using (7.47) and using the fact that G z is Abelian, we have g 2 (t) = g 2 + t (Ax(s)+AA y ) ds = g 2 + t ) (Ae sad(a y ) x +AA y ds. (7.54) The parameterization by arclength is obtained requiring H = 1 2. From (7.45) we obtain that the normal Pontryagin extremals are parametrized by arclength when x +A y x +A y = x +A y 2 = 1. The controls corresponding to the normal Pontryagin extremals (g 1 (t),g 2 (t) are (cf. Formula 7.44): u i (t) = x(t)+y e i +Ae i = x(t) e i + y Ae i = x(t)+a y e i = e tad(a y ) x +A ei y. Exercise Study abnormal extremals for this problem. 7.9 Rolling spheres (3, 5) - Rolling sphere with twisting Consider a sphere of radius 1 in R 3 rolling on a plane without slipping. At every time the state of the system is described by a point on the plane (the projection of its center) and the orientation of the sphere. We represent a point on the plane as z = (z 1,z 2 ) R 2 and the orientation of the sphere by a point X SO(3) representing the orientation of an orthonormal frame attached to the sphere with respect to the standard orthonormal frame in R

186 Let {e 1,e 2,e 3 } be the following basis of the Lie algebra so(3) of SO(3), e 1 = 1 1, e 2 = 1 1, e 3 = 1 1. (7.55) The condition that the sphere is rolling without slipping can be expressed by saying that the only admissible trajectories in SO(3) R 2 are the horizontal trajectories of the following control system (here u i ( ) L ([,T],R), i = 1,2,3). ż 1 = u 1 (t) ż 2 = u 2 (t) Ẋ = X(u 2 (t)e 1 u 1 (t)e 2 +u 3 (t)e 3 ). (7.56) The controls u 1 ( ) and u 2 ( ) correspond to the two rotations of the sphere that produceamovement in the plane, whilethe control u 3 ( ) correpondsto a twist of the sphere(that producesno movement in the plane). See Figure 7.4. We would like to solve the following problem. P: Given an initial and final position in SO(3) R 2, roll the sphere on the plane in such a way that the initial and final conditions are the given ones and T 3 i=1 u i(t) 2 dt is minimal. We have the following result. Proposition The projection on the plane (z 1,z 2 ) of normal Pontryagin extremals is (up to time reparameterization) the set of sinusoids on the plane: {( z1 z 2 ) ( cos(a ) sin(a + ) sin(a ) cos(a )) )( f(φ,b,r,t) t ) } a,φ S 1, b,r, z 1,z 2 R, where f(φ,b,r,t) = { bsin(rt+φ ) if r > bt if r =. To prove Proposition 7.67 we first prove that the problem define a k z sub-riemannian structure and then we study its normal Pontryagin extremals. Claim. The problem above is a problem of type k z. To prove the claim let us set G = SO(3) R 2. We have T ½ G = so(3) R 2. Now let f 1 = (1,) T and f 2 = (,1) T be the generators of R 2 and define d = span{f 1 e 2,f 2 +e 1,e 3 } so(3) R 2. Givenavectorv = u 1 (f 1 e 2 )+u 2 (f 2 +e 1 )+u 3 e 3 dwedefineitsnormas v = u 2 1 +u2 2 +u2 3. If π : so(3) R 2 R 2 is the canonical projection, this norm coincide with the norm of π(v) so(3), where so(3) is the standard norm for which {e 1,e 2,e 3 } is an orthonormal frame. This norm comes from a bi-invariant metric as explained in Section

187 The corresponding sub-riemannian problem is then ġ = g ( ) u 1 (t)(f 1 e 2 )+u 2 (t)(f 2 +e 1 )+u 3 e 3. (7.57) T 3 u i (t) 2 dt min (7.58) i=1 writing g = (X,z) this problem become exactly (7.56). If we define the linear application A : so(3) R 2 via Ae 1 = f 2, Ae 2 = f 1, Ae 3 =, we can write d = {x+ax x so(3)}. Remark Notice that if we write an element of so(3) as x 1 e 1 +x 2 e 2 +x 3 e 3 and an element of R 2 as y 1 f 1 +y 2 f 2, we can think to A and to its adjoint A as to the rectangular matrices A = ( 1 1 ), A = 1 1 Notice that AA = ½ 2 2 while A A ½ 3 3. From the expression of A we also get. A f 1 = e 2, A f 2 = e 1. (7.59) The problem P is then a k z problem with k = so(3), z = R 2. Moreover d, A and the bi-invariant metric on k, are defined as above. Geodesics Geodesics are parametrized by arclength if we take x so(3) and y R 2 satisfying Now writing y = y 1 f 1 +y 2 f 2 and using (7.59) we have A y = A (y 1 f 1 +y 2 f 2 ) = x +A y = 1. (7.6) y 1 y 2 y 1 y 2 Hence writing x = x 1 e 1 +x 2 e 2 +x 3 e 3, equation (7.6) become (x 1 +y 2 )e 1 +(x 2 y 1 )e 2 +x 3 e 3 = 1. It is then convenient to parametrize normal Pontryagin extremals with y 1 R, y 2 R, θ [,π], ϕ [,2π], (7.61). 187

188 taking x 1 = y 2 +cos(θ)cos(ϕ) (7.62) x 2 = y 1 +cos(θ)sin(ϕ) (7.63) x 3 = sin(θ) (7.64) (7.65) The z part of the geodesics is given by the formula (7.54), with g 2 (z 1,z 2 ) T, i.e., ( z1 (t) z 2 (t) ) = = ( z1 z 2 ( z1 z 2 ) + ) + t t ) (Ae sad(a y ) x +AA y ds. ( ( Ae s(a y ) x e s(a y ) y1 + y 2 )) ds. (7.66) If we fix y 1 = y 2 =, we get z 1 (t) = z 1 tcos(θ)sin(ϕ), z 2 (t) = z 2 +tcos(θ)cos(ϕ). Otherwise if we set y 1 = rcos(a) and y 2 = rsin(a), we obtain for r, z 1 (t) = z 1 1 r (rtcos2 (a)cos(θ)sin(ϕ)+sin(a)cos(a)cos(θ)cos(ϕ)(sin(rt) rt)+ sin(a)(sin(a) cos(θ) sin(ϕ) sin(rt) + sin(θ) + sin(θ)( cos(rt)))), z 2 (t) = z r (cos(θ)( cos(ϕ) ( rtsin 2 (a)+cos 2 (a)sin(rt) ) +sin(a)cos(a)sin(ϕ)(sin(rt) rt) ) cos(a) sin(θ)(cos(rt) 1). that is a combination of sinus and cosinus. See Figure 7.5. Exercise Prove that each trajectory (z 1 (t),z 2 (t)) is a rototranslation of a sinusoid and that ϕ determines its initial direction, r its frequence, θ its amplitude and a its rotation on the plane. The k part of the geodesics can be obtained with the formula X(t) = e tx e ta y (2, 3, 5) - Rolling without twisting We now consider a sphere rolling on a plane without slipping and without twisting. Similarly to what done in Section 7.9, the state space is the group G = SO(3) R 2 whose Lie algebra is T ½ G = so(3) R 2 and the distribution is still defined by equation (7.57) with the difference that now we have u

189 Figure 7.5: A Pontryagin extremals for the rolling sphere with twist More precisely, the condition that the sphere is rolling without slipping and twisting can be expressedbysayingthattheonlyadmissibletrajectories inso(3) R 2 arethehorizontaltrajectories of the following control system ġ = g ( u 1 (t)(f 1 e 2 )+u 2 (t)(f 2 +e 1 ) ). (7.67) Here f 1,f 2 are the generators of R 2 and e 1,e 2,e 3 are given by (7.55). The controls u 1 ( ) and u 2 ( ) belonging to L ([,T],R) correspond to the rotations of the sphere along the z 1 and z 2 axis. The commutators among f 1,f 2,e 1,e 2,e 3 are [f 1,f 2 ] = [f i,e j ] =, i = 1,2, j = 1,2,3, (7.68) [e 1,e 2 ] = e 3, [e 2,e 3 ] = e 1, [e 3,e 1 ] = e 2. We would like to solve the following problem. P: Given an initial and final position in SO(3) R 2, roll the sphere on the plane in such a way that the initial and final conditions are the given ones and T 2 i=1 u i(t) 2 dt is minimal. Remark 7.7. Notice that solving problem P corresponds to find the shortest path on the plane such that the sphere rolling along that path goes from the prescribed initial condition to the prescribed final condition. See Figure (7.6). Contrarily to what happens to the problem of rolling a sphere with twisting (Section 7.9.1), this time the problem is not of the form k + z. Indeed the distribution is two dimensional while and itisnotprojectingwellonthecompactsub-algebraso(3). Wearegoingtousethegeneralequations. 189

190 z u 2 3 u 1 z 2 X SO(3) (z 1,z 2 ) shortest path on the plane z 1 Figure 7.6: The sub-riemannian problem of rolling a sphere without slipping and twisting. Normal extremals are solutions of the Hamiltonian system associated with the following Hamiltonian H(g,p) = 1 ( p,l g (f 1 e 2 ) 2 + p,l g (f 2 +e 1 ) 2). 2 The trivialized Hamiltonian is H(ξ) = 1 2 It is convenient to use the following coordinates, Notice that, using (7.68) we have ( ξ,(f 1 e 2 ) 2 + ξ,(f 2 +e 1 ) 2), ξ T ½G. h f1 = ξ,f i, i = 1,2, h ej = ξ,e j, j = 1,2,3. {h f1,h f2 } = ξ,[f 1,f 2 ] =, {h fi,h ej } = ξ,[f i,e j ] =, i = 1,2, j = 1,2,3, {h e1,h e2 } = ξ,[e 1,e 2 ] = ξ,e 3 = h e3, {h e2,h e3 } = h e1, {h e3,h e1 } = h e2. Then H = 1 2 ( (hf1 h e2 ) 2 +(h f2 +h e1 ) 2). The Hamiltonian equations are Let us start with the first one ḣ fi = {H,h fi }, i = 1,2, ḣ ej = {H,h ej }, j = 1,2,3. (7.69) ḣ f1 = {H,h f1 } = 2 i=1 H h fi {h fi,h f1 }+ 3 i=1 H h ei {h ei,h f1 } =, 19

191 where we have used that h f1 commutes (for the Poisson brackets) with everything. Similarly ḣ f2 =, ḣ e1 = (h f1 h e2 )h e3, ḣ e2 = (h f2 +h e1 )h e3, ḣ e3 = h f1 h e1 h f2 h e2. Now if we consider normal Pontryagin extremals parametrized by length, i.e., if we work on the level {H = 1/2} S 1 R 3, it is convenient to use the coordinates r,α,θ,c defined by h f1 = rcos(α) h f2 = rsin(α) h f1 h e2 = cos(θ +α), h f2 +h e1 = sin(θ +α), h e3 = c. Normal normal Pontryagin extremals starting from a given initial condition, are parametrized by points in {H = 1/2}, i.e., by θ S 1, c R and (r,α ) parametrizing R 2 in polar coordinates (r,α S 1 ). The Hamiltonian equations are then ṙ = r = r, (7.7) α = α = α, (7.71) θ = c, (7.72) ċ = r sin(θ). (7.73) Once that equations (7.72) and (7.73) are solved in function of the initial conditions (r,θ,c ), i.e., once that one gets θ(t;r,θ,c ), the controls are given by u 1 (t;r,θ,c,α ) = ξ,f 1 e 2 = h f1 h e2 = cos(θ(t;r,θ,c )+α ) u 2 (t;r,θ,c,α ) = ξ,f 2 +e 1 = h f2 +h e1 = sin(θ(t;r,θ,c )+α ). (7.74) Once u 1 ( ) and u 2 ( ) are known, one can compute the corresponding trajectory by integrating (7.67). However here we are only interesting to the planar part of the normal Pontryagin extremals starting from z 1 and z 2, that is given by z 1 (t;θ,c,α ) = z 1 + z 2 (t;θ,c,α ) = z 2 + t t u 1 (s)ds = z 1 + u 2 (s)ds = z 2 + In the following we refer to (z 1 ( ),z 2 ( )) as the z-geodesics. t t cos(θ(s;θ,c )+α )ds, (7.75) sin(θ(s;θ,c )+α )ds. (7.76) Qualitative analysis of the trajectoris 191

192 Equations (7.72) and (7.73) are the equation of a planar pendulum of mass 1, length 1, where r represent the gravity. These equations admits an explicit solution in terms of elliptic functions. However their qualitative behaviour can be understood easily. First notice that if we consider only z-geodesics starting from the origin and with z 1 () = 1 and z 2 () =, we can fix z 1 = z 2 =, α = θ. All other z-geodesics can be obtained by rototranslations of these ones. Equation (7.72) and (7.73) admit a constant of the motion that up to a constant is the energy of the pendulum: H p = 1 2 c2 r cos(θ). Fixed (r,c ), one compute H p and the corresponding trajectory in the (θ,c) plane should stay on this set. Now let us compute the curvature of the z-geodesics. We have K = z 1 z 2 z 2 z 1 ((z 1 )2 +(z 2 )2 ) 3/2 = θ (t;r,θ,c ) = c(t;r,θ,c ). Hence c is precisely the curvature of the z-geodesic. Inflection points of z-geodesics corresponds to times in which c changes sign. The case r =. In this case ċ = and θ(t) = θ +c t. The z-geodesic is a circle (if c ) or a straight line if (if c = ). The case r >. The level sets of H p are shown in Figure (7.8). There are several types of trajectories: H p > r. In this case the pendulum is rotating and θ( ) is monotonic increasing (no inflection points). H p = r. We have two cases: If θ ±π. The pendulum is on the separatrix. The z-geodesic has an inflection point at infinity. If θ = ±π. The pendulum stays at the unstable equilibrium (θ,c) = (±π,). The z-geodesic is a straight line. H p ( r,r ). In this case the pendulum is oscillating and θ( ) too. The z-geodesic present inflection points. Such z-geodesics are called inflectional. H p = r. The pendulum stays at the stable equilibrium (θ,c) = (,). The z-geodesic is a straight line. Evaluating when these normal Pontryagin extremals lose optimality is not an easy problem and it is outside the purpose of this book. See the bibliographical note. Exercise Find all abnormal extremals for this problem. 192

193 c c = 2 r H p > r H p = r H p = l = 1 H p r θ π π θ M = 1 g = r Figure 7.7: Level set of the pendulum for r. The vertical line θ = π is identified with the veritical line θ = π. We have also indicated the direction of parameterization that one gets from the equation θ = c. Notice that the only critical points are (θ,c) = (,) (stable equilibrium) and (θ, c) = (π, ) (unstable equilibrium). 193

194 r = H p > H p = H p > r > non inflectional geodesics H p = r > separatrice θ ±π H p ( r,r ) inflectional geodesics unstable critical point (θ = ±π) H p = r stable critical point Figure 7.8: z-geodesics. Notice the presence of a periodic trajectories. 194

195 7.9.3 Euler s cvrvae elasticae The z-geodesics for the rolling ball withouting twisting are called Euler s cvrvae elasticae, since they are obtained via (7.75) and (7.76) from the solution of equations (7.7), (7.71), (7.72), (7.73), that are the same equation that one gets while looking for the configurations of an elastic rod on the plane having a stationary point of elastic energy. See [Eul]. For convenience we re-write the equations here: ż 1 = cos(θ +α ) (7.77) ż 2 = sin(θ +α ) (7.78) θ = c (7.79) ċ = r sin(θ) (7.8) These equations contains several parameters: r >, α, and the initial conditions θ() = θ, c() = c, z 1 () = z 1, z 2 () = z 2, having the following meaning: (z 1,z 2 ) is the starting point of the curba elastica; θ +α is the starting angle of the curba elastica; θ gives the starting point of the solution of the pendulum that it is used in the interval [,T]; r and c establish the gravity of the pendulum and the level of the Hamiltonian H p. This has consequences on the type of curba elastica (inflection, non inflectional etc,...) and on their size on the plane. We have the following interesting characterization of cvrvae elasticae. Proposition The set of cvrvae elasticae coincides with the set of planar curves parametrized by planar arclength for which the curvature is an affine function of the coordinates. Proof. Let us make the following change of coordinates z 1,z 2 x 1,x 2 where ( ) ( )( ) x1 cos(α ) sin(α = ) z1. sin(α ) cos(α ) x 2 Then equations (7.77) (7.8) become ẋ 1 = cos(θ), ẋ 2 = sin(θ), θ = c, ċ = r sin(θ). z 2 Hence Integrating we obtain ċ = r sin(θ) = r ẋ 2. c(t) c = r (x 2 (t) x 2 ()). 195

196 Hence where c(t) = c r ( sin(α )z 1 +cos(α )z 2 )+r ( sin(α )z 1 +cos(α )z 2 ) = a +a 1 z 1 +a 2 z 2. a = c +r ( sin(α )z 1 +cos(α )z 2 ), a 1 = r sin(α ), a 2 = r cos(α ). One immediately verify that the Jacobian of the transformation c,r,α a,a 1,a 2 is equal to r. However this singularity is only due to the choice of polar coordinates. Exercise Consider the Engel sub-riemannian problem, i.e. the sub-riemannian structure on R 4 for which an orthonormal frame is given by the vector fields X 1 = x1, X 2 = x2 x 1 x3 + x2 1 2 x 4. Prove that thelie algebra generated by X 1 and X 2 is finite dimensional. Using Theorem 7.1 deduce that this problem define a sub-riemannian structure on a Lie group. Find the group law. Study its geodesics. Do the same for the Cartan sub-riemannian problem, i.e. the sub-riemannian structure on R 5 for which an orthonormal frame is given by the vector fields Bibliographical notes X 1 = x1, X 2 = x2 x 1 x3 + x2 1 2 x 4 +x 1 x 2 x5. 196

197 Chapter 8 End-point and Exponential map In Chapter 4 we started to study necessary conditions for an horizontal trajectory to be a minimizer of the sub-riemannian length between two fixed points. By applying first order variations we found two different class of candidates, namely normal and abnormal extremals. We also proved that normal extremal trajectories are geodesics, i.e., short arcs realize the sub-riemannian distance. In this chapter we go further and we study second order conditions. To this purpose, we introduce the end-point map E q that associates to a control u the final point E q (u) of the admissible trajectory associated to u and starting from q. Then we treat the problem of minimizing the energy J of curves joining two fixed points q,q 1 M as the problem of minimization with constraint min J E 1 q (q 1 ), q 1 M. (8.1) It is then natural to introduce Lagrange multipliers. First order conditions recover Pontryagin extremals, while second order conditions give new information. This viewpoint permits to interpret abnormal extremals as candidates for optimality that are critical points of the map E q defining the constraint. In this chapter we take advantage of the invariance by reparametrization to assume all the trajectories to be defined on the same interval [,1]. Also, since the energy of a curve coincides with the L 2 -norm of the corresponding control, it is natural to take L 2 ([,1],R m ) as class of admissible controls (cf. Section 3.B). This is useful since L 2 ([,1],R m ) has a natural structure of Hilbert space. 8.1 The end-point map and its differential* Recall that every sub-riemannian manifold (M,U,f) is equivalent to a free one (cf. Chapter 3). In this chapter we always assume that the sub-riemannian structure is free of rank m, i.e., U = M R m. In the following {f 1,...,f m } denotes a generating frame. Fix q M. Recall that, for every control u L 2 ([,1],R m ), the corresponding trajectory γ u is the unique solution of the Cauchy problem γ(t) = m u i (t)f i (γ(t)), γ() = q. (8.2) i=1 197

198 (P u t,1 ) (P u t,1 ) f v(t) γ u (1) γ u (t) T γu (1)M f v(t) q Figure 8.1: Differential of the end-point map Definition 8.1. Let (M,U,f) be a free sub-riemannian manifold of rank m and fix q M. Define U q L 2 ([,1],R m ) the set of controls u such that the corresponding trajectory γ u starting at q is defined on [,1]. The end-point map based at q is the map E q : U q M, E q (u) = γ u (1). (8.3) Exercise 8.2. Prove that U q is an open subset of L 2 ([,1],R m ). In what follows we employ the usual notation f u (q) = m i=1 u if i (q). Remark 8.3. With the notation of Chapter 6 the end-point map is rewritten as the chronological exponential E q (u) = q exp 1 f u(t) dt. (8.4) Now we prove that the end-point map is differentiable and we compute its (Fréchet) differential. Proposition 8.4. The end-point map E q is smooth on U q and for every u U q we have D u E q : L 2 ([,1],R m ) T γu(1)m, D u E q (v) = for every v L 2 ([,1],R m ). Here P u t,s is the flow generated by u. 1 (Pt,1) u f γu(1) v(t) dt. (8.5) From the geometric viewpoint, the differential D u E q (v) computes the integral mean of the vector field f v(t) defined by v along the trajectory γ u defined by u, where all the vectors are pushed forward in the same tangent space T γu(1) with Pt,1 u (see Figure 8.1). We stress that, since U q is an open set of L 2 ([,1],R m ), the differential is defined on the whole tangent space (identified with) L 2 ([,1],R m ). Proof of Proposition 8.4. The end-point map from q is a map E q : U q M. If we want to adopt our view point we will evaluate the end point on a function a : M R and obtain a E q : U q R. Indeed what we will estimate is the map a E : U C (M), u( ) (q a E q (u)) 198

199 The end-point map from q can be rewritten as the chronological exponential (8.4). Let us first show the smoothness of E near the control u. E(v( )) = S m (v)+r m (v) (8.6) where S m (v) = Id+ R m (v) = m 1 k=1 k (1) f v(sk ) f v(s1 )ds P v,s m f v(sm) f v(s1 )ds m(1) By Theorem (6.17) with t = 1 there exists C > such that R m (v)a α,k C m! ec v 2 v m 2 a s+m,k (8.7) that proves that the end-point map is differentiable at u = (indeed m-times differentiable, for every m N) and the previous inequality for m = 1 gives that ( E(v( )) 1 ) f v(t) dt a Ce C v 2 v 2 a α+1,k (8.8) α,k [...] To compute the differential at a point ū U, have to consider the expansion near of the map v( ) F(ū( )+v( )) = q exp 1 f (ū+v)(t) dt. We reproduce the argument used in the proof of Proposition??, i.e. we write where Gū is the map defined as follows E(ū+v) = Pū,1 G ū(v) Gū(v) = exp 1 (Pū,t) 1 f v(t) dt Indeed this is easily seen by using the variation formula (6.28) (compare also with the proof of Proposition 3.44) exp 1 f (u+v)(t) dt = exp = exp = exp f u(t) +f v(t) dt ( t exp ) adf u(s) ds (P u,t) 1 f v(t) dt P u,1 199 f v(t) dt exp 1 f u(t) dt

200 Then, the expansion of v Gū(v) near v = is obtained again by estimate (??) and one obtains from which we get, denoting q 1 = E(u) DūE(v) = (P,1 ) 1 D Gū = 1 (P u,t ) 1 f v(t)dt (8.9) (P u,t ) 1 f v(t)(q )dt = 1 (P u t,1 ) f v(t) (q 1 )dt. 8.2 Lagrange multipliers rule Let U be an open set of an Hilbert space H, and let M be a smooth n-dimensional manifold. Consider two smooth maps ϕ : U R, F : U M. (8.1) In this section we discuss the Lagrange multipliers rule for the minimization of the function ϕ under the constraint defined by F. More precisely, we want to write necessary conditions for the solutions of the problem min ϕ F, q M. (8.11) 1 (q) Theorem 8.5. Assume u U is solution of the minimization problem (8.11). Then there exists a covector (λ,ν) T qm R such that (λ,ν) (,) and Remark 8.6. Formula (8.18) means that for every v T q M one has λd u F +νd u ϕ =. (8.12) λ,d u F(v) +νd u ϕ(v) =. Proof. Let us prove that if u U is solution of the minimization problem (8.11), then u is a critical point for the extended map Ψ : U M R defined by Ψ(v) = (F(v),ϕ(v)). Indeed, if u is not a critical point for Ψ, then D u Ψ is surjective. By implicit function theorem, this implies that Ψ is locally surjective at u. In particular, for every neighborhood V of u it exists v V such that F(v) = F(u) = q and ϕ(v) < ϕ(u), that contradicts that u is a constrained minimum. Hence D u Ψ = (D u F,D u ϕ) is not surjective and there exists a non zero covector (λ,ν) such that λd u F +νd u ϕ =. 8.3 Pontryagin extremals via Lagrange multipliers Applying the previous result to the case when F = E q is the end-point map and ϕ = J is the sub-riemannian energy, one obtains the following result. Corollary 8.7. Assume that a control u U is a solution of the minimization problem (8.1), then there exists (λ,ν) Tq M R such that (λ,ν) (,) and λd u E q +νd u J =. (8.13) 2

201 Let us now prove that these necessary conditions are equivalent to those obtained in Chapter 4. Recall that, since J(u) = 1 2 u 2 L 2, then D u J(v) = (u,v) L 2 and, identifying L 2 ([,1],R m ) with its dual, we have D u J = u. Proposition 8.8. We have the following: (N) (u(t),λ(t)) is a normal extremal if and only if there exists λ 1 T q 1 M, where q 1 = E q (u), such that λ(t) = (P u t,1 ) λ 1 and u satisfies (8.13) with (λ,ν) = (λ 1, 1), namely λ 1 D u E q = u. (8.14) (A) (u(t),λ(t)) is an abnormal extremal if and only if there exists λ 1 T q 1 M, where q 1 = E q (u), such that λ(t) = (P u t,1 ) λ 1 and u satisfies (8.13) with (λ,ν) = (λ 1,), namely λ 1 D u E q =. (8.15) where in (8.14) we identify u L 2 with the element (u, ) L 2 (L 2 ) Proof. Let us prove (N). The proof of (A) is similar. Recall that the pair (u(t),λ(t)) is a normal extremal if the curve λ(t) satisfies λ(t) = (Pt,1 u ) λ(1) (that is equivalent to say that λ(t) is a solution of the Hamiltonian system, cf. Chapter 4) and λ(t),f i (γ(t)) = u i (t) for every i = 1,...,m, where γ(t) = π(λ(t)). Assumethatusatisfies(8.14)forsomeλ 1, letusprovethatthecurvedefinedbyλ(t) := (Pt,1 u ) λ 1 is a normal extremal. Condition (8.14) means that for every v L 2 ([,T],R m ) we have Using (8.5), the left hand side is rewritten as follows λ 1,D u E q (v) = = 1 1 λ 1,D u E q (v) = (u,v) L 2 (8.16) λ1,(p u t,1) f v(t) (q 1 ) dt = λ(t),fv(t) (γ(t)) dt = where we used that γ(t) = P 1 t,1 (q 1). Then (8.16) becomes 1 m λ(t),f i (γ(t)) v i (t)dt = i= (P u t,1 ) λ 1,f v(t) (γ(t)) dt m λ(t),f i (γ(t)) v i (t)dt, i=1 m u i (t)v i (t)dt. (8.17) and since v(t) is arbitrary, this implies λ(t),f i (γ(t)) = u i (t) for every i = 1,...,m. Following the same computations in the oppposite direction we have that if (u(t), λ(t)) is a normal extremal then the identity (8.14) is satisfied. 21 i=1

202 8.4 Critical points and second order conditions In this chapter, we develop second order conditions for constrained critical points in the case in which the constrained is regular. When applied to the sub-riemannian case (cf. Section 8.5), this gives second order conditions for normal extremals. In the following H always denote an Hilbert space. Recall that a smooth submanifold of H is a subset V H such that for every point v V there is an open neighborhood Y of v in H and a smooth diffeomorphism φ : V W to an open subset W H such that φ(v Y) = W U for U a closed linear subspace of H. We now recall the implicit function theorem in our setting. Proposition 8.9 (Implicit function theorem). Let F : H M be a smooth map and fix q M. If F is a submersion at every u F 1 (q), i.e., the Fréchet differential D u F : H T q M is surjective for every u F 1 (q), then F 1 (q) is a smooth submanifold whose codimension is equal to the dimension of M. Moreover T u F 1 (q) = kerd u F. We now define critical points. Definition 8.1. Let ϕ : H R be a smooth function and N H be a smooth submanifold. Then u N is called a critical point of ϕ N if D u ϕ TuN =. We start with a geometric version of the Lagrange multipliers rule, which characterize constrained critical points (not just minima). This construction is then used to develop a second order analysis. Proposition 8.11 (Lagrange multipliers rule). Let U be an open subset of H and assume that u U is a regular point of F : U M. Let q = F(u), then u is a critical point of ϕ F if and 1 (q) only if it exists λ TqM such that λd u F = D u ϕ. (8.18) Proof. Recall that the differential of F is a well-defined map D u F : T u U T q M, q = F(u). Since u is a regular point, D u F is surjective and, by implicit function theorem, the level set V q := F 1 (q) is a smooth submanifold (of codimension n = dimm), with u V q and T u V q = KerD u F. Since u is a critical point of ϕ Vq, by definition D u ϕ TuV q = D u ϕ KerDuF =, i.e., KerD u F KerD u ϕ. (8.19) Now consider the following diagram T u U D uf T q M (8.2)? d uϕ R From (8.19), using Exercice 8.12, it follows that there exists a linear map λ : T q M R (that means λ T qm) that makes the diagram (8.2) commutative. 22

203 Exercise Let V be a separable Hilbert spaces and W be a finite-dimensional vector space. Let G : V W and φ : V R two linear maps such that kerg kerφ. Then show that there exists a linear map λ : W R such that λ G = φ. Now we want to consider second order information at critical points. Recall that, for a function ϕ : U R defined on an open set U of an Hilbert space H, the first and second differential are defined in the following way, D u ϕ(v) = d ds ϕ(u+sv), s= Du 2 d2 ϕ(v) = ds 2 ϕ(u+sv) s= For a function F : U M whose target space is a manifold its first differential D u F : H T F(u) M is still well defined while the second differential DuF 2 is meaningful only if we fix a set of coordinates in the target space. If V is a submanifold in H, the first differential of a smooth function ψ : V R at a point u V is defined as D u ψ : T u V R, D u ψ(v) = d ds ψ(w(s)), s= where w : ( ε,ε) V is a curve that satisfies w() = u, ẇ() = v. If ψ = ϕ V is the restriction of a function ϕ : H R defined globally on H, then D u ψ = D u φ TuV coincides with the restriction of the differential defined on the ambient space H. For the second differential things are more delicate. Indeed the formula v T u V d2 ds 2 ψ(w(s)) (8.21) s= where w : ( ε,ε) V is a curve that satisfies w() = u, ẇ() = v, is a well-defined object (i.e., the right hand side depends only on v) only if u is a critical point of ψ. Indeed, if this is not the case, the quantity (8.21) depends also on the second derivative of w, as it is easily checked. If u is a critical point of ψ : V R (i.e., D u ψ = ) the second order differential (8.21) is a well-defined quadratic form T u V, that is called the Hessian of ψ at u: Hess u ψ : T u V R, v d2 ds 2 ψ(w(s)) (8.22) s= We stress that if ψ = ϕ V is the restriction of a function ϕ : H R defined globally on H, then the Hessian of ψ at a critical point u does not coincide, in general, with the restriction of the second differential of ϕ to the tangent space T u V. Let us compute the Hessian of the restriction in the case when V = F 1 (q) is a smooth submanifold of H, and ψ = ϕ F 1 (q). Using that T uf 1 (q) = KerD u F, the Hessian is a well-defined quadratic form Hess u ϕ F 1 (q) : KerD uf R that is computed in terms of the second differentials of ϕ and F as follows. Proposition For all v KerD u F we have where λ is satisfies the identity λd u F = D u ϕ. Hess u ϕ F 1 (q) (v) = D2 uϕ(v) λd 2 uf(v). (8.23) 23

204 Remark We stress again that in (8.23), while the left hand side is a well defined object, in the right hand side Du 2ϕ is well-defined thanks to the linear structure of H, while D2 uf needs also a choice of coordinates in the manifold M. Proof of Proposition By assumption F 1 (q) U is a smooth submanifold in a Hilbert space. Fix u F 1 (q) and consider a smooth path w(s) in U such that w() = u and w(s) F 1 (q) for all s. Differentiating twice with respect to u, with respect to some local coordinates on M, we have D u F( u) =, D 2 uf( u), u +D u F(ü) =. (8.24) where we denoted by u = u() and ü = ü(). Analogous computations for ϕ gives Hess u ϕ d2 F ( u) = 1 (q) ds 2 ϕ(w(s)) s= = D 2 uϕ( u), u +D u ϕ(ü) = D 2 u ϕ( u), u +λd uf(ü) (by λd u F = D u ϕ) = D 2 uϕ( u), u λ D 2 uf( u), u (by (8.24)) The manifold of Lagrange multipliers As above, let us consider the two smooth maps ϕ : U R and F : U M defined on an open set U of an Hilbert space H. Definition We say that a pair (u,λ), with u U and λ T M, is a Lagrange point for the pair (F,ϕ) if λ T F(u) M and D uϕ = λd u F. We denote the set of all Lagrange points by C F,ϕ. More precisely C F,ϕ = {(u,λ) U T M F(u) = π(λ), D u ϕ = λd u F}. (8.25) The set C F,ϕ is a well-defined subset of the vector bundle F (T M), that we recall is defined as (see also Definition 2.5) F (T M) = {(u,λ) U T M F(u) = π(λ)}. (8.26) We now study the structure of the set C F,ϕ. It turns to be a smooth manifold under some regularity conditions on the maps (F, ϕ). Definition The pair (F,ϕ) is said to be a Morse pair (or a Morse problem) if is not a critical value for the smooth map θ : F (T M) U U, (u,λ) D u ϕ λd u F. (8.27) Remark Notice that, if M is a single point, then F is the trivial map and with this definition we have that (F,ϕ) is a Morse pair if and only if ϕ is a Morse function. Indeedin this case D u F =, and is a critical value for θ if, by definition, the second differential Du 2 ϕ is non-degenerate. Proposition If (F,ϕ) define a Morse problem, then C F,ϕ is a smooth manifold in F (T M). 24

205 Proof. To prove that C F,ϕ is a smooth manifold it is sufficient to notice that C F,ϕ = θ 1 () and, by definition of Morse pair, is a regular value of θ. The result follows from the version of the implicit function theorem stated in Lemma 8.19 Lemma Let N be a smooth manifold and H a Hilbert space. Consider a smooth map f : N H and assume that is a regular value of f. Then f 1 () is a smooth submanifold of N. If the dimension of U, the target space of θ, were finite, a simple dimensional argument would permit to compute the dimension of C F,ϕ = θ 1 () (as in Proposition 8.9). In this case, since the differential of θ is surjective we would have that so we could compute the dimension of C F,ϕ dim F (T M) dim C F,ϕ = dim U dim C F,ϕ = dim F (T M) dim U = (dim U +rankt M) dim U = rankt M = n However, in the case dim U = + the above argument is no more valid, and we need the explicit expression of the differential of θ. Proposition 8.2. Under the assumption of Proposition 8.18, then dimc F,ϕ = dimm = n. Proof. To prove the statement, let us choose a set of coordinates λ = (ξ,x) in T M and describe the set C F,ϕ F (T M) as follows { D u ϕ ξd u F = (8.28) F(u) = x where here ξ is thought as a row vector. To compute dimc F,ϕ, it will be enough to compute the dimension of its tangent space T (u,ξ,x) C F,ϕ at a every (u,ξ,x). The tangent space T (u,ξ,x) C F,ϕ is described in coordinates by the set of points (u,ξ,x ) satisfying the equations 1 { D 2 uϕ(u, ) ξd 2 uf(u, ) ξ D u F( ) = D u F(u ) = x (8.29) Let us denote the linear map Q : U U U defined by Q(u ) = D 2 u ϕ(u, ) ξd 2 u F(u, ). Since Q is defined by second derivatives of the maps F and ϕ, it is a symmetric operator. on the Hilbert space U. The definition of Morse problem is immediately rewritten as follows: the pair (F, ϕ) defines a Morse problem if and only if the following map is surjective. Θ : U R n U U, Θ(u,ξ ) = Q(u ) B(ξ ). (8.3) 1 If a manifold C is described as the set {z : Ψ(z) = }, then its tangent space T zc at a point z C is described by the linear equation {z : D zψ(z ) = }. 25

206 where we denoted with B : R n U U the map B(ξ ) = ξ D u F( ). Indeed the map Θ is exactly the first equation in (8.29). The dimension of C F,ϕ coincides with the dimension of kerθ. Indeed for each element (u,ξ ) KerΘ by setting x = D u F(u ) we find a unique (u,ξ,x ) T (u,ξ,x) C F,ϕ. Since Q is self-adjoint, we have U = KerQ ImQ, dimkerq = codimimq. Using that Θ is surjective and dim(imb) n we get that dimkerq = codimimq dimimb n, is finite dimensional (in particular ImQ is closed and U = KerQ ImQ). If we denote with π Ker : U KerQ and π Im : U ImQ the orthogonal projection onto the two subspaces, it is easy to see that { Θ(u,ξ π Ker Bξ = ) = π Im Bξ = Qu Moreover π Ker B : R n KerQ is a surjective map between finite-dimensional spaces (the surjectivity is a consequence of the fact that Θ is surjective). In particular we have dimker(π Ker B) = n dimkerq. Then we get the identity dimkerθ = dimkerq+dimker(π Ker B) = dimkerq+(n dimkerq) = n since π Ker B : R n KerQ is a surjective map The last characterization of Morse problem leads to a convenient criterion to check whether a pair (F,ϕ) defines a Morse problem. Lemma The pair (F,ϕ) defines a Morse problem if and only if (i) ImQ is closed, (ii) KerQ KerD u F = {}. Proof. Assume that (F, ϕ) is a Morse problem. Then, following the lines of the proof of Proposition 8.2, Im Q has finite codimension, hence is closed, and (i) is proved. Moreover, since the problem is Morse, then the image of the differential of the map (8.27) is surjective, i.e. if there exists w U that is orthogonal to ImΘ, namely Q(u ),w ξ D u F( ),w =, (ξ,u ), then w =. Using that Q is self-adjoint we can rewrite the previous identity as that is equivalent, since ξ,u are arbitrary, to u,q(w) ξ D u F( ),w =, (ξ,u ), Q(w) = and D u F(w) =. This proves (ii). The converse implications are proved in a similar way. 26

207 Definition Let N be a n-dimensional submanifold. An immersion F : N T M is said to be a Lagrange immersion if F σ =, where σ denotes the standard symplectic form on T M. Let us consider now the projection map F c : C F,ϕ T M defined by : F c (u,λ) = λ. Proposition If the pair (F,ϕ) defines a Morse problem, then F c is a Lagrange immersion. Proof. First we prove that F c is an immersion and then that F c σ =. (i). Recall that F c : C F,ϕ T M where C F,ϕ = {(u,ξ,x) equations (8.28) holds} The differential D (u,λ) F c : T (u,λ) C F,ϕ T λ T M is defined by the linearization of equations (8.28) where Now looking at (8.29) it easily seen that T (u,λ) C F,ϕ = {(u,ξ,x ) equations (8.29) holds} D (u,λ) F c (u,ξ,x ) = (ξ,x ) D (u,λ) F c (u,ξ,x ) = iff Q(u ) = D u F(u ) =. Since (F,ϕ) defines a Morse problem we have by Lemma 8.21 that such a u does not exists. Hence the differential is never zero and F c is an immersion. (ii). We now show that Fc σ =. Since σ = ds is the differential of the tautological form s, and Fc σ = df c s since the pullback commutes with the differential, it is sufficient to show that F c s is closed. Let us show the identity Fc s = D(ϕ π U) CF,ϕ. By definition of the map F c, the following diagram is commutative: C F,ϕ F c T M (8.31) π U U F M π M Moreover, notice that if φ : M N is smooth and ω Λ 1 (N), by definition of pull-back we have (φ ω) q = ω φ(q) D q φ. Hence (Fcs) (u,λ) = s λ D (u,λ) F c = λ π M D (u,λ) F c (by definition s λ = λ π M ) = λ D u F π U (by (8.31)) = D u (ϕ π U ) (by (8.18)) 27

208 Definition The set L F,ϕ T M of Lagrange multipliers associated with the pair (F,ϕ) is the image of C F,ϕ under the map F c. From Proposition 8.23 it follows that, if L F,ϕ is a smooth manifold, then it is a Lagrangian submanifold of T M, i.e., σ LF,ϕ =. Collecting the results obtained above, we have the following proposition. Proposition Let (F,ϕ) be a Morse pair and assume (u,λ) is a Lagrange point such that u is a regular point for F, where F(u) = q = π(λ). The following properties are equivalent: (i) Hess u ϕ F 1 (q) is degenerate, (ii) (u,λ) is a critical point for the map π F c = F CF,ϕ : C F,ϕ M, Moreover, if L F,ϕ is a submanifold, then (i) and (ii) are equivalent to (iii) λ is a critical point for the map π LF,ϕ : L F,ϕ M. Proof. In coordinates we have the following expression for the Hessian Hess u ϕ F (v) = Q(v),v, 1 (q) v KerD uf. is degen- and Q is the linear operator associated to the bilinear form. Assume that Hess u ϕ F 1 (q) erate, i.e. there exists u KerD u F such that Qu,v =, v KerD u F. In other words Q(u ) KerD u F that is equivalent to say that Q(u ) is a linear combination of the row of the Jacobian matrix of F ξ such that Q(u ) = ξ D u F( ). From equations (8.29) it follows immediately that (i) is equivalent to (ii). The fact that (ii) is equivalent to (iii) is obvious. 8.5 Sub-Riemannian case In this section we want to specify all the theory that we developed in the previous ones to the case of sub-riemannian normal extremal. Hence, we will consider the functional J defined by J(u) = u(t) 2 dt and we consider its critical points constrained to a regular level set of the end-point map E, that means that we fix the final point of our trajectory (as usual we assume that the starting point q is fixed by the very beginning). We already characterized critical points by means of Lagrange multipliers, now we want to consider second order informations. We start by computing the Hessian of J E 1 (q 1 ). Lemma Let q 1 M and (u,λ) be a critical point of J E 1 (q 1 ). Then for every v KerD uf Hess u J E 1 (q 1 ) (v) = v 2 L 2 λ,d 2 u E(v) (8.32) 28

209 where DuE(v,v) 2 = 2q 1 [(P s,1 ) f v(s),(p t,1 ) f v(t) ]dsdt. (8.33) s t 1 and P t,s is the nonautonomous flow defined by the control u. Proof. By Proposition 8.13 we have Hess u J E 1 (q 1 ) (v) = D2 uj λd 2 ue. It is easy to compute derivatives of J. Indeed we can rewrite it as J(u) = 1 2 (u,u) L2, hence D u J(v) = (u,v) L 2, D 2 u J(v) = (v,v) L 2 = v 2 L 2, v KerD u E It remains to compute the second derivative of the end-point map. From the Volterra expansion (8.9) we get Du 2 E(v,v) = 2q 1 (P s,1 ) f v(s) (P t,1 ) f v(t) dsdt (8.34) s t 1 To end the proof we use the following lemma on chronological calculus, which we will use to symmetrize the second derivative. Lemma Let X t be a nonautonomous vector field on M. Then s t 1 Proof of the Lemma. We have 2 s t 1 X s X t dsdt = X s X t dsdt = 1 2 = = = s t 1 1 X s ds X s X t dsdt+ s t s t 1 X s ds 1 X t dt+ 1 2 s t 1 X t X s dsdt+ X s X t dsdt+ X s X t dsdt+ 1 X t dt+ s t 1 s t 1 s t 1 s t 1 X s X t dsdt s t 1 X t X s dsdt [X s,x t ]dsdt+ [X s,x t ]dsdt [X s,x t ]dsdt. [X s,x t ]dsdt. (8.35) s t 1 X t X s dsdt 29

210 Using Lemma 8.27 we obtain from (8.34) DuE(v,v) 2 = 2q 1 s t 1 where we used that 1 (P t,1) f v(t) dt = since v kerd u E. [(P s,1 ) f v(s),(p t,1 ) f v(t) ]dsdt (8.36) Proposition The sub-riemannian problem (E, J) is a Morse pair. Proof. We use the characterization of Lemma We have to show that Im ( Id λd 2 ue ) is closed, Ker ( Id λd 2 ue ) Ker (D u E) =. (8.37) Using the previous notation and defining g t v := (P t,1) f v, we can write Moreover we have λd 2 u E(v),v = 2 = = s t 1 s t 1 1 t D u E(v) = q 1 1 g t v(t) dt gv(s) s gt v(t) dsdt a (8.38) g s v(s) gt v(t) dsdt a+ g s v(s) gt v(t) dsdt a+ 1 t s 1 1 t gv(t) t gs v(s) dsdt a (8.39) gv(t) t gs v(s) dsdt a (8.4) where a is a smooth function such that d q1 a = λ. The kernel of the bilinear form is the kernel of the symmetric linear operator associated to it, the unique symmetric operator Q satisfying Then it follows λd 2 u E(v),v = (Qv,v) L 2 = 1 ( t (Qv)(t) = gv(s) s ds gt +g t (Av)(t)v(t)dt. 1 t g s v(s) ds ) a (8.41) Since (8.41) is a compact integral operator, then I Q is Fredholm, and the closedness of Im(I Q) follows from the fact that it is of finite codimension. On the other hand, for every control v KerD u E we can compute (see (8.5)) q 1 t g s v(s) ds = q 1 1 t g s v(s) ds Hence we have that v belong to the intersection (8.37) if and only if it satisfies ( I λd 2 u E ) v( )(t) = v(t)+λ t which has trivial kernel as it follows from the next lemma. 21 [ ] gv(s) s,gt v(t) (q 1 )ds

211 Lemma Let us consider the linear operator A : L 2 ([,T],R m ) L 2 ([,T],R m ) defined by (Av)(t) = v(t) t where K(t,s) is a function in L 2 ([,T] 2,R m ). Then (i) A = I Q, where Q is a compact operator, (ii) kera = {}. Moreover, if K(t,s) = K(s,t) for all t,s, then A is a symmetric operator. K(t, s)v(s)ds (8.42) Proof. The fact that the integral operator Q : L 2 ([,T],R m ) L 2 ([,T],R m ) defined by (Qv)(t) = t K(t, s)v(s)ds (8.43) is compact is classical (see for instance [HL99, Chapter 6]). We then prove statement (ii) in two steps. a) we prove it for small T. b) we prove it for arbitrary T. (a). Fix T > and consider a solution in L 2 ([,T],R m ) to the equation v(t) = We multiply (8.44) by v(t) and integrate on [,T], obtaining T t v(t) 2 dt = K(t,s)v(s)ds, t [,T]. (8.44) T t K(t, s)v(s)v(t)dsdt By applying twice the Cauchy-Schwartz identity, one obtains T ( T v(t) 2 dt T ) 1/2 T K(t,s) 2 dtds v(t) 2 dt. or, equivalently v 2 L 2 K L 2 v 2 L 2. Since for T we have K L 2 ([,T] 2,R m ), this implies that v = on [,T]. (b). Consider a solution of the identity (8.44) and define T = sup{τ > v(t) =,t [,τ]}. By part (a) one has T >. Since the set {v L 2 ([,T],R m ) v(t) = a.e. on [,T ]} is preserved by A then again by part (a) one obtains that v indeed vanishes on [,T + ε], for some ε >, contradicting the fact that that T is the supremum. Combining the last result with Proposition 8.23 we obtain the following corollary. Corollary 8.3. The manifold of Lagrange multilpliers of the sub-riemannian problem (E, J) is a smooth n-dimensional submanifold of T M. L (E,J) := {λ 1 T M λ 1 = e H (λ ), λ T q M} 211

212 8.5.1 Free initial point problem Let us consider the free initial point problem, i.e., consider the map E : M U M, (q,u) E q (u), where E q (u) is the end-point map based at q. Notice that E is a submersion and E {q } U = E q, E M {u} = P u,1 where Pt,s u is the nonautonomous flow associated with u. Since the initial point is not fixed, the minimization problem min J (8.45) E 1 (q 1 ) has only the trivial solution. We can try to look for solutions of the problem min J(u)+a(q) (8.46) E 1 (q 1 ) where a C (M) is a suitable smooth function. Critical points of this constrained minimization problem can be found with the Lagrange multiplier rule studied in the previous sections with F = E, ϕ = J +a. Fix a point (q,u) M U. Notice that every level set E 1 (q 1 ) is regular since the map E is a submersion. Then the equation (8.18) is written as λd (q,u)e = D (q,u)(j +a) (8.47) Since D (q,u)e = (D u E q,(p u,1 ) ), D (q,u)(j +a) = (D u J,d q a) the equation (8.47) splits into { λd u F = D u J = u, λ(p u,1 ) = d q a In other words, to every critical point of the problem (8.46) we can associate a normal extremal λ(t) = (P 1,t ) λ, where the initial condition is defined by the function a by λ = d q a. Exercise Fix q M and a C (M). Prove that to every critical point of the free endpoint problem min J(u) a(e q (u)), (8.48) u we can associate a normal extremal satisfying λd u F = u, λ = d F(u) a. In other words now we do not restrict to the sublevel F 1 (q 1 ) (we do not fix the final point of the trajectory) but we consider a penalty in the functional we want to minimize. 212

213 8.6 Exponential map A key object in sub-riemannian geometry is the exponential map, that parametrize normal extremals by their initial covectors. Definition Let q M. The sub-riemannian exponential map (based at q ) is the map E q : D q T q M M, E q (λ ) = π e H (λ ). (8.49) wherethedomain D q isthesetofcovectors suchthatthecorrespondingsolutionofthehamiltonian system is defined on [,1]. When there is no confusion on the point where the exponential map is based at, we omit it in the notation, writing E. The homogeneity of the sub-riemannian Hamiltonian H yields to the following homogeneity property of the flow of H. Lemma Let H be the sub-riemannian Hamiltonian. Then, for every λ T M for every α > and t > such that both sides are defined. e t H (αλ) = αe αt H (λ), (8.5) Proof. By Remark 4.27 we know that if λ(t) = e t H (λ ) is a solution of the Hamiltonian system, then also λ α (t) := αλ(αt) is a solution. The result follows from the uniqueness of the solution and the identity λ α () = αλ(). The homogeneity property (8.5) permits to recover the whole extremal trajectory as the image of the ray that join to λ in the fiber T q M. Corollary Let λ(t), t [,T], be the normal extremal that satisfies the initial condition λ() = λ T q M. Then the normal extremal path γ(t) = π(λ(t)) satisfies Proof. Using (8.5) we get γ(t) = E q (tλ ), t [,T] E q (tλ ) = π(e H (tλ )) = π(e t H (λ )) = π(λ(t)) = γ(t). Remark Due to the homogeneity property we can consider the following map R + C q M, (t,λ ) E q (tλ ) where C q is the hypercylinder of normalized covectors C q = {λ T q M H(λ) = 1/2} With an abuse of notation in what follows we define E q (t,λ ) := E q (tλ ), whenever the right hand side is defined. In other words we restrict to length parametrized extremal paths, considering the time as an extra variable. 213

214 Proposition If (M,d) is complete, then D q = T q M. Moreover, if there are no strictly abnormal minimizers, the exponential map is surjective. Proof. To prove that D q = T q M, it is enough to show that any normal extremal λ(t) starting from λ T q M with H(λ ) = 1/2 is defined for all t R. Assume that the extremal λ(t) is defined on [,T[, and assume that it is extendable to any interval [,T +ε[. The projection γ(t) = π(λ(t)) defined on [,T[ is a curve with unit speed hence for any sequence t j T the sequence γ(t j ) is a Cauchy sequence on M since d(γ(t i ),γ(t j )) t i t j hence convergent to a point q 1 by completeness of M. Let us now consider coordinates around the point q 1 and show that, in coordinates λ(t) = (p(t),x(t)), also p(t) is uniformly bounded. This will give a contradiction to the fact that λ(t) is not extendable. By Hamilton equations (4.34) ṗ(t) = H m x (p(t),x(t)) = p(t),f i (γ(t)) p(t),d x f i (γ(t)) i=1 Since H(λ(t)) = 1 2 m i=1 p(t),f i(γ(t)) 2 = 1/2 then p(t),f i (γ(t)) 1. Moreover by smoothness of f i, the derivatives D x f i C are locally boundedin the neighborhoodand one get the inequality ṗ(t) C p(t) which by Gronwall s lemma implies that p(t) is uniformly bounded. The second part of the statement follows from the existence of minimizers. We end this section by the Hamiltonian version of the Gauss Lemma Proposition 8.37 (Gauss Lemma). Let λ C q and let λ(t) = e t H (λ ) for t [,1] be a normal extremal. Assume that the sub-riemannian front E q (C q ) is smooth at E q (λ()). Then the covector λ(1) annihilates the tangent space to E q (C q ). Proof. It is enough to show that for every smooth variation η s Tq M H 1 (1/2) of initial covectors such that η = λ() we have λ(1), d ds E q (η s ) =. s= Let us consider the family of associated controls u s ( ) defined by the identities u s i (t) = ηs (t),f i (γ s (t)), u s L 2 = 1 where η s (t) is the solution of the Hamiltonian equation with initial value η s and γ s (t) is the corresponding trajectory. For these controls one has E q (η s ) = E q (u s ) hence d ds E q (η s ) = d s= ds E q (u s ) = D u E q (v), v := d s= ds u s (8.51) s= Notice that v is orthogonal to u since u s = const. Thus by the normal equation (8.14) and (8.51) λ(1), d ds E q (η s ) = λ(1),d u E q (v) = (u,v) L 2 =. (8.52) s= 214

215 Proposition The sub-riemannian exponential map E q : T q M M is a local diffemorphism at if and only if D q = T q M. Proof. It follows from D E(λ ) = γ λ () that imd E = D q. 8.7 Conjugate points and minimality of extremal trajectories* Consider now an extremal pair (u(t), λ(t)), t [, 1], such that the corresponding extremal path γ(t) is strictly normal. Recall that by Corollary 4.62, the curve γ is a geodesic. Moreover, γ [,s] is a geodesic too, for every s >. If we define γ s (t) := γ(st), with t [,1], then γ s corresponds to the control u s (t) = su(st). Notice that γ s is the curve γ [,s] reparametrized with constant speed on [,1]. Definition An extremal trajectory γ : [,T] M is said to be strongly normal, if γ [,s] is stricly normal s >. Proposition 8.4. Let γ be a strongly normal extremal trajectory. The following are equivalent: (i) Hess u J E 1 (γ(1)) is positive definite, (ii) Hess us J E 1 (γ s(1)) is non degenerate for all s >. Proof. Recall that every operator A on L 2 ([,T],R m ) can be associated with the quadratic form Q(v) = (Av,v) L 2 and viceversa. The quadratic form Hess us J E 1 (γ s(1)) (v) = v 2 L 2 λ(s),d 2 u s E(v,v), (8.53) defined for v kerd us E, is written as (, ) L 2 Q s, with Q s (v) = λ(s),d 2 u s E(v,v) that is associated with a compact operator, thanks to Lemma Then define the function { α(s) : = inf v 2 L 2 λ(s),du 2 s E(v,v) } v =1 = 1 sup λ(s),d 2 us E(v,v) (8.54) v =1 Let us prove the folllowing properties (a) α() = 1 (b) α( s) = implies that Hess u s J E 1 (γ s(1)) is degenerate (c) α(s) is a continuous and monotone decreasing function To prove (a), let us notice that D us E(v) = s P 1 t f v(t) dt, D 2 u s E(v,v) = 215 τ t s [P 1 τ f v(τ),p 1 t f v(t) ]dτdt. (8.55)

216 A change of variables in the integral gives D us E(v) = s 1 P 1 st f v(st)dt, D 2 u s E(v,v) = s 2 τ t 1 [P 1 sτ f v(sτ),p 1 st f v(st)]dτdt. (8.56) Hence for s = we have D 2 u s E(v,v) = and α() = 1. To prove (b), notice that α( s) = means that the quadratic form (, ) L 2 Q s is nonnegative and has infimum zero. Hence there exists a sequence of v j such that v j 2 = 1 and v j 2 Q(v j ). (8.57) Since the unit ball in L 2 is weakly compact we can extract a convergent subsequence, that we still denote by the same symbol, v j v. By compactness of Q s we have Comparing (8.57) and (8.58) we get v j 2 Q(v j ) = 1 Q(v j ) 1 Q( v). (8.58) Hess u s J E 1 (γ s(1)) ( v) = Thanks to Exercise 8.41, the quadratic form Hess u s J E 1 (γ s(1)) is degenerate. Exercise Let V be a vector space and Q : V V R be a quadratic form on V. Recall that Q is degenerate if there exists v V such that Q( v, ) =. Prove that a non negative quadratic form is degenerate if and only if there exists v such that Q( v, v) =. Finally to prove (c), let us write the second differential applied to functions v L 2 ([,1],R m ) D 2 u s E(v,v) = s 2 τ t 1 consider s s 1 and v KerD us E and define the control ( ) s s v(t) = s v s t, t s s, s, s < t 1. [P 1 sτ f v(τ),p 1 st f v(t)]dτdt. (8.59) Then v = v, v KerD us E and D 2 u s E(v,v) = D 2 u s E( v, v), hence α(s) α(s ). To prove that α is continuous we need that both the integrand in the expression of D us E and the kernel KerD us E of these quadratic form is continuous with respect to s. This follows from our main assumption on γ. Indeed, since every restriction γ [,s] is strictly normal we have that rank of D us E is always equal to n, and the kernel continuously depend on s. Remark Notice that (i) implies only that u is local minimizer in the L 2 -topology. We will discuss more stronger minimality conditions in next sections. 216

217 Definition Let q M and E q be the exponential map based at q. We say that q is conjugate to q along γ(t) = E q (tλ) if q = γ(s) and sλ is a critical point of the exponential map E q. We say that q is the first conjugate point to q along γ(t) = E q (tλ) if q = γ(s) and s = inf{τ > τλ is a critical point of E q }. We denote by Con q the set of all first conjugate points to q along some normal extremal trajectory starting from q. Proposition Let γ : [,T] M be a strongly normal extremal trajectory and s ],T]. Then γ(s) is conjugate to γ() along γ if and only if Hess us J E 1 (γ s) is degenerate. Proof. We apply Proposition Indeed γ(s) is a conjugate point if and only if u s is a critical point of the exponential map, that is equivalent to the fact that Hess us J E 1 (γ s) is degenerate. Corollary Let γ : [,T] M be a strongly normal extremal trajectory and assume that γ(t) is not conjugate to γ() along γ for every t >. Then Hess u J E 1 (q 1 >. In particular γ(t) is a ) local minimizer in the L 2 -topology for controls. Proof. Since γ contains no conjugate points, by Proposition 8.44 it follows that Hess us J E 1 (γ s) is non degenerate for every s [,1], hence Hess u c F 1 (q 1 > by Proposition 8.4. ) Corollary Let γ : [,T] M be a strongly normal extremal trajectory. Then the set is isolated from. {s > γ(s) is conjugate to γ()} Proof. It follows from the fact that small pieces of a normal extremal trajectory are minimizers and Proposition Local minimality of normal extremal trajectories in the uniform topology In the previous section we proved that a normal extremal trajectory that contains no conjugate points, is a local minimizer for the length in the space of admissible trajectories with fixed endpoints, endowed with the H 1 -topology, i.e., the L 2 -topology for controls. In this section we prove that, in absence of conjugate points, a normal extremal trajectory is a local minimizer with respect to the stronger uniform (i.e., C ) topology in the space of admissible trajectories with fixed endpoints. Proposition Let γ : [,T] M be a strongly normal extremal trajectory. If γ(s) is not conjugate to γ() for every s >, then γ is a local miminum for the length in the C -topology in the space of admissible trajectories with the same endpoints. Proof. Assume that γ(t) = π e t H (λ ), λ T qm We want to show that hypothesis of Theorem 4.6 are satisfied. We will use the following lemma, which we prove at the end of the proposition. 217

218 Lemma There exists a C (M) such that λ = d q a, Hess (q,u)j +a >, E 1 (γ s) In this case (E,J +a) is a Morse problem and L (E,J+a) = {e H (d q a), q M} From this Lemma it follows that sλ is a regular point of the map π e H L, where as usual L = {d q a,q M} denotes the graph of the differential. Using the homogeneity property (8.5) we can rewrite this saying that π e s H L is an immersion at λ, s [,1], Inparticular it is alocal diffeomorphism. Hence wecan applythelocal version of Theorem 4.6. Proof of Lemma First we notice that KerD (q,u)e T q M U, U Hilbert In particular KerD (q,u)e ( U) = KerD u E Since there are no conjugate points, it follows that Hess (q,u)j +a = Hess uj > (8.6) KerDuE Then it is sufficient to show that there exists a choice of the function a C (M) such that the Hessian is positive definite also in the complement. We define W s := {ξ v KerD (q,u s)e Hess(J +a)(ξ v, KerD us E) = } Notice from (8.6) that, if there is some ξ v W s, then ξ. Now we prove that there exists a map B s : T q M U, W s = {ξ B s ξ, ξ T q M} Then we will have and we get KerD (q,u s)e = ( KerD us F)+W s Hess(J +a)(ξ B s ξ + v,ξ B s ξ + v) = = HessJ(v,v) +Hess(J +a)(ξ B s ξ,ξ B s ξ) = HessJ(v,v) +d 2 a(ξ,ξ)+q(ξ) where we used that mixed terms give no contribution and denote with Q(ξ) a quadratic form that does not depend on second derivatives of a. In particular, since the first term is positive, we can choose a in such a way that it remains positive. 218

219 Up to now we proved a sufficient condition for a strictly normal extremal trajectory to be a strong minimum of the sub-riemannian distance. Indeed Proposition 8.47 says that, if γ contains no conjugate points, then it is optimal with respect to sufficiently C -closed curves. On the other hand, if we consider a control u such that the corresponding trajectory γ(t) = q exp t f u(s) ds is strictly normal, that means u is not a critical point of the end-point map E, then it is well defined the Hessian of J E 1 (q 1 ), where q 1 = E(u) at the point u. Moreover, if γ is locally optimal, also in a very weak sense, then necessarily we have Hess u J E 1 (q 1 ) Indeed if the Hessian is sign-indefinite, then the map is locally open around the point u and we have that small perturbations give rise to a smaller cost. As in the proof of Proposition 8.4 we consider the family of rescaled controls(and corresponding trajectories) u s (t) = su(st), γ s (t) = γ(st), s,t [,1], and we define the function α(s) = min v =1 Hess u s J E 1 (γ s(1)) that is well defined, continuous and non-increasing, under the assumption that γ s is strictly normal for every s [,1]. Notice that α( s) = if and only if γ( s) is a conjugate point. Since α() = 1 we have only three cases (a) α(1) >. By monotonicity this implies α(s) > for all s and we have no conjugate points. Hence, by Proposition 8.47, γ is a minimum in the strong topology. (b) α(1) <. Then the Hessian at u is sign indefinite and γ is not a minimum, also in the weak topology. (c) α(1) =. In this case the Hessian is semi-definite and we cannot conclude anything on the minimality of γ. Notice that in cases (b) and (c) also a segment of conjugate point can appear. To analyze in details case (c) and to understand better the properties of a segment of conjugate point we introduce the notion of Jacobi curves, which is some sense generalize the notion of Jacobi fields in Riemannanian geometry. (see Chapter 15) 8.8 Global minimizers* Before going to the analysis of global minimality of extremal trajectories, let us resume in the following Theorem our results about local minimality. Theorem Let M be complete and γ(s) with γ [,s] and γ [s,1] strictly normal s 1. (i) if γ has no conjugate point then its a minimizer in the C -topology for the trajectories, 219

220 (ii) if γ has at least a conjugate point then its not minimizer in the L 2 -topology for controls. Remark 8.5. The assumption that the curve γ is strictly normal is essential in what we proved. Indeed if a curve γ is both normal and abnormal we have that there exists two covectors λ 1,ν 1 that satisfy λ 1 D u F = u, ν 1 D u F =, that implies (λ 1 +sν 1 )D u F = u, s R and the whole one parameter family of covectors projects on the same extremal trajectory, and γ would be a critical point of the projection. In this case the definition of conjugate point should be changed. Remark Notice that the hypotheses of the above theorem imply that in the case (ii) it not possible to have ha segment of full conjugate point up to t = 1. Definition We say that a point q is in the cut locus of q if thereexists two length minimizers joining q and q. Our previous analysis of conjugate points let us to state the following result. Theorem Let M be a complete sub-riemannian manifold and γ : [,1] M be a normal extremal path. Then (i) assume that γ [,s] is strictly normal for all s > and that γ is not a minimizer. Then there exists τ ],1] such that γ(τ) is either cut or conjugate to γ(), (ii) assume that γ [s,1] is strictly normal for all s > and that there exists τ ],1] such that γ(s) is either cut or conjugate to γ(). Then γ not a minimizer. In particular if γ is strongly normal then we have that γ is not a minimizer if and only if there exists a cut or a conjugate point along γ. Proof. (i). Let us assume that γ is not a minimizer and that there are no conjugate points along γ. We prove that this implies the presence of a cut point. Define t := sup{t [,1] γ [,t] is minimizing} Let us show that < t < 1. Indeed t > since small pieces of a normal extremal path are minimizers. Moreover, since γ [,1] is not a minimizer, by continuity of the distance also t < 1 and l(γ [,t ]) = d(γ(),γ(t )). Fix now a sequence t n t such that t n > t for all n and denote by γ n ( ) a minimizer joining γ() to γ(t n ) such that l(γ n ) = d(γ(),γ(t n )) (the existence of such a minimizers follows from the completeness assumption). By compactness of minimizers (up to considering a subsequence) there exists a limit minimizer γ n γ joining γ() to γ(t ). In particular l( γ [,t ]) = d(γ(),γ(t )) = l(γ [,t ]). On the other hand, since the segment γ [,t ] contains no conjugate points (by definition of t ), the curve γ [,t ] is a minimizer in the strict C -topology. Thus γ cannot be contained in a neighborhood γ. From this it follows that γ(t ) is a cut point. 22

221 (ii). Assume that there exists a conjugate point γ(τ) in the segment [,1]. Then γ is not a local (hence global) minimizer, as proved in Theorem It remains to show that the same remains true if γ(τ) is a cut point. Indeed in this case we have a minimizer γ such that γ(τ) = γ(τ). From this it follows that the curve built with γ [,τ] and γ [τ,1] is also a minimizer and the piece γ [τ,1], by uniqueness of the covector, would be associated with two different normal covectors, hence abnormal, that contradicts our assumptions. Theorem Let γ : [,1] M be a strictly normal extremal path. Assume that for some s > (i) γ [,s] is a global minimizer, (ii) at each point in a neighborhood of γ(s) there exists a unique minimizer joining γ() to γ(s), that is not abnormal. Then there exists ε > such that γ [,s+ε] is a global minimizer. Proof. Let us consider a neighborhood O of γ(s) and, for each q O, let us denote by u q (resp. γ q ) the minimizing control (resp. trajectory) joining γ() to q. The map q u q is continuous in the L 2 topology. Hence we can consider the family λ q 1 of covectors such that λ q 1 D u qf = uq, q O. By the smoothness of F and the continuity of the map q D u qf we have that the map q λ q 1 is continuous. Indeed since the trajectory associated with u q is not abnormal by assumptions, one has D u qf is onto. Thus its adjoint (D u qf) is injective and satisfies λ q 1 = (D u qf) u q. Thus the map q λ q is continuous too, being the composition of the previous one with (P,1 ) 1. Moreover, the map q λ q is also injective. Indeed it is an inverse of the exponential map. By the invariance of domain theorem we have that O = {λ q,q O} is open in T qm. Thus (1+ε)λ γ(s) O for ε small enough. Since (1+ε)λ γ(s) = λ γ((1+ε)s), this means that γ is minimizer on the interval [,(1+ε)s[. Hence γ(s) is not a conjugate point. Corollary If we assume in Theorem 8.54 that γ is strongly normal, then γ(s) is not a conjugate point. Corollary Assume that the sub-riemannian structures admits no abnormal minimizer. Let γ : [,1] M be a length minimizer such that γ(1) is conjugate to γ(). Then any neighborhood of γ(1) contains a cut point. 8.9 An example: the first conjugate locus on perturbed sphere In this section we prove that a C small perturbation of the standard metric on S 2 has a first conjugate locus with at least 4 cusps. See Figure??. Recall that geodesics for the standard metric on S 2 are great circles, and the first conjugate locus from a point q coincides with its antipodal point q. Indeed all geodesics starting from q meet and lose their local and global optimality at q. Denote H the Hamiltonian associated with the standard metric on the sphere and let H be an Hamiltonian associated with a Riemannian metric on S 2 such that H is sufficiently close to H, with respect to the C topology for smooth functions in T M. 221

222 Fix a point q S 2. Normal extremal trajectories starting from q and parametrized by length (with respect to the Hamiltonian H) can be parametrized by covectors λ T q M such that H(λ) = 1/2. The set H 1 (1/2) is diffeomorphic to a circle S 1 and can be parametrized by an angle θ. For a fixed initial condition λ = (q,θ), where q M and θ S 1 we write λ(t) = e t H (λ ) = (p(t,θ),γ(t,θ)), and we denote by E = E q the exponential map based at q E q (t,λ ) = π e t H (λ ) = γ(t,θ) For every initial condition θ S 1 denote by t c (θ) the first conjugate time along γ(,θ), i.e. t c (θ) = inf{τ > γ(τ,θ) is conjugate to q along γ(,θ)}. Proposition The first conjugate time t c (θ) is characterized as follows { } t c (θ) = inf t > E θ (t,θ) =. (8.61) Proof. Conjugate points correspond to critical points of the exponential map, i.e., points E(t, θ) such that { } E E rank (t,θ), t θ (t,θ) = 1. (8.62) Notice that E E t (t,θ) = γ(t,θ). Let us show that condition (8.62) occurs only if θ (t,θ) =. Indeed, by Proposition 8.37, one has that p, E t (t,θ) = 1, p, E θ (t,θ) =, thus, whenever E θ (t,θ), thetwovectors appearingin(8.62)arealways linearlyindependent. Lemma The function θ t c (θ) is C 1. Proof. By Proposition 8.57, t c (θ) is a solution to the equation (with respect to t) E (t,θ) =. (8.63) θ Let us first remark that, for the exponential map E associated with the Hamitonian H we have E θ (t c(θ),θ) =, 2 E t θ (t c(θ),θ) (8.64) where t c (θ) is the first conjugate time with respect to the metric induced by H, as it is easily checked. Since H is close to H in the C topology, by continuity with respect to the data of solution of ODEs, we have that E is close to E in the C topology too. Moreover the condition (8.64) ensures the existence of a solution t c (θ) of (8.63) that is close to t c(θ). Hence we have that By the implicit function the function θ t c (θ) is C 1. 2 E t θ (t c(θ),θ) (8.65) 222

223 Let us introduce the function β : S 1 M defined by β(θ) = E(t c (θ),θ). The first conjugate locus, by definition, is the image of the map β. The cuspidal point of the conjugate locus are by definition those points where the function θ t c (θ) change sign. By continuity (cf. proof of Lemma 8.58) the map β takes value in a neighborhood of the point q antipodal to q. Let us take stereographic coordinates around this point and consider β as a function from S 1 to R 2. By the chain rule and (8.63), we have β (θ) = t c(θ) E t (t c(θ),θ)+ E θ (t c(θ),θ) }{{} = Let us define g,g : S 1 R 2 by g(θ) := E t (t c(θ),θ) and g (θ) := E t (t c(θ),θ). The set is convex, since C = {ρg (θ) θ S 1,ρ [,1]} g (θ) = ( ) cosθ sinθ By assumption the perturbation of the metric is small in the C -topology, hence remains convex. (8.66) C = {ρg(θ) θ S 1,ρ [,1]}, (8.67) Theorem The conjugate locus of the perturbed sphere has at least 4 cuspidal points. Proof. Notice that the function θ t c(θ) can change sign only an even number of times on S 1 = [,2π]/. Moreover 2π t c(θ)dθ = t c (2π) t c () =. (8.68) A function with zero integral mean on [,2π] which is not identically zero has to change sign at least twice on the interval. Notice also that 2π t c(θ)g(θ)dθ = 2π β (θ)dθ = β(2π) β() =. (8.69) Let us now assume by contradiction that the function θ t c(θ) changes sign exactly twice at θ 1,θ 2 S 1. Then, by convexity of C, there exists a covector η (R 2 ) such that η,g(θ i ) = for i = 1,2 and such that t c(θ) η,g(θ) > if θ θ i for i = 1,2. This implies in particular 2π 2π η, t c (θ)g(θ)dθ = t c (θ) η,g(θ) dθ which contradicts (8.69). Remark 8.6. A careful analysis of the proof shows that the statement remains true if one considers a small perturbation of the Hamiltonian (or equivalently, the metric) in the C 4 topology. Indeed the key point is that g is close to g in the C 2 topology, to preserve the convexity of the set C defined by (8.67). The same argument can beapplied for every arbitrary small C (and actually C 4 ) perturbation H of the Riemannian Hamiltonian H associated with the standard Riemannian structure on S 2, without requiring that H comes from a Riemannian metric. 223

224 224

225 Chapter 9 2D-Almost-Riemannian Structures Almost-Riemannian structures are examples of sub-riemannian strucures such that the local minimum bundle rank (cf. Definition 3.2) is equal to the dimension of the manifold at each point (cf. Section 3.1.3). They are the prototype of rank-varying sub-riemannian structures. In this chapter we study the 2-dimensional case, that is very simple since it is Riemannian almost everywhere (see Theorem 9.19), but presents already some interesting phenomena as for instance the presence of sets of finite diameter but infinite area and the presence of conjugate points even when the curvature is always negative (where it is defined). Also the Gauss-Bonnet theorem has a surprising form in this context. 9.1 Basic Definitions and properties Thanks to Exercise 3.28, given a structure having constant local minimum bundle rank m one can find an equivalent one having bundle rank m. In dimension 2, due to the Lie bracket generating assumption, also the opposite holds true in the following sense: a structure having bundle rank 2 has local minimal bundle rank 2. Hence we can define a 2D-almost-Riemannian structure in the following simpler way. Definition 9.1. Let M be a 2-D connected smooth manifold. A 2D-almost-Riemannian structure on M is a pair (U,f) where U is an Euclidean bundle over M of rank 2. We denote each fiber by U q, the scalar product on U q by ( ) q and the norm of u U q as u = (u u) q. f : U TM is a smooth map that is a morphism of vector bundles i.e. f(u q ) T q M and f is linear on fibers. D = {f(σ) σ : M U smooth section}, is a bracket-generating family of vector fields. As for a general sub-riemannian structure, we define: the distribution as D(q) = {X(q) X D} = f(u q ) T q M, the norm of a vector v D q as v := min{ u, u U q s.t. v = f(q,u)}. 225

226 admissible curve as a Lipschitz curve γ : [,T] M such that there exists a measurable and essentially bounded function u : t [,T] u(t) U γ(t), called control function, such that γ(t) = f(γ(t),u(t)), for a.e. t [,T]. Recall that there may be more than one control corresponding to the same admissible curve. minimal control of an admissible curve γ as u (t) := argmin{ u, u U γ(t) s.t. γ(t) = f(γ(t), u)} (for all differentiability point of γ). Recall that the minimal control is measurable (cf. Section 3.A) (almost-riemannian) length of an admissible curve γ : [,T] M as l(γ) := T γ(t) dt = T u (t) dt. distance between two points q,q 1 M as d(q,q 1 ) = inf{l(γ) γ : [,T] M admissible, γ() = q, γ(t) = q 1 }. (9.1) Recall that thanks to the Lie-bracket generating condition, the Chow-Rashevskii Theorem 3.3 guarantees that (M,d) is a metric space and that the topology induced by (M,d) is equivalent to the manifold topology. Definition 9.2. If (σ 1,σ 2 ) is an orthonormal frame for ( ) q on a local trivialization Ω R 2 of U, an orthonormal frame for the 2D-almost-Riemannian structure on Ω is the pair of vector fields (F 1,F 2 ) := (f σ 1,f σ 2 ). In Ω R 2 the map f can be written as f(q,u) = u 1 F 1 (q) + u 2 F 2 (q). When this can be done globally, we say that the 2D-almost-Riemannian structure is free. In this chapter we do not work with an equivalent structure of higher bundle rank that is free. Technically such a structure fits Definition 3.2 (i.e., that local minimum bundle rank is equal to the dimension of the manifold at each point) but not Definition 9.1. We rather work with local orthonormal frames that, as explained below, are orthonormal in the standard sense out of the singular set. This point of view permits to understand how global properties of U (as its orientability, its topology) are transferred in properties of the almost-riemannian structure. Definition 9.3. A 2D-almost-Riemannian structure (U, f) over a 2D manifold M is said to be orientable if U is orientable. It is said to be fully orientable if both U and M are orientable. Remark 9.4. Free 2D almost-riemannian structures are always orientable. On an orientable 2D almost-riemannian structure if {F 1,F 2 } and {G 1,G 2 } are two positive oriented orthonormal frames defined respectively on two open subsets Ω and Ξ then on Ω Ξ there exists a smooth function θ : M S 1 such that ( G1 (q) G 2 (q) ) = ( cos(θ(q)) sin(θ(q)) sin(θ(q)) cos(θ(q)) )( F1 (q) F 2 (q) As shown by the following examples, one can construct orientable 2D-almost-Riemannian structures on non-orientable manifolds and viceversa. An orientable 2D almost-riemannian structure on the Klein bottle. Let M be the Klein bottle seen as the square [ π,π] [ π,π] with the identifications (x, π) (x,π), ( π,y) (π, y). 226 ).

227 Let U = M R 2 with the standard Euclidean metric and consider the morphism of vector bundles given by f : U TM, f(x 1,x 2,u 1,u 2 ) = (x 1,x 2,u 1,u 2 sin(2x 1 )) This structure is Lie bracket generating and the two vector fields F 1 (x 1,x 2 ) = f(x 1,x 2,1,) = (x 1,x 2,1,), F 2 (x 1,x 2 ) = (x 1,x 2,,sin(2x 1 )), which are well defined on M, provide a global orthonormal frame. This structure is orientable since U is trivial. Exercise 9.5. Construct a non orientable almost-riemannian structure on the 2D torus. We now define Euler number of U that measures how far the vector bundle U is from the trivial one. Definition 9.6. Consider a 2D-almost-Riemannian structure (U, f) on a 2D manifold M. The EulernumberofU, denotedbye(u) istheself-intersection numberofm inu, wherem isidentified with the zero section. To compute e(u), consider a smooth section σ : M U transverse to the zero section. Then, by definition, e(u) = i(p,σ), p σ(p)= where i(p,σ) = 1, respectively 1, if d p σ : T p M T σ(p) U preserves, respectively reverses, the orientation. Notice that if we reverse the orientation on M or on U then e(u) changes sign. Hence, the Euler number of an orientable vector bundle E is defined up to a sign, depending on the orientations of both U and M. Since reversing the orientation on M also reverses the orientation of TM, the Euler number of TM is defined unambiguously and is equal to χ(m), the Euler characteristic of M. Remark 9.7. Assume that σ Γ(E) has only isolated zeros, i.e. the set {p σ(p) = } is finite. Since U is endowed with a smooth scalar product ( ) q we can define σ : M\{p σ(p) = } SU by σ(q) = σ(q) (here SU denotes the spherical bundle of U). Then if σ(p) =, i(p, σ) = i(p,σ) (σ σ)q is equal to the degree of the map B S 1 that associate with each q B the value σ(q), where B is a neighborhood of p diffeomorphic to an open ball in R n that does not contain any other zero of σ. Notice that if i(p,σ), the limit lim q p σ(q) does not exist. Remark 9.8. Notice that U is trivial if and only if e(u) =. Remark 9.9. Consider a 2D-almost-Riemannian structure (U,f) on a 2D manifold M. Let σ be a section of U and z σ the set of its zeros. As in Remark 9.7, define on M\z σ the normalization σ of σ and let σ (still defined on M\z σ ) its orthogonal with respect to ( ) q. Then the original structure is free when restricted to M \z σ and { σ, σ } is a global orthonormal frame for ( ) q. The global orthonormal frame for the corresponding 2D-almost-Riemannian structure is then (f σ,f σ ). Exercise 9.1. Consider a 2D-almost-Riemannian structure (U, f) on a 2D manifold M. Prove that (U,f) is free when restricted to M \{q } where q is any point on M. 227

228 Definition The singular set Z of a 2D-almost-Riemannian structure (U,f) over a 2D manifold M is the set of points q of M such that f is not fiberwise surjective, i.e., such that the rank of the distribution k(q) :=dim(d q ) is less than 2. Notice if q Z then k(q) = 1. Indeed at q we have k(q) = then the structure could not be bracket generated at q. Since outside the singular set Z, f is fiberwise surjective, we have the following Proposition A 2D-almost-Riemannian strucutre is Riemannian strucutre on M \ Z. On Riemannian points, the Riemannian metric g is reconstructed with the polarization identity (see Exercice 3.8). We have that if v = v 1 F 1 (q)+v 2 F 2 (q) T q M and w = w 1 F 1 (q)+w 2 F 2 (q) T q M then g q (v,w) = v 1 w 1 +v 2 w 2. By construction, at Riemannian points, {F 1,F 2 } is an orthonormal frame in the usual sense g q (F i (q),f j (q)) = δ ij, i,j = 1,2. Exercise Assume that in a local system of coordinates an orthonormal frame is given by ( ) ( ) ( F 1 F 1 = 1 F 1 F1 2, F 2 = 2 F2 2 and let F = (F j F i ) 1 i,j=1,2 = 1 F2 1 ) F1 2 F2 2. ProvethatatRiemannianpointstheRiemannianmetricisrepresentedbythematrixg = t (F 1 )F 1. The following Proposition is very useful to study local properties of 2D-almost-Riemannian structures Proposition For every point q of M there exists a neighborhood Ω of this point and a system of coordinates (x 1,x 2 ) in Ω such that an orthonormal frame for the 2D-almost-Riemannian structure can be written in Ω as: F 1 (q) = ( 1 where f : Ω R is a smooth function. Moreover ) (, F 2 = f(x 1,x 2 ) (i) the integral curves of F 1 are normal Pontryagin extremals; ), (9.2) (ii) if the step of the structure at q is equal to s, we have r x 1 f = for r = 1,2,...,s 2 and s 1 x 1 f ; Remark Notice that using the system of coordinates and the orthonormal frame given by Proposition 9.14, we have that Z Ω = {(x 1,x 2 ) Ω f(x 1,x 2 ) = }. Before proving Proposition 9.14, let us prove the following Lemma Lemma Consider a 2D-almost-Riemannian structure and let W be a smooth embedded onedimensional submanifold of M. Assume that W is transversal to the distribution D, i.e., such that D(q)+T q W = T q M for every q W. Then, for every q W there exists an open neighborhood U of q such that for every ε > small enough, the set is a smooth embedded one-dimensional submanifold of U. {q U d(q,w) = ε} (9.3) 228

229 W normal Pontryagin extremals D(q) Figure 9.1: Normal Pontryagin extremals starting from the singular set Proof. Let H(λ) be sub-riemannian Hamiltonian and consider a smooth regular parametrization α w(α) of W. Let α λ (α) T w(α) M be a smooth map satisfying H(λ (α)) = 1/2 and λ (α) T w(α) W. Let E(t,α) be the solution at time t of the Hamiltonian system with Hamiltonian H and with initial condition λ() = λ (α). Fix q W and define ᾱ by q = w(ᾱ). Now let us prove that E(t,α) is a local diffeomorphism around the point (,ᾱ). To do so let us show that the two vectors v 1 = E α (,ᾱ) and v 2 = E (,ᾱ) (9.4) t are not parallel. On one hand, since v 1 is equal to dw dα (ᾱ), then it spans T qw. On the other hand, being H quadratic in λ, λ (ᾱ),v 2 = λ (ᾱ), H λ (λ (ᾱ)) = 2H(λ (ᾱ)) = 1. (9.5) Thus v 2 does not belong to the orthogonal to λ (ᾱ), that is, to T q W. Therefore for a small enough neighborhood U of q, using the fact that small arcs of normal extremal paths are minimizers, we have that for ε > small enough, the set A = {q U d(q,w) = ε} contains the intersection of U with the images of E(ε, ) and E( ε, ). By possibly restricting U, we are in the situation of Figure 9.1 and the set A coincides with the intersection of U with the images of E(ε, ) and E( ε, ). Remark Notice that in this proof we did not make any hypothesis on abnormal extremals. In Section we are going to see that for 2D almost-riemannian structures there are no non trivial abnormal extremals. Proof of Proposition Following the notation of the proof of Lemma 9.16 let us take (t,α) as a system of coordinates on U and define the vector field F 1 by F 1 (t,α) = E(t,α). (9.6) t 229

230 Notice that, by construction, for every q U the vector X(q ) belongs to D(q ) and F 1 (q ) = 1. In the coordinates (t,α) we have F 1 = (1,) and by construction its integral curves are normal Pontryagin extremals. Let F 2 be a vector field on U such that (F 1,F 2 ) is an orthonormal frame for the 2D almost-riemannian structure in U. We claim that the first component of F 2 is identically equal to zero. Indeed, were this not the case, the norm of F 1 would not be equal to one. We are left to prove B. We have ( F 3 := [F 1,F 2 ] = x1 f(x 1,x 2 ) and beside (9.7), the only brackets among F 1,F 2 and F 3 that could be different from zero are of the form ( ) [F 3,...,[F 3,F 1 ],F 1 ] = }{{} r. x 1 f(x 1,x 2 ) r times Hence if the structure has step s at q we have r x 1 f = for r = 1,2,...,s 2 and s 1 x 1 f. The form (9.2) is very useful to express the Riemannian quantities on M \Z. Indeed one has Lemma Assume that on an open set Ω M a system of coordinates (x 1,x 2 ) is fixed and an orthonormal frame for the 2D-almost-Riemannian is given in the form (9.2). Then on Ω (M \Z) the Riemannian metric, the element of Riemannian area and the Gaussian curvatures are given by g (x1,x 2 ) = da (x1,x 2 ) = ( 1 1 f(x 1,x 2 ) 2 ) ) (9.7), (9.8) 1 f(x 1,x 2 ) dx 1dx 2, (9.9) K(x 1,x 2 ) = f(x 1,x 2 ) 2 x 1 f(x 1,x 2 ) 2( x1 f(x 1,x 2 )) 2 f(x 1,x 2 ) 2. (9.1) Proof. Formula (9.8) is a direct consequence of (9.1). Formula (9.9) comes from the definition of the Riemannian area da(f 1,F 2 ) = 1 where {F 1,F 2 } is a local orthonormal frame. Formula (9.1) comes from the formula K(q) = α 2 1 α 2 2 +F 1 α 2 F 2 α 1 where α 1 and α 2 are the two functions defined by [F 1,F 2 ] = α 1 F 1 +α 2 F 2 (see Corollary 4.41). Hence in a 2D-almost-Riemannian structure all Riemannian quantities explodes while approaching to Z How big is the singular set? A natural question is how big could be the singular set. The answer is given by the following Lemma. Theorem Consider a system of coordinates (x 1,x 2 ) defined on an open set Ω and let dx 1 dx 2 be the corresponding Lebesgue measure. Then Z Ω has zero dx 1 dx 2 -measure. 23

231 Proof. Without loss of generality we can assume that Ω has the following properties: it is the product of two non-empty intervals: Ω = (x A 1,x B 1 ) (x A 2,x B 2 ), on Ω we have an orthonormal frame of the form ( ) ( 1 F 1 (q) =, F 2 = on Ω the step of the structure is s N. f(x 1,x 2 ) ), (9.11) If some of the properties above are not satisfied, one can prove the theorem on a countable union of sets where the properties above hold. Let 1 Z : Ω {,1} be the characteristic function of Z. Using Fubini theorem, ( x B 2 x B 1 dx 1 dx 2 = 1 Z (x 1,x 2 )dx 1 dx 2 = 1 Z (x 1,x 2 )dx 1 )dx 2. Z Ω Ω We now prove that for every fixed x 2 (x A 2,xB 2 ), we have x B 1 1 x A Z (x 1, x 2 )dx 1 = from which the 1 conclusion of the theorem follows. Indeed B. of Proposition 9.14 guarantees that there exists r s 1 such that x r 1 f(x 1, x 2 ) for every x 1 (x A 1,xB 1 ). Hence f(, x 2) has only isolated zeros and x B 1 x A 1 x A 2 x A 1 1 Z (x 1, x 2 )dx 1 =. Exercise 9.2. Show that from the proof of Theorem 9.19 it follows that the singular set is locally the countable union of zero- and one-dimensional manifolds and hence that it is rectifiable Genuinely 2D-almost-Riemannian structures have always infinite area Theorem Let Ω be a bounded open set such that Ω Z. Then diam(ω) and da = where diam(ω) is the diameter of Ω computed with respect to the almost-riemannian distance and da is the Riemannian area associated to the almost-riemannian structure on Ω\ Z. Proof. Take a a point q Ω\Z and a system of coordinates (x 1,x 2 ) on a neighborhood Ω Ω of q. Expanding f in Taylor series, we have Ω\Z According to (9.9), the (almost-riemannian) area of Ω is Ω 1 f(x 1,x 2 ) dx 1dx 2. f(x 1,x 2 ) = a 1 x 1 +a 2 x 2 +O(x 2 1 +x 2 2). (9.12) But the inverse of a function of the form (9.12) is never integrable around the origin in the plane. 231

232 9.1.3 Normal Pontryagin extremals Since 2D almost Riemannian structures are particular cases of sub-riemannian structures, there are two kind of candidate optimal trajectories: normal and abnormal extremals. Normal extremals are geodesics while abnormal extremals could or could not be geodesics. An important fact is the following. Theorem For a 2D-almost-Riemannian strucutre, all abnormal extremal are trivial. Moreover a trivial trajectory γ : [a,b] M, γ(t) = q is the projection of an abnormal extremal if and only if q Z. Proof. It is immediate to verify that if γ(t) = q Z for every t [a,b] then γ admits an abnormal lift. Let γ : [a,b] M, (a < b) be the projection of an abnormal extremal and let us prove that γ([a,b]) = q for some q Z. Let us firstprovethat γ([a,b]) Z. By contradiction assumethat thereexists t ]a,b[ suchthat γ( t) / Z. By continuity there exists a non trivial interval [c,d] ]a,b[ such that γ([c,d]) Z =. Then γ [c,d] is a Riemannian geodesic and hence cannot be abnormal. Recall that if an arc of a geodesic is not abnormal, then the geodesic if not abnormal too, hence it follows that γ is not abnormal. This contradicts the hypothesis that γ is the projection of an abnormal extremal. Let us fix a local system of coordinates such that an orthonormal frame is given in the form (9.2). If this is not possible globally on a neighborhood of γ([a,b]), one can repeat the proof on different coordinate charts. Let us write in coordinates γ(t) = (γ 1 (t),γ 2 (t)). We have different cases. If (γ 1 (t),γ 2 (t)) = (c 1,c 2 ) for every t [a,b] we already know that γ admits an abnormal lift. If γ 1 is not constant and γ 2 = c in [a,b], then γ 2 = in [a,b] and Z contains a set of the type Z = {(x 1,c) x 1 [x A 1,x B 1 ]} with x A 1 < x B 1. Hence f = on Z. It follows that r x 1 f = on Z for every r = 1,2,... As in the proof of Theorem 9.19 it follows that all brackets between F 1 and F 2 are zero on Z and that the bracket generating condition is violated. Hence this case is not possible. There exists t ]a,b[ such that γ 2 ( t) is defined and γ 2 ( t). Now since ( ) v γ( t) = 1, v 2 f(γ( t)) for some v 1,v 2 R, we have f(γ( t)) and hence γ( t) / Z violating the condition γ([a,b]) Z for an abnormal extremal. Hence also this case is not possible. Hence all non-trivial geodesics are normal and are projection on M of the solution of the Hamiltonian system whose Hamiltonian is (cf. (4.31)) H : T M R, H(λ) = max ( λ,f(q,u) 12 ) u U u 2, q = π(λ). (9.13) q 232

233 Locally, if an orthonormal frame {F 1,F 2 } is assigned, we have H(λ) = 1 ( λ,f 1 (q) 2 + λ,f 2 (q) 2). 2 For a system of coordinates and a choice of an orthonormal frame as those of Proposition 9.14, we have H(x 1,x 2,p 1,p 2 ) = 1 ( p p 2 2 f(x 1,x 2 ) 2). (9.14) As a consequence of the fact that all geodesics are projections of solutions of a smooth Hamiltonian system and that our structure is Riemannian on M \Z, we have Proposition In 2D almost-riemannian geometry all geodesics are smooth and they coincide with Riemannian geodesics on M \Z. The only particular property of geodesics in almost-riemannian geometry is that on the singular set their velocity is constrained to belong to the distribution (otherwise their length could not be finite). All this is illustrated in the next section for the Grushin plane. 9.2 The Grushin plane The Grushin plane is the simplest example of genuinely almost-riemannian structure. It is the free almost-riemannain structure on R 2 for which a global orthonormal frame is given by ( ) ( ) 1 F 1 =, F 2 = In the sense of Definition 9.1, it can be seen as the pair (U,f) where U = R 2 R 2 and f((x 1,x 2 ),(u 1,u 2 )) = ((x 1,x 2 ),(u 1,u 2 x 1 )). Here the singular set Z is the x 2 -axis and on R 2 \Z the Riemannian metric, the Riemannian area and the Gaussian curvature are given respectively by: g = ( 1 1 x 2 1 ) x 1, da = 1 x 1 dx 1dx 2, K = 2 x 2. (9.15) 1 Notice that the (almost-riemannian) area of an open set intersecting the x 2 -axis is always infinite Normal Pontryagin extremals of the Grushin plane In this section we recall how to compute the normal Pontryagin extremals for the Grushin plane, with the purpose of stressing that they can cross the singular set with no singularities. In this case the Hamiltonian (9.14) is given by and the corresponding Hamiltonian equations are: H(x 1,x 2,p 1,p 2 ) = 1 2 (p2 1 +x 2 1p 2 2) (9.16) ẋ 1 = p 1, ṗ 1 = x 1 p 2 2 ẋ 2 = x 2 1p 2, ṗ 2 = (9.17) 233

234 Figure 9.2: Normal Pontryagin extremals and the front for the Grushin plane, starting from the singular set. Normal Pontryagin extremals parameterized by arclength are projections on the (x 1,x 2 ) plane of solutions of these equations, lying on the level set H = 1/2. We study the normal Pontryagin extremals starting from: i) a point on Z, e.g. (,); ii) an ordinary point, e.g. ( 1,). Case (x 1 (),x 2 ()) = (,) In this case the condition H(x 1 (),x 2 (),p 1 (),p 2 ()) = 1/2 implies that we have two families of normal Pontryagin extremals corresponding respectively to p 1 () = ±1, p 2 () =: a R. Their expression can be easily obtained and it is given by: { x1 (t) = ±t, x 2 (t) = if a = x 1 (t) = ± sin(at) a, x 2 (t) = 2at sin(2at) (9.18) if a 4a 2 Some normal Pontryagin extremals are plotted in Figure 9.2 together with the front, i.e., the end point of all normal Pontryagin extremals at time t = 1. Notice that normal Pontryagin extremals start horizontally. The particular form of the front shows the presence of a conjugate locus accumulating to the origin. Case (x 1 (),x 2 ()) = ( 1,) InthiscasetheconditionH(x 1 (),x 2 (),p 1 (),p 2 ()) = 1/2becomesp 2 1 +p2 2 = 1anditisconvenient to set p 1 = cos(θ),p 2 = sin(θ), θ S 1. The expression of the normal Pontryagin extremals is given by: x 1 (t) = t 1, x 2 (t) =, if θ = x 1 (t) = t 1, x 2 (t) =, if θ = π x 1 (t) = sin(θ tsin(θ)), sin(θ) sin(2θ 2tsin(θ)) 2t 2cos(θ)+ sin(θ) x 2 (t) = 4sin(θ) if θ / {,π} 234

235 Some normal Pontryagin extremals are plotted in Figure 9.3 together with the front at time t = 4.8. Notice that normal Pontryagin extremals pass horizontally through Z, with no singularities. The particular form of the front shows the presence of a conjugate locus. Normal Pontryagin extremals can have conjugate times only after intersecting Z. Before it is impossible since they are Riemannian and the curvature is negative Figure 9.3: Normal Pontryagin extremals and the front for the Grushin plane, starting from a Riemannian point. 9.3 Riemannian, Grushin and Martinet points In 2D almost-riemannian structures there are 3 kind of important points, namely Riemannian, Grushin and Martinet points. As we are going to see in Section 9.4, these points are important in thesensethat if a system hasonly this typeof points then thisis truealso after a small perturbation of the system. Moreover arbitrarily close to any system there is a system where only these points are present. First we study under which conditions Z has the structure of a 1D manifold. To this purpose we are going to study Z as the set of zeros of a function. Definition Let {F 1,F 2 } be a local orthonormal frame on an open set Ω and let ω be a volume form on Ω. On Ω define the function Φ = ω(f 1,F 2 ). Exercise Prove that Φ is invariant by a positive oriented change of orthonormal frame defined on the same open set Ω. Sinceavolume formcan beglobally definedwhenm isorientable wehave that Φcan beglobally defined on fully orientable 2D almost-riemannian structures (cf. Definition 9.3), just defining it as above on positive oriented orthonormal frames. 235

236 For structure that are not fully orientable, Φ can be defined only locally and up to a sign. (notice however that Φ is always well defined). This is what should be taken in mind every time that the function Φ appears in the following. If in a system of coordinates (x 1,x 2 ), we write then F 1 = ( F 1 1 F 2 1 ), F 2 = ( F 1 2 F 2 2 Φ(x 1,x 2 ) = h(x 1,x 2 )det ( F 1 1 F 1 2 F 2 1 F 2 2 ), ω(x 1,x 2 ) = h(x 1,x 2 )dx 1 dx 2 ). (x 1,x 2 ) Remark For a system of coordinates and a choice of an orthonormal frame as those of Proposition 9.14, and taking ω = dx 1 dx 2, we have Φ(x 1,x 2 ) = f(x 1,x 2 ). The function Φ permits to write, Z = {q M Φ(q) = }. We are now going to consider the following assumptions H q If Φ(q ) = then dφ(q ). H The condition H q holds for every q M. Exercise Prove that the conditions above do not depend on the choice of the volume form ω. By definition of submanifold we have Proposition Assume that H holds. Then Z is a one dimensional embedded submanifold of M. As usual define D 1 = D, D i+1 = D i + [D i,d i ], i = 1,2,... We are now ready to define Riemannian, Grushin and Martinet points. 236

237 Z Z D(q) D(q) Grushin points Martinet point Figure 9.4: Grushin and Martinet points Definition Consider a 2D-almost Riemannian structure. Fix q M. If D 1 (q ) = T q M (equivalently if q / Z) we say that q is a Riemannian point. If D 1 (q ) T q M (equivalently if q Z), H q holds then if D 2 (q ) = T q M we say that q is a Grushin point. if D 2 (q ) T q M we say that q is a Martinet point. Remark 9.3. Hence under H every point is either a Riemannian or a Grushin or a Martinet point. Exercise By using the system of coordinate given by Proposition 9.14 prove the following: q is a Grushin point if and only if q Z and L v Φ(q ) for v D(q), v = 1. q is a Martinet point if and only if q Z, dφ(q ), and for v D(q ), v = 1, we have L v Φ(q ) =. The following proposition describes properties of Grushin and Martinet points (see Figure 9.4) Proposition We have the following: (i) Z is an embedded 1D manifold around Grushin or Martinet points; (ii) if q is a Grushin point then D(q ) is transversal to T q Z; (iii) if q is a Martinet point then D(q ) is parallel to T q Z; (iv) Martinet points are isolated. Proof. We use the system of coordinates and an orthonormal frame as the one given by Proposition 9.14, with q = (,), ( ) ( ) 1 F 1 =, F 2 =. f If we take ω = dx dy, we have Φ = f, dφ = ( x1 f, x2 f). To prove (i), it is sufficient to notice that by definition dφ at Grushin and Martinet points. To prove (ii), notice that D(q ) =span(f 1 (q )) = (1,) while T q Z =span{( x2 f(q ), x1 f(q ))} that are not parallel since x1 f(q ). 237

238 To prove (iii), notice that D(q ) =span(f 1 (q )) = (1,) while T q Z =span{( x2 f,)} since the condition D 2 (q ) T q M implies x1 f(q ) =. To prove (iv), simply observe that if Martinet points were accumulating at q then at that point we cold not have s 1 x 1 f, where s is the step of the structure at q. Examples All points on the x 2 -axis for the Grushin plane are Grushin points. The origin the following structure is the simplest example of Martinet point ( ) ( ) 1 F 1 =, F 2 = x 2 x 2. 1 The origin for the following example ( ) ( 1 F 1 = and F 2 = x 2 2 x2 1 is not a Martinet point since the condition dφ(,) is not satisfied. Outside the origin all points are either Riemannian or Grushin points, but at the origin Z is not a manifold. ), The x 2 -axis of the following example ( ) 1 F 1 = ( and F 2 = x 2 1 ), is not made by Grushin points since D 2 ((,x 2 )) T (,x2 )M and it is not made by Martinet points since dφ(,x 2 ) is not satisfied (althugh in this case Z is a manifold). In this case D((,x 2 )) is transversal to Z Normal forms* Proposition Let q be a Riemannian, Grushin or a Martinet point. There exists a neighborhood Ω of q and a system of coordinates (x 1,x 2 ) in Ω such that an orthonormal frame for the 2D-almost-Riemannian structure can be written in Ω as: (NF1) if q is a Riemannian point, then (NF2) if q is a Grushin point, then (NF3) if q is a Martinet point, then F 1 (x 1,x 2 ) = (1,), F 2 (x 1,x 2 ) = (,e φ(x 1,x 2 ) ), F 1 (x 1,x 2 ) = (1,), F 2 (x 1,x 2 ) = (,xe φ(x 1,x 2 ) ) F 1 (x 1,x 2 ) = (1,), F 2 (x 1,x 2 ) = (,(x 2 x s 1 1 ψ(x))e ξ(x 1,x 2 ) ), where φ, ξ and ψ are smooth real-valued functions such that φ(,x 2 ) = and ψ(). Moreover s 2 is an integer, that is the step of the structure at the Martinet point. Proof. To be written. 238

239 9.4 Generic 2D-almost-Riemannian structures Recall hypothesis H q and H: H q If Φ(q ) = then dφ(q ). H The condition H q holds for every q M. Recall the H is independent from the volume form used to define the function Φ. We have seen (cf. Remark 9.3) that under hypothesis H every point is either a Riemannian or a Grushin or a Martinet point. In this section we are going to prove that hypothesis H holds for most of the systems. More precisely we are going to prove that hypothesis H is generic in the following sense. Definition Fix a rank 2 Euclidean bundle U over a 2D compact manifold M. Let F be the set of all morphism of bundle from U to TM such that (U,f), f F is a 2D almost-riemannian structure. Endow F with the C 1 norm. We say that a subset of F is generic if it is open and dense in F. Theorem Under the same hypothesis of Definition 9.34, let F F the subset of morphisms satisfying H. Then F is generic. Remark In Theorem 9.35 we have assumed that M is compact. A similar result holds also in the case in which M is not compact. However, in the non compact case, one gets that F is a countable union of open and dense subsets of F and one should use a suitable topology (the Whitney one). In this book we have decided not to enter inside transversality theory and we have provided a statement that can be proved easily via the Sard lemma Proof of the genericity result Cover M with a finite number of compact coordinate neighborhood U i, i = 1...N, in such a way that an orthonormal frame for the 2-ARS in U i is given by ) (, G i (x i 1,xi 2 ) = F i (x i 1,xi 2 ) = ( 1 Let us consider the following hypothesis H i The condition H q holds for every q U i. f i (x 1,x 2 ) Proposition Let F i be the subset of F satisfying H i. Then F i is generic. ). (9.19) Once Proposition 9.37 is proved, the conclusion of Theorem 9.35 follows immediately. Indeed F i is open and dense in F and the open and dense set F := N i=1 F i is made by systems satisfying H in all M. Proof of Proposition Since the map that to (F i,g i ) associates Φ is continuous in the C 1 topology, a small perturbation of F i and G i will induce a small perturbation of Φ. Fixed q, condition H q is clearly openintheset of mapsfromu i to RfortheC 1 topology. Asaconsequence of the compactness of U i, condition H i is open as well. We are now going to prove that H i is dense. To this purpose we construct an arbitrarily small perturbation in the C 1 norm (F ε i,gε i ) of (F i,g i ) for which H i is satisfied. 239

240 Lemma For every ε R with ε small enough there exists a perturbation (Fi ε,gε i ) of (F i,g i ) such that Fi ε F i C 1 Cε, G ε i G i C 1 Cε (for some C > independent from ε) and on U i we have Φ ε := ω(fi ε,gε i ) = Φ+ε; Once Lemma 9.38 is proved, the density of F i follows easily. Indeed let now apply the Sard Lemma to the C function Φ in U i. We have that the set {c R such that there exists q U i such that Φ(q) = c and dφ(q) = } has measure zero. As a consequence, since Φ ε = Φ+ε, we have that the set {ε R such that there exists q U i such that Φ ε (q) = and dφ ε (q) = } has measure zero. It follows that, for almost every ε, condition H i is realized for (F ε i,gε i ). Proof of Lemma If in U i we write in coordinates ω = h i (x i 1,xi 2 )dxi 1 dxi 2, then Φ = ω(f i,g i ) = h i (x i 1,x i 2)f i (x i 1,x i 2). Consider now a perturbation G ε i of G i of the form G ε i(x i 1,x i 2) = ( f i (x i 1,xi 2 )+ ε h i (x i 1,xi 2 ) ). (9.2) and let us define Fi ε = F i. It follows that in U i, ( ) Φ ε = ω(fi ε,gε i ) = h i(x i 1,xi 2 ) f i (x i 1,xi 2 )+ ε h i (x i 1,xi 2 ) = h i (x i 1,xi 2 )f i(x i 1,xi 2 )+ε = Φ+ε. Notice that by construction G ε i is close to G i in the C 1 norm A Gauss-Bonnet theorem For an compact orientable 2D-Riemannian manifold, the Gauss-Bonnet theorem asserts that the integral of the curvature is a topological invariant that is the Euler characteristic of the manifold (see Section 1.3). This theorem admit an interesting generalization in the context of 2D almost-riemannian structures that are fully orientable. This generalization is not trivial since one needs to integrate the Gaussian curvature (that in general is diverging while approaching to the singular set) on the manifold (that has always infinite volume). This generalization holds under certain natural assumptions on the 2D almost-riemannian structure, namely we will assume HG : The base manifold M is compact. The 2D almost-riemannian structure is fully orientable, H holds and every point of Z is a Grushin point. 24

241 The hypotheses that the structure is fully orientable is crucial and it is the almost-riemannian version of the classical orientability hypothesis that one need in Riemannian geometry. The hypothesis H is the basic hypothesis to have a reasonable description of the asymptotics of K in a neighborhood of Z. The hypotesis that every point is a Grushin point is a technical hypothesis. A version of a Gauss Bonnet Theorem in presence of Martinet points can also be written, but is more technical and outside the purpose of this book. With an argument similar to the one of the beginning of Section 9.4.1, one get Theorem Hypothesis HG is open in the set of smooth map f : U TM endowed with C 1 topology: Clearly hypothesis HG is not dense since Martinet points do not disappear for small C 1 perturbations of the system. It is important to notice that HG is not empty. Indeed we have Lemma 9.4. Every oriented compact surface can be endowed with an oriented almost-riemannian structure satisfying the requirement that there are no Martinet points. We are going to prove Lemma 9.4 in Section Definition Consider a 2D almost-riemannian structure (U, f) over a 2D manifold M and assume that HG holds. Let ν a volume form for the Euclidean structure on U, i.e. a never vanishing 2-form s.t. ν(σ 1,σ 2 ) = 1 on every positive oriented local orthonormal frame for ( ) q. Let Ξ be an orientation on M. We define: The signed area form da s on M as the two-form on M \ Z given by the pushforward of ν along f. Notice that the Riemannian area da on M \ Z is the density associated to the volume form da s. M + = {q M \Z, s.t. the orientation given by da s q and Ξ q are the same }. 1 M = {q M \Z, s.t. the orientation given by da s q and Ξ q are opposite }. Notice that given a measurable function h : Ω M ± \Z R, we have h da s = ± h da (if it exists). (9.21) Ω Definition Under the same hypotheses of Definition 9.41, define Ω M ε = {q M d(q,z) > ε} where d(, ) is the 2D-almost-Riemannian structure on M. M ± ε = M ε M ± Given a measurable function h : M \Z R, we say that it is AR-integrable if lim ε exists and is finite. In this case we denote such a limit by hda s. Remark Notice that (9.22) is equivalent to ( h da lim ε 1 i.e. da s q(f 1,F 2) = αξ(f 1,F 2) with α > M + ε M ε h da s (9.22) M ε ) h da 241

242 Example: the Grushin sphere The Grushin sphere is the free 2D-almost Riemannian structure on the sphere S 2 = {y 2 1 +y2 2 +y2 3 = 1} for which an orthonormal frame is given by two orthogonal rotations for instance Y 1 = Y 2 = y 3 y 2 y 3 y 1 (rotation along the y 1 -axis) (9.23) (rotation along the y 2 -axis) (9.24) In this case Z = {y 3 =, y1 2 +y2 2 = 1}. Passing in spherical coordinates and letting y 1 = cos(x)cos(φ) y 2 = cos(x)sin(φ) y 3 = sin(x) we get that an orthonormal frame is given by X 1 = cos(φ π/2)y 1 +sin(φ π/2)y 2 X 2 = sin(ϕ π/2)y 1 +cos(φ π/2)y 2 ( X 1 = tan(x) ) ( 1, X 2 = Notice that the singularity at x = π/2 is due to the spherical coordinates. Instead Z = {x = }. In this case we have. da = The loci Z, M ±, are illustrated in Figure 9.5. The main result of this section is the following. ). 1 tan(x) dxdφ, da s = 1 2 dx dφ, K = tan(x) sin(x) 2 Theorem Consider a 2D-almost-Riemannian structure satisfying hypothesis HG. Let da s be the signed area form and K be the Riemannian curvature, both defined on M \Z. Then K is AR-integrable and we have KdA s = e(u) where e(u) denotes the Euler number of E. Moreover we have e(u) = χ(m + ) χ(m ) where χ(m ± ) denotes the Euler characteristic of M ±. 242

243 y 3 M + x φ y 2 y 1 M Z Figure 9.5: The Grushin sphere Notice that in the Riemannian case KdA s is the standard integral of the Riemannian curvature and e(u) = χ(m) since U = TM. Hence Theorem 9.44 contains the classical Gauss-Bonnet theorem. In a sense, in Riemannian geometry the topology of the surface gives a constraint on the total curvature, while in 2D almost-riemannian geometry such constraints is determined by the topology of the bundle U. For a free almost-riemannian structure we have that U is a rank 2 trivial bundle over M. As a consequence we get that KdA s =, generalizing what happens on the torus. We could interpret this result in the following way. Take a metric that is determined by a single pair of vector fields. In the Riemannian context we are constrained to be parallelizable (i.e. we are constrained to be on the torus). In the AR context, M could be any compact orientable manifolds, but the metric is constrained to be singular somehwere. In any case, the integral of the curvature will be zero Proof of Theorem 9.44* The proof is divided in two steps. First we prove that KdA s = χ(m + ) χ(m ). Then we prove that e(u) = χ(m + ) χ(m ) Step 1 As a consequence of the compactness of M and of Lemma 9.16 one has: Lemma Assume that HG holds. Then the set Z is the union of finitely many curves diffeomorphic to S 1. Moreover, there exists ε > such that, for every < ε < ε, we have that 243

244 b a ε ε a b M ε is smooth and the set M \M ε is diffeomorphic to Z [,1]. Under HG the almost-riemannian structure can be described, around each point of Z, by a normal form of type (NF2). Take ε as in the statement of Lemma For every ε (,ε ), let M ε ± = M ± M ε. By definition of da s and M ±, KdA s = KdA KdA. M ε Mε M + ε The Gauss-Bonnet formula asserts that for every compact oriented Riemannian manifold (N, g) with smooth boundary N, we have KdA+ k g ds = 2πχ(N), N N where K is the curvature of (N,g), da is the Riemannian density, k g is the geodesic curvature of N (whose orientation is induced by the one of N), and ds is the length element. Applying the Gauss-Bonnet formula to the Riemannian manifolds (M ε +,g) and (Mε,g) (whose boundary smoothness is guaranteed by Lemma 9.45), we have KdA s = 2π(χ(M ε + ) χ(mε )) k g ds+ k g ds. (9.25) M ε Thanks again to Lemma 9.45, χ(m ε ± ) = χ(m ± ). We are left to prove that ( ) k g ds k g ds =. (9.26) lim ε M + ε Fix q Z and a (NF2)-type local system of coordinates (x 1,x 2 ) in a neighborhood U q of q. We can assume that U q is given, in the coordinates (x 1,x 2 ), by a rectangle [ a,a] [ b,b], a,b >. Assume that ε < a. Notice that Z U q = {} [ b,b] and M ε U q = { ε,ε} [ b,b]. We are going to prove that M ε U q k g ds = O(ε). (9.27) 244 M ε M + ε M ε

245 Then (9.26) follows from the compactness of Z. (Indeed, { ε} [ b,b] and {ε} [ b,b], the horizontal edges of U q, arenormal Pontryagin extremals minimizing thelength from Z. Therefore, Z can be covered by a finite number of neighborhoods of type U q whose pairwise intersections have empty interior.) Without loss of generality, wecan assumethatm + U q = (,a] [ b,b]. Therefore, M + ε induces on M + ε = {ε} [ b,b] a downwards orientation (see Figure 9.5.1). Thecurve s c(s) = (ε,x 2 (s)) satisfying ċ(s) = F 2 (c(s)), c() = (ε,), is an oriented parametrization by arclength of M ε +, making a constant angle with F 1. Let (θ 1,θ 2 ) be the dual basis to (F 1,F 2 ) on U q M +, i.e., θ 1 = dx 1 and θ 2 = x 1 1 e φ(x 1,x 2 ) dx 2. According to [?, Corollary 3, p. 389, Vol. III], the geodesics curvature of M ε + at c(s) is equal to λ(ċ(s)), where λ Λ 1 (U q ) is the unique one-form satisfying A trivial computation shows that Thus, dθ 1 = λ θ 2, dθ 2 = λ θ 1. λ = x1 (x 1 1 e φ(x 1,x 2 ) )dx 2. k g (c(s)) = x1 (x 1 1 e φ(c(s)) )(dx 2 (F 2 ))(c(s)) = 1 ε + x 1 φ(ε,x 2 (s)). Denote by L 1 and L 2 the lengths of, respectively, {ε} [,b] and {ε} [ b,]. Then, L2 k g ds = k g (c(s))ds M ε + U q L 1 L2 ( ) 1 = L 1 ε + x 1 φ(ε,(s)) ds b ( ) 1 = ε + 1 x 1 φ(ε,x 2 ) εe φ(ε,x 2) dx 2, b where the last equality is obtained taking x 2 = x 2 ( s) as the new variable of integration. We reason similarly on Mε U q, on which Mε induces the upwards orientation. An orthonormal frame on M U q, oriented consistently with M, is given by (F 1, F 2 ), whose dual basis is (θ 1, θ 2 ). The same computations as above lead to b ( ) 1 k g ds = Mε U q b ε 1 x 1 φ( ε,x 2 ) εe φ( ε,x 2) dx 2. Define Then F(ε,x 2 ) = (1+ε x1 φ(ε,x 2 ))e φ(ε,x 2). (9.28) k g ds k g ds = 1 M ε + U q Mε U q ε 2 By Taylor expansion with respect to ε we get b b (F(ε,x 2 ) F( ε,x 2 )) dx 2. F(ε,x 2 ) F( ε,x 2 ) = 2 ε F(,x 2 )ε+o(ε 3 ) = O(ε 3 ) 245

246 X= zeros of X Y= zeros of Y A B singular locus where the last equality follows from the relation ε F(,x 2 ) = (see equation (9.28)). Therefore, k g ds k g ds = O(ε), M ε + U q Mε U q and (9.27) is proved. Step 2 The idea of the proof is to find a section σ of SE with isolated singularities p 1,...,p m such that m j=1 i(p j,σ) = χ(m + ) χ(m ) + τ(s). In the sequel, we consider Z to be oriented with the orientation induced by M +. To be finished Construction of trivializable 2-ARSs with no tangency points In this section we prove Lemma 9.4, by showing how to construct a trivializable 2-ARS with no tangency points on every compact orientable two-dimensional manifold. Without loss of generality we can assume M connected. For the torus, an example of such structure is provided by the standard Riemannian one. The case of a connected sum of two tori can be treated by gluing together two copies of the pair of vector fields F 1 and F 2 represented in Figure9.5.2A, whicharedefinedon atoruswithahole cutout. Inthefigurethetorusis represented as a square with the standard identifications on the boundary. The vector fields F 1 and F 2 are parallel on the boundary of the disk which has been cut out. Each vector field has exactly two zeros and the distribution spanned by F 1 and F 2 is transversal to the singular locus. Examples on the connected sum of three or more tori can be constructed similarly by induction. The resulting singular locus is represented in Figure 9.5.2B. We are left to check the existence of a trivializable 2-ARS with no tangency points on a sphere. A simple example can be found in the literature and arises from a model of control of quantum systems (see [BCC5, BCG + 2]). Let M be a sphere in R 3 centered at the origin and take 246

247 z Y Integral Curves of X X Y X Y X X Y X Y y Integral Curves of Y x F 1 (x,y,z) = (y, x,), F 2 (x,y,z) = (,z, y) as orthonormal frame. Then F 1 (respectively, F 2 ) is an infinitesimal rotation around the third (respectively, first) axis. The singular locus is therefore given by the intersection of the sphere with the plane {y = } and none of its points exhibit tangency (see Figure 9.5.2). Notice that hypothesis HG is satisfied. 247

248 248

249 Chapter 1 Nonholonomic tangent space In this chapter, for a point q M, the symbol Ω q denotes the set of smooth curves γ on M that are based at q, that is γ() = q. 1.1 Jet spaces Fix q in M and a curve γ Ω q. In every coordinate chart it is meaningful to write the Taylor expansion γ(t) = q + γ()t+o(t 2 ) (1.1) The tangent vector v T q M to γ at t = is by definition the equivalence class of curves in Ω q such that, in some coordinate chart, they have the same 1-st order Taylor polynomial. (This requirement indeed implies that the same is true for every coordinate chart, by the chain rule.) In the same spirit we can consider, given a smooth curve such that γ() = q, its m-th order Taylor polynomial at q γ(t) = q + γ()t+ γ() t γ(m) () tm m! +O(tm+1 ) (1.2) Exercise 1.1. Let γ,γ Ω q. We say that γ is (m-)equivalent to γ at q, and we write γ q,m γ, if their Taylor polynomial at q of order m in some coordinate chart coincide. Prove that q,m is a well-defined equivalence relation on the set of curves based at q. Definition 1.2. Let m > be an integer and q M. We define the set of m-th jets of curves at point q M as the equivalence classes of curves based at q with respect to q,m. We denote with Jq m γ the equivalence class of a curve γ and with J m q := {J m q γ : γ Ω q } Remark 1.3. From coordinates representation (1.2), one can prove that J m q is a smooth manifold and dimj m q = mn. Indeed the m-th order Taylor polynomial is characterized by the n-dimensional vectors γ (i) () for i=1,...,m (cf. (1.2)). In the following we always assume that q M is fixed together with a coordinate chart around q, where q =. The Taylor expansion of a curve γ Ω q is then written as follows J m q γ = m i=1 249 γ (i) () ti i!.

250 To better understand the structure of J m q as a smooth manifold we consider the map which forget about the m-th derivative Π m m 1 : Jm q J m 1 q m γ (i) () ti m 1 i! i=1 Proposition 1.4. Jq m is an affine bundle over Jq m 1 spaces over T q M. i=1 γ (i) () ti i! with projection Π m m 1, whose fibers are affine Proof. Fix an element j Jq m 1, then the fiber (Π m m 1 ) 1 (j) is the set of all m th -jets with fixed (m 1) th jet equal to j. To show that it is an affine space over T q M we should define the sum of a tangent vector and an m th -jet, with (m 1) th -jet fixed, having as a result another m th -jet with the same (m 1) th -jet. Let j = Jq mγ be the mth -jet of a smooth curve in M and let v T q M. Consider a smooth vector field V Vec(M) such that V(q) = v and define the sum J m q γ +v := Jm q (γv ), γ v (t) = e tmv (γ(t)) (1.3) It is easy to see that, due to the presence of the power t m, the (m 1) th Taylor polynomial of γ and γ v coincide. Indeed J m q (e tmv (γ(t))) = J m q γ +t m V(q) Hence the sum (1.3) gives to (Π m m 1 ) 1 (j) the structure of affine space over T q M. Indeed it is enough to check that the definition does not depend on the reoresentative. The geometric meaning of the fact that J m q is an affine bundle (and not an vector bundle) is that we cannot complete in a canonic way a (m 1) th -jet to a m th -jet, i.e. we cannot fix an origin in the fiber. On the other hand there exists a sort of global origin on J m q, that is the jet of the constant curve equal to q. Now we want to define dilations on jet spaces, analogously to homothety in Euclidean spaces. Since we have no vector space structure we have to find an appropriate notion Definition 1.5. Let α R and define γ α (t) := γ(αt) for every t R. Define the dilation of factor α on J m q as δ α : J m q J m q, δ α(j m q γ) = Jm q (γ α) One can check that this definition does not depend on the representative and, in coordinates, it is written as a quasi-homogeneous multiplication ( m ) δ α t i ξ i = i=1 m t i α i ξ i Next we extend the notion of jets also for vector fields. To start with we consider flows on the manifold. Definition 1.6. A flow on M is a smooth family of diffeomorphisms i=1 P = {P t Diff(M), t R}, P = Id 25

251 Notice that we do not reuire the family to be a one parametric group (i.e., the group law P t P s = P t+s is not satisfied) and this in general is carachterized as the flow of the nonautonomous vector field X t := d dε P t+ε Pt 1. ε= The set of all flows on M is a group with the point-wise product, i.e. the product of the flows P = {P t } and Q = {Q t } is given by (P Q) t := P t Q t Clearly we can act with a flow on a smooth curve on M as follows: (Pγ)(t) = P t (γ(t)). Moreover, since P = Id, every flow defines a map on Ω q. This action is well-behaved with respect to equivalence relations m,q, i.e., it defines a map on Jq m. Indeed if γ Ω q, then Pγ Ω q and from the chain rule it follws that Jq m (Pγ) depends only on first m derivatives of γ at q, i.e., on Jq m γ. Definition 1.7. Let P be a smooth flow on M. The action of P on J m q is defined by Pj := J m q (Pγ), if j = J m q γ. It can beeasily checked that the definition is well-posed and (P Q)j = P(Qj) for every j J m q. Jets of vector fields Given a vector field V Vec(M) we want to define its m th -jet Jq m V which should be naturally an element of Vec(Jq m ). LetusdenotewithP V = {e tv }the1-parametric groupdefinedbytheflowofv. Asweexplained we can act on jets P V : j e V (j) To act on a family of curves we need a family of flows, then let us consider the 1-parametric group of flows P s V = {estv } Definition 1.8. For every V Vec(M) we define the vector field Jq mv Vec(Jm q ) is the section Jq m V : Jq m TJq m defined as follows (Jq m V)(Jm q γ) := s PV s (Jm q γ) = s= s Jq m (etsv (γ(t))) (1.4) s= Exercise 1.9. Prove the following formula for every V Vec(M) (J m q V)(J m q γ) = m i=1 t i d i t= i! dt i (tv(γ(t))) where V is identified with a vector function V : R n R n in coordinates. To end this section we study the interplay between dilations and jets of vector fields. Since δ α is a map on Jq m its differential (δ α ) acts on elements of Vec(Jq m ), and in particular on jets of vector fields on M. Surprisingly, its action on these fields is linear with respect to α. 251

252 Proposition 1.1. For every α R and V Vec(M) one has (δ α ) (Jq m V) = Jq m (αv) = αjq m V. Proof. From the very definition of the differential of a map (see also Chapter 2) we have ((δ α ) Jq m V))(Jm q γ) = s Jq m (δ αe tsv δ 1/α (γ(t))) s= = s Jq m (δ α e tsv (γ(t/α))) s= = s Jq m (eαtsv (γ(t))) s= = Jq m (αv) = αjm q V 1.2 Admissible variations In this section we define the appropriate notion of tangent vector to a sub-riemannian manifold. Our goal is to define the tangent structure to a sub-riemannian one. As usual, we assume that the sub-riemannian structure is defined by the generating family {f 1,...,f m }. Admissible curves on M are maps γ : [,T] M such that there exists a control function u L such that γ(t) = f u(t) (γ(t)) = m u i (t)f i (γ(t)). i=1 To have a good definition of tangent vector we could not restrict to family of admissible curves, because in this way we lose all the information about directions that are not in the distribution. Indeed we want the tangent space to be a first order approximation of the structure, containing informations about all directions. We need a proper definition of tangent vector, that means a proper definition of variation of a point, in order to give a precise meaning to its principal term, that is going to be the tangent vector. We now introduce the notion of smooth admissible variation. Definition A curve γ : [,T] M in Ω q is said a smooth admissible variation if there exists a family of controls {u(t,s)} s [,τ] such that (i) u(t, ) is measurable and essentially bounded for all t [,T], (ii) u(,s) is smooth with bounded derivatives, for all s [,τ], (iii) u(,s) = for all s [,τ], (iv) γ(t) = q exp τ f u(t,s) ds 252

253 In other words γ is a smooth admissible variation (or shortly, admissible variation) it can be parametrized as the final point of a smooth family of admissible curves. We stress that an admissible variation is not an admissible curve, in general. Remark Recall that two distributions are said to be equivalent (see also Definition 3.3 and 3.17) if and only if the corresponding modulus of horizontal vector fields are isomorphic D D, where D = span{f(σ), σ smooth section of U}. is finitely generated by a basis f 1,...,f m. Let us show that the definition of admissible variation does not depend on the frame f 1,...,f m. By definition any admissible variation γ(t) is associated with a family q(t,s), for s [,τ] solution of m s q(t,s) = u i (t,s)f i (q(t,s)), (1.5) i=1 such that γ(t) = q(t,τ). Assume that f 1,..., f m is another set of local generators of the modulus. Then there exist functions a ij C (M) such that f i (q) = m a ij (q) f j (q), q M, i = 1,...,m. (1.6) j=1 and assume that γ is an admissible variation with respect to u(t,s), i.e., it satisfies (1.6). Now we prove that there exist a family ũ(t,s) of controls such that γ is an admissible variation in the new frame. From (1.6) we get m u i (t,s)f i (q) = i=1 m u i (t,s)a ij (q) f j (q) i,j=1 Then we could define, using the solution q(t,s) of (1.5), the new family of controls ũ j (t,s) = m u i (t,s)a ij (q(t,s)) i=1 and we see from identities above that m s q(t,s) = ũ j (t,s) f j (q(t,s)), s [,τ] (1.7) i=1 Assumption. From now on, we assume that the sub-riemannian structure is bracket generating at q with step m, i.e. D m q = T q M. Definition Let (M,U,f) be a sub-riemannian structure. The set of admissible jets with respect to the sub-riemannian structure is J f q := {J m q γ, γ Ω q is an admissible variation} 253

254 Example Consider two vector fields X,Y Vec(M) and the curve γ : [,T] M, γ(t) = e ty e tx e ty e tx (q) It is easily seen that γ is an admissible variation if we set where γ(t) = exp 4 f tv(s) (q)ds (1,), if s [,1], (,1), if s [1,2], v(s) = ( 1,), if s [2,3], (, 1), if s [3,4]. In coordinates we have expansion γ(t) = q +t 2 [X,Y](q)+o(t 2 ). Now we want to introduce the nonholonomic tangent space in a coordinate-free way. In the next section we will see how it can be described in some special set of coordinates. Definition The group of flows of admissible variations is { τ } P f := exp f u(t,s) ds, u(t,s) smooth variation Any admissible variation is given by γ(t) = P t (q) for some P P f, where we identify q with the constant curve. Remark P f is a group, indeed the following equality holds where exp τ1 f u(t,s) ds exp w(t,s) = τ2 f v(t,s) ds = exp { u(t,s), s τ 1, is the concatenation of controls. Then we have that τ1 +τ 2 v(t,s τ 1 ), τ 1 s τ 1 +τ 2. J f q = {Jm q (P(q)),P Pf } is exactly the orbit of q under the action of the group P f. f w(t,s) ds Thenonholonomic tangent spaceis thequotient ofp f with respectto theaction of thesubgroup of slow flows. Heuristically, a flow is slow if the first nonzero jet J i qγ of its associated trajectory γ belongs to a subspace D j, with j < i. Definition Let Q P f. Q is said to be a slow flow if it is associated to a smooth variation u(t,s) such that satisfies u(,s) = u (,s) =. t The subgroup of slow flows is the normal subgroup P f of Pf generated by slow flows, i.e. } P f {(P := t ) 1 Q t P t : P P f, Q slow flow 254

255 Remark Notice that, by definition of slow flow andthe linearity of f, a slow flow is associated with a family of control that can be written in the form u(t,s) = tv(t,s), where v(,s) =. Moreover we have P t = exp τ f u(t,s) ds = τ exp f tv(t,s) ds = τ exp tf v(t,s) ds = t τ exp f v(t,s) ds, In other words a slow flow Q P f is of the form Q t = tp t for some P P f. Exercise Let j = J m q γ and j = J m q γ for some γ,γ Ω q. Prove that J m q γ J m q γ, if γ (t) = P t (γ(t)) (1.8) for some P P f is a well defined equivalence relation on Jf q. Definition 1.2. The nonholonomic tangent space T f q is defined as where is the equivalence relation (1.8). T f q := Jf q / Proposition Let X D be an horizontal vector field for the sub-riemannian structure on M. The action of the one parametric group e tx on J f q defined by (1.4) passes to the quotient with respect to the equivalence classes with respect to. Proof. From thevery definitionof Jq f it is easy to seethat if Jq m γ is thejetof anadmissiblevariation then the right hand side of (1.4) is an admissible variation for every s. We are left to show that if γ(t) γ (t) = e tx γ(t) e tx γ (t). From our assumption we get γ (t) = γ(t) Q t for a slow flow Q P f. It follows that γ (t) e tx = γ(t) Q t e tx = γ(t) e tx e tx Q t e tx = (γ(t) e tx ) Q t where Q t := e tx Q t e tx is also a slow flow. This shows that e tx is independent on the representative and its action is well defined on the quotient. 1.3 Nilpotent approximation and privileged coordinates In this section we want to introduce some coordinates in which we have a good description of the nonholonomic tangent space. Consider some non negative integers k 1,...,k m such that n = k k m and the splitting where every x i = (x 1 i,...,xk i i ) Rk i. R n = R k 1... R km, x = (x 1,...,x m ) 255

256 ThespaceDer(R n ) ofall differential operatorsinr n withsmoothcoefficients formanassociative algebra with composition of operators as multiplication. The differential operators with polynomial coefficients form a subalgebra of this algebra with generators 1,x j i,, where i = 1,...,m; j = x j i 1,...,k i. We define weights of generators as ( ) ν(1) =, ν(x j i ) = i, ν = i. Then for any monomial x j i ν (y β ) 1 y α = z 1 z β α β ν(y i ) ν(z j ). i=1 j=1 We say that a polynomial differential operator D is homogeneous if it is a sum of monomial terms all of same weight. We stress that this definition depends on the coordinate set and the choice of the weights. Lemma Let D 1,D 2 be two homogeneous differential operators. Then D 1 D 2 is homogeneous and ν(d 1 D 2 ) = ν(d 1 )+ν(d 2 ) (1.9) Proof. It is sufficent to check formula (1.9) for monomials of kind D 1 = and D x j 1 2 = x j 2 i2. This i1 follows from the identity x j x j2 1 i2 = x j 2 i2 i1 x j + x j 2 i 2 1 i1 x j 1 i1 A special case is when we consider, as differential operators, vector fields. Corollary If V 1,V 2 Vec(R n ) are homogeneous vector fields then [V 1,V 2 ] is homogeneous and ν([v 1,V 2 ]) = ν(v 1 )+ν(v 2 ). With these properties we can define a filtration in the space of all smooth differential operators Indeed we can write (in multiindex notation) D = α ϕ α (x) α x α ConsideringtheTaylorexpansionatofeverycoefficientwecansplitDasasumofitshomogeneous components D D (i) and define the filtration i= D (h) = {D Der(R n ) : D (i) =, i < h}, h Z 256

257 It is easy to see that it is a decreasing filtration, i.e. D (h) D (h 1) for every h, and if we restrict our attention to vector fields we get V Vec(R n ) V (i) =, i < m Indeed every monomial of a N th -order differential operator has weight not smaller than mn). In other words we have (i) Vec(R n ) D ( m), (ii) V Vec(R n ) D () implies V() =. and every vector field that is not zero at the origin is at least in D ( 1). This motivates the following definition Definition A system of coordinates near the point q is said linearly adapted to the flag D 1 q D2 q... Dm q if D i q = R k 1... R k i, i = 1,...,m. (1.1) A system of coordinates near the point q is said privileged if it is linearly adapted to the flag and f D ( 1) for every f D. Notice that condition (i) can always be satisfied after a suitable linear change of coordinates. Condition (ii) says that each horizontal vector field has no homogeneous component of degree less than 1. Example We analyze the meaning of privileged coordinates in the basic cases m = 1,2 and then for m = 3 we show that in general not all system of linearly adapted coordinates are privileged. (1) If m = 1 all sets of coordinates are privileged because Vec(M) D ( 1) since ν( xi ) = 1 for all i. (2) If m = 2 then all systems of coordinates that are linearly adapted to the flag are privileged. Indeed, since ν( x j) = 1 and ν( 1 x j) = 2, a vector field belonging to D ( 2) \D ( 1) must 2 containamonomialvectorfieldofthekind x j, withconstantcoefficients. Ontheotherhanda 2 vector field f D cannot contain such a monomial since, by our assumption f() D 1 = Rk 1. (3) Let us consider the following set of vector fields in R 3 = R R R f 1 = x1 +x 1 x3, f 2 = x 1 x2, f 3 = x 2 x3 and set ν(x i ) = i for i = 1,2,3. The nontrivial commutators between these vector fields are [f 1,f 2 ] = x2, [f 2,f 3 ] = x 1 x3, [[f 1,f 2 ],f 3 ] = x3. Then the flag (computed at x = ) is given by D 1 = span{ x1 }, D 2 = span{ x1, x2 }, D 3 = span{ x1, x2, x3 }. These coordinates are then linearly adapted to the flag but they are not privileged since ν(x 1 x3 ) = 2 and f 1 D ( 2) \D ( 1). 257

258 Theorem Let M be a sub-riemannian manifold and q M. There always exists a system of privileged coordinates around q. We postpone the proof of this theorem to the end of this section, after having analyzed in more detail the structure of privileged coordinates. Theorem Let M be a sub-riemannian manifold and q M. In privileged coordinates we have the following (i) J f q = { m i=1 ti ξ i,ξ i D i q } and dimjf q = mk 1 +(m 1)k k m. (ii) Let j 1,j 2 J f q. Then j 1 j 2 if and only if j 1 j 2 = m i=1 ti η i, where η i D i 1 q. First part of proof of Theorem We start by proving the inclusion Jq f { m i=1 ti ξ i,ξ i Dq i}. For any smooth variation γ(t) we can write Taylor expansion leads to γ(t) = q + i j=1 s j... s 1 s γ(t) = q exp τ f u(t,s) ds q f u(t,s1 )... f u(t,sj ) ds 1...ds j +O(t i+1 ) Indeed using the fact that f is linear in u, we can factor out t from every term since u(,s) =. If we want compute our curve in privileged coordinates (to compute weights) it is sufficient to apply all to the coordinate function. In particular, since by definition of privileged coordinates f u D ( 1) for each u, we have that f u(t,s1 )... f u(t,sj ) D ( j) and applying to a coordinate function x β α, where α = 1,...,m and β = 1,...,k α we have f u(t,s1 )... f u(t,sj )x β α D( j+α) because ν(x β α) = α. Then, if α > i we have that this function has positive weight. Thus, when evaluated at x = it is zero. In other words we proved that, for every i = 1,...,m, up to the i th -term we can find only element in D i q. To prove the converse inclusion we have to show that, given some elements ξ i D i q we can find a smooth variation that has these vectors as elements of its jet. We start with some preliminary lemmas. Lemma Let m,n be two integers. Assume that we have two flows such that P t = Id+Vt n +O(t n+1 ) Q t = Id+Wt m +O(t m+1 ) in the operator sense. Then P t Q t P 1 t Q 1 t = Id+[V,W]t n+m +O(t n+m+1 ). 258

259 Exercise Assume that the flow P t satisfies P t = Id + Vt n + O(t n+1 ). Show that the nonautonomous vector field V t associated to P t satisfies V t = nt n 1 V +O(t n ). Proof. Define R(t,s) := P t Q s Pt 1 Q 1 s. Since P = Q = Id we have that R satisfies R(,s) = R(t,) = Id for every t,s R. Hence the only derivative that enter in our expansion, that coincide with F(t) = R(t,t), are mixed derivatives. This remark let us to expand the product P t Q s Pt 1 Q 1 s and keep only terms with mixed power of t and s in the expansion. Using that for the inverse flow we have the expansions one gets P 1 t = Id t n V +O(t n+1 ), Q 1 t = Id t m W +O(t m+1 ). (Id+t n V +O(t n+1 ))(Id+s m W+O(s m+1 ))(Id t n V +O(t n+1 ))(Id s m W +O(s m+1 )) = and the lemma is proved. = Id+t n s m (VW WV)+O(t n+m+1 ) = Id+t n s m [V,W]+O(t n+m+1 ) Lemma 1.3. For all l h and i 1,...,i h {1,...,k}, there exists an admissible variation u(t,s) such that q τ exp f u(t,s) ds = q +t l [f i1,...,[f ih 1,f ih ]](q)+o(t l+1 ). (1.11) Proof. We prove the lemma by induction on h - l 1 and i = 1,...,k we have to show that there exists an admissible variation u(t,s) such that q τ exp f u(t,s) ds = q +t l f i (q)+o(t l+1 ) To this aim, it is sufficient to consider a control u = (u 1,...,u k ) where u i = t l and u j = for all j i. - l 2 and i,j = 1,...,k, we have to show that there exists an admissible variation u(t,s) such that q τ exp f u(t,s) ds = q +t l [f i,f j ](q)+o(t l+1 ) To this aim, it is sufficient to use the previous lemma where P t and Q t are flows respectively of nonautonomous vector fields V t = t l 1 f i1 and W t = tf i2. With analogous arguments we can prove by induction the lemma. In other words we proved that every bracket monomial of degree i can be presented as the i-th term of a jet of some admissible variation. Now we prove that we can do the same for any linear combination of such monomials (recall that D i is the linear span of all i-th order brackets). Remark The previous construction of u(t, s) does not depend on the sub-riemannian structure but only on the structure of the Lie bracket. 259

260 Lemma Let π = π(f 1,...,f k ) a bracket polynomial of degree degπ l. There exists an admissible variation u(t, s) such that q τ exp f u(t,s) ds = q +t l π(f 1,...,f k )(q)+o(t l+1 ) Proof. Let π(f 1,...,f k ) = N j=1 V j(f 1,...,f k ) where V j are monomials. By our previous argument we can find u j (t,s),s [,τ j ] such that q τ exp f u j (t,s)ds = q +t l V j (f 1,...,f k )(q)+o(t l+1 ) Now consider the concatenation of controls u(t,s), where s [,τ] and τ = N j=1 τ j defined as follows ) j j j+1 u(t,s) = u (t,s j τ i, if τ i s < τ i, 1 j N. i=1 Exercise Complete the previous proof by showing that the flow associated with u has as main term in the Taylor expansion j V j at order l. Then prove, by using a time rescaling argument, that also any monomial of type αv for α R can be presented in this way. i=1 i=1 Second part of Theorem Now we can complete the proof of the first statemet of Theorem 1.27 proving the following inclusion { m i=1 ti ξ i,ξ i Dq} i Jq. f Let us consider a m-th jet j = m i=1 ti ξ i,ξ i Dq i. We prove the statement by steps: at i-th step we built an admissible variation whose i-th Taylor polynomial coincide with the one of j. - From Lemma?? there exists a smooth admissible variation γ(t) such that γ(t) = q τ exp f u(t,s) ds, γ(t) = ξ 1 Then we will have γ(t) = tξ 1 + t 2 η 2 + O(t 3 ) where η 2 D 2 from first part of the proof. In the second step we correct the second order term. - From Lemma?? there exists a smooth admissible variation γ 1 (t) such that γ 1 (t) = q τ exp f v(t,s) ds, γ 1 (t) = t 2 (ξ 2 η 2 )+O(t 3 ) Defining γ 2 (t) := γ 1 (t) γ(t) we have γ 2 (t) tξ 1 +t 2 η 2 +t 2 (ξ 2 η 2 )+t 3 η 3 tξ 1 +t 2 ξ 2 +t 3 η 3 where η 3 D 3. At every step we can correct the right term of the jet and after m steps we have the inclusion. 26

261 (ii) We have to prove that j j j j = m t i η i, i=1 η i D i 1 q. ( ). Assume that j j, where j = J m q γ = t i ξ i and j = J m q γ = t i ξ i. Then γ = γ Q t for some slow flow Q t P f of the form Q i t = Pt i exp Q t = Q 1 t Q h t τ f tv i (t,s)ds (P i t) 1 for some P i P f,i = 1,...,h. For simplicity we prove only the case h = 1. By formula (6.26) we have that Q t = P t τ exp f tv(t,s) ds Pt 1 = τ exp P t f tv(t,s) Pt 1 ds then by linearity of f we have Now recall that P t = exp τ Q t = τ exp tadp t f v(t,s) ds f w(t,θ)dθ for some admissible variation w(t,θ) and from (6.23) we get Q t = exp τ t exp Finally, if γ(t) = q exp τ f u(t,s)ds we can write γ (t) = q exp τ s f u(t,s) ds exp adf w(t,θ) dθ f v(t,s) ds τ t exp s adf w(t,θ) dθ f v(t,s) ds Expanding with respect to t we have Q t (Id +t t i V i ) = Id+ t i+1 V i where V i is a bracket polynomial of degree i. Due to the presence of t it is easy to see that in the expansion of γ we will find the same terms of γ plus something that belong to D i 1. ( ). Assume now that j = J m q γ = t i ξ i and j = J m q γ = t i ξ i, with j j = m t i η i, i=1 η i D i 1 q. We need to find a slow flow Q t such that γ = γ Q t. In other words it is sufficient to prove that we can realize with a slow flow every jet of type m i=1 ti η i, η i Dq i 1. To this purpose we can repeat arguments of proof of part (i), using the following Lemma Let P t,q t be two flows with P t P f and Q t P f (or P t P f and Q t P f ). Then P t Q t Pt 1 Q 1 t P f. Proof. If Q t P f then Q 1 t P f. Moreover from the definition of Pf we have that P tq t Pt 1 P f. Hence also their composition is in P f. 261

262 We have the following corollary of Theorem 1.27, part (i). Corollary In privileged coordinates (x 1,...,x m ) defined by the splitting R n = R k 1... R km we have tx 1 +O(t 2 ) Jq f t 2 x 2 +O(t 3 ) =. : x i R k i,i = 1,...,m t m x m Proof. Indeed we know that D i = R k 1... R k i and writing ξ i = x i, x i,i, x i,j R k j we have, expanding and collecting terms t i ξ i = tξ 1 +t 2 ξ t m ξ m = tx 1,1 +t 2 (x 2,1 +x 2,2 )+...+t m (x m, x m,m ) = (tx 1,1 +t 2 x 2, t m x m,1,t 2 x 2, t m x m,2,t m x m,m ) Corollary The nonholonomic tangent space Tq f is a smooth manifold of dimension dimtq f = m(q) i=1 k i(q). In privileged coordinates we can write tx 1 Tq f t 2 x 2 =. : x i R k i,i = 1,...,m t m x m and dilations δ α acts on T f q in a quasi-homogeneous way as follows δ α (tx 1,...,t m x m ) = (αtx 1,...,α m t m x m ), α >. Proof. It follows directly from the representation of the equivalence relation. Indeed two elements j and j can be written in coordinates as j = (tx 1 +O(t 2 ),t 2 x 2 +O(t 3 ),...,t m x m ) j = (ty 1 +O(t 2 ),t 2 y 2 +O(t 3 ),...,t m y m ) and j j if and only if x i = y i for all i = 1,...,m. Remark Notice that a polynomial differential operator homogeneous with respect to ν (i.e. whose monomials are all of same weight) is homogeneous with respect to dilations δ t : R n R n defined by δ t (x 1,...,x m ) = (tx 1,t 2 x 2,...,t m x m ), t >. (1.12) In particular for a homogeneous vector field X of weight h it holds δ X = t h X. 262

263 Now we can improve Proposition 1.21 and see that actually the jet of a horizontal vector field is a vector field on the tangent space and belongs to D ( 1) (in privileged coordinates). Lemma Fix a set of privileged coordinate. Let V D ( 1), then the jet J m q V is tangent to the submanifold Jq f. Moreover it is well defined as vector field V on the nonhonolomic tangent space. In other words V Vec(Tq f ) and we have v 1 (x) v 1 (x) v 2 (x) V =. = V v 2 (x) = (1.13). v m (x) v m (x) where v i is the term of order i 1 of v i. Proof. Let V D ( 1) and γ(t) be an admissible variation. When expressed in coordinates we have v 1 (x) tx 1 +O(t 2 ) v 2 (x) V =., γ(t) = t 2 x 2 +O(t 3 ). v m (x) t m x m We know that (Jq mv)(jm q γ) is expressed as the m-th jet of tv(γ(t)) by Exercise 1.9. Hence we compute tv 1 (tx 1 +O(t 2 ),...,t m x m ) (Jq m V)(Jm q γ) = tv 2 (tx 1 +O(t 2 ),...,t m x m ) (1.14). tv m (tx 1 +O(t 2 ),...,t m x m ) Notice that V D ( 1) means exactly that V = m i=1 v i (x) = v j i x (x) i x j, ν i ( x j i ) = i and v i is a function of order at least i 1. Let we denote with v i the homogeneous part of v i of order i 1. To compute the value of V then we have to restrict its action on admissible variations from T f q, then evaluate and neglect the higher order part (that corresponds to the projection on the factor space) in order to have v i (tx 1,...,t m x m ) = t i 1 v i (x 1,...,x m )+O(t i ) and using equality we have tv 1 (tx 1,...,t m x m ) (Jq m V) T tv 2 (tx 1,...,t m x m ) = f q. = tv m (tx 1,...,t m x m ) t v 1 +O(t 2 ) t 2 v 2 +O(t 3 ). t m v m +O(t m+1 ) (1.15) Then (1.13) follows. 263

264 Remark Notice that, since v i is a homogeneous function of weight i 1, it depends only on variables x 1,...,x i 1 of weight equal of smaller than its weight. Hence V has the following triangular form v 1 v 2 (x 1 ) V(x) = (1.16). v m (x 1,...,x m 1 ) Moreover theflowof avector fieldofthiskindcanbeeasily computedby astep bystepsubstitution Existence of privileged coordinates Now we prove existence of privileged coordinates Proof of Theorem Consider our sub-riemannian structure on M defined by the orthonormal frame {f 1,...,f k } and its flag D 1 q D 2 q... D m q = T q M, with n j := dimd j q (n j = k k j ) Let we consider a basis {V 1,...,V n } of the tangent space adapted to the flag, i.e. V i = π i (f 1,...,f k ) π i bracket polynomial, degπ i j if i n j D j q = span{v 1 (q),...,v nj (q)}, j = 1,...,m InparticularV 1,...,V n1 areselectedin{f i,i = 1,...,k}, V n1 +1,...,V n2 areselectedfrom{[f i,f j ],i,j = 1,...,k} and so on. Define the map Ψ : (s 1,...,s n ) q e s 1V 1... e snvn (1.17) We want to show that Ψ 1 defines privileged coordinates around q. It is easy to show that (1.17) is a local diffeomorphism since Hence it remains to show that (i) Ψ 1 (Dq) i = span{,..., }, s 1 s ni (ii) Ψ 1 f i D ( 1) for every i = 1,...,k Ψ s= s= = Ψ = V i (q), i = 1,...,n (1.18) s i s i Part (i) easily follows from our choice of adapted frame to the flag and (1.18). On the other hand the second part is not trivial since we need to compute differential of Ψ at every point and not only at s =. 264

265 Remark 1.4. In what follows we consider on T q M the weight defined by coordinates (y 1,...,y n ) induced by the flag. In other words we consider the basis V 1 (q),...,v n (q) in T q M and write v = (y 1,...,y n ) = y i V i (q), where ν(y i ) := w i = j if n j 1 < i n j Moreover we can think at v T q M as the constant vector field on T q M identically equal to v. In this way it makes sense to consider the value of a polynomial bracket at π(f 1,...,f k ) at the point q and consider its weight ν(π). We prove the following auxiliary Lemma Let X = π(f 1,...,f k )(q) Vec(T q M), ν(x) d. Consider now the polynomial vector field on T q M Y(y) = y il y i1 (adv il adv i1 X)(q) (1.19) = p i (y)v i (q) for some polynomial p i. Then p i D (w i d). Proof of Lemma. It easily follows from definition of weights that adv il adv i1 (X) D ( w ij d) hence every summand of (1.19) belong to D ( d). Then if we rewrite the sum in terms of the basis V i (q),i = 1,...,k we have that every coefficient p i (y) must belong to D (w i d), since ν(v i (q)) = w i. Now we prove the following claim: for every bracket polynomial X = π(f 1,...,f k ) we have Ψ 1 X D ( d). In particular part (ii) will follow when d = 1. Clearly we can write in coordinates Ψ 1 X = n i=1 a i (s) s i (1.2) and our claim is equivalent to show that a i D (wi d). First we notice that Ψ = s i ε q e s 1V 1 e (s i+ε)v i e snvn ε= = q e s 1V 1 e s iv i V i e s i+1v i+1 e snvn = q e s 1V 1 e snvn e snvn e s i+1v i+1 V }{{} i e s i+1v i+1 e snvn Ψ(s) In geometric notation we can write Ψ s i = e snvn e s i+1v i+1 V i Ψ(s) (1.21) Remember that, as operator on functions, e ty = e tady. This implies that in (1.21) we have a series of bracket polynomials. Apply Ψ to (1.2) we get X = Ψ(s) n a i (s)e snvn e s i+1v i+1 Ψ(s) V i i=1 265

266 Now we apply e s 1V 1 e snvn e s 1V 1 to both sides to compute the vector field at the point q n X = a i (s)e s 1V 1 q e snvn i=1 e s i 1V i 1 V i q (1.22) Rewriting this identity in coordinates b i (s)v i (q) = a i (s)(ϕ ij (s)v j (q)+v i (q)) (1.23) i i,j where ϕ ij () =. Indeed we split the zero order term since we know that for s = the pushforward of the vector fields is exactly V i. Using Lemma above with X and V i,i = 1,...,n we have b i D w i d, ϕ ij D w j w i On the other hand we can rewrite relation between coefficients as follows B(s) = A(s)(Φ(s)+I) where we denote B(s) = (b 1 (s),...,b n (s)), A(s) = (a 1 (s),...,a n (s)) and Φ(s) = (ϕ ij ) ij Thus we get and we can finish the proof noticing that and so on. Hence we get a i D w i d. A(s) = B(s)(I +Φ(s)) 1 = B(s)(I Φ(s)+Φ(s) 2...) = B(s) (BΦ)(s)+(BΦ 2 )(s)... (B) i = b i D w i d (BΦ) i = b j ϕ ji D w j d+(w i w j ) = D w i d. Remark One can repeat all calculation in chronological notation and recover the proof in a purely algebraic way. In the above computations nothing change if we consider any permutation σ = (i 1,...,i n ) of (1,...,n) and the coordinate map Ψ σ : (s 1,...,s n ) q e s in V in... e s i 1 V i1 In particular we can consider the coordinate map and it is easy to see that it satisfies Φ : (x 1,...,x n ) q e xnvn... e x 1V 1 Φ 1 V 1 = x1 = x 2 x1 = Φ 1 V 2 Φ 1 V i. (1.24) = x i x1 =...=x i 1 = for i = 1,...,n 1, the set of vector fields among f 1,...,f k that generates D q. 266

267 In Riemannian geometry the tangent space depends only on the dimension of the manifold (i.e. all tangent spaces to a n-dimensional manifold are isometric). Now we can prove that in sub-riemannian geometry this is not true. Indeed we see that, even in dimension 3, we can have non isometric tangent space, depending on the growth vector (n 1,...,n m ). In bigger dimension it is also possible to prove that, for a fixed growth vector, we have non isometric tangent space depending on the point on the manifold. Example (Heisenberg) Assume n = 3 and that growth vector is (2,3). Then we consider coordinates (x 1,x 2,x 3 ) and weights (w 1,w 2,w 3 ) = (1,1,2). We can assume that V 1 = f 1, V 2 = f 2, V 3 = [f 1,f 2 ] From last Remark we have that, in privileged coordinates we can assume f 1 = x1, f 2 = x2 +αx 1 x3, α R (1.25) because f i = xi + something that has weight 1 and depend only on xj,j > n 1. On the other hand from (1.24) we have [f 1,f 2 ] = x3 = α = 1 and we get the Heisenberg algebra f 1 = x1, f 2 = x2 +x 1 x3, f 3 = x3 (1.26) Example (Martinet) Assume n = 3 and that growth vector is (2,2,3). Then we consider coordinates (x 1,x 2,x 3 ) and weights (w 1,w 2,w 3 ) = (1,1,3). We can assume, up to change indices, that V 1 = f 1, V 2 = f 2, V 3 = [f 1,[f 1,f 2 ]] From last Remark we have that, in privileged coordinates we can write f 1 = x1, f 2 = x2 +(αx 2 1 +βx 1x 2 ) x3, α,β R (1.27) since we assume f 2 x1 = = x2 that implies f 2 = x2 +x 1 a(x) x3, but ν(f 2 ) = 1 and so (1.27) follows. From V 3 x= = x3 we have [f 1,[f 1,f 2 ]] = 2α x3 = α = 1/2. Moreover, since we are interested to normalize sub-riemannian structure and not only the pair of vector fields, we consider rotations of the orthonormal frame. Remark Notice that f 1 = cosθf 1 sinθf 2 f 2 = sinθf 1 +cosθf 2 = [ f 1, f 2 ] = [f 1,f 2 ]. 267

268 Thus, denoting as usual f u = u 1 f 1 +u 2 f 2 we can consider the linear map ϕ : u [f u,[f 1,f 2 ]]/D which vanish on some line on the plane D = span{f 1,f 2 }. Up to a rotation of the frame we can assume that f 2 kerϕ so that [f 2,[f 1,f 2 ]] =, hence β =. f 1 = x1, f 2 = x x2 1 x3, f 3 = x3 (1.28) 1.4 Geometric meaning In the previous section we very clearly found how V is analitically recovered from V. It is nothing else but the principal part of V in privileged coordinates. But now we want to discuss in which sense V is an approximation of V. It turns out that in this nonholonomic setting it plays the same role that linearization of a vector filed does in the Euclidean case. Lemma Let V a vector field. In privileged coordinates we have equality εδ1 ε V = V +εw ε, where W ε is smooth Proof. Write V = V +W and applying the dilation we find δ1 ε V = δ1 ε V +δ1 ε W Since V ishomogeneousofdegree 1wehaveδ1 V = 1 ε ε V andsettingw ε = εδ1 W wearedone. ε Remark Geometrically this procedure means that we consider a small neighborhood of the point q and we make a dilation. Then we properly rescale in order to catch the principal term. This is a blow-up procedure. Notice that we are blowing-up in a nonisotropic way and it contains information about local structure of the bracket. Now we can give a very precise meaning of the fact that nilpotent approximation is the principal part of the sub-riemannian structure, which knows local geometry near the point q. Let us consider the end point map E : U M, u( ) q exp 1 f u(t) dt where U = L k 2 (,1) = L2 ([,1],R k ) is the set of admissible controls. Let we denote by ρ the sub-riemannian distance from the fixed point From Lemma 1.46 we can write for ε > ρ(x) := d(x,q) = inf{ u,e(u) = x} (1.29) f ε u := εδ1 ε f u = f u +εw ε u 268

269 Denote now with f ε and f respectively the sub-riemannian structures on R n and by d ε and d the associated sub-riemannian distance. Notice that, from the very definition of d ε we have d ε (x,y) = 1 ε d(δ ε(x),δ ε (y)) that says d ε is d when we look infinitesimally near the point q and rescale. Let ρ ε, ρ and E ε,ê have analogous meaning. We start from an auxiliary proposition. Proposition E ε Ê uniformly on balls in Lk 2 (,1) (actually in C sense). Proof. Consider the solution x ε (t) and x(t) of the two systems based atq = x(t) = f u(t) ( x(t)), ẋ ε (t) = f ε u(t) (xε (t)) Using Lemma 1.46 we rewrite the second equation as ẋ ε (t) = f u(t) (x ε (t))+εw ε t (xε (t)) and standard estimates from ODE theory prove that x ε x. Notice that, since nilpotent vector fields are complete, the solution x(t) is defined for all t R. Lemma {ρ ε } ε> is an equicontinuous family. Proof. We will prove the following: for every compact K R n there exists ε,c >, depending on K, such that d ε (x,y) C x y 1/m, ε < ε, x,y K. (1.3) where m is the degree of nonholonomy. Notice that from (1.3) we get, using triangle inequality ρ ε (x) ρ ε (y) = d ε (,x) d ε (,y) d ε (x,y) C x y 1/m whichprovesthelemma. Wearethenreducedtoprove(1.3). Ideaistocoverafixedneighborhood of the origin using controls with bounded norms, uniformly in ε. Let V 1,..., V n an adapted basis of the nilpotent system f, such that V i = π i ( f 1,..., f k ) for some bracket polynomials π i,i = 1,...,n. From the very definition we have V 1 ()... V n () On the other hand, by continuity, this implies that they are linearly independent also in a small neighborhood of the origin and by quasi-homogeneity we get V 1 (x)... V n (x), x R n. Let Vi ε = π i (f1 ε,...,fε k ) denote vector fields defined by the same bracket polynomials but in terms of the vector fields of the approximating system. For every K R n there exists ε = ε (K) such that V1(x)... V ε n(x) ε, x K, ε ε. 269

270 Recall that by Lemma 1.32, given a bracket polynomial π i (g 1,...,g k ),degπ i = w i there exists an admissible variation u i (t,s), depending only on π i, such that exp 1 g ui (t,s)ds = Id+t w i π i (g 1,...,g k )+O(t w i+1 ) If we apply this lemma for g i = f ε i we find u i (t,s) such that exp 1 where w i = deg V i = degvi ε. Now consider the map Φ ε (t 1,...,t n,x) = x exp Remark 1.5. We have the expansion x exp f ε u i (t,s) ds = Id+tw i V ε i +O(t w i+1 ), ε > 1 1 f ε ds... exp u 1 (t 1/w 1 1,s) 1 wi+1 f ε ds = x+t u i (t 1/w i iv ε i,s) i (x)+o(t w i f ε ds (1.31) u n(t 1/wn n,s) i ) In particular this is a C 1 map with respect to t. Notice that it is not C 2 if w i > 1 for some i (i.e. a real subriemannian problem). From this remark it follows that Φ ε C 1 as a function of t, being a composition of C 1 maps. Moreover we get the expansion Φ ε (t 1,...,t n,x) = x+ n i=1 t i V ε i (x)+o( t ) = Φε t i t= = V ε i (x) Hence the map Φ ε is a local diffeomorphism near the origin t = (t 1,...,t n ) = and by Implicit Function Theorem there exists a constant c > such that x+cνb Φ ε (νb,x), B = B(,1) R n, x K, (1.32) where c is independent of ε and ν is small enough. Let us denote now with E x the end-point map based at the point x R n (with analogous meaning for E ε x,êx), and with B L 2 the unit ball in L k 2 [,1]. We claim that (1.32) implies that there exists a constant c such that x+c νb E ε x(ν 1 m BL 2), ν,ε > (1.33) Since t u i (t, ) is a smooth map for every i, and u i (, ) = we have that there exist a constant c i such that for all ν > small enough. t νb u i (t, ) c i νb L 2, (1.34) u i (t 1/w i, ) c i ν 1/w i B L 2, (1.35) 27

271 For such ν we have by inclusion (1.33) that x y cν = d ε (x,y) ν 1/m where we used the fact that d ε is the infimum of norm of u such that Ex(u) ε = y. From this easily follows d ε (x,y) c 1 1 m x y m (1.36) Remark All estimates are valid also for ε, i.e. for the nilpotent approximation. In particular, using homogeneity d(x,y) C x y 1 m, x,y R n (1.37) Indeed from the proof of Lemma 1.49 it follows that the estimate (1.37) holds in a compact K containing the origin. Consider two arbitrary points x,y R n and ε > such that δ ε x,δ ε y K. By the homogeneity of the distance Moreover since the estimate (1.37) holds in K We can state now the main result d(δ ε x,δ ε y) = ε d(x,y). d(δ ε x,δ ε y) C δ ε x δ ε y 1/m Cε x y 1/m Theorem ρ ε ρ uniformly on compacts in R n. Proof. By Lemma 1.49 it is sufficient to prove pointwise convergence. We prove the following inequalities lim supρ ε (x) ρ(x) liminf ε + ε ρε (x) (1.38) + (i) Fix a point x and a control û such that Ê(û) = x, û = ρ(x), i.e. such that the corresponding trajectory is a minimizer for the system f. Now consider x ε := E ε (û). From Proposition 1.48 we get x ε x for ε. Moreover, from the definition of ρ ε we have ρ ε (x ε ) ρ(x). Hence ρ ε (x) = ρ ε (x ε )+ρ ε (x) ρ ε (x ε ) ρ(x)+ ρ ε (x) ρ ε (x ε ) Using that ρ ε is an equicontinuous family and that x ε x we have the left inequality in (1.38). 271

272 (ii) Let now u ε be a control such that E ε (u ε ) = x, u ε = ρ ε (x) and define x ε := Ê(uε ). As before we have ρ(x ε ) ρ ε (x). Then ρ(x) = ρ(x ε )+ ρ(x) ρ(x ε ) ρ ε (x)+ ρ(x) ρ(x ε ) and now it is sufficient to notice that x ε = E ε (u ε ) Ê(uε ) = x since E ε Ê uniformly on balls of L 2 and u ε bounded since ρ ε are equicontinuous.. In privileged coordinates x = (x 1,...,x m ) R k 1... R km = R n we set Π ε = {x R n, x i ε i,i = 1,...,m} Corollary 1.53 (Ball-Box Theorem). There exists constants c 1,c 2 > such that c 1 Π ε B(x,ε) c 2 Π ε where B(x, ε) is the subriemannian ball in privileged coordinates. Notice that this is a weaker statement with respect to Theorem Exercise Prove Corollary Definition Let f and f be two sub-riemannian structures on the same manifold M. We say that the structures are locally Lipschitz equivalent if, for any compact K M there exist c 1,c 2 > such that c 1 d(x,y) d(x,y) c 2 d(x,y) where µ and µ are respectively the sub-riemannian distances induced by f and f. From the Ball-Box Theorem we easily get a characterization of locally Lipschitz equivalent structures in term of the distribution. Corollary Two sub-riemannian structures are locally Lipschitz equivalent if and only if the two flags are equal at al points, i.e. Dq i = D q i, q M, i 1. Corollary Two regular sub-riemannian structures are locally Lipschitz equivalent if and only if their distributions are equal at al points, i.e. D q = D q, q M. In other words, in the regular case, the distribution define the metric up to locally Lipschitz equivalence. Remark In the proof of Theorem 1.52 we showed that, in some coordinates, the sub- Riemannian metric has an holder estimate with respect to the Euclidean one. The fact that the metric is Lipschitz equivalent to the Euclidean one characterize exactly Riemannian structures on M. Moreover we notice that this is only local property since we do not study the behaviour of the constants c 1,c 2 when K become big. 272

273 1.5 Algebraic meaning In the last section we proved in which sense the sub-riemannian tangent space approximate the sub-riemannian structure on the manifold. Now we also show that, at least in the regular case, the nilpotent approximation has a structure of Lie group, endowed with a left-invariant sub-riemannian structure. Recall that given an orthonormal frame {f 1,...,f k } for the sub-riemannian structure, by Proposition 1.21 the vector field J m q f i, jet of a vector field on M, is a well defined vector field on the quotient T f q := J f q/, which we denote f i. Proposition The Lie algebra Lie{ f 1,..., f k } is a nilpotent Lie algebra of step m, where m is the nonholonomic degree of f at q. Proof. Consider privileged coordinates around the point q. Then f i has weight 1 and is homogeneous with respect to the dilation δ λ. Moreover for any bracket monomial we have ν([ f i1,...,[ f ij 1, f ij ]]) = j Since every vector field V, when written in privileged coordinates, satisfies ν(v) m, then every bracket of m vector fileds is necessarily zero. Consider now the group generated by the flows of these vector fields G = Gr{e t f 1,...,e t f k } which acts on T f q on the right, and is by definition a nilpotent Lie group. 1 Moreover in the proof of Theorem 1.27 we showed that this action is also transitive (i.e. we can realize every element of T f q with this action) Collecting together all these results we have Corollary 1.6. The nilpotent approximation T f q is a homogeneous space, diffeomorphic to the quotient G/G, where G is the isotropy group of the trivial element of T f q. Before interpreting this contruction at the level of Lie algebras, we recall some definitions. The free associative algebra on k generators x 1,...,x k is the associative algebra A k of linear combinations of words of its generators, where the product of two element is defined by juxtaposition. The free Lie algebra on k generators, denoted L k, is the algebra of Lie elements of A k where the product of two elements x,y is defined by the commutator [x,y] = xy yx. The nilpotent step m free Lie algebra on k generators x 1,...,x k, is the quotient of the free Lie algebra by the ideal I m+1 generated as follows: I 1 = L, and I j = [I j 1,L]. Let Lie m {X 1,...,X k } be the nilpotent step m free Lie algebra generated by the vector fields X 1,...,X k and consider the subalgebra C := {π Lie m {X 1,...,X k } π( f 1,..., f k )() = } of all polynomial bracket such that if we replace X i with f i are zero when evaluated at zero. Then LieT f q Lie m{x 1,...,X k }/C 1 A Lie group G is nilpotent if its Lie algebra g is nilpotent. The fact that G acts on the right is because right action satisfies R hg = R gr h (i.e. x (hg) = (x h) g). 273

274 Remark To discuss regularity properties of T f q with respect to q, we can restate this characterization in such a way that does not depend on the nilpotent approximation: where C q is the core subalgebra LieT f q Lie m {X 1,...,X k }/C q C q := {π Lie m {X 1,...,X k } π(f 1,...,f k )(q) D degπ 1 q } (1.39) Lemma Assume that the sub-riemannian structure has constant growth vector, i.e. that n i (q) = dimd i q does not depend on q. Then C q is an ideal. In particular T f q is a Lie group. Proof. It is sufficent to prove that X C q = [f i,x] C q, i = 1,...,k Since the structure has constant growth vector, we can consider an adapted basis V 1,...,V n, well defined in a neighborhood O q of q. In particular if X = π(f 1,...,f k ) is a bracket polynomial of degree degπ = d we can write X(q ) = a i (q )V i (q ), q O q i:w i d where a i are suitable smooth functions. From (1.39) we have that X C q if and only if it belongs to Dq d 1, i.e. a i (q) =, i s.t. w i = d. On the other hand [f i,x] = [f i, w j da j V j ] = w j da j [f i,v j ]+f i (a j )V j (1.4) From this equality it is easy to check that every coefficient of degree d+1 in this sum is null at q, since they can appear only in the first summand of (1.4). Corollary Under previuos assumptions f 1,..., f k are a basis of left-invariant vector fields on T f q. Proof. All relies on the fact that if we consider a left invariant vector field X on a Lie group G, and we consider the right action of a normal subgroup H on it, then X is a well defined left-invariant vector field on the quotient G/H, which is still a Lie group. 1.6 Normal forms for nilpotent approximation in low dimension* In this section we provide normal forms for the nilpotent approximation of regular sub-riemannian structures in dimension less or equal than 5. One can easily shows that in this case the only possibilities for growth vectors are: - dim(m) = 3: G(S) = (2,3), 274

275 - dim(m) = 4: G(S) = (2,3,4) or G(S) = (3,4), - dim(m) = 5: G(S) {(2,3,4,5),(2,3,5),(3,5),(3,4,5),(4,5)}. We have the following. Theorem Let S = (M,D,g) be a regular sub-riemannian manifold and Ŝq its nilpotent approximation near q. Up to a change of coordinates and a rotation of the orthonormal frame we have the following expression for the orthonormal frame of Ŝq: Case n = 3 G(S) = (2,3). (Heisenberg) X 1 = 1, X 2 = 2 +x 1 3. Case n = 4 G(S) = (2,3,4). (Engel) X 1 = 1, X 2 = 2 +x 1 3 +x 1 x 2 4. G(S) = (3, 4). (Quasi-Heisenberg) X 1 = 1, X 2 = 2 +x 1 4, X 3 = 3. Case n = 5 G(S) = (2,3,5). (Cartan) X 1 = 1, X 2 = 2 +x x x 1 x 2 5. G(S) = (2,3,4,5). (Goursat rank 2) X 1 = 1, X 2 = 2 +x x x G(S) = (3,5). (Corank 2) X 1 = x 2 4, X 2 = x x 3 5, X 3 = x

276 G(S) = (3,4,5). (Goursat rank 3) X 1 = x x 1x 2 5, X 2 = x x2 1 5, X 3 = 3. G(S) = (4, 5). (Bi-Heisenberg) X 1 = x 2 5, X 2 = x 1 5, X 3 = 3 α 2 x 4 5, α R, (1.41) X 4 = 4 + α 2 x 3 5. Proof. It is sufficient to find, for every such a structure, a basis of the Lie algebra such that the structural constants 2 are uniquely determined by the sub-riemannian structure. We give a sketch of the proof for the (2,3,4,5) and (3,4,5) and (4,5) cases. The other cases can be treated in a similar way. (i). Let Ŝ = (G,D,g) be a nilpotent (3,4,5) sub-riemannian structure. Since we deal with a left-invariant sub-riemannian structure, we can identify the distribution with its value at the identity of the group D id. Let {e 1,e 2,e 3 } be a basis for D id, as a vector subspace of the Lie algebra. By our assumption on the growth vector we know that dim span{[e 1,e 2 ],[e 1,e 3 ],[e 2,e 3 ]}/D id = 1. (1.42) In other words, we can consider the skew-simmetric mapping Φ(, ) := [, ]/D id : D id D id T id G/D id, (1.43) and condition (1.42) implies that there exists a one dimensional subspace in the kernel of this map. Let X 3 be a normalized vector in the kernel and consider its orthogonal subspace D D id with respect to the Euclidean product on D id. Fix an arbitrary orthonormal basis {X 1,X 2 } of D and set X 4 := [X 1,X 2 ]. It is easy to see that X 4 does not change if we rotate the base {X 1,X 2 } and there exists a choice of this frame, denoted { X 1, X 2 }, such that [ X 2, X 4 ] =. Then set X 5 := [ X 1, X 4 ]. Therefore we found a canonical basis for the Lie algebra that satisfies the following commutator relations: [ X 1, X 2 ] = X 4, [ X 1, X 4 ] = X 5, and all other commutators vanish. A standard application of the Campbell-Hausdorff formula gives the coordinate expression above. 2 Let X 1,...,X k be a basis of a Lie algebra g. The coefficients c l ij that satisfy [X i,x j] = l cl ijx l are called structural constant of g. 276

277 (ii). Let us assume now that Ŝ is a nilpotent (2,3,4,5) sub-riemannian structure. As before we identify the distribution with its value at the identity and consider any orthonormal basis {e 1,e 2 } for the 2-dimensional subspace D id. By our assumption on G(S) dim span{e 1,e 2,[e 1,e 2 ]} = 3 dim span{e 1,e 2,[e 1,e 2 ],[e 1,[e 1,e 2 ]],[e 2,[e 1,e 2 ]]} = 4. (1.44) As in (i), it is easy to see that there exists a choice of the orthonormal basis on D id, which we denote { X 1, X 2 }, such that [ X 2,[ X 1, X 2 ]] =. From this property and the Jacobi identity it follows [ X 2,[ X 1,[ X 1, X 2 ]]] =. Then we set X 3 := [ X 1, X 2 ], X4 = [ X 1,[ X 1, X 2 ]] and X 5 := [ X 1,[ X 1,[ X 1, X 2 ]]]. It is easily seen that (1.44) implies that these vectors are linearly independent and give a canonical basis for the Lie algebra, with the only nontrivial commutator relations: [ X 1, X 2 ] = X 3, [ X 1, X 3 ] = X 4, [ X 1, X 4 ] = X 5. (iii). In the case (4,5) since dim T id G/D id = 1, the map (1.43) is represented by a single 4 4 skew-simmetric matrix L. By skew-symmetricity its eigenvalues are purely imaginary ±ib 1,±ib 2, one of which is different from zero. Assuming b 1 we have that α = b 2 /b 1. Notice that the structure is contact if and only if α (see also Section?? for more details on the normal form). Remark Notice that, in the statement of Theorem 1.64, in all other cases the nilpotent approximation does not depend on any parameter, except for the (4,5) case. As a consequence, up to dimension 5, the sub-riemannian structure induced on the tangent space, and hence the Popp measure P, does not depend on the point, except for the (4,5) case. In the (4,5) case we have the following expression for the Popp s measure P = 1 dx b 2 1 +b dx 5, 2 where b 1,b 2 are the eigenvalues of the skew-simmetric matrix that represent the Lie bracket map. SincethenormalformsinTheorem1.64donotdependonthepoint, exceptwheng(s) (4,5), we have the following corollary Corollary Let S = (M,D,g) be a regular sub-riemannian manifold such that dim(m) 5 and G(S) (4,5). Then if q 1,q 2 M we have that Ŝq 1 is isometric to Ŝq 2 as sub-riemannian manifolds. 1.7 Examples* Heisenberg Martinet Grushin 277

278 278

279 Chapter 11 Regularity of the sub-riemannian distance In this chapter we focus our attention on the analytical properties of the sub-riemannian squared distance from a fixed point. In particular we want to answer to the following questions: (i) Which is the (minimal) regularity of d 2 that one can expect? (ii) Is the sub-riemannian distance d 2 smooth? If not, can we characterize smooth points? 11.1 General properties of the distance function In this section we recall and collect some general properties of the sub-riemannian distance and results related to it, some of which we already proved in the previous chapters. Let us consider a free sub-riemannian structure (M,U,f) where the vector fields f 1,...,f m define a generating family, i.e. f : U TM, f(u,q) = m u i f i (q) i=1 Here U is a trivial Euclidean bundle on M of rank m. Definition Fix a point q M. The flag of the sub-riemannian structure at the point q is the sequence of subspaces {D i q} i N defined by D i q := span{[f j1,...,[f jl 1,f jl ]](q), l i} Notice that Dq 1 = D q is the set of admissible directions. Moreover, by construction, Dq i Di+1 q all i 1. for The bracket generating assumptions implies that q M, m(q) > s.t. D m(q) q and m(q) is called the step of the sub-riemannian structure at q. 279 = T q M

280 Exercise Prove that the filtration defined by the subspaces Dq i, for i 1, is independent on the choice of a generating family (i.e., on the trivialization of U). 2. Show that m(q) does not depend on the generating frame. Prove that the map q m(q) is upper semicontinuous. In Chapter 1 we already proved that the sub-riemannian distance is Hölder continuous. For the reader s convenience, we recall here the statement. Proposition For every q M there exists a neighborhood O q such that q,q 1 O q and for every coordinate map φ : O q R n d(q,q 1 ) C φ(q ) φ(q 1 ) 1/m where m = m(q) is the step of the sub-riemannian structure at q. In what follows we fix a point q M to be fixed and r > such that B = B q (r ) is a closed compact ball centered in q. Let us denote by E = E q : U M the end-point map based at q M, i.e., the map that associates to every control u( ) U L 2 the end-point q u (1) of the solution associated to the control u (we recall that U is the open set of L 2 such that the corresponding solution q u ( ) is defined on [,1]). Denote with B the ball of radius r in L 2 (where r is chosen in such a way that the closure of B q (r ) is compact). Notice that since B is compact then B U. Proposition F B : B M is continuous in the weak topology. In other words if u n u in the weak-l 2 topology then F(u n ) F(u). Proof. Consider the solution of the problem γ(t) = f u(t) (γ(t)), γ() = q, u B. Since the ball B is compact, all trajectories are Lipschitzian with the same Lipchitz constant. In particular this set has compact closure in the C topology. Assume now that u n u and consider the family of curves γ n (t) associated to u n, that satisfy γ n (t) = q + t f un(τ)(γ n (τ))dτ. By compactness there exists a subsequence, which we still denote γ n, such that γ n γ uniformly, for some curve γ, in particular their endpoints converge. It remains to show that γ is the trajectory associated to u. Since u n u we have that f un(t)(γ n (t)) f u(t) (γ(t)) being the product between strong and weak convergent sequences. 1 taking the limit we find i.e. γ is the trajectory associated to u. γ(t) = q + t 1 one can write the coordinate expression u i kf i(q k (t)) f u(τ) (γ(τ))dτ, 28

281 Remark Actually we prove that all trajectories converge uniformly and not only their endpoints. The previous proposition given another proof of the existence of minimizers Corollary 11.6 (Existence of minimizers). For any q B q (r) there exists u (with u r) that join q and q and is a minimizer. i.e. u = d(q,q). Proof. Consider a point q in the compact ball B. Then take a minimizing sequence u n such that F(u n ) = q and u n d(q,q). The sequence ( u n ) n is bounded, hence by weak compactness of balls in L 2 there exists a subsequence, that we still call u n such that u n u for some u. By continuity F(u) = q. Moreover the semicontinuity of the L 2 norm proves that u corresponds to a minimizer joining q to q since u liminf n u n = d(q,q). Definition A control u is called a minimizer if it satisfies J(u) = 1 2 d2 (q,f(u)). Notice that in this case we have u = d(q,f(u)). We denote by M L 2 the set of all minimizing controls. Theorem 11.8 (Compactness). Let K M be compact. The set of all minimal controls associated with trajectories reaching K M K = {u M F(u) K}, is compact in the strong L 2 topology. Proof. Consider a sequence u n M K. Since K is compact, the sequence u n is bounded. Since bounded sets in L 2 are weakly compact, we can assume that u n u. Let us show that we also have u n u. From Proposition 11.4 it follows that F(u n ) F(u) in M and the continuity of the distance implies d(q,f(u n )) d(q,f(u)). Moreover since u n M we have that u n = d(q,f(u n )) and by weak semicontinuity of the L 2 norm we get Hence u n u strongly in L 2 and u M. u liminf n u n = liminf n d(q,f(u n )) = d(q,f(u)) Regularity of the sub-riemannian distance In this section we fix once for all a point q M and a closed ball B = B q (r ) such that B is compact. In particular for each q B there exists a minimizer joining q and q (see Corollary 11.6). In what follows we denote by f the squared distance from q The main result of this chapter is the following. f( ) = 1 2 d2 (q, ). (11.1) 281

282 Theorem The function f B : B R is smooth on a open dense subset of B. In the case of complete sub-riemannian structures, since balls are compact for all radii, we have immediately the following corollary Corollary Assume that M is a complete sub-riemannian manifold and q M. Then f is smooth on an open and dense subset of M. We start by looking for necessary conditions for f to be C around a point. Proposition Let q B and assume that f is C at q. Then (i) there exists a unique length minimizer γ joining q with q. Moreover γ is not abnormal and not conjugate. (ii) d q f = λ 1, where λ 1 is the final covector of the normal lift of γ. Proof. Under the above assumptions the functional Ψ : v J(v) f(f(v)), v L ([,T],R k ), (11.2) is smooth and non negative. For every optimal trajectory γ, associated with the control u, that connects q with q in time 1, one has = d u Ψ = d u J d q f D u F. (11.3) Thus, γ is a normal extremal trajectory, with Lagrange multiplier λ 1 = d q f. By Theorem 4.26, we can recover γ by the formula γ(t) = π e (t 1) H (λ 1 ). Then, γ is the unique minimizer of J connecting its endpoints, and is normal. Next we show that γ is not abnormal and not conjugate. For y in a neighbourhood O q of q, let us consider the map Φ : O q T q M, Φ(y) = e H (d y f). (11.4) The map Φ, by construction, is a smooth right inverse for the exponential map, since E(Φ(y)) = π e H (e H (d y f)) = π(d y f) = y. (11.5) This implies that q is a regular value for the exponential map. Since q is a regular value for the exponential map and, a fortiori, u is a regular point for the end-point map. This proves that u corresponds to a trajectory that is at the same time strictly normal and not conjugate. Remark Notice that from the proof it follows that if we only assume that f is differentiable at q, we can still conclude that there exists a unique minimizer γ joining q to q, and it is normal. Moreover leu us notice that to conclude it is enough to assume that f is twice differentiable at q. In particular a posteriori we can prove that whenever f is is twice differentiable at q then it is C. Before going further in the study of the smoothness property of the distance function, we are already able to prove an important corollary of this result. 282

283 Denote, for r >, S r := f 1 ( r2 2 ) the sub-riemannian sphere of radius r centered at q Corollary Assume that D q T q M. For every r r, the sphere S r contains a non smooth point of the function f. Proof. Since r r, the sphere S r is non empty and contained in a compact ball. Assume, by contradiction, that f is smooth at every point of S r. Then S r is a level set defined by f and d q f for every q S r (since d q f is the nonzero covector attached at the final point of a geodesic, see Proposition 11.11). It follows that S r is a smooth submanifold of dimension n 1, without boundary. Moreover, being the level set of a continuous function, S r is closed, hence compact. Let us consider the map Φ : S r T q M, Φ(q) = e H (d q f), By assumption f is smooth, hence Φ is a smooth right inverse of the exponential map (see also (11.5)). In particular the differential of Φ is injective at every point. Moreover H(Φ(q)) = r since f(q) = H(λ) = r for every q S r. It follows that actually Φ defines a smooth immersion Φ : S r H 1 (r) T q M (11.6) of the sphere S r into the set { C r := H 1 (r) Tq M = λ Tq M : 1 2 } k λ,f i (q ) 2 = r. i=1 Notice that C r is a smooth connected and non compact n 1 dimensional submanifold of the fiber T q M, indeed diffeomorphic to the cylinder S k 1 R n k (here k = dimd q < n is the rank of the structure at the point q ). By continuity of Φ, the image Φ(S r ) is closed in C r. Moreover, since every immersion is a local submersion and dims r = dimc r, the set Φ(S r ) is also open in C r. Hence it is connected. Since Φ(S r ) has no boundary, it is a connected component of C r, namely Φ(S r ) = C r. This is a contradiction since, by continuity, Φ(S r ) is compact, while C r is not. Next we go back to the proof of the main result. Recall that q M is fixed and f is the one half of the distance squared from q. After Proposition 11.11, it is natural to introduce the following definition. Definition Fix a point q M. The set of smooth point from q is the set Σ M of q M such that there exists a unique lenght-minimizer γ joining q to q, that it is strictly normal, and not conjugate. From the proof of Proposition (see also Remark 11.12) it follows that if the squared distance f from q, is smooth at q then q Σ. The name smooth point of f is justified by the following theorem. Theorem The set Σ is open and dense in B. Moreover f is smooth at every point of Σ. Proof. We divide the proof into three parts: (a) the set Σ is open, (b) the function f is smooth in a neighborhood of every point of Σ, (c) the set Σ is dense in B. 283

284 (a). To prove that Σ is open we have to show that for every q Σ there exists a neighborhood O q of q such that every q O q is also in Σ. Let us start by proving the following claim: there exists a neighborhood of q in B such that every point in this neighborhood is reached by exactly one minimizer. By contradiction, ifthis propertyisnot true, thereexists asequenceq n of points inb converging to q such that (at least) two minimizers γ n and γ n joining q and q n. Let us denote by u n and v n the corresponding minimizing controls. By Proposition 11.8, the set of controls associated with minimizers whose endpoint is in the compact ball B is compact in L 2 (w.r.t. the strong topology). Then there exist, up to considering a subsequence, two controls u,v such that u n u and v n v. Moreovers the limits u and v are both minimizers and join q with q. Since by assumption there is a unique minimizer γ joining q with q, it follows that u = v is the corresponding control. By smoothness of the end point map both D un F and D vn F tends to D u F, which has has full rank (u is strictly normal, hence is not a critical point for F). Hence, for n big enough, both D un F and D vn F are surjective, i.e., u n and v n are strictly normal, and we can build the sequence λ n 1 and ξ n 1 of corresponding final covectors in T q n M satisfying λ n 1 D u n F = u n, ξ n 1 D v n F = v n. These relations can be rewritten in terms of the adjoint linear maps (D un F) λ n 1 = u n, (D vn F) ξ n 1 = v n. Since both (D un F) and (D vn F) are a family of injective linear maps converging to (D u F) and u n,v n u, it follows that the corresponding (unique) solutions λ n 1 and ξn 1 also converge to the solutionofthelimitproblem(d u F) λ 1 = u, i.e, bothconvergetothefinalcovector λ 1 corresponding to γ. By using the flow defined by the corresponding controls we can deduce the convergence of the sequences λ n and ξn of the initial covectors associated to u n and v n to the unique initial covector λ corresponding to γ. Finally, since λ by assumption is a regular point of the exponential map, i.e., the unique minimizer γ joining q to q is not conjugate, it follows that the exponential map is invertible in a neighborhood V λ of λ onto its image O q := E(V λ ), that is a neighborhood of q. In particular this proves our initial claim. More precisely we have proved that for every point q O q there exists a unique minimizer joining q to q, whose initial covector λ V λ is a regular point of the exponential map. This implies that every q O q is a smooth point, and Σ is open. (b). Now we prove that f is smooth in a neighborhood of each point q Σ. From the part (a) of the proof it follows that if q Σ there exists a neighborhood V λ of λ and O q of q such that E Vλ : V λ O q is a smooth invertible map. Denote by Φ : O q V λ its smooth inverse. Since for every q O q there is only one minimizer joining q to q with initial covector Φ(q ) it follows that, f(q ) = 1 2 d2 (q,q ) = H(Φ(q )), that is a composition of smooth functions, hence smooth. (c). Our next goal is to show that Σ is a dense set in B. We start by a preliminary definition. 284

285 Definition A point q B is said to be (i) a fair point if there exists a unique minimizer joining q to q, that is normal. (ii) a good point if it is a fair point and the unique minimizer joining q to q is strictly normal. We denote by Σ f and Σ g the set of fair and good points, respectively. We stress that a fair point can be reached by a unique minimizer that is both normal and abnormal. From the definition it is immediate that Σ Σ g Σ f. The proof of (c) relies on the following four steps: (c1) Σ f is a dense set in B, (c2) Σ g is a dense set in B, (c3) f is Lipschitz in a neighborhood of every point of Σ g, (c4) Σ is a dense set in B. (c1). Fix an open set O B and let us show that Σ f O. Consider a smooth function a : O R such that a 1 ([s,+ [) is compact for every s R. Then consider the function ψ : O R, ψ(q) = f(q) a(q) The function ψ is continuous on O and, since f is nonnegative, the set ψ 1 (],s[) are compact for every s R due to the assumption on a. It follows that ψ attains its minimum at some point q 1 O. Define a control u 1 associated with a minimizer γ joining q and F(u 1 ) = q 1. Since J(u) f(f(u)) for every u, it is easy to see that the map Φ : U R, Φ(u) = J(u) a(f(u)) attains its minimum at u 1. In particular it holds = D u1 Φ = u 1 (d q1 a)d u1 F. The last identity implies that u 1 is normal and λ 1 = d q1 a is the final covector associated with the trajectory. By Theorem 4.26, the corresponding trajectory γ is uniquely recovered by the formula γ(t) = π e (t 1) H (d q1 a). In particular γ is the uniqueminimizer joining q to q 1 O, and is normal, i.e. q 1 Σ f O. Remark In the Riemannian case Σ f = Σ g since there are no abnormal extremal. (c2). As in the proof of (c1), we shall prove that Σ g O for any open O B. By (c1) the set Σ f O is nonempty. For any q Σ f O we can define rankq := rankd u F, where u is the control associated to the unique minimizer γ joining q to q. To prove (c2) it is sufficient to prove that there exists a point q Σ f O such that rankq = n (i.e., D u F is surjective, where u is the control associated to the unique minimizer joining q and q ). Assume by contradiction that k O := max rankq < n, q Σ f O and consider a point q where the maximum is attained, i.e., such that rank q = k O. 285

286 We claim that all points of Σ f O that are sufficiently close to q have the same rank (we stress that the existence of points in Σ f O arbitrary close to q is also guaranteed by (c1)). Assume that the claim is not true, i.e., there exist a sequence of points q n Σ f O such that q n q and rankq n k O 1. Reasoning as in the proof of (a), using uniquenessand compactness of the minimizers, one can prove that the sequence of controls u n associated to the unique minimizers joining q to q n satisfies u n û strongly in L 2, where û is the control associated to the unique minimizer joining q with q. By smoothness of the end-point map F it follows that D un F DûF which, by semicontinuity of the rank, implies the contradiction rank q = rankdûf liminf n rankd u n F k O 1. Thus, without loss of generality, we can assume that rankq = k O < n for every q Σ f O (maybe by restricting our neighborhood O). We introduce the following set Π q = e H {ξ T q M ξd uf = λ 1 D u F} T q M. The set Π q is the set of initial covector λ T q M whose image via the exponential map is the point q. Lemma Π q is an affine subset of T q M such that dimπ q = n k O. Moreover the map q Π q is continuous. Proof. It is easy to check that the set Π q = {ξ T q M ξd uf = λ 1 D u F} is an affine subspace of T q M. Indeed ξ Π q if and only if (D u F) (ξ λ 1 ) =, that is Π q = {ξ T qm ξd u F = λ 1 D u F} = λ 1 +Ker(D u F), Moreover dimker(d u F) = n dimimd u F = n k O. Since all elements ξ Π q are associated with the same control u, we have that Π q = e H ( Π q ) = P,t ( Π q ), hence Π q is an affine subspace of T q M. Let us now show that the map q Π q is continuous on Σ f O. Consider a sequence of points q n in Σ f O such that q n q Σ f O. Let u n (resp. u) be the unique control associated with the minimizing trajectory joining q and q n (resp. q). By theuniqueness-compactness argument already used in the previous part of the proof we have that u n u strongly and moreover D un F D u F. Since rank D un F is constant, it follows that Ker(D un F) Ker(D u F), as subspaces. Consider now A T q M a k O -dimensional ball that contains λ = e H (λ 1 ) and is transversal to Π q. By continuity A is transversal also to Π q, for q Σ f O close to q. In particular Π q A. Since E(Π q ) = q, this implies that Σ f O E(A). By (c1), Σ f O is a dense set, hence E(A) is also dense in O. On the other hand, since E is a smooth map and A is a compact ball of positive codimension (k O < n), by Sard Lemma it follows that E(A) is a closed dense set of O that has measure zero, that is a contradiction. (c3) The proof of this claim relies on the following result, which is of independent interest. Theorem Let K B a compact in our ball such that any minimizer connecting q to q K is strictly normal. Then f is Lipschitz on K. 286

287 Proof of Theorem Let us first notice that, since K is compact, it is sufficient to show that f is locally Lipschitz on K. Fix a point q K and some control u associated with a minimizer joining q and q (it may be not unique). By our assumptions D u F is surjective, since u is strictly normal. Thus, by inverse function theorem, there exist neighborhoods V of u in U and O q of q in K, together with a smooth map Φ : O q V that is a local right inverse for the end-point map, namey F(Φ(q )) = q for all q O q (see also Theorem 2.54). Fix then local coordinates around q. Since Φ is smooth, there exists R > and C > such that B q (C r) F(B u (r)), r < R, (11.7) where B u (r) is the ball of radius r in L 2 and B q (r) is the ball of radius r in coordinates on M. Let us also observe that, since J is smooth on, there exists C 1 > such that for every u,u B u (R) one has J(u ) J(u) C 1 u u 2 Pick then any point q K such that q q = C r, with r R. By (11.7), there exists u B u (R) with u u 2 r such that F(u ) = q. Using that f(q ) J(u ) and f(q) = J(u), since u is a minimizer, we have f(q ) f(q) J(u ) J(u) C 1 u u 2 C q q, where C = C 1 /C. Notice that the above inequality is true for all q such that q q C R. Since K is compact, and the set of control u associated with minimizers that reach the compact set K is also compact, the constants R > and C,C 1 can be chosen uniformly with respect to q K. Hence we can exchange the role of q and q in the above reasoning and get f(q ) f(q) C q q, for every pair of points q,q such that q q C R. To end the proof of (c3) it is sufficient to show that if q Σ g there exists a (compact) neighborhood O q of q such that every point in O q is reached by only strictly normal minimizers (we stress that no uniqueness is required here). By contradiction, assume that the claim is not true. Then there exists a sequence of points q n converging to q and a choice of controls u n, such that the corresponding minimizers are abnormal. By compacness of minimizers there exists u such that u n u and by uniqueness of the limit u is abnormal for the point q, that is a contradiction. (c4). We have to prove that Σ O is non empty for every open neighborhood O in B. By (c3) we can choose q Σ g O and fix O O neighborhood of q such that f is Lipschitz on O. It is then sufficient to show that Σ O. By Proposition (see also Remark 11.12) every differentiability point of f is reached by a unique minimizer that is normal, hence is a fair point. Since we know that f is Lipschitz on O, it follows by Rademacher Theorem that almost every point of O is fair, namely meas(σ f O ) = meas(o ). Let us also notice that the set Σ f O of fair points of O is also contained in the image of the exponential map. Thanks to the Sard Lemma, the set of regular values of the exponential map in 287

288 O is also a set of full measure in O. Since by definition a point in Σ f that is a regular value for the exponential map is in Σ, this implies that meas(σ O ) = meas(σ f O ) = meas(o ). This in particular proves that Σ O is not empty. As a corollary of this result we can prove that if there are no abnormal minimizers, then the set of smooth points has full measure Corollary Assume that M is a complete sub-riemannian structure and that there are no abnormal minimizers. Then meas(m \Σ) =. This result is not known in general, and it is indeed a main open problem of sub-riemannian geometry to establish whether Corollary 11.2 remains true in presence of abnormal minimizers. We stress that the assumptions of the theorem are satisfied in the case of Riemannian structure. Indeed in this case, following the same arguments of the proof, we have the following result. Proposition Let M be a sub-riemannian structure that is Riemannian at q,i.e., such that dimd q = dimm. Then there exists a neighborhood O q of q such that f is smooth on O q Locally Lipschitz functions and maps If S is a subset of a vector space V, we denote by conv(s) the convex hull of S, that is the smallest convex set containing S. It is characterized as the set of v V such that there exists a finite number of elements v,...,v l S such that v = l λ i v i, λ i, i= n λ i = 1. i= If ϕ : M R is a function defined on a smooth manifold M, we say that ϕ is locally Lipschitz is ϕ is locally Lipschitz in any coordinate chart, as a function defined on R n. The classical Rademacher theorem implies that a locally Lipschitz function ϕ : M R is differentiable almost everywhere. Still we can introduce a weak notion of differential that is defined at every point. If ϕ : M R is locally Lipschitz, any point q M is the limit of differentiability points. In what follows, whenever we write d q ϕ, it is implicitly understood that q M is a differentiability point of ϕ. Definition Let ϕ : M R be a locally Lipschitz function. The (Clarke) generalized differential of ϕ at the point q M is the set q ϕ := conv{ξ T q M ξ = lim q n q d q n ϕ} (11.8) Notice that, by definition, q ϕ is a subset of T qm. It is closed by definition and bounded since the function is locally Lipschitz, hence compact. Exercise (i). Show that the mapping q q ϕ is upper semicontinuous in the following sense: if q n q in M and ξ n ξ in T M where ξ n qn ϕ, then ξ q ϕ. (ii). We say that q is regular for ϕ if / q ϕ. Prove that the set of regular point for ϕ is open in M. 288

289 From the very definition of generalized differential we have the following result. Lemma Let ϕ : M R be a locally Lipschitz function and q M. The following are equivalent: (i) q ϕ = {ξ} is a singleton, (ii) d q ϕ = ξ and the map x d x ϕ is continuous at q, i.e., for every sequence of differentiability point q n q we have d qn ϕ d q ϕ. Remark Let A be a subset of R n of measure zero and consider the set of half-lines L v = {q +tv,t } emanating from q and parametrized by v S n 1. It follows from Fubini s theorem that for almost every v S n 1 the one-dimensional measure of the intersection A L v is zero. If we apply this fact to the case when A is the set at which a locally Lipschitz function ϕ : R n R fails to be differentiable, we deduce that for almost all v S n 1, the function t ϕ(q +tv) is differentiable for a.e. t. Example Let ϕ : R R defined by (i) ϕ(x) = x. Then ϕ = [ 1,1], (ii) ϕ(x) = x, if x < and ϕ(x) = 2x, if x. In this case ϕ = [1,2]. In particular in the first example is a minimum for ϕ and ϕ. In the second case the function is locally invertible near the origin and ϕ is separated from zero. In what follows we will prove that these fact corresponds to general results (cf. Proposition 11.3 and Theorem 11.34). The following is a classical hyperplane separation theorem for closed convex sets in R n. Lemma Let K and C be two disjoint, closed, convex sets in R n, and suppose that K is compact. Then there exists ε > and a vector v S n 1 such that x,v > y,v +ε, x K, y C. (11.9) We also recall here another useful result from convex analysis. Lemma (Carathéodory). Let S R n and x conv(s). Then there exists x,...,x n S such that x conv{x,...,x n }. The notion of generalized gradient permits to extend some classical properties of critical points of smooth functions. Proposition Let ϕ : M R be locally Lipschitz and q be a local minimum for ϕ. Then q ϕ. Proof. Sincetheclaim is alocal propertywecan assumewithoutloss of generality thatm = R n. As usual we will identify vectors and covectors with elements of R n and the duality covectors-vectors is given by the Euclidean scalar product, that we still denote,. Assume by contradiction that / q ϕ and let us show that q cannot be a minimum for ϕ. To this aim, we prove that there exists a direction w in S n 1 such that the scalar map t ϕ(q +tw) has no minimum at t =. 289

290 The set q ϕ is a compact convex set that does not contain the origin, hence by Lemma 11.27, there exist ε > and v S n 1 such that ξ,v < ε, ξ q ϕ. By definition of generalized differential, one can find open neighborhoods O q of q in R n and V v of v in S n 1 such that for all differentiability point q O q of ϕ one has dq ϕ,v ε/2, v V v. Fix q O q where ϕ is differentiable and a vector w V v such that the set of differentiable points with the line {q +tw} has full measure (cf. Remark 11.25). Then we can compute for t > ϕ(q +tw) ϕ(q) = Thus ϕ cannot have a minimum at q. t d q+sw ϕ,w ds εt/2. The following proposition gives an estimate for the generalized differential of some special class of function. Proposition Let ϕ ω : M R be a family of C 1 functions, with ω Ω a compact set. Assume that the following maps are continuous: (ω,q) ϕ ω (q), (ω,q) d q ϕ ω Then the function a(q) := min ω Ω ϕ ω(q) is locally Lipschitz on M and q a conv{d q ϕ ω ω Ω s.t. ϕ ω (q) = a(q)}. (11.1) Proof. As in the proof of Proposition we can assume that M = R n. Notice that, if we denote by Ω q = {ω Ω,ϕ ω (q) = a(q)} we have by compactness of Ω that Ω q is non empy for every q M and we can rewrite the claim as follows q a conv{d q ϕ ω ω Ω q }. (11.11) We divide the proof into two steps. In step (i) we prove that a is locally Lipschitz and then in (ii) we show the estimate (11.11). (i). Fix a compact K M. Since every ϕ ω is Lipschitz on K and Ω is compact, there exists a common Lipschitz constant C K >, i.e. the following inequality holds Clearly we have ϕ ω (q) ϕ ω (q ) C K q q, q,q K, ω Ω, min ϕ ω(q) ϕ ω (q ) C K q q, q,q K, ω Ω, ω Ω and since the last inequality holds for all ω Ω we can pass to the min with respect to ω in the left hand side and a(q) a(q ) C K q q, q,q K. 29

291 SincetheconstantC K dependsonlyonthecompactsetk wecanexchangeinthepreviousreasoning the role of q and q, that gives a(q) a(q ) C K q q, q,q K. (ii). Define D q := conv{d q ϕ ω ω Ω q }. Let us first prove prove that d q a D q for every differentiability point q of a. Fix any ξ / D q. By Lemma applied to the pair D q and {ξ}, there exist ε > and v S n 1 such that d q ϕ ω,v > ξ,v +ε, ω Ω q, By continuity of the map (ω,q) d q ϕ ω, there exists a neighborhood O q of q and V neighborhood of Ω q such that dq ϕ ω,v > ξ,v +ε/2, q O q, ω V, An integration argument let us to prove that there exists δ > such that for ω V 1 t (ϕ ω(q +tv) ϕ ω (q)) > ξ,v +ε/4, < t < δ. Clearly we have 1 t (ϕ ω(q +tv) a(q)) ξ,v +ε/4, < t < δ. and since the minimum in a(q+tv) = min ω Ω ϕ ω (q+tv) is attained for ω in Ω q+tv V for t small enough, we can pass to the minimum w.r.t. ω V in the left hand side, proving that there exists t > such that 1 t (a(q +tv) a(q)) ξ,v +ε/4, < t < t. Passing to the limit for t we get d q a,v ξ,v +ε/4 (11.12) If d q a / D q we can choose ξ = d q a in the above reasoning and (11.12) gives the contradiction d q a,v d q a,v +ε/4. Hence d q a D for every differentiability point q of a. Now suppose that one has a sequence q n q, where q n are differentiability points of a. Then d qn a D qn for all n from the first part of the proof. We want to show that, whenever the limit ξ = lim n d qn a exists, then ξ D q. This is a consequence of the fact that the map (ω,q) d q ϕ ω is continuous (in particular upper semicontinuous in the sense of Exercise 11.23) and the fact that Ω is compact. Exercise Complete the second part of the proof of Proposition Hint: use Carathéodory lemma Locally Lipschitz map and Lipschitz submanifolds Asforscalar functions, amapf : M N betweensmoothmanifoldsissaidtobelocally Lipschitzif for any coordinate chart in M and N the corresponding function from R n to R n is locally Lipschitz. For a locally Lipschitz map between manifolds f : M N the (Clarke) generalized differential is defined as follows q f := conv{l Hom(T q M,T f(q) N) L = lim q n q D q n f, q n diff. point of f}, The following lemma shows how the standard chain rule extends to the Lipschitz case. 291

292 Lemma Let M be a smooth manifold and f : M N be a locally Lipschitz map. (a) If φ : M M is a diffeomorphism and q M we have q (f φ) = ϕ(q) f D q φ. (11.13) (b) If ϕ : N W is a C 1 map, and q M we have q (ϕ f) = D f(q) ϕ q f. (11.14) Moreover the generalized differential, as a set, is upper semicontinuous. More precisely for every neighborhood Ω Hom(T q M,T f(q) N) of q f there exists a neighborhood O q of q such that q f Ω, for every q O q. Sketch of the proof. For a detailed proof of this result see??. Here we only give the main ideas. (a). Since φ is a diffeomorphism, it sends every differentiability point q of f φ to a differentiability point φ(q) for f. Then (11.13) is true at differentiability point and passing to the limit it is also valid for sub-differential (one proves both inclusions using φ and φ 1 ). Part (b) can be proved along the same lines. The semicontinuity can be proved by using the hyperplane separation theorem and the Carathéodory Lemma. Definition Let f : M N be a locally Lipschitz map. A point q M is said critical for f if q f contains a non-surjective map. If q M is not critical it is said regular. Notice that by the semicountinuity property of Lemma 11.32, it follows that the set of regular point of a locally Lipschitz map f is open. Theorem Let f : R n R n be a locally Lipschitz map and q M be a regular point. Then there exists neighborhood O f(q) and a locally Lipschitz map g : O f(q) R n R n such that f g = g f = Id. Remark The classical C 1 version of the inverse function theorem (cf. Theorem??) can be proved from Theorem and the chain rule (Lemma 11.32). Indeed Theorem implies that there exists a locally Lipschitz inverse g and using the chain rule it is easy to show that the sub-differential of g contains only one element (this implies that it is differentiable at that point) and the differential of g is the inverse of the differential of f. Before proving Theorem we need the following technical lemma. Lemma Let f : R n R n be a locally Lipschitz map and q M be a regular point. Then there exists a neighborhood O q of q and ε > such that v S n 1, ξ v S n 1 s.t. ξ v, x f(v) > ε, x O q. (11.15) Moreover f(x) f(y) ε x y, for all x,y O q. We stress that (11.15) means that the inequality ξ v,l(v) > ε holds for every x O q and every element L x f. 292

293 Proof. Notice that, since q is a regular point, the set q f contains only invertible linear maps. For every v S n 1, theset q f(v) is compact and convex, and does not contain thezero linear map. By the hyperplane separation theorem we can find ξ v such that ξ v, q f(v) > ε(v). The map x x f is upper semicontinuous, hence there exists a neighborhood O q of q such that ξ v, x f(v) > ε(v) for all x O q. Since S n 1 is compact, there exists a uniform ε = min{ε(v),v S n 1 } that satisfies (11.15). To prove the second statement of the Lemma, write y = x+sv, where s = x y and v S n 1. Consider a vector v S n 1 close to v such that almost every point in the direction of v is a point of differentiability (cf. Remark 11.25), and set y = x + sv and ξ v the vector associated to v defined by (11.15). Then we can write f(y ) f(x) = s (D x+tv f)v dt. and we have the inequality f(y ) f(x) ξ v,f(y ) f(x) = s ε y x ξv,(d x+tv f)v dt Since ε does not depend on v, we can pass to the limit for v v in the above inequality (in particular y y) and the Lemma is proved. Proof of Theorem The inequality proved in Lemma implies that f is injective in the neighborhood O q of the point q. If we show that f(o q ) covers a neighborhood O f(q) of the point f(q), then the inverse function g : O f(q) R n is well defined and locally Lipschitz. Without loss of generality, up to restricting the neighborhood O q, we can assume that every point in O q is regular for f and moreover that the estimate of the Lemma holds also on the topological boundary O q. Lemma also implies that dist(f(q), f(o q )) εdist(q, O q ) >, where dist(x,a) = inf y A x y denotes the Euclidean distance from x to the set A. Then consider a neighborhood W f(o q ) of f(q) such that y f(q) < dist(y, f(o q )), for every y W. Fix an arbitrary ȳ W and let us show that the equation f(x) = ȳ has a solution. Define the function ψ : O q R, ψ(x) = f(x) ȳ 2 By construction ψ(q) < ψ(z), for all z O q, hence by continuity ψ attains the minimum on some point x O q. By Proposition 11.29, we have x ψ. Moreover, using the chain rule x ψ = (f( x) ȳ) T x f Since x is a regular point of f, the linear map x f is invertible. Thus x ψ implies f( x) = ȳ. We say that c R is a regular value of a locally Lipschitz function ϕ : M R if ϕ 1 (c) and every x ϕ 1 (c) is a regular point. 293

294 Corollary Let ϕ : M R be locally Lipschitz and assume that c R is a regular value for ϕ. Then ϕ 1 (c) is a Lipschitz submanifold of M of codimension 1. Proof. We show that in any small neighborhood O x of every x ϕ 1 (c) the set O x ϕ 1 (c) can be described as the zero locus of a locally Lipschitz function. Since x ϕ does not contain, by the hyperplane separation theorem there exists v 1 S n 1, such that x ϕ,v 1 > for every x in the compact neighborhood O x ϕ 1 (y). Let us complete v 1 to an orthonormal basis {v 1,v 2,...,v n } of R n and consider the map ϕ(x ) c f : O x R n, f(x v 2,x ) =. v n,x By construction f is locally Lipschitz and x is a regular point of f. Hence there exists, by Theorem alipschitz inverseg of f. Inparticular theinversemap isalipschitzfunction that transforms the hyperplane {y 1 = } into ϕ 1 (c). Hence the level set ϕ 1 (c) is a Lipschitz submanifold A non-smooth version of Sard Lemma In this section we prove a Sard-type result for the special class of Lipschitz functions we considered in the previous section. We first recall the statement of the classical Sard lemma. We denote by C f the critical point of a smooth map f : M N, i.e. the set of points x in M at which the differential of f is not surjective. Theorem (Sard lemma). Let f : R n R m be a C k function, with k max{n m+1,1}. Then the set f(c f ) of critical values of f has measure zero in R m. Notice that the classical Sard Lemma does not apply to C 1 functions ϕ : R n R, whenever n 1. Theorem Let M be a smooth manifold and ϕ ω : M R a family of smooth functions, with ω Ω. Assume that (i) Ω = i N N i is the union of smooth submanifold, and is compact, (ii) the maps (ω,q) ϕ ω (q) and (ω,q) d q ϕ ω are continuous on Ω M, (iii) the maps ψ i : N i M R, (ω,q) ϕ ω (q) are smooth. Then the set of critical values of the function a(q) = min ω Ω ϕ ω(q) has measure zero in R. Proof. We aregoing to defineacountable set of smooth functions Φ α indexedby α = (α,...,α n ) N n+1, where n = dimm, such that to every critical point q of a there corresponds a critical point z q of some Φ α. Moreover we have Φ α (z q ) = a(q). 294

295 Denote by Λ n = {(λ,...,λ n ) λ i, λ i = 1}. For every α = (α,...,α n ) N n+1 let us consider the map Φ α : N α... N αn Λ n M R n Φ α (ω,...,ω n,λ,...,λ n,q) = λ i ϕ ωi (q). (11.16) By computing partial derivatives, it is easy to see that a point z = (ω,...,ω n,λ,...,λ n,q) is critical for Φ α id and only if it satisfies the following relations: i= n i= λ αi i ψ ω (ω i,q) =, i =,...,n, n i= λ id q ϕ ωi = i =,...,n, ϕ ω (q) =... = ϕ ωn (q) (11.17) Recall that ψ i is simply the restriction of the map (ω,q) ϕ ω (q) for ω N i. Let us now show that every critical point q of a can be associated to a critical point z q of some Φ α. By Proposition 11.3, the function a is locally Lipschitz. Assume that q is a critical point of a, then we have q a conv{d q ϕ ω ω Ω s.t. ϕ ω (q) = a(q)}. By Carathéodory lemma there exist n+1 element ω,..., ω n and n+1 scalars λ,..., λ n such that λ i, n i= λ i = 1 and = n λ i d q ϕ ωi, i= ϕ ωi (q) = a(q), i =,...,n. Moreover, let us choose for every i =,...,n an index ᾱ i N such that ω i Nᾱi. Since ϕ ωi (q) = a(q) = min Ω ϕ ω (q), ω i is critical for the map ψ αi, namely we have ψ αi ω ( ω i,q) =. This implies that z q = ( ω,..., ω n, λ,..., λ n,q) satisfies the relations (11.17) for the function Φᾱ, with ᾱ = (ᾱ,...,ᾱ n ). Moreover it is easy to check that Φᾱ(z q ) = a(q) since Φᾱ(z q ) = ( n n λ i ϕ ωi (q) = λ i )a(q) = a(q). i= i= Then if C a denotes the set of critical points of a and C α the set of critical point of Φ α we have meas(a(c a )) meas Φ α (C α ) meas(φ α (C α )) =, α N n+1 α N n+1 since meas(φ α (C α )) = for all α by classical Sard lemma. 295

296 We want to apply the previous result in the case of functions that are infimum of smooth functions on level sets of a submersion. Theorem Let F : N M be a smooth map between finite dimensional manifolds and ϕ : N R be a smooth function. Assume that (i) F is a submersion (ii) for all q M the set N q = {x N, ϕ(x) = min ϕ(y)} is a non empty compact set. y F 1 (q) Then the set of critical values of the function a(q) = min ϕ(x) has measure zero in R. x F 1 (q) Proof. Denote by C a the set of critical points of a and a(c a ) is the set of its critical values. Let us first show that for every point q M there exist an open neighborhood O q of q such that meas(a(c a ) O qn ) =. From assumption (i), it follows that for every q M the set F 1 (q) is a smooth submanifold in N. Let us now consider an auxiliary non negative function ψ : N R such that (A) A α := ψ 1 ([,α]) is compact for every α >. and select moreover a constant c > such that the following assumptions are satisfied: (A1) N q inta c, (A2) c is a regular level of ψ F 1 (q). The existence of such a c > is guaranteed by the fact that (A1) is satisfied for all c big enough since N q is compact and A c contains any compact as c +. Moreover, by classical Sard lemma (cf. Theorem 11.38), almost every c is a regular value for the smooth function ψ F 1 (q). By continuity, there exists a neighborhood O q of the point q such that assumptions (A)-(A2) are satisfied for every q O q, for c > and ψ fixed. We observe that (A2) is equivalent to require that level set of F are transversal to level of ψ. We can infer that F 1 (O q ) A c is a smooth manifold with boundary that has the structure of locally trivial bundle. Maybe restricting the neighborhood of q then we can assume F 1 (q) A c = Ω, F 1 (O q ) A c O q Ω, where Ω is a smooth manifold with boundary. In this neighborhood we can split variables in N as follows x = (ω,q) with ω Ω and q M and the restriction a Oq is written as a Oq : O q R, a(q) = min ω Ω ϕ(ω,q). Notice that Ω is compact and is the union of its interior and its boundary, which are smooth by assumptions(a)-(a2). WecanthenapplytheTheorem11.39toa Oq, thatgives meas(a(c a O q ) = for every q M. We have built a covering of M = q M O q. Since M is a smooth manifold, from every covering it is possible to extract a countable covering, i.e. there exists a sequence q n of points in M such that M = n NO qn 296

297 In particular this implies that meas(a(c a )) n Nmeas(a(C a ) O qn ) = since meas(a(c a O q ) = for every q. Remark Notice that we do not assume that N is compact. In that case the proof is easier since every submersion F : N M with N compact automatically endows N with a locally trivial bundle structure Regularity of sub-riemannian spheres We end this chapter by applying the previous theory to get information about the regularity of sub-riemannian spheres. Before proving the main result we need two lemmas. Lemma Fix q M and let K T q M \(H 1 () T q M) be a compact set such that all normal extremals associated with λ K are not abnormal. Then there exists ε = ε(k) such that tλ is a regular point for the E q for all < t ε. Proof. By Corollary 8.46 for every strongly normal extremal γ(t) = E(tλ ), with λ T q M, there exists ε = ε(λ ) > such that γ ],ε] does not contain points conjugate to q, or equivalently, tλ is a regular point for the E q for all < t ε. Since K is compact, it follows that there exists ε = ε(k) such that the above property holds uniformly on K. Lemma Let q M and K M be a compact set such that every point of K is reached from q by only strictly normal minimizers. Define the set Then C is compact. C = {λ T q M λ minimizer, E(λ ) K}. Proof. It is enough to show that C is bounded. Assume by contradiction that there exists a sequence λ n C of covectors (and the associate sequence of minimizing trajectories γ n, associated with controls u n ) such that λ n +, where is some norm in T q M. Since these minimizers are normal they satisfy the relation and dividing by λ n one obtain the identity λ n D un F = u n, n N. (11.18) λ n λ n D u n F = u n, n N. (11.19) λ n Using compactness of minimizers whose endpoints stay in a compact region, we can assume that u n u. Morever the sequence λ n / λ n is bounded and we can assume that λ n / λ n λ for some final covector λ. Using that D un F D u F and the fact that λ n +, passing to the limit for n in (11.19) we obtain λd u F =. This implies in particular that the minimizers γ n converge to a minimizer γ (associated to λ) that is abnormal and reaches a point of K that is a contradiction. 297

298 Theorem Let M be a sub-riemannian manifold, q M and r > such that every point different from q in the compact ball Bq (r ) is not reached by abnormal minimizers. Then the sphere S q (r) is a Lipschitz submanifold of M for almost every r r. Proof. Let us fix δ > and consider the annulus A δ = B r (q )\B δ (q ). Define the set C = {λ T q M λ minimizer, E(λ ) A δ } By Lemma the set C := C is compact. Moreover define C 1 := {λ C H 1 ([,ε ])}, for some ε > that is chosen later. Notice that C 1 is compact. For every λ T M let us consider the control u associated with γ(t) = E(tλ ) and denote by Φ λ := (P 1,t ) : T q M T E q (λ ) M, the pullback of the flow defined by the control u, computed at q. For a fixed λ C, using that C 1 is compact, let us choose ε = ε(λ ) satisfying the following property: for every λ 1 C 1, the covector Φ λ (λ 1 ) T E q (λ ) M, is a regular point of E E q (λ ). Being C also compact, we can define ε = min{ε(λ ),λ C }. Define the map Ψ : C C 1 D δ M, Ψ(λ,λ 1 ) = E Eq (λ )(Φ λ (λ 1 )). By construction Ψ is a submersion. We want to apply Theorem 11.4 to the submersion Ψ and the scalar function H : C C 1 R, H(λ,λ 1 ) = H(λ )+H(λ 1 ). Let us show that the assumption of Theorem 11.4 are satisfied. Indeed we have to show that the set N q = {(λ,λ 1 ) C C 1 H(λ,λ 1 ) = min Ψ(λ,λ 1 )=q H(λ,λ 1 )}, q A δ, is non empty and compact. Let us first notice that Ψ(λ,sλ ) = E q ((1+s)λ ), H(λ,sλ ) = (1+s 2 )H(λ ). By definition of C, for each q A δ there exists λ C such that E q ( λ ) = q and such that the corresponding trajectory is a minimizer. Moreover we can always write this unique minimizer as the union of two minimizers. It follows that min H(λ,λ 1 ) = min H(λ ) = f(q), q A δ. Ψ(λ,λ 1 )=q E q (λ )=q This implies that N q is non empty for every q. Moreover one can show that N q is compact. By applying Theorem 11.4 one gets that the function a(q) = min Ψ(λ,λ 1 )=q H(λ,λ 1 ) = f(q), is locally Lipschitz in A δ and the set of its critical values has measure zero in A δ. Since δ > is arbitrary we let δ and we have that f is locally Lipschitz in B q (r ) \{q } and the set of its critical values has measure zero. In particular almost every r r is a regular value for f. Then, applying Corollary 11.37, the sphere f 1 (r 2 /2) is a Lipschitz submanifold for almost every r r. 298

299 11.5 Geodesic completeness and Hopf-Rinow theorem We start by proving a technical lemma that is needed later. Lemma For every ε > and x M we have B(x,r +ε) = y B(x,r) B(y,ε). (11.2) Proof. The inclusion is a direct consequence of the triangle inequality. Let us prove the converse inclusion. Let y B(x, r+ε)\b(x, ε). Then there exists a length-parameterized curve γ connecting x with y such that l(γ) = t+ε where t < r. Let t ]t,r[; then γ(t ) B(x,r) and y B(γ(t ),ε). Theorem Let M be a sub-riemannian manifold. Then (M, d) is complete if and only if all sub-riemannian closed balls are compact. Proof. (i). Assume that all closed balls are compact and let {x j } be a Cauchy sequence in M. We have to prove that {x j } admits a convergent subsequence. By assumption, if we fix ε > there exists n N such that d(x j,x k ) < ε for all j,k n. Let us definer := max j n d(x j,x n )+ε >. By construction x j B(x n,r) for every j, and B(x n,r) has compact closure by assumption. Hence the sequence admits a converging sub-sequence. (ii). Assume now that (M,d) is complete. Fix x M and define A := {r > B(x,r) is compact}, R := supa. (11.21) Since the topology of (M,d) is locally compact then A and R >. First we prove that A is open and then we prove that R = +. Notice in particular that this proves that A =],+ [ since, by Remark 3.41, r A implies ],r[ A. (ii.a) It is enough to show that, if r A, then there exists δ > such that r +δ A. For each y B(x,r) there exists r(y) < ε small enough such that B(y,r(y)) is compact. We have B(x,r) y B(x,r) B(y,r(y)). By compactness of B(x,r) there exists a finitenumberof points {y i } N i=1 in B(x,r) such that (denote r i := r(y i )) N B(x,r) B(y i,r i ). i=1 Moreover, there exists δ > such that the set of points B(x,r+δ) = {y M dist(y,b(x,r)) δ}, where the equality is given by Lemma 11.45, satisfies B(x,r +δ) N i=1 B(y i,r i ). This proves that r+δ A, since a finite union of compact sets is compact. 299

300 (ii.b) Assume by contradiction that R < + and let us prove that B := B(x,R) is compact. Since B is a closed set, it is enough to show that it is totally bounded, i.e. it admits an ε-net 2 for every ε >. Fix ε > and consider an (ε/3)-net S for the ball B = B(x,R ε/3), that exists by compactness. By Lemma one has for every y B that dist(y,b ) < ε/3. Then it is easy to show that dist(y,s) < dist(y,b )+ε/3 < ε, that is S is an ε-net for B and B is compact. This shows that if R < +, then R A. Hence (ii.a) implies that R+δ A for some δ >, contradicting the fact that R is a sup. Hence R = +. The next result implies that the geodesic completeness of M, i.e. the completeness of H, implies the completeness of M as a metric space. Theorem (sub-riemannian Hopf-Rinow). Let M be a sub-riemannian manifold that does not admit abnormal length minimizers. If there exists a point x M such that the exponential map E x is defined on the whole Tx M, then M is complete with respect to the sub-riemannian distance. Proof. For the fixed x M, let us consider A = {r > B(x,r) is compact}, R := supa. As in the proof of Theorem 11.46, one can show that A and that A is open (by using the local compactness of the topology and repeating the proof of (ii.a)). Assume now that R < + and let us show that R A. By openness of A this will give a contradiction and A =],+ [. We have to show that B(x,R) is compact, i.e., for every sequence y i in B(x,R) we can extract a convergent subsequence. Define r i := d(y i,x). It is not restrictive to assume that r i R (if it is not the case, the sequence stays in a compact ball and the existence of a convergent subsequence is clear). Since the ball B(x,ri ) is compact, by Theorem 3.4 there exists a length minimizing trajectory γ i : [,r i ] M joining x and y i, parametrized by unit speed. Due to the completeness of the vector field H, we can extend each curve γ i, parametrized by length, to the common interval [, R]. By construction this sequence of trajectory is normal γ i (t) = E(tλ i ) = π e t H (λ i ), for some λ i T x M, and is contained in the compact set B(x,R). Since there is no abnormal minimizer, by Lemma the sequence {λ i } is bounded in T xm, thus there exists a subsequence λ in converging to λ. Then r in λ in Rλ and by continuity of E we have that {y i } has a convergent subsequence y in = γ in (r in ) = E(r in λ in ) E(Rλ) =: y To end the proof, one should just notice that an arbitrary Cauchy sequence in M is bounded, hence contained in a suitable ball centered at x, which is compact since R = +. Thus it admits a convergent subsequence. As an immediate corollary we have the following version of geodesic completeness theorem. Corollary Let M be a sub-riemannian manifold that does not admit abnormal length minimizers. If the vector field H is complete on T M, then M is complete with respect to the sub-riemannian distance. 2 an ε-net S for a set B in a metric space is a finite set of points S = {z i} N i=1 such that for every y B one has dist(y,s) < ε (or, equivalently, for every y B there exists i such that d(y,z i) < ε). 3

301 Chapter 12 Abnormal extremals and second variation In this chapter we are going to discuss in more details abnormal extremals and how the regularity of the sub-riemannian distance is affected by the presence of these extremals Second variation We want to introduce the notion of Hessian (and second derivative) for smooth maps between manifolds. We first discuss the case of the second differential of a map between linear spaces. Let F : V M be a smooth map from a linear space V on a smooth manifold M. As we know, the first differential of F at a point x V D x F : V T F(x) M, D x F(v) = d dt F(x+tv), v V, t= and is a well defined linear map independent on the linear structure on V. This is not the case for the second differential. Indeed it is easy to see that the second order derivative DxF(v) 2 = d2 dt 2 F(x+tv) (12.1) t= has not invariant meaning if D x F(v). Indeed in this case the curve γ : t F(x + tv) is a smooth curve in M with nonzero tangent vector. Then there exists some local coordinates on M such that the curve γ is a straight line. Hence the second derivative Dx 2 F(v) vanish in these coordinates. In general, the linear structure on V let us to define the second differential of F as a quadratic map DxF 2 : KerD x F T F(x) M (12.2) On the other hand the map (12.2) is not independent on the choice of the linear structure on V and this construction cannot be used if the source of F is a smooth manifold. Assume now that F : N M is a map between smooth manifolds. The first differential is a linear map between the tangent spaces D x F : T x N T F(x) M, x N. 31

302 and the definition of second order derivative should be modified using smooth curves with fixed tangent vector (that belong to the kernel of D x F): DxF(v) 2 = d2 dt 2 F(γ(t)), γ() = x, γ() = v KerD x F, (12.3) t= Computing in coordinates we find that d 2 dt 2 F(γ(t)) = d2 F df ( γ(), γ())+ γ() (12.4) t= dx2 dx that shows that term (12.4) is defined only up to ImD x F. Thus is intrinsically defined only a certain part of the second differential, which is called the Hessian of F, i.e. the quadratic map Hess x F : KerD x F T F(x) M/ImD x F 12.2 Abnormal extremals and regularity of the distance In the previuos chapter we proved that if we have abnormal minimizer that reach some point q, then the sub-riemannian distance is not smooth at q. If we also have that no normal minimizers reach q we can say that it is not even Lipschitz. Proposition Assume that there are no normal minimizers that join q to q. Then f is not Lipschitz in a neighborhood of q. Moreover lim d q f = +. (12.5) q q q Σ In the previous theorem is an arbitrary norm of the fibers of T M. Proof. Consider a sequence of smooth points q n Σ such that q n q. Since q n are smooth we know that there exists unique controls u n and covectors λ n such that λ n D un F = u n, λ n = d qn f. Assume by contradiction that d qn f M then, using compactness we find that u n u, λ n λ with λd u F = u, that means that the associate geodesic reach q. In other words, there exists a normal minimizer that goes at q, that is a contradiction. Let us now consider the end-point map F : U M. As we explained in the previous section, its Hessian at a point u U is the quadratic vector function Hess u F : KerD u F CokerD u F = T F(u) M/ImD u F. Remark Recall that λd u F = if and only if λ (ImD u F). In other words, for every abnormal extremal there is a well defined scalar quadratic form λhess u F : KerD u F R Notice that the dimension of the space ImD u F of such covectors coincide with dimcokerd u F. 32

303 Definition Let Q : V R be a quadratic form defined on a vector space V. The index of Q is the maximal dimension of a negative subspace of Q: ind Q = sup{dimw Q W\{} < }. (12.6) Recall that in the finite-dimensional case this number coincide with the number of negative eigenvalues in the diagonal form of Q. The following notion of index of the map F will be also useful: Definition Let F : U M and u U be a critical point for F. The index of F at u is Ind u F = min λ ImD uf λ ind (λhess u F) codimimd u F Remark If codimimd u F = 1, then there exists a unique (up to scalar multiplication) non zero λ ImD u F, hence Ind u F = ind (λhess u F) 1. Theorem If Ind u F 1, then u is not a strictly abnormal minimizer. We state without proof the following result (see Lemma 2.8 of [AS4]) Lemma Let Q : R N R n be a vector valued quadratic form. Assume that Ind Q. Then there exists a regular point x R n of Q such that Q(x) =. Definition Let Φ : E R n be a smooth map defined on a linear space E and r >. We say that Φ is r-solid at a point x E if there exists a constant C >, ε > and a neighborhood U of x such that for all ε < ε there exists δ(ε) > satisfying for all maps Φ C (E,R n ) such that Φ Φ C (U,R n ) < δ. B Φ(x) (Cεr ) Φ(B x (ε)), (12.7) Exercise Prove that if x is a regular point of Φ, then Φ is 1-solid at x. (Hint: Use implicit function theorem to prove that Φ satisfies (12.7) and Brower theorem to show that the same holds for some small perturbation) Proposition Assume that Ind x Φ. Then Φ is 2-solid at x. Proof. We can assume that x = and that Φ() =. We divide the proof in two steps: first we prove that there exists a finite dimensional subspace E E such that the restriction Φ E satisfies the assumptions of the theorem. Then we prove the proposition under the assumption that dime < +. (i). Denote k := dimcokerd Φ and consider the Hessian Hess Φ : KerD Φ CokerD Φ We can rewrite the assumption on the index of Φ as follows ind λhess Φ k, λ ImD Φ \{}. (12.8) 33

304 Since property (12.8) is invariant by multiplication of the covector by a positive scalar we are reduced to the sphere λ S k 1 = {λ ImD Φ, λ = 1}. By definition of index, for every λ S k 1, there exists a subspace E λ E, dime λ = k such that λhess u Φ Eλ \{} < By the continuity of the form with respect to λ, there exists a neighborhood O λ of λ such that E λ = E λ for every λ O λ. By compactness we can choose a finite covering of S k 1 made by open subsets S k 1 = O λ1... O λn Then it is sufficient to consider the finitedimensional subspace E = (ii). Assume dime < and split N j=1 E λj E = E 1 E 2 E 2 := KerD Φ The Hessian is a map Hess Φ : E 2 R n /D Φ(E 1 ) According to Lemma 12.7 there exists e 2 E 2, regular point of Hess Φ, such that Hess Φ(e 2 ) = = D 2 Φ(e 2) = D Φ(e 1 ), for some e 1 E 1. Define the map Q : E R n by the formula Q(v 1 +v 2 ) := D Φ(v 1 )+ 1 2 D2 Φ(v 2 ), v = v 1 +v 2 E = E 1 E 2. and the vector e := e 1 /2+e 2. From our assumptions it follows that e is a regular point of Q and Q(e) =. In particular there exists c > such that B (c) Q(B (1)) and the same holds for some perturbation of the map Q (see Exercice 12.9). Consider then the map Φ ε : v 1 +v 2 1 ε 2Φ(ε2 v 1 +εv 2 ) (12.9) Using that v 2 KerD Φ we compute the Taylor expansion with respect to ε Φ ε (v 1 +v 2 ) = Q(v 1 +v 2 )+O(ε) (12.1) hence for small ε the image of Φ ε contain a ball around from which it follows that B φ() (cε 2 ) Φ(B (ε)) (12.11) Moreover as soon as ε is fixed we can perturb the map Φ and still the estimate (12.11) holds. 34

305 Actually we proved the following statement, that is stronger than 2-solideness of Φ: Lemma Under the assumptions of the Theorem 12.1, there exists C > such that for every ε small enough B Φ() (Cε 2 ) Φ(B (ε2 ) B (ε)) (12.12) where B and B denotes the balls in E 1 and E 2 respectively. The key point is that, in the subspace where the differential of Φ vanish, the ball of radius ε is mapped into a ball of radius ε 2, while the restriction on the other subspace preserves the order, as the estimates (12.9) and (12.1) show. 1 Proof of Theorem We prove that if Ind u F 1, where u is a strictly abnormal geodesic, then u cannot be a minimizer. It is sufficient to show that the extended endpoint map ( ) J(u) Φ : U R M, Φ(u) =, F(u) is locally open at u. Recall that d u J = λd u F, for some λ T F(u) M, if and only if d u J KerDuF = (see also Proposition 8.11). Since u is strictly abnormal, it follows that d u J KerDuF. (12.13) Moreover from the definition of Φ and (12.13) one has KerD u Φ = Kerd u J KerD u F, dimimd u J = 1. Moreover, a covector λ = (α,λ) in R T F(u) M annihilates the image of D uφ if and only if α = and λ ImD u F, indeed if = λd u Φ = αd u J +λd u F with α, this would imply that u is also normal. In other words we proved the equality ImD u Φ = {(,λ) R T F(u) M λ ImD uf } (12.14) Combining (12.13) and (12.14) one obtains for every λ = (,λ) ImD u Φ λhess u Φ = λhess u F KerduJ KerD uf (12.15) Moreover codimimd u Φ = codimimd u F since dimimd u Φ = dimimd u F+1 by (12.13) and D u Φ takes values in R T F(u) M. Then for every λ = (,λ) ImD u Φ ind ( λhess u Φ) codimimd u Φ = ind (λhess u F KerduJ KerD uf ) codimimd uf and passing to the infimum with respect to λ we get ind (λhess u F) 1 codimimd u F Ind u Φ Ind u F 1. By Proposition 12.1 this implies that Φ is locally open at u. Hence u cannot be a minimizer. 1 B (c) Φ ε(b(1)) B (cε 2 ) Φ(ε 2 v 1 +εv 2),v i B i (1) B (cε 2 ) Φ(B ε 2 B ε) 35

306 Now we prove that, under the same assumptions on the index of the endpoint map given in Theorem 12.6, the sub-riemannian is Lipschitz even if some abnormal minimizers are present. Theorem Let K B q (r ) be a compact and assume that Ind u F 1 for every abnormal minimizer u such that F(u) K. Then f is Lipschitz on K. Proof. Recall that if there are no abnormal minimizers reaching K, Theorem ensures that f is Lipschitz on K. Then, using compactness of the set of all minimizers, it is sufficient to prove the estimate in neighborhood of a point q = F(u), where u is abnormal. Since Ind u F 1 by assumption, Theorem 12.6 implies that every abnormal minimizer u is not strictly abnormal, i.e., has also a normal lift. We have Hess u F : KerD u F CokerD u F, with Ind u F 1. and, since u is also normal, it follows that d u J = λd u F for some λ T F(u) M, hence KerD uf Kerd u J. The assumption of Lemma are satisfied, hence splitting the the space of controls L 2 k ([,1]) = E 1 E 2, E 2 := KerD u F we have that there exists C > and R > such that for ε < R we have B q (C ε 2 ) F(B ε ), B ε := B u(ε 2 ) B u(ε), q = F(u), (12.16) where B u (r) and B u (r) are the ball of radius r in E 1 and E 2 respectively, and B q (r) is the ball of radius r in coordinates on M. Let us also observe that, since J is smooth on B u (ε2 ) B u (ε), with d uj = on E 2, by Taylor expansion we can find constants C 1,C 2 > such that for every u = (u 1,u 2 ) B ε one has (we write u = (u 1,u 2 )) J(u ) J(u) C 1 u 1 u 1 +C 2 u 2 u 2 2 Pick then any point q K such that q q = C ε 2, with ε < R. Then (12.16) implies that there exists u = (u 1,u 2 ) B ε such that F(u ) = q. Using that f(q ) J(u ) and f(q) = J(u), since u is a minimizer, we have f(q ) f(q) J(u ) J(u) C 1 u 1 u 1 +C 2 u 2 u 2 2 (12.17) Cε 2 = C q q (12.18) where we can choose C = max{c 1,C 2 } and C = C/C. Since K is compact, and the set of control u associated with minimizers that reach the compact set K is also compact, the constants R > and C,C 1,C 2 can be chosen uniformly with respect to q K. Hence we can exchange the role of q and q in the above reasoning and get f(q ) f(q) C q q, for every pair of points q,q such that q q C R 2. 36

307 12.3 Goh and generalized Legendre conditions In this section we present some necessary conditions for the index of the quadratic form along an abnormal extremal to be finite. Theorem Let u be an abnormal minimizer and let λ 1 T F(u) M satisfy λ 1D u F =. Assume that ind λ 1 Hess u F < +. Then the following condition are satisfied : (i) λ(t),[f i,f j ](γ(t)), for a.e. t, i,j = 1,...,k, (ii) λ(t),[[f u(t),f v ],f v ](γ(t)), for a.e. t, v R k, (Goh condition) (Generalized Legendre condition) where λ(t) and γ(t) = π(λ(t)) are respectively the extremal and the trajectory associated to λ 1. Remark Notice that, in the statement of the previous theorem, if λ 1 satisfies the assumption λ 1 D u F =, then also λ 1 satisfies the same assumptions. Since ind ( λ 1 Hess u F) = ind + λ 1 Hess u F this implies that the statement holds under the assumption ind + λ 1 Hess u F < +. Indeed the proof shows that as soon as the Goh condition is not satisfied, both the positive and the negative index of this form are infinity. Notice that these condition are related to the properties of the distribution of the sub-riemannian structure and not to the metric. Indeed recall that the extremal λ(t) is abnormal if and only if it satisfies k λ(t) = u i (t) h i (λ(t)), λ(t),f i (γ(t)) =, i = 1,...,k, i=1 i.e. λ(t) satifies the Hamiltonian equation and belongs to Dγ(t). Goh condition are equivalent to require that λ(t) (Dγ(t) 2 ). Corollary Assume that the sub-riemannian structure is 2-generating, i.e. D 2 q = T qm for all q M. Then there are no strictly abnormal minimizers. In particular f is locally Lipschitz on M. Proof. Since D 2 q = T q M implies (D 2 γ(t) ) = for every q M, no abnormal extremal can satisfy the Goh condition. Hence by Theorem it follows that Ind u F = +, for any abnormal minimizer u. In particular, from Theorem 12.6 it follows that the minimizer cannot be strictly abnormal Hence f is globally Lipschitz by Theorem Remark Notice that f is locally Lipschitz on M if and only if the sub-riemannian structureis 2-generating. Indeed if the structure is not 2-generating at a point q, then from Ball-Box Theorem (Corollary 1.53) it follows that the squared distance f is not Lipschitz at the base point q. On the other hand, on the set where f is positive, we have that f is Lipschitz if and only if the sub-riemannian distance d(q, ) is. Before going into the proof of the Goh conditions (Theorem 12.13) we discuss an important corollary. 37

308 Theorem Assume that D q D 2 q. Then for every ε > there exists a normal extremal path γ starting from q such that l(γ) = ε and γ is not a length-minimizer. Before the proof, this is the idea: fix an element ξ D q \ (D 2 q ) which is non empty by assumptions. We want to build an abnormal minimizing trajectory that has ξ as initial covector and that is the limit of a sequence of stricly normal lenth-minimizers. In this way this abnormal will have finite index (the abnormal quadratic form will be the limit of positive ones) and then by Goh condition ξ D 2 q =, which is a contradiction. Proof. Assume by contradiction that there exists T > such that all normal extremal paths γ λ associated with initial covector λ H 1 (1/2) T q M minimize on the segment [,T]. Since restriction of length-minimizers are still length-minimizers, by suitably reducing T >, we can assume, thanks to Lemma 3.34, that there exists 2 a compact set K such that {γ λ (T) λ H 1 (1/2)} K. Fix an element ξ D q \(D 2 q ), which is non empty by assumptions. Then consider, given any λ H 1 (1/2) T q M, thefamilyofnormalextremalpaths(andcorrespondingnormaltrajectories) λ s (t) = e t H (λ +sξ), γ s (t) = π(λ s (t)), t [,T]. andlet u s bethecontrol associated withγ s, anddefinedon[,t]. DuetoTheorem 11.9, thereexists a positive sequence s n + such that q n := γ sn (T) is a smooth point for the squared distance from q, for every n N. By compactness of minimizers reaching K, there exists a subsequence of s n, that we still denote by the same symbol, and a minimizing control ū such that u sn ū, when n. In particular γ sn is a strictly normal length-minimizer for every n N. Denote Φ n t = P usn,t the non autonomous flow generated by the control u sn. The family λ sn (t) satisfies λ sn (t) = e t H (λ +s n ξ) = (Φ n t ) (λ +s n ξ). Moreover, by continuity of the flow with respect to convergence of controls, we have that Φ n t Φ t for n, where Φ t denotes the flow associated with the control ū. Hence we have that the rescaled family ( ) 1 1 λ sn (t) = (Φ n s t) λ +ξ n s n converges for n to the limit extremal λ(t) = Φ tξ. Notice that λ(t) is, by construction, an abnormal extremal associated to the minimizing control ū, and with initial covector ξ. The fact that u sn is a strictly normal minimizer says that the Hessian of the energy J restricted to the level set F 1 (q n ) is non negative. Recall that Hess u J F 1 (q) = I λ 1 D 2 uf, where λ 1 T F(u) M is the final covector of the extremal lift. In particular we have for every n N and every control v the following inequality v 2 λ sn (T)D 2 u sn F(v,v). This immediately implies 1 s n v 2 1 s n λ sn (T)D 2 u sn F(v,v), 2 indeed it is enough to fix an arbitrary compact K with q int(k) such that the corresponding δ K defined by Lemma 3.34 is smaller than T. 38

309 and passing to the limit for n one gets In particular one has that λ(t)d 2 ūf(v,v). ind + λ(t)hessūf = ind ( λ(t)d 2 ūf) =. Hence the abnormal extremal has finite (positive) index and we can apply Goh conditions (see Theorem and Remark 12.14). Thus ξ is orthogonal to D 2 q, which is a contradiction since ξ D q \(D 2 q ). Remark (About the assumptions of Theorem 12.17). Assume that the sub-riemannian structure is bracket-generating and is not Riemannian in an open set O M, i.e., D q T q M for every q O. Then there exists a dense set D O such that D q D 2 q for every q D. IndeedassumethatD q D 2 q forallq inanopenseta, thenitiseasytoseethatd i q = D q T q M for all q A, since the structure is not Riemannian. Hence the structure is not bracket-generating in A, which gives a contradiction Proof of Goh condition - (i) of Theorem Proof of Theorem Denote by u the abnormal control and by P t = exp t f u(s)ds the nonautonomous flow generated by u. Following the argument used in the proof of Proposition 8.4 we can write the end-point map as the composition E(u+v) = P 1 (G(v)), D u E = P 1 D G, and reduced the problem to the expansion of G, which is easier. Indeed denoting gi t map G can be interpreted as the end-point map for the system := P 1 t f i, the k q(t) = gv(t) t (q(t)) = v i (t)gi t (q(t)) and the Hessian of F can be computed easily starting from the Hessian of G at v = Hess u F = P 1 Hess G from which we get, using that λ = P1 λ 1, λ 1 Hess u F = λ 1 P 1 Hess G = λ Hess G i=1 Moreover computing λ(t),[f i,f j ](γ(t)) = λ,p 1 t [f i,f j ](γ(t)) = λ,[g t i,gt j ](γ()) the Goh and generalized Legendre conditions can also be rewritten as λ,[g t i,g t j]γ(), for a.e. t [,1], i,j = 1,...,k, (G.1) λ,[[g t u(t),gt i],g t i]](γ()), for a.e. t [,1], i = 1,...,k. (L.1) 39

310 Now we want to compute the Hessian of the map G. Using the Volterra expansion computed in Chapter 6 we have 1 G(v( )) q Id+ gv(t) t dt+ gv(τ) τ gt v(t) dτdt +O( v 3 ) τ t 1 where we used that gv t is linear with respect to v to estimate the remainder. This expansion let us to recover immediately the linear part, i.e. the expressions for the first differential, which can be interpreted geometrically as the integral mean D G(v) = 1 g t v(t) (q )dt, On the other hand the expression for the quadratic part, i.e. the second differential DG(v) 2 = 2q gv(τ) τ gt v(t) dτdt. τ t 1 has not an immediate geometrical interpretation. Recall that the second differential D 2 G is defined on the set 1 KerD G = {v L 2 k [,1], gv(t) t (q )dt = } (12.19) and, for such a v, D 2G(v) belong to the tangent space T q M. Indeed, using Lemma 8.27, and that v belong to the set (12.19), we can symmetrize the second derivative, getting the formula D 2 G(v) = [gv(τ) τ,gt v(t) ](q )dτdt, τ t 1 which shows that the second differential is computed by the integral mean of the commutator of the vector field gv(t) t for different times. Now consider an element λ ImD G, i.e. that satisfies λ,gv(q t ) =, for a.e. t [,1], v R k. Then we can compute the Hessian λ Hess G(v) = λ,[gv(τ) τ,gt v(t) ](q ) dτdt (12.2) τ t 1 Remark Denoting by K the bilinear form K(τ,t)(v,w) = λ,[gv τ,gt w ](q ), the Goh and generalized Legendre conditions are rewritten as follows K(t,t)(v,w) =, v,w R k, for a.e. t [,1], (G.2) K τ (τ,t) (v,v), τ=t v R k, for a.e. t [,1]. (L.2) 31

311 Indeed, the first one easily follows from (G.1). Moreover recall that gv t = P 1 t f v, hence the map t gv t is Lipschitz for every fixed v. By definition of P t = exp t f u(t)dt it follows that which shows that (L.2) is equivalent to (L.1). t gt v = [g t u(t),gt v] Finally we want to express the Hessian of G in Hamiltonian terms. To this end, consider the family of functions on T M which are linear on fibers, associated to the vector fields g t v : h t v(λ) := λ,g t v(q), λ T M, q = π(λ). and define, for a fixed element λ ImD G : Using the identities η t v := h t v(λ ) T λ T M (12.21) σ λ ( h t v, h t w) = {h t v,h t w}(λ) = λ,[g t v,g t w](q), q = π(λ) and computing at the point λ T q M we find σ λ (η t v,ηt w ) = λ,[g t v,gt w ](q ) and we get the final expression for the Hessian λ Hess G(v( )) = τ t 1 σ λ (ηv(τ) τ,ηt v(t) )dtdτ. (12.22) where the control v KerD G satisfies the relation (notice that π η t v = g t v(q )) 1 1 π ηv(t) t dt = π ηv(t) t dt = Moreover the Hamiltonian version of Goh and Legendre conditions is expressed as follows: σ λ (η t v,η t w) =, v,w R k, for a.e. t [,1], (G.3) σ λ ( η t v,η t v), v R k, for a.e. t [,1]. (L.3) We are reduced to prove, under the assumption ind λ Hess G < +, that (G.3) and (L.3) hold. Actually we will prove that Goh and generalized Legendre conditions are necessary conditions for the restriction of the quadratic form to the subspace of controls in KerD G that are concentrated on small segments [t,t+s]. In what follows we fix once for all t [,1[. Consider an arbitrary vector control function v : [,1] R k with compact support in [,1] and build, for s > small enough, the control ( ) τ t v s (τ) = v, suppv s [t,t+s]. (12.23) s 311

312 The idea is to apply the Hessian to this particular control functions and then compute the asymptotics for s. indice finito allora e finito anche qui sopra. Actually, since the index of a quadratic form is finite if and only if the same holds for the restriction of the quadratic form to a subspace of finite codimension, it is not restrictive to restrict also to the subspace of zero average controls E s := {v s KerD G, v s defined by (12.23), 1 v(τ)dτ = }. Notice that this space depend on the choice of t, while codime s does not depend on s. Remark We will use the following identity (writing σ for σ λ ), which holds for arbitrary control functions v,w : [,1] R k β t β β σ(ηv(τ) τ,ηt w(t) )dtdτ = σ( ηv(τ) τ dτ,ηt w(t) )dt = σ(ηv(τ) τ, ηw(t) t dt)dτ. (12.24) α τ t β α α For the specific choice w(t) = t v(τ)dτ we have also the integration by parts formula β α η t v(t) dt = ηβ w(β) ηα w(α) β α α τ η w(t) t dt. (12.25) Combining (12.22) and (12.24), we rewrite the Hessian applied to v s as follows λ Hess G(v s ( )) = t+s t σ( τ t ηv θ s(θ) dθ,ητ v s(τ) )dτ. (12.26) Notice that the control v s is concentrated on the segment [t,t + s], thus we have restricted the extrema of the integral. The integration by parts formula (12.25), using our boundary conditions, gives τ τ ηv θ dθ = s(θ) ητ w s(τ) η w θ s(θ) dθ. (12.27) where we defined t w s (θ) = Combining (12.26) and (12.27) one has λ Hess G(v s ( )) = = t+s t t+s t θ t v s (τ)dτ, σ(η τ w s(τ),ητ v s(τ) )dτ t+s t t θ [t,t+s]. τ σ( t t+s t+s σ(ηw τ s(τ),ητ v )dτ s(τ) σ( η w τ, s(τ) t η θ w s(θ) dθ,ητ v s(τ) )dτ τ ηv θ s(θ) dθ)dτ (12.28) where the second equality uses (12.24). Next consider the second term in (12.28) and apply again the integration by part formula (recall that w s (t+s) = ) t+s t t+s t+s σ( η w τ, s(τ) ηv θ dθ)dτ = s(θ) σ( η w τ s(τ),ητ w )dτ s(τ) τ t 312 t+s t t+s σ( η w τ, s(τ) η w θ dθ)dτ. s(θ) τ

313 Collecting together all these results one obtains λ Hess G(v s ( )) = t+s t σ(η τ w s(τ),ητ v s(τ) )dτ + t+s t σ( η τ w s(τ),ητ w s(τ) )dτ + t+s t t+s σ( η w τ, s(τ) η w θ dθ)dτ s(θ) This is indeed a homogeneous decomposition of λ Hess G(v s ( )) with respect to s, in the following sense. Since ( ) θ t w s (θ) = sw, s we can perform the change of variable ζ = τ t, τ [t,t+s], s and obtain the following expression for the Hessian: λ Hess G(v s ( )) = s 2 1 σ(η t+sθ w(θ),ηt+sθ v(θ) )dθ +s 3 1 σ( η t+sθ w(θ),ηt+sθ w(θ) +s 4 1 τ )dθ (12.29) σ( η t+sθ w(θ), 1 θ η t+sζ w(ζ) dζ)dθ We recall that here v s is defined through a control v compactly supported in [,1] by (12.23) and w is the primitive of v, that is also compactly supported on [,1]. In particular we can write 1 λ Hess G(v s ( )) = s 2 σ(ηw(θ) t,ηt v(θ) )dθ +O(s3 ). (12.3) By assumption ind λ Hess G < +. This implies that the quadratic form given by its principal part w( ) 1 σ(ηw(θ) t,ηt ẇ(θ) )dθ, (12.31) has also finite index. Indeed, assume that (12.31) has infinite negative index. Then by continuity every sufficiently small perturbation of (12.31) would have infinite index too. Hence, for s small enough, the quadratic form λ Hess G would also have infinite index, contradicting our assumption on (12.3). To prove Goh condition, it is then sufficient to show that if (12.31) has finite index then the integrand is zero, which is guaranteed by the following Lemma Let A : R k R k R be a skew-symmetric bilinear form and define the qudratic form Q : U R, Q(w( )) = 1 A(w(t),ẇ(t))dt, where U := {w( ) Lip[,1],w() = w(1) = }. Then ind Q < + if and only if A. 313

314 Proof. Clearly if A =, then Q = and ind Q =. Assume then that A and we prove that ind Q = +. We divide the proof into steps (i). The bilinear form B : U U R defined by B(w 1 ( ),w 2 ( )) = 1 A(w 1 (t),ẇ 2 (t))dt is symmetric. Indeed, integrating by parts and using the boundary conditions we get B(w 1,w 2 ) = 1 = = 1 1 A(w 1 (t),ẇ 2 (t))dt A(ẇ 1 (t),w 2 (t))dt A(w 2 (t),ẇ 1 (t))dt = B(w 2,w 1 ) (ii). Q is not identically zero. Since Q is the quadratic form associated to B and from the polarization formula B(w 1,w 2 ) = 1 4 (Q(w 1 +w 2 ) Q(w 1 w 2 )) it easily follows that Q if and only if B. Then it is sufficient to prove that B is not zero. Assume that there exists x,y R k such that A(x,y), and consider a smooth nonconstant function α : R R, s.t. α() = α(1) = α() = α(1) =. Then α(t)z,α(t)z U for every z R k and we can compute B( α( )x,α( )y) = 1 = A(x,y) A( α(t)x, α(t)y)dt 1 α(t) 2 dt. (iii). Q has the same number of positive and negative eigenvalues. Indeed it is easy to see that Q satisfies the identity Q(w(1 )) = Q(w( )) from which (iii) follows. (iv). Q is non zero on a infinite dimensional subspace. Consider some w U such that Q(w) = α. For every x = (x 1,...,x N ) R N one can built the function w x (t) = x i w(nt i), t [ i N, i+1 N ], i = 1,...,N. An easy computations shows that Q(w x ) = α In particular there exists a subspace of arbitrary large dimension where Q is nondegenerate. 314 N i=1 x 2 i

315 Proof of generalized Legendre condition - (ii) of Theorem Applying Lemma for any t we prove that the s 2 order term in (12.29) vanish and we get to λ Hess G(v( )) = s 3 1 = s 3 1 σ( η t+sθ w(θ),ηt+sθ w(θ) )dθ +O(s4 ) σ( η t+sθ w(θ),ηt w(θ) )dθ +O(s4 ) where the last equalily follows from the fact that ηv t is Lipschitz with respect to t (see also (12.21)), i.e. = ηv t +O(s) η t+sθ v On the other hand η v t is only measurable bounded, but the Lebesgue points of u are the same of η. In particular if t is a Lebesgue point of η, the quantity η w( ) t is well defined and we can write 1 λ Hess G(v( )) = s 3 σ( η w(θ) t,ηt w(θ) )dθ s 3 ( 1 σ( η t+sθ w(θ),ηt w(θ) ) σ( ηt w(θ),ηt w(θ) )dθ )+O(s 4 ) Using the linearity of σ and the boundedness of the vector fields we can estimate 1 σ( η t+sθ w(θ),ηt w(θ) ) σ( ηt w(θ),ηt w(θ) )dθ C 1 η t+sθ s C sup v 1 1 s w(θ) ηt w(θ) dθ η t+τ v η t v dτ s where the last term tends to zero by definition of Lebesgue point. Hence we come to 1 λ Hess G(v( )) = s 3 σ( η w(θ) t,ηt w(θ) )dθ +o(s3 ) (12.32) To prove the generalized Legendre condition we have to prove that the integrand is a non negative quadratic form. This follows from the following Lemma, which can be proved similarly to Lemma Lemma Let Q : R k R be a quadratic form on R k and The quadratic form U := {w( ) Lip[,1],w() = w(1) = }. Q : U R, Q(w( )) = has finite index if and only if Q is non negative Q(w(t))dt

316 More on Goh and generalized Legendre conditions If Goh condition is satisfied, the generalized Legendre condition can also be characterized as an intrinsic property of the module. Indeed one can see that the quadratic map U γ(t) R, v λ(t),[[f u(t),f v ],f v ](γ(t)) is well defined and does not depend on the extension of f v to a vector field f v(t) on U. Notice that, using the notation h v (λ) = λ,f v (q) an abnormal extremal satisfies h v (λ t ), v R k Recalling that the Poisson bracket between linear functions on T M is computed by the Lie bracket we can rewrite the Goh condition as follows while strong Legendre conditions reads {h v,h w }(λ) = λ,[f v,f w ](q) Taking derivative of (12.33) with respect to t we find {h v,h w }(λ(t)), v,w R k (12.33) {{h u(t),h v },h v }, v R k (12.34) {h u(t),{h v,h w }}(λ(t)), v,w R k and using Jacobi identity of the Poisson bracket we get that the bilinear form (v,w) {{h u(t),h v },h w }(λ) (12.35) is symmetric. Hence the generalized Legendre condition says that the quadratic form associated to (12.35) is nonnegative. Now we want to characterize the trajectories that satisfy these conditions. Recall that, if λ(t) is an abnormal geodesic, we have λ(t) = h u(t) (λ(t)), h i (λ(t)), t 1. (12.36) where h u(t) = k i=1 u i(t) h i (t). Moreover for any smooth function a : T M R d dt a(λ(t)) = {h u(t),a}(λ(t)) = k u i (t){h i,a}(λ(t)) i=1 Notation. We will denote the iterated Poisson brackets h i1...i k (λ) = {h i1,...,{h ik 1,h ik }}(λ) (12.37) = λ,[f i1,...,[f ik 1,f ik ]](q), q = π(λ) (12.38) 316

317 Differentiating the identities in (12.36), using (12.37), we get h i (λ(t)) = k u j (t)h ji (λ(t)) =, t. (12.39) j=1 If k is odd we always have a nontrivial solution of the system, if k is even is possible only for those λ that satisfy det{h ij (λ)} =. But we want to characterize only those controls that satisfy Goh conditions, i.e. such that h ij (λ(t)). (12.4) Hence you cannot recover the control u from the linear system (12.39). We differentiate again equations (12.4) and we find k u l (t)h lij (λ(t)). (12.41) l=1 For every fixed t, these are k(k 1)/2 equations in k variables u 1,...,u k. Hence (i) If k = 2, we have 1 equation in 2 variables and we can recover the control u 1,u 2 up to a scalar mutilplier, if at least one of the coefficients does not vanish. Since we can always deal with lengh-parametrized curve this uniquely determine the control u. (ii) If k 3, we have that the system is overdetermined. Remark For generic systems it is proved that, when k 3, Goh conditions are not satisfied. On the other hand, in the case of Carnot groups, for big codimension of the distribution, abnormal minimizers always appear Rank 2 distributions and nice abnormal extremals Consider a rank 2 distribution generated by a local frame f 1,f 2 and let h 1,h 2 be the associated linear Hamiltonian. An abnormal extremal λ(t) associated with a control u(t) satisfies the system of equations λ(t) = u 1 (t) h 1 (λ(t))+u 2 (t) h 2 (λ(t)), h 1 (λ(t)) = h 2 (λ(t)) =. (12.42) Define the linear Hamiltonian associated with the h 12 (λ(t)) = λ,[f 1,f 2 ](q). Notice that in this special framework the Goh condition is rewritten as h 12 (λ(t)) = for a.e. t. Equivalently, every abnormal extremal satisfies Goh conditions if and only if λ(t) (D 2 ). Lemma Every nontrivial abnormal extremal on a rank 2 sub-riemannian structure satisfies the Goh condition. Proof. Indeed differentiating the identity (12.42) one gets (we omit t in the notation for simplicity) u 2 {h 2,h 1 } = u 2 h 21 (λ) =, u 1 {h 1,h 2 } = u 1 h 21 (λ) =, Since at least one among u 1 and u 2 is not identically zero, we have that h 12 (λ(t)), that is Goh condition. 317

318 From now on we focus on a special class of abnormal extremals. Definition An abnormal extremal λ(t) is called nice abnormal if, for every t [,1], it satisfies λ(t) (D 2 ) \(D 3 ). Remark Assume that λ(t) is a nice abnormal extremal. The system (12.41) obtained by differentiating twice the equations (12.42) reads u 1 h 112 (λ) = u 2 h 221 (λ). (12.43) Under our assumption, at least one coefficient in (12.43) is nonzero and we can uniquely recover the control u = (u 1,u 2 ) up to a scalar as follows u 1 (t) = h 221 (λ(t)), u 2 (t) = h 112 (λ(t)). (12.44) If we plug this control into the original equation we find that λ(t) is a solution of Let us now introduce the quadratic Hamitonian λ = h 221 (λ) h 1 (λ)+h 112 (λ) h 2 (λ). (12.45) H = h 221 h 1 +h 112 h 2. (12.46) Theorem Any abnormal extremal belong to (D 2 ). Moreover we have that λ(t) (D 2 ) \ (D 3 ) for all t [,1] if and only if λ(t) satisfies with initial condition λ (D 2 q) \(D 3 q). λ(t) = H (λ(t)) (12.47) Remark Notice that, as soon as n > 3, the set (D 2 q) \(D 3 q) is nonempty for an open dense set of q M. Indeed assume that we have D 2 q = D 3 q for any q in a open neighborhood O q of a point q in M. Then it follows that D 2 q = D 3 q = D 4 q =... and the structure cannot be bracket generating, since dimd i q < dimm for every i > 1. The case n = 3 will be treated separately. Proof. Using that any abnormal extremal belong to the subset {h 1 (λ(t)) = h 2 (λ(t)) = }, it is easy to show that an abnormal extremal λ(t) satisfies (12.45) if and only if it is an integral curve of the Hamiltonian vector field H. It remains to prove that a solution of the system λ(t) = H(λ(t)), λ (D 2 ) \(D 3 ), (12.48) satisfies λ(t) (D 2 ) \ (D 3 ) for every t. First notice that the solution cannot intersect the set (D 3 ) sincetheseareequilibriumpointsofthesystem(12.48)(sinceat thesepointsthehamiltonian has a root of order two). 318

319 We are reduced to prove that (D 2 ) is an invariant subset for H. Hence we prove that the functions h 1,h 2,h 12 are constantly zero when computed on the extremal. To do this we find the differential equation satisfied by these Hamiltonians. Recall that, for any smooth function a : T M R and any solution of the Hamiltonian system λ(t) = e t H λ, we have ȧ = {H,a}. Hence we get ḣ 12 = {h 221 h 1 +h 112 h 2,h 12 } = {h 221,h 12 }h 1 +{h 112,h 12 }h 2 +h 112 h 221 +h 212 h }{{ 112 } = c = 1 h 1 +c 2 h 2 for some smooth coefficients c 1 and c 2. We see that there exists smooth functions a 1,a 2,a 12 and b 1,b 2,b 12 such that ḣ 1 = a 1 h 1 +a 2 h 2 +a 12 h 12 ḣ 2 = b 1 h 1 +b 2 h 2 +b 12 h 12 (12.49) ḣ 12 = c 1 h 1 +c 2 h 2 If we plug the solution λ(t) into the equation of (12.48), i.e. if we consider it as a system of differential equations for the scalar functions h i (t) := h i (λ(t)), with variable coefficients a i (λ(t)),b i (λ(t)), c i (λ(t)), we find that h 1 (t),h 2 (t),h 12 (t) satisfy a nonautonomous homogeneous linear system of differential equation with zero initial condition, since λ (D 2 ), i.e. h 1 (λ ) = h 2 (λ ) = h 12 (λ ) =. (12.5) Hence h 1 (λ(t)) = h 2 (λ(t)) = h 12 (λ(t)) =, t. We also can prove easily that nice abnormals satisfy the generalized Legendre condition. Recall that if λ(t) is an abnormal extremal, then λ(t) is also an abnormal extremal. Lemma Let λ(t) be a nice abnormal. Then λ(t) or λ(t) satisfy the generalized Legendre condition. Proof. Let u(t) be the control associated with the extremal λ(t). It is sufficient to prove that the quadratic form Q t : v λ(t),[[f u(t),f v ],f v ], v R 2 (12.51) is non negative definite. We already proved (cf.??) that the bilinear form B t : (v,w) λ(t),[[f u(t),f v ],f w ], v,w R 2 (12.52) is symmetric. From (12.52) it is easy to see that u(t) KerB t for every t. Hence Q t is degenerate for every t. On the other hand if the quadratic form is identically zero we have λ(t) (D 3 ), which is a contradiction. Hence the quadratic form has rank 1 and is semi-definite and we can choose ±λ in such a way that (12.51) is positive at t =. Since the sign of the quadratic form does not change along the curve (it is continuous and it cannot vanish) we have that it is positive for all t. 319

320 12.5 Optimality of nice abnormal in rank 2 structures Up to now we proved that every nice abnormal extremal in a rank 2 sub-riemannian structure automatically satisfies the necessary condition for optimality. Now we prove that actually they are strict local minimizers. Theorem Let λ(t) be a nice abnormal extremal and let γ(t) be corresponding abnormal trajectory. Then there exists s > such that γ [,s] is a strict local length minimizer in the L 2 - topology for the controls (equivalently the H 1 -topology for trajectories). Remark Notice that this property of γ does not depend on the metric but only on the distribution. In particular the value of s will be independent on the metric structure defined on the distribution. It follows that, as soon as the metric is fixed, small pieces of nice abnormal are also global minimizers. Before proving Theorem 12.3 we prove the following technical result. Lemma Let Φ : E R n be a smooth map defined on a Hilbert space E such that Φ() =, where is a critical point for Φ λd Φ =, λ R n, λ. Assume that λhess φ is a positive definite quadratic form. Then for every v such that λ,v <, there exists a neighborhood of zero O E such that Φ(x) / R + v, x O,x, R + = {α R,α > }. In particular the map Φ is not locally open and x = is an isolated point on its level set. Proof. In the first part of the proof we build some particular set of coordinates that simplifies the proof, exploiting the fact that the Hessian is well defined independently on the coordinates. Split the domain and the range of the map as follows E = E 1 E 2, E 2 = KerD Φ, (12.53) R n = R k 1 R k 2, R k 1 = ImD Φ, (12.54) where we select the complement R k 2 in such a way that v R k 2 (notice that by our assumption v / R k 1 ). Accordingly to the notation introduced, let us write Φ(x 1,x 2 ) = (Φ 1 (x 1,x 2 ),Φ 2 (x 1,x 2 )), x i E i, i = 1,2. Since Φ 1 is a submersion by construction, the Implicit function theorem implies that by a smooth change of coordinates we can linearize Φ 1 and assume that Φ has the form Φ(x 1,x 2 ) = (D Φ(x 1 ),Φ 2 (x 1,x 2 )), since x 2 E 2 = KerD Φ. Notice that, by construction of the coordinate set, the function x 2 Φ 2 (,x 2 ) coincides with the restriction of Φ to the kernel of its differential, modulo its image. 32

321 Hence for every scalar function a : R k 2 R such that d a = λ we have the equality λhess Φ = Hess (a Φ 2 (, )) > In particular the function a Φ 2 (,y) is non negative in a neighborhood of. Assume now that Φ(x 1,x 2 ) = sv for some s. Since v R k 2 it follows that D Φ(x 1 ) = = x 1 =, and Φ 2 (,x 2 ) = sv. In particular we have d ds a(φ 2 (,x 2 )) = d s= ds a(sv) = λ,v a(sv) for s s= which is a contradiction. Let λ(t) be an abnormal extremal and let γ(t) be corresponding abnormal trajectory. γ = u 1 f 1 (γ)+u 2 f 2 (γ). (12.55) In what follows we always assume that γ. = {γ(t) : t [,1]} is a smooth one-dimensional submanifold of M, with or without border. Then either the curve γ has no self-intersection or γ is diffeomorfic to S 1. In both cases we can chose a basis f 1,f 2 in a neighborhood of γ in such a way that γ is the integral curve of f 1 γ = f 1 (γ) Then γ is the solution of (12.55) with associated control ũ = (1,). Notice that a change of the frame on M corresponds to a smooth change of coordinates on the end-point map. With analogous reasoning as in the previous section, we describe the end point map as the composition where G is the end point map for the system F : (u 1,u 2 ) γ(1) F = e f 1 G q = (u 1 1)e tf 1 f 1 +u 2 e tf 1 f 2. (12.56) Since e tf 1 f 1 = f 1, denoting g t := e tf 1f 2 and defining the primitives w(t) = t (1 u 1 (τ))dτ, v(t) = we can rewrite the system, whose endpoint map is G, as follows The Hessian of G is computed λ Hess G(u 1, v) = 1 λ,[ t q = ẇf 1 (q)+ vg t (q). t u 2 (τ)dτ, (12.57) ẇ(τ)f 1 + v(τ)g τ dτ, ẇ(t)f 1 + v(t)g t ](q ) dt. (12.58) 321

322 Recall that D G(u 1, v) = 1 ẇ(t)f 1 (q )+ v(t)g t (q )dt = w(1)f 1 (q )+ and the condition λ ImD G is rewritten as 1 v(t)g t (q )dt λ,f 1 (q ) = λ,g t (q ) =, t. (12.59) Notice that since equality (12.59) is valid for all t then we have that λ,ġ t (q ) = λ,[f 1,g t ](q ) =, (12.6) Then we can rewrite our quadratic form only as a function of v, since all terms containing ẇ disappear 1 t λ Hess G( v) = λ,[ v(τ)g τ dτ, v(t)g t ](q ) dt (12.61) with the extra condition 1 v(t)g t (q )dt = w(1)f 1 (q ). (12.62) Now we rearrange these formulas, using integration by parts, rewriting the Hessian as a quadratic form on the space of primitives Using the equality we have t λ Hess G( v) = v(t) = t v(τ)dτ v(τ)g τ dτ = v(t)g t 1 t λ,[v(t)g t, v(t)g t ](q ) dt 1 λ,[ t v(τ)ġ τ dτ (12.63) v(τ)ġ τ dτ, v(t)g t ](q ) dt The first addend is zero since [g t,g t ] =. Exchanging the order of integration in the second term 1 λ,[ t and then integrating by parts v(τ)ġ τ dτ, v(t)g t ](q ) dt = 1 t 1 v(τ)g τ dτ = v(1)g 1 v(t)g t λ,[v(t)ġ t, v(τ)g τ dτ](q ) dt t 1 t v(τ)ġ τ dτ

323 we get to λhess G( v) = 1 which can also be rewritten as follows λhess G( v) = 1 λ,[ġ t,g t ](q ) v(t) 2 dt + 1 λ,[ t λ,[ġ t,g t ](q ) v(t) 2 dt + 1 λ,[ t 1 v(τ)ġ τ,v(t)ġ t v(1)g 1 ](q ) dt (12.64) v(τ)ġ τ dτ +v(1)g 1,v(t)ġ t ](q )dt. (12.65) Moreover, again integrating by parts the extra condition (12.62), we find 1 v(t)ġ t (q )dt = w(1)f 1 (q )+v(1)g 1 (q ) (12.66) Remark Notice that we cannot plug in the expression (12.66) directly into the formula since this equality is valid only at the point q, while in (12.64) we have to compute the bracket. Notice that the vectors f 1 (q 1 ) and f 2 (q 1 ) are linearly independent, then also f 1 (q ) = e f 1 (f 1 (q 1 )), and g 1 (q ) = e f 1 (f 2 (q 1 )), are linearly independent. From (12.66) it follows that for every pair(w, v) in the kernel the following estimates are valid w(1) C v L 2, v(1) C v L 2. (12.67) Theorem Let γ : [,1] M be an abnormal trajectory and assume that the quadratic form (12.64) satisfies λ Hess G( v) α v 2 L 2. (12.68) Then the curve is locally minimizer in the L 2 topology of controls. Remark Notice that the estimate (12.68) depends only on v, while the map G is a smooth map of v and ẇ. Hence Lemma does not apply. Moreover, the statement of Lemma violates for the endpoint map, since it is locally open as soon as the bracket generating condition is satisfied (this is equivalent to the Chow-Rashevsky Theorem). Moreover the final point of the trajectory is never isolated in the level set. What we are going to use is part of the proof of this Lemma, to show that the statements holds for the restriction of the endpoint map to some subset of controls Proof of Theorem Our goal is to prove that there are no curves shorter than γ that join q to q 1 = γ(1). To this extent we consider the restriction of the endpoint map to the set of curves that are shorter or have the same lenght than the original curve. Hence we need to fix some sub-riemannian structure on M. 323

324 We can then assume the orthonormal frame f 1,f 2 to be fixed and that the length of our curve is exactly 1 (we can always dilate all the distances on our manifold and the local optimality of the curve is not affected). The set of curves of length less or equal than 1 can be parametrized, using Lemma 3.15, by the set {(u 1,u 2 ) u 2 1 +u 2 2 1} Following the notation (12.57), notice that {(u 1,u 2 ) u 2 1 +u 2 2 1} {(w,v) ẇ }. We want to show that, for some function a C (M) such that d q a = λ ImD F, we have in the domain a F D (ẇ, v) = λhess F(ẇ, v)+r(w,v), where D = {(ẇ, v) KerD F,ẇ } R(w, v) v 2 (12.69) (ẇ, v) Indeed if we prove (12.69) we have that the point (ẇ, v) = (,) is locally optimal for F. This means that the curve γ, i.e. the curve associated to controls u 1 = 1,u 2 =, is also locally optimal. Using the identity t exp v(τ)f 2 dτ = e v(t)f 2 and applying the variations formula (6.28) to the endpoint map F we get F(ẇ, v) = q exp = q exp 1 1 (1 ẇ(t))f 1 + v(t)f 2 dt (1 ẇ(t))e v(t)f 2 f 1 dt e v(1)f 2 Hence we can express the endpoint map as a smooth function of the pair (ẇ,v). Now, to compute (12.69), we can assume that the function a is constant on the trajectories of f 2 (since we only fix its differential at one point) so that which simplifies our estimates: Writing a F(ẇ, v) = q exp e v(1)f 2 a = a 1 (1 ẇ(t))e v(t)f 2 f 1 dta (1 ẇ(t))e v(t)f 2 f 1 = f 1 +X (v(t))+ẇ(t)x 1 (v(t)) (12.7) and using the variation formula (6.29), setting Yt i = e (t 1)f 1 X i for i =,1, we get (recall that q 1 = q e f 1 (q )) a F(ẇ, v) = q 1 exp 1 Expanding the chronological exponential we find that Yt (v(t))+ẇ(t)y t 1 (v(t))dta, Y t () = Y t 1 () =, 324

325 (a) the zero order term vanish since Yt () = Y t 1 () =, (b) all first order terms vanish since the vector fields f 1 and [f 1,f 2 ] spans the image of the differential (hence are orthogonal to λ = d q a) (c) the second order terms are in the Hessian, since our domain D is contained in the kernel of the differential In other words it remains to show that every term in v,w of order greater or equal than 3 in the expansion can be estimated with o( v 2 ). 3 Let us prove first the claim for monomial of order three: 1 ẇ(t)v 2 (t)dt = o( v 2 ), 1 ẇ(t) t ẇ(τ)v(τ)dτdt = o( v 2 ) 1 ẇ(t) t τ ẇ(τ) ẇ(s)dsdτdt = o( v 2 ) Using that ẇ, which is the key assumption, and the fact that (ẇ, v) KerD F, which gives the estimates (12.67), we compute 1 ẇ(t)v 2 1 (t)dt ẇ(t) v 2 (t)dt = 1 ẇ(t)v 2 (t)dt = w(1)v 2 (1) v 3 +ε v 2, 1 w(t)v(t) v(t)dt where we estimate for the second term follows from 1 1 w(t)v(t) v(t)dt maxw(t) v(t) v(t)dt w(1) v v C v v 2 The second integral can be rewritten 1 ẇ(t) and then we estimate t 1 ẇ(τ)v(τ)dτdt = w(1) ẇ(t) t 3 where o( v 2 ) have the same meaning as in (12.69). 1 ẇ(t)v(t)dt 1 1 ẇ(τ)v(τ)dτdt 2 w(1) v(t)ẇ(t)dt C ẇ v 2 w(t)v(t)ẇ(t)dt 325

326 Finally, the last integral is very easy to estimate using the equality 1 ẇ(t) t τ ẇ(τ) ẇ(s)dsdτdt = 1 1 ẇ(t) 3 dt 6 C ẇ v 2 Starting from these estimate it is easy to show that any mixed monomial of order greater that three satisfies these estimates as well. Applying these results to a small piece of abnormal trajectory we can prove that small pieces of nice abnormals are minimizers Proof of Theorem If we apply the arguments above to a small piece γ s = γ [,s] of the curve γ it is easy to see that the Hessian rescale as follows, λ Hess G s (v) = s λ,[g t,ġ t ](q ) v(t) 2 dt + s λ,[ t v(τ)ġ τ dτ,v(t)ġ t v(s)g s ](q ) dt Since the generalized Legendre condition ensures 4 that (see also Lemma 12.29) then the norm λ,[g t,ġ t ](q ) C > ( s 1/2 v g = λ,[g t,ġ t ](q ) v(t) dt) 2 (12.71) is equivalent to the standard L 2 -norm. Hence the Hessian can be rewritten as where T is a compact operator in L 2 of the form λhess G s (v) = v g + Tv,v (12.72) (Tv)(t) = s K(t,τ)v(τ)dτ Since T 2 = K 2 L 2 for s, it follows that the Hessian is positive definite for small s > Conjugate points along abnormals In this section, we give an effective way to check the inequality (12.68) that implies local minimality of the nice abnormal geodesic according to Theorem it is semidefinite and we already know that f 1 is in the kernel 326

327 We define Q 1 (v) := λhess G( v). Quadratic form Q 1 is continuous in the topology defined by the norm v L2. The closure of the domain of Q 1 in this topology is the space D(Q 1 ) = { v L 2 [,1] : 1 } v(t)ġ t (q )dt span{f 1 (q ),g 1 (q )}. The extension of Q 1 to this closure is denoted by the same symbol Q 1. We set: l(t) = λ,[ġ t,g t ](q ), X t = v 1 g 1 + and we rewrite the form Q 1 in these more compact notations: Q 1 (v) = 1 l(t)v(t) 2 dt+ 1 t 1 v(τ)ġ τ dτ λ,[x t,ẋ t ](q ) dt, Ẋ t = v(t)ġ t, X 1 g 1 =, X (q ) f 1 (q ) =. (1) Moreover, we introduce the family of quadratic forms Q s, for < s 1, as follows Q s (v) := s l(t)v(t) 2 dt+ s λ,[x t,ẋ t ](q ) dt, Ẋ t = v(t)ġ t, X s g s =, X (q ) f 1 (q ) =. (1) Recall that l(t) is a strictly positive continuous function. In particular, 1 l(t)v(t)2 dt is the square of a norm of v that is equivalent to the standard L 2 -norm. Next statement is proved by the same arguments as Proposition 8.4. We leave details to the reader. Proposition The form Q 1 is positive definite if and only if kerq s =, s (,1]. Definition A time moment s (,1] is called conjugate to for the abnormal geodesic γ if kerq s. We are going to characterize conjugate times in terms of an appropriate Jacobi equation. Let ξ 1 T λ (T M) and ζ t T λ (T M) be the values at λ of the Hamiltonian lifts of the vector fields f 1 and g t. Recall that the Hamiltonian lift of a field f VecM is the Hamiltonian vector field associated to the Hamiltonian function λ λ,f(q), λ T qm, q M. We have: Q s (v) = s l(t)v(t) 2 dt+ s σ(x(t),ẋ(t))dt, ẋ(t) = v(t) ζ t, x(s) ζ s =, π x() π ξ 1 =, (2) where σ is the standard symplectic product on T λ (T M) and π : T M M is the standard projection. Moreover l(t) = σ( ζ t,ζ t ), t 1. (12.73) Let E = span{ξ 1,ζ t, t 1}. We use only the restriction of σ to E in the expression of Q s and we are going to get rid of unnecessary variables. Namely, we set: Σ. = E/(kerσ E ). 327

328 Lemma dimσ 2(dimspan{f 1 (q ),g t (q ), t 1} 1). Proof. Dimension of Σ is equal to twice the codimension of a maximal isotropic subspace of σ E. We have: σ(ξ 1,ζ t ) = λ,[f 1,g t ](q )] =, t [,1], hence ξ 1 kerσ E. Moreover, π (E) = span{f 1 (q ),g t (q ), t 1} and E kerπ is an isotropic subspace of σ E. We denote by ζ t Σ the projection of ζ t to Σ and by Π Σ the projection of E kerπ. Note that the projection of ξ 1 to Σ is ; moreover, equality (12.73) implies that ζ t, t [,1]. The final expression of Q s is as follows: Q s (v) = s l(t)v(t) 2 dt+ s σ(x(t),ẋ(t))dt, ẋ(t) = v(t) ζ t, x(s) ζ s =, x() Π. (4) We have: v kerq s if and only if s ( l(t)v(t)+σ(x(t), ζ ) t ) w(t)dt =, for any w( ) such that We obtain that v kerq s if and only if there exists ν Π ζ s s We set y(t) = x(t) ν and obtain the following: ζ t w(t)dt Π+Rζ s. (5) such that l(t)v(t)+σ(x(t), ζ t ) = σ(ν, ζ t ), t s. Theorem A time moment s (,1] is conjugate to if and only if there exists a nontrivial solution of the equation l(t)ẏ = σ( ζ t,y) ζ t (12.74) that satisfy the following boundary conditions: ν Π ζ such that (y(s)+ν) ζ s s =, (y()+ν) Π. (12.75) Remark Notice that identity (12.73) implies that y(t) = ζ t for t [,1] is a solution to the equation (12.74). However this solution may violate the boundary conditions. Let us consider the special case: dimspan{f 1 (q ),g t (q ), t 1} = 2; this is what we automatically have for abnormal geodesics in a 3-dimensional sub-riemannian manifold. In this case, dime = 2, dimπ = 1; hence Π = Π,ζ s = Rζ s and Π ζ =. Then ν in the boundary s conditions (12.75) must be and y(s) = cζ s, where c is a nonzero constant. Hence y(t) = cζ t for t 1 and y() = cζ / Π. We obtain: Corollary If dimspan{f 1 (q ),g t (q ), t 1} = 2, then the segment [,1] does not contain conjugate time moments and assumption of Theorem is satisfied. We can apply this corollary to the isoperimetric problem studied in Section Abnormal geodesics correspond to connected components of the zero locus of the function b (see notations in Sec ). All these abnormal geodesics are nice if and only if zero is a regular value of b. Take a compact connected component of b 1 (); this is a smooth closed curve. Our corollary together with Theorem implies that this closed curve passed once, twice, three times or arbitrary number of times is a locally optimal solution of the isoperimetric problem. Moreover, this is true for any Riemannian metric on the surface M! 328

329 Abnormals in dimension 3 Nice abnormals for the isoperimetric problem on surfaces Recall the isoperimetric problem: given two points x,x 1 on a 2-dimensional Riemannian manifold N, a 1-form ν Λ 1 N and c R, we have to find (if it exists) the minimum: min{l(γ),γ() = x,γ(t) = x 1, ν = c} (12.76) As shown in Section 4.4.2, this problem can be reformulated as a sub-riemannian problem on the extended manifold M = N R = {(x,y),x N,y R}, where the sub-riemannian structure is defined by the contact form D = Ker(dy ν) and the sub-riemannian length of a curve coincides with the Riemannian length of its projection on N. If we write dν = bdv, where b is a smooth function and dv denote the Riemannian volume on N, we have that the Martinet surface is defined by the cilynder M = R b 1 (), where, generically, the set b 1 () is a regular level of b. Since the distribution is well behaved with the projection on N by construction, it follows that the distribution is always transversal to the Martinet surface and all abnormal are nice, since Dq 3 = T qm for all q. Thus the projection of abnormal geodesics on N are the connected components of the set b 1 () and we can recover the whole abnormal extremal integrating the 1-form ν to find the missing component. In other words the abnormal extremals are spirals on M with step equal to Adν, (if dν is the volume form on N, it coincide with the area of the region A inside the curve defined on N by the connected component of b 1 ()). Corollary Let M be a sub-riemannian manifold, dimm = 3, and let γ : [,1] M be a nice abnormal geodesic. Then γ is a strict local minimizer for the L 2 control topology, for any metric. Remark Notice that we do not require that the curve does not self-intersect since in the 3D case this is automatically guaranteed by the fact that nice abnormal are integral curves of a smooth vector fields on M. γ A non nice abnormal extremal In this section we give an example of non nice (and indeed not smooth) abnormal extremal. Consider the isoperimetric problem on R 2 = {(x 1,x 2 ), x i R} defined by the 1-form ν such that dν = x 1 x 2 dx 1 dx

330 Here b(x 1,x 2 ) = x 1 x 2 and the set b 1 () consists of the union of the two axes, with moreover db =. Let us fix x 1, x 2 > and consider the curve joining (, x 2 ) and ( x 1,) that is the union of two segment contained in the coordinate axes { γ : [ x 2, x 1 ] R 2 (, t), t [ x 2,],, γ(t) = (t,), t [, x 1 ]. Proposition The curve γ is a projection of an abnormal extremal that is not a length minimizer. Proof of Proposition Let us built a family of variations γ ε,δ of the curve γ defined as in Figure Namely in γ ε,δ we cut a corner of size ε at the origin and we turn around a small circle of radius δ before reaching the endpoint. Denoting by D ε and D δ the two region enclosed by the curve it is easy to see that the isoperimetric condition rewrites as follows = ν = dν dν γ ε,δ D ε D δ It is then easy using that dν = x 1 x 2 dx 1 dx 2 to show that there exists c 1,c 2 > such that dν = c 1 ε 4, dν = c 2 δ 3 D ε D δ while l(γ ε,δ ) l(γ) = 2πδ (2 2)ε (12.77) Choosing ε in such a way that c 1 ε 4 = c 2 δ 3 it is an easy exercise to show that the quantity (12.77) is negative when δ > is very small. Remark If you consider some plane curve γ that is a projection of a normal extremal having the same endpoint γ and contained in the set {(x 1,x 2 ) R 2,x 1 >,x 2 > }, then γ must have self intersections. Indeed it is easy to see that if it is not the case then the isoperimetric condition ν = cannot be satisfied. γ It is still an open problem to find which is the length minimizer joining these two points. We know that it should be a projection of a normal extremal (hence smooth) but for instance we do not know how many self-intersection it has Higher dimension Now consider another important special case that is typical if dimension of the ambient manifold is greater than 3. Namely, assume that, for some k 2, the vector fields f 1, f 2, (adf 1 )f 2,..., (adf 1 ) k 1 f 2 (12.78) 33

331 x 2 D δ D ε x 1 Figure 12.1: An abnormal extremal that is not length minimizer are linearly independent in any point of a neighborhood of our nice abnormal geodesic γ, while (adf 1 ) k f 2 is a linear combination of the vector fields (12.78) in any point of this neighborhood; in other words, k 1 (adf 1 ) k f 2 = a i (adf 1 ) i f 2 +αf 1, i= where a i,α are smooth functions. In this case, all closed to γ solutions of the equation q = f 1 (q) are abnormal geodesics. A direct calculation basedon thefact that λ t,(adf i 1 )f 2)(γ(t) =, t 1, gives theidentity: ζ (k) k 1 t = a i (γ(t))ζ (i) +α(γ(t))ξ 1. t 1. (12.79) i= Identity (12.79) implies that dime = k and Π =. The boundary conditions (12.75) take the form: y() ζ s, (y(s) y()) ζ =. (12.8) s The caracterization of conjugate points is especially simple and geometrically clear if the ambient manifold has dimension 4. Let be a rank 2 equiregular distribution in a 4-dimensional manifold (the Engel distribution). Then abnormal geodesics form a 1-foliation of the manifold and condition (12.78) is satisfied with k = 2. Moreover, dime = 3, dimσ = 2 and ζ s = Rζ. Recall that s y(t) = ζ t, t s, is a solution to (12.74). Hence boundary conditions (12.8) are equivalent to the condition ζ s ζ =. (12.81) It is easy to re-write relation (12.81) in the intrinsic way without special notations we used to simplify calculations. We have the following characterization of conjugate times. Lemma A time moment t is conjugate to for the abnormal geodesic γ if and only if e tf 1 D γ() = D γ(t). The flow e tf 1 preserves D 2 and f 1 but it does not preserve D. The plane e tf 1 D rotates around the line Rf 1 inside D 2 with a nonvanishing angular velocity. Conjugate moment is a moment when the plane makes a complete revolution. Collecting all the information we obtain: 331

332 Theorem Let D be the Engel distribution, f 1 be a horizontal vector field such that [f 1,D 2 ] = D 2 and γ = f 1 (γ). Then γ is an abnormal geodesic. Moreover (i) if e tf 1 D γ() D γ(t), t (,1], then γ is a local length minimizer for any sub-riemannian structure on D (ii) If e tf 1 D γ() = D γ(t) for some t (,1) and γ is not a normal geodesic, then γ is not a local length minimizer Equivalence of local minimality Now we prove that, under the assumption that our trajectory is smooth, it is equivalent to be locally optimal in the H 1 -topology or in the uniform topology for the trajectories. Recall that a curve γ is called a C -local length-minimizer if l( γ) l(γ) for every curve γ that is C -close to γ satisfying the same boundary conditions, while it is called a H 1 -local lengthminimizer if l( γ) l(γ) for every curve γ such that the control u corresponding to γ is close in the L 2 topology to the control ū associated with γ and γ satisfies the same boundary conditions. Any C -local minimizer is automatically a H 1 -local minimizer. Indeed it is possible to show that for every v,w in a neighborhood of a fixed control u there exists a constant C > such that γ v (t) γ w (t) C u v L 2, t [,T], where γ v and γ w are the trajectories associated to controls v,w respectively. Theorem Let M be a sub-riemannian structure that is the restriction to D of a Riemannian structure (M,g). Assume γ is of class C 1 and has no self intersections. If γ is a (strict) local minimizer in the L 2 topology for the controls then γ is also a (strict) local minimizer in the C topology for the trajectories. Proof. Since γ has no self intersections, we can look for a preferred system of coordinates on an open neighborhood Ω in M of the set V = { γ(t) : t [,1]}. For every ε >, define the cylinder in R n = {(x,y) : x R,y R n 1 } as follows I ε B n 1 ε = {(x,y) R n : x ] ε,1+ε[,y R n 1, y < ε}, (12.82) We need the following technical lemma. Lemma There exists ε > and a coordinate map Φ : I ε Bε n 1 t [,1] (a) Φ(t, ) = γ(t), (b) the Riemannian metric Φ g is the identity matrix at (t,),i.e., along γ. Ω such that for all Proof of the Lemma. As in the proof of Theorem??, for every ε > we can find coordinates in the cylinder I ε Bε n 1 such that, in these coordinates, our curve γ is rectified γ(t) = (t,) and has length one. Our normalization of the curve γ implies that for the matrix representing the Riemannian metric Φ g in these coordinates satisfies ( ) Φ G11 G g = 12, with G G 21 G 11 (x,) =

333 where G ij, for i,j = 1,2, are the blocks of Φ g corresponding to the splitting R n = R R n 1 defined in (12.82). For every point (x,) let us consider the orthogonal complement T(x,) of the tangent vector e 1 = x to γ with respect to G. It can be written as follows (in this proof, is the Euclidean product in R n ) T(x,) = { ( v x,y,y),y R n 1} for some family 5 of vectors v x R n 1, depending smoothly with respect to x. Let us consider now the smooth change of coordinates Ψ : R n R n, Ψ(x,y) = (x v x,y,y) Fix ε > small enough such that the restriction of Ψ to I ε Bε n 1 is possible since vx detdψ(x,y) = 1 x,y. is invertible. Notice that this It is not difficult to check that, in the new variables (that we still denote by the same symbol), one has ( ) 1 G(x,) =, M(x,) where M(x,) is a positive definite matrix for all x I ε. With a linear change of cooordinates in the y space (x,y) (x,m(x,) 1/2 y) we can finally normalize the matrix in such a way that G(x,) = Id for all x I ε. We are now ready to prove the theorem. We check the equivalence between the two notions of local minimality in the coordinate set, denoted (x, y), defined by the previous lemma. Notice that the notion of local minimality is independent on the coordinates. Given an admissible curve γ(t) = (x(t),y(t)) contained in the cylinder I ε Bε n 1 and satisfying γ() = (,) and γ(1) = (1,) and denoting the reference trajectory γ(t) = (t,) we have that γ γ 2 H 1 = = = ẋ(t) ẏ(t) 2 dt ẋ(t) 2 + ẏ(t) 2 dt 2 ẋ(t) 2 + ẏ(t) 2 dt 1 1 ẋ(t)dt+1 where we used that x() = and x(1) = 1 since γ satisfies the boundary conditions. If we denote by J(γ) = 1 G(γ(t)) γ(t), γ(t) dt, J e (γ) = 1 ẋ(t) 2 + ẏ(t) 2 dt (12.83) respectively the energy of γ and the Euclidean energy, we have γ γ 2 H 1 = J e (γ) 1 and the H 1 -local minimality can be rewritten as follows: 5 Indeed it is easily checked that v x = G 1 21(x,), where G 1 21 denotes the first column of the (n 1) (n 1) matrix G

334 ( ) there exists ε > such that for every γ admissible and J e (γ) 1+ε one has J(γ) 1. Next we build the following neighborhood of γ: for every δ > define A δ as the set of admissible curves γ(t) = (x(t),y(t)) in I ε Bε n 1 such that the dilated curve γ δ (t) = (x(t), 1 δy(t)) is still contained in the cylinder. This implies that in particular that γ is contained in I ε B n 1 δε. Notice that A δ A δ whenever δ < δ. Moreover, every curve that is εδ close to γ in the C -topology is contained in A δ. It is then sufficient to prove that, for δ > small enough, for every γ A δ one has l(γ) l( γ). Indeed it is enough to check that J(γ) J( γ). Let us consider two cases (i) γ A δ and J e (γ) 1+ε. In this case ( ) implies that J(γ) 1. (ii) γ A δ and J e (γ) > 1+ε. In this case we have G(x,) = Id and, by smoothness of G, we can write for (x,y) I ε B n 1 δε and δ G(x,y)v,v = (1+O(δ)) v,v, where O(δ) is uniform with respect to (x,y). Since γ A δ implies that γ is contained in I ε B n 1 δε we can deduce for δ J(γ) = J e (γ)(1+o(δ)) (1+ε)(1+O(δ)) and one can choose δ > small enough such that the last quantity is strictly bigger than one. This proves that there exists δ > such every admissible curve γ A δ is longer than γ. Remark Notice that this result implies in particular Theorem 4.6, since normal extremals are always smooth. Nevertheless, the argument of Theorem 4.6 can be adapted for more general coercive functional (see[as4]), while this proof use specific estimates that hold only for our explicit cost (i.e., the distance). We proved in Theorem that nice abnormals are smooth and cannot have self-intersections, being solution of a smooth Hamiltonian system. Thus we can combine Theorem 12.3 and and obtain the following result. Corollary Let γ(t) be a nice abnormal trajectory. Then there exists s > such that γ [,s] is a strict local length minimizer in the C -topology. 334

335 Chapter 13 Some model spaces* 13.1 Carnot groups of step 2* Multi-dimensional Heisenberg groups* We study the optimal synthesis for the multi-dimensional Heisenberg groups of the following type: the symmetric one, where all frequences coincide, and the totally-asymmetric one, where all frequences are different. Intermediate cases left as exercices Free Carnot groups of step 2* We compute the intersection of the cut locus from the origin with the vertical space. In particular we give the explicit formula of the distance from the origin to the vertical space Some non-equiregular nilpotent structures* Grushin* We compute all the optimal synthesis and the cut locus Martinet* We compute synthesis and Maxwell points, i.e., points reached by at least two optimal geodesics. We explain optimality, but we refer to proofs in the references for the details Some non-nilpotent left invariant structures* SU(2) SO(3), SL(2)* Here we consider the left-invariant structures endowed with Killing metric. We compute synthesis and cut locus. 335

336 SE(2)* We compute synthesis and Maxwell points, i.e., points reached by at least two optimal geodesics. We explain optimality, but we refer to proofs in the references for the details. 336

337 Chapter 14 Curves in the Lagrange Grassmannian In this chapter we introduce the manifold of Lagrangian subspaces of a symplectic vector space. After a description of its geometric properties, we discuss how to define the curvature for regular curves in the Lagrange Grassmannian, that are curves with non-degenerate derivative. Then we discuss the non-regular case, where a reduction procedure let us to reduce to a regular curve in a reduced symplectic space The geometry of the Lagrange Grassmannian In this section we recall some basic facts about Grassmanians of k-dimensional subspaces of an n-dimensional vector space and then we consider, for a vector space endowed with a symplectic structure, the submanifold of its Lagrangian subspaces. Definition Let V be an n-dimensional vector space. The Grassmanian of k-planes on V is the set G k (V) := {W W V is a subspace, dim(w) = k}. It is a standard fact that G k (V) is a compact manifold of dimension k(n k). Now we describe the tangent space to this manifold. Proposition Let W G k (V). We have a canonical isomorphism T W G k (V) Hom(W,V/W). Proof. Consider a smooth curve on G k (V) which starts from W, i.e. a smooth family of k- dimensional subspaces defined by a moving frame W(t) = span{e 1 (t),...,e k (t)}, W() = W. We want to associate in a canonical way with the tangent vector Ẇ() a linear operator from W to the quotient V/W. Fix w W and consider any smooth extension w(t) W(t), with w() = w. Then define the map W V/W, w ẇ() (mod W). (14.1) 337

338 We are left to prove that the map (14.1) is well defined, i.e. independent on the choices of representatives. Indeed if we consider another extension w 1 (t) of w satisfying w 1 (t) W(t) we can write k w 1 (t) = w(t)+ α i (t)e i (t), for some smooth coefficients α i (t) such that α i () = for every i. It follows that i=1 ẇ 1 (t) = ẇ(t)+ k i=1 k α i (t)e i (t)+ α i (t)ė i (t), (14.2) i=1 and evaluating (14.2) at t = one has ẇ 1 () = ẇ()+ k i=1 α i ()e i (). This shows that ẇ 1 () = ẇ()(mod W), hence the map (14.1) is well defined. In the same way one can prove that the map does not depend on the moving frame defining W(t). Finally, it is easy to show that the map that associates the tangent vector to the curve W(t) with the linear operator W V/W is surjective, hence it is an isomorphism since the two space have the same dimension. Let us now consider a symplectic vector space (Σ, σ), i.e. a 2n-dimensional vector space Σ endowed with a non degenerate symplectic form σ Λ 2 (Σ). Definition A vector subspace Π Σ of a symplectic space is called (i) symplectic if σ Π is nondegenerate, (ii) isotropic if σ Π, (iii) Lagrangian if σ Π and dimπ = n. Notice that in general for every subspace Π Σ, by nondegeneracy of the symplectic form σ, one has dimπ+dimπ = dimσ. (14.3) where as usual we denote the symplectic orthogonal by Π = {x Σ σ(x,y) =, y Π}. Exercise Prove the following properties for a vector subspace Π Σ: (i) Π is symplectic iff Π Π = {}, (ii) Π is isotropic iff Π Π, (iii) Π is Lagrangian iff Π = Π. Exercise Prove that, given two subspaces A,B Σ, one has the identities (A + B) = A B and (A B) = A +B. 338

339 Example Any symplectic vector space admits Lagrangian subspaces. Indeed fix any nonzero element e 1 := e in Σ. Choose iteratively e i span{e 1,...,e i 1 } \span{e 1,...,e i 1 }, i = 2,...,n. (14.4) Then Π := span{e 1,...,e n } is a Lagrangian subspaceby construction. Notice that the choice (14.4) is possible by (14.3) Lemma Let Π = span{e 1,...,e n } be a Lagrangian subspace of Σ. Then there exists vectors f 1,...,f n Σ such that (i) Σ = Π, := span{f 1,...,f n }, (ii) σ(e i,f j ) = δ ij, σ(e i,e j ) = σ(f i,f j ) =, i,j = 1,...,n. Proof. We prove the lemma by induction. By nondegeneracy of σ there exists a non-zero x Σ such that σ(e n,x). Then we define the vector f n := x σ(e n,x), = σ(e n,f n ) = 1. The last equality implies that σ restricted to span{e n,f n } is nondegerate, hence by (a) of Exercise 14.4 span{e n,f n } span{e n,f n } =, (14.5) And we can apply induction on the 2(n 1) subspace Σ := span{e n,f n }. Notice that (14.5) implies that σ is non degenerate also on Σ. Remark In particular the complementary subspace = span{f 1,...,f n } defined in Lemma 14.7 is Lagrangian and transversal to Π Σ = Π. Considering coordinates induced from the basis chosen for this splitting we can write Σ = R n R n, (denoting R n denotes the set of row vectors). More precisely x = (ζ,z) if n x = ζ i e i +z i f i, ζ = ( z 1 ζ 1 ζ n), z =., i=1 z n and using canonical form of σ on our basis (see Lemma 14.7) we find that in coordinates, if x 1 = (ζ 1,z 1 ),x 2 = (ζ 2,z 2 ) we get σ(x 1,x 2 ) = ζ 1 z 2 ζ 2 z 1, (14.6) where we denote with ζz the standard rows by columns product. Lemma 14.7 shows that the group of symplectomorphisms acts transitively on pairs of transversal Lagrangian subspaces. The next exercise, whose proof is an adaptation of the previous one, describes all the orbits of the action of the group of symplectomorphisms on pairs of subspaces of a symplectic vector spaces. Exercise Let Λ 1,Λ 2 be two subspaces in a symplectic vector space Σ, and assume that dimλ 1 Λ 2 = k. Show that there exists Darboux coordinates (p,q) in Σ such that Λ 1 = {(p,)}, Λ 2 = {((p 1,...,p k,,...,),(,...,,q k+1,...,q n )}. 339

340 The Lagrange Grassmannian Definition The Lagrange Grassmannian L(Σ) of a symplectic vector space Σ is the set of its n-dimensional Lagrangian subspaces. Proposition L(Σ) is a compact submanifold of the Grassmannian G n (Σ) of n-dimensional subspaces. Moreover diml(σ) = n(n+1). (14.7) 2 Proof. Recall that G n (Σ) is an 2 -dimensionalcompact manifold. Clearly L(Σ) G n (Σ) asasubset. Consider the set of all Lagrangian subspaces that are transversal to a given one = {Λ L(Σ) : Λ = }. Clearly L(Σ) is an open subset and since by Lemma 14.7 every Lagrangian subspace admits a Lagrangian complement L(Σ) =. L(Σ) It is then sufficient to find some coordinates on these open subsets. Every n-dimensional subspace Λ Σ which is transversal to is the graph of a linear map from Π to. More precisely there exists a matrix S Λ such that Λ = Λ = {(z T,S Λ z),z R n }. (Here we used the coordinates induced by the splitting Σ = Π.) Moreover it is easily seen that Λ L(Σ) S Λ = (S Λ ) T. Indeed we have that Λ L(Σ) if and only if σ Λ = and using (14.6) this is rewritten as σ((z T 1,S Λ z 1 ),(z T 2,S Λ z 2 )) = z T 1 S Λ z 2 z T 2 S Λ z 1 =, which means exactly S Λ symmetric. Hence the open set of all subspaces that are transversal to Λ is parametrized by the set of symmetric matrices, that gives coordinates in this open set. This also proves that the dimension of L(Σ) coincide with the dimension of the space of symmetric matrices, hence (14.7). Notice also that, being L(Σ) a closed set in a compact manifold, it is compact. Now we describe the tangent space to the Lagrange Grassmannian. Proposition Let Λ L(Σ). Then we have a canonical isomorphism T Λ L(Σ) Q(Λ), where Q(Λ) denote the set of quadratic forms on Λ. Proof. Consider a smooth curve Λ(t) in L(Σ) such that Λ() = Λ and Λ() T Λ L(Σ) its tangent vector. As before consider a point x Λ and a smooth extension x(t) Λ(t) and denote with ẋ := ẋ(). We define the map Λ : x σ(x,ẋ), (14.8) 34

341 that is nothing else but the quadratic map associated to the self adjoint map x ẋ by the symplectic structure. We show that in coordinates Λ is a well defined quadratic map, independent on all choices. Indeed Λ(t) = {(z T,S Λ(t) z),z R n }, and the curve x(t) can be written x(t) = (z(t) T,S Λ(t) z(t)), x = x() = (z T,S Λ z), for some curve z(t) where z = z(). Taking derivative we get ẋ(t) = (ż(t) T,ṠΛ(t)z(t)+S Λ(t) ż(t)), and evaluating at t = (we simply omit t when we evaluate at t = ) we have x = (z T,S Λ z), ẋ = (ż T,ṠΛz +S Λ ż), and finally get, using the simmetry of S Λ, that σ(x,ẋ) = z T (ṠΛz +S Λ ż) ż T S Λ z = z T Ṡ Λ z +z T S Λ ż ż T S Λ z = z T Ṡ Λ z. (14.9) Exercise Let Λ(t) L(Σ) such that Λ = Λ() and σ be the symplectic form. Prove that the map S : Λ Λ R defined by S(x,y) = σ(x,ẏ), where ẏ = ẏ() is the tangent vector to a smooth extension y(t) Λ(t) of y, is a symmetric bilinear map. Remark We have the following natural interpretation of this result: since L(Σ) is a submanifold of the Grassmanian G n (Σ), its tangent space T Λ L(Σ) is naturally identified by the inclusion with a subspace of the Grassmannian i : L(Σ) G n (Σ), i : T Λ L(Σ) T Λ G n (Σ) Hom(Λ,Σ/Λ), where the last isomorphism is Proposition Being Λ a Lagrangian subspace of Σ, the symplectic structure identifies in a canonical way the factor space Σ/Λ with the dual space Λ defining Σ/Λ Λ, [z] Λ,x = σ(z,x). (14.1) Hence the tangent space to the Lagrange Grassmanian consist of those linear maps in the space Hom(Λ,Λ ) that are self-adjoint, which are naturally identified with quadratic forms on Λ itself. 1 Remark Given a curve Λ(t) in L(Σ), the above procedure associates to the tangent vector Λ(t) a family of quadratic forms Λ(t), for every t. We end this section by computing the tangent vector to a special class of curves that will play a major role in the sequel, i.e. the curve on L(Σ) induced by the action on Λ by the flow of the linear Hamiltonian vector field h associated with a quadratic Hamiltonian h C (Σ). (Recall that a Hamiltonian vector field transform Lagrangian subspaces into Lagrangian subspaces.) 1 any quadratic form on a vector space q Q(V) can be identified with a self-adjoint linear map L : V V, L(v) = B(v, ) where B is the symmetric bilinear map such that q(v) = B(v,v). 341

342 Proposition Let Λ L(Σ) and define Λ(t) = e t h (Λ). Then Λ = 2h Λ. Proof. Consider x Λ and the smooth extension x(t) = e t h (x). Then ẋ = h(x) and by definition of Hamiltonian vector field we find σ(x,ẋ) = σ(x, h(x)) = d x h,x = 2h(x), where in the last equality we used that h is quadratic on fibers Regular curves in Lagrange Grassmannian The isomorphism between tangent vector to the Lagrange Grassmannian with quadratic forms makes sense to the following definition (we denote by Λ the tangent vector to the curve at the point Λ as a quadratic map) Definition Let Λ(t) L(Σ) be a smooth curve in the Lagrange Grassmannian. We say that the curve is (i) monotone increasing (descreasing) if Λ(t) ( Λ(t) ). (ii) strictly monotone increasing (decreasing) if the inequality in (i) is strict. (iii) regular if its derivative Λ(t) is a non degenerate quadratic form. Remark Notice that if Λ(t) = {(p,s(t)p),p R n } in some coordinate set, then it follows from the proof of Proposition that the quadratic form Λ(t) is represented by the matrix ṠΛ(t) (see also (14.9)). In particular the curve is regular if and only if detṡλ(t). The main goal of this section is the construction of a canonical Lagrangian complement. (i.e. another curve Λ (t) in thelagrange Grassmannian defined by Λ(t) and such that Σ = Λ(t) Λ (t).) Consider an arbitrary Lagrangian splitting Σ = Λ() defined by a complement to Λ() (see Lemma 14.7) and fix coordinates in such a way that that Σ = {(p,q), p,q R n }, Λ() = {(p,), p R n }, = {(,q), q R n }. In these coordinates our regular curve is described by a one parametric family of symmetric matrices S(t) Λ(t) = {(p,s(t)p), p R n }, such that S() = and Ṡ() is invertible. All Lagrangian complement to Λ() are parametrized by a symmetrix matrix B as follows B = {(Bq,q),q R n }, B = B T. The following lemma shows how the coordinate expression of our curve Λ(t) change in the new coordinate set defined by the splitting Σ = Λ() B. 342

343 Lemma Let S B (t) the one parametric family of symmetric matrices defining Λ(t) in coordinates w.r.t. the splitting Λ() B. Then the following identity holds S B (t) = (S(t) 1 B) 1. (14.11) Proof. It is easy to show that, if (p,q) and (p,q ) denotes coordinates with respect to the splitting defined by the subspaces and B we have { p = p Bq q (14.12) = q The matrix S B (t) by definition is the matrix that satisfies the identity q = S B (t)p. Using that q = S(t)p by definition of Λ(t), from (14.12) we find q = q = S(t)p = S(t)(p +Bq ), and with straightforward computations we finally get S B (t) = (I S(t)B) 1 S(t) = (S(t) 1 B) 1. Since Ṡ(t) represents the tangent vectors to the regular curve Λ(t), its properties are invariant with respect to change of coordinates. Hence it is natural to look for a change of coordinates (i.e. a choice of the matrix B) that simplifies the second derivative our curve. Corollary There exists a unique symmetric matrix B such that S B () =. Proof. Recall that for a one parametric family of matrices X(t) we have d dt X(t) 1 = X(t) 1 Ẋ(t)X(t) 1. Applying twice this identity to (14.11) (we omit t to denote the value at t = ) we get ( ) d d dt S B (t) = (S 1 B) 1 t= dt S 1 (t) (S 1 B) 1 t= = (S 1 B) 1 S 1 ṠS 1 (S 1 B) 1 = (I SB) 1 Ṡ(I BS) 1. Hence for the second derivative evaluated at t = (remember that in our coordinates S() = ) one gets S B = S +2ṠBṠ, and using that Ṡ is non degerate, we can choose B = 1 2Ṡ 1 SṠ 1. We set Λ () := B, where B is determined by (14.13). Notice that by construction Λ () is a Lagrangian subspace and it is transversal to Λ(). The same argument can be applied to define Λ (t) for every t. 343

344 Definition Let Λ(t) be a regular curve, the curve Λ (t) defined by the condition above is called derivative curve of Λ(t). Exercise Prove that, if Λ(t) = {(p,s(t)p), p R n } (without the condition S() = ), then the derivative curve Λ (t) = {(p,s (t)p), p R n }, satisfies S (t) = B(t) 1 +S(t), where B(t) := 1 2Ṡ(t) 1 S(t) Ṡ(t) 1, (14.13) provided Λ (t) is transversal to the subspace = {(,q),q R n }. (Actually this condition is equivalent to the invertibility of B(t).) Notice that if S() = then S () = B() 1. Remark The set Λ tr of all n-dimensional spaces transversal to a fixed subspace Λ is an affine space over Hom(Σ/Λ,Λ). Indeed given two elements 1, 2 Λ tr we can associate with their difference the operator 2 1 A Hom(Σ/Λ,Λ), A([z] Λ ) = z 2 z 1 Λ, (14.14) where z i i [z] Λ are uniquely identified. If Λ is Lagrangian, we have identification Σ/Λ Λ given by the symplectic structure (see (14.1)) that Λ, that coincide by definition with the intersection Λ tr L(Σ) is an affine space over Hom S (Λ,Λ), the space of selfadjoint maps between Λ and Λ, that it isomorphic to Q(Λ ). Notice that if we fix a distinguished complement of Λ, i.e. Σ = Λ, then we have also the identification Σ/Λ and Λ Q(Λ ) Q( ). Exercise Prove that the operator A defined by (14.14), in the case when Λ is Lagrangian, is a self-adjoint operator. Remark Assumethat thesplittingσ = Λ isfixed. ThenourcurveΛ(t) inl(σ), suchthat Λ() = Λ, is characterized by a family of symmetric matrices S(t) satisfying Λ(t) = {(p,s(t)p),p R n }, with S() =. By regularity of the curve, Λ(t) Λ for t > small enough, hence we can consider its coordinate presentation in the affine space on the vector space of quadratic forms defined on (see Remark 14.23) that is given by S 1 (t) and write the Laurent expansion of this curve in the affine space ( ) 1 S(t) 1 = tṡ + t2 2 S +O(t 3 ) = 1 (I tṡ 1 + t ) 1 2 SṠ 1 +O(t 2 ) = 1 tṡ 1 1 2Ṡ 1 SṠ 1 +O(t). }{{} B It is not occasional that the matrix B coincides with the free term of this expansion. Indeed the formula (14.11) for the change of coordinates can be rewritten as follows S B (t) 1 = S 1 (t) B, (14.15) and the choice of B corresponds exactly to the choice of a coordinate set where the curve Λ(t) has no free term in this expansion (i.e. S B (t) 1 has no term of order zero). This is equivalent to say that a regular curve let us to choose a privileged origin in the affine space of Lagrangian subspaces that are transversal to the curve itself. 344

345 14.3 Curvature of a regular curve Now we want to define the curvature of a regular curve in the Lagrange Grassmannian. Let Λ(t) be a regular curve and consider its derivative curve Λ (t). The tangent vectors to Λ(t) and Λ (t), as explained in Section 14.1, can be interpreted in a a canonical way as a quadratic form on the space Λ(t) and Λ (t) respectively Λ(t) Q(Λ(t)), Λ (t) Q(Λ (t)). Being Λ (t) a canonical Lagrangian complement to Λ(t) we have the identifications through the symplectic form 2 Λ(t) Λ (t), Λ (t) Λ(t), and the quadratic forms Λ(t), Λ (t) can be treated as (self-adjoint) mappings: Λ(t) : Λ(t) Λ (t), Λ (t) : Λ (t) Λ(t). (14.16) Definition TheoperatorR Λ (t) := Λ (t) Λ(t) : Λ(t) Λ(t)iscalled thecurvature operator of the regular curve Λ(t). Remark In the monotonic case, when Λ(t) defines a scalar product on Λ(t), the operator R(t) is, by definition, symmetric with respect to this scalar product. Moreover R(t), as quadratic form, has the same signature and rank as Λ (t)sign( Λ (t)). Definition Let Λ 1,Λ 2 be two transversal Lagrangian subspaces of Σ. We denote the projection on Λ 2 parallel to Λ 1, i.e. the linear operator such that π Λ1 Λ 2 : Σ Λ 2, (14.17) π Λ1 Λ 2 Λ1 = π Λ1 Λ 2 Λ2 = Id. Exercise Assume Λ 1 and Λ 2 be two Lagrangian subspaces in Σ and assume that, in some coordinate set, Λ i = {(x,s i x), R n } for i = 1,2. Prove that Σ = Λ 1 Λ 2 if and only if ker(s 1 S 2 ) = {}. In this case show that the following matrix expression for π Λ1 Λ 2 : π Λ1 Λ 2 = ( S 1 12 S 1 S12 1 S 2 S12 1 S 1 S 2 S12 1 ), S 12 := S 1 S 2. (14.18) From the very definition of the derivative of our curve we can get the following geometric characterization of the curvature of a curve. Proposition Let Λ(t) a regular curve in L(Σ) and Λ (t) its derivative curve. Then Λ(t)(x t ) = π Λ(t)Λ (t)(ẋ t ), Λ (t)(xt ) = π Λ (t)λ(t)(ẋ t ). In particular the curvature is the composition R Λ (t) = Λ (t) Λ(t). 2 if Σ = Λ is a splitting of a vector space then Σ/Λ. If moreover the splitting is Lagrangian in a symplectic space, the symplectic form identifies Σ/Λ Λ, hence Λ. 345

346 Proof. Recall that, by definition, the linear operator Λ : Λ Σ/Λ associated with the quadratic form is the map x ẋ (mod Λ). Hence to build the map Λ Λ it is enough to compute the projection of ẋ onto the complement Λ, that is exactly π ΛΛ (ẋ). Notice that the minus sign in equation (14.3) is a consequence of the skew symmetry of the symplectic product. More precisely, the sign in the identification Λ Λ depends on the position of the argument. The curvature R Λ (t) of the curve Λ(t) is a kind of relative velocity between the two curves Λ(t) andλ (t). Inparticular noticethat ifthetwocurvesmoves inthesamedirection wehaver Λ (t) >. Now we compute the expression of the curvature R Λ (t) in coordinates. Proposition Assume that Λ(t) = {(p,s(t)p)} is a regular curve in L(Σ). Then we have the following coordinate expression for the curvature of Λ (we omit t in the formula) R Λ = ((2Ṡ) 1 S) ((2 Ṡ) 1 S) 2 (14.19) = 1 2Ṡ 1... S 3 4 (Ṡ 1 S) 2. (14.2) Proof. Assume that both Λ(t) and Λ (t) are contained in the same coordinate chart with Λ(t) = {(p,s(t)p)}, Λ (t) = {(p,s (t)p)}. We start the proof by computing the expression of the linear operator associated with the derivative Λ : Λ Λ (we omit t when we compute at t = ). For each element (p,sp) Λ and any extension (p(t),s(t)p(t)) one can apply the matrix representing the operator π ΛΛ (see (14.18)) to the derivative at t = and find π ΛΛ (p,sp) = (p,s p ), p = (S S ) 1 Ṡp. Exchanging the role of Λ and Λ, and taking into account of the minus sign one finds that the coordinate representation of R is given by R = (S S) 1 Ṡ (S S) 1 Ṡ. (14.21) We prove formula (14.2) under the extra assumption that S() =. Notice that this is equivalent to the choice of a particular coordinate set in L(Σ) and, being the expression of R coordinate independent by construction, this is not restrictive. Under this extra assumption, it follows from (14.13) that Λ(t) = {(p,s(t)p)}, Λ (t) = {(p,s (t)p)}, where S (t) = B(t) 1 +S(t) and we denote by B(t) := 1 2Ṡ(t) 1 S(t) Ṡ(t) 1. Hence we have, assuming S() = and omitting t when t = R = (S S) 1 Ṡ (S S) 1 Ṡ ( ) d = B dt B(t) 1 +S(t) BṠ t= = (BṠ)2 ḂṠ. Plugging B = 1 2Ṡ 1 SṠ 1 into the last formula, after some computations one gets to (14.2). 346

347 Remark The formula for the curvature R Λ (t) of a curve Λ(t) in L(Σ) takes a very simple form in a particular coordinate set given by the splitting Σ = Λ() Λ (), i.e. such that Λ() = {(p,),p R n }, Λ () = {(,q),q R n }. Indeed using a symplectic change of coordinates in Σ that preserves both Λ and Λ (i.e. of the kind p = Ap, q = (A 1 ) q) we can choose the matrix A in such a way that Ṡ() = I. Moreover we know from Proposition that the fact that Λ = {(,q),q R n } is equivalent to S() =. Hence one finds from (14.2) that R = 1... S 2 When the curve Λ(t) is strictly monotone, the curvature R represents a well defined operator on Λ(), naturally endowed with the sign definite quadratic form Λ(). Hence in these coordinates the eigenvalues of... S (and not only the trace and the determinant) are invariants of the curve. Exercise Let f : R R be a smooth function. The Schwartzian derivative of f is defined as ( ) f ) f 2 Sf := ( 2f 2f (14.22) Prove that Sf = if and only if f(t) = at+b ct+d for some a,b,c,d R. Remark The previous proposition says that the curvature R is the matrix version of the Schwartzian derivative of the matrix S (cfr. (14.19) and (14.22)). Example Let Σ be a 2-dimensional symplectic space. In this case L(Σ) P 1 (R) is the real projective line. Let us compute the curvature of a curve in L(Σ) with constant (angular) velocity α >. We have Λ(t) = {(p,s(t)p),p R}, S(t) = tan(αt) R. From the explicit expression it easy to find the relation Ṡ(t) = α(1+s 2 (t)), S(t) = αs(t), 2Ṡ(t) from which one gets that R(t) = αṡ(t) α2 S 2 (t) = α 2, i.e. the curve has constant curvature. We end this section with a useful formula on the curvature of a reparametrized curve. Proposition Let ϕ : R R a diffeomorphism and define the curve Λ ϕ (t) := Λ(ϕ(t)). Then R Λϕ (t) = ϕ 2 (t)r Λ (ϕ(t))+r ϕ (t)id. (14.23) Proof. It is a simple check that the Schwartzian derivative of the composition of two function f and g satisfies S(f g) = (Sf g)(g ) 2 +Sg. Notice that R ϕ (t) makes senseas thecurvatureof theregular curveϕ : R R P 1 in thelagrange Grassmannian L(R 2 ). 347

348 Exercise (Another formula for the curvature). Let Λ,Λ 1 L(Σ) be such that Σ = Λ Λ 1 andfixtwotangent vectors ξ T Λ L(Σ)andξ 1 T Λ1 L(Σ). Asin(14.16)wecantreat eachtangent vector as a linear operator ξ : Λ Λ 1, ξ 1 : Λ 1 Λ, (14.24) and define the cross-ratio [ξ 1,ξ ] = ξ 1 ξ. If in some coordinates Λ i = {(p,s i p)} for i =,1 we have 3 [ξ 1,ξ ] = (S 1 S ) 1 Ṡ 1 (S 1 S ) 1 Ṡ. Let now Λ(t) a regular curve in L(Σ). By regularity Σ = Λ() Λ(t) for all t > small enough, hence the cross ratio [ Λ(t), Λ()] : Λ() Λ(), is well defined. Prove the following expansion for t [ Λ(t), Λ()] 1 t 2Id+ 1 3 R Λ()+O(t). (14.25) 14.4 Reduction of non-regular curves in Lagrange Grassmannian In this section we want to extend the notion of curvature to non-regular curves. As we will see in the next chapter, it is always possible to associate with an extremal a family of Lagrangian subspaces in a symplectic space, i.e. a curve in a Lagrangian Grassmannian. This curve turns out to be regular if and only if the extremal is an extremal of a Riemannian structure. Hence, if we want to apply this theory for a genuine sub-riemannian case we need some tools to deal with non-regular curves in the Lagrangian Grassmannian. Let (Σ, σ) be a symplectic vector space and L(Σ) denote the Lagrange Grassmannian. We start by describing a natural subspace of L(Σ) associated with an isotropic subspace Γ of Σ. This will allow us to define a reduction procedure for a non regular curve. Let Γ be a k-dimensional isotropic subspace of Σ, i.e. σ Γ =. This means that Γ Γ. In particular Γ /Γ is a 2(n k) dimensional symplectic space with the restriction of σ. Lemma There is a natural identification of L(Γ /Γ) as a subspace of L(Σ): Moroever we have a natural projection where Λ Γ := (Λ Γ )+Γ = (Λ+Γ) Γ. L(Γ /Γ) {Λ L(Σ),Γ Λ} L(Σ). (14.26) π Γ : L(Σ) L(Γ /Γ), Λ Λ Γ, Proof. Assume that Λ L(Σ) and Γ Λ. Then, since Λ is Lagrangian, Λ = Λ Γ, hence the identification (14.26). Assume now that Λ L(Γ /Γ) and let us show that π Γ (Λ) = Λ, i.e. π Γ is a projection. Indeed from the inclusions Γ Λ Γ one has π Γ (Λ) = Λ Γ = (Λ Γ )+Γ = Λ+Γ = Λ. 3 here Ṡi denotes the matrix associated with ξi. 348

349 We are left to check that Λ Γ is Lagrangian, i.e. (Λ Γ ) = Λ Γ. (Λ Γ ) = ((Λ Γ )+Γ) = (Λ Γ ) Γ = (Λ+Γ) Γ = Λ Γ, where we repeatedly used Exercise (The identity (Λ Γ ) + Γ = (Λ + Γ) Γ is also a consequence of the same exercise.) Remark Let Γ = {Λ L(Σ),Λ Γ = {}}. The restriction π Γ Γ is smooth. Indeed it can be shown that π Γ is defined by a rational function, since it is expressed via the solution of a linear system. The following example shows that the projection π Γ is not globally continous on L(Σ). Example Consider the symplectic structure σ on R 4, with Darboux basis {e 1,e 2,f 1,f 2 }, i.e. σ(e i,f j ) = δ ij. Let Γ = span{e 1 } be a one dimensional isotropic subspace and define Λ ε = span{e 1 +εf 2,e 2 +εf 1 }, ε >. It is easy to see that Λ ε is Lagrangian for every ε and that Λ Γ ε = span{e 1,f 2 }, ε >, (14.27) Λ Γ = span{e 1,e 2 }. Indeed f 2 e 1, that implies e 1 +εf 2 Λ ε Γ, therefore f 2 Λ ε Γ. By definition of reduced curve f 2 Λ Γ ε and (14.27) holds. The case ε = is trivial Ample curves In this section we introduce ample curves. Definition Let Λ(t) L(Σ) be a smooth curve in the Lagrange Grassmannian. The curve Λ(t) is ample at t = t if there exists N N such that Σ = span{λ (i) (t ) λ(t) Λ(t),λ(t) smooth, i N}. (14.28) In other words we require that all derivatives up to order N of all smooth sections of our curve in L(Σ) span all the possible directions. As usual, we can choose coordinates in such a way that, for some family of symmetric matrices S(t), one has Σ = {(p,q) p,q R n }, Λ(t) = {(p,s(t)p) p R n }. Exercise Assume that Λ(t) = {(p,s(t)p),p R n } with S() =. Prove that the curve is ample at t = if and only if there exists N N such that all the columns of the derivative of S(t) up to order N (and computed at t = ) span a maximal subspace: rank{ṡ(), S(),...,S (N) ()} = n. (14.29) In particular, a curve Λ(t) is regular at t if and only if is ample at t with N =

350 An important property of ample and monotone curves is described in the following lemma. Lemma Let Λ(t) L(Σ) a monotone, ample curve at t. Then, there exists ε > such that Λ(t) Λ(t ) = {} for < t t < ε. Proof. Without loss of generality, assume t =. Choose a Lagrangian splitting Σ = Λ Π, with Λ = J(). For t < ε, thecurve is contained in the chart definedby such a splitting. In coordinates, Λ(t) = {(p,s(t)p) p R n }, with S(t) symmetric and S() =. The curve is monotone, then Ṡ(t) is a semidefinite symmetric matrix. It follows that S(t) is semidefinite too. Suppose that, for some t, Λ(t) Λ() {} (assume t > ). This means that v R n such that S(t)v =. Indeed also v S(t)v =. The function τ v S(τ)v is monotone, vanishing at τ = and τ = t. Therefore v S(τ)v = for all τ t. Being a semidefinite, symmetric matrix, v S(τ)v = if and only if S(τ)v =. Therefore, we conclude that v kers(τ) for τ t. This implies that, for any i N, v kers (i) (), which is a contradiction, since the curve is ample at. Exercise Provethat amonotonecurveλ(t) isampleat t ifandonlyifoneoftheequivalent conditions is satisfied (i) the family of matrices S(t) S(t ) is nondegenerate for t t close enough, and the same remains true if we replace S(t) by its N-th Taylor polynomial, for some N in N. (ii) the map t det(s(t) S(t )) has a finite order root at t = t. Let us now consider an analytic monotone curve on L(Σ). Without loss of generality we can assume the curve to be non increasing, i.e. Λ(t). By monotonicity Λ() Λ(t) = τ t Λ(τ) =: Υ t Clearly Υ t is a decreasing family of subspaces, i.e. Υ t Υ τ if τ t. Hence the family Υ t for t stabilizes and the limit subspace Υ is well defined Υ := lim t Υ t The symplectic reduction of the curve by the isotropic subspace Υ defines a new curve Λ(t) := Λ(t) Υ L(Υ /Υ). Proposition If Λ(t) is analytic and monotone in L(Σ), then Λ(t) is ample L(Υ /Υ). Proof. By construction, in the reduced space Υ /Υ we removed the intersection of Λ(t) with Λ(). Hence Λ() Λ(t) = {}, in L(Υ /Υ) (14.3) In particular, if S(t) denotes the symmetric matrix representing Λ(t) such that S() = Λ(t ), it follows that S(t) is non degenerate for < t < ε. The analyticity of the curve guarantees that the Taylor polynomial (of a suitable order N) is also non degenerate. 35

351 14.6 From ample to regular In this section we prove the main result of this chapter, i.e. that any ample monotone curve can be reduced to a regular one. Theorem Let Λ(t) be a smooth ample monotone curve and set Γ := Ker Λ(). Then the reduced curve t Λ Γ (t) is a smooth regular curve. In particular Λ Γ () >. Before proving Theorem 14.46, let us discuss two useful lemmas. Lemma Let v 1 (t),...,v k (t) R n and define V(t) as the n k matrix whose columns are the vectors v i (t). Define the matrix S(t) := t V(τ)V(τ) dτ. Then the following are equivalent: (i) S(t) is invertible (and positive definite), (ii) span{v i (τ) i = 1,...,k;τ [,t]} = R n. Proof. Fix t > and let us assume S(t) is not invertible. Since S(t) is non negative then there exists a nonzero x R n such that S(t)x,x =. On the other hand S(t)x,x = t V(τ)V(τ) x,x dτ = t V(τ) x 2 dτ This implies that V(τ) x = (or equivalently x V(τ) = ) for τ [,t], i.e. the nonzero vector x is orthogonal to Im τ [,t] V(τ) = span{v i (τ) i = 1,...,k, τ [,t]} = R n, that is a contradiction. The converse is similar. Lemma Let A,B two positive and symmetric matrices such that < A < B. Then we have also < B 1 < A 1. Proof. Assume first that A and B commute. Then A and B can be simultaneously diagonalized and the statement is trivial for diagonal matrices. In the general case, since A is symmetric and positive, we can consider its square root A 1/2, which is also symmetric and positive. We can write < Av,v < Bv,v By setting w = A 1/2 v in the above inequality and using Av,v = A 1/2 v,a 1/2 v one gets < w,w < A 1/2 BA 1/2 w,w, which is equivalent to I < A 1/2 BA 1/2. Since the identity matrix commutes with every other matrix, we obtain < A 1/2 B 1 A 1/2 = (A 1/2 BA 1/2 ) 1 < I which is equivalent to < B 1 < A 1 reasoning as before. Proof of Theorem By assumption the curve t Λ(t) is ample, hence Λ(t) Γ = {} and t Λ Γ (t) is smooth for t > small enough. We divide the proof into three parts: (i) we compute the coordinate presentation of the reduced curve. (ii) we show that the reduced curve, extended by continuity at t =, is smooth. (iii) we prove that the reduced curve is regular. 351

352 (i). Let us consider Darboux coordinates in the symplectic space Σ such that Σ = {(p,q) : p,q R n }, Λ(t) = {(p,s(t)p) p R n }, S() =. Morover we can assume also R n = R k R n k, where Γ = {} R n k. According to this splitting we have the decomposition p = (p 1,p 2 ) and q = (q 1,q 2 ). The subspaces Γ and Γ are described by the equations Γ = {(p,q) : p 1 =,q = }, Γ = {(p,q) : q 2 = } and (p 1,q 1 ) are natural coordinates for the reduced space Γ /Γ. Up to a symplectic change of coordinates preserving the splitting R n = R k R n k we can assume that ( ) ( ) S11 (t) S S(t) = 12 (t) Ik S12 (t) S, with Ṡ() =. (14.31) 22(t) where I k is the k k identity matrix. Finally, from the fact that S is monotone and ample, that implies S(t) > for each t >, it follows S 11 (t) >, S 22 (t) >, t >. (14.32) Then we can compute the coordinate expression of the reduced curve, i.e. the matrix S Γ (t) such that Λ Γ (t) = {(p 1,S Γ (t)p 1 ),p 1 R k }. From the identity Λ(t) Γ = {(p,s(t)p),s(t)p R k } = {( S 1 (t) ( q1 ) ( )) } q1,,q 1 R k (14.33) one gets the key relation S Γ (t) 1 = (S(t) 1 ) 11. Thus the matrix expression of the reduced curve Λ Γ (t) in L(Γ /Γ) is recovered simply by considering it as a map of (p 1,q 1 ) only, i.e. ( )( ) ( ) S11 S S(t)p = 12 p1 S11 p S12 = 1 +S 12 p 2 S 22 p 2 S12 p 1 +S 22 p 2 from which we get S(t)p R k if and only if S 12 (t)p 1 +S 22 (t)p 2 =. Then that means Λ Γ (t) = {(p 1,S 11 p 1 +S 12 p 2 ) : S 12(t)p 1 +S 22 (t)p 2 = } = {(p 1,(S 11 S 12 S 1 22 S 12)p 1 )} S Γ = S 11 S 12 S 1 22 S 12. (14.34) (ii). By the coordinate presentation of S Γ (t) the only term that can give rise to singularities is the inverse matrix S22 1 (t). In particular, since by assumption t dets 22(t) has a finite order zero at t =, the a priori singularity can be only a finite order pole. To prove that the curve is smooth it is enough the to show that S Γ (t) for t, i.e. the curve remains bounded. This follows from the following Claim I. As quadratic forms on R k, we have the inequality S Γ (t) S 11 (t). 352

353 Indeed S(t) symmetric and positive one has that its inverse S(t) 1 is symmetric and positive also. This implies that S Γ (t) 1 = (S(t) 1 ) 11 > and so is S Γ (t). This proves the left inequality of the Claim I. Moreover using (14.34) and the fact that S 22 is positive definite (and so S 1 22 (S11 S Γ )p 1,p 1 = S12 S 1 22 S 12p 1,p 1 = S 1 22 (S 12p 1 ),(S 12p 1 ). ) one gets Since S(t) for t, clearly S 11 (t) when t, that proves that S Γ (t) also. (iii). We are reduced to show that the derivative of t S Γ (t) at is non degenerate matrix, which is equivalent to show that t S Γ (t) 1 has a simple pole at t =. We need the following lemma, whose proof is postponed at the end of the proof of Theorem Lemma Let A(t) be a smooth family of symmetric nonnegative n n matrices. If the condition rank(a, A,...,A (N) ) t= = n is satisfied for some N, then there exists ε > such that εta() < t A(τ)dτ for all ε < ε and t > small enough. Applying the Lemma to the family A(t) = Ṡ(t) one obtains (see also (14.31)) S(t)p,p > εt p 1 2 for all < ε < ε, any p R n and any small time t >. Now let p 1 R k be arbitrary and extend it to a vector p = (p 1,p 2 ) R n such that (p,s(t)p) Λ(t) Γ (i.e. S(t)p = (q 1 ) T or equivalently S(t) 1 (q 1,) = (p 1,p 2 )). This implies in particular that S Γ (t)p 1 = q 1 and S Γ (t)p 1,p 1 = S(t)p,p εt p1 2, This identity can be rewritten as S Γ (t) > εti k > and implies by Lemma which completes the proof. < S Γ (t) 1 < 1 εt I k Proof of Lemma We reduce the proof of the Lemma to the following statement: Claim II. There exists c, N > such that for any sufficiently small ε,t > ( t ) det A(τ) εa()dτ > ct N. Moreover c, N depends only on the 2N-th Taylor polynomial of A(t). Indeed fix t >. Since A(t) and A(t) is not the zero family, then t A(τ)dτ >. Hence, for a fixed t, there exists ε small enough such that t A(τ) εa()dτ >. Assume now that the matrix S t = t A(τ) εa()dτ > is not strictly positive for some < t < t, then dets(τ) = for some τ [t,t ], that is a contradiction. We now prove Claim II. We may assume that t A(t) is analytic. Indeed, by continuity of the determinant, the statement remains true if we substitute A(t) by its Taylor polynomial of sufficiently big order. 353

354 An analytic one parameter family of symmetric matrices t A(t) can be simultaneously diagonalized (see??), in the sense that there exists an analytic (with respect to t) family of vectors v i (t), with i = 1,...,n, such that n A(t)x,x = v i (t),x 2. i=1 In other words A(t) = V(t)V(t), where V(t) is the n n matrix whose columns are the vectors v i (t). (Notice that some of these vector can vanish at or even vanish identically.) Let us now consider the flag E 1 E 2... E N = R n defined as follows E i = span{v (l) j,1 j n, l i}. Notice that this flag is finite by our assumption on the rank of the consecutive derivatives of A(t) and N is the same as in the statement of the Lemma. We then choose coordinates in R n adapted to this flag (i.e. the spaces E i are coordinate subspaces) and define the following integers (here e 1,...,e n is the standard basis of R n ) m i = min{j : e i E j }, i = 1,...,n. In other words, when written in this new coordinate set, m i is the order of the first nonzero term in the Taylor expansion of the i-th row of the matrix V(t). Then we introduce a quasi-homogeneous family of matrices V(t): the i-th row of V(t) is the m i -homogeneous part of the i-the row of V(t). Then we define Â(t) := V(t) V(t). The columns of the matrix Â(t) satisfies the assumption of Lemma 14.47, then t Â(τ)dτ > for every t >. If we denote the entries A(t) = {a ij (t)} n i,j=1 and Â(t) = {â ij(t)} n i,j=1 we obtain â ij (t) = c ij t m i+m j, a ij (t) = â ij (t)+o(t m i+m j +1 ), for suitable constants c ij (some of them may be zero). Then we let A ε (t) := A(t) εa() = {a ε ij (t)}n i,j=1. Of course aε ij (t) = cε ij tm i+m j +O(t m i+m j +1 ) where { c ε ij = (1 ε)c ij, if m i +m j =, c ij, if m i +m j >. From the equality one gets On the other hand ( t det ( hence det c ε ij m i +m j +1 t ( a ε ij(τ)dτ = t m c ε ) i+m j +1 ij m i +m j +1 +O(t) ( t ) ( ( det A ε (τ)dτ = t n+2 N i=1 m i det ) ( ( Â(τ)dτ = t n+2 N i=1 m i det c ε ij m i +m j +1 c ij m i +m j +1 ) > for small ε. The proof is completed by setting ( ) c ij N c := det, N := n+2 m i +m j +1 ) ) +O(t) ) ) +O(t) > i=1 m i 354

355 14.7 Conjugate points in L(Σ) In this section we introduce the notion of conjugate point for a curve in the Lagrange Grassmannian. In the next chapter we explain why this notion coincide with the one given for extremal paths in sub-riemannian geometry. Definition Let Λ(t) be a monotone curve in L(Σ). We say that Λ(t) is conjugate to Λ() if Λ(t) Λ() {}. As a consequence of Lemma 14.43, we have the following immediate corollary. Corollary Conjugate points on a monotone and ample curve in L(Σ) are isolated. The following two results describe general properties of conjugate points Theorem Let Λ(t), (t) two ample monotone curves in L(Σ) defined on R such that (i) Σ = Λ(t) (t) for every t, (ii) Λ(t), (t), as quadratic forms. Then there exists no τ > such that Λ(τ) is conjugate to Λ(). Moreover lim t + Λ(t) = Λ( ). Proof. Fix coordinates induced by some Lagrangian splitting of Σ in such a way that S Λ() = and S () = I. The monotonicity assumption implies that t S Λ(t) (resp. t S (t) ) is a monotone increasing (resp. decreasing) curve in the space of symmetric matrices. Moreover the tranversality of Λ(t) and (t) implies that S (t) S Λ(t) is a non degenerate matrix for all t. Hence < S Λ(t) < S (t) < I, for all t >. In particular Λ(t) never leaves the coordinate neighborhood under consideration, the subspace Λ(t) is always traversal to Λ() for t > and has a limit Λ( ) whose coordinate representation is S Λ ( ) = lim t + S Λ (t). Theorem Let Λ s (t), for t,s [,1] be an homotopy of curves in L(Σ) such that Λ s () = Λ for s [,1]. Assume that (i) Λ s ( ) is monotone and ample for every s [,1], (ii) Λ ( ),Λ 1 ( ) and Λ s (1), for s [,1], contains no conjugate points to Λ. Then no curve t Λ s (t) contains conjugate points to Λ. Proof. Let us consider the open chart Λ defined by all the Lagrangian subspaces traversal to Λ. The statement is equivalent to prove that Λ s (t) Λ for all t > and s [,1]. Let us fix coordinates induced by some Lagrangian splitting Σ = Λ in such a way that Λ = {(p,)} and Λ s (t) = {(B s (t)q,q)} for all s and t > (at least for t small enough, indeed by ampleness Λ s (t) Λ for t small). Moreover we can assume that B s (t) is a monotone increasing family of symmetric matrices. 355

356 Notice that x T B s (τ)x for every x R n when τ +, due to the fact that Λ s () = Λ is out of the coordinate chart. Moreover, a necessary condition for Λ s (t) to be conjugate to Λ is that there exists a nonzero x such that x T B s (τ)x for τ t. It is then enough to show that, for all x R n the function (t,s) x T B s (t)x is bounded. Indeed by assumptions t x T B (t)x and t x T B 1 (t)x are monotone increasing and bounded up to t = 1. Hence the continuous family of values M s := x T B s (1)x is weel defined and bounded for all s. The monotonicity implies that actually x T B s (t)x < + for all values of t,s [,1]. (See also Figure 14.7). + x T B s (1)x x T B (1)x x T B 1 (1)x x T B s (t)x s Figure 14.1: Proof of Theorem Comparison theorems for regular curves In this last section we prove two comparison theorems for regular monotone curves in the Lagrange Grassmannian. Corollary Let Λ(t) be a monotone and regular curve in the Lagrange Grassmannian such that R Λ (t). Then Λ(t) contains no conjugate points to Λ(). Proof. This is a direct consequence of Theorem Theorem Let Λ(t) be a monotone and regular curve in the Lagrange Grassmannian. Assume that there exists k such that for all t (i) R Λ (t) kid. Then, if Λ(t) is conjugate to Λ(), we have t π k. (ii) 1 n tracer Λ(t) k. Then for every t there exists τ [t,t+ π k ] such that Λ(τ) is conjugate to Λ(). 356

357 We stress that assumption (i) means that all the eigenvalues of R Λ (t) are smaller or equal than k, while (ii) requires only that the average of the eigenvalues is bigger or equal than k. Remark Notice that the estimates of Theorem are sharp, as it is immediately seen by considering the example of a 1-dimensional curve of constant velocity (see Example 14.35). Proof. (i). Consider the real function ϕ : R ], π k [, ϕ(t) = 1 k (arctan kt+ π 2 ) Using that ϕ(t) = (1+kt 2 ) 1 it is easy to show that the Schwarzian derivative of ϕ is k R ϕ (t) = (1+kt 2 ) 2. Thus using ϕ as a reparametrization we find, by Proposition R Λϕ (t) = ϕ 2 R Λ (ϕ(t))+r ϕ (t)id 1 = (1+kt 2 ) 2(R Λ(ϕ(t)) kid). By Corollary the curve Λ ϕ has no conjugate points, i.e. Λ has no conjugate points in the interval ], π k [. (ii). We prove the claim by showing that the curve Λ(t), on every interval of length π/ k has non trivial intersection with every subspace (hence in particular with Λ()). This is equivalent to prove that Λ(t) is not contained in a single coordinate chart for a whole interval of length π/ k. Assume by contradiction that Λ(t) is contained in one coordinate chart. Then there exists coordinates such that Λ(t) = {(p,s(t)p)} and we can write the coordinate expression for the curvature: R Λ (t) = Ḃ(t) B(t)2, where B(t) = (2S(t)) 1 S(t). Let now b(t) := traceb(t). Computing the trace in both sides of equality we get Ḃ(t) = B 2 (t)+r Λ (t), ḃ(t) = trace(b 2 (t))+tracer Λ (t). (14.35) Lemma For every n n symmetric matrix S the following inequality holds true trace(s 2 ) 1 n (traces)2. (14.36) Proof. For every symmetric matrix S there exists a matrix M such that MSM = D is diagonal. Since trace(mam 1 ) = trace(a) for every matrix A, it is enough to prove the inequality (14.36) for a diagonal matrix D = diag(λ 1,...,λ n ). In this case (14.36) reduces to the Cauchy-Schwartz inequality ( n n ) 2 λ 2 i 1 λ i. n i=1 i=1 357

358 Applying Lemma to (14.35) and using the assumption (ii) one gets ḃ(t) 1 n b2 (t)+nk, (14.37) By standardresults in ODE theory wehave b(t) ϕ(t), whereϕ(t) is the solution of the differential equation ϕ(t) = 1 n ϕ2 (t)+nk (14.38) The solution for (14.38), with initial datum ϕ(t ) =, is explicit and given by ϕ(t) = n ktan( k(t t )). This solution is defined on an interval of measure π/ k. Thus the inequality b(t) ϕ(t) completes the proof. 358

359 Chapter 15 Jacobi curves Now we are ready to introduce the main object of this part of the book, i.e. the Jacobi curve associated with a normal extremal. Heuristically, we would like to extract geometric properties of the sub-riemannian structure by studying the symplectic invariants of its geodesic flow, that is the flow of H. The simplest idea is to look for invariants in its linearization. As we explain in the next sections, this object is naturally related to geodesic variations, and generalizes the notion of Jacobi fields in Riemannian geometry to more general geometric structures. In this chapter we consider a sub-riemannian structure (M, U, f) on a smooth n-dimensional manifold M and we denote as usual by H : T M R its sub-riemannian Hamiltonian From Jacobi fields to Jacobi curves Fix a covector λ T M, with π(λ) = q, and consider the normal extremal starting from q and associated with λ, i.e. λ(t) = e t H (λ), γ(t) = π(λ(t)). (i.e. λ(t) T γ(t) M.) For any ξ T λ (T M) we can define a vector field along the extremal λ(t) as follows X(t) := e t H ξ T λ(t) (T M) The set of vector fields obtained in this way is a 2n-dimensional vector space which is the space of Jacobi fields along the extremal. For an Hamiltonian H corresponding to a Riemannian structure, the projection π gives an isomorphisms between the space of Jacobi fields along the extremal and the classical space of Jacobi fields along the geodesic γ(t) = π(λ(t)). Notice that this definition, equivalent to the standard one in Riemannian geometry, does not need curvature or connection, and can be extended naturally for any strongly normal sub- Riemannian geodesic. In Riemannian geometry, the study of one half of this vector space, namely the subspace of classical Jacobi fields vanishing at zero, carries informations about conjugate points along the given geodesic. By the aforementioned isomorphism, this corresponds to the subspace of Jacobi fields along the extremal such that π X() =. This motivates the following construction: For 359

360 any λ T M, we denote V λ := kerπ λ the vertical subspace. We could study the whole family of (classical) Jacobi fields (vanishing at zero) by means of the family of subspaces along the extremal L(t) := e t H V λ T λ(t) (T M). Notice that actually, being e t H a symplectic transformation and V λ a Lagrangian subspace, the subspace L(t) is a Lagrangian subspace of T λ(t) (T M) Jacobi curves The theory of curves in the Lagrange Grassmannian developed in Chapter?? is an efficient tool to study family of Lagrangian subspaces contained in a single symplectic vector space. It is then convenient to modify the construction of the previous section in order to collect the informations about the linearization of the Hamiltonian flow into a family of Lagrangian subspaces at a fixed tangent space. By definition, the pushforward of the flow of H maps the tangent space to T M at the point λ(t) back to the tangent space to T M at λ: e t H : T λ(t) (T M) T λ (T M). If we then restrict the action of the pushforward e t H to the vertical subspace at λ(t), i.e. the tangent space T λ(t) (Tγ(t) M) at the point λ(t) to the fiber T γ(t) M, we define a one parameter family of n-dimensional subspaces in the 2n-dimensional vector space T λ (T M). This family of subspaces is a curve in the Lagrangian Grassmannian L(T λ (T M)). Notation. In the following we use the notation V λ := T λ (T qm) for the vertical subspace at the point λ T M, i.e. the tangent space at λ to the fiber T qm, where q = π(λ). Being the tangent space to a vector space, sometimes it will be useful to identify the vertical space V λ with the vector space itself, namely V λ T qm. Definition Let λ T M. The Jacobi curve at the point λ is defined as follows J λ (t) := e t H V λ(t), (15.1) whereλ(t) := e t H (λ)andγ(t) = π(λ(t)). Notice thatj λ (t) T λ (T M)andJ λ () = V λ = T λ (T qm) is vertical. As discussed in Chapter 14, the tangent vector to a curve in the Lagrange Gassmannian can be interpreted as a quadratic form. In the case of a Jacobi curve J λ (t) its tangent vector is a quadratic form J λ (t) : J λ (t) R. Proposition The Jacobi curve J λ (t) satisfies the following properties: (i) J λ (t+s) = e t H J λ(t) (s), for all t,s, (ii) J λ () = 2H T q M as quadratic forms on V λ T q M. (iii) rank J λ (t) = rankh T γ(t) M 36

361 Proof. Claim (i) is a consequence of the semigroup property of the family {e t H } t. To prove (ii), introduce canonical coordinates (p,x) in the cotangent bundle. Fix ξ V λ. The smooth family of vectors defined by ξ(t) = e t H ξ (considering ξ as a constant vertical vector field) is a smooth extension of ξ, i.e. it satisfies ξ() = ξ and ξ(t) J λ (t). Therefore, by (14.8) ( J λ ()ξ = σ(ξ, ξ) = σ ξ, d ) dt e t H ξ = σ(ξ,[ H,ξ]). (15.2) t= To compute the last quantity we use the following elementary, although very useful, property of the symplectic form σ. Lemma Let ξ V λ a vertical vector. Then, for any η T λ (T M) where we used the canonical identification V λ = T qm. σ(ξ,η) = ξ,π η, (15.3) Proof. In any Darboux basis induced by canonical local coordinates (p,x) on T M, we have σ = n i=1 dp i dx i and ξ = n i=1 ξi pi. The result follows immediately. To complete the proof of point (ii) it is enough to compute in coordinates π [ H,ξ] = π [ H p x H x p,ξ ] = 2 H p p 2 ξ x, Hence by Lemma 15.3 and the fact that H is quadratic on fibers one gets σ(ξ,[ H,ξ]) = ξ, 2 H p 2 ξ = 2H(ξ). (iii). The statement for t = is a direct consequence of (ii). Using property (i) it is easily seen that the quadratic forms associated with the derivatives at different times are related by the formula J λ (t) e t H = J λ(t) (). (15.4) Since e t H is a symplectic transformation, it preserves the sign and the rank of the quadratic form. 1 Remark Notice that claim (iii) of Proposition 15.2 implies that rank of the derivative of the Jacobi curve is equal to the rank of the sub-riemannian structure. Hence the curve is regular if and only if it is associated with a Riemannian structure. In this case of course it is strictly monotone, namely J λ (t) < for all t. Corollary The Jacobi curve J λ (t) associated with a sub-riemannian extremal is monotone nonincreasing for every λ T M. 1 Notice that J λ (t), J λ(t) () are defined on J λ (t), J λ(t) () respectively, and J λ (t) = e t H J λ(t) (). 361

362 15.2 Conjugate points and optimality At this stage we have two possible definition for conjugate points along normal geodesics. On one hand we have singular points of the exponential map along the extremal path, on the other hand we can consider conjugate points of the associated Jacobi curve. The next result show that actually the two definition coincide. Proposition Let γ(t) = E q (tλ) be a normal geodesic starting from q with initial covector λ. Denote by J λ (t) its Jacobi curve. Then for s > γ(s) is conjugate to γ() J λ (s) is conjugate to J λ (). Proof. By Definition 8.43, γ(s) is conjugate to γ() if sλ is a critical point of the exponential map E q. This is equivalent to say that the differential of the map from Tq M to M defined by λ π e s H (λ) is not surjective at the point λ, i.e. the image of the differential e s H has a nontrivial intersection with the kernel of the projection π e s H J λ () T λ(s) Tγ(s) M {}. (15.5) Applying the linear invertible transformation e s H to both subspaces one gets that (15.5) is equivalent to J λ () J λ (s) {} which means by definition that J λ (s) is conjugate to J λ (). The next result shows that, as soon as we have a segment of points that are conjugate to the initial one, the segment is also abnormal. Theorem Let γ : [,1] M be a normal extremal path such that γ [,s] is not abnormal for all < s 1. Assume γ [t,t 1 ] is a curve of conjugate points to γ(). Then the restriction γ [t,t 1 ] is also abnormal. Remark Recall that if a curve γ : [,T] M is a strictly normal trajectory, it can happen that a piece of it is abnormal as well. If the trajectory is strongly normal, then if t,t 1 satisfy the assumptions of Theorem 15.7 necessarily t >. Proof. Let us denote by J λ (t) the Jacobi curve associated with γ(t). From Proposition 15.6 it follows that J λ (t) J λ () {} for each t [t,t 1 ]. We now show that actually this implies J λ () t [t,t 1 ] J λ (t) {}. (15.6) We can assume that the whole piece of the Jacobi curve J λ (t), with t t t 1, is contained in a single coordinate chart. Otherwise we can cover [t,t 1 ] with such intervals and repeat the argument on each of them. Let us fix coordinates given by a Lagrangian splitting in such a way that J λ (t) = {(p,s(t)p),p R n }, J λ () = {(p,),p R n } 362

363 Moreover we can assume that S(t) for every t t t 1, i.e. is non positive definite and monotone decreasing, 2 In particular J λ (t 1 ) J λ () {} if and only if there exists a vector v such that S(t 1 )v =. Since the map t v T S(t)v is nonpositive and decreasing this means that S(t)v = for all t [t,t 1 ], thus J λ () J λ (t 1 ) J λ () J λ (t) (15.7) t [t,t 1 ] that implies that actually we have the equality in (15.7). We are left to show that if a Jacobi curve J λ (t) is such that every t is a conjugate point for τ τ, then the corresponding extremal is also abnormal. Indeed let us fix an element ξ such that ξ J λ (t) t [,τ] which is non-empty by the above discussion. Then we consider the vertical vector field ξ(t) = e t H ξ T λ(t) (Tγ(t) M), t τ. By construction, the vector field ξ is preserved by the Hamiltonian field, i.e. e t H ξ = ξ, that implies [ H,ξ](λ(t)) =. Then the statement is proved by the following Exercise Define η(t) = ξ(λ(t)) T γ(t) M (by canonical identification T λ(t q M) T q M). Show that the identity [ H,ξ](λ(t)) = rewrites in coordinates as follows k h i (η(t)) 2 =, η(t) = i=1 k h i (λ(t)) h i (η(t)). (15.8) i=1 Exercise 15.9 shows that η(t) is a family of covectors associated with the extremal path corresponding to controls u i (t) = h i (λ(t)) and such that h i (η(t)) =, that means that it is abnormal. Corollary Let J λ (t) be the Jacobi curve associated with λ T M and γ(t) = π(λ(t)) the associated sub-riemannian extremal path. Then γ [,τ] is not abnormal for all τ t if and only if J λ (τ) J λ () = {} for all τ t Reduction of the Jacobi curves by homogeneity The Jacobi curve at point λ T M parametrizes all the possible geodesic variations of the geodesic associated with an initial covector λ. Since the variations in the direction of the motion are always trivial, i.e. the trajectory remains the same up to parametrizations, one can reduce the space of variation to an (n 1)-dimensional one. This idea is formalized by considering a reduction of the Jacobi curve in a smaller symplectic space. As we show in the next section, this is a natural consequence of the homogeneity of the sub-riemannian Hamiltonian. 2 Indeed it is proved that the only invariant of a pair of two Lagrangian subspaces in a symplectic space is the dimension of the intersection, i.e. the rank of the difference rank(s(t) S()). Add exercise 363

364 Remark This procedure was already exploited in Section 8.9, obtained by a direct argument via Proposition Indeed one can recognize that the procedure that reduced the equation for conjugate points of one dimension corresponds exactly to the reduction by homogeneity of the Jacobi curve associated to the problem. We start with a technical lemma, whose proof is left as an exercise. Lemma Let Σ = Σ 1 Σ 2 be a splitting of the symplectic space, with σ = σ 1 σ 2. Let Λ i L(Σ i ) and define the curve Λ(t) := Λ 1 (t) Λ 2 (t) L(Σ). Then one has the splittings: Λ(t) = Λ 1 (t) Λ 2 (t), R Λ (t) = R Λ1 (t) R Λ2 (t). Consider now a Jacobi curve associated with λ T M: J λ (t) = e t H V λ(t), V λ = T λ (T π(λ) M). Denote by δ α : T M T M the fiberwise dilation δ α (λ) = αλ, where α >. Definition The Euler vector field E Vec(T M) is the vertical vector field defined by E(λ) = d ds δ s (λ), λ T M. s=1 It is easy to see that in canonical coordinates (x,ξ) it satisfies E = n identity holds e t E λ = e t λ, i.e. e t E (ξ,x) = (e t ξ,x). i=1 ξ i Exercise Prove that the Euler vector field is characterized by the identity Lemma We have the identity e t H i E σ = s, s = Liouville 1-form in T M. E = E t H. In particular [ H, E] = H. ξ i and the following Proof. The homogeneity property (8.5) of the Hamiltonian can be rewritten as follows e t H (δ s λ) = δ s (e st H (λ)), s,t >. Applying δ s to both sides and changing t into t one gets the identity δ s e t H δ s = e st H. (15.9) Computing the 2 nd order mixed partial derivative at (t,s) = (,1) in (15.9) one gets, by (2.27), that [ H, E] = H. Thus, by (2.31) we have e t H E = E t H, since every higher order commutator vanishes. Proposition The subspace Σ = span{ E, H} is invariant under the action of the Hamiltonian flow. Moreover { E, H} is a Darboux basis on Σ H 1 (1/2). 364

365 Proof. The fact that Σ is an invariant subspace is a consequence of the identities e t H E = E t H, e t H H =. Moreover, on the level set H 1 (1/2), we have by homogeneity of H w.r.t. p: σ( E, H) = E(H) = d dt H(e t E (p,x)) = p H = 2H = 1. (15.1) t= p It follows that { E, H} is a Darboux basis for Σ. In particular we can consider the the symplectic splitting Σ = Σ Σ. Exercise Prove the following intrinsic characterization of the skew-orthogonal to Σ: Σ = {ξ Tλ (T M) : d λ H,ξ = s λ,ξ = }. The assumptions of Lemma are satisfied and we could split our Jacobi curve. Definition The reduced Jacobi curve is defined as follows Ĵ λ (t) := J λ (t) Σ. (15.11) Notice that, if we put V λ := V λ T λ H 1 (1/2), we get Ĵ λ () = V λ, Ĵ λ (t) = e t H V λ. Moreover we have the splitting J λ (t) = Ĵλ(t) R( E t H). We stress again that Ĵ λ (t) is a curve of (n 1)-dimensional Lagrangian subspaces in the(2n 2)- dimensional vector space Σ. Exercise With the notation above (i) Show that the curvature of the curve J λ (t) Σ in Σ is always zero. (ii) Prove that J λ () J λ (s) {} if and only if Ĵλ() Ĵλ(s) {}. 365

366 366

367 Chapter 16 Riemannian curvature On a manifold, in general there is no canonical method for identifying tangent spaces at different points, (or more generally fibers of a vector bundle at different points). Thus, we have to expect that a notion of derivative for vector fields (or sections of a vector bundle), has to depend on certain choices. In our presentation we introduce the general notion of Ehresmann connection and we then we discuss how this notion is related with the notion of parallel transport and covariant derivative usually introduced in classical Riemannian geometry Ehresmann connection Given a smooth fiber bundle E, with base M and canonical projection π : E M, we denote by E q = π 1 (q) the fiber at the point q M. The vertical distribution is by definition the collection of subspaces in TE that are tangent to the fibers V = {V z } z E, V z := kerπ,z = T z E π(z) T z E. Definition Let E be a smooth fiber bundle. An Ehresmann connection on E is a smooth vector distribution H in E satisfying H = {H z } z E, T z E = V z H z. Notice that V, being the kernel of the pushforward π, is canonically associated with the fibre bundle. Defining a connection means exactly to define a canonical complement to this distribution. For this reason H is also called horizontal distribution. Definition LetX Vec(M). Thehorizontal lift of X is theuniquevector field X Vec(E) such that X (z) H z, π X = X, z E. (16.1) The uniqueness follows from the fact that π,z : T z E T π(z) M is an isomorphism when restricted to H z. Indeed π,z is a surjective linear map with kerπ,z = V z. Notation. In the following we will refer also at as the connection on E. 367

368 Given a smooth curve γ : [,T] M on the manifold M, the connection let us to define the parallel transport along γ, i.e. a way to identify tangent vectors belonging to tangent spaces at different points of the curve. Let X t be a nonautonomous smooth vector field defined on a neighborhood of γ, that is an extension of the velocity vector field of the curve 1, i.e. such that γ(t) = X t (γ(t)), t [,T]. Then consider the non autonomous vector field Xt Vec(E) obtained by its lift. Definition Let γ : [,T] M be a smooth curve. The parallel transport along γ is the map Φ defined by the flow of Xt Φ t,t 1 := exp t1 t Xs ds : E γ(t ) E γ(t1 ), for < t < t 1 < T. (16.2) In the general case we need some extra assumptions on the vector field to ensure that (16.2) exists (even for small time t > ) since the existence time of a solution also depend on the point on the fiber. For instance if we the fibers are compact, then it is possible to find such t >. Exercise Show that the parallel transport map sends fibers to fibers and does not depend on the extension of the vector field X t. (Hint: consider two extensions and use the existence and uniqueness of the flow.) Curvature of an Ehresmann connection Assume that π : E M is a smooth fiber bundle and let be a connection on E, defining the splitting E = V H. Given an element z E we will also denote by z hor (resp. z ver ) its projection on the horizontal (resp. vertical) subspace at that point. The commutator of two vertical vector field is always vertical. The curvature operator associated with the connection computes if the same holds true for two horizontal vector fields. Definition Let E be a smooth fiber bundle and a connection on E. Let X,Y Vec(M) and define R(X,Y) := [ X, Y ] ver (16.3) The operator R is called the curvature of the connection. Notice that, given a vector field on E, its horizontal part coincide, by definition, with the lift of its projection. In particular [ X, Y ] hor = [X,Y], (i.e. π [ X, Y ] = [X,Y]) Hence R(X,Y) computes the nontrivial part of the bracket between the lift of X and Y and R if and only if the horizontal distribution H is involutive. The curvature R(X,Y) is also rewritten in the following more classical way R(X,Y) = [ X, Y ] [X,Y]. = X Y X Y [X,Y]. Next we show that R is actually a tensor on T q M, i.e. the value of R(X,Y) at q depends only on the value of X and Y at the point q. 1 this is always possible with a (maybe non autonomous) vector field. 368

369 Proposition R is a skew symmetric tensor on M. Proof. The skew-symmetry is immediate. To prove that the value of R(X,Y) at q depends only on the value of X and Y at the point q, it is sufficient to prove that R is linear on functions. By skew-symmetry, we are reduced to prove that R is linear in the first argument, namely R(aX,Y) = ar(x,y), where a C (M). Notice that the symbol a in the right hand side stands for the function π a = a π in C (E), that is constant on fibers. By definition of lift of a vector field it is easy to prove theidentities ax = a X and X a = Xa for every a C (M). Applying the definition of and the Leibnitz rule for the Lie bracket one gets R(aX,Y) = [ ax, Y ] [ax,y] = a[ X, Y ] ( Y a) X a[x,y] (Ya)X = a[ X, Y ] (Ya) X a [X,Y] +(Ya) X = ar(x,y) Linear Ehresmann connections Assume now that E is a vector bundle on M (i.e. each fiber E q = π 1 (q) has a natural structure of vector space). In this case it is natural to introduce a notion of linear Ehresmann connection on E. Given a vector bundle π : E M, we denote by CL (E) the set of smooth functions on E that are linear on fibers. Remark For a vector bundle π : E M, the base manifold M can be considered immersed in E as the zero section (see also Example 2.48). The dual version of this identification is the inclusion i : C (M) C (E). Indeed any function in C (M) can be considered as a functions in C (E) which is constant on fibers, i.e. more precisely a C (M) π a C (E). Exercise Show that a vector field on E is the lift of a vector field on M if and only if, as a differential operator on C (E), it maps the subspace C (M) into itself. After this discussion it is natural to give the following definition. Definition A linear connection on a vector bundle E on the base M is an Ehresmann connection such that the lift X of a vector field X Vec(M) satisfies the following property: for every a C L (E) it holds Xa C L (E). Remark Given a local basis of vector fields X 1,...,X n on M we can build dual coordinates (u 1,...,u n ) on the fibers of E defining the functions u i (z) = z,x i (q) where q = π(z). In this way E = {(u,q),q M,u R n }, 369

370 and the tangent space to E is splitted in T z E T q M T z E q. A connection on E is determined by the lift of the vector fields X i,i = 1,...,n on the base manifold (recall that π Xi = X i ) Xi = X i + n a ij (u,q) uj, i = 1,...,n, (16.4) j=1 where a ij C (E) are suitable smooth functions. Then is linear if and only if for every i,j the function a ij (u,q) = n k=1 Γk ij (q)u k is linear with respect to u. The smooth functions Γ k ij are also called the Christoffel symbols of the linear connection. Exercise Let γ be a smooth curve on the manifold such that γ(t) = n i=1 v i(t)x i (γ(t)). Show that the differential equation ξ(t) = γ(t) ξ(t) for the parallel transport along γ are written as u j = i,k Γk ij v iu k where (u 1,...,u n ) are the vertical coordinates of ξ. Notice that, for a linear connection, the parallel transport is defined by a first order linear (nonautonomous) ODE. The existence of the flow is then guaranteed from stantard results form ODE theory. Moreover, when it exists, the map Φ t,t 1 is a linear transformation between fibers Covariant derivative and torsion for linear connections Once a connection on a linear vector bundle E is given, we have a well defined linear parallel transport map Φ t,t 1 := t1 exp Xs ds : E γ(t ) E γ(t1 ), for < t < t 1 < T. (16.5) t If we consider the dual map of the parallel transport one can naturally introduce a non autonomous linear flow on the dual bundle (notice the exchange of t,t 1 in the integral) Φ t,t 1 := ( t ) exp Xs ds : Eγ(t ) E γ(t 1 ), for < t < t 1 < T. (16.6) t 1 The infinitesimal generator of this adjoint flow defines a linear parallel transport, hence a linear connection, on the dual bundle E. In what follows we will restrict our attention to the case of the vector bundle E = T M and we assume that a linear connection on T M is given. Notice that, by the above discussion, all the constructions can be equivalently performed on the dual bundle E = TM. For every vector field Y Vec(M) let us denote with Y C (T M) the function Y (λ) = λ,y(q), q = π(z), namely the smooth function on E associated with Y that is linear on fibers. This identification betweenvector fieldsonm andlinearfunctionsont M permitsustodefinethecovariant derivative of vector fields. Definition Let X,Y Vec(M). We define X Y = Z if and only if X Y = Z with Z Vec(M). 37

371 Notice that the definition is well-posed since is linear, hence X Y is a linear function and there exists Z Vec(M) such that X Y = Z. 2 Lemma Let {X 1,...,X n } be a local frame on M. Then Xi X j = Γ k ij X k, where Γ k ij are the Christoffel symbols of the connection. Proof. Let us prove this in the coordinates dual to our frame. In these coordinates, the linear connection is specified by the lifts Xi = X i +Γ k ij u k uj, where u j (λ) = λ,x j. Moreover Xj = u j. Hence it is immediate to show Xi Xj = Γk ij X k, and the lemma is proved. We now introduce the torsion tensor of a linear connection on T M. As usual, σ denotes the canonical symplectic structure on T M. Definition Thetorsion of alinear connection is themap T : Vec(M) 2 Vec(M) defined by the identity T(X,Y) := σ( X, Y ), X,Y Vec(M). (16.7) It is easy to check that T is actually a tensor, i.e. the value of T(X,Y) at a point q depends only on the values of X,Y at the point. The torsion computes how much the horizontal distribution H is far from being Lagrangian. In particular H is Lagrangian if and only if T. The classical formula for the torsion tensor, in terms of the covariant derivative, is recovered in the following lemma. Lemma The torsion tensor satisfies the identity T(X,Y) = X Y Y X [X,Y]. (16.8) Proof. We have to prove that T(X,Y) = X Y Y X [X,Y]. Notice that by definition of the Liouville 1-form s Λ 1 (T M), s λ = λ π we have X (λ) = λ,x = s λ, X. Then we have, using that σ = ds and the Cartan formula (4.75) T(X,Y) = ds( X, Y ) = X s, Y Y s, X s,[ X, Y ] = X s, Y Y s, X s, [X,Y] = X Y Y X [X,Y], where in the second equality we used that s,[ X, Y ] = s,[ X, Y ] hor = s, [X,Y] since the Liouville form by definition depends only on the horizontal part of the vector. Exercise Show that a linear connection on a vector bundle E satisfies the following Leibnitz rule X (ay) = a X Y +(Xa)Y, for each a C (M). 2 There is no confusion in the notation above since, by definition, X it is well defined when applied to smooth functions on T M. Whenever it is applied to a vector field we follow the aforementioned convention. 371

372 16.2 Riemannian connection In this section we want to introduce the Levi-Civita connection on a Riemannian manifold M by defining an Ehresmann connection on T M via the Jacobi curve approach. Recall that every Jacobi curve associated with a trajectory on a Riemannian manifold is regular. Moreover, as showed in Chapter 14, every regular curve in the Lagrangian Grassmannian admits a derivative curve, which defines a canonical complement to the curve itself. Hence, following this approach, it is natural to introduce the Riemannian connection at λ T M as the canonical complement to the Jacobi curve defined at λ. Definition The Levi-Civita connection on T M is the Ehresmann connection H is defined by H λ = J λ (), λ T M, where as usual J λ (t) denotes the Jacobi curve defined at the point λ T M and Jλ derivative curve. denotes its The next proposition characterizes the Levi-Civita connection as the unique linear connection on T M that is linear, metric preserving and torsion free. Proposition The Levi-Civita connection satisfies the following properties: (i) is a linear connection, (ii) is torsion free, (iii) is metric preserving, i.e. X H = for each vector field X Vec(M). Proof. (i). It is enough to prove that the connection H λ is 1-homogeneous, i.e. H cλ = δ c H λ, c >. (16.9) Indeed in this case the functions a ij C (T M) defining the connection (see (16.4)) are 1- homogeneous, hence linear as a consequence of Exercise Let us prove (16.9). The differential of the dilation on the fibers δ c : T M T M satisfies the property δ c (T λ (Tq M)) = T cλ(tq M). From this identity and differentiating the identity one easily gets that Indeed one has the following chain of identities e t H δ c = δ c e ct H, c >, (16.1) J cλ (t) = δ c J λ (ct), t,λ T M. (16.11) J cλ (t) = e t H (T cλ (Tq M)) = e t H δ c (T λ (T qm)) (by (16.1)) = δ c e ct H (T λ (Tq M)) = δ c J λ (ct). 372

373 Now we show that the same relation holds true also for the derivative curve, i.e. J cλ (t) = δ c J λ (ct), t, λ T M. (16.12) Indeed one can check in coordinates (we denote as usual J λ (t) = {(p,s λ (t)p),p R n }) that the identity (16.11) is written as S cλ (t) = 1 c S λ(ct) thus S cλ (t) 1 = cs λ (ct) 1. From here 3 one also gets B cλ (t) = cb λ (ct) and (16.12) follows from the identity S (t) = B 1 (t) +S(t). (See also Exercise 14.22). In particular at t = the identity (16.12) says that H cλ = δ c H λ. (ii). It is a direct consequence of the fact that J λ () is a Lagrangian subspace of T λ(t M) for every λ T M, hence the symplectic form vanishes when applied to two horizontal vectors. (iii). Again, for every X Vec(M), both X and H are horizontal vector field. Since the horizontal space is Lagrangian X H = σ( X, H) =. Exercise Let f : R n R be a smooth function that satisfies f(αx) = αf(x) for every x R n and α. Then f is linear. The following theorem says that a connection satisfying the three properties above is unique. Then it characterize the Levi-Civita connection in terms of the structure constants of the Lie algebra defined by an orthonormal frame. Theorem There is a unique Ehresmann connection satisfying the properties (i), (ii), and (iii) of Proposition 16.18, that is the Levi-Civita connection. Its Christoffel symbols are computed by Γ k ij = 1 2 (ck ij c i jk +cj ki ), (16.13) where c k ij are the smooth functions defined by the identity [X i,x j ] = n k=1 ck ij X k. Proof. Let X 1,...,X n be a local orthonormal frame for the Riemannian structure and let us consider coordinates (q,u) in T M, where the fiberwise coordinates u = (u 1,...,u n ) are dual to the orthonormal frame. From the linearity of the connection it follows that there exist smooth functions Γ k ij : M R (depending on q only) such that n Xi = X i + Γ k ij u k uj, j=1 i = 1,...,n. In particular Xi X j = Γ k ij X k. In these coordinates the Hamiltonian vector field associated with the Riemannian Hamiltonian H = 1 n 2 i=1 u2 i reads (see also Exercise??) H = n i,j,k=1 u i X i +c k iju i u k uj, while the symplectic form σ is written (ν 1,...,ν n denotes the dual basis to X 1,...,X n ) σ = n i,j,k=1 du k ν k c k iju k ν i ν k. 3 recall that B is the zero order term of the expansion of S

374 Since the horizontal space is Lagrangian, one has the relations n = σ( Xi, Xj ) = (Γ k ij Γk ji ck ij )u k, i,j = 1,...,n, k=1 hence c k ij = Γk ij Γk ji for all i,j,k. Moreover the connection is metric, i.e. it satisfies n = Xi H = Γ k ij u ku j, j,k=1 i = 1,...,n. The last identity implies that Γ k ij is skew-symmetric with respect to the pair (j,k), i.e. Γk ij = Γj ik. Thus combining the two identities one gets c k ij c i jk +cj ki = (Γk ij Γ k ji) (Γ i jk +Γi kj )+(Γj ki Γj ik ) = Γ k ij Γj ik = 2Γk ij. Remark Notice that in the classical approach one can recover formula (16.13) from the following particular case of the Koszul formula Γ k ij = g( X i X j,x k ) = 1 2 (g([x i,x j ],X k ) g([x j,x k ],X i )+g([x k,x i ],X j )), that holds for every orthonormal basis X 1,...,X n. Notice also that the Hamiltonian vector field is written in coordinates H = n i=1 u i Xi, which gives another proof of the fact that it is horizontal. Let X,Y,Z,W Vec(M). We define R(X,Y)Z = W if R(X,Y)Z = W. Proposition (Bianchi identity). For every X, Y, Z Vec(M) the following identity holds R(X,Y)Z +R(Y,Z)X +R(Z,X)Y =. (16.14) Proof. We will show that (16.14) is a consequence of the Jacobi identity (2.32). Using that is a torsion free connection we can write Then [X,[Y,Z]] = X [Y,Z] [Y,Z] X = X Y Z X Z Y [Y,Z] X, [Z,[X,Y]] = Z X Y Z Y X [X,Y] Z, [Y,[Z,X]] = Y Z X Y X Z [Z,X] Y, = [X,[Y,Z]]+[Y,[Z,X]] +[Z,[X,Y]] = X Y Z X Z Y [Y,Z] X + Z X Y Z Y X [X,Y] Z + Y Z X Y X Z [Z,X] Y = R(X,Y)Z +R(Y,Z)X +R(Z,X)Y. 374

375 Exercise Prove the second Bianchi identity ( X R)(Y,Z,W)+( Y R)(Z,X,W)+( Z R)(X,Y,W) =, X,Y,Z,W Vec(M). (Hint: Expand the identity [X,[Y,Z]]+[Y,[Z,X]]+[Z,[X,Y]] W =.) Let usdenote(x,y,z,w) := R(X,Y)Z,W. Following thisnotation, thefirstbianchi identity can be rewritten as follows: (X,Y,Z,W)+(Z,X,Y,W)+(Y,Z,X,W) =, X,Y,Z,W Vec(M). (16.15) Remark The property of the Riemann tensor can be reformulated as follows (X,Y,Z,W) = (Y,X,Z,W), (X,Y,Z,W) = (X,Y,W,Z). (16.16) Proposition For every X,Y,Z,W Vec(M) we have (X,Y,Z,W) = (Z,W,X,Y). Proof. Using (16.15) four times we can write the identities (X,Y,Z,W) +(Z,X,Y,W) +(Y,Z,X,W) =, (Y,Z,W,X) +(W,Y,Z,X) +(Z,W,Y,X) =, (Z,W,X,Y )+(X,Z,W,Y )+(W,X,Z,Y ) =, (W,X,Y,Z) +(Y,W,X,Z)+(X,Y,W,Z) =. Summing all together and using the skew symmetry (16.16), one gets (X,Z,W,Y ) = (W,Y,X,Z). Proposition Assume that (X,Y,X,W) = for every X,Y,W Vec(M). Then (X,Y,Z,W) = X,Y,Z,W Vec(M). Proof. By assumptions and the skew-simmetry properties (16.16) of the Riemann tensor we have that (X,Y,Z,W) = whenever any two of the vector fields coincide. In particular = (X,Y +W,Z,Y +W) = (X,Y,Z,W)+(X,W,Z,Y ). (16.17) since the two extra terms that should appear in the expansion vanish by assumptions. Then (16.17) can be rewritten as (X,Y,Z,W) = (Z,X,Y,W), i.e. the quantity (X,Y,Z,W) is invariant by ciclic permutations of X,Y,Z. But the cyclic sum of terms is zero by (16.15), hence (X,Y,Z,W) =. We end this section by summarizing the symmetry property of the Riemann curvature as follows Corollary There is a well defined map R : 2 T q M 2 T q M, R(X Y) := R(X,Y). Moreover R is skew-adjoint with respect to the induced scalar product on 2 T q M, that means R(X Y),Z W = X Y,R(Z W). 375

376 16.3 Relation with Hamiltonian curvature In this section we compute the curvature of the Jacobi curve associated with a Riemannian geodesic and we describe the relation with the Riemann curvature discussed in the previous section. As we show, the curvature associated to a geodesic is a kind of sectional curvature operator in the direction of the geodesic itself. Definition The Hamiltonian curvature tensor at λ T M is the operator R λ := R Jλ () : V λ V λ. In other words R λ is the curvature of the Jacobi curve associated with λ at t =. Proposition Let ξ V λ and V be a smooth vertical vector field extending ξ. Then R λ (ξ) = [ H,[ H,V] hor ] ver (λ) Proof. This is a direct consequence of Proposition Indeed recall that the curvature of the Jacobi curve is expressed through the composition R λ = J λ () J λ (). Moreover, being J λ () = V λ and J λ () = H λ we have that π J()J ()(ξ) = ξ hor, π J ()J()(η) = η ver. FInally we can extend vectors in J λ () (resp. Jλ ()) by applying the Hamiltonian vector field since J λ (t) = e t H J λ () (resp. Jλ (t) = et H Jλ ()). From these remarks we obtain the following formulas J λ ()ξ = [ H,V] hor, J λ ()η = [ H,W] ver for some V vertical (resp. W horizontal) extension of the vector ξ V λ (reps. η H λ ). Another immediate property of the curvature tensor is the homogeneity with respect to the rescaling of the covector (that corresponds to reparametrization of the trajectory). Indeed by choosing ϕ(t) = ct, with c >, in Proposition one gets Corollary For every c > we have R cλ = c 2 R λ. If we use the Riemannian product to identify the tangent and the cotangent space at a point, we recognize that R λ is nothing but the sectional curvature operator where one entry is the tangent vector γ of the geodesic. Let us denote by I : TM T M the isomorphism defined by the Riemannian scalar product. In particular I(v) = λ for λ T qm and v T q M if λ,w = v w for all w T q M. Let denote H q = H T q M. Recall that the differential of H q can be interpreted as a linear map DH q : T qm T q M that sends λ T qm into D λ H q seen as a linear functional on T qm, i.e. a tangent vector. This map is actually the inverse of the isomorphism I. Lemma D λ H q = I 1 (λ). Proof. It is a simple consequence of the formula H(λ) = λ,i 1 (λ).

377 Corollary Assume I(v) = λ, then H(λ) = v. Proof. Indeed, since H is an horizontal vector field, it is sufficient to show that π H(λ) = v, which is a consequence of Lemma Indeed for every vertical vector ξ T λ (T qm) one has ξ,v = ξ,i 1 (λ) = D λ H(ξ) = σ(ξ, H(λ)) = ξ,π H(λ). By arbitrary of ξ T λ (Tq M) one has the equality v = π H(λ). Theorem We have the following identity R I(X) (I(Y)) = R(X,Y)X, X,Y T q M. (16.18) Proof. We have to compute the quantity R I(X) (I(Y)) = [ H,[ H,IY] hor ] ver (I(X)) First notice that π [ H,I(Y)] = Y hence [ H,I(Y)] hor = Y. Then [ H,[ H,I(Y)] hor ] ver (I(X)) = [ X, Y ] ver (I(X)) = R(X,Y)(X). Definition The Ricci tensor at λ is defined as the trace of the curvature operator at λ, Ric(λ) := trace R λ. Exercise Prove the following expression for the Ricci tensor, where X 1,...,X n is a local orthonormal frame and γ() = v = I 1 (λ) is the tangent vector to the geodesic: Ric(λ) = = n R(v,X i )v X i i=1 n σ λ ([ H, Xi ], Xi ). i=1 This shows that Ric(λ) = Ric(v) coincide with the classical Riemannian Ricci tensor Locally flat spaces In this section we want to show that the Riemannian curvature is the only obstruction for a Riemannian manifold to be locally Euclidean. Finally we show that the Riemannian curvature is also completely recovered by the Hamiltonian curvature R λ. A Riemannian manifold M is called flat if R(X,Y) = for every X,Y Vec(M). Theorem M is flat if and only if M is locally isometric to R n. 377

378 Proof. If M is locally isometric to R n, then its curvature tensor at every point in a neighborhood is identically zero. Then let us assume that the Riemann tensor R vanishes identically and prove that M is locally Euclidean. We will do that by showing that there exists coordinate such that the Hamiltonian, in these set of coordinates, is written as the Hamiltonian of the Euclidean R n. Since R is identically zero the horizontal distribution (defined by the Levi Civita connection) is involutive. Hence, by Frobenius theorem, there exists a horizontal Lagrangian foliation of T M, i.e. for each λ T M, there exists a leaf L λ of the foliation passing through this point that is tangent to the horizontal space H λ. In particular each leaf is transversal to the fiber Tq M, where q = π(λ). Fix a point q M and a neighborhood O q where R is identically zero. Define the map Ψ : π 1 (O q ) T q M, λ π 1 (O q ) L λ T q M that assigns to each λ the intersection of the leaf passing through this point and T q M. Exercise Show that Ψ is a linear, orthogonal transformation, i.e. H(Ψ(λ)) = H(λ) for all λ π 1 (O q ). (Hint: use the linearity of the connection and the fact that H is horizontal). Fix now a basis {ν 1,...,ν n } in T q M that is orthonormal (with respect to the dual metric). Being Ψ linear on fibers, we can write Ψ(λ) = n ψ i (λ)ν i, where ψ i (λ) = λ,x i (q) i=1 for a suitable basis of vector fields X 1,...,X n in the neighborhood O q. Moreover X 1,...,X n is an orthonormal basis since Ψ is an orthogonal map. We want to show that {X 1,...,X n } is an orthonormal basis of vector fields that commutes everywhere. Let us show that the fact that the foliation is Lagrangian implies [X i,x j ] = for all i,j = 1,...,n. Indeed the tautological 1-form is written in these coordinates as s = n i=1 ψ iν i and σ = ds = n dψ i ν i +ψ i dν i. (16.19) i=1 Since on each leaf the function ψ i is constant by definition (hence dψ i L = ), we have that σ L = i ψ idν i. In particular each leaf is Lagrangian if and only if dν i = for i = 1,...,n. Then, from the Cartan formula, one gets = dν i (X j,x k ) = ν i ([X j,x k ]), i,j,k. This proves that [X i,x j ] = for each i,j = 1,...,n. Hence, in the coordinate set (ψ,q), we have H(ψ,q) = 1 2 ψ 2. The next result shows that the Hamiltonian curvature can detect if a manifold is flat or not. Corollary M is flat if and only if R λ = for every λ T M. 378

379 Proof. Assume that M is flat. Then R is identically zero and a fortiori R λ = from (16.18). Let us prove the converse. Recall that R λ = implies, again by (16.18), that (X,Y,X,W) =, X,Y,W Vec(M). Then the statement is a consequence of Proposition Exercise Prove that actually the Riemann tensor R is completely determined by R Example: curvature of the 2D Riemannian case In this section we apply the definition of curvature discussed in this chapter to a two dimensional Riemannian surface. As we explain, we recover that the Riemannian curvature tensor is determined by the Gauss curvature of the manifold. Let M be a 2-dimensional surface and f 1,f 2 Vec(M) be a local orthonormal frame for the Riemannian metric. The Riemannian Hamiltonian H is written as follows (we use canonical coordinates λ = (p,x) on T M) H(p,x) = 1 2 ( p,f 1(x) 2 + p,f 2 (x) 2 ) (16.2) Here, foracovector λ = (p,x) T M, thesymplecticvector spaceσ λ = T λ (T M) is 4-dimensional. Recall that, being M 2-dimensional, the level set H 1 (1/2) Tq M is a circle. Hence, there is a well defined vector field that produces rotation on the reduced fiber. Let us define the angle θ on the level H 1 (1/2) Tx M by setting p,f 1 (x) = cosθ, p,f 2 (x) = sinθ, in such a way that θ = corresponds to the direction of f 1. Denote by θ the rotation in the fiber of the unit tangent bundle and by E, the Euler vector field. Denote finally by H := [ θ, H]. Notice that Σ λ = V λ H λ where V λ = span{ E, θ } and H λ = span{ H, H }. Lemma The vector fields { E, θ, H, H } at λ form a Darboux basis for Σ λ. Proof. We want to compute the following symplectic products of the vector fields: σ( θ, E) =, σ( θ, H) =, σ( E, H) = 1. (16.21) σ( θ, H ) = 1, σ( E, H ) =, σ( H, H ) =. (16.22) Indeed, let us prove first (16.21). The first equality follows from the fact that both vectors belong to the vertical subspace, that is Lagrangian. The second one is a consequence of the fact that, by construction, θ is tangent to the level set of H, i.e. σ( θ, H) = θ ( H) = dh, θ =. The last identity is (15.1). As a preliminary step for the proof of (16.22) notice that, if s = i E σ denotes the tautological Liouville form, one has s, H = 1, s, H =. (16.23) 379

380 These two identities follows from s, H = σ( E, H) = 1, (16.24) s, H = s,[ θ, H] = ds( θ, H) = σ( θ, H) =, (16.25) where in the second line we used the Cartan formula (4.75) and the fact that θ is vertical. Let us now prove (16.22). Being [ θ, H ] = [ θ,[ θ, H]] = H, we have again by Cartan formula and (16.23) σ( θ, H ) = ds( θ, H ) = s,[ θ, H ] = s, H = σ( E, H) = 1 Moreover by (16.23) The last computation is similar. Let us write σ( E, H ) = s, H =. σ( H, H ) = dh, H = dh,[ θ, H], and apply the Cartan formula to the last term (with dh as 1-form). dh([ θ, H]) = d 2 H( θ, H) θ dh, H + H dh, θ = since the three terms are all equal to zero. Now we compute the curvature via the Jacobi curve, reduced by homogeneity. Notice that by Lemma 16.4 we can remove the symplectic space spanned by { E, H} and, being { E, H} = { θ, H }, we have Ĵ λ (t) = span{e t H θ }. Then we define the generator of the Jacobi curve V t = e t H θ, Vt = e t H [ H, θ ] = e t H H Notice that σ(v t, V t ) = 1, for every t. (16.26) Indeedit istruefort = andtheequality isvalid forall tsincethetransformatione t H is symplectic. To compute the curvature of the Jacobi curve let us write V t = α(t)v β(t) V (16.27) We claim that the matrix S(t) representing the 1-dimensional Jacobi curve (that actually is a scalar), is given in these coordinates by S(t) = β(t) α(t) = σ(v,v t ) σ( V,V t ). Indeed the identity ( V t = α(t)v β(t) V = α(t) V β(t) ) α(t) V, (16.28) 38

381 tells us that the matrix representing the vector space spanned by V t is the graph of the linear map V β(t) α(t) V. Moreover, using that V and V is a Darboux basis, it is easy to compute σ(v,v t ) = α(t)σ(v,v ) β(t)σ(v }{{}, V ) = β(t), (16.29) }{{} = = 1 σ( V,V t ) = α(t)σ( V,V ) β(t)σ( }{{} V, V ) = α(t). (16.3) }{{} =1 = Differentiating the identity (16.26) with respect to t one gets the relations σ(v t, V t ) =, σ(v t,v (3) t ) = σ( V t, V t ) Notice that these quantities are constant with respect to t. Collecting the above results one can compute the asymptotic expansion of S(t) with respect to t t+ t3 S(t) = 6 σ(v, V... )+O(t 5 ) 1+ t2 2 σ( V, V )+O(t 4 ) ( = t+ t3 6 σ(v, V... )( ) )+O(t 5 ) 1 t2 2 σ( V, V )+O(t 4 ) (16.31) (16.32) and one gets for the derivative of S(t) at t = Ṡ() = 1, S() =,... S() = 2σ( V, V ). The formula for the curvature R is finally computed in terms of S(t) as follows: Using that V t = e t H θ we can expand V t as follows hence (16.33) is rewritten as R = 1... S() = σ( V, 2 V ) (16.33) V t = θ +t[ H, θ ]+ t2 2 [ H,[ H, θ ]]+O(t 3 ) R = σ([ H,[ H, θ ]],[ H, θ ]) (16.34) = σ([ H, H ], H ) (16.35) To end this section, we compute the curvature R with respect to the orthonormal frame f 1,f 2. Denote the Hamiltonians h i (p,x) = p,f i (x), i = 1,2. The PMP reads ẋ = h 1 f 1 (x)+h 2 f 2 (x) ḣ 1 = {H,h 1 } = {h 2,h 1 }h 2 (16.36) ḣ 2 = {H,h 2 } = {h 2,h 1 }h 1 381

382 Moreover {h 2,h 1 }(p,x) = p,[f 2,f 1 ](x). Assume that [f 1,f 2 ] = a 1 f 1 +a 2 f 2, a i C (M). Then {h 2,h 1 } = a 1 h 1 a 2 h 2. If we restrict to h 1 = cosθ and h 2 = sinθ equations (16.36) become { ẋ = cosθf 1 +sinθf 2 θ = a 1 cosθ+a 2 sinθ and it is easy to compute the following expression for H and commutators 4 H = h 1 f 1 +h 2 f 2 +(a 1 h 1 +a 2 h 2 ) θ, H = h 2 f 1 +h 1 f 2 +( a 1 h 2 +a 2 h 1 ) θ, [ H, H ] = (f 1 a 2 f 2 a 1 a 2 1 a2 2 ) θ. Recall that κ = f 1 a 2 f 2 a 1 a 2 1 a 2 2, is the Gaussian curvature of the surface M (see also Chapter 4). Since σ( θ, H ) = 1 one gets R = σ([ H, H ], H ) = σ(κ θ, H ) = κ. Exercise In this exercise we recover the previous computations introducing dual coordinates to our frame. Let ν 1,ν 2 be the dual basis to f 1,f 2 and set f θ := h 1 f 1 +h 2 f 2, ν θ := h 1 ν 1 +h 2 ν 2. Define the smooth function b := a 1 h 1 +a 2 h 2 on T M. In these notation H = f θ +b θ, H = f θ +b θ, where denotes the derivative with respect to θ. Then, using that in these coordinates the tautological form is s = ν θ, show that the symplectic form is written as and compute the following expressions σ = ds = dθ ν θ bν 1 ν 2, i H σ = (b b)ν θ dθ, [ H, H ] = (f θ b f θ b b 2 b 2 ) θ, showing that this gives an alternative proof of the above computation of the curvature. 4 here we still use the notation h 1,h 2 as functions of θ satisfying θ h 1 = h 2, θ h 2 = h 1 382

383 Chapter 17 Curvature in 3D contact sub-riemannian geometry The main goal of this chapter is to compute the curvature of the three dimensional contact sub- Riemannian case. Then we will discuss how the invariant contained in the sub-riemannian curvature classify 3D left-invariant structures on Lie groups D contact sub-riemannian manifolds In this section we consider a sub-riemannian manifold M of dimension 3 whose distribution is defined as the kernel of a contact 1-form ω Λ 1 (M), i.e. D q = kerω q for all q M. Let us also fix a local orthonormal frame f 1,f 2 such that D q = kerω q = span{f 1 (q),f 2 (q)} Recall that the 1-form ω Λ 1 (M) defines a contact distribution if and only if ω dω is never vanishing. Exercise Let M be a 3D manifold, ω Λ 1 M and D = kerω. The following are equivalent: (i) ω is a contact 1-form, (ii) dω D, (iii) f 1,f 2 D linearly independent, then [f 1,f 2 ] / D. Remark The contact form ω is defined up to a smooth function, i.e. if ω is a contact form, aω is a contact form for every a C (M). This let us to normalize the contact form by requiring that dω D = ν 1 ν 2, (i.e. dω(f 1,f 2 ) = 1.) where ν 1,ν 2 is the dual basis to f 1,f 2. This is equivalent to say that dω is equal to the area form induced on the distribution by the sub-riemannian scalar product. Definition The Reeb vector field of the contact structure is the unique vector field f Vec(M) that satisfies dω(f, ) =, ω(f ) = 1 383

384 In particular f is transversal to the distribution and the triple {f,f 1,f 2 } defines a basis of T q M at every point q M. Notice that ω,ν 1,ν 2 is the dual basis to this frame. Remark The flow generated by the Reeb vector field e tf : M M is a group of diffeomorphisms that satisfy (e tf ) ω = ω. Indeed L f ω = d(i f ω)+i f dω = since i f ω = ω(f ) = 1 is constant and i f dω = dω(f, ) =. In what follows, to simplify the notation, we will replace the contact form ω by ν, as the dual element to the vector field f. We can write the structure equations of this basis of 1-forms dν = ν 1 ν 2 dν 1 = c 1 1 ν ν 1 +c 1 2 ν ν 2 +c 1 12 ν 1 ν 2 (17.1) dν 2 = c 2 1 ν ν 1 +c 2 2 ν ν 2 +c 2 12 ν 1 ν 2 The structure constants c k ij are smooth functions on the manifold. Recall that the equation 2 2 dν k = c k ij ν i ν j if and only if [f j,f i ] = c k ij f k. i,j= k= Introduce the coordinates (h,h 1,h 2 ) in each fiber of T M induced by the dual frame λ = h ν +h 1 ν 1 +h 2 ν 2 where h i (λ) = λ,f i (q) are the Hamiltonians linear on fibers associated to f i, for i =,1,2. The sub-riemannian Hamiltonian is written as follows H = 1 2 (h2 1 +h2 2 ). We now compute the Poisson bracket {H,h }, denoting with {H,h } q its restriction to the fiber T qm. Proposition The Poisson bracket {H,h } q is a quadratic form. Moreover we have {H,h } = c 1 1h 2 1 +(c 2 1 +c 1 2)h 1 h 2 +c 2 2h 2 2, (17.2) c 1 1 +c 2 2 =. (17.3) Notice that q ker{h,h } q and {H,h } q can be treated as a quadratic form on T q M/ q = q. Proof. Using the equality {h i,h j }(λ) = λ,[f i,f j ](q) we get {H,h } = 1 2 {h2 1 +h 2 2,h } = h 1 {h 1,h }+h 2 {h 2,h } = h 1 (c 1 1h 1 +c 2 1h 2 )+h 2 (c 1 2h 1 +c 2 2h 2 ) = c 1 1 h2 1 +(c2 1 +c1 2 )h 1h 2 +c 2 2 h

385 Differentiating the first equation in (17.1) one gets: which proves (17.3). = d 2 ν = dν 1 ν 2 ν 1 ν 2 = (c 1 1 ν ν 1 ) ν 2 ν 1 (c 2 2 ν ν 2 ) = (c 1 1 +c 2 2)ν ν 1 ν 2 Remark Being {H,h } q a quadratic form on the Euclidean plane D q (using the canonical identification of the vector space D q with its dual Dq given by the scalar product), it can be interpreted as a symmetric operator on the plane itself. In particular its determinant and its trace are well defined. From (17.3) we get trace{h,h } q = c 1 1 +c 2 2 =. This identity is a consequence of the fact that the flow defined by the normalized Reeb f preserves not only the distribution but also the area form on it. It is natural then to define our first invariant as the positive eigenvalue of this operator, namely: χ(q) = det{h,h } q. (17.4) Notice that the function χ measures an intrinsic quantity since both H and h are defined only by the sub-riemannian structure and are independent by the choice of the orthonormal frame. Indeed the quantity {H,h } compute the derivative of H along the flow of h, i.e. the obstruction to the fact that the flow of the Reeb field f (which preserves the distribution and the volume form on it) to preserve the metric. Notice that, by definition χ. Corollary Assume that the vector field f is complete. Then {e tf } t R is a group of sub- Riemannian isometries if and only if χ. In the case when χ one can consider (locally) the quotient of M with respect to the action of this group, i.e. the space of trajectories described by f. The two dimensional surface defined by the quotient strucure is endowed with a well defined Riemannian metric. The sub-riemannian structure on M coincide with the isoperimetric Dido problem constructed on this surface. The Heisenberg case corresponds with the case when the surface has zero Gaussian curvature Canonical frames In this section we want to show that it is always possible to select a canonical orthonormal frame for the sub-riemannian structure. In this way we are able to find missing discrete invariants and to classify sub-riemannian structures simply knowing structure constants c k ij for the canonical frame. We study separately the two cases χ and χ =. We start by rewriting and improving Proposition 17.5 when χ. 385

386 Proposition Let M be a 3D contact sub-riemannian manifold and q M. If χ(q), then there exists a local frame such that {h,h } = 2χh 1 h 2. (17.5) In particular, in the Lie group case with left-invariant stucture, there exists a unique (up to a sign) canonical frame (f,f 1,f 2 ) such that Moreover we have χ = c2 1 +c1 2 2 [f 1,f ] = c 2 1f 2, [f 2,f ] = c 1 2 f 1, (17.6) [f 2,f 1 ] = c 1 12f 1 +c 2 12f 2 +f., κ = (c 1 12 )2 (c 2 12 )2 + c2 1 c1 2. (17.7) 2 Proof. From Proposition 17.5 we know that the Poisson bracket {h,h } q is a non degenerate symmetric operator with zero trace. Hence we have a well defined, up to a sign, orthonormal frame by setting f 1,f 2 as the orthonormal isotropic vectors of this operator (remember that f depends only on the structure and not on the orthonormal frame on the distribution). It is easily seen that in both of these cases we obtain the expression (17.5). Remark Notice that, if we change sign to f 1 or f 2, then c 2 12 or c1 12, respectively, change sign in (17.6), while c 1 2 and c2 1 are unaffected. Hence equalities (17.7) do not depend on the orientation of the sub-riemannian structure. If χ = the above procedure cannot apply. Indeed both trace and determinant of the operator vanish, hence we have {h,h } q =. From (17.2) we get the identities so that commutators (??) simplify in (where c = c 2 1 ) c 1 1 = c2 2 =, c2 1 +c1 2 =. (17.8) [f 1,f ] = cf 2, [f 2,f ] = cf 1, (17.9) [f 2,f 1 ] = c 1 12f 1 +c 2 12f 2 +f. We want to show, with an explicit construction, that also in this case there always exists a rotation of our frame, by an angle that smoothly depends on the point, such that in the new frame κ is the only structure constant which appear in (17.9). Lemma Let f 1,f 2 be an orthonormal frame on M. If we denote with f 1, f 2 the frame obtained from the previous one with a rotation by an angle θ(q) and with ĉ k ij structure constants of rotated frame, we have: ĉ 1 12 = cosθ(c 1 12 f 1 (θ)) sinθ(c 2 12 f 2 (θ)), ĉ 2 12 = sinθ(c 1 12 f 1 (θ))+cosθ(c 2 12 f 2 (θ)). 386

387 Now we can prove the main result of this section. Proposition Let M be a 3D simply connected contact sub-riemannian manifold such that χ =. Then there exists a rotation of the original frame f 1, f 2 such that: [ f 1,f ] = κ f 2, [ f 2,f ] = κ f 1, (17.1) [ f 2, f 1 ] = f. Proof. Using Lemma 17.1 we can rewrite the statement in the following way: there exists a function θ : M R such that f 1 (θ) = c 1 12, f 2(θ) = c (17.11) Indeed, this would imply ĉ 1 12 = ĉ2 12 = and κ = c. Let us introduce simplified notations c 1 12 = α 1, c 2 12 = α 2. Then If (ν,ν 1,ν 2 ) denotes the dual basis to (f,f 1,f 2 ) we have and from (17.9) we get: κ = f 2 (α 1 ) f 1 (α 2 ) (α 1 ) 2 (α 2 ) 2 +c. (17.12) dθ = f (θ)ν +f 1 (θ)ν 1 +f 2 (θ)ν 2. f (θ) = ([f 2,f 1 ] α 1 f 1 α 2 f 2 )(θ) = f 2 (α 1 ) f 1 (α 2 ) α 2 1 α 2 2 = κ c. Suppose now that (17.11) are satisfied, we get dθ = (κ c)ν +α 1 ν 1 +α 2 ν 2 =: η. (17.13) with the r.h.s. independent from θ. To prove the theorem we have to show that η is an exact 1-form. Since the manifold is simply connected, it is sufficient to prove that η is closed. If we denote ν ij := ν i ν j dual equations of (17.9) are: dν = ν 12, dν 1 = cν 2 +α 1 ν 12, dν 2 = cν 1 α 2 ν 12. and differentiating we get two nontrivial relations: f 1 (c)+cα 2 +f (α 1 ) =, (17.14) f 2 (c) cα 1 +f (α 2 ) =. (17.15) 387

388 Recollecting all these computations we prove the closure of η dη = d(κ c) ν +(κ c)dν +dα 1 ν 1 +α 1 dν 1 +dα 2 ν 2 +α 2 dν 2 = dc ν +(κ c)ν f (α 1 )ν 1 f 2 (α 1 )ν 12 +α 1 (α 1 ν 12 cν 2 ) +f (α 2 )ν 2 +f 1 (α 2 )ν 12 +α 2 (cν 1 α 2 ν 12 ) = (f (α 1 )+α 2 c+f 1 (c))ν 1 +(f (α 2 ) α 1 c+f 2 (c))ν 2 =. +(κ c f 2 (α 1 )+f 1 (α 2 )+α 2 1 +α2 2 )ν 12 where in the last equality we use (17.12) and (17.14)-(17.15) Curvature of a 3D contact structure In this section we compute the sub-riemannian curvature of a 3D contact structure with a technique similar to that used in Section 16.5 for the 2D Riemannian case. Let us consider the level set {H = 1/2} = {h 2 1 +h2 2 = 1} and define the coordinate θ in such a way that h 1 = cosθ, h 2 = sinθ. On the bundle T M H 1 (1/2) we introduce coordinates (x,θ,h ). Notice that each fiber is topologically a cylinder S 1 R. The sub-riemannian Hamiltonian equation written in these coordinates are ẋ = h 1 f 1 (x)+h 2 f 2 (x) ḣ 1 = {H,h 1 } = {h 2,h 1 }h 2 (17.16) ḣ 2 = {H,h 2 } = {h 2,h 1 }h 1 ḣ = {H,h } Computing the Poisson bracket {h 2,h 1 } = h + c 1 12 h 1 + c 2 12 h 2 and introducing the two functions a,b : T M R given by a = {H,h } = 2 c j i h ih j, b := c 1 12h 1 +c 2 12h 2. i,j=1 we can rewrite the system, when restricted to H 1 (1/2), as follows ẋ = cosθf 1 +sinθf 2 θ = h b ḣ = a (17.17) Notice that, while a is intrinsic, the function b depends on the choice of the orthonormal frame. 388

389 In particular we have for the Hamiltonian vector field in the coordinates (q,θ,h ) (where we use h 1,h 2 as a shorthand for cosθ and sinθ): H = h 1 f 1 +h 2 f 2 (h +b) θ +a h (17.18) [ θ, H] = H = h 2 f 1 +h 1 f 2 +a h b θ (17.19) where we denoted by the derivative with respect to θ, e.g. h 1 = h 2 and h 2 = h 1. Now consider thesymplectic vector spaceσ λ = T λ (T M). Thevertical subspacev λ is generated by the vectors θ, h, E. Hence the Jacobi curve is J λ (t) = span{e t H θ,e t H h,e t H E} The first reduction, by homogeneity, let us to split the space Σ λ = span{ E, H} span{ E, H} and consider the reduced Jacobi curve Λ(t) := Ĵλ(t) in the 4-dimensional symplectic space Λ(t) := e t H V λ /R H = span{e t H θ,e t H h }/RH Next we describe the second reduction of the Jacobi curve, the one related with the fact that the curve is non-regular. Indeed notice that the rank of Ĵλ(t) is 1. To find the new reduced curve, we need to compute the kernel of the derivative of the curve at t = From the definition of Λ := Λ() it follows that Γ := Ker Λ() Λ( θ ) = π [ H, θ ] = h 2 f 1 h 1 f 2 Λ( h ) = π [ H, h ] = π ( θ ) = Hence Γ = R h and Γ is 3-dimensional in V λ /R H. Proposition We have the following characterizations: (i) Γ = span{ h, θ, H } in V λ /R H, (ii) { θ, H } is a Darboux basis for Γ /Γ. Proof. Since h and θ are vertical to prove (i) it is enough to show that H is skew-orthongonal to h. It is easy to compute, by Cartan formula σ( h, H ) = h s, H H s, h s,[ h, H ] =, since all the three terms vanish. Indeed s, H = σ( E, H ) = and s, h = s,[ h, H ] = since h and [ h, H ] are both vertical, as can be computed from (17.19). To complete the proof of (ii) it is enough to show, using [ θ, H ] = H, that σ( θ, H ) = θ s, H H s, θ s,[ θ, H ] = s, H =

390 Next we compute the curvature in terms of the Hamiltonian vector field and its commutators. For a vector field W we use the notations Ẇ := [ H,W], W := [ θ,w]. Let us consider the vector field V t = e t H h. Notice that V = θ, V = H. The fact that θ and h are vertical implies that σ(v t, V t ) =, t Differentiating the above identity at t = we get (from now on, we omit t when we evaluate at t = ) σ( V, V)+σ(V, V) = = σ(v, V) =. Differentiating once more the last identity and using σ( V, V) = σ( θ, H ) = 1 one gets σ( V, V)+σ(V,V (3) ) = = σ(v,v (3) ) = 1. With similar computations one can show that σ( V,V (3) ) = σ(v,v (4) ) =. Evaluating all derivatives of order 4 one can see that r := σ( V,V (3) ) = σ( V,V (4) ) = σ(v,v (5) ). Proposition The sub-riemannian curvature is R = 1 1 σ([ H, H ], H ) = r 1 Proof. The second equality follows from the definition of r and the fact that V = H and V (3) = [ H, H ]. To prove the first identity we have to compute the Schwartzian derivative of the bi-reduced curve, in the symplectic basis ( V, V) of the space Γ /Γ (notice the minus sign). Recall that Λ(t) = span{v t, V t }. To compute the 1-dimensional reduced curve Λ Γ (t) in the symplectic space Γ /Γ we need to compute the intersection of Λ(t) with Γ (for all t). In other words we look for x(t) such that σ( V t +x(t)v t,v ) = = x(t) = σ( V t,v ) σ(v t,v ). (17.2) Then we write this vector as a linear combination of the Darboux basis (cf. (16.28) for the 2D Riemannian case) V t +x(t)v t = α(t) V β(t) V +ξ(t)v (17.21) To see it as acurve in the space Γ/Γ we simply ignore the coefficient along V. Inthese coordinates the matrix S(t), which is a scalar, representing the curve is S(t) = β(t) α(t) (17.22) 39

391 Notice that this is a one-dimensional non-degenerate curve. These coefficients are computed by the symplectic products Combining (17.23),(17.24) with (17.22) and (17.2) one gets α(t) = σ( V t +x(t)v t, V ) (17.23) β(t) = σ( V t +x(t)v t, V ) (17.24) S(t) = σ( V t, V )σ(v t,v ) σ(v t, V )σ( V t,v ) σ( V t, V )σ( V t, V ) σ( V t, V )σ( V t, V ) (17.25) After some computations, by Taylor expansion one gets S(t) = t 4 t3 12 r+o(t4 ) (17.26) Since S = the curvature is computer by R =... S 2Ṡ = r 1 We end this section by computing the expression of the curvature in terms of the orthonormal frame for the distribution and the Reeb vector filed. As usual we restrict to the level set H 1 (1/2) where h 2 1 +h 2 2 = 1, h 1 = cosθ, h 2 = sinθ. In the following we use the notation f θ = h 1 f 1 +h 2 f 2, ν θ = h 1 ν 1 +h 2 ν 2. If h = (h 1,h 2 ) = (cosθ,sinθ) we denote by h = ( h 2,h 1 ) = ( sinθ,cosθ) its derivative with respect to θ and, more in general, we denote F := θ F for a smooth function F on T M. To express the quantity r = σ([ H, H ], H ) we start by computing the commutator [ H, H ]. From (17.18) and (17.19) one gets [ H, H ] = f +h f θ +(f 2 c 1 12 f 1c 2 12 (h +b)b (b ) 2 +a ) θ. Next we write, following this notation, the symplectic form σ = ds. The Liouville form s is expressed, in the dual basis ν,ν 1,ν 2 to the basis of vector fields f 1,f 2,f as follows hence the symplectic form σ is written as follows s = h ν +ν θ σ = dh ν +h ν θ ν θ +dθ ν θ +dν θ where we used that dν = ν 1 ν 2 = ν θ ν θ. Computing the symplectic product then one finds the value of 1R = h a +κ 391

392 where κ = f 2 c 1 12 f 1 c 2 12 (c 1 12) 2 (c 2 12) 2 + c2 1 c1 2 2 (17.27) By homogeneity, the function R is defined on the whole T M, and not only for λ H 1 (1/2). For every λ = (h,h 1,h 2 ) T xm 1R = h a +κ(h 2 1 +h 2 2) Remark The restriction of R to the 1-dimensional subspace λ D (that corresponds to λ = (h,,)), is a strictly positive quadratic form. Moreover it is equal to 1/1 when evaluated on the Reeb vector field. Hence the curvature R encodes both the contact form ω and its normalization. On the orthogonal complement (with respect to R) {h = } we have that R is treated as a quadratic form R = 3 2 a +κ(h 2 1 +h2 2 ). Remark (i). If a there always exists a frame such that a = 2χh 1 h 2 and in this frame we can express R as a quadratic form on the whole T M R = h 2 +(κ+3χ)h2 1 +(κ 3χ)h2 2. It is easily seen from this formulas that we can recover the two invariants χ,κ considering trace(1r h = ) = 2κ, discr(1r h = ) = 36χ. (ii). When a = the eigenvalues of R coincide and χ =. In this case κ represents the Riemannian curvature of the surface defined by the quotient of M with respect to the flow of the Reeb vector field. Indeed the flow e tf preserves the metric and it is easy to see that the identities e tf f i = f i, i = 1,2. implies [f,f 1 ] = [f,f 2 ] =. Hence c 2 1,c1 2 = and the expression of κ reduces to the Riemannian curvature of a surface whose orthonormal frame is f 1,f 2. Exercise Let f 1,f 2 bean orthonormal frame for M and denote by f 1, f 2 the frame obtained rotating f 1,f 2 by an angle θ = θ(q). Show that the structure constants ĉ k ijof rotated frame satisfies ĉ 1 12 = cosθ(c1 12 f 1(θ)) sinθ(c 2 12 f 2(θ)), ĉ 2 12 = sinθ(c 1 12 f 1 (θ))+cosθ(c 2 12 f 2 (θ)). Exercise Show that the expression (17.27) for κ does not depend on the choice of an orthonormal frame f 1,f 2 for the sub-riemannian structure. 392

393 17.4 Application: classification of 3D left-invariant structures In this paper we exploit these local invariants to provide a complete classification of left-invariant structures on 3D Lie groups. A sub-riemannian structure on a Lie group is said to be left-invariant if its distribution and the inner product are preserved by left translations on the group. A leftinvariant distribution is uniquely determined by a two dimensional subspace of the Lie algebra of the group. The distribution is bracket generating (and contact) if and only if the subspace is not a Lie subalgebra. A standard result on the classification of 3D Lie algebras (see, for instance, [Jac62]) reduce the analysis on the Lie algebras of the following Lie groups: H 3, the Heisenberg group, A + (R) R, where A + (R) is the group of orientation preserving affine maps on R, SOLV +,SOLV are Lie groups whose Lie algebra is solvable and has 2-dim square, SE(2) and SH(2) are the groups of orientation preserving motions of Euclidean and Hyperbolic plane respectively, SL(2) and SU(2) are the three dimensional simple Lie groups. Moreover itiseasytoshowthatineachofthesecasesbutoneallleft-invariantbracket generating distributions are equivalent by automorphisms of the Lie algebra. The only case where there exists two non-equivalent distributions is the Lie algebra sl(2). More precisely a 2-dimensional subspace of sl(2) is called elliptic (hyperbolic) if the restriction of the Killing form on this subspace is signdefinite (sign-indefinite). Accordingly, we use notation SL e (2) and SL h (2) to specify on which subspace the sub-riemannian structure on SL(2) is defined. For a left-invariant structure on a Lie group the invariants χ and κ are constant functions and allow us to distinguish non isometric structures. To complete the classification we can restrict ourselves to normalized sub-riemannian structures, i.e. structures that satisfy χ = κ =, or χ 2 +κ 2 = 1. (17.28) Indeed χ and κ are homogeneous with respect to dilations of the orthonormal frame, that means rescaling of distances on the manifold. Thus we can always rescale our structure in such a way that (17.28) is satisfied. To find missing discrete invariants, i.e. to distinguish between normalized structures with same χ and κ, we then show that it is always possible to select a canonical orthonormal frame for the sub- Riemannian structure such that all structure constants of the Lie algebra of this frame are invariant with respect to local isometries. Then the commutator relations of the Lie algebra generated by the canonical frame determine in a unique way the sub-riemannian structure. Collecting together these results we prove the following Theorem All left-invariant sub-riemannian structures on 3D Lie groups are classified up to local isometries and dilations as in Figure 17.1, where a structure is identified by the point (κ,χ) and two distinct points represent non locally isometric structures. Moreover (i) If χ = κ = then the structure is locally isometric to the Heisenberg group, 393

394 (ii) If χ 2 +κ 2 = 1 then there exist no more than three non isometric normalized sub-riemannian structures with these invariants; in particular there exists a unique normalized structure on a unimodular Lie group (for every choice of χ,κ). (iii) If χ or χ =,κ, then two structures are locally isometric if and only if their Lie algebras are isomorphic. Figure 17.1: Classification In other words every left-invariant sub-riemannian structure is locally isometric to a normalized one that appear in Figure 17.1, where we draw points on different circles since we consider equivalence classes of structures up to dilations. In this way it is easier to understand how many normalized structures there exist for some fixed value of the local invariants. Notice that unimodular Lie groups are those that appear in the middle circle (except for A + (R) R). From the proof of Theorem we get also a uniformization-like theorem for constant curvature manifolds in the sub-riemannian setting: Corollary Let M be a complete simply connected 3D contact sub-riemannian manifold. Assume that χ = and κ is costant on M. Then M is isometric to a left-invariant sub-riemannian structure. More precisely: (i) if κ = it is isometric to the Heisenberg group H 3, (ii) if κ = 1 it is isometric to the group SU(2) with Killing metric, (iii) if κ = 1 it is isometric to the group SL(2) with elliptic type Killing metric, where SL(2) is the universal covering of SL(2). 394

395 Another byproduct of the classification is the fact that there exist non isomorphic Lie groups with locally isometric sub-riemannian structures. Indeed, as a consequence of Theorem 17.18, we get that there exists a unique normalized left-invariant structure defined on A + (R) R having χ =,κ = 1. Thus A + (R) R is locally isometric to the group SL(2) with elliptic type Killing metric by Corollary This fact was already noted in [FG96] as a consequence of the classification. In this paper we explicitly compute the global sub-riemannian isometry between A + (R) R and the universal covering of SL(2) by means of Nagano principle. We then show that this map is well defined on the quotient, giving a global isometry between the group A + (R) S 1 and the group SL(2), endowed with the sub-riemannian structure defined by the restriction of the Killing form on the elliptic distribution. The group A + (R) R can be interpreted as the subgroup of the affine maps on the plane that acts as an orientation preserving affinity on one axis and as translations on the other one 1 a b A + (R) R := 1 c, a >,b,c R. 1 The standard left-invariant sub-riemannian structure on A + (R) R is defined by the orthonormal frame D = span{e 2,e 1 +e 3 }, where 1 1 e 1 =, e 2 =, e 3 = 1, is a basis of the Lie algebra of the group, satisfying [e 1,e 2 ] = e 1. The subgroupa + (R) is topologically homeomorphic to the half-plane {(a,b) R 2,a > } which can be descirbed in standard polar coordinates as {(ρ,θ) ρ >, π/2 < θ < π/2}. Theorem The diffeomorphism Ψ : A + (R) S 1 SL(2) defined by ( ) 1 cosϕ sinϕ Ψ(ρ,θ,ϕ) =, (17.29) ρcosθ ρsin(θ ϕ) ρcos(θ ϕ) where (ρ,θ) A + (R) and ϕ S 1, is a global sub-riemannian isometry. Using this global sub-riemannian isometry as a change of coordinates one can recover the geometry of the sub-riemannian structure on the group A + (R) S 1, starting from the analogous properties of SL(2) (e.g. explicit expression of the sub-riemannian distance, the cut locus). Remark (Comments). χ and κ are functions defined on the manifold; they reflect intrinsic geometric properties of the sub-riemannian structure and are preserved by the sub-riemannian isometries. In particular, χ and κ are constant functions for left-invariant structures on Lie groups (since left translations are isometries). 1 We can recover the action as an affine map identifying (x,y) R 2 with (x,y,1) T and a b x ax+b 1 c y = y +c

396 17.5 Proof of Theorem Now we use the results of the previous sections to prove Theorem In this section G denotes a 3D Lie group, with Lie algebra g, endowed with a left-invariant sub-riemannian structure defined by the orthonormal frame f 1,f 2, i.e. D = span{f 1,f 2 } g, span{f 1,f 2,[f 1,f 2 ]} = g. Recall that for a 3D left-invariant structure to be bracket generating is equivalent to be contact, moreover the Reeb field f is also a left-invariant vector field by construction. From the fact that, for left-invariant structures, local invariants are constant functions (see Remark??) we obtain a necessary condition for two structures to be locally isometric. Proposition Let G, H be 3D Lie groups with locally isometric sub-riemannian structures. Then χ G = χ H and κ G = κ H. Notice that this condition is not sufficient. It turns out that there can be up to three mutually non locally isometric normalized structures with the same invariants χ, κ. Remark It is easy to see that χ and κ are homogeneous of degree 2 with respect to dilations of the frame. Indeed assume that the sub-riemannian structure (M,D,g) is locally defined by the orthonormal frame f 1,f 2, i.e. D = span{f 1,f 2 }, g(f i,f j ) = δ ij. Consider now the dilated structure (M,D, g) defined by the orthonormal frame λf 1,λf 2 D = span{f 1,f 2 }, g(f i,f j ) = 1 λ 2δ ij, λ >. If χ,κ and χ, κ denote the invariants of the two structures respectively, we find χ = λ 2 χ, κ = λ 2 κ, λ >. A dilation of the orthonormal frame corresponds to a multiplication by a factor λ > of all distances in our manifold. Since we are interested in a classification by local isometries, we can always suppose (for a suitable dilation of the orthonormal frame) that the local invariants of our structure satisfy χ = κ =, or χ 2 +κ 2 = 1, and we study equivalence classes with respect to local isometries. Since χ is non negative by definition (see Remark??), we study separately the two cases χ > and χ = Case χ > Let G be a 3D Lie group with a left-invariant sub-riemannian structure such that χ. From Proposition 17.8 we can assume that D = span{f 1,f 2 } where f 1,f 2 is the canonical frame of the structure. From (17.6) we obtain the dual equations dν = ν 1 ν 2, dν 1 = c 1 2ν ν 2 +c 1 12ν 1 ν 2, (17.3) dν 2 = c 2 1 ν ν 1 +c 1 12 ν 1 ν

397 Using d 2 = we obtain structure equations { c 1 2 c 2 12 =, c 2 1 c1 12 =. (17.31) We know that the structure constants of the canonical frame are invariant by local isometries (up to change signs of c 1 12,c2 12, see Remark 17.9). Hence, every different choice of coefficients in (17.6) which satisfy also (17.31) will belong to a different class of non-isometric structures. Taking into account that χ > implies that c 2 1 andc1 2 cannot bebothnon positive (see (17.7)), we have the following cases: (i) c 1 12 = and c2 12 =. In this first case we get and formulas (17.7) imply χ = c2 1 +c1 2 2 [f 1,f ] = c 2 1 f 2, [f 2,f ] = c 1 2f 1, [f 2,f 1 ] = f, >, κ = c2 1 c In addition, we find the relations between the invariants We have the following subcases: χ+κ = c 2 1, χ κ = c1 2. (a) If c 1 2 = we get the Lie algebra se(2) of the group SE(2) of the Euclidean isometries of R 2, and it holds χ = κ. (b) If c 2 1 = we get the Lie algebra sh(2) of the group SH(2) of the Hyperbolic isometries of R 2, and it holds χ = κ. (c) If c 2 1 > and c1 2 < we get the Lie algebra su(2) and χ κ <. (d) If c 2 1 < and c1 2 > we get the Lie algebra sl(2) with χ+κ <. (e) If c 2 1 > and c1 2 > we get the Lie algebra sl(2) with χ+κ >,χ κ >. (ii) c 1 2 = and c1 12 =. In this case we have and necessarily c 2 1. Moreover we get [f 1,f ] = c 2 1f 2, [f 2,f ] =, (17.32) [f 2,f 1 ] = c 2 12f 2 +f, χ = c2 1 2 >, κ = (c2 12 )2 + c2 1 2, 397

398 from which it follows χ κ. The Lie algebra g = span{f 1,f 2,f 3 } defined by (17.32) satisfies dim[g,g] = 2, hence it can be interpreted as the operator A = ad f 1 which acts on the subspace span{f,f 2 }. Moreover, it can be easily computed that and we can find the useful relation (iii) c 2 1 = and c2 12 =. In this last case we get and c 1 2. Moreover we get from which it follows trace A = c 2 12, deta = c2 1 >, 2 trace2 A deta = 1 κ χ. (17.33) [f 1,f ] =, [f 2,f ] = c 1 2 f 1, (17.34) [f 2,f 1 ] = c 1 12 f 1 +f, χ = c1 2 2 >, κ = (c1 12) 2 c1 2 2, χ+κ. As before, the Lie algebra g = span{f 1,f 2,f 3 } defined by (17.34) has two-dimensional square and it can be interpreted as the operator A = ad f 2 which acts on the plane span{f,f 1 }. It can be easily seen that it holds and we have an analogous relation trace A = c 1 12, deta = c 1 2 <, 2 trace2 A deta = 1+ κ χ. (17.35) Remark Lie algebras of cases (ii) and (iii) are solvable algebras and we will denote respectively solv + and solv, where the sign depends on the determinant of the operator it represents. In particular, formulas (17.33) and (17.35) permits to recover the ratio between invariants (hence to determine a unique normalized structure) only from intrinsic properties of the operator. Notice that if c 2 12 = we recover the normalized structure (i)-(a) while if c1 12 = we get the case (i)-(b). Remark The algebra sl(2) is the only case where we can define two nonequivalent distributions which corresponds to the case that Killing form restricted on the distribution is positive definite (case (d)) or indefinite (case (e)). We will refer to the first one as the elliptic structure on sl(2), denoted sl e (2), and with hyperbolic structure in the other case, denoting sl h (2). 398

399 Case χ = A direct consequence of Proposition for left-invariant structures is the following Corollary Let G, H be Lie groups with left-invariant sub-riemannian structures and assume χ G = χ H =. Then G and H are locally isometric if and only if κ G = κ H. Thanks to this result it is very easy to complete our classification. Indeed it is sufficient to find all left-invariant structures such that χ = and to compare their second invariant κ. A straightforward calculation leads to the following list of the left-invariant structures on simply connected three dimensional Lie groups with χ = : - H 3 is the Heisenberg nilpotent group; then κ =. - SU(2) with the Killing inner product; then κ >. - SL(2) with the elliptic distribution and Killing inner product; then κ <. - A + (R) R; then κ <. Remark In particular, we have the following: (i) All left-invariant sub-riemannian structures on H 3 are locally isometric, (ii) There exists on A + (R) R a unique (modulo dilations) left-invariant sub-riemannian structure, which is locally isometric to SL e (2) with the Killing metric. Proof of Theorem is now completed and we can recollect our result as in Figure 17.1, where we associate to every normalized structure a point in the (κ,χ) plane: either χ = κ =, or (κ, χ) belong to the semicircle {(κ,χ) R 2,χ 2 +κ 2 = 1,χ > }. Notice that different points means that sub-riemannian structures are not locally isometric Proof of Theorem 17.2 Inthissection wewanttowriteexplicitly thesub-riemannianisometrybetweensl(2)anda + (R) S 1. Consider the Lie algebra sl(2) = {A M 2 (R), trace(a) = } = span{g 1,g 2,g 3 }, where g 1 = 1 ( ) 1, g = 1 2 ( 1 1 ), g 3 = 1 2 ( 1 1 The sub-riemannian structure on SL(2) defined by the Killing form on the elliptic distribution is given by the orthonormal frame ). sl = span{g 1,g 2 }, and g := g 3, (17.36) is the Reeb vector field. Notice that this frame is already canonical since equations (17.1) are satisfied. Indeed [g 1,g ] = g 2 = κg

400 Recall that the universal covering of SL(2), which we denote SL(2), is a simply connected Lie group with Lie algebra sl(2). Hence (17.36) define a left-invariant structure also on the universal covering. On the other hand we consider the following coordinates on the Lie group A + (R) R, that are well-adapted for our further calculations y x A + (R) R := 1 z, y <,x,z R. (17.37) 1 It is easy to see that, in these coordinates, the group law reads (x,y,z)(x,y,z ) = (x yx, yy,z +z ), and its Lie algebra a(r) R is generated by the vector fields e 1 = y x, e 2 = y y, e 3 = z, with the only nontrivial commutator relation [e 1,e 2 ] = e 1. The left-invariant structure on A + (R) R is defined by the orthonormal frame D a = span{f 1,f 2 }, f 1 := e 2 = y y, (17.38) f 2 := e 1 +e 3 = y x + z. With straightforward calculations we compute the Reeb vector field f = e 3 = z. This frame is not canonical since it does not satisfy equations (17.1). Hence we can apply Proposition to find the canonical frame, that will be no more left-invariant. Following the notation of Proposition we have Lemma The canonical orthonormal frame on A + (R) R has the form: f 1 = ysinz x ycosz y sinz z, f 2 = ycosz x ysinz y +cosz z. (17.39) Proof. It is equivalent to show that the rotation defined in the proof of Proposition is θ(x,y,z) = z. The dual basis to our frame {f 1,f 2,f } is given by ν 1 = 1 y dy, ν 2 = 1 y dx, ν = 1 y dx dz. Moreover we have [f 1,f ] = [f 2,f ] = and [f 2,f 1 ] = f 2 + f so that, in equation (17.13) we get c =,α 1 =,α 2 = 1. Hence dθ = ν +ν 2 = dz. 4

401 Now we have two canonical frames { f 1, f 2,f } and {g 1,g 2,g }, whose Lie algebras satisfy the same commutator relations: Let us consider the two control systems [ f 1,f ] = f 2, [g 1,g ] = g 2, [ f 2,f ] = f 1, [g 2,g ] = g 1, (17.4) [ f 2, f 1 ] = f, [g 2,g 1 ] =. q = u 1 f1 (q)+u 2 f2 (q)+u f (q), q A + (R) R, ẋ = u 1 g 1 (x)+u 2 g 2 (x)+u g (x), x SL(2). and denote with x u (t),q u (t), t [,T] the solutions of the equations relative to the same control u = (u 1,u 2,u ). Nagano Principle (see [?] and also [Nag66, Sus74, Sus83]) ensure that the map Ψ : A + (R) R SL(2), q u (T) x u (T). (17.41) that sends the final point of the first system to the final point of the second one, is well-defined and does not depend on the control u. Thus we can find the endpoint map of both systems relative to constant controls, i.e. considering maps F : R 3 A + (R) R, (t 1,t 2,t ) e t f e t 2 f 2 e t 1 f 1 (1 A ), (17.42) G : R 3 SL(2), (t 1,t 2,t ) e t g e t 2g 2 e t 1g 1 (1 SL ). (17.43) where we denote with 1 A and 1 SL identity element of A + (R) R and SL(2), respectively. The composition of these two maps makes the following diagram commutative A + (R) R Ψ SL(2) Ψ F 1 π R 3 G SL(2) (17.44) where π : SL(2) SL(2) is the canonical projection and we set Ψ := π Ψ. To simplify computation we introduce the rescaled maps F(t) := F(2t), G(t) := G(2t), t = (t 1,t 2,t ), and solving differential equations we get from (17.42) the following expressions F(t 1,t 2,t ) = ( ) 2e 2t tanht tanh 2, e 2t 1 1 tanh2 t 2 t 2 1+tanh 2, 2(arctan(tanht 2 ) t ). (17.45) t 2 The function F is globally invertible on its image and its inverse 41

402 F 1 (x,y,z) = ( 1 2 log x 2 +y 2, arctanh( y + x 2 +y 2 x ) ), arctan( y + x 2 +y 2 ) z. x 2 is defined for every y < and for every x (it is extended by continuity at x = ). On the other hand, the map (17.43) can be expressed by the product of exponential matrices as follows ( )( )( ) e t 1 cosht2 sinht G(t 1,t 2,t ) = 2 cost sint e t. (17.46) 2 sinht 2 cosht 2 sint cost To simplify the computations, we consider standard polar coordinates (ρ, θ) on the half-plane {(x,y),y < }, where π/2 < θ < π/2 is the angle that the point (x,y) defines with y-axis. In particular, it is easy to see that the expression that appear in F 1 is naturally related to these coordinates: ξ = ξ(θ) := tan θ y + x 2 +y 2 2 =, if x, x, if x =. Hence we can rewrite F 1 (ρ,θ,z) = ( 1 2 logρ, arctanhξ, arctanξ z ). 2 and compute the composition Ψ = G F 1 : A + (R) R SL(2). Once we substitute these expressions in (17.46), the third factor is a rotation matrix by an angle arctanξ z/2. Splitting this matrix in two consecutive rotations and using standard trigonometric identities cos(arctan ξ) = 1, sin(arctanξ) = ξ, cosh(arctanhξ) = 1, sinh(arctanhξ) = ξ, for ξ ( 1,1), 1+ξ 2 1+ξ 2 1 ξ 2 we obtain: 1 ξ 2 Ψ(ρ,θ,z) = 1 ( ) ρ 1/2 = 1 ξ 2 ρ 1/2 ξ 1 ξ 2 ξ 1 1 ξ 2 1+ξ 2 1 ξ 1 ξ 2 1+ξ 2 ξ 1+ξ 2 cos z 2 1 sin z 1+ξ 2 2 sin z 2 cos z. 2 42

403 Then using identities: cosθ = 1 ξ2 2ξ 1+ξ2, sinθ = 1+ξ2, we get 1+ξ 2 ( ) ρ 1/2 1 ξ 4 cos z Ψ(ρ,θ,z) = 2 ρ 1/2 2ξ 1 ξ 2 sin z 1 ξ 4 1 ξ 4 2 sin z 2 cos z 2 1+ξ = 2 ( ) ρ 1/2 1 cos z 1 ξ 2 ρ 1/2 2ξ 1 ξ ξ 2 1+ξ 2 sin z 2 sin z 2 cos z 2 = ( )( ) 1 cos z ρcosθ ρ sinθ cosθ sin z 2 sin z 2 cos z 2 1 = cos z sin z 2 2 ρcosθ ρsin(θ z 2 ) ρcos(θ z. 2 ) Lemma The set Ψ 1 (I) is a normal subgroup of A + (R) R. Proof. ItiseasytoshowthatΨ 1 (I) = {F(,,2kπ),k Z}. From(17.45)weseethatF(,,2kπ) = (, 1, 4kπ) and (17.37) implies that this is a normal subgroup. Indeed it is enoough to prove that Ψ 1 (I) is a subgroup of the centre, that follows from the identity 1 1 4kπ 1 y x y 1 z = 1 x 1 z +4kπ 1 y x 1 = 1 z 1 4kπ 1 1. Remark With a standard topological argument it is possible to prove that actually Ψ 1 (A) is a discrete countable set for every A SL(2), and Ψ is a representation of A + (R) R as universal covering of SL(2). By Lemma the map Ψ is well defined isomorphism between the quotient A + (R) R Ψ 1 (I) A + (R) S 1, and the group SL(2), defined by restriction of Ψ on z [ 2π,2π]. If we consider the new variable ϕ = z/2, defined on [ π,π], we can finally write the global isometry as ( ) 1 cosϕ sinϕ Ψ(ρ,θ,ϕ) =, (17.47) ρcosθ ρsin(θ ϕ) ρcos(θ ϕ) where (ρ,θ) A + (R) and ϕ S 1. 43

404 Remark In the coordinate set defined above we have that 1 A = (1,,) and ( ) 1 Ψ(1 A ) = Ψ(1,,) = = 1 1 SL. On the other hand Ψ is not a homomorphism since in A + (R) R it holds ( 2 2, π 2 4,π)( 2, π 4, π) = 1 A, while it can be easily checked from (17.47) that Ψ ( 2 2, π 4,π) Ψ ( ( ) 2 2 2, π 4, π) = 1 1/2 1/2 SL. Bibliographical Notes Left-invariant structures on Lie groups are the basic models of sub-riemannian manifolds and the study of such structures is the starting point to understand the general properties of sub- Riemannian geometry. In particular, thanks to the group structure, in some of these cases it is also possible to compute explicitly the sub-riemannian distance and geodesics (see in particular [?] for the Heisenberg group, [BR8] for semisimple Lie groups with Killing form and [MS1, Sac1] for a detailed study of the sub-riemannian structure on the group of motions of a plane). The problem of equivalence for several geometric structures close to left-invariant sub-riemannian structures on 3D Lie groups were studied in several publications (see [Car32, Car33, FG96,?,?]). In particular in [?] the author provide a first classification of symmetric sub-riemannian structures of dimension 3, while in [FG96] is presented a complete classification of sub-riemannian homogeneous spaces (i.e. sub-riemannian structures which admits a transitive Lie group of isometries acting smoothly on the manifold) by means of an adapted connection. The principal invariants used there, denoted by τ and K, coincide, up to a normalization factor, with our differential invariants χ and κ. 44

405 Chapter 18 Asymptotic expansion of the 3D contact exponential map In this chapter we study the small time asymptotics of the exponential map in the three-dimensional contact caseandseehowthestructureofthecutandtheconjugatelocusisencodedinthecurvature. Let us consider the sub-riemannian Hamiltonian of a 3D contact structure (cf. Section 17.3) H = h 1 f 1 +h 2 f 2 (h +b) θ +a h (18.1) written in the dual coordinates (h,h 1,h 2 ) of a local frame f,f 1,f 2, where ν is the normalized contact form, f is the Reeb vector field and f 1,f 2 is a local orthonormal frame for the sub- Riemannian structure. As usual the coordinate θ on the level set H 1 (1/2) is defined such a way that h 1 = cosθ and h 2 = sinθ. In this chapter it will be convenient to introduce the notation ρ := h for the function linear on fibers of T M associated with the opposite of the Reeb vector field. The Hamiltonian system (18.1) on the level set H 1 (1/2) is rewritten in the following form: q = cosθf 1 +sinθf 2 θ = ρ b ρ = a (18.2) The exponential map starting from the initial point q M is the map that to each time t > and every initial covector (θ,ρ ) Tq M H 1 (1/2) assigns the first component of the solution at time t of the system (18.2), denoted by E q (t,θ,ρ ), or simply E(t,θ,ρ ). Conjugate points are points where the differential of the exponential map is not surjective, i.e. solutions to the equation E E E =. (18.3) θ ρ t The variation of the exponential map along time is always nonzero and independent with respect to variations of the covectors in the set H 1 (1/2) (see also Section 8.9 and Proposition 8.37). This implies that (18.3) is equivalent to E E =. (18.4) θ ρ 45

406 18.1 Nilpotent case The nilpotent case, i.e. the Heisenberg group, corresponds to the case when the functions a and b vanish identically, i.e. the system q = cosθf 1 +sinθf 2 θ = ρ (18.5) ρ = Let us first recover, in this notation, the conjugate locus in the case of the Heisenberg group. Let us denote coordinates on the manifold R 3 as follows q = (x,y), x = (x 1,x 2 ) R 2,y R. (18.6) Notice moreover that in this case the Reeb vector field is proportional to y and its dual coordinate ρ is constant along trajectories. There are two possible cases: (i) ρ =. Then the solution is a straight line contained in the plane y = and is optimal for all time. (ii) ρ. In this case we claim that the equation (18.4) is equivalent to the following x θ x ρ =. (18.7) By thegauss Lemma(Proposition 8.37) thecovector p = (p x,ρ) at thefinal point annihilates the differential of the exponential map restricted to the level set, i.e. p, E = p x, x +ρ y = (18.8) θ θ θ p, E = p x, x +ρ y = (18.9) ρ ρ ρ and since ρ it follows that among the three vectors x 1 x 2 y θ θ θ x 1 x 2 y (18.1) ρ ρ ρ the third one is always a linear combination of the first two. Proposition The first conjugate time is t c (θ,ρ ) = 2π/ ρ. Proof. In the standard coordinates (x 1,x 2,y) the two vector fields f 1 and f 2 defining the orthonormal frame are f 1 = x1 x 2 2 y, f 2 = x2 + x 1 2 y Thus, the first two coordinates of the horizontal part of the Hamiltonian system satisfy { ẋ 1 = cosθ (18.11) ẋ 2 = sinθ 46

407 It is then easy to integrate the x-part of the exponential map being θ(t) = θ + ρt (recall that ρ ρ and, without loss of generality we can assume ρ > ) x(t;θ,ρ ) = t ( ) cos(θ +ρs) ds = sin(θ +ρs) θ +t θ ( ) cosρs ds (18.12) sinρs Due to the symmetry of the Heisenberg group, the determinant of the Jacobian map will not depend on θ. Hence to compute the determinant of the Jacobian it is enough to compute partial derivatives at θ = ( ) x cosρt 1 x ρ = 1 ρ 2 and denoting by τ := ρt one can compute θ = ( sinρt 1 cosρt sinρt ) + t ρ ( ) cosρt sinρt x x = 1 ( ) cosτ 1 τ cosτ sinτ θ ρ ρ 2 det, sinτ 1+τ sinτ +cosτ = 1 ρ2(τ sinτ +2cosτ 2). The fact that t c = 2π/ ρ follows from Exercise Exercise Prove that τ c = 2π is the first positive root of the equation τ sinτ+2cosτ 2 =. Moreover show that τ c is a simple root General case: second order asymptotic expansion Let us consider the Hamiltonian system for the general 3D contact case q = f θ := cosθf 1 +sinθf 2 θ = ρ b ρ = a (18.13) We are going to study the asymptotic expansion for our system for the initial parameter ρ ±. To this aim, it is convenient to introduce the change of variables r := 1/ρ and denote by ν := r() = 1/ρ its initial value. Notice that ρ is no more constant in the general case and ρ implies ν. The main result of this section says that the conjugate time for the perturbed system is a perturbation of the conjugate time of the nilpotent case, where the perturbation has no term of order 2. Proposition The conjugate time t c (θ,ν) is a smooth function of the parameter ν for ν >. Moreover for ν t c (θ,ν) = 2π ν +O( ν 3 ). 47

408 Proof. Let us introduce a new time variable τ such that dt dτ = r. If we now denote by F the derivative of a function F with respect to the new time τ, the system (18.13) is rewritten in the new coordinate system (q,θ,r) (where we recall r = 1/ρ), as follows q = rf θ θ = 1 rb ṙ = r 3 (18.14) a ṫ = r To compute the asymptotics of the conjugate time, it is also convenient to consider a system of coordinates, depending on a parameter ε, corresponding to the quasi-homogeneous blow up of the sub-riemannian structure at q and converging to the nilpotent approximation. In other words we consider the change of coordinates Φ ε such that f θ 1 ε fε θ where f ε θ = f +εf () +ε 2 f (1) +... Accordingly to this change of coordinates we have the equalities f i = 1 ε fε i, f = 1 ε 2fε, b = 1 ε bε, a = 1 ε 2aε where f ε is the Reeb vector field defined by the orthonormal frame fε 1,fε 2 (and analogously for a ε,b ε ). Let us now define, for fixed ε, the variable w such that r = εw. The system (18.14) is finally rewritten in the following form q = wfθ ε θ = 1 wb ε ẇ = εw 3 a ε (18.15) ṫ = εw Notice that the dynamical system is written in a coordinate system that depends on ε. Moreover the initial asymptotic for ρ, corresponding to r, is now reduced to fix an initial value w() = 1 and send ε. Consider some linearly adapted coordinates (x,y), with x R 2 and y R (cf. Definition 1.24). If we denote by q ε = (x ε,y ε ) the solution of the horizontal part of the ε-system (18.15), conjugate points are solutions of the equation q ε qε θ w =. w =1 As in Section 18.1, one can check that this condition is equivalent to x ε xε θ w =. w =1 Notice that the original parameters (t,θ,ρ ) parametrizing the trajectories in the exponential map correspond to a conjugate point if the corresponding parameters (τ,θ,ε) satisfy ϕ(τ,ε,θ ) := xε xε θ w = (18.16) w =1 48

409 For ε =, i.e. the nilpotent approximation, the first conjugate time is τ c = 2π, and moreover it is a simple root. Thus one gets ϕ(2π,,θ ) =, ϕ τ (2π,,θ ). (18.17) Hence the implicit function theorem guarantees that there exists a smooth function τ c (ε,θ ) such that τ c (,θ ) = 2π and ϕ(τ c (ε,θ ),ε,θ ) =. (18.18) Inotherwordsτ c (ε,θ )computestheconjugatetimeτ associated withparametersε,θ. Bysmoothness of τ c one immediately has the expansion for ε τ c (ε,θ ) = 2π +O(ε). Now the statement of the proposition is rewritten in terms of the function τ c as follows Differentiating the identity (18.18) with respect to ε one has τ c (ε,θ ) = 2π +O(ε 2 ). (18.19) ϕ τ c τ ε + ϕ ε =, hence, thanks to (18.17), the expansion (18.19) holds if and only if ϕ ε (2π,,θ ) =. Moreover differentiating the expression (18.16) with respect to ε one has ϕ ε (2π,,θ ) = 2 x ε xε 2 x ε xε ε θ w ε w θ w =1,ε=,τ=2π The second one vanishes since at ε = is the Heisenberg case, whose horizontal part at τ = 2π does not depend on θ. Hence we are reduced to prove that 2 x ε ε θ =. (18.2) ε=,τ=2π which is a consequence of the following lemma. Lemma The quantity xε ε does not depend on θ. ε=,τ=2π Proof of Lemma. To prove the lemma it will be enough to find the first order expansion in ε of the solution of the system (18.15). Recall that when ε = the system corresponds to the Heisenberg case, i.e. we have a ε ε= =,b ε ε= =. This gives the expansion of w (recall that w() = w = 1) w(t) = w()+ t εa ε (τ)w 3 (τ)dτ w = 1+O(ε 2 ) Analogously we have b ε = ε β,u +O(ε 2 ), where β,u = β 1 u 1 +β 2 u 2 and β denotes the (constant) coefficient of weight zero in the expansion of b with respect to ε. 49

410 Denoting u(θ) = (cosθ,sinθ), the equation for θ then is reduced to This equation can be integrated and one gets θ ε = ε= θ = 1 ε β,u(θ) +O(ε 2 ), θ() = θ. t β,u(θ(τ)) dτ = β,u (θ +t) u (θ ) (18.21) where u (θ) = ( sinθ,cosθ). Next we are going to use (18.21) to compute the derivative of x ε wrt ε. The equation for the horizontal part of (18.15) can be expanded in ε as follows ẋ ε = u(θ)+εf () u(θ) (x)+o(ε2 ) where the first term is Heisenberg, and f () u(θ) is the term of weight zero of f u, which is linear with respect to x 1 and x 2 because of the weight. 1 To compute the derivative of the solution with respect to parameter we use the following general fact Lemma Let φ(ε, t) denote the solution of the differential equation ẏ = F(ε, y) with fixed initial condition y() = y. Then the derivative φ ε satisfies the following linear ODE d φ F (ε,t) = dt ε y (ε,φ(ε,t)) φ ε (ε,t)+ F ε (ε,φ(ε,t)) We apply the above lemma when y = (x,θ) and F = (F x,f θ ) and we compute at ε =. In particular we need the solution of the original system at ε = φ(,t) = ( x(t), θ(t)), θ(t) = θ +t, x(t) = u (θ ) u (θ +t). Then by Lemma 18.5 we have d x dt ε = Fx x x ε + Fx θ θ ε + Fx ε Computing the derivatives at ε = gives F x F x x =, ε= θ = u ( θ(t)), ε= F x ε = f () ε= u( θ(t)) ( x(t)) and we obtain the equation for x ε d x dt ε = θ ε= ε u (θ +t)+f () u(θ +t) (u (θ ) u (θ +t)) ε= 1 Recall that this is the zero order part of the vector field f u along x, hence only x variables appear and have order 1. 41

411 If we set s = θ +t we can rewrite this equation d x ds ε = θ ε= ε u (s)+f () u(s) (u (θ ) u (s)) and integrating one has x ε = (2π,) θ +2π θ β,u (s) u (θ ) u (s)ds θ +2π + f () u(s) (u (θ ) u (s))ds θ In the last expression it is easy to see that all terms where θ appears are zero, while the others vanish since we compute integrals of periodic functions over a period (which does not dep on θ ). This finishes the proof of Lemma 18.4, hence the proof of the Proposition Some more details on the computations* Let (f 1,f 2 ) be an orthonormal frame of the distribution and f be the associated Reeb vector field. Let (x 1,x 2,y) be the associated linearly adapted coordinates on R 3. We introduce the Cartesian coordinates on Tq M (h 1,h 2,h ), where h i (λ) = λ,f i (q). The Hamiltonian is given by H = 1 ( 2 h 2 1 +h 2 ) 2. We consider equations on H 1 ({1/2}) so that there exists θ : T M R such that h 1 (λ) = cosθ(λ) and h 2 (λ) = sinθ(λ). Then we have the following equations for (q,θ,h ). q = cosθf 1 +sinθf 2 θ = h b ḣ = a Where a,b : T M R are given by (see Chapter 17) a = {H,h } = 2 c j,i h ih j, b = c 1 12 h 1 +c 2 12 h 2. i,j=1 We introduce the non homogeneous ε-blowup δ ε of the sub-riemannian structure at q that converges to the nilpotent approximation, i.e. the change of coordinates such that f ε 1 = εδ1 ε f 1, f ε 2 = εδ1 ε f 2, f ε = ε2 δ1 ε f. According to this change of coordinates, we introduce the other ε-counterparts : h ε i(λ) = λ,f ε i(q) α ε (λ) = {h ε 1 2 +h ε 2 2,h ε }(λ) = 2 ε c j,i (λ)hε i(λ)h ε j(λ), i,j=1 b ε = ε c 1 12 (λ)hε 1 (λ)+ε c 2 12 (λ)hε 2 (λ). 411

412 where ε c k ij are the structural constants associated with (fε 1,fε 2,fε ). The relations between the original variables and the ε-variables can then be computed h i (εδ 1/ε λ) = εδ 1/ε λ,f i = λ,εδ 1/ε f i = h ε i(λ), i {1,2}, h (εδ 1/ε λ) = εδ 1/ε λ,f = λ,εδ 1/ε f = 1 ε hε (λ), θ(εδ 1/ε λ) = θε (λ). Regarding the structural constants, we have weights w 1 = w 2 = 1 and w = 2, then so that [f ε i,fε j ] = 2 ε w i+w j [δ1 ε f i,δ1 ε f j] = k= 2 k= ε c k ij fε k ε c k ij = ε w i+w j w k c k ij δ ε. ε c k ij εw k δ1 ε f k, We can then compute ε 2dh dt (εδ 1/ε λ) = ε2 {h 1,h }(εδ1/ε λ)h 1(εδ1/ε λ)+ε2 {h 2,h }(εδ1/ε λ)h 2(εδ1/ε λ) 2 2 = ε 2 c k j, (δ ε(λ))h ε k (λ)hε j (λ) = j=1 k=1 2 j=1 k=1 2 ε c k j,(λ)h ε k (λ)hε j(λ) = dhε dt (λ) =: aε (λ). We can infer from this computation that a ε (λ) = ε 2 a(εδ1/ε λ) and in particular from line 2 that a ε = O(ε 2 ). Likewise b ε (λ) = εb(εδ1/ε λ) and bε = O(ε). We now introduce a new time variable τ such that dt dτ = 1 h and a new space variable w = 1 h, ε so that dθ ε dτ = 1+wbε and dw dτ = dhε 1 dτ h ε 2 = 1 h ε 3aε = w 3 a ε. We get the following system of equations (writing d dτ with dots): q = cosθf ε 1 +sinθfε 2 θ = 1+wb ε ẇ = w 3 a ε ṫ = εw and we link ε with h () by setting w() = 1. Since a ε = O(ε 2 ), w(t) = 1+O(ε 2 ). 412

413 18.3 General case: higher order asymptotic expansion Next we continue our analysis about the structure of the conjugate locus for a 3D contact structure by studying the higher order asymptotic. In this section we determine the coefficient of order 3 in the asymptotic expansion of the conjugate locus. Namely we have the following result, whose proof is postponed to Section Theorem In a system of local coordinates around q M one has the expansion Con q (θ,ν) = q ±πf ν 2 ±π(a f θ af θ ) ν 3 +O( ν 4 ), ν ±. (18.22) If we choose coordinates such that a = 2χh 1 h 2 one gets Con q (θ,ν) = q ±πf ν 2 ±2πχ(q )(cos 3 θf 2 sin 3 θf 1 ) ν 3 +O( ν 4 ), ν ±. (18.23) Moreover for the conjugate length we have the expansion l c (θ,ν) = 2π ν πκ ν 3 +O( ν 4 ), ν ±. (18.24) Analogous formulas can be obtained for the asymptotics of the cut locus at a point q where the invariant χ is non vanishing. Theorem Assume χ(q ). In a system of local coordinates around q M such that a = 2χu 1 u 2 one gets Cut q (θ,ν) = q ±πν 2 f (q )±2πχ(q )cosθf 1 (q )ν 3 +O(ν 4 ), ν ± Moreover the cut length satisfies l cut (θ,ν) = 2π ν π(κ+2χsin 2 θ) ν 3 +O(ν 4 ), ν ± (18.25) We can collect the information given by the asymptotics of the conjugate and the cut loci in Figure All geometrical information about the structure of these sets is encoded in a pair of quadratic forms defined on the fiber at the base point q, namely the curvature R and the sub-riemannian Hamiltonian H. Recall that the sub-riemannian Hamiltonian encodes the information about the distribution and about the metric defined on it (see Exercise 4.33). Let us consider the kernel of the sub-riemannian Hamiltonian kerh = {λ T qm : λ,v =, v D q } = D q. (18.26) The restriction of R to the 1-dimensional subspace D q for every q M, is a strictly positive quadratic form. Moreover it is equal to 1/1 when evaluated on the Reeb vector field. Hence the curvature R encodes both the contact form ω and its normalization. If we denote by D q the orthogonal complement of D q in the fiber with respect to R2, we have that R is a quadratic form on D q and, by using the Euclidean metric defined by H on D q, as a symmetric operator. 2 this is indeed isomorphic to the space of linear functionals defined on D q. 413

414 cut f conjugate πν 2 q 2πχ(q )ν 3 f 2 f 1 Figure 18.1: Asymptotic structure of cut and conjugate locus As we explained in the previous chapter, at each q where χ(q ) there always exists a frame such that {H,h } = 2χh 1 h 2 and in this frame we can express the restriction of R to D q (corresponding to the set {h = }) on this subspace as follows (see Section 17.3) 1R = (κ+3χ)h 2 1 +(κ 3χ)h2 2. From this formulae it is easy to recover the two invariants χ,κ considering trace(1r h = ) = 2κ, discr(1r h = ) = 36χ2, where the discriminant of an operator Q, defined on a two-dimensional space, is defined as the square of the difference of its eigenvalues, and can be compute by the formula discr(q) = trace 2 (Q) 4det(Q). The cubic term of the conjugate locus (for a fixed value of ν) parametrizes an astroid. The cuspidal directions of the astroid are given by the eigenvectors of R, and the cut locus intersect the conjugate locus exactly at the cuspidal points in the direction of the eigenvector of R corresponding to the larger eigenvalue. Finally the size of the cut locus increases for bigger values of χ, while κ is involved in the length of curves arriving at cut/conjugate locus Remark The expression of the cut locus given in Theorem 18.7 gives the truncation up to order 3 of the asymptotics of the cut locus of the exponential map. It is possible to show that this 414

Introduction to Riemannian and Sub-Riemannian geometry

Introduction to Riemannian and Sub-Riemannian geometry Introduction to Riemannian and Sub-Riemannian geometry From Hamiltonian viewpoint andrei agrachev davide barilari ugo boscain This version: June 12, 216 Preprint SISSA 9/212/M 2 Contents Introduction 9

More information

Introduction to Riemannian and Sub-Riemannian geometry

Introduction to Riemannian and Sub-Riemannian geometry Introduction to Riemannian and Sub-Riemannian geometry From Hamiltonian viewpoint andrei agrachev davide barilari ugo boscain This version: July 11, 215 Preprint SISSA 9/212/M 2 Contents Introduction 8

More information

Some topics in sub-riemannian geometry

Some topics in sub-riemannian geometry Some topics in sub-riemannian geometry Luca Rizzi CNRS, Institut Fourier Mathematical Colloquium Universität Bern - December 19 2016 Sub-Riemannian geometry Known under many names: Carnot-Carathéodory

More information

DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17

DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17 DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17 6. Geodesics A parametrized line γ : [a, b] R n in R n is straight (and the parametrization is uniform) if the vector γ (t) does not depend on t. Thus,

More information

THEODORE VORONOV DIFFERENTIAL GEOMETRY. Spring 2009

THEODORE VORONOV DIFFERENTIAL GEOMETRY. Spring 2009 [under construction] 8 Parallel transport 8.1 Equation of parallel transport Consider a vector bundle E B. We would like to compare vectors belonging to fibers over different points. Recall that this was

More information

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M =

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M = CALCULUS ON MANIFOLDS 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M = a M T am, called the tangent bundle, is itself a smooth manifold, dim T M = 2n. Example 1.

More information

1. Geometry of the unit tangent bundle

1. Geometry of the unit tangent bundle 1 1. Geometry of the unit tangent bundle The main reference for this section is [8]. In the following, we consider (M, g) an n-dimensional smooth manifold endowed with a Riemannian metric g. 1.1. Notations

More information

Chapter 3. Riemannian Manifolds - I. The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves

Chapter 3. Riemannian Manifolds - I. The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves Chapter 3 Riemannian Manifolds - I The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves embedded in Riemannian manifolds. A Riemannian manifold is an abstraction

More information

Geodesic Equivalence in sub-riemannian Geometry

Geodesic Equivalence in sub-riemannian Geometry 03/27/14 Equivalence in sub-riemannian Geometry Supervisor: Dr.Igor Zelenko Texas A&M University, Mathematics Some Preliminaries: Riemannian Metrics Let M be a n-dimensional surface in R N Some Preliminaries:

More information

GEOMETRIC QUANTIZATION

GEOMETRIC QUANTIZATION GEOMETRIC QUANTIZATION 1. The basic idea The setting of the Hamiltonian version of classical (Newtonian) mechanics is the phase space (position and momentum), which is a symplectic manifold. The typical

More information

Exercises in Geometry II University of Bonn, Summer semester 2015 Professor: Prof. Christian Blohmann Assistant: Saskia Voss Sheet 1

Exercises in Geometry II University of Bonn, Summer semester 2015 Professor: Prof. Christian Blohmann Assistant: Saskia Voss Sheet 1 Assistant: Saskia Voss Sheet 1 1. Conformal change of Riemannian metrics [3 points] Let (M, g) be a Riemannian manifold. A conformal change is a nonnegative function λ : M (0, ). Such a function defines

More information

A Tour of Subriemannian Geometries,Their Geodesies and Applications

A Tour of Subriemannian Geometries,Their Geodesies and Applications Mathematical Surveys and Monographs Volume 91 A Tour of Subriemannian Geometries,Their Geodesies and Applications Richard Montgomery American Mathematical Society Contents Introduction Acknowledgments

More information

Abstract. Jacobi curves are far going generalizations of the spaces of \Jacobi

Abstract. Jacobi curves are far going generalizations of the spaces of \Jacobi Principal Invariants of Jacobi Curves Andrei Agrachev 1 and Igor Zelenko 2 1 S.I.S.S.A., Via Beirut 2-4, 34013 Trieste, Italy and Steklov Mathematical Institute, ul. Gubkina 8, 117966 Moscow, Russia; email:

More information

BACKGROUND IN SYMPLECTIC GEOMETRY

BACKGROUND IN SYMPLECTIC GEOMETRY BACKGROUND IN SYMPLECTIC GEOMETRY NILAY KUMAR Today I want to introduce some of the symplectic structure underlying classical mechanics. The key idea is actually quite old and in its various formulations

More information

Invariant Nonholonomic Riemannian Structures on Three-Dimensional Lie Groups

Invariant Nonholonomic Riemannian Structures on Three-Dimensional Lie Groups Invariant Nonholonomic Riemannian Structures on Three-Dimensional Lie Groups Dennis I. Barrett Geometry, Graphs and Control (GGC) Research Group Department of Mathematics, Rhodes University Grahamstown,

More information

LECTURE 15: COMPLETENESS AND CONVEXITY

LECTURE 15: COMPLETENESS AND CONVEXITY LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other

More information

Smooth Dynamics 2. Problem Set Nr. 1. Instructor: Submitted by: Prof. Wilkinson Clark Butler. University of Chicago Winter 2013

Smooth Dynamics 2. Problem Set Nr. 1. Instructor: Submitted by: Prof. Wilkinson Clark Butler. University of Chicago Winter 2013 Smooth Dynamics 2 Problem Set Nr. 1 University of Chicago Winter 2013 Instructor: Submitted by: Prof. Wilkinson Clark Butler Problem 1 Let M be a Riemannian manifold with metric, and Levi-Civita connection.

More information

Modern Geometric Structures and Fields

Modern Geometric Structures and Fields Modern Geometric Structures and Fields S. P. Novikov I.A.TaJmanov Translated by Dmitry Chibisov Graduate Studies in Mathematics Volume 71 American Mathematical Society Providence, Rhode Island Preface

More information

LECTURE 16: CONJUGATE AND CUT POINTS

LECTURE 16: CONJUGATE AND CUT POINTS LECTURE 16: CONJUGATE AND CUT POINTS 1. Conjugate Points Let (M, g) be Riemannian and γ : [a, b] M a geodesic. Then by definition, exp p ((t a) γ(a)) = γ(t). We know that exp p is a diffeomorphism near

More information

Hamiltonian Systems of Negative Curvature are Hyperbolic

Hamiltonian Systems of Negative Curvature are Hyperbolic Hamiltonian Systems of Negative Curvature are Hyperbolic A. A. Agrachev N. N. Chtcherbakova Abstract The curvature and the reduced curvature are basic differential invariants of the pair: Hamiltonian system,

More information

612 CLASS LECTURE: HYPERBOLIC GEOMETRY

612 CLASS LECTURE: HYPERBOLIC GEOMETRY 612 CLASS LECTURE: HYPERBOLIC GEOMETRY JOSHUA P. BOWMAN 1. Conformal metrics As a vector space, C has a canonical norm, the same as the standard R 2 norm. Denote this dz one should think of dz as the identity

More information

An Introduction to Riemann-Finsler Geometry

An Introduction to Riemann-Finsler Geometry D. Bao S.-S. Chern Z. Shen An Introduction to Riemann-Finsler Geometry With 20 Illustrations Springer Contents Preface Acknowledgments vn xiii PART ONE Finsler Manifolds and Their Curvature CHAPTER 1 Finsler

More information

Math 6455 Nov 1, Differential Geometry I Fall 2006, Georgia Tech

Math 6455 Nov 1, Differential Geometry I Fall 2006, Georgia Tech Math 6455 Nov 1, 26 1 Differential Geometry I Fall 26, Georgia Tech Lecture Notes 14 Connections Suppose that we have a vector field X on a Riemannian manifold M. How can we measure how much X is changing

More information

As always, the story begins with Riemann surfaces or just (real) surfaces. (As we have already noted, these are nearly the same thing).

As always, the story begins with Riemann surfaces or just (real) surfaces. (As we have already noted, these are nearly the same thing). An Interlude on Curvature and Hermitian Yang Mills As always, the story begins with Riemann surfaces or just (real) surfaces. (As we have already noted, these are nearly the same thing). Suppose we wanted

More information

High-order angles in almost-riemannian geometry

High-order angles in almost-riemannian geometry High-order angles in almost-riemannian geometry Ugo Boscain SISSA-ISAS, Via Beirut 2-4, 34014 Trieste, Italy and LE2i, CNRS UMR5158, Université de Bourgogne, 9, avenue Alain Savary- BP 47870 21078 DIJON

More information

Lecture 11. Geodesics and completeness

Lecture 11. Geodesics and completeness Lecture 11. Geodesics and completeness In this lecture we will investigate further the metric properties of geodesics of the Levi-Civita connection, and use this to characterise completeness of a Riemannian

More information

CHAPTER 1 PRELIMINARIES

CHAPTER 1 PRELIMINARIES CHAPTER 1 PRELIMINARIES 1.1 Introduction The aim of this chapter is to give basic concepts, preliminary notions and some results which we shall use in the subsequent chapters of the thesis. 1.2 Differentiable

More information

Differential Geometry II Lecture 1: Introduction and Motivation

Differential Geometry II Lecture 1: Introduction and Motivation Differential Geometry II Lecture 1: Introduction and Motivation Robert Haslhofer 1 Content of this lecture This course is on Riemannian geometry and the geometry of submanifol. The goal of this first lecture

More information

Many of the exercises are taken from the books referred at the end of the document.

Many of the exercises are taken from the books referred at the end of the document. Exercises in Geometry I University of Bonn, Winter semester 2014/15 Prof. Christian Blohmann Assistant: Néstor León Delgado The collection of exercises here presented corresponds to the exercises for the

More information

On the Hausdorff volume in sub-riemannian geometry

On the Hausdorff volume in sub-riemannian geometry On the Hausdorff volume in sub-riemannian geometry Andrei Agrachev SISSA, Trieste, Italy and MIAN, Moscow, Russia - agrachev@sissa.it Davide Barilari SISSA, Trieste, Italy - barilari@sissa.it Ugo Boscain

More information

Choice of Riemannian Metrics for Rigid Body Kinematics

Choice of Riemannian Metrics for Rigid Body Kinematics Choice of Riemannian Metrics for Rigid Body Kinematics Miloš Žefran1, Vijay Kumar 1 and Christopher Croke 2 1 General Robotics and Active Sensory Perception (GRASP) Laboratory 2 Department of Mathematics

More information

UNIVERSITY OF DUBLIN

UNIVERSITY OF DUBLIN UNIVERSITY OF DUBLIN TRINITY COLLEGE JS & SS Mathematics SS Theoretical Physics SS TSM Mathematics Faculty of Engineering, Mathematics and Science school of mathematics Trinity Term 2015 Module MA3429

More information

From the Heisenberg group to Carnot groups

From the Heisenberg group to Carnot groups From the Heisenberg group to Carnot groups p. 1/47 From the Heisenberg group to Carnot groups Irina Markina, University of Bergen, Norway Summer school Analysis - with Applications to Mathematical Physics

More information

Solutions to the Hamilton-Jacobi equation as Lagrangian submanifolds

Solutions to the Hamilton-Jacobi equation as Lagrangian submanifolds Solutions to the Hamilton-Jacobi equation as Lagrangian submanifolds Matias Dahl January 2004 1 Introduction In this essay we shall study the following problem: Suppose is a smooth -manifold, is a function,

More information

B 1 = {B(x, r) x = (x 1, x 2 ) H, 0 < r < x 2 }. (a) Show that B = B 1 B 2 is a basis for a topology on X.

B 1 = {B(x, r) x = (x 1, x 2 ) H, 0 < r < x 2 }. (a) Show that B = B 1 B 2 is a basis for a topology on X. Math 6342/7350: Topology and Geometry Sample Preliminary Exam Questions 1. For each of the following topological spaces X i, determine whether X i and X i X i are homeomorphic. (a) X 1 = [0, 1] (b) X 2

More information

William P. Thurston. The Geometry and Topology of Three-Manifolds

William P. Thurston. The Geometry and Topology of Three-Manifolds William P. Thurston The Geometry and Topology of Three-Manifolds Electronic version 1.1 - March 00 http://www.msri.org/publications/books/gt3m/ This is an electronic edition of the 1980 notes distributed

More information

CHAPTER 3. Gauss map. In this chapter we will study the Gauss map of surfaces in R 3.

CHAPTER 3. Gauss map. In this chapter we will study the Gauss map of surfaces in R 3. CHAPTER 3 Gauss map In this chapter we will study the Gauss map of surfaces in R 3. 3.1. Surfaces in R 3 Let S R 3 be a submanifold of dimension 2. Let {U i, ϕ i } be a DS on S. For any p U i we have a

More information

IV Canonical relations for other geometrical constructions

IV Canonical relations for other geometrical constructions IV Canonical relations for other geometrical constructions IV.1. Introduction In this chapter we will further exploit the concept of canonical relation and show how it can be used to create center sets,

More information

Hopf-Rinow and Hadamard Theorems

Hopf-Rinow and Hadamard Theorems Summersemester 2015 University of Heidelberg Riemannian geometry seminar Hopf-Rinow and Hadamard Theorems by Sven Grützmacher supervised by: Dr. Gye-Seon Lee Prof. Dr. Anna Wienhard Contents Introduction..........................................

More information

SOME EXERCISES IN CHARACTERISTIC CLASSES

SOME EXERCISES IN CHARACTERISTIC CLASSES SOME EXERCISES IN CHARACTERISTIC CLASSES 1. GAUSSIAN CURVATURE AND GAUSS-BONNET THEOREM Let S R 3 be a smooth surface with Riemannian metric g induced from R 3. Its Levi-Civita connection can be defined

More information

1 Differentiable manifolds and smooth maps

1 Differentiable manifolds and smooth maps 1 Differentiable manifolds and smooth maps Last updated: April 14, 2011. 1.1 Examples and definitions Roughly, manifolds are sets where one can introduce coordinates. An n-dimensional manifold is a set

More information

MATH 4030 Differential Geometry Lecture Notes Part 4 last revised on December 4, Elementary tensor calculus

MATH 4030 Differential Geometry Lecture Notes Part 4 last revised on December 4, Elementary tensor calculus MATH 4030 Differential Geometry Lecture Notes Part 4 last revised on December 4, 205 Elementary tensor calculus We will study in this section some basic multilinear algebra and operations on tensors. Let

More information

274 Microlocal Geometry, Lecture 2. David Nadler Notes by Qiaochu Yuan

274 Microlocal Geometry, Lecture 2. David Nadler Notes by Qiaochu Yuan 274 Microlocal Geometry, Lecture 2 David Nadler Notes by Qiaochu Yuan Fall 2013 2 Whitney stratifications Yesterday we defined an n-step stratified space. Various exercises could have been but weren t

More information

DIFFERENTIAL GEOMETRY. LECTURE 12-13,

DIFFERENTIAL GEOMETRY. LECTURE 12-13, DIFFERENTIAL GEOMETRY. LECTURE 12-13, 3.07.08 5. Riemannian metrics. Examples. Connections 5.1. Length of a curve. Let γ : [a, b] R n be a parametried curve. Its length can be calculated as the limit of

More information

An introduction to Mathematical Theory of Control

An introduction to Mathematical Theory of Control An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018

More information

The Geometrization Theorem

The Geometrization Theorem The Geometrization Theorem Matthew D. Brown Wednesday, December 19, 2012 In this paper, we discuss the Geometrization Theorem, formerly Thurston s Geometrization Conjecture, which is essentially the statement

More information

Terse Notes on Riemannian Geometry

Terse Notes on Riemannian Geometry Terse Notes on Riemannian Geometry Tom Fletcher January 26, 2010 These notes cover the basics of Riemannian geometry, Lie groups, and symmetric spaces. This is just a listing of the basic definitions and

More information

Definition 5.1. A vector field v on a manifold M is map M T M such that for all x M, v(x) T x M.

Definition 5.1. A vector field v on a manifold M is map M T M such that for all x M, v(x) T x M. 5 Vector fields Last updated: March 12, 2012. 5.1 Definition and general properties We first need to define what a vector field is. Definition 5.1. A vector field v on a manifold M is map M T M such that

More information

PICARD S THEOREM STEFAN FRIEDL

PICARD S THEOREM STEFAN FRIEDL PICARD S THEOREM STEFAN FRIEDL Abstract. We give a summary for the proof of Picard s Theorem. The proof is for the most part an excerpt of [F]. 1. Introduction Definition. Let U C be an open subset. A

More information

Hyperbolic Geometry on Geometric Surfaces

Hyperbolic Geometry on Geometric Surfaces Mathematics Seminar, 15 September 2010 Outline Introduction Hyperbolic geometry Abstract surfaces The hemisphere model as a geometric surface The Poincaré disk model as a geometric surface Conclusion Introduction

More information

Riemannian geometry of surfaces

Riemannian geometry of surfaces Riemannian geometry of surfaces In this note, we will learn how to make sense of the concepts of differential geometry on a surface M, which is not necessarily situated in R 3. This intrinsic approach

More information

Some aspects of the Geodesic flow

Some aspects of the Geodesic flow Some aspects of the Geodesic flow Pablo C. Abstract This is a presentation for Yael s course on Symplectic Geometry. We discuss here the context in which the geodesic flow can be understood using techniques

More information

LECTURE 2. (TEXED): IN CLASS: PROBABLY LECTURE 3. MANIFOLDS 1. FALL TANGENT VECTORS.

LECTURE 2. (TEXED): IN CLASS: PROBABLY LECTURE 3. MANIFOLDS 1. FALL TANGENT VECTORS. LECTURE 2. (TEXED): IN CLASS: PROBABLY LECTURE 3. MANIFOLDS 1. FALL 2006. TANGENT VECTORS. Overview: Tangent vectors, spaces and bundles. First: to an embedded manifold of Euclidean space. Then to one

More information

Index. Bertrand mate, 89 bijection, 48 bitangent, 69 Bolyai, 339 Bonnet s Formula, 283 bounded, 48

Index. Bertrand mate, 89 bijection, 48 bitangent, 69 Bolyai, 339 Bonnet s Formula, 283 bounded, 48 Index acceleration, 14, 76, 355 centripetal, 27 tangential, 27 algebraic geometry, vii analytic, 44 angle at a corner, 21 on a regular surface, 170 angle excess, 337 angle of parallelism, 344 angular velocity,

More information

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.)

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.) 4 Vector fields Last updated: November 26, 2009. (Under construction.) 4.1 Tangent vectors as derivations After we have introduced topological notions, we can come back to analysis on manifolds. Let M

More information

Gromov s Proof of Mostow Rigidity

Gromov s Proof of Mostow Rigidity Gromov s Proof of Mostow Rigidity Mostow Rigidity is a theorem about invariants. An invariant assigns a label to whatever mathematical object I am studying that is well-defined up to the appropriate equivalence.

More information

Gauge Fixing and Constrained Dynamics in Numerical Relativity

Gauge Fixing and Constrained Dynamics in Numerical Relativity Gauge Fixing and Constrained Dynamics in Numerical Relativity Jon Allen The Dirac formalism for dealing with constraints in a canonical Hamiltonian formulation is reviewed. Gauge freedom is discussed and

More information

The Symmetric Space for SL n (R)

The Symmetric Space for SL n (R) The Symmetric Space for SL n (R) Rich Schwartz November 27, 2013 The purpose of these notes is to discuss the symmetric space X on which SL n (R) acts. Here, as usual, SL n (R) denotes the group of n n

More information

CALCULUS ON MANIFOLDS

CALCULUS ON MANIFOLDS CALCULUS ON MANIFOLDS 1. Manifolds Morally, manifolds are topological spaces which locally look like open balls of the Euclidean space R n. One can construct them by piecing together such balls ( cells

More information

DIFFERENTIAL GEOMETRY HW 12

DIFFERENTIAL GEOMETRY HW 12 DIFFERENTIAL GEOMETRY HW 1 CLAY SHONKWILER 3 Find the Lie algebra so(n) of the special orthogonal group SO(n), and the explicit formula for the Lie bracket there. Proof. Since SO(n) is a subgroup of GL(n),

More information

Math 550 / David Dumas / Fall Problems

Math 550 / David Dumas / Fall Problems Math 550 / David Dumas / Fall 2014 Problems Please note: This list was last updated on November 30, 2014. Problems marked with * are challenge problems. Some problems are adapted from the course texts;

More information

ON GEODESIC FLOWS MODELED BY EXPANSIVE FLOWS UP TO TIME-PRESERVING SEMI-CONJUGACY

ON GEODESIC FLOWS MODELED BY EXPANSIVE FLOWS UP TO TIME-PRESERVING SEMI-CONJUGACY ON GEODESIC FLOWS MODELED BY EXPANSIVE FLOWS UP TO TIME-PRESERVING SEMI-CONJUGACY KATRIN GELFERT AND RAFAEL O. RUGGIERO Abstract. Given a smooth compact surface without focal points and of higher genus,

More information

Theorem 2. Let n 0 3 be a given integer. is rigid in the sense of Guillemin, so are all the spaces ḠR n,n, with n n 0.

Theorem 2. Let n 0 3 be a given integer. is rigid in the sense of Guillemin, so are all the spaces ḠR n,n, with n n 0. This monograph is motivated by a fundamental rigidity problem in Riemannian geometry: determine whether the metric of a given Riemannian symmetric space of compact type can be characterized by means of

More information

arxiv: v1 [math.oc] 2 Jul 2014

arxiv: v1 [math.oc] 2 Jul 2014 Local properties of almost-riemannian structures in dimension 3 U. Boscain, G. Charlot, M. Gaye, and P. Mason arxiv:47.6v [math.oc] Jul 4 CNRS, CMAP École Polytechnique, Palaiseau, France and Team GECO,

More information

Diffraction by Edges. András Vasy (with Richard Melrose and Jared Wunsch)

Diffraction by Edges. András Vasy (with Richard Melrose and Jared Wunsch) Diffraction by Edges András Vasy (with Richard Melrose and Jared Wunsch) Cambridge, July 2006 Consider the wave equation Pu = 0, Pu = D 2 t u gu, on manifolds with corners M; here g 0 the Laplacian, D

More information

A LOWER BOUND ON THE SUBRIEMANNIAN DISTANCE FOR HÖLDER DISTRIBUTIONS

A LOWER BOUND ON THE SUBRIEMANNIAN DISTANCE FOR HÖLDER DISTRIBUTIONS A LOWER BOUND ON THE SUBRIEMANNIAN DISTANCE FOR HÖLDER DISTRIBUTIONS SLOBODAN N. SIMIĆ Abstract. Whereas subriemannian geometry usually deals with smooth horizontal distributions, partially hyperbolic

More information

Invariant Nonholonomic Riemannian Structures on Three-Dimensional Lie Groups

Invariant Nonholonomic Riemannian Structures on Three-Dimensional Lie Groups Invariant Nonholonomic Riemannian Structures on Three-Dimensional Lie Groups Dennis I. Barrett Geometry, Graphs and Control (GGC) Research Group Department of Mathematics (Pure and Applied) Rhodes University,

More information

Summary of Prof. Yau s lecture, Monday, April 2 [with additional references and remarks] (for people who missed the lecture)

Summary of Prof. Yau s lecture, Monday, April 2 [with additional references and remarks] (for people who missed the lecture) Building Geometric Structures: Summary of Prof. Yau s lecture, Monday, April 2 [with additional references and remarks] (for people who missed the lecture) A geometric structure on a manifold is a cover

More information

Comparison for infinitesimal automorphisms. of parabolic geometries

Comparison for infinitesimal automorphisms. of parabolic geometries Comparison techniques for infinitesimal automorphisms of parabolic geometries University of Vienna Faculty of Mathematics Srni, January 2012 This talk reports on joint work in progress with Karin Melnick

More information

Geometric inequalities for black holes

Geometric inequalities for black holes Geometric inequalities for black holes Sergio Dain FaMAF-Universidad Nacional de Córdoba, CONICET, Argentina. 3 August, 2012 Einstein equations (vacuum) The spacetime is a four dimensional manifold M with

More information

1 Euclidean geometry. 1.1 The metric on R n

1 Euclidean geometry. 1.1 The metric on R n 1 Euclidean geometry This chapter discusses the geometry of n-dimensional Euclidean space E n, together with its distance function. The distance gives rise to other notions such as angles and congruent

More information

1. Classifying Spaces. Classifying Spaces

1. Classifying Spaces. Classifying Spaces Classifying Spaces 1. Classifying Spaces. To make our lives much easier, all topological spaces from now on will be homeomorphic to CW complexes. Fact: All smooth manifolds are homeomorphic to CW complexes.

More information

4.7 The Levi-Civita connection and parallel transport

4.7 The Levi-Civita connection and parallel transport Classnotes: Geometry & Control of Dynamical Systems, M. Kawski. April 21, 2009 138 4.7 The Levi-Civita connection and parallel transport In the earlier investigation, characterizing the shortest curves

More information

INTRO TO SUBRIEMANNIAN GEOMETRY

INTRO TO SUBRIEMANNIAN GEOMETRY INTRO TO SUBRIEMANNIAN GEOMETRY 1. Introduction to subriemannian geometry A lot of this tal is inspired by the paper by Ines Kath and Oliver Ungermann on the arxiv, see [3] as well as [1]. Let M be a smooth

More information

LECTURE 10: THE ATIYAH-GUILLEMIN-STERNBERG CONVEXITY THEOREM

LECTURE 10: THE ATIYAH-GUILLEMIN-STERNBERG CONVEXITY THEOREM LECTURE 10: THE ATIYAH-GUILLEMIN-STERNBERG CONVEXITY THEOREM Contents 1. The Atiyah-Guillemin-Sternberg Convexity Theorem 1 2. Proof of the Atiyah-Guillemin-Sternberg Convexity theorem 3 3. Morse theory

More information

Lecture 8. Connections

Lecture 8. Connections Lecture 8. Connections This lecture introduces connections, which are the machinery required to allow differentiation of vector fields. 8.1 Differentiating vector fields. The idea of differentiating vector

More information

LECTURE: KOBORDISMENTHEORIE, WINTER TERM 2011/12; SUMMARY AND LITERATURE

LECTURE: KOBORDISMENTHEORIE, WINTER TERM 2011/12; SUMMARY AND LITERATURE LECTURE: KOBORDISMENTHEORIE, WINTER TERM 2011/12; SUMMARY AND LITERATURE JOHANNES EBERT 1.1. October 11th. 1. Recapitulation from differential topology Definition 1.1. Let M m, N n, be two smooth manifolds

More information

1 v >, which will be G-invariant by construction.

1 v >, which will be G-invariant by construction. 1. Riemannian symmetric spaces Definition 1.1. A (globally, Riemannian) symmetric space is a Riemannian manifold (X, g) such that for all x X, there exists an isometry s x Iso(X, g) such that s x (x) =

More information

Plane hyperbolic geometry

Plane hyperbolic geometry 2 Plane hyperbolic geometry In this chapter we will see that the unit disc D has a natural geometry, known as plane hyperbolic geometry or plane Lobachevski geometry. It is the local model for the hyperbolic

More information

A brief introduction to Semi-Riemannian geometry and general relativity. Hans Ringström

A brief introduction to Semi-Riemannian geometry and general relativity. Hans Ringström A brief introduction to Semi-Riemannian geometry and general relativity Hans Ringström May 5, 2015 2 Contents 1 Scalar product spaces 1 1.1 Scalar products...................................... 1 1.2 Orthonormal

More information

A FRANKS LEMMA FOR CONVEX PLANAR BILLIARDS

A FRANKS LEMMA FOR CONVEX PLANAR BILLIARDS A FRANKS LEMMA FOR CONVEX PLANAR BILLIARDS DANIEL VISSCHER Abstract Let γ be an orbit of the billiard flow on a convex planar billiard table; then the perpendicular part of the derivative of the billiard

More information

Differential Geometry Exercises

Differential Geometry Exercises Differential Geometry Exercises Isaac Chavel Spring 2006 Jordan curve theorem We think of a regular C 2 simply closed path in the plane as a C 2 imbedding of the circle ω : S 1 R 2. Theorem. Given the

More information

Mostow Rigidity. W. Dison June 17, (a) semi-simple Lie groups with trivial centre and no compact factors and

Mostow Rigidity. W. Dison June 17, (a) semi-simple Lie groups with trivial centre and no compact factors and Mostow Rigidity W. Dison June 17, 2005 0 Introduction Lie Groups and Symmetric Spaces We will be concerned with (a) semi-simple Lie groups with trivial centre and no compact factors and (b) simply connected,

More information

Introduction to Algebraic and Geometric Topology Week 14

Introduction to Algebraic and Geometric Topology Week 14 Introduction to Algebraic and Geometric Topology Week 14 Domingo Toledo University of Utah Fall 2016 Computations in coordinates I Recall smooth surface S = {f (x, y, z) =0} R 3, I rf 6= 0 on S, I Chart

More information

MILNOR SEMINAR: DIFFERENTIAL FORMS AND CHERN CLASSES

MILNOR SEMINAR: DIFFERENTIAL FORMS AND CHERN CLASSES MILNOR SEMINAR: DIFFERENTIAL FORMS AND CHERN CLASSES NILAY KUMAR In these lectures I want to introduce the Chern-Weil approach to characteristic classes on manifolds, and in particular, the Chern classes.

More information

A crash course the geometry of hyperbolic surfaces

A crash course the geometry of hyperbolic surfaces Lecture 7 A crash course the geometry of hyperbolic surfaces 7.1 The hyperbolic plane Hyperbolic geometry originally developed in the early 19 th century to prove that the parallel postulate in Euclidean

More information

HADAMARD FOLIATIONS OF H n. I

HADAMARD FOLIATIONS OF H n. I HADAMARD FOLIATIONS OF H n. I MACIEJ CZARNECKI Abstract. We introduce the notion of an Hadamard foliation as a foliation of Hadamard manifold which all leaves are Hadamard. We prove that a foliation of

More information

AREA-STATIONARY SURFACES INSIDE THE SUB-RIEMANNIAN THREE-SPHERE

AREA-STATIONARY SURFACES INSIDE THE SUB-RIEMANNIAN THREE-SPHERE AREA-STATIONARY SURFACES INSIDE THE SUB-RIEMANNIAN THREE-SPHERE ANA HURTADO AND CÉSAR ROSALES ABSTRACT. We consider the sub-riemannian metric g h on S 3 given by the restriction of the Riemannian metric

More information

SYMPLECTIC GEOMETRY: LECTURE 5

SYMPLECTIC GEOMETRY: LECTURE 5 SYMPLECTIC GEOMETRY: LECTURE 5 LIAT KESSLER Let (M, ω) be a connected compact symplectic manifold, T a torus, T M M a Hamiltonian action of T on M, and Φ: M t the assoaciated moment map. Theorem 0.1 (The

More information

Poisson geometry of b-manifolds. Eva Miranda

Poisson geometry of b-manifolds. Eva Miranda Poisson geometry of b-manifolds Eva Miranda UPC-Barcelona Rio de Janeiro, July 26, 2010 Eva Miranda (UPC) Poisson 2010 July 26, 2010 1 / 45 Outline 1 Motivation 2 Classification of b-poisson manifolds

More information

ARITHMETICITY OF TOTALLY GEODESIC LIE FOLIATIONS WITH LOCALLY SYMMETRIC LEAVES

ARITHMETICITY OF TOTALLY GEODESIC LIE FOLIATIONS WITH LOCALLY SYMMETRIC LEAVES ASIAN J. MATH. c 2008 International Press Vol. 12, No. 3, pp. 289 298, September 2008 002 ARITHMETICITY OF TOTALLY GEODESIC LIE FOLIATIONS WITH LOCALLY SYMMETRIC LEAVES RAUL QUIROGA-BARRANCO Abstract.

More information

B.7 Lie Groups and Differential Equations

B.7 Lie Groups and Differential Equations 96 B.7. LIE GROUPS AND DIFFERENTIAL EQUATIONS B.7 Lie Groups and Differential Equations Peter J. Olver in Minneapolis, MN (U.S.A.) mailto:olver@ima.umn.edu The applications of Lie groups to solve differential

More information

Math 302 Outcome Statements Winter 2013

Math 302 Outcome Statements Winter 2013 Math 302 Outcome Statements Winter 2013 1 Rectangular Space Coordinates; Vectors in the Three-Dimensional Space (a) Cartesian coordinates of a point (b) sphere (c) symmetry about a point, a line, and a

More information

LECTURE 5: COMPLEX AND KÄHLER MANIFOLDS

LECTURE 5: COMPLEX AND KÄHLER MANIFOLDS LECTURE 5: COMPLEX AND KÄHLER MANIFOLDS Contents 1. Almost complex manifolds 1. Complex manifolds 5 3. Kähler manifolds 9 4. Dolbeault cohomology 11 1. Almost complex manifolds Almost complex structures.

More information

Bredon, Introduction to compact transformation groups, Academic Press

Bredon, Introduction to compact transformation groups, Academic Press 1 Introduction Outline Section 3: Topology of 2-orbifolds: Compact group actions Compact group actions Orbit spaces. Tubes and slices. Path-lifting, covering homotopy Locally smooth actions Smooth actions

More information

Tangent spaces, normals and extrema

Tangent spaces, normals and extrema Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent

More information

Differential Geometry, Lie Groups, and Symmetric Spaces

Differential Geometry, Lie Groups, and Symmetric Spaces Differential Geometry, Lie Groups, and Symmetric Spaces Sigurdur Helgason Graduate Studies in Mathematics Volume 34 nsffvjl American Mathematical Society l Providence, Rhode Island PREFACE PREFACE TO THE

More information

Heat kernel asymptotics at the cut locus for Riemannian and sub-riemannian manifolds

Heat kernel asymptotics at the cut locus for Riemannian and sub-riemannian manifolds Heat kernel asymptotics at the cut locus for Riemannian and sub-riemannian manifolds Davide Barilari IMJ, Université Paris Diderot - Paris 7 International Youth Conference Geometry and Control, Moscow,

More information

VOLUME GROWTH AND HOLONOMY IN NONNEGATIVE CURVATURE

VOLUME GROWTH AND HOLONOMY IN NONNEGATIVE CURVATURE VOLUME GROWTH AND HOLONOMY IN NONNEGATIVE CURVATURE KRISTOPHER TAPP Abstract. The volume growth of an open manifold of nonnegative sectional curvature is proven to be bounded above by the difference between

More information

Junior Seminar: Hyperbolic Geometry Lecture Notes

Junior Seminar: Hyperbolic Geometry Lecture Notes Junior Seminar: Hyperbolic Geometry Lecture Notes Tim Campion January 20, 2010 1 Motivation Our first construction is very similar in spirit to an analogous one in Euclidean space. The group of isometries

More information