Introduction to Riemannian and Sub-Riemannian geometry

Size: px

Start display at page:

Download "Introduction to Riemannian and Sub-Riemannian geometry"

Edith Preston
6 years ago
Views:

1 Introduction to Riemannian and Sub-Riemannian geometry From Hamiltonian viewpoint andrei agrachev davide barilari ugo boscain This version: July 11, 215 Preprint SISSA 9/212/M

2 2

3 Contents Introduction 8 1 Geometry of surfaces in R Geodesics and optimality Existence and minimizing properties of geodesics Absolutely continuous curves Parallel transport Gauss-Bonnet Theorems Gauss-Bonnet theorem: local version Gauss-Bonnet theorem: global version Consequences of the Gauss-Bonnet Theorems The Gauss map Surfaces in R 3 with the Minkowski inner product Model spaces of constant curvature Zero curvature: the Euclidean plane Positive curvature: spheres Negative curvature: the hyperbolic plane Vector fields and vector bundles Differential equations on smooth manifolds Tangent vectors and vector fields Flow of a vector field Vector fields as operators on functions Nonautonomous vector fields Differential of a smooth map Lie brackets Cotangent space Vector bundles Submersions and level sets of smooth maps Sub-Riemannian structures Basic definitions The minimal control and the length of an admissible curve Equivalence of sub-riemannian structures Examples

4 3.1.4 Every sub-riemannian structure is equivalent to a free one Sub-Riemannian distance and Chow-Rashevskii Theorem Proof of Chow-Raschevskii Theorem Existence of length-minimizers Pontryagin extremals The energy functional Proof of Theorem A Measurability of the minimal control A.1 Main lemma A.2 Proof of Lemma B Lipschitz vs Absolutely continuous admissible curves Characterization and local minimality of Pontryagin extremals Geometric characterization of Pontryagin extremals Lifting a vector field from M to T M The Poisson bracket Hamiltonian vector fields The symplectic structure The symplectic form vs the Poisson bracket Characterization of normal and abnormal extremals Normal extremals Abnormal extremals Example: codimension one distribution and contact distributions Examples D Riemannian Geometry Isoperimetric problem Heisenberg group Lie derivative Symplectic geometry Local minimality of normal trajectories The Poincaré-Cartan one form Normal trajectories are geodesics Integrable Systems Completely integrable systems Arnold-Liouville theorem Integrable geodesic flows Geodesic flow Geodesic flow on ellipsoids Chronological calculus Duality Operator ODE and Taylor expansion Variations Formulae

5 7 End-point and Exponential map First order conditions Lagrange points and Lagrange submanifolds Sub-Riemannian case Exponential map Conjugate points and minimality properties of geodesics Application: Conjugate locus on perturbed S Global minimizers Nonholonomic tangent space Jet spaces Admissible variations Nilpotent approximation and privileged coordinates Geometric meaning Algebraic meaning The volume in sub-riemannian geometry The Popp volume Popp volume for equiregular sub-riemannian manifolds A formula for Popp volume Popp volume and isometries Regularity of the sub-riemannian distance General properties of the distance function Regularity of the squared distance Locally Lipschitz functions and maps Locally Lipschitz map and Lipschitz submanifolds A non-smooth version of Sard Lemma Geodesic completeness and Hopf-Rinow theorem Abnormal extremals and second variation Second variation Abnormal extremals and regularity of the distance Goh and generalized Legendre conditions Proof of Goh condition - (i) of Theorem Proof of generalized Legendre condition - (ii) of Theorem More on Goh and generalized Legendre conditions Rank 2 distributions and nice abnormal extremals Optimality of nice abnormal in rank 2 structures Conjugate points along abnormals Abnormals in dimension Higher dimension Equivalence of local minimality

6 12 Curves in the Lagrange Grassmannian The geometry of the Lagrange Grassmannian The Lagrange Grassmannian Regular curves in Lagrange Grassmannian Curvature of a regular curve Reduction of non-regular curves in Lagrange Grassmannian Ample curves From ample to regular Conjugate points in L(Σ) Comparison theorems for regular curves Jacobi curves From Jacobi fields to Jacobi curves Jacobi curves Conjugate points and optimality Reduction of the Jacobi curves by homogeneity Riemannian curvature Ehresmann connection Curvature of an Ehresmann connection Linear Ehresmann connections Covariant derivative and torsion for linear connections Riemannian connection Relation with Hamiltonian curvature Locally flat spaces Example: curvature of the 2D Riemannian case Curvature in 3D contact sub-riemannian geometry D contact sub-riemannian manifolds Curvature of a 3D contact structure Asymptotic expansion of the 3D contact exponential map Nilpotent case General case: second order asymptotic expansion General case: higher order asymptotic expansion Proof of Theorem 16.6: asymptotics of the exponential map Asymptotics of the conjugate locus Asymptotics of the conjugate length Stability of the conjugate locus The sub-riemannian heat equation The heat equation The heat equation in the Riemannian context The heat equation in the sub-riemannian context Few properties of the sub-riemannian Laplacian: the Hörmander theorem and the existence of the heat kernel

7 17.2 The heat-kernel on the Heisenberg group The Heisenberg group as a group of matrices The heat equation on the Heisenberg group

8 8

9 Introduction Thisbookconcerns a freshdevelopment of theeternal ideaof thedistance as thelength of a shortest path. In Euclidean geometry, shortest paths are segments of straight lines that satisfy all classical axioms. In the Riemannian world, Euclidean geometry is just one of a huge amount of possibilities. However, each of these possibilities is well approximated by Euclidean geometry at very small scale. In other words, Euclidean geometry is treated as geometry of initial velocities of the paths starting from a fixed point of the Riemannian space rather than the geometry of the space itself. The Riemannian construction was based on the previous study of smooth surfaces in the Euclidean space undertaken by Gauss. The distance between two points on the surface is the length of a shortest path on the surface connecting the points. Initial velocities of smooth curves starting from a fixed point on the surface form a tangent plane to the surface, that is an Euclidean plane. Tangent planes at two different points are isometric, but neighborhoods of the points on the surface are not locally isometric in general; certainly not if the Gaussian curvature of the surface is different at the two points. Riemann generalized Gauss construction to higher dimensions and realized that it can be done in an intrinsic way; you do not need an ambient Euclidean space to measure the length of curves. Indeed, to measure the length of a curve it is sufficient to know the Euclidean length of its velocities. A Riemannian space is a smooth manifold whose tangent spaces are endowed with Euclidean structures; each tangent space is equipped with its own Euclidean structure that smoothly depends on the point where the tangent space is attached. For a habitant sitting at a point of the Riemannian space, tangent vectors give directions where to move or, more generally, to send and receive information. He measures lengths of vectors, and angles between vectors attached at the same point, according to the Euclidean rules, and this is essentially all what he can do. The point is that our habitant can, in principle, completely recover the geometry of the space by performing these simple measurements along different curves. In the sub-riemannian space we cannot move, receive and send information in all directions. There are restictions (imposed by the God, the moral imperative, the government, or simply a physical law). A sub-riemannian space is a smooth manifold with a fixed admissible subspace in any tangent space where admissible subspaces are equipped with Euclidean structures. Admissible paths are those curves whose velocities are admissible. The distance between two points is the infimum of the length of admissible paths connecting the points. It is assumed that any pair of points in the same connected component of the manifold can be connected by at least an admissible path. The last assumption might look strange at a first glance, but it is not. The admissible subspace depends on the point where it is attached, and our assumption is satisfied for a more or less general smooth dependence on the point; better to say that it is not satisfied only for very special families of admissible subspaces. Let us describe a simple model. Let our manifold be R 3 with coordinates x,y,z. We consider 9

10 the differential 1-form ω = dz (xdy ydx). Then dω = dx dy is the pullback on R3 of the area form on the xy-plane. In this model the subspace of admissible velocities at the point (x,y,z) is assumed to be the kernel of the form ω. In other words, a curve t (x(t),y(t),z(t)) is an admissible path if and only if ż(t) = 1 2 (y(t)ẋ(t) x(t)ẏ(t)). The length of an admissible tangent vector (ẋ,ẏ,ż) is defined to be (ẋ 2 +ẏ 2 ) 1 2, that is the length of the projection of the vector to the xy-plane. We see that any smooth planar curve (x(t),y(t)) has a unique admissible lift (x(t),y(t),z(t)) in R 3, where: z(t) = 1 2 t x(s)ẏ(s) ẋ(s)y(s) ds. If x() = y() =, then z(t) is the signed area of the domain boundedby the curve and the segment connecting (, ) with (x(t), y(t)). By construction, the sub-riemannian length of the admissible curve in R 3 is equal to the Euclidean length of its projection to the plane. We see that sub-riemannian shortest paths are lifts to R 3 of the solutions to the classical Dido isoperimetric problem: find a shortest planar curve among those connecting (,) with (x 1,y 1 ) and such that the signed area of the domain bounded by the curve and the segment joining (,) and (x 1,y 1 ) is equal to z 1 (see Figure 1). z (x(t),y(t),z(t)) y (x(t),y(t)) x Figure 1: The Dido problem Solutions of the Dido problem are arcs of circles and their lifts to R 3 are spirals where z(t) is the area of the piece of disc cut by the hord connecting (,) with (x(t),y(t)). A piece of such a spiral is a shortest admissible path between its endpoints while the planar projection of this piece is an arc of the circle. The spiral ceases to be a shortest path when its planar projection starts to run the circle for the second time, i.e. when the spiral starts its second turn. Sub-Riemannian balls centered at the origin for this model look like apples with singularities at the poles (see Figure 3). Singularities are points on the sphere connected with the center by more than one shortest path. The dilation (x,y,z) (rx,ry,r 2 z) transforms the ball of radius 1 into the ball of radius r. In particular, arbitrary small balls have singularities. This is always the case when admissible subspaces are proper subspaces. Another important symmetry connects balls with different centers. Indeed, the product operation (x,y,z) (x,y,z ) =. ( x+x, y +y, z +z + 1 ) 2 (xy x y) 1

z y x Figure 2: Solutions to the Dido problem Figure 3: The Heisenberg sub-riemannian sphere turns R 3 into a group, the Heisenberg group. The origin in R 3 is the unit element of this group.

11 z y x Figure 2: Solutions to the Dido problem Figure 3: The Heisenberg sub-riemannian sphere turns R 3 into a group, the Heisenberg group. The origin in R 3 is the unit element of this group. It is easy to see that left translations of the group transform admissible curves into admissible ones and preserve the sub-riemannian length. Hence left translations transform balls in balls of the same radius. A detailed description of this example and other models of sub-riemannian spaces is done in Section 8.5 and Chapter??. Actually, even this simplest model tells us something about life in a sub-riemannian space. Here we deal with planar curves but, in fact, operate in the three-dimensional space. Sub-Riemannian spacesalways haveakindofhiddenextradimension. Agoodandnotyet exploited sourceformystic speculations but also for theoretical physicists who are always searching new crazy formalizations. In mechanics, this is a natural geometry for systems with nonholonomic constraints like skates, wheels, rolling balls, bearings etc. This kind of geometry could also serve to model social behavior that allows to increase the level of freedom without violation of a restrictive legal system. Anyway, in this book we perform a purely mathematical study of sub-riemannian spaces to provide an appropriate formalization ready for all eventual applications. Riemannian spaces appear as a very special case. Of course, we are not the first to study the sub-riemannian stuff. There is a broad literature even if it is hard to find an expert who could claim that sub-riemannian geometry is his main field of expertise. Important motivations come from CR geometry, hyperbolic 11

12 geometry, analysis of hypoelliptic operators, and some other domains. Our first motivation was control theory: length minimizing is a nice class of optimal control problems. Indeed, one can find a control theory spirit in our treatment of the subject. First of all, we include admissible paths in admissible flows that are flows generated by vector fields whose values in all points belong to admissible subspaces. The passage from admissible subspaces attached at different points of the manifold to a globally defined space of admissible vector fields makes the structure more flexible and well-adapted to algebraic manipulations. We pick generators f 1,...,f k of the space of admissible fields, and this allows us to describe all admissible paths as solutions to time-varying ordinary differential equations of the form: q(t) = k i=1 u i(t)f i (q(t)). Different admissible paths correspond to the choice of different control functions u i ( ) and initial points q() while the vector fields f i are fixed at the very beginning. We also use a Hamiltonian approach supported by the Pontryagin maximum principle to characterize shortest paths. Few words about the Hamiltonian approach: sub-riemannian geodesics are admissible paths whose sufficiently small pieces are length-minimizers, i. e. the length of such a piece is equal to the distance between its endpoints. In the Riemannian setting, any geodesic is uniquely determined by its velocity at the initial point q. In the general sub-riemannian situation we have much more geodesics based at the the point q than admissible velocities at q. Indeed, every point in a neighborhood of q can be connected with q by a length-minimizer, while the dimension of the admissible velocities subspace at q is usually smaller than the dimension of the manifold. What is a natural parametrization of the space of geodesics? To understand this question, we adapt a classical trajectory wave front duality. Given a length-parameterized geodesic t γ(t), we expect that the values at a fixed time t of geodesics starting at γ() and close to γ fill a piece of a smooth hypersurface (see Figure 4). For small t this hypersurface is a piece of the sphere of radius t, while in general it is only a piece of the wave front. p(t) γ(t) γ() Figure 4: The wave front and the impulse Moreover, we expect that γ(t) is transversal to this hypersurface. It is not always the case but this is true for a generic geodesic. The impulse p(t) Tγ(t) M is the covector orthogonal to the wave front and normalized by the condition p(t), γ(t) = 1. The curve t (p(t),γ(t)) in the cotangent bundle T M satisfies a Hamiltonian system. This is exactly what happens in rational mechanics or geometric optics. The sub-riemannian Hamiltonian H : T M R is defined by the formula H(p,q) = 1 2 p,v 2, where p Tq M, and v T qm is an admissible velocity of length 1 that maximizes the inner product of p with admissible velocities of length 1 at q M. Any smooth function on the cotangent bundle defines a Hamiltonian vector field and such a 12

13 field generates a Hamiltonian flow. The Hamiltonian flow on T M associated to H is the sub- Riemannian geodesic flow. The Riemannian geodesic flow is just a special case. As we mentioned, in general, the construction described above cannot be applied to all geodesics: the so-called abnormal geodesics are missed. An abnormal geodesic γ(t) also possesses its impulse p(t) Tγ(t) M but this impulse belongs to the orthogonal complement to the subspace of admissible velocities and does not satisfy the above Hamiltonian system. Geodesics that are trajectories of the geodesic flow are called normal. Actually, abnormal geodesics belong to the closure of the space of the normal ones, and elementary symplectic geometry provides a uniform characterization of the impulses for both classes of geodesics. Such a characterization is, in fact, a very special case of the Pontryagin maximum principle. Recall that all velocities are admissible in the Riemannian case, and the Euclidean structure on the tangent bundle induces the identification of tangent vectors and covectors, i. e. of the velocities and impulses. We should however remember that this identification depends on the metric. One can think to a sub-riemannian metric as the limit of a family of Riemannian metrics when the length of forbidden velocities tends to infinity, while the length of admissible velocities remains untouched. It is easy to see that the Riemannian Hamiltonians defined by such a family converge with all derivatives to the sub-riemannian Hamiltonian. Hence the Riemannian geodesics with a prescribed initial impulse converge to the sub-riemannian geodesic with the same initial impulse. On the other hand, we cannot expect any reasonable convergence for the family of Riemannian geodesics with a prescribed initial velocity: those with forbidden initial velocities disappear at the limit while geodesics with admissible initial velocities multiply. Outline of the book WestartinChapter1fromsurfacesinR 3 thatisthebeginningofeverythingindifferential geometry and also a starting point of the story told in this book. There are not yet Hamiltonians here, but a control flavor is already present. The presentation is elementary and self-contained. A student in applied mathematics or analysis who missed the geometry of surfaces at the university or simply is not satisfied by his understanding of these classical ideas, might find it useful to read just this chapter even if he does not plan to study the rest of the book. In Chapter 2, we recall some basic properties of vector fields and vector bundles. Sub-Riemannian structures are defined in Chapter 3 where we also prove three fundamental facts: the finiteness and the continuity of the sub-riemannian distance; the existence of length-minimizers; the infinitesimal characterization of geodesics. The first is the classical Chow-Rashevski theorem, the second and the third one are simplified versions of the Filippov existence theorem and the Pontryagin maximum principle. In Chapter 4, we introduce the symplectic language. We define the geodesic Hamiltonian flow, we consider an interesting class of three-dimensional problems and we prove a general sufficient condition for length-minimality of normal trajectories. Chapter 5 is devoted to applications to integrable Hamiltonian systems. We explain the construction of the action-angle coordinates and we describe classical examples of integrable geodesic flows, such as the geodesic flow on ellipsoids. Chapters 1 5 form a first part of the book where we do not use any tool from functional analysis. In fact, even the knowledge of the Lebesgue integration and elementary real analysis are not essential with a unique exception of the existence theorem in Section 3.3. In all other places the reader can substitute terms Lipschitz and absolutely continuous by piecewise C 1 and 13

14 measurable by piecewise continuous without a loss for the understanding. We start to use some basic functional analysis in Chapter 6. In this chapter, we give elements of an operator calculus that simplifies and clarifies calculations with non-stationary flows, their variations and compositions. In Chapter??, we use this calculus for a fast introduction to the Lie group theory. In Chapter 7, we interpret the impulses as Lagrange multipliers for constrained optimization problems and apply this point of view to the sub-riemannian case. We also introduce the sub- Riemannian exponential map and we study conjugate points. In Chapter 8, we construct the nonholonomic tangent space at a point q of the manifold: a first quasi-homogeneous approximation of the space if you observe and exploit it from q by means of admissible paths. In general, such a tangent space is a homogeneous space of a nilpotent Lie group equipped with an invariant vector distribution; its structure may depend on the point where the tangent space is attached. At generic points, this is a nilpotent Lie group endowed with a left-invariant vector distribution. The construction of the nonholonomic tangent space does not need a metric; if we take into account the metric, we obtain the Gromov Hausdorff tangent to the sub-riemannian metric space. Useful ball-box estimates of small balls follow automatically. Chapter?? is devoted to the explicit calculation of the sub-riemannian distance for model spaces. In Chapter 1, we study general analytic properties of the sub-riemannian distance as a function of points of the manifold. It is shown that the distance is smooth on an open dense subset and is semi-concave out of the points connected by abnormal length-minimizers. Moreover, generic sphere is a Lipschitz submanifold if we remove these bad points. In Chapter 11, we turn to abnormal geodesics, which provide the deepest singularities of the distance. Abnormal geodesics are critical points of the endpoint map defined on the space of admissible paths, and the main tool for their study is the Hessian of the endpoint map. This is the end of the second part of the book; next few chapters are devoted to the curvature and its applications. Let Φ t : T M T M, for t R, be a sub-riemannian geodesic flow. Submanifolds Φ t (T q M), q M, form a fibration of T M. Given λ T M, let J λ (t) T λ (T M) be the tangent space to the leaf of this fibration. Recall that Φ t is a Hamiltonian flow and T qm are Lagrangian submanifolds; hence the leaves of our fibrations are Lagrangian submanifolds and J λ (t) is a Lagrangian subspace of the symplectic space T λ (T M). In other words, J λ (t) belongs to the Lagrangian Grassmannian of T λ (T M), and t J λ (t) is a curve in the Lagrangian Grassmannian, a Jacobi curve of the sub-riemannian structure. The curvature of the sub-riemannian space at λ is simply the curvature of this curve in the Lagrangian Grassmannian. Chapter 12 is devoted to the elementary differential geometry of curves in the Lagrangian Grassmannian; in Chapter 13 we apply this geometry to Jacobi curves. The language of Jacobi curves is translated to the traditional language in the Riemannian case in Chapter 14. We recover the Levi Civita connection and the Riemannian curvature and demonstrate their symplectic meaning. In Chapter 15, we explicitly compute the sub-riemannian curvature for contact three-dimensional spaces. In the next Chapter 16 we study the small distance asymptotics of the expowhree-dimensional contact case and see how the structure of the conjugate locus is encoded in the curvature. In Chapter??, we consider two-dimensional sub-riemannian metrics; such a metric differs from a Riemannian one only along a one-dimensional submanifold. In the last Chapter 17 we define the 14

15 sub-riemannian Laplace operator, the canonical volume form, and compute the density of the sub-riemannian Hausdorff measure. We conclude with a discussion of the sub-riemannian heat equation and an explicit formula for the heat kernel in the three-dimensional Heisenberg case. We finish here this introduction into the Introduction...We hope that the reader won t be bored; comments to the chapters contain suggestions for further reading. 15

16 16

17 Chapter 1 Geometry of surfaces in R 3 In this preliminary chapter we study the geometry of smooth two dimensional surfaces in R 3 as a heating problem and we recover some classical results. In the fist part of the chapter we consider surfaces in R 3 endowed with the standard Euclidean product, which we denote by. In the second part we study surfaces in the Minskowski space, that is R 3 endowed with a sign-indefinite inner product, which we denote by h Definition 1.1. A surface of R 3 is a subset M R 3 such that for every q M there exists a neighborhood U R 3 of q and a smooth function a : U R such that U M = a 1 () and a on U M. 1.1 Geodesics and optimality Let M R 3 be a surface and γ : [,T] M be a smooth curve in M. The length of γ is defined as l(γ) := T where v = v v denotes the norm of a vector in R 3. γ(t) dt. (1.1) Remark 1.2. Notice that the definition of length in (1.1) is invariant by reparametrizations of the curve. Indeed let ϕ : [,T ] [,T] be a monotone smooth function. Define γ ϕ : [,T ] M by γ ϕ := γ ϕ. Using the change of variables t = ϕ(s), one gets l(γ ϕ ) = T γ ϕ (s) ds = T γ(ϕ(s)) ϕ(s) ds = T γ(t) dt = l(γ). The definition of length can be extended to piecewise smooth curves on M, by adding the length of every smooth piece of γ. When the curve γ is parametrized in such a way that γ(t) c for some c > we say that γ has constant speed. If moreover c = 1 we say that γ is parametrized by length. The distance between two points p,q M is the infimum of length of curves that join p to q d(p,q) = inf{l(γ), γ : [,T] M piecewise smooth, γ() = p,γ(t) = q}. (1.2) Now we focus on length-minimizers, i.e., piece-wise smooth curves that realize the distance between their endpoints: l(γ) = d(γ(), γ(t)). 17

18 γ(t) γ(t) T γ(t) M γ(t) M Figure 1.1: A smooth minimizer Exercise 1.3. Prove that, if γ : [,T] M is a length-minimizer, then the curve γ [t1,t 2 ] is also a length-minimizer, for all < t 1 < t 2 < T. The following proposition characterizes smooth minimizers. We prove later that all minimizers are smooth (cf. Corollary 1.15). Proposition 1.4. Let γ : [, T] M be a smooth minimizer parametrized by length. Then γ(t) T γ(t) M for all t [,T]. Proof. Consider a smooth non-autonomous vector field (t,q) f t (q) T q M that extends the tangent vector to γ in a neighborhood W of the graph of the curve {(t,γ(t)) R M}, i.e. f t (γ(t)) = γ(t) and f t (q) 1, (t,q) W. Let now (t,q) g t (q) T q M be a smooth non-autonomous vector field such that f t (q) and g t (q) define a local orthonormal frame in the following sense f t (q) g t (q) =, g t (q) 1, (t,q) W. Piecewise smooth curves parametrized by length on M are solutions of the following ordinary differential equation ẋ(t) = cosu(t)f t (x(t))+sinu(t)g t (x(t)), (1.3) for some initial condition x() = q and some piecewise continuous function u(t), which we call control. The curve γ is the solution to (1.3) associated with the control u(t) and initial condition γ(). Let us consider the family of controls {, t < τ u τ,s (t) = τ T, s R (1.4) s, t τ and denote by x τ,s (t) the solution of (1.3) that corresponds to the control u τ,s (t) and with initial condition x τ,s () = γ(). 18

19 Lemma 1.5. For every τ 1,τ 2,t [,T] the following vectors are linearly dependent s x τ1,s(t) s= s x τ2,s(t) (1.5) s= Proof. By Exercice 1.3 is not restrictive to assume t = T. Fix τ 1 τ 2 T and consider the family of curves φ(t;h 1,h 2 ) solutions of (1.3) associated with controls, t [,τ 1 [, v h1,h 2 (t) = h 1, t [τ 1,τ 2 [, h 1 +h 2, t [τ 2,T +ε[, where h 1,h 2 belong to a neighborhood of and ε is small enough (to guarantee the existence of the trajectory). Notice that φ is smooth in a neighborhood of (t,h 1,h 2 ) = (T,,) and φ h i = (h1,h 2 )= s x τi,s(t), i = 1,2. s= By contradiction assume that the vectors in (1.5) are linearly independent. Then φ h is invertible and the classical implicit function theorem applied to the map (t,h 1,h 2 ) φ(t;h 1,h 2 ) at the point (T,,) implies that there exists δ > such that t ]T δ,t +δ[, h 1,h 2, s.t. φ(t;h 1,h 2 ) = γ(t), In particular there exists a curve with unit speed joining γ() and γ(t) in time t < T, which gives a contradiction, since γ is a minimizer. Lemma 1.6. For every τ,t [,T] the following identity holds s x τ,s (t) γ(t) =. (1.6) s= Proof. If t τ, then by construction (cf. (1.4)) the first vector is zero since there is no variation w.r.t. s and the conclusion follows. Let us now assume that t > τ. Again, by Remark 1.3, it is sufficient to prove the statement at t = T. Let us write the Taylor expansion of ψ(t) = s s= x τ,s (t) in a right neighborhood of t = τ. Observe that, for t τ Hence Then, for t τ, we have ψ(τ) = s x τ,s (τ) =, s= ẋ τ,s = cos(s)f t (x τ,s )+sin(s)g t (x τ,s ). ψ(τ) = s ẋ τ,s (τ) = g τ (x τ,s (τ)). s= ψ(t) = (t τ)g τ (x τ,s (τ))+o((t τ) 2 ). (1.7) For τ sufficiently close to T, one can take t = T in (1.7). Passing to the limit for τ T one gets 1 T τ s x τ,s (T) g T (γ(t)). s= τ T Now, by Lemma 1.5 all vectors in left hand side are parallel among them, hence they are parallel to g T (γ(t)). The lemma is proved since γ(t) = f T (γ(t)) and f T and g T are orthogonal. 19

20 Now we end the proposition by showing that γ(t) T γ(t) M. Notice that this is equivalent to show γ(t) f t (γ(t)) = γ(t) g t (γ(t)) =. (1.8) Recall that γ(t) γ(t) = 1. Differentiating this identity one gets = d γ(t) γ(t) = 2 γ(t) γ(t), dt which shows that γ(t) is orthogonal to f t (γ(t)). Next, differentiating (1.6) with respect to t, we have 1 for t τ s ẋ τ,s (t) + γ(t) s= s x τ,s (t) γ(t) =. (1.9) s= Now, from ẋ τ,s (t) ẋ τ,s (t) = 1 one gets sẋτ,s(t) ẋτ,s(t) =, for t τ. Evaluating at s =, using that x τ, (t) = γ(t), one has s ẋ τ,s (t) γ(t) =, for t τ. s= Hence, by (1.9), it follows that s x τ,s (t) γ(t) =, s= which, by continuity, holds for every t [,T]. Using that s s= x τ,s (t) is parallel to g t (γ(t)) (see proof of Lemma 1.6), it follows that g t (γ(t)) γ(t) =. Definition 1.7. Asmooth curveγ : [,T] M parametrized with constant speediscalled geodesic if it satisfies γ(t) T γ(t) M, t [,T]. (1.1) Proposition 1.4 says that a smooth curve that minimizes the length is a geodesic. Now we get an explicit characterization of geodesics when the manifold M is globally defined as the zero level of a smooth function. In other words there exists a smooth function a : R 3 R such that M = a 1 (), and a on M. (1.11) Remark 1.8. Recall that for all q M it holds q a T q M. Indeed, for every q M and v T q M, let γ : [,T] M be a smooth curve on M such that γ() = q and γ() = v. By definition of M one has a(γ(t)) =. Differentiating this identity with respect to t at t = one gets q a v =. Proposition 1.9. A smooth curve γ : [,T] M is a geodesic if and only if it satisfies, in matrix notation: γ(t) = γ(t)t ( 2 γ(t) a) γ(t) γ(t) a 2 γ(t) a, t [,T], (1.12) where 2 γ(t) a is the Hessian matrix of a. 1 notice that x τ,s is smooth on the set [,T]\{τ}. 2

21 Proof. Differentiating the equality γ(t) a γ(t) = we get, in matrix notation: γ(t) T ( 2 γ(t) a) γ(t)+ γ(t)t γ(t) a =. By definition of geodesic there exists a function b(t) such that γ(t) = b(t) γ(t) a. Hence we get from which (1.12) follows. γ(t) T ( 2 γ(t) a) γ(t)+b(t) γ(t)a 2 =, Remark 1.1. Notice that formula (1.12) is always true locally since, by definition of surface, the assumptions (1.11) are always satisfied locally Existence and minimizing properties of geodesics As a direct consequence of Proposition 1.9 one gets the following existence and uniqueness theorem for geodesics. Corollary Let q M and v T q M. There exists a unique geodesic γ : [,ε] M, for ε > small enough, such that γ() = q and γ() = v. Proof. By Proposition 1.9, geodesics satisfy a second order ODE, hence they are smooth curves, characterized by ther initial position and velocity. To end this section we show that small pieces of geodesics are always global minimizers. Theorem Let γ : [,T] M be a geodesic. For every τ [,T[ there exists ε > such that (i) γ [τ,τ+ε] is a minimizer, i.e. d(γ(τ),γ(τ +ε)) = l(γ [τ,τ+ε] ), (ii) γ [τ,τ+ε] is the unique minimizers joining γ(τ) and γ(τ +ε) in the class of piecewise smooth curves, up to reparametrization. Proof. Without loss of generality let us assume that τ = and that γ is length parametrized. Consider a length-parametrized curve α on M such that α() = γ() and α() γ() and denote by (t,s) x s (t) the smooth variation of geodesics such that x (t) = γ(t) and (see also Figure 1.2) x s () = α(s), ẋ s () α(s). (1.13) The map ψ : (t,s) x s (t) is a local diffeomorphism near (,). Indeed the partial derivatives ψ = ψ t t=s= t x (t) = γ(), = t= s t=s= s x s () = α(), s= are linearly independent. Thus ψ maps a neighborhood U of (, ) on a neighborhood W of γ(). We now consider the function φ and the vector field X defined on W φ : x s (t) t, X : x s (t) ẋ s (t). 21

22 x s (t) γ α(s) Figure 1.2: Proof of Theorem 1.12 Lemma q φ = X(q) for every q W. Proof of Lemma We first show that the two vectors are parallel, and then that they actually coincide. To show that they are parallel, first notice that φ is orthogonal to its level set {t = const}, hence xs(t)φ s x s(t) =, (t,s) U. (1.14) Now, let us show that s x s(t) ẋs(t) =, (t,s) U. (1.15) Computing the derivative with respect to t of the left hand side of (1.15) one gets sẋs(t) ẋs(t) + s x s(t) ẍs(t), which is identically zero. Indeed the first term is zero because ẋ s (t) has unit speed and the second one vanishes because of (1.1). Hence, the left hand side of (1.15) is constant and coincides with its value at t =, which is zero by the orthogonality assumption (1.13). By (1.14) and (1.15) one gets that φ is parallel to X. Actually they coincide since φ X = d dt φ(x s(t)) = 1. Now consider ε > small enough such that γ [,ε] is contained in W and take a piecewise smooth and length parametrized curve c : [,ε ] M contained in W and joining γ() to γ(ε). Let us show that γ is shorter than c. First notice that l(γ [,ε] ) = ε = φ(γ(ε)) = φ(c(ε )) 22

23 Using that φ(c()) = φ(γ()) = and that l(c) = ε we have that l(γ [,ε] ) = φ(c(ε )) φ(c()) = ε ε = = d φ(c(t))dt (1.16) dt ε φ(c(t)) ċ(t) dt The last inequality follows from the Cauchy-Schwartz inequality X(c(t)) ċ(t) dt ε = l(c), (1.17) X(c(t)) ċ(t) X(c(t)) ċ(t) = 1 (1.18) which holds at every smooth point of c(t). In addition, equality in (1.18) holds if and only if ċ(t) = X(c(t)) (at the smooth points of c). Hence we get that l(c) = l(γ [,ε] ) if and only if c coincides with γ [,ε]. Now let us show that thereexists ε εsuch that γ [, ε] is a global minimizer among all piecewise smoothcurvesjoiningγ() toγ( ε). Itisenoughtotake ε < dist(γ(), W). Everycurvethatescape from W has length greater than ε. From Theorem 1.12 it follows Corollary Any minimizer of the distance (in the class of piecewise smooth curves) is a geodesic, and hence smooth Absolutely continuous curves Notice that formula (1.1) defines the length of a curve even in the class of absolutely continuous ones, if one understands the integral in the Lebesgue sense. In this setting, in the proof of Theorem 1.12, one can assume that the curve c is actually absolutely continuous. This proves that small pieces of geodesics are minimizers also in the class of absolutely continuous curves on M. Morever, this proves the following. Corollary Any minimizer of the distance (in the class of absolutely continuous curves) is a geodesic, and hence smooth. 1.2 Parallel transport In this section we want to introduce the notion of parallel transport, which let us to define the main geometric invariant of a surface: the Gaussian curvature. Let us consider a curve γ : [,T] M and a vector ξ T γ() M. We want to define the parallel transport of ξ along γ. Heuristically, it is a curve ξ(t) T γ(t) M such that the vectors {ξ(t),t [,T]} are all parallel. Remark If M = R 2 R 3 is the set {z = } we can canonically identify every tangent space T γ(t) M with R 2 so that every tangent vector ξ(t) belong to the same vector space. 2 In this case, parallel simply means ξ(t) = as an element of R 3. This is not the case if M is a manifold because tangent spaces at different points are different. 2 The canonical isomorphism R 2 T xr 2 is written explicitly as follows: y d dt t= x+ty. 23

24 Definition Let γ : [,T] M be a smooth curve. A smooth curve of tangent vectors ξ(t) T γ(t) M is said to be parallel if ξ(t) T γ(t) M. Assume now that M is the zero level of a smooth function a : R 3 R as in (1.11). We have the following description: Proposition A smooth curve of tangent vectors ξ(t) defined along γ : [,T] M is parallel if and only if it satisfies ξ(t) = γ(t)t ( 2 γ(t) a)ξ(t) γ(t) a 2 γ(t) a, t [,T]. (1.19) Proof. As in Remark 1.8, ξ(t) T γ(t) M implies γ(t) a,ξ(t) =. Moreover, by assumption ξ(t) = α(t) γ(t) a for some smooth function α. With analogous computations as in the proof of Proposition 1.9 we get that from which the statement follows. γ(t) T ( 2 γ(t) a)ξ(t)+α(t) γ(t)a 2 =, Remark Notice that, since (1.53) is a first order linear ODE with respect to ξ, for a given curve γ : [,T] M and initial datum v T γ() M, there is a unique parallel curve of tangent vectors ξ(t) T γ(t) M along γ such that ξ() = v. Since (1.53) is a linear ODE, the operator that associates with every initial condition ξ() the final vector ξ(t) is a linear operator, which is called parallel transport. Next we state a key property of the parallel transport. Proposition 1.2. The parallel transport preserves the inner product. In other words, if ξ(t), η(t) are two parallel curves of tangent vectors along γ, then we have d ξ(t) η(t) =, t [,T]. (1.2) dt Proof. From the fact that ξ(t),η(t) T γ(t) M and ξ(t), η(t) T γ(t) M one immediately gets d ξ(t) η(t) = ξ(t) η(t) + ξ(t) η(t) =. dt The notion of parallel transport permits to give a new characterization of geodesics. Indeed, by definition Corollary A smooth curve γ : [,T] M is a geodesic if and only if γ is parallel along γ. In the following we assume that M is oriented. Definition The spherical bundle SM on M is the disjoint union of all unit tangent vectors to M: SM = S q M, S q M = {v T q M, v = 1}. (1.21) q M 24

25 SM is a smooth manifold of dimension 3. Moreover it has the structure of fiber bundle with base manifold M, typical fiber S 1, and canonical projection π : SM M, π(v) = q if v T q M. Remark Since every vector in the fiber S q M has norm one, we can parametrize every v S q M by an angular coordinate θ S 1 through an orthonormal frame {e 1 (q),e 2 (q)} for S q M, i.e. v = cos(θ)e 1 (q)+sin(θ)e 2 (q). The choice of a positively oriented orthonormal frame {e 1 (q),e 2 (q)} corresponds to fix the element in the fiber corresponding to θ =. Hence, the choice of such an orthonormal frame at every point q induces coordinates on SM of the form (q,θ +ϕ(q)), where ϕ C (M). Given an element ξ S q M we can complete it to an orthonormal frame (ξ,η,ν) of R 3 in the following unique way: (i) η T q M is orthogonal to ξ and (ξ,η) is positively oriented (w.r.t. the orientation of M), (ii) ν T q M and (ξ,η,ν) is positively oriented (w.r.t. the orientation of R 3 ). Let t ξ(t) S γ(t) M be a smooth curve of unit tangent vectors along γ : [,T] M. Define η(t),ν(t) T γ(t) M as above. Since t ξ(t) has constant speed, one has ξ(t) ξ(t) and we can write ξ(t) = u ξ (t)η(t)+v ξ (t)ν(t). In particular this shows that every element of T ξ SM, written in the basis (ξ,η,ν), has zero component along ξ. Definition The Levi-Civita connection on M is the 1-form ω Λ 1 (SM) defined by ω ξ : T ξ SM R, ω ξ (z) = u z, (1.22) where z = u z η +v z ν and (ξ,η,ν) is the orthonormal frame defined above. Notice that ω change sign if we change the orientation of M. Lemma A curve of unit tangent vectors ξ(t) is parallel if and only if ω ξ(t) ( ξ(t)) =. Proof. By definition ξ(t) is parallel if and only if ξ(t) is orthogonal to Tγ(t) M, i.e., collinear to ν(t). In particular, a curve parametrized by length γ : [,T] M is a geodesic if and only if ω γ(t) ( γ(t)) =, t [,T]. (1.23) Proposition The Levi-Civita connection ω Λ 1 (SM) satisfies: (i) there exist two smooth functions a 1,a 2 : M R such that where (x 1,x 2,θ) is a system of coordinates on SM. ω = dθ +a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2, (1.24) 25

26 (ii) dω = π Ω, where Ω is a 2-form defined on M and π : SM M is the canonical projection. Proof. (i) Fix a system of coordinates (x 1,x 2,θ) on SM and consider the vector field / θ on SM. Let us show that ( ) ω = 1. θ Indeed consider a curve t ξ(t) of unit tangent vector at a fixed point which describes a rotation in a single fibre. As a curve on SM, the velocity of this curve is exactly its orthogonal vector, i.e. ξ(t) = η(t) and the equality above follows from the definition of ω. By construction, ω is invariant by rotations, hence the coefficients a i = ω( / x i ) do not depend on the variable θ. (ii) Follows directly from expression (1.57) noticing that dω depends only on x 1,x 2. Remark Notice that the functions a 1,a 2 in (1.57) are not invariant by change of coordinates on the fiber. Indeed the transformation θ θ+ϕ(x 1,x 2 ) induces dθ dθ+( x1 ϕ)dx 1 +( x2 ϕ)dx 2 which gives a i a i + xi ϕ for i = 1,2. By definition ω is an intrinsic 1-form on SM. Its differential, by property (ii) of Proposition 1.55, is the pull-back of an intrinsic 2-form on M, that in general is not exact. Definition The area form dv on a surface M is the differential two form that on every tangent space to the manifold agrees with the volume induced by the inner product. In other words, for every positively oriented orthonormal frame e 1,e 2 of T q M, one has dv(e 1,e 2 ) = 1. Given a set Γ M its area is the quantity Γ = Γ dv. Since any 2-form on M is proportional to the area form dv, it makes sense to give the following definition: Definition The Gaussian curvature of M is the function κ : M R defined by the equality Ω = κdv. (1.25) Note that κ does not dependon the orientation of M, since both Ω and dv change sign if we reverse the orientation. Moreover the area 2-form dv on the surface depends only on the metric structure on the surface. 1.3 Gauss-Bonnet Theorems In this section we will prove both the local and the global version of the Gauss-Bonnet theorem. A strong consequence of these results is the celebrated Gauss Theorema Egregium which says that the Gaussian curvature of a surface is independent on its embedding in R 3. Definition 1.3. Let γ : [,T] M be a smooth curve parametrized by length. The geodesic curvature of γ is defined as ρ γ (t) = ω γ(t) ( γ(t)). (1.26) Notice that if γ is a geodesic, then ρ γ (t) = for every t [,T]. The geodesic curvature measures how much a curve is far from being a geodesic. Remark The geodesic curvature changes sign if we move along the curve in the opposite direction. Moreover, if M = R 2, it coincides with the usual notion of curvature of a planar curve. 26

27 1.3.1 Gauss-Bonnet theorem: local version Definition A curvilinear polygon Γon an oriented surfacem is theimage of aclosed polygon in R 2 under a diffeomorphism. We assume that Γ is oriented consistently with the orientation of M. In the following we represent Γ = m i=1 γ i(i i ) where γ i : I i M, for i = 1,...,m, are smooth curves parametrized by length, with orientation consistent with Γ. We denote by α i the external angles at the points where Γ is not C 1 (see Figure 1.3). α3 γ 3 α 2 γ 4 Γ γ 2 α 4 γ 5 α 1 γ 1 α 5 Figure 1.3: A curvilinear polygon Notice that a curvilinear polygon is homeomorphic to a disk. Theorem 1.33 (Gauss-Bonnet, local version). Let Γ be a curvilinear polygon on an oriented surface M. Then we have m m κdv + ρ γi (t)dt+ α i = 2π. (1.27) I i Γ i=1 Proof. (i) Case Γ is smooth. In this case Γ is the image of the unit (closed) ball B 1, centered in the origin of R 2, under a diffeomorphism F : B 1 M, Γ = F(B 1 ). In what follows we denote by γ : I M the curve such that γ(i) = Γ. We consider on B 1 the vector field V(x) = x 1 x2 x 2 x1 which has an isolated zero at the origin and whose flow is a rotation around zero. Denote by X := F V the induced vector field on M with critical point q = F(). For ε small enough, we define (cf. Figure 1.4) i=1 Γ ε := Γ\F(B ε ), and A ε := F(B ε ), where B ε is the ball of radius ε centered in zero in R 2. We have Γ ε = A ε Γ. Define the map φ : Γ ε SM, φ(q) = X(q) X(q). 27

28 F A ε γ Γ ε B 1 \B ε M Figure 1.4: The map F First notice that φ(γ ε) dω = φ(γ ε) π Ω = π(φ(γ ε)) Ω = Ω, (1.28) Γ ε where we used the fact that π(φ(γ ε )) = Γ ε. Then let us compute the integral of the curvature κ on Γ ε κdv = Ω = dω, (by (1.28)) Γ ε Γ ε φ(γ ε) = ω, (by Stokes Theorem) = φ(γ ε) φ(a ε) ω φ( Γ) ω, (since φ(γ ε ) = φ(a ε ) φ( Γ)) (1.29) Notice that in the third equality we used the fact that the induced orientation on φ(γ ε ) gives opposite orientation on the two terms. Let us treat separately these two terms. The first one, by Proposition 1.55, can be written as φ(a ε) ω = φ(a ε) dθ+ a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 (1.3) φ(a ε) The first element of (1.3) is equal to 2π since we integrate the 1-form dθ on a closed curve. The second element of (1.3), for ε, satisfies φ(a ε) a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 Cl(φ(A ε )), (1.31) Indeed the functions a i are smooth (hence bounded on compact sets) and the length of φ(a ε ) goes to zero for ε. 28

29 Let us now consider the second term of (1.29). Since φ( Γ) is parametrized by the curve t γ(t) (as a curve on SM), we have ω = ω γ(t) ( γ(t))dt = ρ γ (t)dt. φ( Γ) Concluding we have from (1.29) κdv = lim κdv = 2π Γ ε Γ ε I I I ρ γ (t)dt, that is (1.27) in the smooth case (i.e. when α i = for all i). (ii) Case Γ non smooth. We reduce to the previous case with a sequence of polygons Γ n such that Γ n is smooth and Γ n approximates Γ in a smooth way. In particular, we assume that Γ n coincides with Γ excepts in neighborhoods U i, for i = 1,...,m, of each point q i where Γ is not smooth, in such a way that the curve σ (n) i that parametrize ( Γ n \ Γ) U i satisfies l(σi n) 1/n. If we apply the statement of the Theorem for the smooth case to Γ n we have κdv + ρ γ (n)(t)dt = 2π, Γ n where γ (n) is the curve that parametrizes Γ n. Since Γ n tends to Γ as n, then lim κdv = κdv. n Γ n Γ We are left to prove that lim n ρ γ (n)(t)dt = i=1 m i=1 I i ρ γi (t)dt+ For every n, let us split the curve γ (n) as the union of the smooth curves σ (n) i??. Then m m ρ γ (n)(t)dt = ρ (n) γ (t)dt + ρ (n) i σ (t)dt. i Since the curve γ (n) i tends to γ i for n one has lim ρ (n) n γ (t)dt = i i=1 ρ γi (t)dt. m α i. (1.32) i=1 and γ (n) i Moreover, with analogous computations of part (i) of the proof ρ (n) σ (t)dt = ω = dθ+a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 i φ(σ (n) i ) and one has, using that l(φ(σ (n) i )) dθ α i, n φ(σ (n) i ) Then (1.32) follows. φ(σ (n) i ) φ(σ (n) i ) a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 n. 29 as in Figure

30 An important corollary is obtained by applying the Gauss-Bonnet Theorem to geodesic triangles. A geodesic triangle T is a curvilinear polygon with m = 3 edges and such that every smooth piece of boundary γ i is a geodesic. For a geodesic triangle T we denote by A i := π α i its internal angles. Corollary Let T be a geodesic triangle and A i (T) its internal angles. Then i κ(q) = lim A i(t) π T T Proof. Fix a geodesic triangle T. Using that the geodesic curvature of γ i vanishes, the local version of Gauss-Bonnet Theorem (1.27) can be rewritten as 3 i=1 A i = π + κdv. (1.33) Γ Dividing for T and passing to the limit for T in the class of geodesic triangles containing q one obtains 1 i κ(q) = lim κdv = lim A i(t) π T T T T T Gauss-Bonnet theorem: global version Now we state the global version of the Gauss-Bonnet theorem. In other words we want to generalize (1.27) to thecase whenγis a region of M not necessarily homeomorphictothe disk, seefor instance Figure 1.5. As we will see that the result depends on the Euler characteristic χ(γ) of this region. Inwhatfollows, byatriangulationofm wemeanadecompositionofm intocurvilinearpolygons (see Definition 1.32). Notice that every compact surface admits a triangulation. 3 Definition Let M R 3 be a compact oriented surface with boundary M (possibly with angles). Consider a triangulation of M. We define the Euler characteristic of M as where n i is the number of i-dimensional faces in the triangulation. χ(m) := n 2 n 1 +n, (1.34) The Euler characteristic can be defined for every region Γ of M in the same way. Here, by a region Γ on asurfacem, wemean a closed domainof themanifold withpiecewise smooth boundary. Remark The Euler characteristic is well-defined. Indeed one can show that the quantity (1.34) is invariant for refinement of a triangulation, since every at every step of the refinement the alternating sum does not change. Moreover, given two different triangulations of the same region, there always exists a triangulation that is a refinement of both of them. This shows that the quantity (1.34) is independent on the triangulation. Example For a compact connected orientable surface M g of genus g (i.e., a surface that topologically is a sphere with g handles) one has χ(m g ) = 2 2g. For instance one has χ(s 2 ) = 2, χ(t 2 ) =, where T 2 is the torus. Notice also that χ(b 1 ) = 1, where B 1 is the closed unit disk in R 2. 3 Formally, a triangulation of a topological space M is a simplicial complex K, homeomorphic to M, together with a homeomorphism h : K M. 3

31 Following the notation introduced in the previous section, for a given region Γ, we assume that Γ is oriented consistently with the orientation of M and Γ = m i=1 γ i(i i ) where γ i : I i M, for i = 1,...,m, are smooth curves parametrized by length (with orientation consistent with Γ). We denote by α i the external angles at the points where Γ is not C 1 (see Figure 1.5). Γ 2 Γ 1 Γ 4 Γ 3 M Figure 1.5: Gauss-Bonnet Theorem Theorem 1.38 (Gauss-Bonnet, global version). Let Γ be a region of a surface on a compact oriented surface M. Then Γ κdv + m i=1 I i ρ γi (t)dt+ m α i = 2πχ(Γ). (1.35) Proof. As in the proof of the local version of the Gauss-Bonnet theorem we consider two cases: (i) Case Γ smooth (in particular α i = for all i). Consider a triangulation of Γ and let {Γ j,j = 1,...,n 2 } be the corresponding subdivision of Γ in curvilinear polygons. We denote by {γ (j) k } the smooth curves parametrized by length whose image are the edges of Γ j and by and θ (j) k the external angles of Γ j. We assume that all orientations are chosen accordingly to the orientation of M. Applying Theorem 1.33 to every Γ j and summing w.r.t. j we get We have that n 2 j=1 n 2 j=1 ( κdv = Γ j Γ j κdv + k Γ κdv, i=1 ρ (j) γ (t)dt+ k k j,k θ (j) k ρ (j) γ (t)dt = k ) = 2πn 2. (1.36) m i=1 ρ γi (t)dt. (1.37) The second equality is a consequence of the fact that every edge of the decomposition that does 31

32 not belong to Γ appears twice in the sum, with opposite sign. It remains to check that θ (j) k = 2π(n 1 n ), (1.38) j,k Let us denote by N the total number of angles in the sum of the left hand side of (1.38). After reindexing we have to check that N θ ν = 2π(n 1 n ). (1.39) ν=1 Denote by n the number of vertexes that belong to Γ and with ni := n n. Similarly we define n 1 and ni 1. We have the following relations: (i) N = 2n I 1 +n 1, (ii) n = n 1, Claim (i) follows from the fact that every curvilinear polygon with n edges has n angles, but the internal edges are counted twice since each of them appears in two polygons. Claim (ii) is a consequence of the fact that Γ is the union of closed curves. If we denote by A k := π θ k the internal angles, we have N N θ ν = Nπ A ν. (1.4) ν=1 Moreover the sum of the internal angles is equal to π for a boundary vertex, and to 2π for an internal one. Hence one gets N A ν = 2πn I +πn, (1.41) ν=1 Combining (1.4), (1.41) and (i) one has ν=1 ν θ ν = (2n I 1 +n 1 )π (2nI +n )π i=1 Using (ii) one finally gets (1.39). (ii) Case Γ non-smooth. We consider a decomposition of Γ into curvilinear polygons whose edges intersect the boundary in the smooth part (this is always possible). The proof is identical to the smooth case up to formula (1.37). Now, instead of (1.39), we have to check that N θ ν = ν=1 m α i +2π(n 1 n ), (1.42) i=1 Now (1.42) can be rewritten as θ ν = 2π(n 1 n ), ν/ A where A is the set of indices whose corresponding angles are non smooth points of Γ. 32

33 Consider now a new region Γ, obtained by smoothing the edges of Γ, together with the decomposition induced by Γ (see Figure 1.5). Denote by ñ 1 and ñ the number of edges and vertexes of the decomposition of Γ. Notice that {θ ν,ν / A} is exactly the set of all angles of the decomposition of Γ. Moreover ñ 1 ñ = n 1 n, since n = ñ +m and n 1 = ñ 1 +m, where m is the number of non-smooth points. Hence, by part (i) of the proof: θ ν = 2π(ñ 1 ñ ) = 2π(n 1 n ). ν/ A Corollary Let M be a compact oriented surface without boundary. Then κdv = 2πχ(M). (1.43) Consequences of the Gauss-Bonnet Theorems M Definition 1.4. Let M,M be two surfaces in R 3. A smooth map φ : R 3 R 3 is called an isometry between M and M if φ(m) = M and for every q M it satisfies v w = D q φ(v) D q φ(w), v,w T q M. (1.44) If the property (1.44) is satisfied by a map defined locally in a neighborhood of every point q of M, then it is called a local isometry. Two surfaces M and M are said to be isometric (resp. locally isometric) if there exists an isometry (resp. local isometry) between M and M. Notice that the restriction φ of a global isometry Φ of R 3 to a surface M R 3 always defines an isometry between M and M = φ(m). From (1.44) it follows that an isometry preserves the angles between vectors and, a fortiori, the length of a curve and the distance between two points. Corollary 1.34, and the fact that the angles and the volumes are preserved by isometries, one obtains that the Gaussian curvature is invariant by local isometries, in the following sense. Corollary 1.41 (Gauss s Theorema Egregium). Assume φ is a local isometry between M and M, then for every q M one has κ(q) = κ (φ(q)), where κ (resp. κ ) is the Gaussian curvature of M (resp. M ). This Theorem says that the Gaussian curvature κ depends only on the metric structure on M and not on the specific fact that the surface is embedded in R 3 with the induced inner product. Corollary Let M be surface and q M. If κ(q) then M is not locally isometric to R 2 in a neighborhood of q. Exercise Prove that a surface M is locally isometric to the Euclidean plane R 2 around a point q M if and only if there exists a coordinate system (x 1,x 2 ) in a neighborhood U of q M such that the vectors x1 and x2 have unit length and are everywhere orthonormal. As a converse of Corollary 1.42 we have the following. 33

34 Theorem Assume that κ in a neighborhood of a point q M. Then M is locally Euclidean (i.e., locally isometric to R 2 ) around q. Proof. From our assumptions we have, in a neighborhood U of q: Ω = κdv =. Hence dω = π Ω =. From its explicit expression ω = dθ +a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2, it follows that the 1-form a 1 dx 1 +a 2 dx 2 is locally exact, i.e. there exists a neighborhood W of q, W U, and a function φ : W R such that a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2 = dφ. Hence ω = d(θ +φ(x 1,x 2 )). Thus we can define a new angular coordinate on SM, which we still denote by θ, in such a way that (see also Remark 1.27) ω = dθ. (1.45) Now, let γ be a length parametrized geodesic, i.e. ω γ(t) ( γ(t)) =. Using the the angular coordinate θ just defined on the fibers of SM, the curve t γ(t) S γ(t) M is written as t θ(t). Using (1.45), we have then = ω γ(t) ( γ(t)) = dθ( γ(t)) = θ(t). In other words the angular coordinate of a geodesic γ is constant. We want to construct Cartesian coordinates in a neighborhood U of q. Consider the two length parametrized geodesics γ 1 and γ 2 starting from q and such that θ 1 () =, θ 2 () = π/2. Define them to be the x 1 -axes and x 2 -axes of our coordinate system, respectively. Then, for each point q U consider the two geodesics starting from q and satisfying θ 1 () = and θ 2 () = π/2. We assign coordinates (x 1,x 2 ) to each point q in U by considering the length parameter of the geodesic projection of q on γ 1 and γ 2 (See Figure 1.6). Notice that the family of geodesics constructed in this way, and parametrized by q U, are mutually orthogonal at every point. By construction, in this coordinate system the vectors x1 and x2 have length one (being the tangent vectors to length parametrized geodesics) and are everywhere mutually orthogonal. Hence the theorem follows from Exercise The Gauss map We end this section with a geometric characterization of the Gaussian curvature of a manifold M, using the Gauss map. Definition Let M be an oriented surface. We define the Gauss map associated to M as N : M S 2, q ν q, (1.46) where ν q S 2 R 3 denotes the external unit normal vector to M at q. 34

35 x 2 γ 2 q q x 1 γ 1 Figure 1.6: Proof of Theorem Let us consider the differential of the Gauss map at the point q D q N : T q M T N(q) S 2 T q M where an element tangent to the sphere S 2 at N(q), being orthogonal to N(q), is identified with a tangent vector to M at q. Theorem We have that κ(q) = det(d q N). Before proving this theorem we prove an important property of the Gauss map. Lemma For every q M, the differential D q N of the Gauss map is a symmetric operator, i.e., D q N(ξ) η = ξ D q N(η), ξ,η T q M. (1.47) Proof. We prove the statement locally, i.e., for a manifold M parametrized by a function φ : R 2 M. In this case T q M = ImD u φ, where φ(u) = q. Let v,w R 2 such that ξ = D u φ(v) and η = D u φ(w). Since N(q) T q M we have N(q) η = N(q) D u φ(w) =. Taking the derivative in the direction of ξ one gets D q N(ξ) η + N(q) D 2 u φ(v,w) =, where D 2 uφ is a bilinear symmetric map. Now (1.47) follows exchanging the role of v and w. Proof of Theorem We will use Cartan s moving frame method. Let ξ SM and denote with (e 1 (ξ),e 2 (ξ),e 3 (ξ)), e i : SM R 3, the orthonormal basis attached at ξ and constructed in Section 1.2. Let us compute the differentials of these vectors in the ambient space R 3 and write them as a linear combination (with 1-form as coefficients) of the vectors e i d ξ e i (η) = 3 (ω ξ ) ij (η)e j (ξ), ω ij Λ 1 SM, η T ξ SM. j=1 35

36 Dropping ξ and η from the notation one gets the relation 3 de i = ω ij e j, j=1 ω ij Λ 1 SM. Since for each ξ the basis (e 1 (ξ),e 2 (ξ),e 3 (ξ)) is orthonormal (hence can be seen as an element of SO(3)) its derivative is expressed through a skew-symmentric matrix (i.e., ω ij = ω ji ) and one gets the equations Let us now prove the following identity de 1 = ω 12 e 2 +ω 13 e 3, de 2 = ω 12 e 1 +ω 23 e 3, (1.48) de 3 = ω 13 e 1 ω 23 e 2. ω 13 ω 23 = dω 12. (1.49) Indeed, differentiating the first equation in (1.48) one gets, using that d 2 =, = d 2 e 1 = dω 12 e 2 +ω 12 de 2 +dω 13 e 3 +ω 13 de 3 = (dω 12 ω 13 ω 23 )e 2 +(dω 13 ω 12 ω 23 )e 3, which implies in particular (1.49). The statement of the theorem can be rewritten as an identity between 2-forms as follows Applying π to both sides one gets det(d q N)dV = κdv. π (det(d q N)dV) = π κdv = dω (1.5) where ω is the Levi-Civita connection. Let us show that (1.5) is equivalent to (1.49). Indeed by construction ω 12 computes the coefficient of the derivative of the first vector of the orthonormal basis along the second one, hence ω 12 = ω (see also Definition 1.54). It remains to show that ω 13 ω 23 = π (det(d q N)dV) = det(d π(ξ) N)π dv Since e 3 = N π, where π : SM M is the canonical projection, one has The proof is completed by the following D q N π = de 3 = ω 13 e 1 ω 23 e 2 Exercise Let V be a 2-dimensional Euclidean vector space and e 1,e 2 an orthonormal basis. Let F : V V a linear map and write F = F 1 e 1 +F 2 e 2, where F i : V R are linear functionals. Prove that F 1 F 2 = (detf)dv, where dv is the area form induced by the inner product. 36

37 Remark Lemma 1.47 allows us to define the principal curvatures of M at the point q as the two real eigenvalues k 1 (q),k 2 (q) of the map D q N. In particular κ(q) = k 1 (q)k 2 (q), q M. The principal curvatures can be geometrically interpreted as the maximum and the minimum of curvature of sections of M with orthogonal planes. Notice moreover that, using the Gauss-Bonnet theorem, one can relate then degree of the map N with the Euler characteristic of M as follows 1 degn := Area(S 2 (detd q N)dV = 1 κdv = 1 ) 4π 2 χ(m). M 1.4 Surfaces in R 3 with the Minkowski inner product The theory and the results obtained in this chapter can be adapted to the case when M R 3 is a surface in the Minkowski 3-space, that is R 3 endowed with the hyperbolic (or Minkowski-type) inner product q 1,q 2 h = x 1 x 2 +y 1 y 2 z 1 z 2. (1.51) Here q i = (x i,y i,z i ) for i = 1,2, are two points in R 3. We denote by q h = q,q 1/2 h the norm induced by the inner product (1.51). For the metric structure to be defined on M, we require that the restriction of the inner product (1.51) to the tangent space to M is positive definite at every point. Indeed, under this assumption, the inner product (1.51) can be used to define the length of a tangent vector to the surface (which is non-negative). Thus one can introduce the length of (piecewise) smooth curves on M and its distance by the same formulas as in Section 1.1. These surfaces are also called space-like surfaces in the Minkovski space. The structure of the inner product impose some condition on the structure of space-like surfaces, as the following exercice shows. Exercise 1.5. Let M be a space-like surface in R 3 endowed with the inner product (1.51). (i) Show that if v T q M is a non zero vector that is orthogonal to T q M, then v h <. (ii) Prove that, if M is compact, then M. (iii) Show that restriction to M of the projection π(x,y,z) = (x,y) onto the xy-plane is a local diffeomorphism. (iv) Show that M is locally a graph on the plane {z = }. The results obtained in the previous sections for surfaces embedded in R 3 can be recovered for space-like surfaces by simply adapting all formulas to their hyperbolic counterpart. For instance, geodesics are defined as curves of unit speed whose second derivative is orthogonal, with respect to h, to the tangent space to M. For a smooth function a : R 3 R, its hyperbolic gradient h qa is defined as ( a h q a = x, a y, a z 37 ) M

38 If we assume that M = a 1 () is a regular level set of a smooth function a : R 3 R. If γ(t) is a curve contained in M, i.e. a(γ(t)) =, one has the identity = h γ(t) a γ(t) The same computation shows that h γ(t) a is orthogonal to the level sets of a, where orthogonal always means with respect to h. In particular, if M = a 1 () is space-like, one has q a h <. Exercise Let γ be a geodesic on M = a 1 (). Show that γ satisfies the equation (in matrix notation) γ(t) = γ(t)t ( 2 γ(t) a) γ(t) h h γ(t) a 2 γ(t) a, t [,T]. (1.52) h where 2 γ(t) a is the (classical) matrix of second derivatives of a.4 Given a smooth curve γ : [,T] M on a surface M, a smooth curve of tangent vectors ξ(t) T γ(t) M is said to be parallel if ξ(t) T γ(t) M, with respect to the hyperbolic inner product. It is then straightforward to check that, if M is the zero level of a smooth function a : R 3 R, then ξ(t) is parallel along γ if and only if it satisfies h. ξ(t) = γ(t)t ( 2 γ(t) a)ξ(t) h h γ(t) a 2 γ(t) a, t [,T]. (1.53) h By definition a smooth curve γ : [,T] M is a geodesic if and only if γ is parallel along γ. Remark As for surfaces in the Euclidean space, given curve γ : [,T] M and initial datum v T γ() M, there is a unique parallel curve of tangent vectors ξ(t) T γ(t) M along γ such that ξ() = v. Moreover the operator ξ() ξ(t) is a linear operator, which the parallel transport of v along γ. Exercise Show that if ξ(t),η(t) are two parallel curves of tangent vectors along γ, then we have d dt ξ(t) η(t) h =, t [,T]. (1.54) Assume that M is oriented. Given an element ξ S q M we can complete it to an orthonormal frame (ξ,η,ν) of R 3 in the following unique way: (i) η T q M is orthogonal to ξ with respect to h and (ξ,η) is positively oriented (w.r.t. the orientation of M), (ii) ν T q M with respect to h and (ξ,η,ν) is positively oriented (w.r.t. the orientation of R 3 ). For a smooth curve of unit tangent vectors ξ(t) S γ(t) M along a curve γ : [,T] M we define η(t),ν(t) T γ(t) M and we can write ξ(t) = u ξ (t)η(t)+v ξ (t)ν(t). 4 otherwise one can write the numerator of (1.52) as 2,h γ(t) γ(t) γ(t), where h 2,h γ(t) is the hyperbolic Hessian. 38

39 Definition The hyperbolic Levi-Civita connection on M is the 1-form ω Λ 1 (SM) defined by ω ξ : T ξ SM R, ω ξ (z) = u z, (1.55) where z = u z η +v z ν and (ξ,η,ν) is the orthonormal frame defined above. It is again easy to check that a curve of unit tangent vectors ξ(t) is parallel if and only if ω ξ(t) ( ξ(t)) = and a curve parametrized by length γ : [,T] M is a geodesic if and only if ω γ(t) ( γ(t)) =, t [,T]. (1.56) Exercise Prove that the hyperbolic Levi Civita connection ω Λ 1 (SM) satisfies: (i) there exist two smooth functions a 1,a 2 : M R such that where (x 1,x 2,θ) is a system of coordinates on SM. ω = dθ +a 1 (x 1,x 2 )dx 1 +a 2 (x 1,x 2 )dx 2, (1.57) (ii) dω = π Ω, where Ω is a 2-form defined on M and π : SM M is the canonical projection. Again one can introduce the area form dv on M induced by the inner product and it makes sense to give the following definition: Definition The Gaussian curvature of a surface M in the Minkowski 3-space is the function κ : M R defined by the equality Ω = κdv. (1.58) By reasoning as in the Euclidean case, one can define the geodesic curvature of a curve and prove the analogue of the Gauss-Bonnet theorem in this context. As a consequence one gets that the Gaussian curvature is again invariant under isometries of M and hence is an intrinsic quantity that depends only on the metric properties of the surface and not on the fact that its metric is obtained as the restriction of some metric defined in the ambient space. Finally one can define the hyperbolic Gauss map Definition Let M be an oriented surface. We define the Gauss map N : M H 2, q ν q, (1.59) where ν q H 2 R 3 denotes the external unit normal vector to M at q, with respect to the Minkovsky inner product. Let us now consider the differential of the Gauss map at the point q: D q N : T q M T N(q) H 2 T q M where an element tangent to the hyperbolic plane H 2 at N(q), being orthogonal to N(q), is identified with a tangent vector to M at q. Theorem The differential of the Gauss map D q N is symmetric, and κ(q) = det(d q N). 39

40 1.5 Model spaces of constant curvature In this section we briefly discuss surfaces embedded in R 3 (with Euclidean or Lorentzian inner product) that have constant Gaussian curvature, playing the role of model spaces. For each model we are interested in describing geodesics and, more generally, curves of constant geodesic curvature. These results will be useful in the study of sub-riemannian model spaces in dimension three (cf. Chapter??). Assume that the surface M has constant Gaussian curvature κ R. We already know that κ is a metric invariant of the surface, i.e., it does not depend on the embedding of the surface in R 3. We will distinguish the following three cases: (i) κ = : this is the flat model of the classical Euclidean plane, (ii) κ > : these corresponds to the case of the sphere, (iii) κ < : these corresponds to the hyperbolic plane. We will briefly discuss the cases (i), since it is trivial, and study in some more detail the cases (ii) and (iii) of spherical and hyperbolic geometry Zero curvature: the Euclidean plane The Euclidean plane can be realized as the surface of R 3 defined by the zero level set of the function a : R 3 R, a(x,y,z) = z. It is an easy exercise, applying the results of the previous sections, to show that the curvature of this surface is zero (the Gauss map is constant) and to characterize geodesics and curves with constant curvature. Exercise Prove that geodesics on the Euclidean plane are lines. Moreover, show that curves with constant curvature c are circles of radius 1/c Positive curvature: spheres Let us consider the sphere S 2 r of radius r as the surface of R3 defined as the zero level set of the function S 2 r = a 1 (), a(x,y,z) = x 2 +y 2 +z 2 r 2. (1.6) If we denote, as usual, with the Euclidean inner product in R 3, Sr 2 can be viewed also as the set of points q = (x,y,z) whose Euclidean norm is constant S 2 r = {q R 3 q q = r 2 }. The Gauss map associated with this surface can be easily computed since its is explicitly given by N : Sr 2 S2, N(q) = 1 q, (1.61) r It follows immediately by (1.69) that the Gaussian curvature of the sphere is κ = 1/r 2 at every point q Sr 2. Let us now recover the structure of geodesics and constant geodesic curvature curves on the sphere. 4

41 Proposition 1.6. Let γ : [,T] Sr 2 be a curve with constant geodesic curvature equal to c R. For every vector w R 3 the function α(t) = γ(t) w is a solution of the differential equation α(t) + (c 2 + 1r ) 2 α(t) = Proof. Without loss of generality, we can assume that γ is parametrized by unit speed. Differentiating twice the equality a(γ(t)) =, where a is the function defined in (1.68), we get (in matrix notation): γ(t) T ( 2 γ(t) a) γ(t)+ γ(t)t γ(t) a =. Moreover, since γ(t) is constant and γ has constant geodesic curvature equal to c, there exists a function b(t) such that γ(t) = b(t) γ(t) a+cη(t) (1.62) where c is the geodesic curvature of the curve and η(t) = γ(t) is the vector orthogonal to γ(t) in T γ(t) S 2 r (defined in such a way that γ(t) and η(t) is a positively oriented frame). Reasoning as in the proof of Proposition 1.9 and noticing that γ(t) a is proportional to the vector γ(t), one can compute b(t) and obtains that γ satisfies the differential equation Lemma η(t) = c γ(t) γ(t) = 1 r2γ(t)+cη(t). (1.63) Proof of Lemma The curve η(t) has constant norm, hence η(t) is orthogonal to η(t). Recall that the triple (γ(t), γ(t), η(t)) defines an orthogonal frame at every point. Differentiating the identity η(t) γ(t) = with respect to t one has = η(t) γ(t) + η(t) γ(t) = η(t) γ(t). Hence η(t) has nonvanishing component only along γ(t). Differentiating the identity η(t) γ(t) = one obtains = η(t) γ(t) + η(t) γ(t) = η(t) γ(t) +c where we used (1.63). Hence η(t) = η(t) γ(t) γ(t) = c γ(t). Next we compute the derivatives of the function α as follows Using Lemma 1.61, we have α(t) = γ(t) w = 1 γ(t) w +c η(t) w. (1.64) r2 α(t) = 1 γ(t) w +c η(t) w (1.65) r2 = 1 ( ) 1 r 2 γ(t) w c2 γ(t) w = r 2 +c2 α(t). (1.66) which ends the proof of the Proposition

42 Corollary Constant geodesic curvature curves are contained in the intersection of Sr 2 with an affine plane of R 3. In particular, geodesics are contained in the intersection of Sr 2 with planes passing through the origin, i.e., great circles. Proof. Let us fix a vector w R 3 that is orthogonal to γ() and γ(). Let us then prove that α(t) := γ(t) w = for all t [,T]. By Proposition 1.6, the function α(t) is a solution of the Cauchy problem { α(t)+( 1 +c 2 )α(t) = r 2 (1.67) α() = α() = Since (1.67) admits the unique solution α(t) = for all t. If the curve is a geodesic, then c = and the geodesic equation is written as γ(t) = γ(t). Then consider the function Γ(t) := γ(t) w, where w is chosen as before. Γ(t) is constant since Γ(t) = α(t) =. In fact Γ(t) is identically zero since Γ() = γ() w = γ() w =, by the assumption on w. This proves that the curve γ is contained in a plane passing through the origin. Remark Curves with constant geodesic curvatures on the spheres are circles obtained as the intersection of the sphere with an affine plane. Moreover all these curves can be also characterized in the following two ways: (i) curves that have constant distance from a geodesic (equidistant curves), (ii) boundary of metric balls (spheres) Negative curvature: the hyperbolic plane The negative constant curvature model is the hyperbolic plane H 2 r obtained as the surface of R 3, endowed with the hyperbolic metric, defined as the zero level set of the function a(x,y,z) = x 2 +y 2 z 2 +r 2. (1.68) Indeed this surface is a two-fold hyperboloid, so we restrict our attention to the set of points H 2 r = a 1 () {z > }. In analogy with the positive constant curvature model (which is the set of points in R 3 whose euclidean norm is constant) the negative constant curvature can be seen as the set of points whose hyperbolic norm is constant in R 3. In other words H 2 r = {q = (x,y,z) R 3 q 2 h = r2 } {z > }. The hyperbolic Gauss map associated with this surface can be easily computed since its is explicitly given by N : H 2 r H2, N(q) = 1 r qa, (1.69) Exercise Prove that the Gaussian curvature of H 2 r is κ = 1/r 2 at every point q H 2 r. We can now discuss the structure of geodesics and constant geodesic curvature curves on the hyperbolic space. With start with a result than can be proved in an analogous way to Proposition

43 Proposition Let γ : [,T] Hr 2 be a curve with constant geodesic curvature equal to c R. For every vector w R 3 the function α(t) = γ(t) w h is a solution of the differential equation α(t) + (c 2 1r ) 2 α(t) =. (1.7) As for the sphere, this result implies immediately the following corollary. Corollary Constant geodesic curvature curves on Hr 2 are contained in the intersection of Hr 2 with affine planes of R3. In particular, geodesics are contained in the intersection of Hr 2 with planes passing through the origin. Exercise Prove Proposition 1.65 and Corollary Geodesics on Hr 2 are hyperbolas, obtained as intersections of the hyperboloid with plane passing through the origin. The classification of constant geodesic curvature curves is in fact more rich. The sections of the hyperboloid with affine planes can have different shapes depending on the Euclidean orthogonal vector to the plane: they are circles when it has negative hyperbolic length, hyperbolas when it has positive hyperbolic length or parabolas when it has length zero (that is it belong to the x 2 +y 2 z 2 = ). These distinctions reflects in the value of the geodesic curvature. Indeed, as the form of (1.7) also suggest, the value c = 1 r is a threshold and we have the following situation: (i) if c < 1/r, then the curve is an hyperbola, (ii) if c = 1/r, then the curve is a parabola, (iii) if c > 1/r, then the curve is a circle. This is not the only interesting feature of this classification. Indeed curves of type(i) are equidistant curves while curves of type (iii) are boundary of balls, i.e., spheres, in the hyperbolic plane. Finally, curves of type (ii) are also called horocycles (cf. Remark?? for the difference with respect to the case of the positive constant curvature model). 43

44 44

45 Chapter 2 Vector fields and vector bundles In this chapter we collect some basic definitions of differential geometry, in order to recall some useful results and to fix the notation. We assume the reader to be familiar with the definitions of smooth manifold and smooth map between manifolds. 2.1 Differential equations on smooth manifolds In what follows I denotes an interval of R containing in its interior Tangent vectors and vector fields Let M be a smooth n-dimensional manifold and γ 1,γ 2 : I M two smooth curves based at q = γ 1 () = γ 2 () M. We say that γ 1 and γ 2 are equivalent if they have the same 1-st order Taylor polynomial in some (or, equivalently, in every) coordinate chart. This defines an equivalence relation on the space of smooth curves based at q. Definition 2.1. Let M be a smooth n-dimensional manifold and let γ : I M be a smooth curve such that γ() = q M. Its tangent vector at q = γ(), denoted by d dt γ(t), or γ(), (2.1) t= is the equivalence class in the space of all smooth curves in M such that γ() = q. It is easy to check, usingthe chain rule, that this definition is well-posed (i.e., it does not depend on the representative curve). Definition 2.2. Let M be a smooth n-dimensional manifold. The tangent space to M at a point q M is the set { } d T q M := dt γ(t), γ : I M smooth, γ() = q. t= It is a standard fact that T q M has a natural structure of n-dimensional vector space, where n = dimm. 45

46 Definition 2.3. A smooth vector field on a smooth manifold M is a smooth map X : q X(q) T q M, that associates to every point q in M a tangent vector at q. We denote by Vec(M) the set of smooth vector fields on M. Incoordinates wecan writex = n i=1 Xi (x) x i, andthevector fieldissmoothifits components X i (x) are smooth functions. The value of a vector field X at a point q is denoted in what follows both with X(q) and X q. Definition 2.4. Let M be a smooth manifold and X Vec(M). The equation q = X(q), q M, (2.2) is called an ordinary differential equation (or ODE) on M. A solution of (2.2) is a smooth curve γ : J M, where J R is an interval, such that We also say that γ is an integral curve of the vector field X. γ(t) = X(γ(t)), t J. (2.3) A standard theorem on ODE ensures that, for every initial condition, there exists a unique integral curve of a smooth vector field, defined on some interval. Theorem 2.5. Let X Vec(M) and consider the Cauchy problem { q(t) = X(q(t)) q() = q (2.4) For any point q M there exists δ > and a solution γ : ( δ,δ) M of (2.4), denoted by γ(t;q ). Moreover the map (t,q) γ(t;q) is smooth on a neighborhood of (,q ). The solution is unique in the following sense: if there exists two solutions γ 1 : I 1 M and γ 2 : I 2 M of (2.4) defined on two different intervals I 1,I 2 containing zero, then γ 1 (t) = γ 2 (t) for every t I 1 I 2. This permits to introduce the notion of maximal solution of (2.4), that is the unique solution of (2.4) that is not extendable to a larger interval J containing I. If the maximal solution of (2.4) is defined on a bounded interval I = (a,b), then the solution leaves every compact K of M in a finite time t K < b. A vector field X Vec(M) is called complete if, for every q M, the maximal solution γ(t;q ) of the equation (2.2) is defined on I = R. Remark 2.6. The classical theory of ODE ensure completeness of the vector field X Vec(M) in the following cases: (i) M is a compact manifold (or more generally X has compact support in M), (ii) M = R n and X is sub-linear, i.e. there exists C 1,C 2 > such that where denotes the Euclidean norm in R n. X(x) C 1 x +C 2, x R n. 46

47 When we are interested in the behavior of the trajectories of a vector field X Vec(M) in a compact subset K of M, the assumption of completeness is not restrictive. Indeed consider an open neighborhood O K of a compact K with compact closure O K in M. There exists a smooth cut-off function a : M R that is identically 1 on K, and that vanishes out of O K. Then the vector field ax is complete, since it has compact support in M. Moreover, the vector fields X and ax coincide on K, hence their integral curves coincide too Flow of a vector field Given a complete vector field X Vec(M) we can consider the family of maps φ t : M M, φ t (q) = γ(t;q), t R. (2.5) where γ(t;q) is the integral curve of X starting at q when t =. By Theorem 2.5 it follows that the map φ : R M M, φ(t,q) = φ t (q), is smooth in both variables and the family {φ t, t R} is a one parametric subgroup of Diff(M), namely, it satisfies the following identities: φ = Id, Moreover, by construction, we have φ t φ s = φ s φ t = φ t+s, t,s R, (2.6) (φ t ) 1 = φ t, t R, φ t (q) t = X(φ t (q)), φ (q) = q, q M. (2.7) The family of maps φ t defined by (2.5) is called the flow generated by X. For the flow φ t of a vector field X it is convenient to use the exponential notation φ t := e tx, for every t R. Using this notation, the group properties (2.6) take the form: e X = Id, e tx e sx = e sx e tx = e (t+s)x, (e tx ) 1 = e tx, (2.8) d dt etx (q) = X(e tx (q)), q M. (2.9) Remark 2.7. When X(x) = Ax is a linear vector field on R n, where A is a n n matrix, the corresponding flow φ t is the matrix exponential φ t (x) = e ta x Vector fields as operators on functions A vector field X Vec(M) induces an action on the algebra C (M) of the smooth functions on M, defined as follows where X : C (M) C (M), a Xa, a C (M), (2.1) (Xa)(q) = d dt a(e tx (q)), q M. (2.11) t= In other words X differentiates the function a along its integral curves. 47

48 Remark 2.8. Let us denote a t := a e tx. The map t a t is smooth and from (2.11) it immediately follows that Xa represents the first order term in the expansion of a t with respect to t: a t = a+txa+o(t 2 ). Exercise 2.9. Let a C (M) and X Vec(M), and denote a t = a e tx. Prove the following formulas d dt a t = Xa t, (2.12) a t = a+txa+ t2 2! X2 a+ t3 3! X3 a+...+ tk k! Xk a+o(t k+1 ). (2.13) It is easy to see also that the following Leibnitz rule is satisfied X(ab) = (Xa)b+a(Xb), a,b C (M), (2.14) that means that X, as an operator on functions, is a derivation of the algebra C (M). Remark 2.1. Notice that, in coordinates, if a C (M) and X = i X i(x) x i then Xa = i X i(x) a x i. In particular, when X is applied to the coordinate functions a i (x) = x i then Xa i = X i, which shows that a vector field is completely characterized by its action on functions. Exercise Let f 1,...,f k C (M) and assume that N = {f 1 =... = f k = } M is a smooth submanifold. Show that X Vec(M) is tangent to N, i.e., X(q) T q N for all q N, if and only if Xf i = for every i = 1,...,k Nonautonomous vector fields Definition A nonautonomous vector field is family of vector fields {X t } t R such that the map X(t,q) = X t (q) satisfies the following properties (C1) X(,q) is measurable for every fixed q M, (C2) X(t, ) is smooth for every fixed t R, (C3) for every system of coordinates defined in an open set Ω M and every compact K Ω and compact interval I R there exists L functions c(t),k(t) such that X(t,x) c(t), X(t,x) X(t,y) k(t) x y, (t,x),(t,y) I K Notice that conditions (C1) and (C2) are equivalent to require that for every smooth function a C (M) the real function X t a q defined on R M is measurable in t and smooth in q. Remark In these lecture notes we are mainly interested in nonautonomous vector fields of the following form m X t (q) = u i (t)f i (q) (2.15) i=1 48

49 where u i are L functions and f i are smooth vector fields on M. For this class of nonautonomous vector fields assumptions (C1)-(C2) are trivially satisfied. For what concerns (C3), by the smoothness of f i for every compact set K Ω we can find two positive constants C K,L K such that for all i = 1,...,m and j = 1,...,n we have f i (x) C K, f i x j L K, x K, and one gets for all (t,x),(t,y) I K m m X(t,x) C K u i (t), X(t,x) X(t,y) L K u i (t) x y. (2.16) i=1 The existence and uniqueness of integral curves of a nonautonomous vector field is guaranteed by the following theorem (see [7]). Theorem 2.14 (Carathéodory theorem). Assume that the nonautonomous vector field {X t } t R satisfies (C1)-(C3). Then the Cauchy problem { q(t) = X(t,q(t)) q(t ) = q (2.17) has a unique solution γ(t;t,q ) defined on an open interval I containing t such that (2.17) is satisfied for almost every t I and γ(t ;t,q ) = q. Moreover the map (t,q ) γ(t;t,q ) is Lipschitz with respect to t and smooth with respect to q. Let us assume now that the equation (2.14) is complete, i.e., for all t R and q M the solution γ(t;t,q ) is defined on I = R. Let us denote P t,t(q) = γ(t;t,q). The family of maps P t,t : M M is the (nonautonomous) flow generated by X t. It satisfies t i=1 P t,t X (q) = q q (t,p t,t(q ))P t,t(q) Moreover the following algebraic identities are satisfied P t,t = Id, P t2,t 3 P t1,t 2 = P t1,t 3, t 1,t 2,t 3 R, (2.18) (P t1,t 2 ) 1 = P t2,t 1, t 1,t 2 R, Conversely, with every family of smooth diffeomorphism P t,s : M M satisfying the relations (2.18), that is called a flow on M, one can associate its infinitesimal generator X t as follows: X t (q) = d ds P t,t+s (q), q M. (2.19) s= The following lemma characterizes flows whose infinitesimal generator is autonomous. Lemma Let {P t,s } t,s R be a family of smooth diffeomorphisms satisfying (2.18). Its infinitesimal generator is an autonomous vector field if and only if P,t P,s = P,t+s, t,s R. 49

50 2.2 Differential of a smooth map A smooth map between manifolds induces a map between the corresponding tangent spaces. Definition Let ϕ : M N a smooth map between smooth manifolds and q M. The differential of ϕ at the point q is the linear map ϕ,q : T q M T ϕ(q) N, (2.2) defined as follows: ϕ,q (v) = d dt ϕ(γ(t)), if v = d t= dt γ(t), q = γ(). t= It is easily checked that this definition depends only on the equivalence class of γ. ϕ q v γ(t) ϕ,q v ϕ(γ(t)) ϕ(q) M N Figure 2.1: Differential of a map ϕ : M N The differential ϕ,q of a smooth map ϕ : M N, also called its pushforward, is sometimes denoted by the symbols D q ϕ or d q ϕ, Exercise Let ϕ : M N, ψ : N Q be smooth maps between manifolds. Prove that the differential of the composition ψ ϕ : M Q satisfies (ψ ϕ) = ψ ϕ. As we said, a smooth map induces a transformation of tangent vectors. If we deal with diffeomorphisms, we can also pushforward a vector field. Definition Let X Vec(M) and ϕ : M N be a diffeomorphism. The pushforward ϕ X Vec(N) is the vector field on N defined by (ϕ X)(ϕ(q)) := ϕ (X(q)), q M. (2.21) When P Diff(M) is a diffeomorphism on M, we can rewrite the identity (2.21) as (P X)(q) = P (X(P 1 (q))), q M. (2.22) Notice that, in general, if ϕ is a smooth map, the pushforward of a vector field is not defined. Remark From this definition it follows the useful formula for X, Y Vec(M) (e tx Y) q = e tx ( Y e d (q)) = tx ds e tx e sy e tx (q). s= 5

51 If P Diff(M) and X Vec(M), then P X is, by construction, the vector field whose integral curves are the image under P of integral curves of X. The following lemma shows how it acts as operator on functions. Lemma 2.2. Let P Diff(M), X Vec(M) and a C (M) then e tp X = P e tx P 1, (2.23) (P X)a = (X(a P)) P 1. (2.24) Proof. From the formula d dt P e tx P 1 (q) = P (X(P 1 (q))) = (P X)(q), t= it follows that t P e tx P 1 (q) is an integral curve of P X, from which (2.23) follows. To prove (2.24) let us compute (P X)a q = d dt a(e tp X (q)). t= Using (2.23) this is equal to d dt a(p(e tx (P 1 (q))) = d t= dt (a P)(e tx (P 1 (q))) = (X(a P)) P 1. t= As a consequence of Lemma 2.2 one gets the following formula: for every X,Y Vec(M) 2.3 Lie brackets (e tx Y)a = Y(a e tx ) e tx. (2.25) In this section we introduce a fundamental notion for sub-riemannian geometry, the Lie bracket of twovector fieldsx andy. Geometrically itisdefinedastheinfinitesimalversion ofthepushforward of the second vector field along the flow of the first one. As expalined below, it measures how much Y is modified by the flow of X. Definition Let X,Y Vec(M). We define their Lie bracket as the vector field [X,Y] := t e tx Y. (2.26) t= Remark The geometric meaning of the Lie bracket can be understood by writing explicitly [X,Y] q = t e tx Y q = t= t e tx (Y e tx (q) ) = t= s t e tx e sy e tx (q). (2.27) t=s= Proposition As derivations on functions, one has the identity [X,Y] = XY YX. (2.28) 51

52 Proof. By definition of Lie bracket we have [X,Y]a = t t= (e tx Y)a. Hence we have to compute the first order term in the expansion, with respect to t, of the map Using formula (2.25) we have t (e tx Y)a. (e tx Y)a = Y(a e tx ) e tx. By Remark 2.8 we have a e tx = a txa+o(t 2 ), hence (e tx Y)a = Y(a txa+o(t 2 )) e tx = (Ya tyxa+o(t 2 )) e tx. Denoting b = Ya tyxa+o(t 2 ), b t = b e tx, and using again the expansion above we get (e tx Y)a = (Ya tyxa+o(t 2 ))+tx(ya tyxa+o(t 2 ))+O(t 2 ) = Ya+t(XY YX)a+O(t 2 ). that proves that the first order term with respect to t in the expansion is (XY YX)a. Proposition 2.23 shows that (Vec(M),[, ]) is a Lie algebra. Exercise Prove the coordinate expression of the Lie bracket: let n n X = X i, Y = Y j, x i x j be two vector fields in R n. Show that i=1 [X,Y] = n i,j=1 j=1 ( ) Y j X j X i Y i. x i x i x j Next we prove that every diffeomorphism induces a Lie algebra homomorphism on Vec(M). Proposition Let P Diff(M). Then P is a Lie algebra homomorphism of Vec(M), i.e., P [X,Y] = [P X,P Y], X,Y Vec(M). Proof. We show that the two terms are equal as derivations on functions. Let a C (M), preliminarly we see, using (2.24), that and using twice this property and (2.28) P X(P Ya) = P X(Y(a P) P 1 ) = X(Y(a P) P 1 P) P 1 = X(Y(a P)) P 1, [P X,P Y]a = P X(P Ya) P Y(P Xa) = XY(a P) P 1 YX(a P) P 1 = (XY YX)(a P) P 1 = P [X,Y]a. 52

53 To end this section, we show that the Lie bracket of two vector fields is zero (i.e., they commute as operator on functions) if and only if their flows commute. Proposition Let X, Y Vec(M). The following properties are equivalent: (i) [X,Y] =, (ii) e tx e sy = e sy e tx, t,s R. Proof. We start the proof with the following claim [X,Y] = = e tx Y = Y, t R. (2.29) To prove (2.29) let us show that [X,Y] = d dt t= e tx Y = implies that d dt e tx Y = for all t R. Indeed we have d dt e tx Y = d dε e (t+ε)x Y = d ε= dε e tx e εx Y ε= = e tx d dε e εx Y = e tx [X,Y] =, ε= which proves (2.29). (i) (ii). Fix t R. Let us show that φ s := e tx e sy e tx is the flow generated by Y. Indeed we have s φ s = ε e tx e (s+ε)y e tx ε= = ε e tx e εy e tx e} tx e {{ sy e tx } ε= = e tx Y φ s = Y φ s. where in the last equality we used the Claim. Using uniqueness of the flow generated by a vector field we get e tx e sy e tx = e sy, t,s R, which is equivalent to (ii). (ii) (i). For every function a C we have Then (i) follows from (2.28). XYa = 2 a e sy e tx = 2 a e tx e sy = YXa. t s t=s= s t t=s= Exercise Let X,Y Vec(M) and q M. Consider the curve on M γ(t) = e ty e tx e ty e tx (q). Prove that the tangent vector to the curve t γ( t) at t = is [X,Y](q). φ s 53

54 Exercise Let X,Y Vec(M). Using the semigroup property of the flow, prove the following expansion e tx Y = n= t n n! (adx)n Y = Y +t[x,y]+ t2 2 [X,[X,Y]]+ t3 [X,[X,[X,Y ]]] Exercise Let X,Y Vec(M) and a C (M). Prove the following Leibnitz rule for the Lie bracket: [X,aY] = a[x,y]+(xa)y. Exercise 2.3. Let X,Y,Z Vec(M). Prove that the Lie bracket satisfies the Jacobi identity: Hint: Differentiate the identity e tx [Y,Z] = [e tx Y,e tx Z]. 2.4 Cotangent space [X,[Y,Z]]+[Y,[Z,X]] +[Z,[X,Y]] =. (2.3) In this section we introduce tangent covectors, that are linear functionals on the tangent space. The space of all covectors at a point q M, called cotangent space is, in algebraic terms, simply the dual space to the tangent space. Definition Let M be a n-dimensional smooth manifold. The cotangent space at a point q M is the set T qm := (T q M) = {λ : T q M R,λ linear}. If λ T qm and v T q M, we will denote by λ,v := λ(v) the action of the covector λ on the vector v. As we have seen, a smooth map yields a linear map between tangent spaces. Dualizing this map, we get a linear map on cotangent spaces. Definition Let ϕ : M N be a smooth map and q M. The pullback of ϕ at point ϕ(q), where q M, is the map ϕ : T ϕ(q) N T qm, λ ϕ λ, defined by duality in the following way ϕ λ,v := λ,ϕ v, v T q M, λ T ϕ(q) M. Example Let a : M R be a smooth function and q M. The differential d q a of the function a at the point q M, defined through the formula d q a,v := d dt a(γ(t)), v T q M, (2.31) t= where γ is any smooth curve such that γ() = q and γ() = v, is an element of Tq M, since (2.31) is linear with respect to v. 54

55 Definition A differential 1-form on a smooth manifold M is a smooth map ω : q ω(q) T q M, that associates to every point q in M a cotangent vector at q. We denote by Λ 1 (M) the set of differential forms on M. Since differential forms are dual objects to vector fields, it is well defined the action of ω Λ 1 M on X Vec(M) pointwise, defining a function on M. ω,x : q ω(q),x(q). (2.32) The differential form ω is smooth if and only if, for every smooth vector field X Vec(M), the function ω,x C (M) Definition Let ϕ : M N be a smooth map and a : N R be a smooth function. The pullback ϕ a is the smooth function on M defined by (ϕ a)(q) = a(ϕ(q)), q M. In particular, if π : T M M is the canonical projection and a C (M), then which is constant on fibers. 2.5 Vector bundles (π a)(λ) = a(π(λ)), λ T M, Heuristically, a smooth vector bundle on a manifold M, is a smooth family of vector spaces parametrized by points in M. Definition Let M be a n-dimensional manifold. A smooth vector bundle of rank k over M is a smooth manifold E with a surjective smooth map π : E M such that (i) the set E q := π 1 (q), the fiber of E at q, is a k-dimensional vector space, (ii) for every q M there exist a neighborhood O q of q and a linear-on-fibers diffeomorphism (called local trivialization) ψ : π 1 (O q ) O q R k such that the following diagram commutes π 1 ψ (O q ) O q R k (2.33) π π 1 O q The space E is called total space and M is the base of the vector bundle. We will refer at π as the canonical projection and rank E will denote the rank of the bundle. Remark A vector bundle E, as a smooth manifold, has dimension dime = dimm +rank E = n+k. In the case when there exists a global trivialization map, i.e. one can choose a local trivialization with O q = M for all q M, then E is diffeomorphic to M R k and we say that E is trivializable. 55

56 Example For any smooth n-dimensional manifold M, the tangent bundle TM, defined as the disjoint union of the tangent spaces at all points of M, TM = T q M, q M has a natural structure of 2n-dimensional smooth manifold, equipped with the vector bundle structure (of rank n) induced by the canonical projection map π : TM M, π(v) = q if v T q M. In the same way one can consider the cotangent bundle T M, defined as T M = TqM. q M Again, it is a 2n-dimensional manifold, and the canonical projection map π : T M M, π(λ) = q if λ T qm, endows T M with a structure of rank n vector bundle. Let O M be a coordinate neighborhood and denote by φ : O R n, φ(q) = (x 1,...,x n ), a local coordinate system. The differentials of the coordinate functions dx i q, i = 1,...,n, q O, form a basis of the cotangent space Tq M. The dual basis in the tangent space T qm is defined by the vectors x i T q M, i = 1,...,n, q O, (2.34) q dx i, = δ ij, i,j = 1,...,n. (2.35) x j Thus any tangent vector v T q M and any covector λ Tq M can be decomposed in these basis and the maps v = n v i x i, λ = q i=1 n p i dx q i, ψ : v (x 1,...,x n,v 1,...,v n ), ψ : λ (x1,...,x n,p 1,...,p n ), (2.36) define local coordinates on TM and T M respectively, which we call canonical coordinates induced by the coordinates ψ on M. 56 i=1

57 Definition A morphism f : E E between two vector bundles E,E on the base M (also called a bundle map) is a smooth map such that the following diagram is commutative f E E (2.37) π π M where f is linear on fibers. Here π and π denote the canonical projections. Definition 2.4. Let π : E M be a smooth vector bundle over M. A local section of E is a smooth map 1 σ : A M E satisfying π σ = Id A, where A is an open set of M. In other words σ(q) belongs to E q for each q A, smoothly with respect to q. If σ is defined on all M it is said to be a global section. Example Let π : E M be a smooth vector bundle over M. The zero section of E is the global section We will denote by M := ζ(m) E. ζ : M E, ζ(q) = E q, q M. Remark Notice that smooth vector fields and smooth differential forms are, by definition, sections of the vector bundles TM and T M respectively. We end this section with some classical construction on vector bundles. Definition Let ϕ : M N be a smooth map between smooth manifolds and E be a vector bundle on N, with fibers {E q,q N}. The induced bundle (or pullback bundle) ϕ E is a vector bundle on the base M defined by ϕ E := {(q,v) q M,v E ϕ(q) } M E. Notice that rankϕ E = ranke, hence dimϕ E = dimm +ranke. Example (i). Let M be a smooth manifold and TM its tangent bundle, endowed with an Euclidean structure. The spherical bundle SM is the vector subbundle of T M defined as follows SM = q M S q M, S q M = {v T q M v = 1}. (ii). Let E,E be two vector bundles over a smooth manifold M. The direct sum E E is the vector bundle over M defined by (E E ) q := E q E q. 1 hetre smooth means as a map between manifolds. 57

58 2.6 Submersions and level sets of smooth maps If ϕ : M N is a smooth map, we define the rank of ϕ at q M to be the rank of the linear map ϕ,q : T q M T ϕ(q) N. It is of course just the rank of the matrix of partial derivatives of ϕ in any coordinate chart, or the dimension of Im(ϕ,q ) T ϕ(q) N. If ϕ has the same rank k at every point, we say ϕ has constant rank, and write rankϕ = k. An immersion is a smooth map ϕ : M N with the property that ϕ is injective at each point (or equivalently rankϕ = dimm). Similarly, a submersion is a smooth map ϕ : M N such that ϕ is surjective at each point (equivalently, rankϕ = dimn). Theorem 2.45 (Rank Theorem).. Suppose M and N are smooth manifolds of dimensions m and n, respectively, and ϕ : M N is a smooth map with constant rank k in a neighborhood of q M. Then there exist coordinates (x 1,...,x m ) centered at q and (y 1,...,y n ) centered at ϕ(q) in which ϕ has the following coordinate representation: ϕ(x 1,...,x m ) = (x 1,...,x k,,...,). (2.38) Remark The previous theorem can be rephrased in the following more invariant way. Let ϕ : M N be a smooth map between two smooth manifolds. Then the following are equivalent: (i) ϕ has constant rank in a neighborhood of q M. (ii) There exist coordinates near q M and ϕ(q) N in which the coordinate representation of ϕ is linear. In the case of a submersion, from Theorem 2.45 on can deduce the following result Corollary Assume ϕ : M N is a smooth submersion at q. Then ϕ admits a local right inverse at ϕ(q). Moreover ϕ is open at q. More precisely it exist ε > and C > such that B ϕ(q) (C 1 r) ϕ(b q (r)), r [,ε[. (2.39) Remark The constant C appearing in (2.39) is the norm of the differential of the local right inverse. When ϕ is a diffeomorphism, C is a bound on the norm of the differential of the inverse of ϕ. This recover the classical quantitative statement of the inverse function theorem. Using these results, one can give some very general criteria for level sets of smooth maps (or smooth functions) to be submanifolds. Theorem 2.49 (Constant Rank Level Set Theorem). Let M and N be smooth manifolds, and let ϕ : M N be a smooth map with constant rank k. Each level set ϕ 1 (y), for y N is a closed embedded submanifold of codimension k in M. Remark 2.5. It is worth to specify the following two important sub cases of Theorem 2.49: (a) If ϕ : M N is a submersion at every q ϕ 1 (y) for some y N, then ϕ 1 (y) is a closed embedded submanifold whose codimension is equal to the dimension of N. (b) If a : M R is a smooth function such that d q a for every q a 1 (c), where c R, then the level set a 1 (c) is a smooth hypersurface of M Exercise Let a : M R be a smooth function. Assume that c R is a regular value of a, i.e., d q a for every q a 1 (c). Then N c = a 1 (c) = {q M a(q) = c} M is a smooth submanifold. Prove that for every q N c T q N c = kerd q a = {v T q M d q a,v = }. 58

59 Bibliographical notes The material presented in this chapter is classical and covered by many textbook in differential geometry, as for instance in [6, 2, 12, 27]. Theorem 2.14 is a well-known theorem in ODE. The statement presented here can be deduced from [8, Theorem 2.1.1, Exercice 2.4]. The functions c(t), k(t) appearing in (C3) are assumed to be L, that is stronger than L 1 (on compact intervals). This stronger assumptions implies that the solution is not only absolutely continuous with respect to t, but also locally Lipschitz. 59

60 6

61 Chapter 3 Sub-Riemannian structures 3.1 Basic definitions In this section we introduce a definition of sub-riemannian structure which is quite general. Indeed, this definition includes all the classical notions of Riemannian structure, constant-rank sub- Riemannian structure, rank-varying sub-riemannian structure, almost-riemannian structure etc. Definition 3.1. Let M be a smooth manifold and let F Vec(M) be a family of smooth vector fields. The Lie algebra generated by F is the smallest sub-algebra of Vec(M) containing F, namely LieF := span{[x 1,...,[X j 1,X j ]],X i F,j N}. (3.1) We will say that F is bracket-generating (or that satisfies the Hörmander condition) if Lie q F := {X(q),X LieF} = T q M, q M. Definition 3.2. Let M be a connected smooth manifold. A sub-riemannian structure on M is a pair (U,f) where: (i) U is an Euclidean bundle with base M and Euclidean fiber U q, i.e., for every q M, U q is a vector space equipped with a scalar product ( ) q, smooth with respect to q. For u U q we denote the norm of u as u 2 = (u u) q. (ii) f : U TM is a smooth map that is a morphism of vector bundles, i.e. the following diagram is commutative (here π U : U M and π : TM M are the canonical projections) U f TM π U π M (3.2) and f is linear on fibers. (iii) The set of horizontal vector fields D := {f(σ) σ : M U smooth section}, is a bracketgenerating family of vector fields. 61

62 Whenthevector bundleuadmitsaglobal trivialization wesaythat (U,f) isafree sub-riemannian structure. A smooth manifold endowed with a sub-riemannian structure (i.e., the triple (M, U, f)) is called a sub-riemannian manifold. When the map f : U TM is fiberwise surjective, (M,U,f) is called a Riemannian manifold (cf. Exercise 3.23). Definition 3.3. Let (M, U, f) be a sub-riemannian manifold. The distribution is the family of subspaces {D q } q M, where D q := f(u q ) T q M. We call k(q) := dimd q the rank of the sub-riemannian structure at q M. We say that the sub-riemannian structure (U,f) on M has constant rank if k(q) is constant. The set of horizontal vector fields D Vec(M) has the structure of a finitely generated C (M)- module, whose elements are vector fields tangent to the distribution at each point, i.e. D q = {X(q) X D}. The rank of a sub-riemannian structure (M,U,f) satisfies k(q) m, where m = ranku, (3.3) k(q) n, where n = dimm. (3.4) In what follows we denote points in U as pairs (q,u), where q M is an element of the base and u U q is an element of the fiber. Following this notation we can write the value of f at this point as f(q,u) or f u (q). We prefer the second notation to stress that, for each q M, f u (q) is a vector in T q M. Definition 3.4. A Lipschitz curve γ : [,T] M is said to be admissible (or horizontal) for a sub-riemannian structure if there exists a measurable and essentially bounded function called the control function, such that u : t [,T] u(t) U γ(t), (3.5) γ(t) = f(γ(t),u(t)), for a.e. t [,T]. (3.6) In this case we say that u( ) is a control corresponding to γ. Notice that different controls could correspond to the same trajectory. Remark 3.5. Once we have chosen a local trivialization O q R m for the vector bundle U, where O q is a neighborhood of a point q M, we can choose a basis in the fibers and the map f is written f(q,u) = m i=1 u if i (q), where m is the rank of U. In this trivialization, a Lipschitz curve γ : [,T] M is admissible if there exists u = (u 1,...,u m ) L ([,T],R m ) such that γ(t) = m u i (t)f i (γ(t)), for a.e. t [,T]. (3.7) i=1 Thanks to this local characterization and Theorem 2.14, for each initial condition q M and u L ([,T],R m ) there exists an admissible curve γ, defined on a sufficiently small interval, such that u is the control associated with γ and γ() = q. 62

63 D q Figure 3.1: An horizontal curve Remark 3.6. Notice that, for a curve to be admissible, it is not sufficient to satisfy γ(t) D γ(t) for almost every t [,T]. Take for instance the two free sub-riemannian structures on R 2 having rank two and defined by f(x,y,u 1,u 2 ) = (x,y,u 1,u 2 x), f (x,y,u 1,u 2 ) = (x,y,u 1,u 2 x 2 ). (3.8) and let D and D the corresponding moduli of horizontal vector fields. It is easily seen that the curve γ : [ 1,1] R 2, γ(t) = (t,t 2 ) satisfies γ(t) D γ(t) and γ(t) D γ(t) for every t [ 1,1]. Moreover, γ is admissible for f, since its corresponding control is (u 1,u 2 ) = (1,2) for a.e. t [ 1,1], but it is not admissible for f, since its corresponding control is uniquely determined as (u 1 (t),u 2 (t)) = (1,2/t) for a.e. t [ 1,1], which is not essentially bounded. This example shows that, for two different sub-riemannian structures (U,f) and (U,f ) on the same manifold M, one can have D q = D q for every q M, but D D. Notice, however, that if the distribution has constant rank one has D q = D q for every q M if and only if D = D The minimal control and the length of an admissible curve We start by defining the sub-riemannian norm for vectors that belong to the distribution. Definition 3.7. Let v D q. We define the sub-riemannian norm of v as follows v := min{ u, u U q s.t. v = f(q,u)}. (3.9) Notice that since f is linear with respect to u, the minimum in (3.9) is always attained at a unique point. Indeed the condition f(q, ) = v defines an affine subspace of U q (which is nonempty since v D q ) and the minimum in (3.9) is uniquely attained at the orthogonal projection of the origin onto this subspace (see Figure 3.2). Exercise 3.8. Show that is a norm in D q. Moreover prove that it satisfies the parallelogram law, i.e., it is induced by a scalar product q on D q, that can be recovered by the polarization identity v w q = 1 4 v +w v w 2, v,w D q. (3.1) 63

64 u 2 u 1 +u 2 = v v u 1 Figure 3.2: The norm of a vector v for f(x,u 1,u 2 ) = u 1 +u 2 Exercise 3.9. Let u 1,...,u m U q be an orthonormal basis for U q. Define v i = f(q,u i ). Show that if f(q, ) is injective then v 1,...,v m is an orthonormal basis for D q. An admissible curve γ : [,T] M is Lipschitz, hence differentiable at almost every point. Hence it is well defined the unique control t u (t) associated with γ and realizing the minimum in (3.9). Definition 3.1. Given an admissible curve γ : [,T] M, we define u (t) := argmin{ u, u U q s.t. γ(t) = f(γ(t),u)}. (3.11) for all differentiability point of γ. We say that the control u is the minimal control associated with γ. We stress that u (t) is pointwise defined for a.e. t [,T]. The proof of the following crucial Lemma is postponed to the Section 3.A. Lemma Let γ : [,T] M be an admissible curve. Then its minimal control u ( ) is measurable and essentially bounded on [, T]. Remark If the admissible curve γ : [,T] M is differentiable, its minimal control is defined everywhere on [, T]. Nevertheless, it could be not continuous, in general. Consider, as in Remark 3.6, the free sub-riemannian structure on R 2 f(x,y,u 1,u 2 ) = (x,y,u 1,u 2 x), (3.12) and let γ : [ 1,1] R 2 defined by γ(t) = (t,t 2 ). Its minimal control u (t) satisfies (u 1 (t),u 2 (t)) = (1,2) when t, while (u 1 (),u 2 ()) = (1,), hence is not continuous. Thanks to Lemma 3.11 we are allowed to introduce the following definition. Definition Let γ : [,T] M be an admissible curve. We definethe sub-riemannian length of γ as l(γ) := T 64 γ(t) dt. (3.13)

65 We say that γ is length-parametrized (or arclength parametrized) if γ(t) = 1 for a.e. t [,T]. Notice that for a length-parametrized curve we have that l(γ) = T. Formula (3.13) says that the length of an admissible curve is the integral of the norm of its minimal control. l(γ) = T In particular any admissible curve has finite length. u (t) dt. (3.14) Lemma The length of an admissible curve is invariant by Lipschitz reparametrization. Proof. Let γ : [,T] M be an admissible curve and ϕ : [,T ] [,T] a Lipschitz reparametrization, i.e., a Lipschitz and monotone surjective map. Consider the reparametrized curve γ ϕ : [,T ] M, γ ϕ := γ ϕ. First observe that γ ϕ is a composition of Lipschitz functions, hence Lipschitz. Moreover γ ϕ is admissible since, by the linearity of f, it has minimal control (u ϕ) ϕ L, where u is the minimal control of γ. Using the change of variables t = ϕ(s), one gets l(γ ϕ ) = T γ ϕ (s) ds = T u (ϕ(s)) ϕ(s) ds = T u (t) dt = T γ(t) dt = l(γ). (3.15) Lemma Every admissible curve of positive length is a Lipschitz reparametrization of a lengthparametrized admissible one. Proof. Let ψ : [,T] M be an admissible curve with minimal control u. Consider the Lipschitz monotone function ϕ : [,T] [,l(ψ)] defined by ϕ(t) := t u (τ) dτ. Notice that if ϕ(t 1 ) = ϕ(t 2 ), the monotonicity of ϕ ensures ψ(t 1 ) = ψ(t 2 ). Hence we are allowed to define γ : [,l(ψ)] M by γ(s) := ψ(t), if s = ϕ(t) for some t [,T]. In other words, it holds ψ = γ ϕ. To show that γ is Lipschitz let us first show that there exists a constant C > such that, for every t,t 1 [,T] one has, in some local coordinates (where denotes the Euclidean norm in coordinates) ψ(t 1 ) ψ(t ) C 65 t1 t u (τ) dτ.

66 ( m ) 1/2. Indeed fix K M a compact set such that ψ([,t]) K and C := max f i (x) 2 Then x K ψ(t 1 ) ψ(t ) C t1 t t1 Hence if s 1 = ϕ(t 1 ) and s = ϕ(t ) one has m u i(t)f i (ψ(t)) dt i=1 m u i (t) 2 m f i (ψ(t)) 2 dt t i=1 t1 t u (t) dt, γ(s 1 ) γ(s ) = ψ(t 1 ) ψ(t ) C t1 i=1 i=1 t u (τ) dτ = C s 1 s, which proves that γ is Lipschitz. It particular γ(s) exists for a.e. s [,l(ψ)]. We are going to prove that γ is admissible and its minimal control has norm one. Define for every s such that s = ϕ(t), ϕ(t) exists and ϕ(t), the control v(s) := u (t) ϕ(t) = u (t) u (t). By Exercise 3.16 the control v is defined for a.e. s. Moreover, by construction, v(s) = 1 for a.e. s and v is the minimal control associated with γ. Exercise Show that for a Lipschitz and monotone function ϕ : [,T] R, the Lebesgue measure of the set {s R s = ϕ(t), ϕ(t) exists, ϕ(t) = } is zero. By the previuos discussion, in what follows, it will be often convenient to assume that admissible curves are length-parametrized (or parametrized such that γ(t) is constant) Equivalence of sub-riemannian structures In this section we introduce the notion of equivalence for sub-riemannian structures on the same base manifold M and the notion of isometry between sub-riemannian manifolds. Definition Let (U,f),(U,f ) be two sub-riemannian structures on a smooth manifold M. They are said to be equivalent if the following conditions are satisfied (i) there exist an Euclidean bundle V and two surjective vector bundle morphisms p : V U and p : V U such that the following diagram is commutative U p f 5 V TM p 2 f U (3.16) 66

67 (ii) the projections p, p are compatible with the scalar product, i.e., it holds u = min{ v,p(v) = u}, u U, u = min{ v,p (v) = u }, u U, Remark If (U,f) and (U,f ) are equivalent sub-riemannian structures on M, then: (a) the distributions D q and D q defined by f and f coincide, since f(u q ) = f (U q ) for all q M. (b) for each w D q we have w = w, where and are the norms are induced by (U,f) and (U,f ) respectively. In particular the length of an admissible curve for two equivalent sub-riemannian structures is the same. Remark Notice that (i) is satisfied (with the vector bundle V possibly non Euclidean) if and only if the two moduli of horizontal vector fields D and D defined by U and U are equal (cf. Definition 3.2). Definition 3.2. Let M be a sub-riemannian manifold. We define the minimal bundle rank of M as the infimum of rank of bundles that induce equivalent structures on M. Given q M the local minimal bundle rank of M at q is the minimal bundle rank of the structure restricted on a sufficiently small neighborhood O q of q. Exercise Prove that the free sub-riemannian structure on R 2 defined by f : R 2 R 3 TR 2 defined by f(x,y,u 1,u 2,u 3 ) = (x,y,u 1,u 2 x+u 3 y) has non constant local minimal bundle rank. For equivalence classes of sub-riemannian structures we introduce the following definition. Definition Two equivalent classes of sub-riemannian manifolds are said to be isometric if there exist two representatives (M,U,f),(M,U,f ), a diffeomorphism φ : M M and an isomorphism 1 of Euclidean bundles ψ : U U such that the following diagram is commutative U f TM (3.17) ψ φ U f TM Examples Our definition of sub-riemannian manifold is quite general. In the following we list some classical geometric structures which are included in our setting. 1 isomorphism of bundles in the broad sense, it is fiberwise but is not obliged to map a fiber in the same fiber. 67

68 1. Riemannian structures. Classically a Riemannian manifold is defined as a pair (M, ), where M is a smooth manifold and q is a family of scalar product on T q M, smoothly depending on q M. This definition is included in Definition 3.2 by taking U = TM endowed with the Euclidean structure induced by and f : TM TM the identity map. Exercise Show that every Riemannian manifold in the sense of Definition 3.2 is indeed equivalent to a Riemannian structure in the classical sense above (cf. Exercise 3.8). 2. Constant rank sub-riemannian structures. Classically a constant rank sub-riemannian manifold is a triple (M,D, ), where D is a vector subbundle of TM and q is a family of scalar product on D q, smoothly depending on q M. This definition is included in Definition 3.2 by taking U = D, endowed with its Euclidean structure, and f : D TM the canonical inclusion. 3. Almost-Riemannian structures. An almost-riemannian structure on M is a sub-riemannian structure (U,f) on M such that its local minimal bundle rank is equal to the dimension of the manifold, at every point. 4. Free sub-riemannian structures. Let U = M R m be the trivial Euclidean bundle of rank m on M. A point in U can be written as (q,u), where q M and u = (u 1,...,u m ) R m. If we denote by {e 1,...,e m } an orthonormal basis of R m, then we can define globally m smooth vector fields on M by f i (q) := f(q,e i ) for i = 1,...,m. Then we have ( ) m m f(q,u) = f q, u i e i = u i f i (q), q M. (3.18) i=1 In this case, the problem of finding an admissible curve joining two fixed points q,q 1 M and with minimal length is rewritten as the optimal control problem m γ(t) = u i (t)f i (γ(t)) i=1 T (3.19) u(t) dt min i=1 γ() = q, γ(t) = q 1 For a free sub-riemannian structure, the set of vector fields f 1,...,f m build as above is called a generating family. Notice that, in general, a generating family is not orthonormal when f is not injective. 5. Surfaces in R 3 as free sub-riemannian structures Due to topological constraints, in general it not possible to regard a surface as a free sub- Riemannian structure of rank 2, i.e., defined by a pair of globally defined orthonormal vector fields. However, it is always possible to regard it as a free sub-riemannian structure of rank 3. 68

69 Indeed, for an embedded surface M in R 3, consider the trivial Euclidean bundle U = M R 3, where points are denoted as usual (q,u), with u R 3,q M, and the map f : U TM, f(q,u) = π q (u) T qm. (3.2) where π q : R3 T q M R 3 is the orthogonal projection. Notice that f is a surjective bundle map and the set of vector fields {π q ( x),π q ( y),π q ( z)} is a generating family for this structure. Exercise Show that (U, f) defined in (3.2) is equivalent to the Riemannian structure on M induced by the embedding in R Every sub-riemannian structure is equivalent to a free one The purpose of this section is to show that every sub-riemannian structure (U,f) on M is equivalent to a sub-riemannian structure (U,f ) where U is a trivial bundle with sufficiently big rank. Lemma Let M be a n-dimensional smooth manifold and π : E M a smooth vector bundle of rank m. Then, there exists a vector bundle π : E M with ranke 2n + m such that E E is a trivial vector bundle. Proof. Remember that E, as a smooth manifold, has dimension dim E = dim M +rank E = n+m. Consider the map i : M E which embeds M into the vector bundle E as the zero section M = i(m). If we denote with T M E := i (TE) the pullback vector bundle, i.e., the restriction of TE to the section M, we have the isomorphism (as vector bundles on M) T M E E TM. (3.21) Eq. (3.21) is a consequence of the fact that the tangent to every fibre E q, being a vector space, is canonically isomorphic to its tangent space T q E q so that T q E = T q E q T q M E q T q M, q M. By Whitney theorem we have a (nonlinear on fibers, in general) immersion Ψ : E R N, Ψ : T M E TE TR N, for N = 2(n+m), and Ψ is injective as bundle map, i.e., T M E is a sub-bundleof TR N R N R N. Thus we can choose as a complement E, the orthogonal bundle (on the base M) with respect to the Euclidean metric in R N, i.e. E = E q, E q = (T qe q T q M), q M and considering E := T M E E we have that E is trivial since its fibers are sum of orthogonal complements and by (3.21) we are done. 69

70 Corollary Every sub-riemannian structure (U, f) on M is equivalent to a sub-riemannian structure (U, f) where U is a trivial bundle. Proof. By Lemma 3.25 there exists a vector bundle U such that the direct sum U := U U is a trivial bundle. Endow U with any metric structure g. Define a metric on U in such a way that ḡ(u+u,v +v ) = g(u,v) +g (u,v ) on each fiber Ūq = U q U q. Notice that U q and U q are orthogonal subspace of Ūq with respect to ḡ. Let us define the sub-riemannian structure (U, f) on M by f : U TM, f := f p1, where p 1 : U U U denotes the projection on the first factor. By construction, the diagram U Id f U U TM p 1 4 f U (3.22) is commutative. Moreover condition (ii) of Definition 3.17 is satisfied since for every ū = u +u, with u U q and u U q, we have ū 2 = u 2 + u 2, hence u = min{ ū,p 1 (ū) = u}. Since every sub-riemannian structure is equivalent to a free one, in what follows we can assume that there exists a global generating family, i.e., a family of f 1,...,f m of vector fields globally defined on M such that every admissible curve of the sub-riemannian structure satisfies γ(t) = m u i (t)f i (γ(t)), (3.23) i=1 Moreover, by the classical Gram-Schmidt procedure, we can assume that f i are the image of an orthonormal frame defined on the fiber. (cf. Example 4 of Section 3.1.3) Under these assumptions the length of an admissible curve γ is given by l(γ) = T u (t) dt = T m u i (t)2 dt, where u (t) is the minimal control associated with γ. Notice that Corollary 3.26 implies that the modulus of horizontal vector fields D is globally generated by f 1,...,f m. Remark The integral curve γ(t) = e tf i, defined on [,T], of an element f i of a generating family F = {f 1,...,f m } is admissible and l(γ) T. If F = {f 1,...,f m } are linearly independent then they are an orthonormal frame and l(γ) = T. i=1 7

71 3.2 Sub-Riemannian distance and Chow-Rashevskii Theorem In this section we introduce the sub-riemannian distance between two points as the infimum of the length of admissible curves joining them. Recall that, in the definition of sub-riemannian manifold, M is assumed to be connected. Moreover, thanks to the construction of Section 3.1.4, in what follows we can assume that the sub- Riemannian structure is free, with generating family F = {f 1,...,f m }. Notice that, by definition, F is assumed to be bracket generating. Definition Let M be a sub-riemannian manifold and q,q 1 M. The sub-riemannian distance (or Carnot-Caratheodory distance) between q and q 1 is d(q,q 1 ) = inf{l(γ) γ : [,T] M admissible, γ() = q, γ(t) = q 1 }, (3.24) One of the purpose of this section is to show that, thanks to the bracket generating condition, (3.24) is well-defined, namely for every q,q 1 M, there exists an admissible curve that joins q to q 1, hence d(q,q 1 ) < +. Theorem 3.29 (Chow-Raschevskii). Let M be a sub-riemannian manifold. Then (i) (M,d) is a metric space, (ii) the topology induced by (M, d) is equivalent to the manifold topology. In particular, d : M M R is continuous. In what follows B(q,r) (sometimes denoted also B r (q)) is the (open) sub-riemannian ball of radius r and center q B(q,r) := {q M d(q,q ) < r}. The rest of this section is devoted to the proof of Theorem To prove it, we have to show that d is actually a distance, i.e., (a) d(q,q 1 ) < + for all q,q 1 M, (b) d(q,q 1 ) = if and only if q = q 1, (c) d(q,q 1 ) = d(q 1,q ) and d(q,q 2 ) d(q,q 1 )+d(q 1,q 2 ) for all q,q 1,q 2 M, and the equivalence between the metric and the manifold topology: for every q M we have (d) for every ε > there exists a neighborhood O q of q such that O q B(q,ε), (e) for every neighborhood O q of q there exists δ > such that B(q,δ) O q Proof of Chow-Raschevskii Theorem The symmetry of d is a direct consequence of the fact that if γ : [,T] M is admissible, then the curve γ : [,T] M defined by γ(t) = γ(t t) is admissible and l( γ) = l(γ). The 71

72 triangular inequality follows from the fact that, given two admissible curves γ 1 : [,T 1 ] M and γ 2 : [,T 2 ] M such that γ 1 (T 1 ) = γ 2 (), their concatenation { γ 1 (t), t [,T 1 ], γ : [,T 1 +T 2 ] M, γ(t) = (3.25) γ 2 (t T 1 ), t [T 1,T 1 +T 2 ]. is still admissible. These two arguments prove item (c). We divide the rest of the proof of the Theorem in the following steps. S1. We prove that, for every q M, there exists a neighborhood O q of q such that d(q, ) is finite and continuous in O q. This proves (d). S2. We prove that d is finite on M M. This proves (a). S3. We prove (b) and (e). To prove Step 1 we first need the following lemmas: Lemma 3.3. Let N M be a submanifold and F Vec(M) be a family of vector fields tangent to N, i.e., X(q) T q N, for every q N and X F. Then for all q N we have Lie q F T q N. In particular dimlie q F dimn. Proof. Let X F. As a consequence of the local existence and uniqueness of the two Cauchy problems { { q = X(q), q M, q = X and N (q), q N, q() = q, q N. q() = q, q N. it follows that e tx (q) N for every q N and t small enough. This property, together with the definition of Lie bracket (see formula (2.27)) implies that, if X,Y are tangent to N, the vector field [X,Y] is tangent to N as well. Iterating this argument we get that Lie q F T q N for every q N, from which the conclusion follows. Lemma Let M be an n-dimensional sub-riemannian manifold with generating family F = {f 1,...,f m }. For every q M and every neighborhood V of the origin in R n there exist ŝ = (ŝ 1,...,ŝ n ) V, and a choice of n vector fields f i1,...,f in F, such that ŝ is a regular point of the map ψ : R n M, ψ(s 1,...,s n ) = e snf in e s 1f i1(q ). Remark Notice that, if D q T q M, then ŝ = cannot be a regular point of the map ψ. Indeed, for s =, the image of the differential of ψ at is span q {f ij,j = 1,...,n} D q and the differential of ψ cannot be surjective. We stress that, in the choice of f i1,...,f in F, a vector field can appear more than once, as for instance in the case m < n. Proof of Lemma We prove the lemma by steps. 1. There exists a vector field f i1 F such that f i1 (q ), otherwise all vector fields in F vanish at q and dimlie q F =, which contradicts the bracket generating condition. Then, for s small enough, the map φ 1 : s 1 e s 1f i1(q ), is a local diffeomorphism onto its image Σ 1. If dimm = 1 the Lemma is proved. 72

73 2. Assume dimm 2. Then there exist t 1 1 R, with t1 1 small enough, and f i 2 F such that, if we denote by q 1 = e t1 1 f i 1 (q ), the vector f i2 (q 1 ) is not tangent to Σ 1. Otherwise, by Lemma 3.3, dim Lie q F = 1, which contradicts the bracket generating condition. Then the map φ 2 : (s 1,s 2 ) e s 2f i2 e s 1 f i1(q ), is a local diffeomorphism near (t 1 1,) onto its image Σ 2. Indeed the vectors φ 2 φ 2 s 1 T q1 Σ 1, (t 1 1,) s 2 = f i2 (q 1 ), (t 1 1,) are linearly independent by construction. If dimm = 2 the Lemma is proved. 3. Assume dimm 3. Then there exist t 1 2,t2 2, with t1 2 t1 1 and t2 2 small enough, and f i 3 F such that, if q 2 = e t2 2 f i 2 e t1 2 f i 1 (q ) we have that f i3 (q 2 ) is not tangent to Σ 2. Otherwise, by Lemma 3.3, dim Lie q1 D = 2, which contradicts the bracket generating condition. Then the map φ 3 : (s 1,s 2,s 3 ) e s 3f i3 e s 2 f i2 e s 1 f i1(q ), is a local diffeomorphism near (t 1 2,t2 2,). Indeed the vectors φ 3 s 1, φ 3 φ 3 (t 1 2,t 2 2,) s 2 T q2 Σ 2, (t 1 2,t 2 2,) s 3 = f i3 (q 2 ), (t 1 2,t 2 2,) are linearly independent since the last one is transversal to T q2 Σ 2 by construction, while the first two are linearly independent since φ 3 (s 1,s 2,) = φ 2 (s 1,s 2 ) and φ 2 is a local diffeomorphisms at (t 1 2,t2 2 ) which is close to (t1 1,). Repeating the same argument n times (with n = dimm), the lemma is proved. Proof of Step 1. Thanks to Lemma 3.31 there exists a neighborhood V V of ŝ such that ψ is a diffeomorphism from V to ψ( V), see Figure 3.3. We stress that in general q = ψ() is not contained ψ( V), cf. Remark To build a local diffeomorphism whose image contains q, we consider the map ψ : R n M, ψ(s1,...,s n ) = e ŝ 1f i1 e ŝ nf in ψ(s 1,...,s n ), which has the following property: ψ is a diffeomorphism from a neighborhood of ŝ V, that we still denote V, to a neighborhood of ψ(ŝ) = q. Fix now ε > and apply the construction above where V is the neighborhood of the origin in R n defined by V = {s R n, n i=1 s i < ε}. Let us show that the claim of Step 1 holds with O q = ψ( V). Indeed, for every q ψ( V), let s = (s 1,...,s n ) such that q = ψ(s), and denote by γ the admissible curve joining q to q, built by 2n-pieces, as in Figure 3.4. In other words γ is the concatenation of integral curves of the vector fields f ij, i.e., admissible curves of the form t e tf i j (q) defined on some interval [,T], whose length is less or equal than T (cf. Remark 3.27). Since s,ŝ V V, it follows that: which ends the proof of Step 1. d(q,q) l(γ) s s n + ŝ ŝ n < 2ε, 73

74 ψ V V ψ( V) ŝ q Figure 3.3: Proof of Lemma 3.31 ψ V V s ψ(s) ψ(s) q ψ( V) Figure 3.4: The map ψ Proof of Step 2. To prove that d is finite on M M let us consider the equivalence classes of points in M with respect to the relation q 1 q 2 if d(q 1,q 2 ) < +. (3.26) From the triangular inequality and the proof of Step 1, it follows that each equivalence class is open. Moreover, by definition, the equivalence classes are disjoint and nonempty. Since M is connected, it cannot be the union of open disjoint and nonempty subsets. It follows that there exists only one equivalence class. Lemma Let q M and K M a compact set with q intk. Then there exists δ K > such that every admissible curve γ starting from q and with l(γ) δ K is contained in K. Proof. Without loss of generality we can assume that K is contained in a coordinate chart of M, 74

75 where we denote by the Euclidean norm in the coordinate chart. Let us define C K := max x K ( m ) 1/2 f i (x) 2 i=1 (3.27) and fix δ K > such that dist(q, K) > C K δ K (here dist is the Euclidean distance, in coordinates). Let us show that for any admissible curve γ : [,T] M such that γ() = q and l(γ) δ K we have γ([,t]) K. Indeed, if this is not true, there exists an admissible curve γ : [,T] M with l(γ) δ K and t := sup{t [,T],γ([,t]) K}, with t < T. Then γ(t ) γ() t t γ(t) dt = t m u i (t)f i(γ(t)) dt (3.28) i=1 m f i (γ(t)) 2 m u i (t)2 dt (3.29) i= i= t C K m u i (t)2 dt C K l(γ) (3.3) i= C K δ K < dist(q, K). (3.31) which contradicts the fact that, at t, the curve γ leaves the compact K. Thus t = T. Proof of Step 3. Let us prove that Lemma 3.33 implies property (b). Indeed the only nontrivial implication is that d(q,q 1 ) > whenever q q 1. To prove this, fix a compact neighborhood K of q such that q 1 / K. By Lemma 3.33, each admissible curve joining q and q 1 has length greater than δ K, hence d(q,q 1 ) δ K >. Let us now prove property (e). Fix ε > and a a compact neighborhood K of q. Define C K and δ K as in Lemma 3.33, and set δ := min{δ K,ε/C K }. Let us show that q q < ε whenever d(q,q) < δ, where again is the Euclidean norm in a coordinate chart. Consider a minimizing sequence γ n : [,T] M of admissible trajectories joining q and q such that l(γ n ) d(q,q) for n. Without loss of generality, we can assume that l(γ n ) δ for all n. By Lemma 3.33, γ n ([,T]) K for all n. We can repeat estimates (3.28)-(3.3) proving that q q = γ n (T) γ n () C K l(γ n ) for all n. Passing to the limit for n, one gets q q C K d(q,q) C K δ < ε. (3.32) Corollary The metric space (M,d) is locally compact, i.e., for any q M there exists ε > such that the closed sub-riemannian ball B(q,r) is compact for all r ε. Proof. By the continuity of d, the set B(q,r) = {d(q, ) r} is closed for all q M and r. Moreover the sub-riemannian metric d induces the manifold topology on M. Hence, for radius small enough, the sub-riemannian ball is bounded. Thus small sub-riemannian balls are compact. 75

76 3.3 Existence of length-minimizers In this section we want to discuss the existence of length-minimizers. Definition Let γ : [,T] M be an admissible curve. We say that γ is a length-minimizer if it minimizes the length among admissible curves with same endpoints, i.e., l(γ) = d(γ(), γ(t)). Remark Notice that the existence length-minimizers between two points is not guaranteed in general, as it happens for two points in M = R 2 \{} (endowed with the Euclidean distance) that are symmetric with respect to the origin. On the other hand, when length-minimizers exists between two fixed points, they may be not unique, as it happens for two antipodal points on the sphere S 2. We now show a general semicontinuity property of the length functional. Theorem Let γ n : [,T] M be a sequence of admissible curves on M such that γ n γ uniformly on [,T]. Then l(γ) liminf l(γ n). (3.33) n If moreover liminf n l(γ n ) < +, then γ is also admissible. Proof. Without loss of generality we assume that γ n and γ are parametrized with constant speed on the interval [,1]. Moreover, denote L := liminf l(γ n ) and choose a subsequence, which we still denote by the same symbol, such that l(γ n ) L. If L = + the inequality (3.33) is clearly true, thus assume L < +. Fix δ >. By uniform convergence, it is not restrictive to assume that, for n large enough, l(γ n ) L+δ and that the image of γ n are all contained in a common compact set K. Since γ n is parametrized by constant speed on [,1] we have that γ n (t) V γn(t) where V q = {f u (q), u L+δ} T q M, f u (q) = m u i f i (q). i=1 Notice that V q is convex for every q M, thanks to the linearity of f in u. Let us prove that γ is admissible and satisfies l(γ) L+δ. Since δ is arbitrary, this implies l(γ) L, that is (3.33). In local coordinates, we have for every ε > 1 ε (γ n(t+ε) γ n (t)) = 1 ε t+ε t f un(τ)(γ n (τ))dτ conv{v γn(τ),τ [t,t+ε]}. (3.34) Next we want to estimate the right hand side of (3.34) uniformly. For n n sufficiently large, we have γ n (t) γ(t) < ε (by uniform convergence) and an estimate similar to (3.3) gives for τ [t,t+ε] γ n (t) γ n (τ) τ t γ n (s) ds C K (L+δ)ε. (3.35) where C K is the constant (3.27) defined by the compact K. Hence we deduce for every τ [t,t+ε] and every n n γ n (τ) γ(t) γ n (t) γ n (τ) + γ n (t) γ(t) C ε, (3.36) 76

77 where C is independent on n and ε. From the estimate (3.36) and the equivalence of the manifold and metric topology we have that, for all τ [t,t+ε] and n n, γ n (τ) B γ(t) (r ε ), with r ε when ε. In particular conv{v γn(τ), τ [t,t+ε]} conv{v q, q B γ(t) (r ε )}. (3.37) Plugging (3.37) in (3.34) and passing to the limit for n we get finally to 1 ε (γ(t+ε) γ(t)) conv{v q, q B γ(t) (r ε )}. (3.38) Assume now that t [,1] is a differentiability point of γ. Then the limit of the l.h.s. in (3.38) for ε exists and gives γ(t) convv γ(t) = V γ(t). For every differentiability point t we can thus define the unique u (t) satisfying γ(t) = f(γ(t),u (t)) and u (t) = γ(t). Using the argument contained in Appendix 3.A it follows that u (t) is measurable in t. Moreover u (t) is essentially bounded since, by construction, u (t) L+δ for a.e. t [,T]. Hence γ is admissible. Moreover l(γ) L+δ since γ is length-parametrized on the interval [,1]. Corollary Let γ n be a sequence of length-minimizers on M such that γ n γ uniformly. Then γ is a length-minimizer. Proof. Since the length is invariant under reparametrization, it is not restrictive to assume that all curves γ n and γ are parametrized on [,1]. Since γ n is a length-minimizer one has l(γ n ) = d(γ n (),γ n (1)). By uniform convergence γ n (t) γ(t) for every t [,1] and, by continuity of the distance and semicontinuity of the length l(γ) liminf n l(γ n) = liminf n d(γ n(),γ n (1)) = d(γ(),γ(1)), that implies that l(γ) = d(γ(), γ(1)), i.e., γ is a length-minimizer. The semicontinuity of the length implies the existence of minimizers, under a natural compactness assumption on the space. Theorem 3.39 (Existence of minimizers). Let M be a sub-riemannian manifold and q M. Assume that the ball B q (r) is compact, for some r >. Then for all q 1 B q (r) there exists a length minimizer joining q and q 1, i.e., we have d(q,q 1 ) = min{l(γ) γ : [,T] M admissible,γ() = q,γ(t) = q 1 }. Proof. Fix q 1 B q (r) and consider a minimizing sequence γ n : [,1] M of admissible trajectories, parametrized with constant speed, joining q and q 1 and such that l(γ n ) d(q,q 1 ). Since d(q,q 1 ) < r, we have l(γ n ) r for all n n large enough, hence we can assume without loss of generality that the image of γ n is contained in the common compact K = B q (r) for all n. In particular, the same argument leading to (3.35) shows that for all n n γ n (t) γ n (τ) t τ γ n (s) ds C K r t τ, t,τ [,1]. (3.39) wherec K dependsonly on K. In other words, all trajectories in thesequence {γ n } n N are Lipschitz with the same Lipschitz constant. Thus the sequence is equicontinuous and uniformly bounded. By the classical Ascoli-Arzelà Theorem there exist a subsequence of γ n, which we still denote by the same symbol, and a Lipschitz curve γ : [,T] M such that γ n γ uniformly. By Theorem 3.37, the curve γ satisfies l(γ) liminfl(γ n ) = d(q,q 1 ), that implies l(γ) = d(q,q 1 ). 77

78 Remark 3.4. Assume that B(q,r ) is compact for some r >. Then for every < r r we have that B(q,r) is compact also, being a closed subset of a compact set B(q,r ). Corollary Let q M. Under the hypothesis of Corollary 3.39 there exists ε > such that for all r ε and q 1 B q (r) there exists a minimizing curve joining q and q 1. Proof. It is a direct consequence of Theorem 3.39 and Corollary Remark It is well known that a length space is complete if and only if all closed balls are compact, see [9, Ch. 2]. In particular, if (M,d) is complete with respect to the sub-riemannian distance, then for every q,q 1 M there exists a length minimizer joining q and q Pontryagin extremals In this section we want to give necessary conditions to characterize length-minimizer trajectories. To begin with, we would like to motivate our Hamiltonian approach that we develop in the sequel. In classical Riemannian geometry length-minimizer trajectories satisfy a necessary condition given by a second order differential equation in M, which can be reduced to a first-order differential equation in TM. Hence the set of all length-minimizers is contained in the set of extremals, i.e., trajectories that satisfy the necessary condition, that are be parametrized by initial position and velocity. In our setting (which includes Riemannian and sub-riemannian geometry) we cannot use the initial velocity to parametrize length-minimizer trajectories. This can be easily understood by a dimensional argument. If the rank of the sub-riemannian structure is smaller than the dimension of the manifold, the initial velocity γ() of an admissible curve γ(t) starting from q, belongs to the proper subspace D q of the tangent space T q M. Hence the set of admissible velocities form a set whose dimension is smaller than the dimension of M, even if, by the Chow and Filippov theorems, length-minimizer trajectories starting from a point q cover a full neighborhood of q. The right approach is to parametrize length-minimizers by their initial point and an initial covector λ T q M, which can be thought as the linear form annihilating the front, i.e., the set {γ q (ε) γ q is a length-minimizer starting from q } on the corresponding length-minimizer trajectory for ε. The next theorem gives the necessary condition satisfied by length-minimizers in sub-riemannian geometry. Curves satisfying this condition are called Pontryagin extremals. The proof the following theorem is given in the next section. Theorem 3.43 (Characterization of Pontryagin extremals). Let γ : [, T] M be an admissible curve which is a length-minimizer, parametrized by constant speed. Let u( ) be the corresponding minimal control, i.e., for a.e. t [,T] m T γ(t) = u i (t)f i (γ(t)), l(γ) = u(t) dt = d(γ(),γ(t)), i=1 with u(t) constant a.e. on [,T]. Denote with P,t the flow 2 of the nonautonomous vector field f u(t) = k i=1 u i(t)f i. Then there exists λ Tγ() M such that defining 2 P,t(x) is defined for t [,T] and x in a neighborhood of γ() λ(t) := (P,t 1 ) λ, λ(t) Tγ(t) M, (3.4) 78

79 we have that one of the following conditions is satisfied: (N) u i (t) λ(t),f i (γ(t)), i = 1,...,m, (A) λ(t),f i (γ(t)), i = 1,...,m. Moreover in case (A) one has λ. Notice that, by definition, the curve λ(t) is Lipschitz continuous. Moreover the conditions (N) and (A) are mutually exclusive, unless u(t) = for a.e. t [,T], i.e., γ is the trivial trajectory. Definition Letγ : [,T] M beanadmissiblecurvewithminimalcontrolu L ([,T],R m ). Fix λ Tγ() M \{}, and define λ(t) by (3.4). - If λ(t) satisfies (N) then it is called normal extremal (and γ(t) a normal extremal trajectory). - If λ(t) satisfies (A) then it is called abnormal extremal (and γ(t) a abnormal extremal trajectory). Remark In the Riemannian case there are no abnormal extremals. Indeed, since the map f is fiberwise surjective, we can always find m vector fields f 1,...,f m on M such that span q {f 1,...,f m } = T q M, and (A) would imply that λ,v =, for all v T q M, that gives the contradiction λ =. Remark If the sub-riemannian structure is not Riemannian at q, namely if D q = span q {f 1,...,f m } T q M, then the trivial trajectory, corresponding to u(t), is always normal and abnormal. Notice that even a nontrivial admissible trajectory γ can be both normal and abnormal, since there may exist two different lifts λ(t),λ (t) T γ(t) M, such that λ(t) satisfies (N) and λ (t) satisfies (A). Exercise Prove that condition (N) of Theorem 3.4 implies that the minimal control u(t) is smooth. In particular normal extremals are smooth. At this level it seems not obvious how to use Theorem 3.43 to find the explicit expression of extremals for a given problem. In the next chapter we provide another formulation of Theorem 3.43 which gives Pontryagin extremals as solutions of a Hamiltonian system. The rest of this section is devoted to the proof of Theorem The energy functional Let γ : [,T] M be an admissible curve. We define the energy functional J on the space of Lipschitz curves on M as follows J(γ) = 1 2 T Notice that J(γ) < + for every admissible curve γ. γ(t) 2 dt. 79

80 Remark While l is invariant by reparametrization (see Remark 3.14), J is not. Indeed consider, for every α >, the reparametrized curve γ α : [,T/α] M, γ α (t) = γ(αt). Using that γ α (t) = α γ(αt), we have J(γ α ) = 1 2 T/α γ α (t) 2 dt = 1 2 T/α α 2 γ(αt) 2 dt = αj(γ). Thus, if the final time is not fixed, the infimum of J, among admissible curves joining two fixed points, is always zero. The following lemma relates minimizers of J with fixed final time with minimizers of l. Lemma Fix T > and let Ω q,q 1 be the set of admissible curves joining q,q 1 M. An admissible curve γ : [,T] M is a minimizer of J on Ω q,q 1 if and only if it is a minimizer of l on Ω q,q 1 and has constant speed. Proof. Applying the Cauchy-Schwarz inequality ( T 2 f(t)g(t)dt) with f(t) = γ(t) and g(t) = 1 we get T T f(t) 2 dt g(t) 2 dt, (3.41) l(γ) 2 2J(γ)T. (3.42) Moreover in (3.41) equality holds if and only if f is proportional to g, i.e., γ(t) = const. in (3.42). Since, by Lemma 3.15, every curve is a Lipschitz reparametrization of a length-parametrized one, the minima of J are attained at admissible curves with constant speed, and the statement follows Proof of Theorem 3.43 By Lemma 3.49 we can assume that γ is a minimizer of the functional J among admissible curves joining q = γ() and q 1 = γ(t) in fixed time T >. In particular, if we define the functional J(u( )) := 1 2 T u(t) 2 dt, (3.43) on the space of controls u( ) L ([,T],R m ), the minimal control u( ) of γ is a minimizer for the energy functional J J(u( )) J(u( )), u L ([,T],R m ), where trajectories corresponding to u( ) join q,q 1 M. In the following we denote the functional J by J. Consider now a variation u( ) = u( )+v( ) of the control u( ), and its associated trajectory q(t), solution of the equation q(t) = f u(t) (q(t)), q() = q, (3.44) 8

81 Recall that P,t denotes the local flow associated with the optimal control u( ) and that γ(t) = P,t (q ) is the optimal admissible curve. We stress that in general, for q different from q, the curve t P,t (q) is not optimal. Let us introduce the curve x(t) defined by the identity q(t) = P,t (x(t)). (3.45) Inotherwordsx(t) = P,t 1 (q(t)) isobtained byapplyingtheinverseof theflow ofu( ) tothesolution associated with the new control u( ) (see Figure 3.5). Notice that if v( ) =, then x(t) q. q(t) P,t x(t) q Figure 3.5: The trajectories q(t), associated with u( ) = u( ) + v( ), and the corresponding x(t). The next step is to write the ODE satisfied by x(t). Differentiating (3.45) we get q(t) = f u(t) (q(t))+(p,t ) (ẋ(t)) (3.46) = f u(t) (P,t (x(t)))+(p,t ) (ẋ(t)) (3.47) and using that q(t) = f u(t) (q(t)) = f u(t) (P,t (x(t)) we can invert (3.47) with respect to ẋ(t) and rewrite it as follows ẋ(t) = (P,t 1 ) [ (fu(t) f u(t) )(P,t (x(t))) ] [ ] = (P,t 1 ) (f u(t) f u(t) ) (x(t)) [ ] = (P,t 1 ) (f u(t) u(t) ) (x(t)) [ = (P,t 1 ) f v(t) ](x(t)) (3.48) If we define the nonautonomous vector field g t v(t) = (P 1,t ) f v(t) we finally obtain by (3.48) the following Cauchy problem for x(t) ẋ(t) = g t v(t) (x(t)), x() = q. (3.49) Notice that the vector field gv t is linear with respect to v, since f u is linear with respect to u. Now we fix the control v(t) and consider the map ( ) J(u+sv) s R R M x(t;u+sv) 81

82 where x(t;u + sv) denote the solution at time T of (3.49), starting from q, corresponding to control u( )+sv( ), and J(u+sv) is the associated cost. Lemma 3.5. There exists λ (R T q M), with λ, such that for all v L ([,T],R m ) ( J(u+sv) λ,, x(t;u+sv) ) =. (3.5) s s= s s= Proof of Lemma 3.5. We argue by contradiction: assume that (3.5) is not true, then there exist v,...,v n L ([,T],R m ) such that the vectors in R T q M J(u+sv ) J(u+sv n ) s s= x(t;u+sv ),..., s s= x(t;u+sv n ) (3.51) s s= s s= are linearly independent. Let us then consider the map Φ : R n+1 R M, Φ(s,...,s n ) = ( J(u+ n i= s iv i ) x(t;u+ n i= s iv i ) ). (3.52) By differentiability properties of solution of smooth ODEs with respect to parameters, the map (3.52) is smooth in a neighborhood of s =. Moreover, since the vectors (3.51) are the components of the differential of Φ and they are independent, then the inverse function theorem implies that Φ is a local diffeomorphism sending a neighborhood of s = in R n+1 in a neighborhood of (J(u),q ) in R M. As a result we can find v( ) = i s iv i ( ) such that (see also Figure 3.4.2) x(t;u+v) = q, J(u+v) < J(u). In other words the curve t q(t;u+v) joins q(;u+v) = q to J(ū) J x(t, ū) x q(t;u+v) = P,T (x(t;u+v)) = P,T (q ) = q 1, with a cost smaller that the cost of γ(t) = q(t;u), which is a contradiction Remark Notice that if λ satisfies (3.5), then for every α R, with α, α λ satisfies (3.5) too. Thus we can normalize λ to be ( 1,λ ) or (,λ ), with λ T q M, and λ in the second case (since λ is non zero). 82

83 Condition (3.5) implies that there exists λ Tq M such that one of the following identities is satisfied for all v L ([,T],R m ): J(u+sv) = λ, x(t;u+sv), (3.53) s s= s s= = λ, x(t;u+sv). (3.54) s s= with λ in the second case (cf. Remark 3.51). To end the proof we have to show that identities (3.53) and (3.54) are equivalent to conditions (N) and (A) of Theorem Let us show that J(u+sv) s x(t;u+sv) s = s= = s= T T m u i (t)v i (t)dt, (3.55) i=1 g t v(t) (q )dt = The identity (3.55) follows from the definition of J J(u+sv) = 1 2 T T m i=1 ((P 1,t ) f i )(q )v i (t)dt. (3.56) u+sv 2 dt. (3.57) Eq. (3.56) can be proved in coordinates. Indeed by (3.49) and the linearity of g v with respect to v we have T x(t;u+sv) = q +s gv(t) t (x(t;u+sv))dt, and differentiating with respect to s at s = one gets (3.56). Let us show that (3.53) is equivalent to (N) of Theorem Similarly, one gets that (3.54) is equivalent to (A). Using (3.55) and (3.56), equation (3.53) is rewritten as T m u i (t)v i (t)dt = i=1 = T m i=1 T λ,((p,t 1 ) f i )(q ) v i (t)dt m λ(t),f i (γ(t)) v i (t)dt, (3.58) i=1 where we used, for every i = 1,...,m, the identities λ,((p,t 1 ) f i )(q ) = λ,(p,t 1 ) f i (γ(t)) = (P,t 1 ) λ,f i (γ(t)) = λ(t),f i (γ(t)). Since v i ( ) L ([,T],R m ) are arbitrary, we get u i (t) = λ(t),f i (γ(t)) for a.e. t [,T]. 3.A Measurability of the minimal control In this appendix we prove a technical lemma about measurability of solutions to a class of minimization problems. This lemma when specified to the sub-riemannian context, implies that the minimal control associated with an admissible curve is measurable. 83

84 3.A.1 Main lemma Let us fix an interval I = [a,b] R and a compact set U R m. Consider two functions g : I U R n, v : I R n such that (M1) g(,u) is measurable in t for every fixed u U, (M2) g(t, ) is continuous in u for every fixed t I, (M3) v(t) is measurable with respect to t. Moreover we assume that (M4) for every fixed t I, the problem min{ u : g(t,u) = v(t),u U} has a unique solution. Let us denote by u (t) the solution of (M4) for a fixed t I. Lemma Under assumptions (M1)-(M4), the function t u (t) is measurable on I. Proof. Denote ϕ(t) := u (t). To prove the lemma we show that for every fixed r > the set is measurable in R. By our assumptions A = {t I : ϕ(t) r} A = {t I : u U s.t. u r,g(t,u) = v(t)} Let us fix r > and a countable dense set {u i } i N in the ball of radius r in U. Let show that where A = n N A n = n N i N A i,n }{{} :=A n A i,n := {t I : g(t,u i ) v(t) < 1/n} (3.59) Notice that the set A i,n is measurable by construction and if (3.59) is true, A is also measurable. inclusion. Let t A. This means that there exists ū U such that ū r and g(t,ū) = v(t). Since g is continuous with respect to u and {u i } i N is a dense, for each n we can find u in such that g(t,u in ) v(t) < 1/n, that is t A n for all n. inclusion. Assume t n N A n. Then for every n there exists i n such that the corresponding u in satisfies g(t,u in ) v(t) < 1/n. From the sequence u in, by compactness, it is possible to extract a convergent susequence u in ū. By continuity of g with respect to u one easily gets that g(t,ū) = v(t). That is t A. Next we exploit the fact that the function ϕ(t) := u (t) is measurable to show that the vector function u (t) is measurable. Lemma Under assumptions (M1)-(M4), the vector function t u (t) is measurable on I. 84

85 Proof. It is sufficient to prove that, for every closed ball O in R n the set B := {t I : u (t) O} is measurable. Since the minimum in (M4) is uniquely determined, this is equivalent to B = {t I : u O s.t. u = ϕ(t),g(t,u) = v(t)} Let us fix the ball O and a countable dense set {u i } i N in O. Let show that B = B n = B i,n n N n N i N }{{} :=B n (3.6) where B i,n := {t I : u i < ϕ(t)+1/n, g(t,u i ) v(t) < 1/n;} Notice that the set B i,n is measurable by construction and if (3.6) is true, B is also measurable. inclusion. Let t B. This means that there exists ū O such that ū = ϕ(t) and g(t,ū) = v(t). Since g is continuous with respect to u and {u i } i N is a dense in O, for each n we can find u in such that g(t,u in ) v(t) < 1/n and u in < ϕ(t)+1/n, that is t B n for all n. inclusion. Assume t n N B n. Then for every n it is possible to find i n such that the corresponding u in satisfies g(t,u in ) v(t) < 1/n and u in < ϕ(t)+1/n. From the sequence u in, by compactness of the closed ball O, it is possible to extract a convergent susequence u in ū. By continuity of f in u one easily gets that g(t,ū) = v(t). Moreover ū ϕ(t). Hence ū = ϕ(t). That is t B. 3.A.2 Proof of Lemma 3.11 Consider an admissible curve γ : [,T] M. Since measurability is a local property it is not restrictive to assume M = R n. Moreover, by Lemma 3.15, we can assume that γ is lengthparametrized so that its minimal control belong to the compact set U = { u 1}. Define g : [,T] U R n and v : [,T] R n by g(t,u) = f(γ(t),u), v(t) = γ(t). Assumptions (M1)-(M4) are satisfied. Indeed (M1)-(M3) follow from the fact that g(t, u) is linear with respect to u and measurable in t. Moreover (M4) is also satisfied by linearity with respect to u of f. Applying Lemma 3.53 one gets that the minimal control u (t) is measurable in t. 3.B Lipschitz vs Absolutely continuous admissible curves In these lecture notes sub-riemannian geometry is developed in the framework of Lipschitz admissible curves (that correspond to the choice of L controls). However, the theory can be equivalently developed in the framework of H 1 admissible curves (corresponding to L 2 controls) or in the framework of absolutely continuous admissible curves (corresponding to L 1 controls). 85

86 Definition An absolutely continuous curve γ : [,T] M is said to be AC-admissible if there exists an L 1 function u : t [,T] u(t) U γ(t) such that γ(t) = f(γ(t),u(t)), for a.e. t [,T]. We define H 1 -admissible curves similarly. Being the set of absolutely continuous curve bigger than the set of Lipschitz ones, one could expect that the sub-riemannian distance between two points is smaller when computed among all absolutely continuous admissible curves. However this is not the case thanks to the invariance by reparametrization. Indeed Lemmas 3.14 and 3.15 can be rewritten in the absolutely continuous framework in the following form. Lemma The length of an AC-admissible curve is invariant by AC reparametrization. Lemma Any AC-admissible curve of positive length is a AC reparametrization of a lengthparametrized admissible one. The proof of Lemma 3.55 differs from the one of Lemma 3.14 only by the fact that, if u L 1 is the minimal control of γ then (u ϕ) ϕ is the minimal control associated with γ ϕ. Moreover (u ϕ) ϕ L 1, using the monotonicity of ϕ. Under these assumptions the change of variables formula (3.15) still holds. The proof of Lemma 3.56 is unchanged. Notice that the statement of Exercise 3.16 remains true if we replace Lipschitz with absolutely continuous. We stress that the curve γ built in the proof is Lipschitz (since it is length-parametrized). As a consequence of these results, if we define d AC (q,q 1 ) = inf{l(γ) γ : [,T] M AC-admissible, γ() = q, γ(t) = q 1 }, (3.61) we have the following proposition. Proposition d AC (q,q 1 ) = d(q,q 1 ) Since L 2 ([,T]) L 1 ([,T]), Lemmas 3.55, 3.56 and Proposition 3.57 are valid also in the framework of admissible curves associated with L 2 controls. Bibliographical notes Sub-Riemannian manifolds have been introduced, even if with different terminology, in several contexts starting from the end of 6s, see for instance [19, 16, 13, 17, 14]. However, some pioneering ideas were already present in the work of Carathéodory and Cartan. The name sub-riemannian geometry first appeared in [28]. Classical general references for sub-riemannian geometry are [23, 4, 22, 15, 29]. Recent monographs [18, 26]. The definition of sub-riemannian manifold using the language of bundles dates back to [2, 4]. For the original proof of the Raschevski-Chow theorem see [25, 1]. The proof of existence of sub- Riemannian length minimizer presented here is an adaptation of the proof of Filippov theorem in optimal control. The fact that in sub-riemannian geometry there exist abnormal length minimizers is due to Montgomery [21, 23]. The fact that the theory can be equivalently developed for Lipschitz or absolutely continuous curves is well known, a discussion can be found in [4]. The definition of the length by using the minimal control is, up to our best knowledge, original. The problem of the measurability of the minimal control can be seen as a problem of differential 86

87 inclusion [8]. The characterization of Pontryagin extremals given in Theorem 3.43 is a simplified version of the Pontryagin Maximum Priciple (PMP) [24]. The proof presented here is original and adapted to this setting. For more general versions of PMP see [3, 5]. The fact that every sub-riemannian structure is equivalent to a free one (cf. Section 3.1.4) is a consequence of classical results on fiber bundles. A different proof in the case of classical (constant rank) distribution was also considered in [26, 3]. 87

88 88

89 Chapter 4 Characterization and local minimality of Pontryagin extremals This chapter is devoted to the study of geometric properties of Pontryagin extremals. To this purpose we first rewrite Theorem 3.43 in a more geometric setting, which permits to write a differential equation in T M satisfied by Pontryagin extremals and to show that they do not depend on the choice of a generating family. Finally we prove that small pieces of normal extremal trajectories are length-minimizers. To this aim, all along this chapter we develop the language of symplectic geometry, starting by the key concept of Poisson bracket. 4.1 Geometric characterization of Pontryagin extremals In the previuos chapter we proved that if γ : [,T] M is a length minimizer on a sub-riemannian manifold, associated with a control u( ), then there exists λ Tγ() M such that defining one of the following conditions is satisfied: (N) u i (t) λ(t),f i (γ(t)), i = 1,...,m, (A) λ(t),f i (γ(t)), i = 1,...,m, λ. λ(t) = (P,t 1 ) λ, λ(t) Tγ(t) M, (4.1) Here P,t denotes the flow associated with the nonautonomous vector field f u(t) = m i=1 u i(t)f i and (P,t 1 ) : TqM TP,t (q) M. (4.2) is the induced flow on the cotangent space. The goal of this section is to characterize the curve (4.1) as the integral curve of a suitable (non-autonomous) vector field on T M. To this purpose, we start by showing that a vector field on T M is completely characterized by its action on function that are affine on fibers. To fix the ideas, we first focus on the case in which P,t : M M is the flow associated with an autonomous vector field X Vec(M), namely P,t = e tx. 89

90 4.1.1 Lifting a vector field from M to T M We start by some preliminary considerations on the algebraic structure of smooth functions on T M. As usual π : T M M denotes the canonical projection. Functions in C (M) are in a one-to-one correspondence with functions in C (T M) that are constant on fibers via the map α π α = α π. In other words we have the isomorphism of algebras C (M) C cst (T M) := {π α α C (M)} C (T M). (4.3) In what follows, with abuse of notation, we often identify the function π α C (T M) with the function α C (M). In a similar way smooth vector fields on M are in a one-to-one correspondence with functions in C (T M) that are linear on fibers via the map Y a Y, where a Y (λ) := λ,y(q) and q = π(λ). Vec(M) C lin (T M) := {a Y Y Vec(M)} C (T M). (4.4) Notice that this is an isomorphism as modules over C (M). Indeed, as Vec(M) is a module over C (M), we have that C lin (T M) is a module over C (M) as well. For any α C (M) and a X C lin (T M) their product is defined as αa X := (π α)a X = a αx C lin (T M). Definition 4.1. We say that a function a C (T M) is affine on fibers if there exist two functions α C cst (T M) and a X C lin (T M) such that a = α+a X. In other words a(λ) = α(q)+ λ,x(q), q = π(λ). We denote by C aff (T M) the set of affine function on fibers. Remark 4.2. Linear and affine functions on T M are particularly important since they reflects the linear structure of the cotangent bundle. In particular every vector field on T M, as a derivation of C (T M), is completely characterized by its action on affine functions, Indeed for a vector field V Vec(T M) and f C (T M), one has that (Vf)(λ) = d dt f(e tv (λ)) = d λ f,v(λ), λ T M. (4.5) t= which depends only on the differential of f at the point λ. Hence, for each fixed λ T M, to compute (4.5) one can replace the function f with any affine function whose differential at λ coincide with d λ f. Notice that such a function is not unique. Let us now consider the infinitesimal generator of the flow (P 1,t ) = (e tx ). Since it satisfies the group law (e tx ) (e sx ) = (e (t+s)x ) t,s R, by Lemma 2.15 its infinitesimal generator is an autonomous vector field V X on T M. In other words we have (e tx ) = e tv X for all t. Let us then compute the right hand side of (4.5) when V = V X and f is either a function constant on fibers or a function linear on fibers. The action of V X on functions that are constant on fibers, of the form β π with β C (M), coincides with the action of X. Indeed we have for all λ T M d dt β π((e tx ) λ)) = d t= dt β(e tx (q)) = (Xβ)(q), q = π(λ). (4.6) t= 9

91 For what concerns the action of V X on functions that are linear on fibers, of the form a Y (λ) = λ,y(q), we have for all λ T M d dt a Y ((e tx ) λ) = d t= dt (e tx ) λ,y(e tx (q)) t= = d dt λ,(e tx Y)(q) = λ,[x,y](q) (4.7) t= = a [X,Y] (λ). Hence, by linearity, one gets that the action of V X on functions of C aff (T M) is given by V X (β +a Y ) = Xβ +a [X,Y]. (4.8) As explained in Remark 4.2, formula (4.8) characterizes completely the generator V X of (P 1,t ). To find its explicit form we introduce the notion of Poisson bracket The Poisson bracket The purpose of this section is to introduce an operation {, } on C (T M), called Poisson bracket. First we introduce it in C lin (T M), where it reflects the Lie bracket of vector fields in Vec(M), seen as elements of C lin (T M). Then it is uniquely extended to C aff (T M) and C (T M) by requiring that it is a derivation of the algebra C (T M) in each argument. More precisely we start by the following definition. Definition 4.3. Let a X,a Y C lin (T M) be associated with vector fields X,Y Vec(M). Their Poisson bracket is defined by {a X,a Y } := a [X,Y], (4.9) where a [X,Y] is the function in C lin (T M) associated with the vector field [X,Y]. Remark 4.4. Recall that the Lie bracket is a bilinear, skew-symmetric map defined on Vec(M), that satisfies the Leibnitz rule for X,Y Vec(M): [X,αY] = α[x,y]+(xα)y, α C (M). (4.1) As a consequence, the Poisson bracket is bilinear, skew-symmetric and satisfies the following relation {a X,αa Y } = {a X,a αy } = a [X,αY] = αa [X,Y] +(Xα)a Y, α C (M). (4.11) Notice that this relation makes sense since the productbetween α C cst (T M) and a X C lin (T M) belong to C lin (T M), namely αa X = a αx. Next, we extend this definition on the whole C (T M). Proposition 4.5. There exists a unique bilinear and skew-simmetric map {, } : C (T M) C (T M) C (T M) that extends (4.9) on C (T M), and that is a derivation in each argument, i.e. it satisfies {a,bc} = {a,b}c+{a,c}b, a,b,c C (T M). (4.12) We call this operation the Poisson bracket on C (T M). 91

92 Proof. We start by proving that, as a consequence of the requirement that {, } is a derivation in each argument, it is uniquely extended to C aff (T M). By linearity and skew-symmetry we are reduced to compute Poisson brackets of kind {a X,α} and {α,β}, where a X C lin (T M) and α,β C cst(t M). Using that a αy = αa Y and (4.12) one gets Comparing (4.11) and (4.13) one gets Next, using (4.12) and (4.14), one has {a X,a αy } = {a X,αa Y } = α{a X,a Y }+{a X,α}a Y. (4.13) {a X,α} = Xα (4.14) {a αy,β} = {αa Y,β} = α{a Y,β}+{α,β}a Y (4.15) = αyβ +{α,β}a Y. (4.16) Using again (4.14) one also has {a αy,β} = αyβ, hence {α,β} =. Combining the previous formulas one obtains the following expression for the Poisson bracket between two affine functions on T M {a X +α,a Y +β} := a [X,Y] +Xβ Yα. (4.17) From the explicit formula (4.17) it is easy to see that the Poisson bracket computed at a fixed λ T M depends only on the differential of the two functions a X +α and a Y +β at λ. Next we extend this definition to C (T M) in such a way that it is still a derivation. For f,g C (T M) we define {f,g} λ := {a f,λ,a g,λ } λ (4.18) where a f,λ and a g,λ are two functions in C aff (T M) such that d λ f = d λ (a f,λ ) and d λ g = d λ (a g,λ ). Remark 4.6. The definition (4.18) is well posed, since if we take two different affine functions a f,λ and a f,λ their difference satisfy d λ(a f,λ a f,λ ) = d λ(a f,λ ) d λ (a f,λ ) =, hence by bilinearity of the Poisson bracket {a f,λ,a g,λ } λ = {a f,λ,a g,λ} λ. Let us now compute the coordinate expression of the Poisson bracket. In canonical coordinates (p,x) in T M, if n X = X i (x) n, Y = Y i (x), x i x i we have a X (p,x) = i=1 i=1 n p i X i (x), a Y (p,x) = i=1 92 n p i Y i (x). i=1

93 and, denoting f = a X +α, g = a Y +β we have {f,g} = a [X,Y] +Xβ Yα n = = = i,j=1 n i,j=1 n i=1 ( Y j X j p j X i Y i x i x i X i ( p j Y j x i + β p i f p i g x i f x i g p i. ) +X i β α Y i p i p i ( X j ) Y i p j + α ) x i p i From these computations we get the formula for Poisson brackets of two functions a,b C (T M) {a,b} = n i=1 a b a b, a,b C (T M). (4.19) p i x i x i p i The explicit formula (4.19) shows that the extension of the Poisson bracket to C (T M) is still a derivation. Remark 4.7. We stress that the value {a,b} λ at a point λ T M depends only on d λ a and d λ b. Hence the Poisson bracket computed at the point λ T M can be seen as a skew-symmetric and nondegenerate bilinear form Hamiltonian vector fields By construction, the linear operator defined by {, } λ : T λ (T M) T λ (T M) R. a : C (T M) C (T M) a(b) := {a,b} (4.2) is a derivation of the algebra C (T M), therefore can be identified with an element of Vec(T M). Definition 4.8. The vector field a on T M defined by (4.2) is called the Hamiltonian vector field associated with the smooth function a C (T M). From (4.19) we can easily write the coordinate expression of a for any arbitrary function a C (T M) n a a = a. (4.21) p i x i x i p i i=1 The following proposition gives the explicit form of the vector field V on T M generating the flow (P 1,t ). Proposition 4.9. Let X Vec(M) be complete and let P,t = e tx. The flow on T M defined by (P,t 1 ) = (e tx ) is generated by the Hamiltonian vector field a X, where a X (λ) = λ,x(q) and q = π(λ). 93

94 Proof. To prove that the generator V of (P 1,t ) coincides with the vector field a X it is sufficient to show that their action is the same. Indeed, by definition of Hamiltonian vector field, we have a X (α) = {a X,α} = Xα a X (a Y ) = {a X,a Y } = a [X,Y]. Hence this action coincides with the action of V as in (4.6) and (4.7). Remark 4.1. In coordinates (p,x) if the vector field X is written X = n n i=1 p ix i and the Hamitonian vector field a X is written as follows i=1 X i x i then a X (p,x) = a X = n i=1 X i x i n i,j=1 p i X i x j p j. (4.22) Notice that the projection of a X onto M coincides with X itself, i.e., π ( a X ) = X. This construction can be extended to the case of nonautonomous vector fields. Proposition Let X t be a nonautonomous vector field and denote by P,t the flow of X t on M. Then the nonautonomous vector field on T M V t := a Xt, a Xt (λ) = λ,x t (q), is the generator of the flow (P 1,t ). 4.2 The symplectic structure In this section we introduce thesymplectic structureof T M following theclassical construction. In subsection we show that the symplectic form can be interpreted as the dual of the Poisson bracket, in a suitable sense. Definition The tautological (or Liouville) 1-form s Λ 1 (T M) is defined as follows: s : λ s λ T λ (T M), s λ,w := λ,π w, λ T M, w T λ (T M), where π : T M M denotes the canonical projection. The name tautological comes from its expression in coordinates. Recall that, given a system of coordinates x = (x 1,...,x n ) on M, canonical coordinates (p,x) on T M are coordinates for which every element λ T M is written as follows λ = n p i dx i. i=1 For every w T λ (T M) we have the following w = n i=1 α i p i +β i x i = π w = n i=1 β i x i, 94

95 hence we get s λ,w = λ,π w = n p i β i = i=1 n n p i dx i,w = p i dx i,w. i=1 i=1 In other words the coordinate expression of the Liouville form s at the point λ coincides with the one of λ itself, namely n s λ = p i dx i. (4.23) Exercise Let s Λ 1 (T M) be the tautological form. Prove that i=1 ω s = ω, ω Λ 1 (M). (Recall that a 1-form ω is a section of T M, i.e. a map ω : M T M such that π ω = id M ). Definition The differential of the tautological 1-form σ := ds Λ 2 (T M) is called the canonical symplectic structure on T M. By construction σ is a closed 2-form on T M. Moreover its expression in canonical coordinates (p, x) shows immediately that is a nondegenerate two form σ = n dp i dx i. (4.24) i=1 Remark 4.15 (Thesymplecticforminnon-canonicalcoordinates). Givenabasisof1-formsω 1,...,ω n in Λ 1 (M), one can build coordinates on the fibers of T M as follows. Every λ T M can be written uniquely as λ = n i=1 h iω i. Thus h i become coordinates on the fibers. Notice that these coordinates are not related to any choice of coordinates on the manifold, as the p were. By definition, in these coordinates, we have s = n h i ω i, σ = ds = i=1 n dh i ω i +h i dω i. (4.25) i=1 Notice that, with respect to (4.24) in the expression of σ an extra term appears since, in general, the 1-forms ω i are not closed The symplectic form vs the Poisson bracket Let V be a finite dimensional vector space and V denotes its dual (i.e. the space of linear forms on V). By classical linear algebra arguments one has the following identifications { } non degenerate bilinear forms on V { } { } linear invertible maps non degenerate V V bilinear forms on V. (4.26) Indeed to every bilinear form B : V V R we can associate a linear map L : V V defined by L(v) = B(v, ). On the other hand, given a linear map L : V V, we can associate with it a bilinear map B : V V R defined by B(v,w) = L(v),w, where, denotes as usual the 95

96 pairing between a vector space and its dual. Moreover B is non-degenerate if and only if the map B(v, ) is an isomorphism for every v V, that is if and only if L is invertible. The previous argument shows how to identify a bilinear form on B on V with an invertible linear map L from V to V. Applying the same reasoning to the linear map L 1 one obtain a bilinear map on V. Exercise (a). Let h C (T M). Prove that the Hamiltonian vector field h Vec(T M) satisfies the following identity σ(, h(λ)) = d λ h, λ T M. (b). Prove that, for every λ T M the bilinear forms σ λ on T λ (T M) and {, } λ on T λ (T M) (cf. Remark 4.7) are dual under the identification (4.26). In particular show that {a,b} = a(b) = db, a = σ( a, b), a,b C (T M). (4.27) Remark Notice that σ is nondegenerate, which means that the map w σ λ (,w) defines a linear isomorphism between the vector spaces T λ (T M) and T λ (T M). Hence h is the vector field canonically associated by the symplectic structure with the differential dh. For this reason h is also called symplectic gradient of h. From formula (4.24) we have that in canonical coordinates (p, x) the Hamiltonian vector filed associated with h is expressed as follows h = n i=1 h h, p i x i x i p i and the Hamiltonian system λ = h(λ) is rewritten as ẋ i = h p i ṗ i = h, i = 1,...,n. x i We conclude this section with two classical but rather important results: Proposition A function a C (T M) is a constant of the motion of the Hamiltonian system associated with h C (T M) if and only if {h,a} =. Proof. Let us consider a solution λ(t) = e t h (λ ) of the Hamiltonian system associated with h, with λ T M. Let us prove the following formula for the derivative of the function a along the solution d a(λ(t)) = {h,a}(λ(t)). (4.28) dt By (4.28) it is easy to see that, if {h,a} =, then the derivative of the function a along the flow vanishes for all t and then a is constant. Conversely, if a is constant along the flow then its derivative vanishes and the Poisson bracket is zero. The skew-simmetry of the Poisson brackets immediately implies the following corollary. Corollary A function h C (T M) is a constant of the motion of the Hamiltonian system defined by h. 96

97 4.3 Characterization of normal and abnormal extremals Now we can rewrite Theorem 3.43 using the symplectic language developed in the last section. Given a sub-riemannian structure on M with generating family {f 1,...,f m }, and define the fiberwise linear functions on T M associated with these vector fields h i : T M R, h i (λ) := λ,f i (q), i = 1,...,m. Theorem 4.2 (PMP). Let γ : [,T] M be an admissible curve which is a length-minimizer, parametrized by constant speed. Let u( ) be the corresponding minimal control. Then there exists a Lipschitz curve λ(t) Tγ(t) M such that λ(t) = m u i (t) h i (λ(t)), a.e. t [,T], (4.29) i=1 and one of the following conditions is satisfied: (N) h i (λ(t)) u i (t), i = 1,...,m, t, (A) h i (λ(t)), i = 1,...,m, t. Moreover in case (A) one has λ(t) for all t [,T]. Proof. The statement is a rephrasing of Theorem 3.43, obtained by combining Proposition 4.9 and Exercise Notice that Theorem 4.2 says that normal and abnormal extremals appear as solution of an Hamiltonian system. Nevertheless, this Hamiltonian system is non autonomous and depends on the trajectory itself by the presence of the control u(t) associated with the extremal trajectory. Moreover, the actual formulation of Theorem 4.2 for the necessary condition for optimality still does not clarify if the extremals depend on the generating family {f 1,...,f m } for the sub- Riemannian structure. The rest of the section is devoted to the geometric intrinsic description of normal and abnormal extremals Normal extremals In this section we show that normal extremals are characterized as solutions of a smooth autonomous Hamiltonian system on T M, where the Hamiltonian H is a function that encodes all the informations on the sub-riemannian structure. Definition Let M be a sub-riemannian manifold. The sub-riemannian Hamiltonian is the function on T M defined as follows H : T M R, H(λ) = max ( λ,f u (q) 12 ) u U u 2, q = π(λ). (4.3) q Proposition The sub-riemannian Hamiltonian H is smooth and quadratic on fibers. Moreover, for every generating family {f 1,...,f m } of the sub-riemannian structure, the sub-riemannian Hamiltonian H is written as follows H(λ) = 1 m λ,f i (q) 2, λ T 2 qm, q = π(λ). (4.31) i=1 97

98 Proof. In terms of a generating family {f 1,...,f m }, the sub-riemannian Hamiltonian (4.3) is written as follows ( m ) H(λ) = max u i λ,f i (q) 1 m u 2 u R m i. (4.32) 2 i=1 Differentiating (4.32) with respect to u i, one gets that the maximum in the r.h.s. is attained at u i = λ,f i (q), from which formula (4.31) follows. The fact that H is smooth and quadratic on fibers then easily follows from (4.31). Exercise Prove that two equivalent sub-riemannian structures (U,f) and (U,f ) on a manifold M define the same Hamiltonian. Theorem Every normal extremal is a solution of the Hamiltonian system λ(t) = H(λ(t)). In particular, every normal extremal trajectory is smooth. Proof. Denoting, as usual, h i (λ) = λ,f i (q) for i = 1,...,m, the functions linear on fibers associated with a generating family and using the identity h 2 i = 2h i h i (see (4.12)), it follows that H = 1 m m h 2 i 2 = h i hi. i=1 In particular, since along a normal extremal h i (λ(t)) = u i (t) by condition (N) of Theorem 4.2, one gets m H(λ(t)) = h i (λ(t)) m h i (λ(t)) = u i (t) h i (λ(t)). i=1 Remark In canonical coordinates λ = (p,x) in T M, H is quadratic with respect to p and i=1 i=1 i=1 H(p,x) = 1 2 m p,f i (x) 2. i=1 The Hamiltonian system associated with H, in these coordinates, is written as follows ẋ = H p = m i=1 p,f i(x) f i (x) ṗ = H x = m i=1 p,f i(x) p,d x f i (x) (4.33) From here it is easy to see that if λ(t) = (p(t),x(t)) is a solution of (4.33) then also the rescaled extremal αλ(αt) = (αp(αt),x(αt)) is a solution of the same Hamiltonian system, for every α >. Lemma Let λ(t) be a normal extremal and γ(t) = π(λ(t)) be the corresponding normal extremal trajectory. Then for all t [,T] one has 1 2 γ(t) 2 = H(λ(t)). 98

99 Proof. For every normal extremal λ(t) associated with the (minimal) control u( ) we have 1 2 γ(t) 2 = 1 2 u(t) 2 = 1 2 k u i (t) 2 = H(λ(t)) (4.34) where we used the fact that, along a normal extremal, we have the relations for all t [,T] i=1 u i (t) = λ(t),f i (γ(t)). Corollary A normal extremal trajectory is parametrized by constant speed. In particular it is length parametrized if and only if its extremal lift is contained in the level set H 1 (1/2). Proof. The fact that H is constant along λ(t), easily implies by (4.34) that γ(t) 2 is constant. Moreover one easily gets that γ(t) = 1 if and only if H(λ(t)) = 1/2. Finally, by Remark 4.25, all normal extremal trajectories are reparametrization of length parametrized ones. Let λ(t) beanormal extremal suchthat λ() = λ T q M. Thecorrespondingnormal extremal trajectory γ(t) = π(λ(t)) can be written in the exponential notation γ(t) = π e t H (λ ). By Corollary 4.27, length-parametrized normal extremal trajectories corresponds to the choice of λ H 1 (1/2). We end this section by characterizing normal extremal trajectory as characteristic curves of the canonical symplectic form contained in the level sets of H. Definition Let M be a smooth manifold and Ω Λ k M a 2-form. A Lipschitz curve γ : [,T] M is a characteristic curve for Ω if for almost every t [,T] it holds γ(t) KerΩ γ(t), (i.e. Ω γ(t) ( γ(t), ) = ) (4.35) Notice that this notion is independent on the parametrization of the curve. Proposition Let H be the sub-riemannian Hamiltonian and assume that c > is a regular value of H. Then a Lipschitz curve γ is a characteristic curve for σ H 1 (c) if and only if it is the reparametrization of a normal extremal on H 1 (c). Proof. Recall that if c is a regular value of H, then the set H 1 (c) is a smooth (2n 1)-dimensional manifold in T M (notice that by Sard Theorem almost every c > is regular value for H). For every λ H 1 (c) let us denote by E λ = T λ H 1 (c) its tangent space at this point. Notice that, by construction, E λ is an hyperplane (i.e., dime λ = 2n 1) and d λ H Eλ =. The restriction σ H 1 (c) is computed by σ λ Eλ, for each λ H 1 (c). One one hand kerσ λ Eλ is non trivial since the dimension of E λ is odd. On the other hand the symplectic 2-form σ is nondegenerate on T M, hence the dimension of kerσ λ Eλ cannot be greater than one. It follows that dimkerσ λ Eλ = 1. We arelefttoshow that kerσ λ Eλ = H(λ). Assumethat kerσ λ Eλ = Rξ, forsomeξ T λ (T M). By construction, E λ coincides with the skew-orthogonal to ξ, namely E λ = ξ = {w T λ (T M)) σ λ (ξ,w) = }. 99

100 Since, by skew-symmetry, σ λ (ξ,ξ) =, it follows that ξ E λ. Moreover, by definition of Hamiltonian vector field σ(, H) = dh, hence for the restriction to E λ one has σ λ (, H(λ)) Eλ = d λ H Eλ =. Exercise 4.3. Prove that if two smooth Hamiltonians h 1,h 2 : T M R define the same level set, i.e. E = {h 1 = c 1 } = {h 2 = c 2 } for some c 1,c 2 R, then their Hamiltonian flow h 1, h 2 coincide on E, up to reparametrization. Exercise The sub-riemannian Hamiltonian H encodes all the information about the sub- Riemannian structure. (a) Prove that a vector v T q M is sub-unit, i.e., it satisfies v D q and v 1 if and only if 1 2 λ,v 2 H(λ), λ T qm. (b) Show that this implies the following characterization for the sub-riemannian Hamiltonian H(λ) = 1 2 λ 2, λ = sup λ, v. v D q, v =1 When the structure is Riemannian, H is the inverse norm defined on the cotangent space Abnormal extremals In this section we provide a geometric characterization of abnormal extremals. Even if for abnormal extremals it is not possible to determine a priori their regularity, we show that they can be characterized as characteristic curves of the symplectic form. This gives an unified point of view of both class of extremals. We recall that an abnormal extremal is a non zero solution of the following equations λ(t) = m u i (t) h i (λ(t)), i=1 h i (λ(t)) =, i = 1,...,m. where {f 1,...,f m } is a generating family for the sub-riemannian structure and h 1,...,h m are the corresponding functions on T M linear on fibers. In particular every abnormal extremal is contained in the set H 1 () = {λ T M λ,f i (q) =, i = 1,...,m, q = π(λ)}. (4.36) where H denotes the sub-riemannian Hamiltonian (4.31). Proposition Let H be the sub-riemannian Hamiltonian and assume that H 1 () is a smooth manifold. Then a Lipschitz curve γ is a characteristic curve for σ H 1 () if and only if it is the reparametrization of a normal extremal on H 1 (). 1

101 Proof. In this proof we denote for simplicity N := H 1 () T M. For every λ N we have the identity Kerσ λ N = T λ N = span{ h i (λ) i = 1,...,m}. (4.37) Indeed, from the definition of N, it follows that T λ N = {w T λ (T M) d λ h i, w =, i = 1,...,m} = {w T λ (T M) σ(w, h i (λ)) =, i = 1,...,m} = span{ h i (λ) i = 1,...,m}. and (4.37) follows by taking the skew-orthogonal on both sides. Thus w T λ H 1 () if and only if w is a linear combination of the vectors h i (λ). This implies that λ(t) is a characteristic curve for σ H 1 () if and only if there exists controls u i ( ) for i = 1,...,m such that λ(t) = m u i (t) h i (λ(t)). (4.38) i=1 Notice that is never a regular value of H. Nevertheless, the following exercise shows that the assumption of Proposition 4.32 is always satisfied in the case of a regular sub-riemannian structure. Exercise Assume that the sub-riemannian structure is regular, namely the following assumption holds dimd q = dimspan q {f 1,...,f m } = const. (4.39) Then prove that the set H 1 () defined by (4.36) is a smooth submanifold of T M. Remark From Proposition 4.32 it follows that abnormal extremals do not depend on the sub-riemannian metric, but only on the distribution. Indeed the set H 1 () is characterized as the annihilator D of the distribution H 1 () = {λ T M λ,v =, v D π(λ) } = D T M. Here the orthogonal is meant in the duality sense. Under the regularity assumption (4.39) we can select (at least locally) a basis of 1-forms ω 1,...,ω m for the dual of the distribution D q = span{ω i(q) i = 1,...,m}, (4.4) Letuscompletethissetof1-formstoabasisω 1,...,ω n oft M andconsidertheinducedcoordinates h 1,...,h n asdefinedinremark4.15. Inthesecoordinatestherestriction ofthesymplecticstructure D to is expressed as follows σ D = d(s D ) = m dh i ω i +h i dω i, (4.41) i=1 We stress that the restriction σ D can be written only in terms of the elements ω 1,...,ω m (and not of a full basis of 1-forms) since the differential d commutes with the restriction. 11

102 4.3.3 Example: codimension one distribution and contact distributions Let M be a n-dimensional manifold endowed with a constant rank distribution D of codimension one, i.e., dimd q = n 1 for every q M. In this case D and D are sub-bundles of TM and T M respectively and their dimension, as smooth manifolds, are dim D = dimm +rankd = 2n 1, dim D = dimm +rankd = n+1. Since the symplectic form σ is skew-symmetric, a dimensional argument implies that for n even, the restriction σ D has always a nontrivial kernel. Hence there always exist characteristic curves of σ D, that correspond to reparametrized abnormal extremals by Proposition Let us consider in more detail the case n = 3. Assume that there exists a one form ω Λ 1 (M) such that D = kerω (this is not restrictive for a local description). Consider a basis of one forms ω,ω 1,ω 2 such that ω := ω and the coordinates h,h 1,h 2 associated to these forms (see Remark 4.15). By (4.41) σ D = dh ω +h dω, (4.42) and we can easily compute (recall that D is 4-dimensional) σ σ D = 2h dh ω dω. (4.43) Lemma Let N be a smooth 2k-dimensional manifold and Ω Λ 2 M. Then Ω is nondegenerate on N if and only if k Ω. 1 Definition Let M be a three dimensional manifold. We say that a constant rank distribution D = kerω on M of corank one is a contact distribution if ω dω. For a three dimensional manifold M endowed with a distribution D = ker ω we define the Martinet set as M = {q M (ω dω) q = } M. Corollary Under the previous assumptions all nontrivial abnormal extremal trajectories are contained in the Martinet set M. In particular, if the structure is contact, there are no nontrivial abnormal extremal trajectories. Proof. ByProposition4.32 anyabnormalextremal λ(t)isacharacteristic curveofσ D. ByLemma 4.35 σ D is degenerate if and only if σ σ D =, which is in turn equivalent to ω dω = thanks to (4.43) (notice that dh and ω dω are independent since they depend on coordinates on the fibers and on the manifold, respectively). This shows that, if γ(t) is an abnormal trajectory and λ(t) is the associated abnormal extremal, then λ(t) is a characteristic curve of σ D if and only if (ω dω) γ(t) =, that is γ(t) M. By definition of M it follows that, if D is contact, then M is empty. Remark Since M is three dimensional, we can write ω dω = adv where a C (M) and dv is some smooth volume form on M, i.e., a never vanishing 3-form on M. 1 Here k Ω = } Ω... Ω {{}. k 12

103 In particular the Martinet set is M = a 1 () and the distribution is contact if and only if the function a is never vanishing. When is a regular value of a, the set a 1 () defines a two dimensional surface on M, called the Martinet surface. Notice that this condition is satisfied for a generic choice of the (one form defining the) distribution. Abnormal extremal trajectories are the horizontal curves that are contained in the Martinet surface. When M is smooth, the intersection of the tangent bundle to the surface M and the 2-dimensional distribution of admissible velocities defines, generically, a line field on M. Abnormal extremal trajectories coincide with the integral curves of this line field, up to a reparametrization. 4.4 Examples D Riemannian Geometry Let M be a 2-dimensional manifold and f 1,f 2 Vec(M) a local orthonormal frame for the Riemannian structure. The problem of finding length-minimizers on M could be described as the optimal control problem q(t) = u 1 (t)f 1 (q(t))+u 2 (t)f 2 (q(t)), where length and energy are expressed as l(q( )) = T u1 (t) 2 +u 2 (t) 2 dt, J(q( )) = 1 2 T ( u1 (t) 2 +u 2 (t) 2) dt. Geodesics are projections of integral curves of the sub-riemannian Hamiltonian in T M H(λ) = 1 2 (h 1(λ) 2 +h 2 (λ) 2 ), h i (λ) = λ,f i (q), i = 1,2. Since the vector fields f 1 and f 2 are linearly independent, the functions (h 1,h 2 ) defines a system of coordinates on fibers of T M. In what follows it is convenient to use (q,h 1,h 2 ) as coordinates on T M (even if coordinates on the manifold are not necessarily fixed). Let us start by showing that there are no abnormal extremals. Indeed if λ(t) is an abnormal extremal and γ(t) is the associated abnormal trajectory we have λ(t),f 1 (γ(t)) = λ(t),f 2 (γ(t)) =, t [,T], (4.44) that implies that λ(t) = for all t [,T] since {f 1,f 2 } is a basis of the tangent space at every point. This is a contradiction since λ(t) by Theorem Suppose now that λ(t) is a normal extremal. Then u i (t) = h i (λ(t)) and the equation on the base is q = h 1 f 1 (q)+h 2 f 2 (q). (4.45) For the equation on the fiber we have (remember that along solutions ȧ = {H,a}) {ḣ1 = {H,h 1 } = {h 1,h 2 }h 2 ḣ 2 = {H,h 2 } = {h 1,h 2 }h 1. (4.46) From here one can see directly that H is constant along solutions. Indeed Ḣ = h 1 ḣ 1 +h 2 ḣ 2 =. 13

104 If we require that extremals are parametrized by arclength u 1 (t) 2 +u 2 (t) 2 = 1 for a.e. t [,T], we have H(λ(t)) = 1 2 h 2 1(λ(t))+h 2 2(λ(t)) = 1. It is then convenient to restrict to the spherical cotangent bundle S M (see Example 2.44) of coordinates (q, θ), by setting h 1 = cosθ, h 2 = sinθ. Let a 1,a 2 C (M) be such that [f 1,f 2 ] = a 1 f 1 +a 2 f 2. (4.47) Since {h 1,h 2 }(λ) = λ,[f 1,f 2 ], we have {h 1,h 2 } = a 1 h 1 + a 2 h 2 and equations (4.53) and (4.54) are rewritten in (θ, q) coordinates { θ = a1 (q)cosθ+a 2 (q)sinθ q = cosθf 1 (q)+sinθf 2 (q) (4.48) In other words we are saying that an arc-length parametrized curve on M (i.e. a curve which satisfies the second equation) is a geodesic if and only if it satisfies the first. Heuristically this suggests that the quantity θ a 1 (q)cosθ a 2 (q)sinθ, has some relation with the geodesic curvature on M. Let µ 1,µ 2 the dual frame of f 1,f 2 (so that dv = µ 1 µ 2 ) and consider the Hamiltonian field in these coordinates H = cosθf 1 +sinθf 2 +(a 1 cosθ +a 2 sinθ) θ. (4.49) The Levi-Civita connection on M is expressed by some coefficients (see Chapter??) ω = dθ +b 1 µ 1 +b 2 µ 2, where b i = b i (q). On the other hand geodesics are projections of integral curves of H so that ω, H = = b 1 = a 1, b 2 = a 2. In particular if we apply ω = dθ a 1 µ 1 a 2 µ 2 to a generic curve (not necessarily a geodesic) which projects on γ we find geodesic curvature λ = cosθf 1 +sinθf 2 + θ θ, κ g (γ) = θ a 1 (q)cosθ a 2 (q)sinθ, as we infer above. To end this section we prove a useful formula for the Gaussian curvature of M Corollary If κ denotes the Gaussian curvature of M we have κ = f 1 (a 2 ) f 2 (a 1 ) a 2 1 a

105 Proof. From (1.58) we have dω = κdv where dv = µ 1 µ 2 is the Riemannian volume form. On the other hand, using the following identities we can compute dµ i = a i µ 1 µ 2, da i = f 1 (a i )µ 1 +f 2 (a i )µ 2, i = 1,2. dω = da 1 µ 1 da 2 µ 2 a 1 dµ 1 a 2 dµ 2 = (f 1 (a 2 ) f 2 (a 1 ) a 2 1 a2 2 )µ 1 µ Isoperimetric problem Let M be a 2-dimensional orientable Riemannian manifold and ν its Riemannian volume form. Fix a smooth one-form A Λ 1 M and c R. Problem 1. Fix c R and q,q 1 M. Find, whenever it exists, the solution to { } min l(γ) : γ() = q,γ(t) = q 1, A = c. (4.5) Remark 4.4. Minimizers depend only on da, i.e., if we add an exact term to A we will find same minima for the problem (with a different value of c). Problem 1 can be reformulated as a sub-riemannian problem on the extended manifold M = M R, in the sense that solutions of the problem (11.76) turns to be length minimizers for a suitable sub-riemannian structure on M, that we are going to construct. To every curve γ on M satisfying γ() = q and γ(t) = q 1 we can associate the function t z(t) = A = A( γ(s))ds. γ [,t] The curve ξ(t) = (γ(t),z(t)) defined on M satisfies ω( ξ(t)) = where ω = dz A is a one form on M, since ω( ξ(t)) = ż(t) A( γ(t)) =. Equivalently, ξ(t) Dξ(t) where D = kerω. We define a metric on D by defining the norm of a vector v D as the Riemannian norm of its projection π v on M, where π : M M is the canonical projection on the first factor. This endows M with a sub-riemannian structure. If we fix a local orthonormal frame f 1,f 2 for M, the pair (γ(t),z(t)) satisfies ) ( γ ż Hence the two vector fields on M = u 1 ( f1 A,f 1 γ ) ( ) f2 +u 2. (4.51) A,f 2 F 1 = f 1 + A,f 1 z, F 2 = f 2 + A,f 2 z, 15

106 defines an orthonormal frame for the metric defined above on D = span(f 1,F 2 ). Problem 1 is then equivalent to the following: Problem 2. Fix c R and q,q 1 M. Find, whenever it exists, the solution to { min l(ξ) : ξ() = (q,),ξ(t) = (q 1,c), ξ(t) } D ξ(t). (4.52) Notice that, by construction, D is a distribution of constant rank (equal to 2) but is not necessarily bracket-generating. Let us now compute normal and abnormal extremals associated to the sub-riemannian structure just introduced on M. In what follows we denote with h i (λ) = λ,f i (q) the Hamiltonians linear on fibers of T M. Normal extremals Equations of normal extremals are projections of integral curves of the sub-riemannian Hamiltonian in T M H(λ) = 1 2 (h2 1 (λ)+h2 2 (λ)), h i(λ) = λ,f i (q), i = 1,2. Let us introduce F = z and h (λ) = λ,f (q). Since F 1,F 2 and F are linearly independent, then (h 1,h 2,h ) defines a system of coordinates on fibers of T M. In what follows it is convenient to use (q,h 1,h 2,h ) as coordinates on T M. For a normal extremal we have u i (t) = h i (λ(t)) for i = 1,2 and the equation on the base is ξ = h 1 F 1 (ξ)+h 2 F 2 (ξ). (4.53) For the equation on the fibers we have (remember that along solutions ȧ = {H,a}) ḣ 1 = {H,h 1 } = {h 1,h 2 }h 2 ḣ 2 = {H,h 2 } = {h 1,h 2 }h 1. ḣ = {H,h } = (4.54) If we require that extremals are parametrized by arclength we can restrict to the cylinder of the cotangent bundle T M defined by h 1 = cosθ, h 2 = sinθ. Let a 1,a 2 C (M) be such that Then [f 1,f 2 ] = a 1 f 1 +a 2 f 2. (4.55) [F 1,F 2 ] = [f 1 + A,f 1 z,f 2 + A,f 2 z ] (by (4.55)) = [f 1,f 2 ]+(f 1 A,f 2 f 2 A,f 1 ) z = a 1 (F 1 A,f 1 )+a 2 (F 2 A,f 2 )+f 1 A,f 2 f 2 A,f 1 ) z = a 1 F 1 +a 2 F 2 +da(f 1,f 2 ) z. where in the last equality we use Cartan formula (cf. (4.74) for a proof). Let µ 1, µ 2 be the dual forms to f 1 and f 2. Then ν = µ 1 µ 2 and we can write da = bµ 1 µ 2, for a suitable function b C (M). In this case [F 1,F 2 ] = a 1 F 1 +a 2 F 2 +b z. 16

107 and {h 1,h 2 } = λ,[f 1,F 2 ] = a 1 h 1 +a 2 h 2 +bh. (4.56) With computations analogous to the 2D case we obtain the Hamiltonian system associated to H in the (q,θ,h ) coordinates ξ = cosθf 1 (ξ)+sinθf 2 (ξ) θ = a 1 cosθ +a 2 sinθ +bh (4.57) ḣ = Inotherwordsifq(t) = π(ξ(t)) istheprojection ofanormalextremal pathonm (here π : M M), its geodesic curvature satisfies κ g (q(t)) = θ(t) a 1 (q(t))cosθ(t) a 2 (q(t))sinθ(t) (4.58) κ g (q(t)) = b(q(t))h. (4.59) Namely, projections on M of normal extremal paths are curves with geodesic curvature proportional to the function b at every point. The case b equal to constant is treated in the example of Section Abnormal extremals We prove the following characterization of abnormal extremal Lemma Abnormal extremal trajectories are contained in the Martinet set M = {b = }. Proof. Assume that λ(t) is an abnormal extremal whose projection is a curve ξ(t) = π(λ(t)) that is not reduced to a point. Then we have h 1 (λ(t)) = λ(t),f 1 (ξ(t)) =, h 2 (λ(t)) = λ(t),f 2 (ξ(t)) =, t [,T], (4.6) We can differentiate the two equalities with respect to t [,T] and we get d dt h 1(λ(t)) = u 2 (t){h 1,h 2 } λ(t) = d dt h 2(λ(t)) = u 1 (t){h 1,h 2 } λ(t) = Since the pair (u 1 (t),u 2 (t)) (,) we have that {h 1,h 2 } λ(t) = that implies = λ(t),[f 1,F 2 ](ξ(t)) = b(ξ(t))h, (4.61) where in the last equality we used (4.56) and the fact that h 1 (λ(t)) = h 2 (λ(t)) =. Recall that h otherwisethecovector isidentically zero(thatisnotpossibleforabnormals), thenb(ξ(t)) = for all t [,T]. The last result shows that abnormal extremal trajectories are forced to live in connected components of b 1 (). Exercise Prove that the set b 1 () is independent on the Riemannian metric chosen on M (and the corresponding sub-riemannian metric defined on D). 17

108 4.4.3 Heisenberg group The Heisenberg group is a basic example in sub-riemannian geometry. It is the sub-riemannian structure defined by the isoperimetric problem in M = R 2 = {(x,y)} endowed with its Euclidean scalar product and the 1-form (cf. previous section) A = 1 (xdy ydx). 2 Notice that da = dx dy defines the area form on R 2, hence b 1 in this case. On the extended manifold M = R 3 = {(x,y,z)} the one-form ω is written as ω = dz 1 (xdy ydx) 2 Following the notation of the previous paragraph we can choose as an orthonormal frame for R 2 the frame f 1 = x and f 2 = y. This induced the choice F 1 = x y 2 z, F 2 = y + x 2 z. for the orthonormal frame on D = kerω. Notice that [F 1,F 2 ] = z, that implies that D is bracketgenerating at every point. Defining F = z and h i = λ,f i (q) for i =,1,2, the Hamiltonians linear on fibers of T M, we have {h 1,h 2 } = h, hence the equation (4.57) for normal extremals become q = cosθf 1 (q)+sinθf 2 (q) θ = h ḣ = (4.62) It follows that the two last equation can be immediately solved { θ(t) = θ +h t h (t) = h (4.63) Moreover { h 1 (t) = cos(θ +h t) h 2 (t) = sin(θ +h t) (4.64) From these formulas and the explicit expression of F 1 and F 2 it is immediate to recover the normal extremal trajectories starting from the origin (x = y = z = ) in the case h x(t) = 1 h (sin(θ +h t) sin(θ )) y(t) = 1 h (cos(θ +h t) cos(θ )) (4.65) and the vertical coordinate z is computed as the integral z(t) = 1 2 t x(t)y (t) y(t)x (t)dt = 1 2h 2 (h t sin(h t)) 18

109 When h = the curve is simply a straight line x(t) = sin(θ )t y(t) = cos(θ )t z(t) = (4.66) Notice that, as we know from the results of the previous paragraph, normal extremal trajectories are curves whose projection on R 2 = {(x,y)} has constant geodesic curvature, i.e., straight lines or circles on R 2 (that correspond to horizontal lines and helix on M). There are no non trivial abnormal geodesics since b = Lie derivative In this section we extend the notion of Lie derivative, already introduced for vector fields in Section 3.2, to differential forms. Recall that if X,Y Vec(M) are two vector fields we define L X Y = [X,Y] = d dt e tx Y. t= If P : M M is a diffeomorphism we can consider the pullback P : T P(q) M T qm and extend its action to k-forms. Let ω Λ k M, we define P ω Λ k M in the following way: (P ω) q (ξ 1,...,ξ k ) := ω P(q) (P ξ 1,...,P ξ k ), q M, ξ i T q M. (4.67) It is an easy check that this operation is linear and satisfies the two following properties P (ω 1 ω 2 ) = P ω 1 P ω 2, (4.68) P d = d P. (4.69) Definition Let X Vec(M) and ω Λ k M, where k. We define the Lie derivative of ω with respect to X as L X : Λ k M Λ k M, L X ω = d dt (e tx ) ω. (4.7) t= When k = this definition recovers the Lie derivative of smooth functions L X f = Xf, for f C (M). From (4.68) and (4.69), we easily deducethe following propertiesof thelie derivative: (i) L X (ω 1 ω 2 ) = (L X ω 1 ) ω 2 +ω 1 (L X ω 2 ), (ii) L X d = d L X. The first of these properties can be also expressed by saying that L X is a derivation of the exterior algebra of k-forms. TheLiederivativecombinestogether ak-formandavectorfielddefininganewk-form. Asecond way of combining these two object is to define their inner product, by defining a (k 1)-form. Definition Let X Vec(M) and ω Λ k M, with k 1. We define the inner product of ω and X as the operator i X : Λ k M Λ k 1 M, where we set (i X ω)(y 1,...,Y k 1 ) := ω(x,y 1,...,Y k 1 ), Y i Vec(M). (4.71) 19

110 One can show that the operator i X is an anti-derivation, in the following sense: i X (ω 1 ω 2 ) = (i X ω 1 ) ω 2 +( 1) k 1 ω 1 (i X ω 2 ), ω i Λ k i M, i = 1,2. (4.72) We end this section proving two classical formulas linking together these notions, and usually referred as Cartan s formulas. Proposition 4.45 (Cartan s formula). The following identity holds true L X = i X d+d i X. (4.73) Proof. Define D X := i X d+d i X. It is easy to check that D X is a derivation on the algebra of k-forms, since i X and d are anti-derivations. Let us show that D X commutes with d. Indeed, using that d 2 =, one gets d D X = d i X d = D X d. Since any k-form can be expressed in coordinates as ω = ω i1...i k dx i1...dx ik, it is sufficient to prove that L X coincide with D X on functions. This last property is easily checked by D X f = i X (df)+d(i X f) = df,x = Xf = L }{{} X f. = Corollary Let X,Y Vec(M) and ω Λ 1 M, then dω(x,y) = X ω,y Y ω,x ω,[x,y]. (4.74) Proof. On one hand Definition 4.43 implies, by Leibnitz rule L X ω,y q = d dt (e tx ) ω,y q t= = d dt ω,e tx Y e tx (q) t= On the other hand, Cartan s formula (4.73) gives Comparing the two identities one gets (4.74). 4.6 Symplectic geometry = X ω,y ω,[x,y]. L X ω,y = i X (dω),y + d(i X ω),y = dω(x,y)+y ω,x. In this section we generalize some of the constructions we considered on the cotangent bundle T M to the case of a general symplectic manifold. Definition A symplectic manifold (N, σ) is a smooth manifold N endowed with a closed, non degenerate 2-form σ Λ 2 (N). A symplectomorphism of N is a diffeomorphism φ : N N such that φ σ = σ. 11

111 Notice that a symplectic manifold N is necessarily even-dimensional. We stress that, in general, the symplectic form σ is not exact, as in the case of N = T M. The symplectic structure on a symplectic manifold N permits us to define the Hamiltonian vector field h Vec(N) associated with a function h C (N) by the formula i h σ = dh, or equivalently σ(, h) = dh. Proposition A diffeomorphism φ : N N is a symplectomorphism if and only if for every h C (N): (φ 1 ) h = h φ. (4.75) Proof. Assume that φ is a symplectomorphism, namely φ σ = σ. More precisely, this means that for every λ N and every v,w T λ N one has σ λ (v,w) = (φ σ) λ (v,w) = σ φ(λ) (φ v,φ w), where the second equality is the definition of φ σ. If we apply the above equality at w = φ 1 h one gets, for every λ N and v T λ N σ λ (v,φ 1 h) = (φ σ) λ (v,φ 1 h) = σ φ(λ) (φ v, h) = d φ(λ) h,φ v = φ d φ(λ) h,v. = d(h φ),v This shows that σ λ (,φ 1 h) = d(h φ), that is (4.75). The converse implication follows analogously. Next we want to characterize those vector fields whose flow generates a one-parametric family of symplectomorphisms. Lemma Let X Vec(N) be a complete vector field on a symplectic manifold (N,σ). The following properties are equivalent (i) (e tx ) σ = σ for every t R, (ii) L X σ =, (iii) i X σ is a closed 1-form on N. Proof. By the group property e (t+s)x = e tx e sx one has the following identity for every t R: d dt (etx ) σ = d ds (e tx ) (e sx ) σ = (e tx ) L X σ. s= This proves the equivalence between (i) and (ii), since the map (e tx ) is invertible for every t R. Recall now that the symplectic form σ is, by definition, a closed form. Then dσ = and Cartan s formula (4.73) reads as follows L X σ = d(i X σ)+i X (dσ) = d(i X σ), which proves the the equivalence between (ii) and (iii). 111

112 Corollary 4.5. The flow of a Hamiltonian vector field defines a flow of symplectomorphisms. Proof. This is a direct consequence of the fact that, for an Hamitonian vector field h, one has i h σ = dh. Hence i h σ is a cloded form (actually exact) and property (iii) of Lemma 4.49 holds. Notice that the converse of Corollary 4.5 is true when N is simply connected, since in this case every closed form is exact. Definition Let (N,σ) be a symplectic manifold and a,b C (N). The Poisson bracket between a and b is defined as {a,b} = σ( a, b). We end this section by collecting some properties of the Poisson bracket that follow from the previous results. Proposition The Poisson bracket satisfies the identities (i) {a,b} φ = {a φ,b φ}, a,b C (N), φ Sympl(N), (ii) {a,{b,c}} +{c,{a,b}} +{b,{c,a}} =, a,b,c C (N). Proof. Property (i) follows from (4.75). Property (ii) follows by considering φ = e t c in (i), for some c C (N),. and computing the derivative with respect to t at t =. Corollary For every a,b C (N) we have {a,b} = [ a, b]. (4.76) Proof. Property (ii) of Proposition 4.52 can be rewritten, by skew-symmetry of the Poisson bracket, as follows Using that {a,b} = σ( a, b) = ab one rewrite (4.77) as {{a,b},c} = {a,{b,c}} {b,{a,c}}. (4.77) {a,b}c = a( bc) b( ac) = [ a, b]c. Remark Property (ii) of Proposition 4.52 says that {a, } is a derivation of the algebra C (N). Moreover, the space C (N) endowed with {, } as a product is a Lie algebra isomorphic to a subalgebra of Vec(N). Indeed, by (4.76), the correspondence a a is a Lie algebra homomorphism between C (N) and Vec(N). 4.7 Local minimality of normal trajectories In this section we prove a fundamental result about local optimality of normal trajectories. More precisely we show small pieces of a normal trajectory are length minimizers. 112

113 4.7.1 The Poincaré-Cartan one form Fix a smooth function a C (M) and consider the smooth submanifold of T M defined by the graph of its differential L = {d q a q M} T M. (4.78) Notice that the restriction of thecanonical projection π : T M M tol definesadiffeomorphism between L and M, hence diml = n. Assume that the Hamiltonian flow is complete and consider the image of L under the Hamiltonian flow L t := e t H (L ), t [,T]. (4.79) Define the (n+1)-dimensional manifold with boundary in R T M as follows L = {(t,λ) R T M λ L t, t T} (4.8) = {(t,e t H λ ) R T M λ L, t T}. (4.81) Finally, let us introduce the Poincaré-Cartan 1-form on T M R T (M R) defined by s Hdt Λ 1 (T M R) where s Λ 1 (T M) denotes, as usual, the tautological 1-form of T M. We start by proving a preliminary lemma. Lemma s L = d(a π) L Proof. By definition of tautological 1-form s λ (w) = λ,π w, for every w T λ (T M). If λ L then λ = d q a, where q = π(λ). Hence for every w T λ (T M) s λ (w) = λ,π w = d q a,π w = π d q a,w = d q (a π),w. Proposition The 1-form (s Hdt) L is exact. Proof. We divide the proof in two steps: (i) we show that the restriction of the Poincare-Cartan 1-form (s Hdt) L is closed and (ii) that it is exact. (i). To prove that the 1-form is closed we need to show that the differential d(s Hdt) = σ dh dt, (4.82) vanishes when applied to every pair of tangent vectors to L. Since, for each t [,T], the set L t has codimension 1 in L, there are only two possibilities for the choice of the two tangent vectors: (a) both vectors are tangent to L t, for some t [,T]. (b) one vector is tangent to L t while the second one is transversal. Case (a). Since both tangent vectors are tangent to L t, it is enough to show that the restriction of the one form σ dh dt to L t is zero. First let us notice that dt vanishes when applied to tangent vectors to L t, thus σ dh dt Lt = σ Lt. Moreover, since by definition L t = e t H (L ) one has σ Lt = σ e t H (L ) = (e t H ) σ L = σ L = ds L = d 2 (a π) L =. 113

114 whereinthelastlineweusedlemma4.55andthefactthat(e t H ) σ = σ, sincee t H isanhamiltonian flow and thus preserves the symplectic form. Case (b). The manifold L is, by construction, the image of the smooth mapping Ψ : [,T] L [,T] T M, Ψ(t,λ) (t,e t H λ), Thus a tangent vector to L that is transversal to L t can be obtained by differentiating the map Ψ with respect to t: Ψ t (t,λ) = t + H(λ) T (t,λ) L. (4.83) It is then sufficient to show that the vector (4.83) is in the kernel of the two form σ dh dt. In other words we have to prove i t+ H (σ dh dt) =. (4.84) The last equality is a consequence of the following identities i H σ = σ( H, ) = dh, i t σ =, i H (dh dt) = (i H dh) dt dh (i H dt) =, }{{}}{{} = = i t (dh dt) = (i t dh) dt dh (i }{{} t dt) = dh. }{{} = =1 where we used that i H dh = dh( H) = {H,H} =. (ii). Next we show that the form s Hdt L is exact. To this aim we have to prove that, for every closed curve Γ in L one has s Hdt =. (4.85) Every curve Γ in L can be written as follows Γ Γ : [,T] L, Γ(s) = (t(s),e t(s) H λ(s)), where λ(s) L. Moreover, it is easy to see that the continuous map defined by K : [,T] L L, K(τ,(t,e t H λ )) = (t τ,e (t τ) H λ ) defines an homotopy of L such that K(,(t,e t H λ )) = (t,e t H λ ) and K(t,(t,e t H λ )) = (,λ ). Then the curve Γ is homotopic to the curve Γ (s) = (,λ(s)). Since the 1-form s Hdt is closed, the integral is invariant under homotopy, namely s Hdt = s Hdt. Γ Γ Moreover, the integral over Γ is computed as follows (recall that Γ L and dt = on L ): s Hdt = s = d(a π) =, Γ Γ Γ where we used Lemma 4.55 and the fact that the integral of an exact form over a closed curve is zero. Then (4.85) follows. 114

115 4.7.2 Normal trajectories are geodesics Nowwearereadytoproveasufficientconditionthatensurestheoptimality ofsmallpieces ofnormal trajectories. As a corollary we will get that small pieces of normal trajectories are geodesics. Recall that normal trajectories for the problem m q = f u (q) = u i f i (q), (4.86) where f 1,...,f m is a generating family for the sub-riemannian structure are projections of integral curves of the Hamiltonian vector fields associated with the sub-riemannian Hamiltonian where i=1 λ(t) = H(λ(t)), (i.e. λ(t) = e t H (λ )), (4.87) γ(t) = π(λ(t)), t [,T]. (4.88) H(λ) = max { λ,f u (q) 12 } u U u 2 = 1 q 2 m λ,f i (q) 2. (4.89) Recall that, given a smooth function a C (M), we can consider the image of its differential L and its evolution L t under the Hamiltonian flow associated to H as is (4.78) and (4.79). Theorem Assume that there exists a C (M) such that the restriction of the projection π Lt is a diffeomorphism for every t [,T]. Then for any λ L the normal geodesic i=1 γ(t) = π e t H (λ ), t [,T], (4.9) is a strict length-minimizer among all admissible curves γ with the same boundary conditions. Proof. Let γ(t) be an admissible trajectory, different from γ(t), associated with the control u(t) and such that γ() = γ() and γ(t) = γ(t). We denote by u(t) the control associated with the curve γ(t). By assumption, for every t [,T] the map π Lt : L t M is a local diffeomorphism, thus the trajectory γ(t) can be uniquely lifted to a smooth curve λ(t) L t. Notice that the corresponding curves Γ and Γ in L defined by Γ(t) = (t,λ(t)), Γ(t) = (t,λ(t)) (4.91) have the same boundary conditions, since for t = and t = T they project to the same base point on M and their lift is uniquely determined by the diffeomorphisms π L and π LT, respectively. Recall now that, by definition of the sub-riemannian Hamiltonian, we have H(λ(t)) λ(t),f u(t) (γ(t)) 1 2 u(t) 2, γ(t) = π(λ(t)), (4.92) where λ(t) is a lift of the trajectory γ(t) associated with a control u(t). Moreover, the equality holds in (4.92) if and only if λ(t) is a solution of the Hamiltonian system λ(t) = H(λ(t)). For this reason we have the relations H(λ(t)) < λ(t),f u(t) (γ(t)) 1 2 u(t) 2, (4.93) H(λ(t)) = λ(t),f u(t) (γ(t)) 1 2 u(t) 2. (4.94) 115

116 since λ(t) is a solution of the Hamiltonian equation by assumptions, while λ(t) is not. Indeed λ(t) and λ(t) have the same initial condition, hence, by uniqueness of the solution of the Cauchy problem, it follows that λ(t) = H(λ(t)) if and only if λ(t) = λ(t), that implies that γ(t) = γ(t). Let us then show that the energy associated with the curve γ is bigger than the one of the curve γ. Actually we prove the following chain of (in)equalities 1 T u(t) 2 dt = s Hdt = < 2 Γ Γs Hdt 1 T u(t) 2 dt, (4.95) 2 where Γ and Γ are the curves in L defined in (4.91). By Lemma 4.56, the 1-form s Hdt is exact. Then the integral over the closed curve Γ Γ vanishes, and one gets s Hdt = s Hdt. The last inequality in (4.95) can be proved as follows Γ s Hdt = = < = 1 2 T T T T Γ λ(t), γ(t) H(λ(t))dt λ(t),fu(t) (γ(t)) H(λ(t))dt Γ λ(t),fu(t) (γ(t)) λ(t),fu(t) ( (γ(t)) 1 2 u(t) 2 u(t) 2 dt. where we used (4.93). A similar computation gives computation, using (4.94), gives that ends the proof of (4.95). Γs Hdt = 1 2 T ) dt (4.96) u(t) 2 dt, (4.97) As a corollary we state a local version of the same theorem, that can be proved by adapting the above technique. Corollary Assume that there exists a C (M) and neighborhoods Ω t of γ(t), such that π e t H da Ω : Ω Ω t is a diffeomorphism for every t [,T]. Then (4.9) is a strict length-minimizer among all admissible trajectories γ with same boundary conditions and such that γ(t) Ω t for all t [,T]. We are in position to prove that small pieces of normal trajectories are global length minimizers. Theorem Let γ : [,T] M be a sub-riemannian normal trajectory. Then for every τ [,T[ there exists ε > such that (i) γ [τ,τ+ε] is a length minimizer, i.e., d(γ(τ),γ(τ +ε)) = l(γ [τ,τ+ε] ). (ii) γ [τ,τ+ε] is the unique length minimizer joining γ(τ) and γ(τ +ε), up to reparametrization. 116

117 Proof. Without loss of generality we can assume that the curve is parametrized by length and prove the theorem for τ =. Let γ(t) be a normal extremal trajectory, such that γ(t) = π(e t H (λ )), for t [,T]. Consider a smooth function a C (M) such that d q a = λ and let L t be the family of submanifold of T M associated with this function by (4.78) and (4.79). By construction, for the extremal lift associated with γ one has λ(t) = e t H (λ ) L t for all t. Moreover the projection π L is a diffeomorphism, since L is a section of T M. Hence, for every fixed compact K M containing the curve γ, by continuity there exists t = t (K)suchthattherestriction onk ofthemapπ Lt isalsoadiffeomorphism, forall t < t. Let us now denote δ K the positive constant defined in Lemma 3.33 such that every curve starting from γ() and leaving K is necessary longer than δ K. Then, defining ε = ε(k) := min{δ K,t (K)} we have that the curve γ [,ε] is contained in K and is shorter than any other curve contained in K with the same boundary condition by Corollary 4.58 (applied to Ω t = K for all t [,T]). Moreover l(γ [,ε] ) = ε since γ is length parametrized, hence it is shorter than any admissible curve that is not contained in K. Thus γ [,ε] is a global minimizer. Moreover it is unique up to reparametrization by uniqueness of the solution of the Hamiltonian equation (see proof of Theorem 4.57). Remark 4.6. When D q = T q M, as it is the case for a Riemannian structure, the level set of the Hamiltonian {H = 1/2} = {λ T q M H(λ) = 1/2}, is diffeomorphic to an ellipsoid, hence compact. Under this assumption, for each λ {H = 1/2}, the corresponding geodesic γ(t) = π(e t H (λ )) is optimal up to a time ε = ε(λ ), with λ belonging to a compact set. It follows that it is possible to find a common ε > (depending only on q ) such that each normal trajectory with base point q is optimal on the interval [,ε]. It can be proved that this is false as soon as D q T q M. Indeed in this case, for every ε > there exists a normal extremal path that lose optimality in time ε, see Theorem Bibliographical notes The Hamiltonian approach to sub-riemannian geometry is nowadays classical. However the construction of the symplectic structure, obtained by extending the Poisson bracket from the space of affine functions, is not standard and is inspired by [?]. Historically, in the setting of PDE, the sub-riemannian distance(also called Carnot-Carathéodory distance) is introduced by means of sub-unit curves, see for instance [11] and references therein. The link between the two definition is clarified in Exercice 4.31 The proof that normal extremal are geodesics is an adaptation of a more general condition for optimality given in [3] for a more general class of problems. This is inspired by the classical idea of fields of extremals in classical Calculus of Variation. 117

118 118

119 Chapter 5 Integrable Systems In this chapter we present some applications of the Hamiltonian formalism developed in the previous chapter. In particular we give a proof the well-known Arnold-Liouville s Theorem and, as an application, we study the complete integrability of the geodesic flow on a special class of Riemannian manifolds. 5.1 Completely integrable systems Let M be an n-dimensional smooth manifold and assume that there exist n independent Hamiltonians in involution in T M, i.e. a set of n smooth functions h i : T M R, i = 1,...,n, {h i,h j } =, i,j = 1,...,n. (5.1) such that the differentials d λ h 1,...,d λ h n of the functions are independent at every point λ T M. Definition 5.1. Under the assumptions (5.1), the Hamiltonian system defined by one of the Hamiltonian h i,i = 1,...,n, is said to be completely integrable. Let us consider the vector valued map, called moment map, defined by h : T M R n, h = (h 1,...,h n ), and let c = (c 1,...,c n ) R n be a regular value of the map h. Lemma 5.2. The set h 1 (c) is a n-dimensional submanifold in T M and we have T λ h 1 (c) = span{ h 1 (λ),..., h n (λ)}, λ h 1 (c). (5.2) Proof. Since c is a regular value of h, by Remark 2.51 the set h 1 (c) is a submanifold of dimension n in T M. In particular dimt λ h 1 (c) = n. Moreover, by Exercise 2.11, each vector field h i is tangent to h 1 (c), since h i h j = {h i,h j } = by assumption. To prove (5.2) it is then enough to show that these vector fields are linearly independent. Recall that the differentials of the functions h i are linearly independent on h 1 (c), namely d λ h 1... d λ h n, λ h 1 (c). (5.3) 119

120 Moreover the symplectic form σ on T M induces for all λ an isomorphism T λ (T M) T λ (T M) defined by w σ λ (,w). By nondegeneracy of the symplectic form, this implies that the vectors h1 (λ),..., h n (λ) are linearly independent, hence they form a basis for T λ h 1 (c). Remark 5.3. Notice that the symplectic form vanishes on T λ h 1 (c). Indeed this is a consequence of the fact that σ( h i, h j ) = h i,h j = for all i,j = 1,...,n. In what follows we denote by N c = h 1 (c) the level set of h. If h 1 (c) is not connected, N c will denote a connected component of h 1 (c). Proposition 5.4. Assume that the vector fields h i are complete and define the map Ψ : R n Diff(N c ), Ψ(s 1,...,s n ) := e s 1 h 1... e sn h n Nc. (5.4) The map Ψ defines a transitive action of R n onto N c. In particular N c is diffeomorphic to T k R n k for some k n, where T k denotes the k-dimensional torus. Proof. The complete integrability assumption together with Corollary 4.53 implies that the flows of h i and h j commute for every i,j = 1,...,n since By Proposition 2.26, this is equivalent to [ h i, h j ] = {h i,h j } =. e t h i e τ h j = e τ h j e t h i, t,τ R. (5.5) Since the vector fields are complete by assumption, we can compute for every s,s R n Ψ(s+s ) = e (s 1+s 1 ) h 1... e (sn+s n ) h n = e s 1 h 1 e s 1 h 1... e sn h n e s n h n = e s 1 h 1... e sn h n e s 1 h 1... e s n h n (by (5.5)) = Ψ(s) Ψ(s ), which proves that Ψ is a group action. Moreover, for every point λ N c, we can consider its orbit under the action of Ψ, namely Ω λ = {Ψ(s)λ s R n }. Notice that, for every λ, this defines a smooth local diffeomorphism between R n and Ω λ. Indeed the partial derivatives Ψ s i (Ψ(s)λ) = h i (Ψ(s)λ), i = 1,...,n, are linearly independent on the level set N c. As a consequence the stabilizer S λ of the point λ, i.e. the set S λ = {s R n Ψ(s)λ = λ}, is a discrete subgroup of R n. Then the proof of Proposition 5.4 is completed by the next lemma. 12

121 Lemma 5.5. Let G be a non trivial discrete subgroup of R n. Then there exist k N with 1 k n and e 1,...,e k R n such that { k } G = m i e i, m i Z. i=1 Proof. We prove the claim by induction on the dimension n of the ambient space R n. (i). Let n = 1. Since G is a discrete subgroup of R, then there exists an element e 1 closest to the origin R. We claim that G = Ze 1 = {me 1, m Z}. By contradiction assume that there exists an element f G such that me 1 < f < (m + 1)e 1 for some m Z. Then f := f me 1 belong to G and is closer to the origin with respect to e 1, that is a contradiction. (ii). Assume the statement is true for n 1 and let us prove it for n. The discreteness of G guarantees the existence of an element e 1 G, closest to the origin. Moreover one can prove that G 1 := G Re 1 is a subgroup and, as in part (i) of the proof, that G 1 := G Re 1 = Ze 1. If G = G 1 then the theorem is proved with k = 1. Otherwise one can consider the quotient G/G 1. Exercise 5.6. (i). Prove that there exists a nonzero element e 2 G/G 1 that minimize the distance to the line l = Re 1 in R n. (ii). Show that there exists a neighborhood of the line l that does not contain elements of G/G 1. By Exercise 5.6 the quotient group G/G 1 is a discrete subgroup in R n /l R n 1. Hence, by the induction step there exists e 2,...,e k such that { k } G/G 1 = m i e i, m i Z. i=2 From Proposition 5.4 and the fact that T k R n k is compact if and only if k = n we have the following corollary. Corollary 5.7. If N c is compact, then N c T n. Remark 5.8. On any level set λ N c the map Ψ λ : R n N c defined by Ψ λ (s) = Ψ(s)λ defines coordinates (s 1,...,s n ) in a neighborhood of the point λ. In these coordinate set (defined on N c ) the Hamiltonian vector fields h i are constant. 5.2 Arnold-Liouville theorem In this section we consider the moment map of a completely integrable system h : T M R n, h = (h 1,...,h n ), and we assume that for all values of c R the level set h 1 (c) is a smooth compact and connected manifold. In particular N c T n for all c R by Corollary 5.7 Fix c R and a point λ c N c. Let us consider the basis e 1,...,e n in R n given by Lemma 5.5 and denote by (θ 1,...,θ n ) the coordinates defined in R n by the choice of this basis. 121

122 Since θ 1,...,θ n are obtained by (s 1,...,s n ) by a linear change of coordinates on each level set, the vector fields h i are constant in these coordinates (see Remark 5.8) and the basis θ1,..., θn can be expressed as follows n θi = b ij (c) h j, (5.6) where the coefficients b ij depend only on c, i.e., are constant on each level N c. j=1 Remark 5.9. Notice that the coordinate set (θ 1,...,θ n ) are not uniquely defined. Indeed every transformation of the kind θ i θ i + ψ i (c) still defines a set of angular coordinates on each level set. The choice of the functions ψ i (c) corresponds to the choice of the initial value of θ i at a point (for every choice of c). Notice that the vector fields θi are well defined and independent on this choice. Let us now introduce the diffeomorphism F c : T n N c, F c (θ 1,...,θ n ) = Ψ(θ 1 +2πZ,...,θ n +2πZ)(λ c ). Next we want to analyze the dependence of this construction with respect to c. Fix c R n and consider a neighborhood O of the submanifold N c in the cotangent space T M. Being N c compact, in O we have a foliation of invariant tori N c, for c close to c. In other words we have a well defined coordinate set (c 1,...,c n,θ 1,...,θ n ). Theorem 5.1 (Arnold-Liouville). Let us consider a moment map h : T M R n associated with a completely integrable system such that every level set N c is compact and connected. Then for every c R there exists a neighborhood O of N c and a change of coordinates such that (c 1,...,c n,θ 1,...,θ n ) (I 1,...,I n,ϕ 1,...,ϕ n ) (5.7) (i) I = Φ h, where Φ : h(o) R n is a diffeomorphism, (ii) σ = n j=1 di j dϕ j. Definition The coordinates (I, ϕ) defined in Theorem 5.1 are called action-angle coordinates. Remark This proves that there exists a regular foliation of the phase space by invariant manifolds, that are actually tori, such that the Hamiltonian vector fields associated to the invariants of the foliation span the tangent distribution. There then exist, as mentioned above, special sets of canonical coordinates on the phase space such that the invariant tori are the level sets of the action variables, and the angle variables are the natural periodic coordinates on the torus. The motion on the invariant tori, expressed in terms of these canonical coordinates, is linear in the angle variables. Indeed, since the h j are functions on I variables only, we have hj = n i=1 h j I i ϕi. 122

123 In other words, the Hamiltonian system in the angle-action coordinate (I, ϕ) is written as follows I i = h j ϕ i =, ϕ i = h j I i (I). (5.8) This explains also why this property is called complete integrability. Proof of Theorem 5.1. In this proof we will use the following notation: - if c = (c 1,...,c n ) R n we set c j,ε = (c 1,...,c j +ε,...,c n ), - γ i (c) is the closed curve in the torus N c parametrized by the i-th angular coordinate θ i, namely γ i (c) = {F c (θ 1,...,θ i +τ,...,θ n ) N c τ [,2π]}. - C j,ε i denotes the cylinder defined by the union of curves γ i (c j,τ ), for τ ε. Let us first define the coordinates I i = I i (c 1,...,c n ) by the formula I i (c) = 1 s, 2π γ i (c) where s is the tautological 1-form on T M. Being σ Nc, by Stokes Theorem the variable I i depends only on the homotopy class of γ i. 1 Let us compute the Jacobian of the change of variables. I i (c) = 1 c j 2π = 1 2π = 1 2π = 1 2π = 1 2π ( ) ε s s ε= γ i (c j,ε ) γ i (c) ε s ε= C j,ε i ε σ (where σ = ds) ε= C j,ε i cj +ε ε σ( cj, θi )dθ i dτ ε= c j γ i (c j,τ ) σ( cj, θi )dθ i. γ i (c) Using that θi = n j=1 b ij(c) h j (see (5.6)) one gets σ(, θi ) = n b ij (c)dh j. (5.9) 1 Hence, in principle, we are free to choose any basis γ 1,...,γ n for the first homotopy group of T n. j=1 123

124 Moreover dh i = dc i since they define the same coordinate set. Hence I i (c) = 1 n b ik dc k, ci dθ i c j 2π γ i (c) k=1 = 1 b ij (c)dθ i 2π = b ij (c) γ i (c) Combining the last identity with (5.9) one gets σ(, θi ) = di i In particular this implies that the symplectic form has the following expression in the coordinates (I,θ) σ = a ij (I)dI i di j + di i dθ i. (5.1) ij i where the smooth functions a ij depends only on the action variables, since the symplectic form σ and the term i di i dθ i are closed form. Moreover it is easy to see that the first term of (5.1) can be rewritten as ( n n ) a ij (I)dI i di j = d β i (I) di i and σ can be rewritten as i,j=1 σ = i=1 n di i d(θ i β i (I)) i=1 The proof is completed by defining ϕ i := θ i β i (I). Remark The notion of complete integrability introduced here is the classical one given by Liouville and Arnold. Sometimes, complete integrability of a dynamical system is also referred to systems whose solution can be reduced to a sequence of quadratures. Notice that by Theorem 5.1 complete integrability implies integrability by quadratures (see also Remark 5.12). 5.3 Integrable geodesic flows In this section we want to discuss whether it is possible to apply the Arnold-Lioville s Theorem to the case of a geodesic flow on a Riemannian (or sub-riemannian) manifold. Recall that on a sub-riemannian manifold, we denote by H the sub-riemannian Hamiltonian. Definition We say that a complete smooth vector field X Vec(M) is a Killing vector field if it generates a one parametric flow of isometries, i.e. e tx : M M is an isometry for all t R. Recall that, for every X Vec(M), we can define the function h X C (T M) linear on fibers associated with X by h X (λ) = λ,x(q), where q = π(λ). The following lemma shows that, if X is a Killing vector field, i.e. a vector field on M whose flow generates isometries, then the Hamiltonian associated with it is in involution with the sub- Riemannian Hamiltonian. 124

125 Lemma Let M be a sub- Riemannian manifold and H the sub-riemannian Hamiltonian. For a vector field X Vec(M) is a Killing vector field if and only if {H,h X } =. Proof. A vector field X generates isometries if and only if, by definition, the differential of its flow e tx : T q M T e tx (q)m preserves the sub-riemannian distribution and the norm on it, i.e. e tx v D e tx (q) for every v D q and e tx v = v. By definition of H, this is equivalent to the identity H(e tx λ) = H(λ), λ T M. On the other hand Proposition 4.9 implies that (e tx ) = e t h X, where h X is the hamiltonian linear on fibers related to X. Hence differentiating with respect to t we find the equivalence H e tx = H h X H = {H,h X } =. In other words to every 1-parametric group of isometries of M we can associate an Hamiltonian in involution with H. Let us show the complete integrability of the geodesic flow in some very symmetric cases. Example 5.16 (Revolution surfaces in R 3 ). Let M be a 2-dimensional revolution surface in R 3. Since the rotation around the revolution axis preserves the Riemannian structure, by definition, we have that the Hamiltonian generated by this flow and the Riemannian Hamiltonian H are in involution. As a consequence the geodesic flow is completely integrable. Example 5.17 (Isoperimetric sub-riemannian problem). Let us consider a sub-riemannian structure associated with an isoperimetric problem defined on a 2-dimensional revolution surface M (see Section 4.4.2). The sub-riemannian structure on M R is determined by the function b C (M) satisfying da = bdv, where A Λ 1 (M) is the 1-form defining the isoperimetric problem and dv is the volume form on M. (i) If both M and b are rotational invariant we find a first integral of the geodesic flow as in the previous example (ii) By construction the problem is invariant by translation along the z-axis Hence there exists three Hamitonian in involution and the geodesic flow is completely integrable Geodesic flow Let us consider now a smooth function a : R n R and consider the family of hypersurfaces defined by the level sets of a M c := a 1 (c) R n, c is a regular value of a, endowed with the Riemannian structure induced by the ambient space R n. By Sard s Lemma for almost every c R, c is a regular value for a (in particular, M c is a smooth submanifold of codimension one in R n ). Adapting the arguments of Proposition 1.4 in Chapter 1, one can prove the following characterization of geodesics on a hypersurface M. 125

126 Proposition Let γ : [,1] M a lenght-parametrized curve on M. Then γ is a geodesic if and only if γ(t) T γ(t) M. For a large class of functions a, we will findan Hamiltonian, definedon the ambient space T R n, whose (reparametrized) flow generates the geodesic flow when restricted to each level set M c. Consider the standard symplectic structure on T R n T R n = R n R n = {(x,p), x,p R n }, σ = n dp i dx i, i=1 For x,p R n we will denote by x+rp the line of R n {x+tp,t R}. Assumptions. In what follows we assume that the function a : R n R satisfies the following assumptions: (i) the restriction of a : R n R to every line is strictly convex, (ii) a(x) + when x +. Under these assumptions the restriction of the function a to each affine line in R n always attains a minimum and we can define the function h(x,p) = min t R a(x+tp). (5.11) Remark Given x,p R n the line x+rp is tangent to the level set a 1 (c) (with c = a(x+ tp)) at the point ξ = x+ tp R n at which the minimum in (5.11) is attained. Indeed = d dt = d ξ a,p. t= ta(x+tp) It is clear from the definition of h that actually it is a well-defined function on the space of affine lines in R n. This is formally proved in the following lemma. Lemma 5.2. The Hamiltonian b(x,p) = 1 2 p 2 satisfies {h,b} =, i.e. h it is constant along the flow of b. Proof. TheHamiltoniansystemfor biseasilysolvedforeveryinitialcondition(x(),p()) = (x,p ) {ẋ = b p = p ṗ = b x = { x = x +tp p = p (5.12) and it is easy to see that, by its very definition, h is constant under this flow. Remark Notice that to restrict to a level set of b is equivalent to restrict the function h to the space of affine lines in R n since {(x,p) T R n, b(x,p) = 1/2} = {(x,p) T R n, p = 1}. 126

127 Now we introduce the following function ξ : R n R n R n, ξ(x,p) = x+s(x,p)p, (5.13) where s(x,p) = t is the point at which the function f(t) = a(x+tp) attains its minimum. The following proposition says that if we follow the flow of h, as a flow on the space of lines, then the line is always tangent to the same quadric and actually describes a geodesic on it. Proposition Let (x(t),p(t)) be a trajectory of the Hamiltonian vector field h associated to (5.11). Then the function t ξ(t) := ξ(x(t),p(t)) R n, (5.14) (i) is contained in a level set M c = a 1 (c), for some c R, (ii) is a geodesic on M c, Proof. Property (i) is a simple consequence of Corollary 4.19, since every function is constant along the flow of its Hamiltonian vector field. Indeed, writing h(x,p) = a(ξ(x,p)) and denoting by (x(t), p(t)) the Hamiltonian flow, we get a(ξ(t)) = a(ξ(x(t),p(t))) = h(x(t),p(t)) = const, i.e. the curve ξ(t) is contained on a level set of a. Moreover by definition s(x,p) denotes on the line x+rp where a attains its minimum, hence ξ(t) a,p(t) =, t. (5.15) The Hamiltonian system associated with h reads { ẋ = s ξ a ṗ = ξ a (5.16) that immediately implies ẋ + sṗ =. Computing the derivative ξ = ẋ+ṡp+sṗ = ṡp, itfollowsthat ξ isparalleltop,andactuallyp(t)isthevelocity ofthecurveξ(t), whenreparametrized with the parameter s, since p = 1 implies ξ = ṡ. Finally, the second derivative of the reparametrized of ξ is ṗ and, since ṗ ξ a = from the Hamiltonian system, the second derivative of ξ(t) (when reparametrized by the length) is orthogonal to the level set, i.e. ξ(t) is a geodesic. Notice also that s is a well defined parameter. Computing the derivative with respect to t in (5.15) we have that ṡ 2 ξ ap,p ξa 2 =. and the strict convexity of a implies 2 ξap,p. Remark Thus we can visualize the solutions of h as a motion of lines: the lines move in such a way to be tangent to one and the same geodesic. The tangency point x on the line moves perpendicular to this line in this process. We will also refer to this flow as the line flow associated with a. 127

128 Consider now two functions a,b : R n R that satisfies our assumptions (i),(ii). Following our notation, we set h(x,p) = a(ξ(x,p)), g(x,p) = b(η(x,p)), ξ(x,p) = x+s(x,p)p η(x,p) = x+t(x,p)p where s(x,p) and t(x,p) are defined as above, and ξ, η denote the tangency point of the line x+rp with the level set of a and b respectively. The following proposition computes the Poisson bracket of these Hamiltonian functions Proposition Under the previous assumptions Proof. From the very definition of Poisson bracket {h,g} = (s t) ξ a, η b. (5.17) {h,g} = p h, x g x h, p g = (s t) ξ a, η b. where we used equations (5.16) for both h and g. 5.4 Geodesic flow on ellipsoids It was Jacobi who first established that the geodesic flow on an ellipsoid is completely integrable, using the separation of variables method. Here we give a different derivation, essentially due to Moser, as an application of the theory developed in the previous section. More precisely we consider the particular case when the function a is a quadratic polynomial, i.e. every level set of our function is a quadric in R n. Definition Let A be an n n non degenerate symmetrix matrix. The quadric Q associated to A is the set Q = {x R n, A 1 x,x = 1}. (5.18) For simplicity we consider the case when A has simple distinct eigenvalues α 1 <... < α n. Define, for every λ that is not an eigenvalue of A, a λ (x) = (A λi) 1 x,x, Q λ = {x R n, a λ (x) = 1}. If A = diag(α 1,...,α n ) is a diagonal matrix then (5.18) reads n Q = {x R n x 2 i, = 1}, α i and Q λ represents the family quadrics that are confocal to Q n Q λ = {x R n x 2 i, α i λ = 1}, λ R\Λ, i=1 where Λ = {α 1,...,α n } denotes the set of eigenvalues of A. Note that Q λ = when λ > α n. Note. In what follows by a generic point x for A we mean a point x that does not belong to anyproperinvariant subspaceof A. Inthediagonal caseit isequivalent tosay that x = (x 1,...,x n ), with x i for every i. 128 i=1

129 Exercise Denote by A λ := (A λi) 1. Prove the two following formulas: (i) d dλ A λ = A 2 λ, (ii) A λ A µ = (µ λ)a λ A µ. Lemma Let x R n be a generic point for A and let {Q λ } λ Λ be the family of confocal quadrics. Then there exists exactly n distinct real numbers λ 1,...,λ n in R\Λ such that x Q λi for every i = 1,...,n, and the quadrics Q λi are pairwise orthoghonal at the point x. Proof. For a fixed x, the function λ a λ (x) = A λ x,x satisfies in R\Λ a λ λ (x) = A 2 λ x,x = A λ x 2, where A λ := (A λi) 1, as follows from part (i) of Exercise 5.26 and the fact that A (hence A λ ) is self-adjoint. Thus a λ (x) is monotone increasing as a function of λ, and takes values from to + in each interval ]α i,α i+1 [ contained between two eigenvalues of A. Thisimplies that, for a fixed x, there exist exactly n values λ 1,...,λ n such that a λi (x) = 1 (that means x Q λi ). Next, using part (ii) of Exercise 5.26 (also known as resolvent formula) we can compute, for two distinct values λ i λ j and x Q λi Q λj : x a λi, x a λj = 4 Aλi x,a λj x = 4 A λi A λj x,x = 4 λ j λ i ( A λi x,x A λj x,x ) =, where again we used the fact that A λ is selfadjoint and A λ x,x = 1 for all λ. Now we define the family of Hamiltonians associated with the family of confocal quadrics h λ (x,p) = min t a λ (x+tp) = a λ (ξ λ (x,p)), (5.19) Now we prove another interesting orthogonality property of the family. We show that if two confocal quadrics are tangent to the same line, then their gradient are orthogonal at the tangency points. Proposition Assume that two confocal quadrics are tangent to a given line, i.e. there exist x,y R n such that a λ (ξ λ ) = a µ (ξ µ ), where ξ λ = x+t λ p, ξ µ = x+t µ p. Then ξλ a λ, ξµ a µ =. In particular {h λ,h µ } =. Proof. The condition that the quadric Q λ is tangent to the line x+ry at ξ λ is expressed by the following two equality A λ ξ λ,y =, A λ ξ λ,ξ λ = 1 (5.2) and an analogue relations is valid for Q µ. Notice than from (5.2) one also gets A λ ξ λ,ξ µ = A µ ξ µ,ξ λ = 1. Then,with the same computation as before using (5.26) ξλ a λ, ξµ a µ = 4 Aλ ξ λ,a µ ξ µ The last claim follows from Proposition (5.24). = 4 A λ A µ ξ λ,ξ µ = 4 µ λ ( A λξ λ,ξ µ A µ ξ µ,ξ λ ) =, 129

130 Proposition A generic line in R n is tangent to n 1 quadrics of a confocal family. Proof. Consider the projection along the fixed line x + Rp of the quadrics of the confocal family onto an orthogonal hyperplane. The following exercise shows that this projection defines a confocal family of quadrics on the reduced space. Exercise 5.3. (i). Show that the map x a p λ (x) := A λ(x+t λ p),x+t λ p is a quadratic form and that p Kera p λ. In particular this implies that ap λ is well defined on the quotient Rn /Rp. (ii). Prove that {a p λ } λ is a family of confocal quadric on the factor space (in n 1 variables). Applying then Lemma 5.27 to the family {a p λ } λ we get that, for a generic choice of x, there exists n 1 quadrics passing through the point on the plane where the line is projected, i.e. the line x+rp is tangent to n 1 confocal quadrics of the family {a λ } λ. Remark Notice that thisproves that everygenericlineinr n isassociated withanorthonormal frame of R n, being all the normal vectors to the n 1 quadrics given by Proposition 5.29 mutually orthogonal and orthogonal to the line itself. Theorem The geodesic flow on an ellipsoid is completely integrable. In particular, the tangents of any geodesics on an ellipsoid are tangent to the same set of its confocal quadrics, i.e. independently on the point on the geodesic. Proof. We want to show that the functions λ 1 (x,p),...,λ n 1 (x,p) (as functions defined on the set of lines in R n ) that assign to each line x + Rp in R n the n 1 values of λ such that the line is tangent to Q λ are independent and in involution. First notice that each level set λ i (x,p) = c coincide with the level set h c = 1. Hence, by Exercise 4.3, the two functions defines the same Hamiltonian flow on this level set(up to reparametrization). We are then reduced to prove that the functions h c1,...,h cn 1 are independent and in involution, which is a consequence of Proposition Since the lines that are tangent to a geodesic on the ellipsoid Q λ form an integral curve of the Hamiltoian flow of the associated function h λ, and all the Poisson brackets with the other Hamiltonians are zero, it follows that the line remains tangent to the same set of n 1 quadrics. 13

131 Chapter 6 Chronological calculus In this chapter we develop some tools from chronological caluculs that will allow us to manage in a very efficient way with flows of nonautonomous vector fields. The main idea is to replace a nonlinear object defined on the manifold M with its linear counterpart, when interpreted as an operator on the space C (M) of smooth functions on M. 6.1 Duality We recall that the set C (M) of smooth functions on M is an R-algebra with the usual operation of pointwise addition and multiplication (a+b)(q) = a(q)+b(q), (λa)(q) = λa(q), a,b C (M), λ R, (a b)(q) = a(q)b(q). Any point q M can be interpreted as the linear functional q : C (M) R, q(a) := a(q). For every q M, the functional q is a homomorphism of algebras, i.e. it satisfies q(a b) = q(a) q(b). A diffeomorphism P Diff(M) can be thought as the linear change of variables operator P : C (M) C (M), P(a) := a(p(q)). which is an automorphism of the algebra C (M). Remark 6.1. Notice that every nontrivial homomorphismof algebras ϕ : C (M) Ris represented by some point, i.e. ϕ = q for some q M. Moreover for every automorphism of algebras Φ : C (M) C (M) there exists a diffeomorphism P Diff(M) such that P = Φ. Now we want to characterize tangent vectors as functionals on C (M). As remarked in Chapter 2 a tangent vector v T q M defines in a natural way the derivation in the direction of v, i.e. the functional v : C (M) R, v(a) = d q a,v, 131

132 that satisfies the Leibnitz rule v(a b) = v(a)b(q)+a(q) v(b), a,b C (M). Ontheother hand, consideringv T q M as thetangent vector of acurveq(t) such that q() = q, it is also natural to consider the family of functionals q(t) := q(t), and define v := d dt : C t= q(t) (M) R. (6.1) It is easy to check that (6.1) agrees with our definition of v, it is sufficient to differentiate at t = the identity q(t)(a b) = q(t)a q(t)b In the same spirit, a vector field X Vec(M) will be characterized, as a derivation of C (M) (for vector fields we already discussed this property in Chapter 2), as the infinitesimal version of a flow (i.e. family of diffeomorphisms) P t Diff(M). Indeed if we set X = d dt Pt : C (M) C (M), t= we find that X satisfies (see (2.14)) X(ab) = X(a)b+a X(b), a,b C (M). Remark 6.2. ItispossibletodefineonC (M)theWhitneytopologyanddefineregularityproperties of family of functionals in a weak sense, i.e. we say that A t is continuos (differentiable, etc.) if the map t A t a has the same property for every a C (M). For instance, if X t denotes some locally integrable family of vector fields we denote t For a more detailed presentation see [3]. 1 X s ds : a t X s ads 6.2 Operator ODE and Taylor expansion Consider a nonautonomous vector field X t and the correspondent nonautonomous ODE d dt q(t) = X t(q(t)), q M. (6.2) Using the notation introduced in the previous section we can rewrite (6.2) in the following way d dt q(t) = q(t) X t. (6.3) 1 With this interpretation it makes sense to consider, for instance, the sum of a point q and a vector v q +v : a a(q)+ d qa,v 132

133 Indeed assume that q(t) satisfies (6.2) and let a C (M). We compute ( ) d dt q(t) a = dt q(t)a d = d dt a(q(t)) = d q(t) a,x t (q(t)) = ( X t a)(q(t)) (6.4) = ( q(t) X t )a We discussed in Chapter 2 that, considering the solution to the nonautonomous ODE (6.2), we have a well defined flow, i.e. family of diffeomorphisms, P t : q q(t). We call P t the right chronological exponential and use the notation P t := exp t X s ds. (6.5) Lemma 6.3. The flow P t defined by (6.5) satisfies the differential equation d dt P t = P t X t, P = Id. (6.6) Proof. Fix a point q M and denote by q(t) the solution of the Cauchy problem (6.2) with initial condition q() = q. By the very definition of P t we have that q(t) = P t (q ), which easily implies q(t) = q P t. Notation. In the following we will identify any object with its dual interpretation as operator on functions and stop to use a different notation for the same object when acting on the space of smooth functions. The meaning of the notation will be clear from the context. Notice that there is no risk of confusion since, when using operatorial notation, composition works in the opposite side. Our differential equation (6.6), namely { P t = P t X t P = Id can be rewritten as an integral equation as follows P t = Id+ Substituting into (6.8), and iterating we have t ( P t = Id+ Id+ = Id+ =... = Id+ t N X s ds+ s1 t (6.7) P s X s ds (6.8) P s2 X s2 ds 2 ) X s1 ds 1 s 2 s 1 t k=1 s k... s 1 t P s2 X s2 X s1 ds 1 ds 2 X sk X s1 d k s+r N 133

134 where R N = s N... s 1 t P sn X sn X s1 d N s Formally, letting N, we can write the chronological series exp t X s ds Id+ k=1 S k (t) X sk X s1 d k s (6.9) where S k (t) = {(s 1,...,s k ) R k s k... s 1 t} denotes the k-dimensional symplex. Remark 6.4. If we write expansion (6.9) when X t = X is an autonomous vector field, we find e tx = exp t Xds Id+ k=1 S k (t) X} X {{} d k s = k k= t k k! Xk. since meas(s k (t)) = t k /k!. This also shows that in the nonautonomous case the order in which s 1,...,s k are presented in the composition is very important. The key point is that for different time X s and X τ might not commute. Remark 6.5. Notice that the chronological exponential cannot be written as the flow of an autonomous vector field t t exp X s ds e Xsds. One can show that a necessary condition for the equality holds is [X t,x τ ] = for all t,τ. Consider now the inverse flow Q t := P 1 t, where P t satisfies (6.8), and try to characterize the differential equation satisfied by Q t. First we differentiate the identity P t Q t = Id (6.1) and Leibnitz rule give Using (6.7) then we get P t Q t +P t Q t = P t X t Q t +P t Q t = hence we get, multiplying Q t both sides, that Q t satisfies { Q t = X t Q t, Q = Id. (6.11) which is dual to the Cauchy problem (6.7). The solution to the problem (6.11) will be denoted by the left chronological exponential Q t := exp t 134 ( X s )ds. (6.12)

135 Repeating analogous reasoning, we find the formal expansion t exp ( X s )ds Id+ ( X s1 ) ( X sk )d k s. k=1 s k... s 1 t The difference with respect to the right chronological exponential is in the order of composition. In particular the arrow over the exp says in which direction the time increases. We can summarize properties of the chronological exponential into the following d exp dt t t X s ds = exp t X s ds X t, (6.13) t d exp X s ds = X t exp X s ds, (6.14) dt ( t ) 1 exp X s ds = t exp ( X s )ds. (6.15) Now we can study the action of diffeomorphisms on vectors and vector fields. Let v T q M and P Diff(M). We claim that, as functionals on C (M), we have P v = v P. Indeed consider a curve q(t) such that q() = v and compute (P v)a = d ( ) d dt a(p(q(t))) = t= dt q(t) Pa = v Pa t= Recall that, if X Vec(M) is a vector field we have P X q = P (X P ). In a similar way we 1 (q) will find an expression for P X as derivation of C (M) P X = P 1 X P. (6.16) Remark 6.6. We can reinterpret the pushforward of a vector field in a totally algebraic way in the space of linear operator on C (M). Indeed where P X = (AdP 1 )X, AdP : X P X P 1, X Vec(M) is the adjoint action of P on the space of vector fields 2. Assume now that P t = exp t X sds. We try to characterize the flow AdP t by looking for the ODE it satisfies. Applying to a vector field Y we have ( ) d dt AdP t Y = d dt (AdP t)y = d dt (P t Y Pt 1 ) = P t X t Y P 1 t = P t (X t Y Y X t ) P 1 t = (AdP t )[X t,y] = (AdP t )(adx t )Y 2 it is the differential of the conjugation Q P Q P 1, Q Diff(M) +P t Y ( X t ) P 1 t 135

136 where adx : Y [X,Y], is the adjoint action on the Lie algebra of vector fields. In other words we proved that AdP t is a solution to the differential equation Ȧ t = A t adx t, A = Id. Thus it can be expressed as chronological exponential and we have the identity ( t Ad exp ) X s ds = exp Exercise 6.7. Prove that, if [X t,y] = for all t, then (AdP t )Y = Y. Remark 6.8. More explicitly we can write the following formula (AdP t )Y Y + k=1 s k... s 1 t t adx s ds. (6.17) [X sn,...,[x s2,[x s1,y]]d k s, (6.18) which generalizes the formula (??). Indeed if P t = e tx is the flow associated to an autonomous vector field we get (Ade tx )Y e tx Y = Y + k=1 t k k! [X,...,[X,Y]] = Y +t[x,y]+ t2 [X,[X,Y ]] Exercise 6.9. Prove the following using operator notation: 1. Show that ad is the infinitesimal version of the operator Ad, i.e. if P t is a flow generated by the vector field X Vec(M) then adx = d dt AdP t. t= 2. Show that, if P Diff(M), then P preserves Lie brackets, i.e. P [X,Y] = [P X,P Y]. 3. Show that the Jacobi identity in Vec(M) is the infinitesimal version of the identity proved in 2. (Hint. use P t = e tz ) Exercise 6.1. Prove the following formula on the change of variables on a nonautonomous flow P exp t X s ds P 1 = exp Notice that for an autonomous vector field it proves (2.23). t (AdP)X s ds. (6.19) 136

137 6.3 Variations Formulae Consider the following ODE q = X t (q)+y t (q) (6.2) where Y t is some perturbation of our original equation (6.2). We want to describe the solution to the perturbed equation (6.2) as the perturbation of the solution of the unperturbed one. Proposition Let X t,y t be two nonautonomous vector fields. Then t exp (X s +Y s )ds = t ( s ) exp exp adx τ dτ Y s ds exp = exp t where P t = exp t X sds denote the flow of the original vector field. Proof. Our goal is to find a flow R t such that Q t := exp t By definition of right chronological exponential we have On the other hand, from (6.23), we also find Hence, comparing (6.24) and (6.25), we get t X s ds (6.21) (AdP s )Y s ds P t (6.22) (X s +Y s )ds = R t P t (6.23) Q t = Q t (X t +Y t ) (6.24) Q t = Ṙt P t +R t P t = Ṙt P t +R t P t X t = Ṙt P t +Q t X t (6.25) Q t Y t = Ṙt P t and we can write the ODE satisfied by R t Ṙ t = Q t Y t P 1 t = R t (AdP t )Y t Since R = Id we find that R t is a chronological exponential and exp t which is (6.22). Using (6.17) we get (6.21) Exercise Prove the following (X s +Y s )ds = exp 137 t (AdP s )Y s ds P t

138 (i) Prove the second form of the variational formula, where the original flow appear to the left exp t (X s +Y s )ds = exp t X s ds exp (ii) For autonomous vector fields X, Y Vec(M) prove that e t(x+y ) = exp t = e tx exp t e sadx Yds e tx = exp t ( s exp t t ) adx τ dτ Y s ds (6.26) e sx Yds e tx (6.27) e (s t)adx Yds (6.28) 138

139 Chapter 7 End-point and Exponential map In this chapter we introduce the end-point map (i.e., the map which associates to every control the final point of the associate trajectory) and we interpret the optimality condition obtained in Chapter 4 as a Lagrange multipliers rule. 7.1 First order conditions We start by defining the end-point map. Consider a smooth n-dimensional manifold M and a free sub-riemannian structure on it defined by U = M R m, f : U TM, (q,u) f u (q) = m u i f i (q) i=1 For every measurable square integrable control function t u(t) L 2 there exists an admissible trajectory γ(t, u( )) solution to the Cauchy problem γ(t) = f u(t) (γ(t)), γ(,u( )) = q. In the following we fix the initial point q M and we consider the set of admissible controls U = {u L 2 ([,1],R m ), γ(t,u( )) is defined on [,1]} L 2 ([,1],R m ) which is an open subset by ODE s continuous dependence theorem. Notice that the choice of L 2 will change topology in the space of controls but nothing change in our geometric space because we know that we can always consider length parametrized curves. Definition 7.1. In the previous hypothesis we define the end-point map F : U M, F(u( )) := γ(1,u( )) = q exp 1 f u(t) dt. (7.1) The end-point map is a map from an open set of an Hilbert space in a smooth finite-dimensional manifold. Using chronological calculus developed in Chapter 6, it is easy to compute its (Fréchet) differential. 139

140 Proposition 7.2. Let u U and q 1 = F(u( )). The end-point map F is smooth and its differential at u is the map D u F : L 2 T q1 M, D u F(v) = where P t τ : γ u(τ) γ u (t) is the flow generated by u. 1 P 1 t f v(t)(q 1 )dt. (7.2) Proof. The end-point map from q can be rewritten as the chronological exponential F(u( )) = q exp 1 f u(t) dt. Using the Volterra expansion (6.9) we can immediately compute the differential of the end-point map near u. Indeed we have that, for any control v( ) sufficiently close to : 1 F(v( )) = q Id+ f v(t) dt+ f v(t2 ) f v(t1 )ds 1 ds (7.3) t 2 t 1 t From here we find that the linear term with respect to v is exactly where f appear only once, D F : L 2 T q M, D F(v) = q 1 f v(t) dt = 1 f v(t) (q )dt. To compute the differential at a generic point u U we have to consider the expansion near of the map v( ) F(u( )+v( )) = q 1 exp f (u+v)(t) dt. The variation formula (6.21) let us to write (compare also with the proof of Proposition 3.43) exp 1 f (u+v)(t) dt = exp = exp = exp f u(t) +f v(t) dt ( t exp P t f v(t) dt P 1 where P t τ : γ(τ,u) γ(t,u). In other words we rewrite ) adf u(s) ds f v(t) dt exp F(u) = P 1 (G(u)), D u F = P 1 D u G, 1 f u(t) dt and reduced the problem to the expansion of G, which is easier. Indeed we have the Volterra expansion 1 1 exp Pt f v(t) dt Id+ Pt f v(t) dt+ Pt f v(t2 ) Pt f v(t1 )dt 1 dt (7.4) from which we get, denoting q 1 = F(u) D u F : L 2 T q1 M, D u F(v) = P 1 t 2 t 1 t 1 P t f v(t) (q )dt = 1 P 1 t f v(t) (q 1 )dt. 14

141 Now we want to characterize sub-riemannian extremals as critical points of the end-point map. Proposition 7.3. The following properties hold: (i) (u(t),λ(t)) is an abnormal extremal if and only if u(t) is a critical point for F. Moreover there exists λ 1 TF(u) M such that λ 1 D u F =, λ(t) = P 1 t λ 1, (7.5) (ii) (u(t),λ(t)) is a normal extremal if and only if there exists λ 1 TF(u) M such that λ 1 D u F = u, λ(t) = P 1 t λ 1. (7.6) where in the last equality we identify u L 2 with the element (u, ) L 2 (L 2 ) Proof. (i). u is a critical point of F if and only if D u F is not surjective. In other words there exists a covector λ 1 : T F(u) M R such that λ 1 D u F =, (7.7) where λ 1 D u F denotes the composition of maps L 2 D uf T q1 M λ 1 R (7.8) Now, if we let λ t = P 1 t λ 1, it remains to prove that the curve λ t satisfies the relation We have h i (λ(t)) =, λ 1,D u F = 1 1 h i (λ(t)) = λ(t),f i (q(t)) λ1,p 1 t f v(t) (q 1 ) dt =, v λ(t),fv(t) (q(t)) dt =, v 1 v i λ(t),f i (q(t)) dt =, v i λ(t),f i (q(t)) =, i (ii). With analogous proof. At the end we can rewrite this result in the following way Corollary 7.4. A control u U is an extremal if and only if there exist some Lagrange multiplier λ TF(u) M, λ such that the following equality holds λd u F = νu where (i) ν = in the abnormal case, (ii) ν in the normal case. 141

142 7.2 Lagrange points and Lagrange submanifolds In this section we will work in the following setting: Let U be an open set in a Hilbert space and let M be a smooth n-dimensional manifold. Assume we have a pair of smooth maps F : U M, ϕ : U R. We want to characterize minima of the functional ϕ when restricted to level set of F. As usual we start by considering critical points. min ϕ, q M. (7.9) F 1 (q) Definition 7.5. Let a : M R be a smooth function and N M be a smooth submanifold. Then q N is said a critical point of a N if d q a TqN =. We start with a geometric version of the Lagrange multipliers rule, which characterize constrained critical points. Proposition 7.6 (Lagrange multipliers rule). Assume u U is a regular point of F : U M such that F(u) = q. Then u is a critical point of ϕ F if and only if 1 (q) λ T qm, λ, s.t. d u ϕ = λd u F. (7.1) Proof. Recall that the differential of F is a well defined map D u F : T u U T q M, q = F(u), and, since u is a regular point, D u F is surjective and the level set A q := F 1 (q) = {u U, F(u) = q} U, is a smooth submanifold, with u A q. Since u is a critical point of ϕ Aq, by definition d u ϕ TuA q =. Moreover T u A q = KerD u F. Thus we have that KerD u F Kerd u ϕ. (7.11) Now consider the following diagram T u U D uf T q M (7.12)? d uϕ R From (7.11), usingastandard lemma of linear algebra and the fact that D u F is surjective, it follows that there exists a nontrivial linear map λ : T q M R (that means λ Tq M \{}) that makes the diagram (7.12) commutative. 142

143 Remark 7.7. In the case of sub-riemannian geometry U represents the set of controls, F is the end-point map and ϕ is the energy of the curve associated to controls. In this framework, the problem of finding constrained critical points means exactly to find critical points of the energy when the initial point of the curve is fixed. Hence the solutions of the problem (7.9) represent exactly sub-riemannian extremal trajectories. In particular abnormal extremals corresponds to critical points of F, while normal extremals satisfy the Lagrange multipliers rule. Now we want to consider second order information about our critical points. Recall that, if V is a submanifold in a Hilbert space U, the first differential of a smooth function ψ : V R at a point u V is well defined independently on coordinates d u ψ : T u V R, d u ψ(v) = d dt ψ(γ(t)), t= where γ : ( ε,ε) V is a curve that satisfies γ() = u, γ() = v. This is not the case for the second differential. Indeed the second order derivative of a function ψ is meaningful only at its critical points (at a regular point, by implicit function theorem, one can always find coordinates such that ψ is locally linear). Hence, if u is a critical point for ψ it is intrinsically defined the quadratic map Hess u ψ : T u V R, v d2 dt 2 ψ(γ(t)) t= In our case V = F 1 (q), ψ = ϕ F 1 (q), and since T uf 1 (q) = KerD u F, we have a well defined quadratic form which is computed as follows Hess u ϕ F 1 (q) : KerD uf R Proposition 7.8. For all v KerD u F we have Hess u ϕ F 1 (q) (v) = d2 uϕ(v) λd 2 uf(v). (7.13) where λ is defined by (7.1). Proof. Notice that F 1 (q) U is a submanifold in a Hilbert space. Fix a point q M and u F 1 (q). Consider a path u(s) in U such that u() = u and u(s) F 1 (q) for all s. Then in coordinates we have, differentiating twice with respect to u F(u(s)) = q = df du u = = d2 F df u, u + =. (7.14) du2 duü 143

144 where we denoted 1 by u = u() and ü = ü(). The same computation for ϕ gives Du 2 d2 s= ϕ( u) = ds 2 ϕ(u(s)) = d2 ϕ dϕ u, u + du2 duü = d2 ϕ u, u +λdf du2 duü (by (7.1)) = d2 ϕ du 2 u, F u λ d2 u, u du2 (by (7.14)) Definition 7.9. In the previous setting let u U and λ TF(u) M be a covector. We say that λ is a Lagrange multiplier for the problem (7.9) associated to u (equivalently that (u, λ) is a Lagrange point) if λ TF(u) M s.t. d uϕ = λd u F (7.15) We denote the set of all Lagrange points by C F,ϕ. More precisely C F,ϕ = {(u,λ) U T M F(u) = π(λ), d u ϕ = λd u F} (7.16) The set C F,ϕ is a well-defined subset of the vector bundle F (T M) (see Definition 2.43). Now we give some transversality conditions that ensure F c is an immersion. Definition 7.1. The pair (F,ϕ) is said to be a Morse pair (or a Morse problem) if is not a critical value for the smooth map θ : F (T M) U U, (u,λ) d u ϕ λd u F. (7.17) Remark Notice that, if M = {}, then F is the trivial map and with this definition we have that (F,ϕ) is a Morse pair if and only if ϕ is a Morse function. In canonical coordinates λ = (ξ,x) in T M we can describe the set C F,ϕ F (T M) as the set of (u,ξ,x) that satisfy dϕ du ξdf du = (7.18) F(u) = x The linearization of the system (7.18) at a point (u,ξ,x) is given by the set of points (u,ξ,x ) that satisfy d 2 ϕ du 2u ξ d2 F du 2 u ξ df du = (7.19) df du u = x Let us denote the linear map Q : U U U defined by Qu = ξ d2 F du 2 u d2 ϕ du 2u. 1 Recall that the notation df stands for the differential of F in coordinates, while the notation duf, is intrinsic. du 144

145 Since Q is defined by second derivatives of the maps F and ϕ, it is a symmetric operator. on the Hilbert space U. The definition of Morse problem is immediately rewritten as follows: the pair (F, ϕ) define a Morse problem if and only if the following map is surjective. Θ : U R n U, Θ(u,ξ ) = Qu ξ df du. (7.2) Indeed the map Θ is exactly the coordinate expression of the differential of the first equation in (7.18) (that is the coordinate version of (7.17)). Lemma If (F,ϕ) define a Morse problem, then C F,ϕ is a smooth n-dimensional manifold in F (T M). Proof. First notice that, from (7.16) and (7.17), it immediately follows that C F,ϕ = θ 1 () (7.21) Since is not a critical value for θ, from the implicit function theorem it follows that C F,ϕ is a submanifold. A simple dimension argument let us to conclude under the additional assumption dim U < +. Indeed in this case, since the differential of the map (7.17) is surjective we have that so we can compute the dimension of C F,ϕ dim φ (T M) dim C F,ϕ = dim U dim C F,ϕ = dim φ (T M) dim U = (dim U +rankt M) dim U = rankt M = n In the general case (when dim U = + ) the above argument is no more valid and we have to use explicitly that Q is self-adjoint. Let us denote with B : R n U the map Bξ = ξ df du, so that Θ : (u,ξ ) Qu Bξ Since Θ is surjective and dim(imb) n we get Moreover since Q is self-adjoint we have U = KerQ ImQ, codimimq dimimb n dimkerq = dim(imq) n Now, being Θ the coordinate expression of the differential of θ, the dimension of C F,ϕ coincide with the dimension of the kernel of Θ. In addition, if we denote with π Ker : U KerQ and π Im : U ImQ the orthogonal projection onto the two subspaces, it is easy to see that { Θ(u,ξ π Ker Bξ =, ) = π Im Bξ = Qu from which it immediately follows the identity dimkerθ = dimkerq+dimker(π Ker B) = n since π Ker B : R n KerQ is a surjective map between finite-dimensional spaces. 145

146 The last characterization of Morse problem leads to a convenient criterion to check, in coordinates, whether a pair (F,ϕ) defines a Morse problem or not. Lemma Assume that ImQ is closed. Then the pair (F,ϕ) defines a Morse problem if and only if KerQ KerD u F = (7.22) Proof. The problem is not Morse if and only if the image of the differential of the map (7.17) is not surjective, i.e. there exists w U that is orthogonal to imθ, Using that Q is self-adjoint we get Qu,w ξ df du,w = u,qw ξ df du,w =, ξ,u that is equivalent, since we have disjoint variables, to Qw = and df du w = Let us consider now the projection map F c : C F,ϕ T M defined by : F c (u,λ) = λ. Definition Let N be a n-dimensional submanifold. An immersion F : N T M is said to be a Lagrange immersion if F σ =, where σ denotes the standard symplectic form on T M. Proposition If the pair (F,ϕ) defines a Morse problem, then F c is a Lagrange immersion. Proof. First we prove that F c is an immersion and then that it is Lagragian. (i). Recall that F c : C F,ϕ T M where C F,ϕ = {(u,ξ,x) equations (7.18) holds} The differential D (u,λ) F c : T (u,λ) C F,ϕ T λ T M is defined by the linearization of equations (7.18) where Now looking at (7.19) it easily seen that T (u,λ) C F,ϕ = {(u,ξ,x ) equations (7.19) holds} D (u,λ) F c (u,ξ,x ) = (ξ,x ) D (u,λ) F c (u,ξ,x ) = iff Qu = df du u =. Since (F,ϕ) defines a Morse problem we have by (7.22) that such a u does not exists. Hence the differential is never zero and F c is an immersion. 146

147 (ii). We now show that Fc σ =. Since σ = ds and pullback commutes with the differential it is sufficient to show that Fc s is closed. In particular we will show that Fcs = dϕ CF,ϕ. By definition the map F c we have that the following diagram is commutative: C F,ϕ F c T M (7.23) π U U F M π M Moreover, notice that if F : M N is smooth and ω Λ 1 (N), by definition of pull-back we have (F ω) q = ω F(q) D q F. Hence (F c s) (u,λ) = s λ D (u,λ) F c (by definition of s) = λ π M D (u,λ) F c (by (7.23)) = λ D u F π U (by (7.1)) = d u (ϕ π U ) Remark Recall that the set L F,ϕ of Lagrange multipliers (see Definition 7.9) is the image of C F,ϕ under the map F c : C F,ϕ T M, (u,λ) λ, L F,ϕ := imf c From the last proposition it follows that, if L F,ϕ is a submanifold, then it is a Lagrangian submanifold. We resume the results obtained above in the following Proposition Let (F,ϕ) be a Morse problem and assume (u,λ) is a Lagrange point such that u is a regular point for F, where F(u) = q. The following properties are equivalent: (i) Hess u ϕ F 1 (q) is degenerate, (ii) (u,λ) is a critical point for the map π F c = F CF,ϕ : C F,ϕ M, (iii) if L F,ϕ is a submanifold, λ is a critical point for the map π LF,ϕ : L F,ϕ M. Proof. In coordinates we have the following expression for the Hessian Hess u ϕ F (v) = Qv,v, 1 (q) v KerD uf. is degen- and Q is the linear operator associated to the bilinear form. Assume that Hess u ϕ F 1 (q) erate, i.e. there exists u KerD u F such that Qu,v =, v KerD u F. 147

148 In other words Qu KerD u F that is equivalent to say that Qu is a linear combination of the row of the Jacobian matrix ξ such that Qu = ξ df du From equations (7.19) it follows immediately that (i) is equivalent to (ii). The fact that (ii) is equivalent to (iii) is obvious. 7.3 Sub-Riemannian case In this section we want to specify all the theory we developed in the previous one to the sub- Riemannian case. As we mentioned, we will consider the functional J defined by J(u) = 1 2 u 2 and we consider its critical points constrained to level set of the end-point map F, that means that we fix the final point of our trajectory (as usual we assume that the starting point q is fixed by the very beginning). We already characterized critical points by means of Lagrange multipliers, now we want to consider second order informations. We start by computing the Hessian of J. Lemma Let q 1 M and (u,λ) be a critical point of J F 1 (q 1 ). Then for every v KerD uf Hess u J F 1 (q 1 ) (v) = v 2 L 2 λ, τ t 1 where P t s : γ(s) γ(t) is the flow defined by the control u. Proof. By Proposition 7.8 we have Hess u J F 1 (q 1 ) (v) = d2 uj λd 2 uf. [P 1 τ f v(τ),p 1 t f v(t)]dτdt (7.24) It is easy to compute derivatives of J. Indeed we can rewrite it as J(u) = 1 2 (u,u) L2, hence d u J(v) = (u,v) L 2, d 2 uj(v) = (v,v) L 2 = v 2 L 2, v KerD u F It remains to compute the second derivative of the end-point map. From the Volterra expansion (7.4) we get DuF(v,v) 2 = 2q 1 Pτ f 1 v(τ) Pt f 1 v(t) dτdt τ t 1 To end the proof we use the following lemma on chronological calculus, which we will use to symmetrize the second derivative Lemma Let X t be a nonautonomous vector field on M. Then s t 1 X s X t dsdt = X s ds 1 X t dt+ 1 2 s t 1 [X s,x t ]dsdt (7.25) 148

149 Proof of the Lemma. It is a simple computation 2 X s X t dsdt = 2 X s X t dsdt X t X s dsdt+ X t X s dsdt s t 1 = = s t X s ds X s X t dsdt+ 1 X t dt+ s t 1 s t 1 s t 1 [X t,x s ]dsdt [X s,x t ]dsdt where in the second line we exchange the role of s and t in the integral. s t 1 Proposition 7.2. The sub-riemannian problem (F, J) is Morse. Proof. We use the characterization of Lemma We have to show that, in canonical coordinates λ = (ξ,x), ( ) ( ) ( ) Im ξ d2 F du 2 Id is closed, Ker ξ d2 F df du 2 Id Ker =. (7.26) du Using the previous notation and defining gv t := Pt f 1 v, we can write df du v( ) = q 1 Moreover we have ξ d2 F du 2 v( ),v( ) = ξ 1 τ t 1 g t v(t) dt g τ v(τ) gt v(t) dτdt Sincewewant tofindthekernelofthebilinearformweneedtorecover thelinearoperatorassociated to it, i.e. to symmetrize the form ( ) (Av)(t) := ξ d2 F t 1 du 2 v( ) (t) = ξ gv(τ) τ dτ gt v(t) +ξgt v(t) gv(τ) τ dτ (7.27) Since (7.27) is an compact integral operator, then A Id is Fredholm, and the closedness of Im(A Id) follows from the fact that it is of finite codimension. On the other hand, for every control v KerD u F we can compute (see (7.2)) q 1 t g τ v(τ) dτ = q 1 1 t g τ v(τ) dτ Hence we have that v belong to the intersection (7.26) if and only if it satisfies ( ) I ξ d2 F 1 du 2 v( ) (t) = v(t)+ξ [g tv(t), gv(τ) ](q τ dτ 1 ) which has trivial kernel since is a Volterra operator, of the form v(t)+ t K(t,τ)v(τ)dτ. t t 149

150 Corollary The manifold of Lagrange multilpliers of the sub-riemannian problem (F, J) is a smooth n-dimensional submanifold of T M, namely where H is the sub-riemannian Hamiltonian. L (F,J) := {λ 1 T M λ 1 = e H (λ ), λ T q M} To end this chapter we consider the free initial point problem, i.e. we consider the free end-point map F : M U M, (q,u) γ(1,q,u), where is the solution to the Cauchy problem We look for solution of the problem γ(t,q,u) = q exp t f u(s) ds, γ(t) = f u(t) (γ(t)), γ() = q. min J(u)+a(q), a F 1 (q 1 ) C (M) (7.28) Critical points of this problem can be found with the Lagrange multiplier rule, where, following the notation exploited in the previuos Chapter F = F, ϕ = J +a. Fix a point (q,ũ) M U. It is easy to see that and it is easy to see that the equation F {q } U = F, F M {ũ} = P 1 λ 1 D (q,ũ)f = d (q,ũ)ϕ splits into { λ 1 D u F = d u J = u, λ 1 P 1 = d q a In other words, by PMP, we have that to every critical point of the problem (7.28) we can associate the normal extremal λ t = P t λ, λ = d q a, where the initial condition is defined by the function a. Exercise Consider the free endpoint problem, i.e. find solution of the problem min u J(u) a(f(u)), a C (M) (7.29) In other words now we do not restrict to the sublevel F 1 (q 1 ) (we do not fix the final point of the trajectory) but we consider a penalty in the functional we want to minimize. Prove that u is a critical point of this functional if and only if λ 1 D u F = u, λ 1 = d F(u) a. 15

151 7.4 Exponential map We now introduce the sub-riemannian exponential map. Definition Let q M. We define the sub-riemannian exponential map (based at q ) is the map E q : D q T q M M, E q (λ ) = π e H (λ ). (7.3) wherethedomain D q isthesetofcovectors suchthatthecorrespondingsolutionofthehamiltonian system is defined on [,1]. When there is no confusion on the point where the exponential map is based at, we omit it in the notation, writing E. Proposition If M is complete, then D q = T q M. Moreover, if there are no strictly abnormal minimizers, the exponential map is surjective. The homogeneity of the sub-riemannian Hamiltonian H yields to the following homogeneity property of the flow of H. Lemma Let H be the sub-riemannian Hamiltonian. Then, for every λ T M e t H (αλ) = αe αt H (λ), α,t >. (7.31) Proof. By Remark 4.25 we know that if λ(t) = e t H (λ ) is a solution of the Hamiltonian system, then also λ α (t) := αλ(αt) is a solution. The result follows from the uniqueness of the solution and the identity that λ α () = αλ(). The exponential map sends a covector λ to the point at time 1 of the normal extremal path with initial condition λ. The homogeneity property let us to recover the whole geodesic as the image of the ray that join to λ in the fiber T q M. Corollary Let λ(t), t [,T], be the normal extremal that satisfies the initial condition λ() = λ T q M. Then the normal extremal path γ(t) = π(λ(t)) satisfies Proof. Using (7.31) we get γ(t) = E(tλ ), t [,T] E(tλ ) = π(e H (tλ )) = π(e t H (λ )) = π(λ(t)) = γ(t). Remark Due to the homogeneity property we can consider the following map E q : R + C q M, E q (t,λ ) = E q (tλ ) where C q is the hypercylinder of normalized covectors C q = {λ T q M H(λ) = 1/2} In other words we restrict to length parametrized extremal paths, considering the time as an extra variable. 151

152 We end this section by the Hamiltonian version of the Gauss Lemma Proposition 7.28 (Gauss Lemma). Let (u,λ 1 ) be associated with a normal minimizer starting from q. Assume that the sub-riemannian front E q (T q M) is smooth at E q (λ 1 ). Then the covector λ 1 annihilates the tangent space to E q (T q M). Proof. It is enough to show that for every smooth variation η s of initial covectors such that η = λ we have λ(1), d ds E q (η s ) = s= Let us consider a family of initial covectors η s H 1 (1/2) and their associated controls u s ( ) defined by the identities u s i (t) = ηs (t),f i (γ s (t)), u s L 2 = 1 where η s (t) is the solution of the Hamiltonian equation with initial value η s and γ s (t) is the corresponding trajectory. For these controls one has E q (η s ) = F(u s ) hence d ds E q (η s ) = d s= ds F(u s ) = D u F(v), v := d s= ds u s (7.32) s= Notice that v is orthogonal to u since u s = const. Thus by the normal equation (7.5) and (7.32) λ(1), d ds E q (η s ) = λ(1),d u F(v) = (u,v) L 2 =. (7.33) s= Now we focus on normal extremal paths starting from the fixed point q and we want to understand how they cover a neighborhood of the initial point. Recall that normal extremal paths are projections of the Hamiltonian flow on T M λ(t) = H(λ(t)), H(λ) = 1 2 k λ,f i (q) 2, where H is the sub-riemannian Hamiltonian. In particular the exponential map can be interpreted as the restriction of the end-point map to a special class of controls parametrized by a covector λ T q M where i=1 E q (λ ) = π e H (λ ) = F(u λ ), u λ (t) = (u λ i (t)) i=1,...,k, u λ i (t) = λ(t),f i (γ(t)). Recall that, if we denote by γ λ (t) the normal extremal path with initial covector λ, from the homogeneity property of H it follows that from which one get E(tλ()) = γ λ (t), D E(λ ) = γ λ () = H p (q,p ) = D (H T q M ). It follows that is a regular point of E if and only if D q = T q M. 152

153 Remark In the Riemannian case E gives local coordinates to M around q, being a diffeomorphism of a small ball in T q M onto a small geodesic ball in M, where geodesics are images of straight lines in the cotangent space. Moreover there is a unique minimizer joining q to every point of the (sufficiently small) ball and d 2 is a smooth function around q. As we show next, as soon as D q T q M singularities appear. 7.5 Conjugate points and minimality properties of geodesics Consider now an extremal pair (u(t), λ(t)), t [, 1], such that the corresponding extremal path γ(t) is strictly normal. Recall that by Corollary 4.59, the curveγ is ageodesic. Moreover, γ [,s] also is a geodesic, for every s >, and if we reparametrize it as γ s (t) := γ(st),t [,1] it corresponds to the control u s (t) = su(st). Definition 7.3. A geodesic γ(t) is said to be strongly normal, if γ [,s] is stricly normal s >. Proposition Let γ be a strongly normal geodesic. The following are equivalent: (i) Hess u J F 1 (γ(1)) is positive definite, (ii) Hess us J F 1 (γ s(1)) Proof. Recall that is non degenerate for all s >. Hess us J F 1 (γ s(1)) (v) = v 2 L 2 λ s,d 2 u s F(v,v) (7.34) which is a well defined quadratic form of the kind Id Q s, with Q s compact, since it is a Volterra operator (see also the proof of Proposition 7.2). Then define the function { α(s) : = inf v 2 L 2 λ s,du 2 s F(v,v) } v =1 = 1 sup v =1 λs,d 2 u s F(v,v) (7.35) Notice that, being a compact operator, if the quadratic form is not negative definite then the maximum is attained in (7.35) and it is the maximum eigenvalue 2. On the other hand, if the quadratic form is negative definite the supremum is always zero (it is sufficient to evaluate it on any orthonormal sequence). Now we prove the following claim, which immediately implies the proposition, using that the Hessian is degenerate at some point s then α( s) = (indeed α( s) = means that the quadratic form is nonnegative and has infimum zero, hence has zero as eigenvalue, by compactness). Claim. α(s) is a continuous and monotone decreasing function, with α() = 1. Proof of the Claim. It is easy to show that the following formulas hold for the first and second differentials computed at points u s s D us F(v) = Pt f 1 v(t) dt, Du 2 s F(v,v) = [Pτ f 1 v(τ),pt f 1 v(t) ]dτdt (7.36) τ t s 2 a compact symmetric operator on a Hilbert space is diagonalizable and the set of eigenvalues is countable, bounded, and can be ordered in such a way that µ n. 153

154 Now consider s ŝ 1 and v KerD us F and define the control (ŝ ) v v(t) = s t, t s ŝ, s, ŝ < t 1. Then v = v, v KerD uŝf and D 2 u s F(v,v) = D 2 uŝf( v, v), hence α(s) α(ŝ). On the other hand, if we consider γ s (t) = γ(st) as defined on the whole segment [,1], we can rewrite (7.36) as follows D us F(v) = s 1 P 1 st f v(t) dt, D 2 u s F(v,v) = s 2 τ t 1 [P 1 sτ f v(τ),p 1 st f v(t) ]dτdt (7.37) To prove that α is continuous we need that both the integrand in the expression of D us F and the kernel KerD us F of these quadratic form is continuous with respect to s. This follows from our main assumption on γ. Indeed, since every restriction γ [,s] is strictly normal we have that rank of the quadratic form is always equal 3 to n, and the kernel continuously depend on s. Remark Notice that (i) implies only that u is local minimizer in the L 2 -topology. We will discuss more stronger minimality conditions in next sections. As we said the definition of exponential map is nothing but the map of the Proposition Is natural then to give the following definition: Definition Fix q M and consider the exponential map E = E q starting from q. A point q q is said conjugate to q if q is a critical value for E. We say that q is conjugate to q along the geodesic γ(t) = E(tλ) if q = γ(s) and sλ is a critical point of the exponential map E. Remark Recall that E(λ) Proposition Let γ(t) be a strongly normal geodesic and s t. Then γ(s) is conjugate to γ() if and only if Hess us c F 1 (γ s) is degenerate. Proof. We apply Proposition Indeed γ(s) is a conjugate point if and only if u s is a critical point of the exponential map, that is equivalent to the fact that Hess us c F 1 (γ s) is degenerate. Corollary Let γ(t) be a strongly normal geodesic and assume that there are no conjugate points. Then Hess u c F 1 (q 1 ) >. In particular γ(t) is a local minimizer in the L2 -topology for controls. Proof. Indeed,sincetherearenoconjugatepoints,byProposition7.35itfollowsthatHess us c F 1 (γ s) is non degenerate for every s [,1], hence Hess u c F 1 (q 1 > by Proposition ) Corollary Let γ(t) be a strongly normal geodesic. Then the set {s >,γ(s) is conjugate} is isolated from. 3 a piece of curve γ s is abnormal if and only if it is a critical point of F, that means that the rank of the derivative is not maximum at this point 154

155 Proof. It follows from the fact that small pieces of a normal geodesic are minimizers and Proposition Hence we have a good characterization of minimizers for the sub-riemannian distance in terms of conjugate points, but only in the L 2 -topology for controls, that is equivalent to the H 1 -topology for the trajectories. Now we want to prove that, if there are no conjugate points, the trajectory is also a minimizer in the C -topology, that is more strong. Proposition Let γ be a strongly normal geodesic. If γ(s) is not conjugate to γ() for every < s 1, then γ is a strong miminum in the C -topology for trajectories. Proof. Assume that γ(t) = π e t H (λ ), λ T qm We want to show that hypothesis of Theorem 4.57 are satisfied. We will use the following lemma, which we prove at the end of the proposition. Lemma There exists a C (M) such that λ = d q a, Hess (q,u)j +a >, F 1 (γ s) In this case (F,J +a) is a Morse problem and L (F,J+a) = {e H (d q a), q M} From this Lemma it follows that sλ is a regular point of the map π e H L, where as usual L = {d q a,q M} denotes the graph of the differential. Using the homogeneity property (7.31) we can rewrite this saying that π e s H L is an immersion at λ, s [,1], Inparticular it is alocal diffeomorphism. Hence wecan applythelocal version of Theorem Proof of Lemma First we notice that KerD (q,u)f T q M U, U Hilbert In particular KerD (q,u)f ( U) = KerD u F Since there are no conjugate points, it follows that Hess (q,u)j +a = Hess uj > (7.38) KerDuF Then it is sufficient to show that there exists a choice of the function a C (M) such that the Hessian is positive definite also in the complement. We define W s := {ξ v KerD (q,u s)f Hess(J +a)(ξ v, KerD us F) = } 155

156 Notice from (7.38) that, if there is some ξ v W s, then ξ. Now we prove that there exists a map B s : T q M U, W s = {ξ B s ξ, ξ T q M} Then we will have and we get KerD (q,u s)f = ( KerD us F)+W s Hess(J +a)(ξ B s ξ + v,ξ B s ξ + v) = = HessJ(v,v) +Hess(J +a)(ξ B s ξ,ξ B s ξ) = HessJ(v,v) +d 2 a(ξ,ξ)+q(ξ) where we used that mixed terms give no contribution and denote with Q(ξ) a quadratic form that does not depend on second derivatives of a. In particular, since the first term is positive, we can choose a in such a way that it remains positive. Remark 7.4. The assumption that the curve γ is strictly normal is essential in what we proved. Indeed if a curve γ is both normal and abnormal we have that there exists two covectors λ 1,ν 1 that satisfy λ 1 D u F = u, ν 1 D u F =, that implies (λ 1 +sν 1 )D u F = u, s R and the whole one parameter family of covectors projects on the same geodesic, and γ would be a critical point of the projection. In this case the definition of conjugate point should be changed. Upto nowweproved asufficient condition forastrictly normalgeodesic tobeastrongminimum of the sub-riemannian distance. Indeed Proposition 7.38 says that, if γ contains no conjugate points, then it is optimal with respect to sufficiently C -closed curves. On the other hand, if we consider a control u such that the corresponding trajectory γ(t) = q exp t f u(s) ds is strictly normal, that means u is not a critical point of the end-point map F, then it is well defined the Hessian of J F 1 (q 1 ), where q 1 = F(u) at the point u. Moreover, if γ is locally optimal, also in a very weak sense, then necessarily we have Hess u J F 1 (q 1 ) Indeed if the Hessian is sign-indefinite, then the map is locally open around the point u and we have that small perturbations give rise to a smaller cost. As in the proof of Proposition 7.31 we consider the family of rescaled controls(and corresponding trajectories) u s (t) = su(st), γ s (t) = γ(st), s,t [,1], and we define the function α(s) = min v =1 Hess u s J F 1 (γ s(1)) 156

157 that is well defined, continuous and non-increasing, under the assumption that γ s is strictly normal for every s [,1]. Notice that α( s) = if and only if γ( s) is a conjugate point. Since α() = 1 we have only three cases (a) α(1) >. By monotonicity this implies α(s) > for all s and we have no conjugate points. Hence, by Proposition 7.38, γ is a minimum in the strong topology. (b) α(1) <. Then the Hessian at u is sign indefinite and γ is not a minimum, also in the weak topology. (c) α(1) =. In this case the Hessian is semi-definite and we cannot conclude anything on the minimality of γ. Notice that in cases (b) and (c) also a segment of conjugate point can appear. To analyze in details case (c) and to understand better the properties of a segment of conjugate point we introduce the notion of Jacobi curves, which is some sense generalize the notion of Jacobi fields in Riemannanian geometry. (see Chapter 13) 7.6 Application: Conjugate locus on perturbed S 2 In this section we prove that the conjugate locus of a generic C 2 perturbation of the standard metric on S 2, generates a conjugate locus which has at least 4 cusps. Recall that the conjugate locus from a point q on the standard sphere S 2 coincide with the point that is antipodal to q, where all geodesics starting from q meets and lose their optimality. Let us then consider a point q on S 2 with a Riemannian Hamiltonian H sufficiently close to H (with respect to the C 2 topology). Normal geodesic starting from q can be parametrized by an angle θ S 1, that describes the set of normal extremal paths parametrized by length, or equivalently covectors λ T q M such that H(λ) = 1/2. For a fixed initial condition λ = (q,θ) we have that λ(t) = e t H (λ) = (p(t,θ),γ(t,θ)), and we denote by E q the exponential map based at q E q (t,λ) = π e t H (λ) = γ(t,θ) For every initial condition θ S 1 let us denote by γ θ (t) (also γ(t,θ)) the normal extremal path associated with θ andstarting fromq, andbyt c (θ)thefirstconjugate time alongγ θ. Theconjugate locus is the set Con(q ) = {γ(t c (θ),θ),θ S 1 }. Proposition The conjugate time along γ θ is characterized as follows { } t c (θ) = min t > E θ (t,θ) =. (7.39) Proof. The conjugate point corresponds to points (t, θ) such that the differential of the exponential map is not surjective, i.e. when { } E E rank (t,θ), t θ (t,θ) = 1. (7.4) 157

158 Let us show that the two vector cannot be proportional unless E θ (t,θ) =. Indeed it follows from Proposition 7.28 that p, E t (t,θ) = 1, p, E θ (t,θ) =, thus, whenever E θ (t,θ), thetwovectors appearingin(7.4)arealways linearlyindependent. Let us now consider solutions of the equation E (t,θ) =. (7.41) θ In other words introduce the function β : θ E(t c (θ),θ). By the chain rule and (7.41), it is easy to see that β (θ) = t c (θ) E θ (t c(θ),θ)+ E θ (t c(θ),θ) (7.42) }{{} = Let us denote by g : S 1 R 2 the function g(θ) := E θ (t c(θ),θ). When H corresponds to the Hamiltonian H of the standard Riemannian structure on the sphere then the function g describes a circle: ( ) cosθ g (θ) = sinθ By assumption the perturbation of the metric is small in the C 2 -topology, hence the perturbation does not change the convexity property of g. Then the cuspidal point of the conjugate locus corresponds exactly to those points where the function θ t c (θ) change sign. Theorem The conjugate locus of the perturbed sphere has at least 4 cuspidal points. Proof. Notice that the function θ t c(θ), seen as a periodic function defined on R, can change sign only an even number of times on an interval [,2π]. Moreover it is has zero mean since 2π t c(θ)dθ = t c (2π) t c () = (7.43) that implies that, if it is not identically zero, it has to change sign at least twice on [,2π]. Notice also that 2π t c(θ)g(θ)dθ = 2π β (θ)dθ = β(2π) β() =. (7.44) Let us now assume by contradiction that the function θ t c(θ) changes sign exactly twice at points θ 1,θ 2 S 1. Then, by convexity, there exists a covector λ (R 2 ) such that λ,g(θ i ) = for i = 1,2 and such that t c (θ) λ,g(θ), that implies in particular which contradicts (7.44). 2π t c(θ) λ,g(θ) dθ Remark Thesame argument can beapplied for every small C 2 perturbation H of theriemannian Hamiltonian H associated with the standard Riemannian structureon S 2, and not necessarily a quadratic Hamiltonian coming from a Riemannian metric. 158

159 7.7 Global minimizers Before going to the analysis of global minimality of geodesics, let us resume in the following Theorem our results about local minimality. Theorem Let M be complete and γ(s) with γ [,s] and γ [s,1] strictly normal s 1. (i) if γ has no conjugate point then its a minimizer in the C -topology for the trajectories, (ii) if γ has at least a conjugate point then its not minimizer in the L 2 -topology for controls. Remark Notice that the hypotheses of the above theorem imply that in the case (ii) it not possible to have ha segment of full conjugate point up to t = 1. Definition We say that a point q is in the cut locus of q if thereexists two length minimizers joining q and q. Our previous analysis of conjugate points let us to state the following result. Theorem Let M be a complete sub-riemannian manifold and γ : [,1] M be a normal extremal path. Then (i) assume that γ [,s] is strictly normal for all s > and that γ is not a minimizer. Then there exists τ ],1] such that γ(τ) is either cut or conjugate to γ(), (ii) assume that γ [s,1] is strictly normal for all s > and that there exists τ ],1] such that γ(s) is either cut or conjugate to γ(). Then γ not a minimizer. In particular if γ is strongly normal then we have that γ is not a minimizer if and only if there exists a cut or a conjugate point along γ. Proof. (i). Let us assume that γ is not a minimizer and that there are no conjugate points along γ. We prove that this implies the presence of a cut point. Define t := sup{t [,1] γ [,t] is minimizing} Let us show that < t < 1. Indeed t > since small pieces of a normal extremal path are minimizers. Moreover, since γ [,1] is not a minimizer, by continuity of the distance also t < 1 and l(γ [,t ]) = d(γ(),γ(t )). Fix now a sequence t n t such that t n > t for all n and denote by γ n ( ) a minimizer joining γ() to γ(t n ) such that l(γ n ) = d(γ(),γ(t n )) (the existence of such a minimizers follows from the completeness assumption). By compactness of minimizers (up to considering a subsequence) there exists a limit minimizer γ n γ joining γ() to γ(t ). In particular l( γ [,t ]) = d(γ(),γ(t )) = l(γ [,t ]). On the other hand, since the segment γ [,t ] contains no conjugate points (by definition of t ), the curve γ [,t ] is a minimizer in the strict C -topology. Thus γ cannot be contained in a neighborhood γ. From this it follows that γ(t ) is a cut point. (ii). Assume that there exists a conjugate point γ(τ) in the segment [,1]. Then γ is not a local (hence global) minimizer, as proved in Theorem It remains to show that the same remains true if γ(τ) is a cut point. Indeed in this case we have a minimizer γ such that γ(τ) = γ(τ). From this it follows that the curve built with γ [,τ] and γ [τ,1] is also a minimizer and the piece 159

160 γ [τ,1], by uniqueness of the covector, would be associated with two different normal covectors, hence abnormal, that contradicts our assumptions. Theorem Let γ : [,1] M be a strictly normal extremal path. Assume that for some s > (i) γ [,s] is a global minimizer, (ii) at each point in a neighborhood of γ(s) there exists a unique minimizer joining γ() to γ(s), that is not abnormal. Then there exists ε > such that γ [,s+ε] is a global minimizer. Proof. Let us consider a neighborhood O of γ(s) and, for each q O, let us denote by u q (resp. γ q ) the minimizing control (resp. trajectory) joining γ() to q. The map q u q is continuous in the L 2 topology. Hence we can consider the family λ q 1 of covectors such that λ q 1 D u qf = uq, q O. By the smoothness of F and the continuity of the map q D u qf we have that the map q λ q 1 is continuous. Indeed since the trajectory associated with u q is not abnormal by assumptions, one has D u qf is onto. Thus its adjoint (D u qf) is injective and satisfies λ q 1 = (D u qf) u q. Thus the map q λ q is continuous too, being the composition of the previous one with (P,1 ) 1. Moreover, the map q λ q is also injective. Indeed it is an inverse of the exponential map. By the invariance of domain theorem we have that O = {λ q,q O} is open in T q M. Thus (1+ε)λ γ(s) O for ε small enough. Since (1+ε)λ γ(s) = λ γ((1+ε)s), this means that γ is minimizer on the interval [,(1+ε)s[. Hence γ(s) is not a conjugate point. Corollary If we assume in Theorem 7.48 that γ is strongly normal, then γ(s) is not a conjugate point. Corollary 7.5. Assume that the sub-riemannian structures admits no abnormal minimizer. Let γ : [,1] M be a length minimizer such that γ(1) is conjugate to γ(). Then any neighborhood of γ(1) contains a cut point. 16

161 Chapter 8 Nonholonomic tangent space In this chapter, for a point q M, the symbol Ω q denotes the set of smooth curves γ on M that are based at q, that is γ() = q. 8.1 Jet spaces Fix q in M and a curve γ Ω q. In every coordinate chart it is meaningful to write the Taylor expansion γ(t) = q + γ()t+o(t 2 ) (8.1) The tangent vector v T q M to γ at t = is by definition the equivalence class of curves in Ω q such that, in some coordinate chart, they have the same 1-st order Taylor polynomial. (This requirement indeed implies that the same is true for every coordinate chart, by the chain rule.) In the same spirit we can consider, given a smooth curve such that γ() = q, its m-th order Taylor polynomial at q γ(t) = q + γ()t+ γ() t γ(m) () tm m! +O(tm+1 ) (8.2) Exercise 8.1. Let γ,γ Ω q. We say that γ is (m-)equivalent to γ at q, and we write γ q,m γ, if their Taylor polynomial at q of order m in some coordinate chart coincide. Prove that q,m is a well-defined equivalence relation on the set of curves based at q. Definition 8.2. Let m > be an integer and q M. We define the set of m-th jets of curves at point q M as the equivalence classes of curves based at q with respect to q,m. We denote with Jq m γ the equivalence class of a curve γ and with J m q := {J m q γ : γ Ω q } Remark 8.3. From coordinates representation (8.2), one can prove that J m q is a smooth manifold and dimj m q = mn. Indeed the m-th order Taylor polynomial is characterized by the n-dimensional vectors γ (i) () for i=1,...,m (cf. (8.2)). In the following we always assume that q M is fixed together with a coordinate chart around q, where q =. The Taylor expansion of a curve γ Ω q is then written as follows J m q γ = m i=1 161 γ (i) () ti i!.

162 To better understand the structure of J m q as a smooth manifold we consider the map which forget about the m-th derivative Π m m 1 : Jm q J m 1 q m γ (i) () ti m 1 i! i=1 i=1 γ (i) () ti i! Proposition 8.4. Jq m is an affine bundle over Jq m 1 spaces over T q M. with projection Π m m 1, whose fibers are affine Proof. Fix an element j Jq m 1, then the fiber (Π m m 1 ) 1 (j) is the set of all m th -jets with fixed (m 1) th jet equal to j. To show that it is an affine space over T q M we should define the sum of a tangent vector and an m th -jet, with (m 1) th -jet fixed, having as a result another m th -jet with the same (m 1) th -jet. Let j = Jq mγ be the mth -jet of a smooth curve in M and let v T q M. Consider a smooth vector field V Vec(M) such that V(q) = v and define the sum J m q γ +v := Jm q (γv ), γ v (t) = e tmv (γ(t)) (8.3) It is easy to see that, due to the presence of the power t m, the (m 1) th Taylor polynomial of γ and γ v coincide. Indeed J m q (e tmv (γ(t))) = J m q γ +t m V(q) Hence the sum (8.3) gives to (Π m m 1 ) 1 (j) the structure of affine space over T q M. Indeed it is enough to check that the definition does not depend on the reoresentative. The geometric meaning of the fact that Jq m is an affine bundle (and not an vector bundle) is that we cannot complete in a canonic way a (m 1) th -jet to a m th -jet, i.e. we cannot fix an origin in the fiber. On the other hand there exists a sort of global origin on Jq m, that is the jet of the constant curve equal to q. Now we want to define dilations on jet spaces, analogously to homothety in Euclidean spaces. Since we have no vector space structure we have to find an appropriate notion Definition 8.5. Let α R and define γ α (t) := γ(αt) for every t R. Define the dilation of factor α on J m q as δ α : J m q J m q, δ α(j m q γ) = Jm q (γ α) One can check that this definition does not depend on the representative and, in coordinates, it is written as a quasi-homogeneous multiplication ( m ) δ α t i ξ i = i=1 m t i α i ξ i Next we extend the notion of jets also for vector fields. To start with we consider flows on the manifold. Definition 8.6. A flow on M is a smooth family of diffeomorphisms i=1 P = {P t Diff(M), t R}, P = Id 162

163 Notice that we do not reuire the family to be a one parametric group (i.e., the group law P t P s = P t+s is not satisfied) and this in general is carachterized as the flow of the nonautonomous vector field X t := d dε P t+ε Pt 1. ε= The set of all flows on M is a group with the point-wise product, i.e. the product of the flows P = {P t } and Q = {Q t } is given by (P Q) t := P t Q t Clearly we can act with a flow on a smooth curve on M as follows: (Pγ)(t) = P t (γ(t)). Moreover, since P = Id, every flow defines a map on Ω q. This action is well-behaved with respect to equivalence relations m,q, i.e., it defines a map on Jq m. Indeed if γ Ω q, then Pγ Ω q and from the chain rule it follws that Jq m (Pγ) depends only on first m derivatives of γ at q, i.e., on Jq m γ. Definition 8.7. Let P be a smooth flow on M. The action of P on J m q is defined by Pj := J m q (Pγ), if j = J m q γ. It can beeasily checked that the definition is well-posed and (P Q)j = P(Qj) for every j J m q. Jets of vector fields Given a vector field V Vec(M) we want to define its m th -jet Jq m V which should be naturally an element of Vec(Jq m ). LetusdenotewithP V = {e tv }the1-parametric groupdefinedbytheflowofv. Asweexplained we can act on jets P V : j e V (j) To act on a family of curves we need a family of flows, then let us consider the 1-parametric group of flows P s V = {estv } Definition 8.8. For every V Vec(M) we define the vector field Jq mv Vec(Jm q ) is the section Jq m V : Jq m TJq m defined as follows (Jq m V)(Jm q γ) := s PV s (Jm q γ) = s= s Jq m (etsv (γ(t))) (8.4) s= Exercise 8.9. Prove the following formula for every V Vec(M) (J m q V)(J m q γ) = m i=1 t i d i t= i! dt i (tv(γ(t))) where V is identified with a vector function V : R n R n in coordinates. To end this section we study the interplay between dilations and jets of vector fields. Since δ α is a map on Jq m its differential (δ α ) acts on elements of Vec(Jq m ), and in particular on jets of vector fields on M. Surprisingly, its action on these fields is linear with respect to α. 163

164 Proposition 8.1. For every α R and V Vec(M) one has (δ α ) (Jq m V) = Jq m (αv) = αjq m V. Proof. From the very definition of the differential of a map (see also Chapter 2) we have ((δ α ) Jq m V))(Jm q γ) = s Jq m (δ αe tsv δ 1/α (γ(t))) s= = s Jq m (δ α e tsv (γ(t/α))) s= = s Jq m (eαtsv (γ(t))) s= = Jq m (αv) = αjm q V 8.2 Admissible variations In this section we define the appropriate notion of tangent vector to a sub-riemannian manifold. Our goal is to define the tangent structure to a sub-riemannian one. As usual, we assume that the sub-riemannian structure is defined by the generating family {f 1,...,f m }. Admissible curves on M are maps γ : [,T] M such that there exists a control function u L such that γ(t) = f u(t) (γ(t)) = m u i (t)f i (γ(t)). i=1 To have a good definition of tangent vector we could not restrict to family of admissible curves, because in this way we lose all the information about directions that are not in the distribution. Indeed we want the tangent space to be a first order approximation of the structure, containing informations about all directions. We need a proper definition of tangent vector, that means a proper definition of variation of a point, in order to give a precise meaning to its principal term, that is going to be the tangent vector. We now introduce the notion of smooth admissible variation. Definition A curve γ : [,T] M in Ω q is said a smooth admissible variation if there exists a family of controls {u(t,s)} s [,τ] such that (i) u(t, ) is measurable and essentially bounded for all t [,T], (ii) u(,s) is smooth with bounded derivatives, for all s [,τ], (iii) u(,s) = for all s [,τ], (iv) γ(t) = q exp τ f u(t,s) ds 164

165 In other words γ is a smooth admissible variation (or shortly, admissible variation) if it can be parametrized as the final point of a smooth family of admissible curves. We stress that an admissible variation is not an admissible curve, in general. Remark We recall that two distributions are said to be equivalent (see also Definition 3.3 and 3.17) if and only if the corresponding modulus of horizontal vector fields are isomorphic D D, where we recall that D = span{f(σ), σ smooth section of U}. which is finitely generated by a basis f 1,...,f m. Let us show that the definition of admissible variation does not depend on the frame f 1,...,f m. Recall that γ(t) is an admissible variation if γ(t) = q(t,τ) where q(t,s) is a solution of m s q(t,s) = u i (t,s)f i (q(t,s)), i=1 s [,τ] Let now f 1,..., f m be another set of local generators of the modulus. There exist functions a ij C (M) such that f i (q) = m a ij (q)f j (q), q M, i = 1,...,m (8.5) j=1 and assume that γ is an admissible variation with respect to u(t,s) in this new frame, i.e. m s q(t,s) = u i (t,s) f i (q(t,s)), s [,τ] (8.6) i=1 Now we prove that there exist a control ũ(t,s) such that γ is an admissible variation of the old frame with respect to this control. From (8.5) we get f(u,q) = i = i,j = j u i (t,s) f i (q) u i (t,s)a ij (q)f j (q) v j (t,s,q)f j (q) = f(v(u,q),q) Then we could define, using the solution q(t,s) of (8.6), the new control ũ j (t,s) = i u i (t,s)a ij (q(t,s)) and we see from identities above that m s q(t,s) = ũ j (t,s)f j (q(t,s)), s [,τ] (8.7) i=1 165

166 Note. We assume that the sub-riemannian structure is bracket generating at q and let m the degree of nonholonomy of the distribution, i.e. such that D m q = T qm. Definition The set of admissible jets with respect to the sub-riemannian structure is J f q := {Jm q γ, γ is an admissible variation} Example Consider two vector fields X,Y Vec(M) and the curve γ : t e ty e tx e ty e tx (q) It is easily seen that γ is an admissible variation if we set where γ(t) = exp 4 f tv(s) (q)ds (1,), if s [,1], (,1), if s [1,2], v(s) = ( 1,), if s [2,3], (, 1), if s [3,4]. In coordinates we have expansion γ(t) = q +t 2 [X,Y]+o(t 2 ). Now we want to describe the nonholonomic tangent space in an intrinsic coordinate free way. Then we will see how it can be described in special coordinates. Definition The group of flows of admissible variations is { τ } P f := exp f u(t,s) ds, u(t,s) smooth variation Any admissible variation is given by γ(t) = P t (q) for some P P f, where we identify q with the constant curve γ(t) q for all t. Then we have J f q = {Jm q (P (q)),p P f } and the set of admissible jets is exactly the orbit of q under the action of the group P f. Remark It is easy to see that P f is a group since the following equality holds where τ1 exp f u(t,s) ds τ2 exp w(t,s) = f v(t,s) ds = τ1 +τ 2 exp f w(t,s) ds { u(t,s), s τ 1, v(t,s τ 1 ), τ 1 s τ 1 +τ 2. is the concatenation of controls. 1 1 Here we see that is useful not to fix τ in the definition, otherwise we need to rescale controls. 166

167 Now we want to describe the tangent space as the quotient of this set with respect to some subgroup of slow flows. The heuristic idea is that a flow is slow if the first nonzero jet of its associated trajectory J i q γ belong to a subspace j, with j < i. Definition Let P P f. P is said to be purely slow if it is associated to a smooth variation u(t,s) such that satisfies u(,s) = u (,s) =. t The subgroup of slow flows is the normal subgruop of P f generated by purely slow flows, i.e. P f := {(Q t) 1 P t Q t, Q t P f, P t purely slow} Remark Notice that, from the definition and the linearity of f, a purely slow flow can be written as follows: u(t,s) = tv(t,s), where v(,s) =. Moreover we have P t = τ exp f u(t,s) ds = τ exp f tv(t,s) ds = τ exp tf v(t,s) ds = t τ exp f v(t,s) ds, Definition Let γ, γ be two curves on M. We say that J m q γ and J m q γ are equivalent if γ(t) = P t (γ(t)) for some P t P f. The nonholonomic tangent space Tf q is defined as T f q := Jf q / We end this section with the coordinate presentation of jets of horizontal vector fields of the sub-riemannian structure Proposition 8.2. Let X D be an horizontal vector field for the sub-riemannian structure on M. Then the one parametric group e tx acts on the set J f q. Moreover the action is well defined on the equivalence classes with respect to. Proof. From thevery definitionof Jq f it is easy to seethat if Jq m γ is thejetof anadmissiblevariation then the right hand side of (8.4) is an admissible variation for every s. We are left to show that if γ(t) γ (t) = e tx γ(t) e tx γ (t). From our assumption we get γ (t) = γ(t) Q t for a slow flow Q P f. It follows that γ (t) e tx = γ(t) Q t e tx = γ(t) e tx e tx Q t e tx = (γ(t) e tx ) Q t where Q t := e tx Q t e tx is also a slow flow. This shows that e tx is independent on the representative and is well defined on the quotient. 8.3 Nilpotent approximation and privileged coordinates In this section we want to introduce some coordinates in which we have a good description of the nonholonomic tangent space. 167

168 Consider some non negative integers k 1,...,k m such that n = k k m and the splitting R n = R k 1... R km, x = (x 1,...,x m ) where every x i = (x 1 i,...,xk i i ) Rk i. ThespaceDer(R n ) ofall differential operatorsinr n withsmoothcoefficients formanassociative algebra with composition of operators as multiplication. The differential operators with polynomial coefficients form a subalgebra of this algebra with generators 1,x j i,, where i = 1,...,m; j = x j i 1,...,k i. We define weights of generators as Then for any monomial ν(1) =, ν(x j i ) = i, ν( x j ) = i. i β ν(y 1 y α ) = z 1 z β α β ν(y i ) ν(z j ). i=1 j=1 We say that a polynomial differential operator D is homogeneous if it is a sum of monomial terms all of same weight. Lemma Let D 1,D 2 be two homogeneous differential operators. Then D 1 D 2 is homogeneous and ν(d 1 D 2 ) = ν(d 1 )+ν(d 2 ) (8.8) Proof. It is sufficent to check for monomials of kind D 1 = and D x j 1 2 = x j 2 i2 and formula (8.8) i1 follows from identity j 2 i 2 x j x j2 1 i2 = x j 2 i2 i1 x j + x 1 i1 x j 1 i1 A special case is when we consider vector fields. If V 1,V 2 Vec(R n ) are homogeneous vector fields then [V 1,V 2 ] is homogeneous and ν([v 1,V 2 ]) = ν(v 1 )+ν(v 2 ). With these properties we can define a filtration in the space of all smooth differential operators Indeed we can write (in multiindex notation) D = α ϕ α (x) α x α ConsideringtheTaylorexpansionatofeverycoefficientwecansplitDasasumofitshomogeneous components D D (i) and define the filtration i= D (h) = {D Der(R n ) : D (i) =, i < h}, h Z 168

169 It is easy to see that it is a decreasing filtration, i.e. D (h) D (h 1) for every h, and if we restrict our attention to vector fields we get V Vec(R n ) V (i) =, i < m Indeed every monomial of a N th -order differential operator has weight not smaller than mn). In other words we have (i) Vec(R n ) D ( m), (ii) V Vec(R n ) D () implies V() =. and every vector field that is not zero at the origin is necessarily in D ( 1). This motivates the folowing definition Definition A system of coordinates near the point q is said privileged for a sub-riemannian structure M if the following conditions are satisfied (i) D i q = Rk 1... R k i, i = 1,...,m, (ii) f D ( 1) for every f D. Condition (i) says that our coordinates are linearly adapted to the flag D 1 q D2 q... Dm q. Notice that this condition can be always satisfied with a linear change of coordinates. Example We analyze the meaning of privileged coordinates in the easiest cases m = 1, 2 and we show that in general not all system of linearly adapted coordinates are privileged. (1) If m = 1 all sets of coordinates are privileged because Vec(M) D ( 1) since ν( xi ) = 1 for all i. (2) If m = 2 then all systems of coordinates that are linearly adapted to the flag are privileged. Indeed we have ν( x j) = 1 and ν( 1 x j) = 2 and a vector field that belong to D ( 2) \D ( 1) 2 must contain a monomial of the second kind, with constant coefficient. On the other hand vector fields f 1,...,f k cannot contain such a monomial since, by our assumption span{f 1 (),...,f k ()} = D 1 = Rk 1. (3) Let we consider the following set of vector fields in R 3 = R R R f 1 = x1 +x 1 x3, f 2 = x 1 x2, f 3 = x 2 x3 where we put ν(x i ) = i for i = 1,2,3. All nontrivial commutators are computed as follows [f 1,f 2 ] = x2, [f 2,f 3 ] = x 1 x3, [[f 1,f 2 ],f 3 ] = x3, and it is easy to see that the flag (computed at x = ) is D 1 = span{ x 1 }, D 2 = span{ x 1, x2 }, D 3 = span{ x 1, x2, x3 } Then this set of coordinates are linearly adapted to the flag but are not privileged since ν(x 1 x3 ) = 2 169

170 Theorem Let M be a sub-riemannian manifold and q M. There always exists a system of privileged coordinates near q. We postpone the proof of this theorem to the end of this section, after having analyzed the meaning of privileged coordinates. Theorem Let M be a sub-riemannian manifold and q M. In privileged coordinates we have the following (i) J f q = { m i=1 ti ξ i,ξ i D i q} and dimj f q = mk 1 +(m 1)k k m. (ii) Let j 1,j 2 J f q. Then j 1 j 2 if and only if j 1 j 2 = m i=1 ti η i, where η i D i 1 q. First part of proof of Theorem We start by proving the inclusion Jq f { m i=1 ti ξ i,ξ i Dq i}. For any smooth variation γ(t) we can write Taylor expansion leads to γ(t) = q + i j=1 s j... s 1 s γ(t) = q exp τ f u(t,s) ds q f u(t,s1 )... f u(t,sj ) ds 1...ds j +O(t i+1 ) Indeed using the fact that f is linear in u, we can factor out t from every term since u(,s) =. If we want compute our curve in privileged coordinates (to compute weights) it is sufficient to apply all to the coordinate function. In particular, since f u D ( 1) we have that f u(t,s1 )... f u(t,sj ) D ( i) and applying to a coordinate function x β α, where α = 1,...,m and β = 1,...,k α we have f u(t,s1 )... f u(t,sj )x β α D ( i+α) because ν(x β α) = α. Then, if α > i we have that this function has positive weight. Thus, when evaluated at x = it is zero. In other words we proved that, for every i = 1,...,m, up to the i th -term we can find only element in D i q. To prove the converse inclusion we have to show that, given some elements ξ i D i q we can find a smooth variation that has these vectors as elements of its jet. We start with some preliminary lemmas. Lemma Let m,n be two integers. Assume that we have two flows P t = Id+t n V +O(t n+1 ) Q t = Id+t m W +O(t m+1 ) Then P t Q t P 1 t Q 1 t = Id+t n+m [V,W]+O(t n+m+1 ). 17

171 Proof. Denoting V t the nonautonomous vector filed associated to P t it is easily check that Moreover for the inverse flow we have V t = nt n 1 V +O(t n ) P 1 t = Id t n V +O(t n+1 ) Q 1 t = Id t m W +O(t m+1 ) Define R(t,s) := P t Q s Pt 1 Q 1 s. Since P = Q = Id we have that R is constant on the axes,i.e. R(,s) = R(t,) = Id. Hence the only derivative that enter in our expansion, that coincide with F(t) = R(t,t), are mixed derivatives. This remark let us to expand the product P t Q t Pt 1 Q 1 t and consider only terms with mixed power of t and s to get (Id+t n V +O(t n+1 ))(Id+t m W +O(t m+1 ))(Id t n V +O(t n+1 ))(Id t m W +O(t m+1 )) = and the lemma is proved. = Id+t n s m (VW WV)+... = Id+t n s m [V,W] Lemma For all l h and i 1,...,i h {1,...,k}, there exists an admissible variation u(t,s) such that q τ exp f u(t,s) ds = q +t l [f i1,...,[f ih 1,f ih ]](q)+o(t l+1 ) Proof. By induction - l 1 and i = 1,...,k there exists an admissible variation u(t,s) such that q exp τ f u(t,s) ds = q +t l f i (q)+o(t l+1 ) It is sufficient to consider u = (u 1,...,u k ) where u i = t l and u h = for all h i. - l 2 and i,j = 1,...,k, there exists an admissible variation u(t,s) such that q exp τ f u(t,s) ds = q +t l [f i,f j ](q)+o(t l+1 ) It is sufficient to use the previous lemma where P t and Q t are flows respectively of nonautonomous vector fields V t = t l 1 f i1 and W t = tf i2. With analogous arguments we can prove by induction the lemma In other words we proved that every bracket monomial of degree i can be presented as the i-th term of a jet of some admissible variation. Now we prove that we can do the same for any linear combination of such monomials (recall that D i id the linear span of all i-th order brackets). 171

172 Remark The previuous construction of u(t, s) does not depend on the sub-riemannian structure but only on the structure of the Lie bracket. Lemma Let π = π(f 1,...,f k ) a bracket polynomial of degree degπ l. There exists an admissible variation u(t, s) such that q exp τ f u(t,s) ds = q +t l π(f 1,...,f k )(q)+o(t l+1 ) Proof. Let π(f 1,...,f k ) = j V j(f 1,...,f k ) where V j are monomials. By our previous argument we can find u j (t,s),s [,τ j ] such that q exp τ f u j (t,s)ds = q +t l V j (f 1,...,f k )(q)+o(t l+1 ) Now consider the concatenation of controls u(t,s), where s [,τ] and τ = τ j defined as follows u(t,s) = u j (t,s j τ i ), if τ j s τ j+1 i=1 Exercise 8.3. End the previous proof showing that the flow relative to u has as l-th term j V j. Then prove, by rescaling that also any monomial of type αv can be presented. Now we can complete the proof of the first statemet of Theorem 8.25 proving the following inclusion { m i=1 ti ξ i,ξ i D i q} J f q. Second part of Theorem Let we consider a m-th jet m i=1 ti ξ i,ξ i Dq i. We prove by induction - From previous lemmas there exists an admissible variation γ(t) such that γ(t) = q exp τ f u(t,s) ds, γ(t) = ξ 1 Then we will have γ(t) = tξ 1 +t 2 η where η 2 D 2 from first part of the proof. We want to correct the second order term - From previous lemma there exists an admissible variation γ 1 (t) such that γ 1 (t) = q exp τ Defining γ 2 (t) = γ 1 (t) γ(t) we have where η 3 D 3. f v(t,s) ds, γ(t) = t 2 (ξ 2 η 2 )+o(t 2 ) γ 2 (t) tξ 1 +t 2 η 2 +t 2 (ξ 2 η 2 )+t 3 η 3 tξ 1 +t 2 ξ 2 +t 3 η 3 At every step we can correct the right term of the jet and prove the inclusion. 172

173 (ii) We have to prove that j j j j = m t i η i, i=1 η i D i 1 q. ( ). Assume that j j, where j = J m q γ = t i ξ i and j = J m q γ = t i ξ i. Then γ = γ Q t for some slow flow Q t P f of the form Q i t = Pt i exp Q t = Q 1 t Q h t τ f tv i (t,s)ds (P i t) 1 for some P i P f,i = 1,...,h. For simplicity we prove only the case h = 1. By formula (6.19) we have that Q t = P t τ exp f tv(t,s) ds Pt 1 = τ exp P t f tv(t,s) Pt 1 ds then by linearity of f we have Now recall that P t = exp τ Q t = τ exp tadp t f v(t,s) ds f w(t,θ)dθ for some admissible variation w(t,θ) and from (6.17) we get Q t = exp τ t exp Finally, if γ(t) = q exp τ f u(t,s)ds we can write γ (t) = q exp τ s f u(t,s) ds exp adf w(t,θ) dθ f v(t,s) ds τ t exp s adf w(t,θ) dθ f v(t,s) ds Expanding with respect to tq t (Id+t t i V i ) = Id+ t i+1 V i where V i is a bracket polynomial of degree i. Due to the presence of t it is easy to see that in the expansion of γ we will find the same terms of γ plus something that belong to D i 1. ( ). Assume now that j = J m q γ = t i ξ i and j = J m q γ = t i ξ i, with j j = m t i η i, i=1 η i D i 1 q. We need to find a slow flow Q t such that γ = γ Q t. In other words it is sufficient to prove that we can realize with a slow flow every jet of type m i=1 ti η i, η i Dq i 1. To this purpose we can repeat arguments of proof of part (i), using the following Lemma Let P t,q t be two flows with P t P f and Q t P f (or P t P f and Q t P f ). Then P t Q t Pt 1 Q 1 t P f. Proof. If Q t P f then Q 1 t P f. Moreover from the definition of Pf we have that P tq t Pt 1 P f. Hence also their composition is in P f. 173

174 Corollary In privileged coordinates (x 1,...,x m ) defined by the splitting R n = R k 1 R km we have tx 1 +O(t 2 ) Jq f = t 2 x 2 +O(t 3 ).,x i R k i,i = 1,...,m t m x m Proof. Indeed we know that D i = R k 1 R k i and writing ξ i = x i, x i,i, x i,j R k j we have, expanding and collecting terms t i ξ i = tξ 1 +t 2 ξ t m ξ m = tx 1,1 +t 2 (x 2,1 +x 2,2 )+...+t m (x m, x m,m ) = (tx 1,1 +t 2 x 2, t m x m,1,t 2 x 2, t m x m,2,t m x m,m ) Corollary The nonholonomic tangent space Tq f is a smooth manifold of dimension dimtq f = m(q) i=1 k i(q). In privileged coordinates we can write tx 1 Tq f = t 2 x 2.,x i R k i,i = 1,...,m t m x m and dilations δ α acts on T f q in a quasi-homogeneous way δ α (tx 1,...,t m x m ) = (αtx 1,...,α m t m x m ), α >. Proof. It follows directly from the representation of the equivalence relation. Indeed two elements j and j can be written in coordinates as and j j if and only if x j = y j for all j. j = (tx 1 +O(t 2 ),t 2 x 2 +O(t 3 ),...,t m x m ) j = (ty 1 +O(t 2 ),t 2 y 2 +O(t 3 ),...,t m y m ) Remark Notice that a polynomial differential operator homogeneous with respect to ν (i.e. whose monomials are all of same weight) is homogeneous with respect to dilations δ t : R n R n defined by δ t (x 1,...,x m ) = (tx 1,t 2 x 2,...,t m x m ), t >. (8.9) In particular for a homogeneous vector field X of weight h it holds δ X = t h X. Now we can improve Proposition 8.2 and see that actually the jet of a horizontal vector field is a vector field on the tangent space and belongs to D ( 1) (in privileged coordinates). 174

175 Lemma Fix a set of privileged coordinate. Let V D ( 1), then the jet Jq m V is tangent to the submanifold Jq f. Moreover it is well defined as vector field V on the nonhonolomic tangent space. In other words V Vec(Tq f ) and we have where v i is the i 1 order term of v i. v 1 (x) v 1 (x) v 2 (x) V =. = V v 2 (x) =. v m (x) v m (x) (8.1) Proof. Let V D ( 1) and γ(t) be an admissible variation. When expressed in coordinates we have (see...) v 1 (x) tx 1 +O(t 2 ) v 2 (x) V =., γ(t) = t 2 x 2 +O(t 3 ). v m (x) t m x m We know that (Jq mv)(jm q γ) is expressed as the m-th jet of tv(γ(t)) by Exercise...Hence we compute tv 1 (tx 1 +O(t 2 ),...,t m x m ) (Jq m V)(Jq m tv 2 (tx 1 +O(t 2 ),...,t m x m ) γ) = (8.11). tv m (tx 1 +O(t 2 ),...,t m x m ) Notice that V D ( 1) means exactly that V = v i (x) = v j i x (x) i x j, ν( i x j ) = i i and v i is a function of order at least i 1. Let we denote with v i the homogeneous part of v i of order i 1. To compute the value of V then we have to restrict its action on admissible variations from T f q, then evaluate and neglect the higher order part (that corresponds to the projection on the factor space) in order to have v i (tx 1,...,t m x m ) = t i 1 v i (x 1,...,x m )+O(t i ) and using equality we have tv 1 (tx 1,...,t m x m ) (Jq m V) T tv 2 (tx 1,...,t m x m ) = f q. = tv m (tx 1,...,t m x m ) t v 1 +O(t 2 ) t 2 v 2 +O(t 3 ). t m v m +O(t m+1 ) (8.12) From this easily follows (8.1). 175

176 Remark Notice that, since v i is a homogeneous function of weight i 1, it depends only on variables x 1,...,x i 1 of weight equal of smaller than its weight. Hence V has the following triangular form v 1 v 2 (x 1 ) V(x) = (8.13). v m (x 1,...,x m 1 ) Moreover theflowof avector fieldofthiskindcanbeeasily computedby astep bystepsubstitution. Now we prove existence of privileged coordinates Proof of Theorem Consider our sub-riemannian structure on M defined by the orthonormal frame {f 1,...,f k } and its flag D 1 q D 2 q... D m q = T q M, with n j := dimd j q (n j = k k j ) Let we consider a basis {V 1,...,V n } of the tangent space adapted to the flag, i.e. V i = π i (f 1,...,f k ) π i bracket polynomial, degπ i j if i n j D j q = span{v 1 (q),...,v nj (q)}, j = 1,...,m InparticularV 1,...,V n1 areselectedin{f i,i = 1,...,k}, V n1 +1,...,V n2 areselectedfrom{[f i,f j ],i,j = 1,...,k} and so on. Define the map Ψ : (s 1,...,s n ) q e s 1V 1... e snvn (8.14) We want to show that Ψ 1 defines privileged coordinates around q. It is easy to show that (8.14) is a local diffeomorphism since Hence it remains to show that (i) Ψ 1 (Di q ) = span{,..., }, s 1 s ni (ii) Ψ 1 f i D ( 1) for every i = 1,...,k Ψ s= s= = Ψ = V i (q), i = 1,...,n (8.15) s i s i Part (i) easily follows from our choice of adapted frame to the flag and (8.15). On the other hand the second part is not trivial since we need to compute differential of Ψ at every point and not only at s =. Remark In what follows we consider on T q M the weight defined by coordinates (y 1,...,y n ) induced by the flag. In other words we consider the basis V 1 (q),...,v n (q) in T q M and write v = (y 1,...,y n ) = y i V i (q), where ν(y i ) := w i = j if n j 1 < i n j Moreover we can think at v T q M as the constant vector field on T q M identically equal to v. In this way it makes sense to consider the value of a polynomial bracket at π(f 1,...,f k ) at the point q and consider its weight ν(π). 176

177 We prove the following auxiliary Lemma Let X = π(f 1,...,f k )(q) Vec(T q M), ν(x) d. Consider now the polynomial vector field on T q M Y(y) = y il y i1 (adv il adv i1 X)(q) (8.16) = p i (y)v i (q) for some polynomial p i. Then p i D (w i d). Proof of Lemma. It easily follows from definition of weights that adv il adv i1 (X) D ( w ij d) hence every summand of (8.16) belong to D ( d). Then if we rewrite the sum in terms of the basis V i (q),i = 1,...,k we have that every coefficient p i (y) must belong to D (w i d), since ν(v i (q)) = w i. Now we prove the following claim: for every bracket polynomial X = π(f 1,...,f k ) we have Ψ 1 X D( d). In particular part (ii) will follow when d = 1. Clearly we can write in coordinates Ψ 1 X = n i=1 a i (s) s i (8.17) and our claim is equivalent to show that a i D (wi d). First we notice that Ψ = s i ε q e s 1V 1 e (s i+ε)v i e snvn ε= = q e s 1V 1 e s iv i V i e s i+1v i+1 e snvn = q e s 1V 1 e snvn e snvn e s i+1v i+1 V }{{} i e s i+1v i+1 e snvn Ψ(s) In geometric notation we can write Ψ s i = e snvn e s i+1v i+1 V i Ψ(s) (8.18) Remember that, as operator on functions, e ty = e tady. This implies that in (8.18) we have a series of bracket polynomials. Apply Ψ to (8.17) we get X = Ψ(s) n a i (s)e snvn e s i+1v i+1 Ψ(s) V i i=1 Now we apply e s 1V 1 e snvn e s 1V 1 to both sides to compute the vector field at the point q e snvn X = q n i=1 a i (s)e s 1V e s i 1V i 1 V i q (8.19)

178 Rewriting this identity in coordinates b i (s)v i (q) = i i,j a i (s)(ϕ ij (s)v j (q)+v i (q)) (8.2) where ϕ ij () =. Indeed we split the zero order term since we know that for s = the pushforward of the vector fields is exactly V i. Using Lemma above with X and V i,i = 1,...,n we have b i D w i d, ϕ ij D w j w i On the other hand we can rewrite relation between coefficients as follows B(s) = A(s)(Φ(s)+I) where we denote B(s) = (b 1 (s),...,b n (s)), A(s) = (a 1 (s),...,a n (s)) and Φ(s) = (ϕ ij ) ij Thus we get and we can finish the proof noticing that and so on. Hence we get a i D w i d. A(s) = B(s)(I +Φ(s)) 1 = B(s)(I Φ(s)+Φ(s) 2...) = B(s) (BΦ)(s)+(BΦ 2 )(s)... (B) i = b i D w i d (BΦ) i = b j ϕ ji D w j d+(w i w j ) = D w i d. Remark One can repeat all calculation in chronological notation and recover the proof in a purely algebraic way. In the above computations nothing change if we consider any permutation σ = (i 1,...,i n ) of (1,...,n) and the coordinate map Ψ σ : (s 1,...,s n ) q e s in V in... e s i 1 V i1 In particular we can consider the coordinate map and it is easy to see that it satisfies Φ : (x 1,...,x n ) q e xnvn... e x 1V 1 Φ 1 V 1 = x1 = x 2 x1 = Φ 1 V 2 Φ 1 V i. (8.21) = x i x1 =...=x i 1 = for i = 1,...,n 1, the set of vector fields among f 1,...,f k that generates D q. 178

179 In Riemannian geometry the tangent space depends only on the dimension of the manifold (i.e. all tangent spaces to a n-dimensional manifold are isometric). Now we can prove that in sub-riemannian geometry this is not true. Indeed we see that, even in dimension 3, we can have non isometric tangent space, depending on the growth vector (n 1,...,n m ). In bigger dimension it is also possible to prove that, for a fixed growth vector, we have non isometric tangent space depending on the point on the manifold. Example 8.4. (Heisenberg) Assume n = 3 and that growth vector is (2,3). Then we consider coordinates (x 1,x 2,x 3 ) and weights (w 1,w 2,w 3 ) = (1,1,2). We can assume that V 1 = f 1, V 2 = f 2, V 3 = [f 1,f 2 ] From last Remark we have that, in privileged coordinates we can assume f 1 = x1, f 2 = x2 +αx 1 x3, α R (8.22) because f i = xi + something that has weight 1 and depend only on xj,j > n 1. On the other hand from (8.21) we have [f 1,f 2 ] = x3 = α = 1 and we get the Heisenberg algebra f 1 = x1, f 2 = x2 +x 1 x3, f 3 = x3 (8.23) Example (Martinet) Assume n = 3 and that growth vector is (2,2,3). Then we consider coordinates (x 1,x 2,x 3 ) and weights (w 1,w 2,w 3 ) = (1,1,3). We can assume, up to change indices, that V 1 = f 1, V 2 = f 2, V 3 = [f 1,[f 1,f 2 ]] From last Remark we have that, in privileged coordinates we can write f 1 = x1, f 2 = x2 +(αx 2 1 +βx 1x 2 ) x3, α,β R (8.24) since we assume f 2 x1 = = x2 that implies f 2 = x2 + x 1 a(x) x3, but ν(f 2 ) = 1 and so (8.24) follows. From V 3 x= = x3 we have [f 1,[f 1,f 2 ]] = 2α x3 = α = 1/2. Moreover, since we are interested to normalize sub-riemannian structure and not only the pair of vector fields, we consider rotations of the orthonormal frame. Remark Notice that f 1 = cosθf 1 sinθf 2 f 2 = sinθf 1 +cosθf 2 = [ f 1, f 2 ] = [f 1,f 2 ]. 179

180 Thus, denoting as usual f u = u 1 f 1 +u 2 f 2 we can consider the linear map ϕ : u [f u,[f 1,f 2 ]]/D which vanish on some line on the plane D = span{f 1,f 2 }. Up to a rotation of the frame we can assume that f 2 kerϕ so that [f 2,[f 1,f 2 ]] =, hence β =. f 1 = x1, f 2 = x x2 1 x3, f 3 = x3 (8.25) 8.4 Geometric meaning In the previous section we very clearly found how V is analitically recovered from V. It is nothing else but the principal part of V in privileged coordinates. But now we want to discuss in which sense V is an approximation of V. It turns out that in this nonholonomic setting it plays the same role that linearization of a vector filed does in the Euclidean case. Lemma Let V a vector field. In privileged coordinates we have equality εδ1 ε V = V +εw ε, where W ε is smooth Proof. Write V = V +W and applying the dilation we find δ1 ε V = δ1 ε V +δ1 ε W Since V ishomogeneousofdegree 1wehaveδ1 V = 1 ε ε V andsettingw ε = εδ1 W wearedone. ε Remark Geometrically this procedure means that we consider a small neighborhood of the point q and we make a dilation. Then we properly rescale in order to catch the principal term. This is a blow-up procedure. Notice that we are blowing-up in a nonisotropic way and it contains information about local structure of the bracket. Now we can give a very precise meaning of the fact that nilpotent approximation is the principal part of the sub-riemannian structure, which knows local geometry near the point q. Let us consider the end point map F : U M, u( ) q exp 1 f u(t) dt where U = L k 2 (,1) = L2 ([,1],R k ) is the set of admissible controls. Let we denote by ρ the sub-riemannian distance from the fixed point From Lemma 8.43 we can write for ε > ρ(x) := d(x,q) = inf{ u,f(u) = x} (8.26) f ε u := εδ1 ε f u = f u +εw ε u 18

181 Denote now with f ε and f respectively the sub-riemannian structures on R n and by d ε and d the associated sub-riemannian distance. Notice that, from the very definition of d ε we have d ε (x,y) = 1 ε d(δ ε(x),δ ε (y)) that says d ε is d when we look infinitesimally near the point q and rescale. Let ρ ε, ρ and F ε, F have analogous meaning. We start from an auxiliary proposition. Proposition F ε F uniformly on balls in L k 2 (,1) (actually in C sense). Proof. Consider the solution x ε (t) and x(t) of the two systems based atq = x(t) = f u(t) ( x(t)), ẋ ε (t) = f ε u(t) (xε (t)) Using Lemma 8.43 we rewrite the second equation as ẋ ε (t) = f u(t) (x ε (t))+εw ε t (xε (t)) and standard estimates from ODE theory prove that x ε x. Notice that, since nilpotent vector fields are complete, the solution x(t) is defined for all t R. Lemma {ρ ε } ε> is an equicontinuous family. Proof. We will prove the following: for every compact K R n there exists ε,c >, depending on K, such that d ε (x,y) C x y 1/m, ε < ε, x,y K. (8.27) where m is the degree of nonholonomy. Notice that from (8.27) we get, using triangle inequality ρ ε (x) ρ ε (y) = d ε (,x) d ε (,y) d ε (x,y) C x y 1/m which proves the lemma. We are then reduced to prove (8.27). Idea is to cover a fixedneighborhood of the origin using controls with bounded norms, uniformly in ε. Let V 1,..., V n an adapted basis of the nilpotent system f, such that V i = π i ( f 1,..., f k ) for some bracket polynomials π i,i = 1,...,n. From the very definition we have V 1 ()... V n () On the other hand, by continuity, this implies that they are linearly independent also in a small neighborhood of the origin and by quasi-homogeneity we get V 1 (x)... V n (x), x R n. Let Vi ε = π i (f1 ε,...,fε k ) denote vector fields defined by the same bracket polynomials but in terms of the vector fields of the approximating system. For every K R n there exists ε = ε (K) such that V1(x)... V ε n(x) ε, x K, ε ε. 181

182 Recall that by Lemma 8.29, given a bracket polynomial π i (g 1,...,g k ),degπ i = w i there exists an admissible variation u i (t,s), depending only on π i, such that exp 1 g ui (t,s)ds = Id+t w i π i (g 1,...,g k )+O(t w i+1 ) If we apply this lemma for g i = f ε i we find u i (t,s) such that exp 1 where w i = deg V i = degvi ε. Now consider the map Φ ε (t 1,...,t n,x) = x exp Remark We have the expansion x exp f ε u i (t,s) ds = Id+tw i V ε i +O(t w i+1 ), ε > 1 1 f ε ds... exp u 1 (t 1/w 1 1,s) 1 wi+1 f ε ds = x+t u i (t 1/w i iv ε i,s) i (x)+o(t w i f ε ds (8.28) u n(t 1/wn n,s) i ) In particular this is a C 1 map with respect to t. Notice that it is not C 2 if w i > 1 for some i (i.e. a real subriemannian problem). From this remark it follows that Φ ε C 1 as a function of t, being a composition of C 1 maps. Moreover we get the expansion Φ ε (t 1,...,t n,x) = x+ n i=1 t i V ε i (x)+o( t ) = Φε t i t= = V ε i (x) Hence the map Φ ε is a local diffeomorphism near the origin t = (t 1,...,t n ) = and by Implicit Function Theorem there exists a constant c > such that x+cνb Φ ε (νb,x), B = B(,1) R n, x K, (8.29) where c is independent of ε and ν is small enough. Let us denote now with F x the end-point map based at the point x R n (with analogous meaning for F ε x, F x ), and with B L 2 the unit ball in L k 2 [,1]. We claim that (8.29) implies that there exists a constant c such that x+c νb F ε x(ν 1 m BL 2), ν,ε > (8.3) Since t u i (t, ) is a smooth map for every i, and u i (, ) = we have that there exist a constant c i such that for all ν > small enough. t νb u i (t, ) c i νb L 2, (8.31) u i (t 1/w i, ) c i ν 1/w i B L 2, (8.32) 182

183 For such ν we have by inclusion (8.3) that x y cν = d ε (x,y) ν 1/m where we used the fact that d ε is the infimum of norm of u such that Fx(u) ε = y. From this easily follows d ε (x,y) c 1 1 m x y m (8.33) Remark All estimates are valid also for ε, i.e. for the nilpotent approximation. In particular, using homogeneity d(x,y) C x y 1 m, x,y R n (8.34) Indeed from the proof of Lemma 8.46 it follows that the estimate (8.34) holds in a compact K containing the origin. Consider two arbitrary points x,y R n and ε > such that δ ε x,δ ε y K. By the homogeneity of the distance Moreover since the estimate (8.34) holds in K We can state now the main result d(δ ε x,δ ε y) = ε d(x,y). d(δ ε x,δ ε y) C δ ε x δ ε y 1/m Cε x y 1/m Theorem ρ ε ρ uniformly on compacts in R n. Proof. By Lemma 8.46 it is sufficient to prove pointwise convergence. We prove the following inequalities lim supρ ε (x) ρ(x) liminf ε + ε ρε (x) (8.35) + (i) Fix a point x and a control û such that F(û) = x, û = ρ(x), i.e. such that the corresponding trajectory is a minimizer for the system f. Now consider x ε := F ε (û). From Proposition 8.45 we get x ε x for ε. Moreover, from the definition of ρ ε we have ρ ε (x ε ) ρ(x). Hence ρ ε (x) = ρ ε (x ε )+ρ ε (x) ρ ε (x ε ) ρ(x)+ ρ ε (x) ρ ε (x ε ) Using that ρ ε is an equicontinuous family and that x ε x we have the left inequality in (8.35). 183

184 (ii) Let now u ε be a control such that F ε (u ε ) = x, u ε = ρ ε (x) and define x ε := F(u ε ). As before we have ρ(x ε ) ρ ε (x). Then ρ(x) = ρ(x ε )+ ρ(x) ρ(x ε ) ρ ε (x)+ ρ(x) ρ(x ε ) and now it is sufficient to notice that x ε = F ε (u ε ) F(u ε ) = x since F ε F uniformly on balls of L 2 and u ε bounded since ρ ε are equicontinuous.. In privileged coordinates x = (x 1,...,x m ) R k 1... R km = R n we set Π ε = {x R n, x i ε i,i = 1,...,m} Corollary 8.5 (Ball-Box Theorem). There exists constants c 1,c 2 > such that c 1 Π ε B(x,ε) c 2 Π ε where B(x, ε) is the subriemannian ball in privileged coordinates. Notice that this is a weaker statement with respect to Theorem Exercise Prove Corollary 8.5. Definition Let f and f be two sub-riemannian structures on the same manifold M. We say that the structures are locally Lipschitz equivalent if, for any compact K M there exist c 1,c 2 > such that c 1 d(x,y) d(x,y) c 2 d(x,y) where µ and µ are respectively the sub-riemannian distances induced by f and f. From the Ball-Box Theorem we easily get a characterization of locally Lipschitz equivalent structures in term of the distribution. Corollary Two sub-riemannian structures are locally Lipschitz equivalent if and only if the two flags are equal at al points, i.e. D i q = D i q, q M, i 1. Corollary Two regular sub-riemannian structures are locally Lipschitz equivalent if and only if their distributions are equal at al points, i.e. D q = D q, q M. In other words, in the regular case, the distribution define the metric up to locally Lipschitz equivalence. Remark In the proof of Theorem 8.49 we showed that, in some coordinates, the sub-riemannian metric has an holder estimate with respect to the Euclidean one. The fact that the metric is Lipschitz equivalent to the Euclidean one characterize exactly Riemannian structures on M. Moreover we notice that this is only local property since we do not study the behaviour of the constants c 1,c 2 when K become big. 184

185 8.5 Algebraic meaning In the last section we proved in which sense the sub-riemannian tangent space approximate the sub-riemannian structure on the manifold. Now we also show that, at least in the regular case, the nilpotent approximation has a structure of Lie group, endowed with a left-invariant sub-riemannian structure. Recall that given an orthonormal frame {f 1,...,f k } for the sub-riemannian structure, by Proposition 8.2 the vector field J m q f i, jet of a vector field on M, is a well defined vector field on the quotient T f q := J f q /, which we denote f i. Proposition The Lie algebra Lie{ f 1,..., f k } is a nilpotent Lie algebra of step m, where m is the nonholonomic degree of f at q. Proof. Consider privileged coordinates around the point q. Then f i has weight 1 and is homogeneous with respect to the dilation δ λ. Moreover for any bracket monomial we have ν([ f i1,...,[ f ij 1, f ij ]]) = j Since every vector field V, when written in privileged coordinates, satisfies ν(v) m, then every bracket of m vector fileds is necessarily zero. Consider now the group generated by the flows of these vector fields G = Gr{e t f 1,...,e t f k } which acts on T f q on the right, and is by definition a nilpotent Lie group. 2 Moreover in the proof of Theorem 8.25 we showed that this action is also transitive (i.e. we can realize every element of T f q with this action) Collecting together all these results we have Corollary The nilpotent approximation T f q is a homogeneous space, diffeomorphic to the quotient G/G, where G is the isotropy group of the trivial element of T f q. Before interpreting this contruction at the level of Lie algebras, we recall some definitions. The free associative algebra on k generators x 1,...,x k is the associative algebra A k of linear combinations of words of its generators, where the product of two element is defined by juxtaposition. The free Lie algebra on k generators, denoted L k, is the algebra of Lie elements of A k where the product of two elements x,y is defined by the commutator [x,y] = xy yx. The nilpotent step m free Lie algebra on k generators x 1,...,x k, is the quotient of the free Lie algebra by the ideal I m+1 generated as follows: I 1 = L, and I j = [I j 1,L]. Let Lie m {X 1,...,X k } be the nilpotent step m free Lie algebra generated by the vector fields X 1,...,X k and consider the subalgebra C := {π Lie m {X 1,...,X k } π( f 1,..., f k )() = } of all polynomial bracket such that if we replace X i with f i are zero when evaluated at zero. Then LieT f q Lie m {X 1,...,X k }/C 2 A Lie group G is nilpotent if its Lie algebra g is nilpotent. The fact that G acts on the right is because right action satisfies R hg = R gr h (i.e. x (hg) = (x h) g). 185

186 Remark To discuss regularity properties of T f q with respect to q, we can restate this characterization in such a way that does not depend on the nilpotent approximation: where C q is the core subalgebra LieT f q Lie m{x 1,...,X k }/C q C q := {π Lie m {X 1,...,X k } π(f 1,...,f k )(q) D degπ 1 q } (8.36) Lemma Assume that the sub-riemannian structure has constant growth vector, i.e. that n i (q) = dimd i q does not depend on q. Then C q is an ideal. In particular T f q is a Lie group. Proof. It is sufficent to prove that X C q = [f i,x] C q, i = 1,...,k Since the structure has constant growth vector, we can consider an adapted basis V 1,...,V n, well defined in a neighborhood O q of q. In particular if X = π(f 1,...,f k ) is a bracket polynomial of degree degπ = d we can write X(q ) = a i (q )V i (q ), q O q i:w i d where a i are suitable smooth functions. From (8.36) we have that X C q if and only if it belongs to Dq d 1, i.e. a i (q) =, i s.t. w i = d. On the other hand [f i,x] = [f i, w j da j V j ] = w j da j [f i,v j ]+f i (a j )V j (8.37) From this equality it is easy to check that every coefficient of degree d+1 in this sum is null at q, since they can appear only in the first summand of (8.37). Corollary 8.6. Under previuos assumptions f 1,..., f k are a basis of left-invariant vector fields on T f q. Proof. All relies on the fact that if we consider a left invariant vector field X on a Lie group G, and we consider the right action of a normal subgroup H on it, then X is a well defined left-invariant vector field on the quotient G/H, which is still a Lie group. Examples Heisenberg Martinet Grushin 186

187 Chapter 9 The volume in sub-riemannian geometry 9.1 The Popp volume For an equiregular sub-riemannian manifold M, Popp s volume is a smooth volume which is canonically associated with the sub-riemannian structure, and it is a natural generalization of the Riemannian one. In this chapter we define the Popp volume and we prove a general formula for its expression, written in terms of a frame adapted to the sub-riemannian distribution. As a first application of this result, we prove an explicit formula for the canonical sub-laplacian, namely the one associated with Popp s volume. Finally, we discuss sub-riemannian isometries, and we prove that they preserve Popp s volume. 9.2 Popp volume for equiregular sub-riemannian manifolds Recall that a distribution D is equiregular if the growth vector is constant, i.e. for each i = 1,2,...,m, k i (q) = dim(d i q) does not depend on q M. In this case the subspaces D i q are fibres of the higher order distributions D i TM. For equiregular distributions we will simply talk about growth vector and step of the distribution, without any reference to the point q. Next, we introduce the nilpotentization of the distribution at the point q, which is fundamental for the definition of Popp s volume. Definition 9.1. Let D be an equiregular distribution of step m. The nilpotentization of D at the point q M is the graded vector space gr q (D) = D q D 2 q/d q... D m q /D m 1 q. The vector space gr q (D) can be endowed with a Lie algebra structure, which respects the grading. Then, there is a unique connected, simply connected group, Gr q (D), such that its Lie algebra is gr q (D). The global, left-invariant vector fields obtained by the group action on any orthonormal basis of D q gr q (D) define a sub-riemannian structure on Gr q (D), which is called the nilpotent approximation of the sub-riemannian structure at the point q. In what follows, we provide the definition of Popp s volume. Our presentation follows closely the one that can be found in [?]. (See also [23]). The definition rests on the following lemmas. 187

188 Lemma 9.2. Let E be an inner product space and V a vector space. Let π : E V be a surjective linear map. Then π induces an inner product on V such that the norm of v V is v V = min{ e E s.t. π(e) = v}. (9.1) Proof. It is easy to check that Eq. (9.1) defines a norm on V. Moreover, since E is induced by an inner product, i.e. it satisfies the parallelogram identity, it follows that V satisfies the parallelogram identity too. Notice that this is equivalent to consider the inner product on V defined by the linear isomorphism π : (kerπ) V. Indeed the norm of v V is the norm of the shortest element e π 1 (v). Lemma 9.3. Let E be a vector space of dimension n with a flag of linear subspaces {} = F F 1 F 2... F m = E. Let gr(f) = F 1 F 2 /F 1... F m /F m 1 be the associated graded vector space. Then there is a canonical isomorphism θ : n E n gr(f). Proof. We only give a sketch of the proof. For i m, let k i := dimf i. Let X 1,...,X n be a adapted basis for E, i.e. X 1,...,X ki is a basis for F i. We define the linear map θ : E gr(f) which, for j m 1, takes X kj +1,...,X kj+1 to thecorrespondingequivalence class in F j+1 /F j. This map is indeed a non-canonical isomorphism, which depends on the choice of the adapted basis. In turn, θ induces a map θ : n E n gr(f), which sends X 1... X n to θ(x 1 )... θ(x n ). The proof that θ does not depend on the choice of the adapted basis is dual to the proof of [23, Lemma 1.4]. The idea behind Popp s volume is to define an inner product on each Dq i/di 1 q which, in turn, induces an inner product on the orthogonal direct sum gr q (D). The latter has a natural volume form, which is the canonical volume of an inner product space obtained by wedging the elements an orthonormal dual basis. Then, we employ Lemma 9.3 to define an element of ( n T q M) n TqM, which is Popp s volume form computed at q. Fix q M. Then, let v,w D q, and let V,W be any horizontal extensions of v,w. Namely, V,W Γ(D) and V(q) = v, W(q) = w. The linear map π : D q D q Dq/D 2 q π(v w) := [V,W] q mod D q, (9.2) is well defined, and does not depend on the choice the horizontal extensions. Indeed let Ṽ and W be two different horizontal extensions of v and w respectively. Then, in terms of a local frame X 1,...,X k of D Ṽ = V + k f i X i, where, for 1 i k, f i,g i C (M) and f i (q) = g i (q) =. Therefore [Ṽ, W] = [V,W]+ i=1 k (V(g i ) W(f i ))X i + i=1 k W = W + g i X i, (9.3) i=1 k f i g j [X i,x j ]. (9.4) Thus, evaluating at q, [Ṽ, W] q = [V,W] q mod D q, as claimed. Similarly, let 1 i m. The linear maps π i : i D q D i q /Di 1 q i,j=1 π i (v 1 v i ) = [V 1,[V 2,...,[V i 1,V i ]]] q mod D i 1 q, (9.5) 188

189 are well definedand do not depend on the choice of the horizontal extensions V 1,...,V i of v 1,...,v i. By the bracket-generating condition, π i are surjective and, by Lemma 9.2, they induce an inner product space structure on Dq i/di 1 q. Therefore, the nilpotentization of the distribution at q, namely gr q (D) = D q Dq 2 /D q... Dq m /Dm 1 q, (9.6) is an inner product space, as the orthogonal direct sum of a finite number of inner product spaces. As such, it is endowed with a canonical volume (defined up to a sign) µ q n gr q (D), which is the volume form obtained by wedging the elements of an orthonormal dual basis. Finally, Popp s volume (computed at the point q) is obtained by transporting the volume of gr q (D) to T q M through the map θ q : n T q M n gr q (D) defined in Lemma 9.3. Namely P q = θ q(µ q ) = µ q θ q, (9.7) where θ q denotes the dual map and we employ the canonical identification ( n T q M) n T qm. Eq. (9.7) is defined only in the domain of the chosen local frame. Since M is orientable, with a standard argument, these n-forms can be glued together to obtain Popp s volume P Ω n (M). The smoothness of P follows directly from Theorem 9.5. Remark 9.4. The definition of Popp s volume can be restated as follows. Let (M,D) be an oriented sub-riemannian manifold. Popp s volume is the unique volume P such that, for all q M, the following diagram is commutative: (M,D) gr q P ( n T q M) θq gr q (D) µ ( n gr q (D)) where µ associates the inner product space gr q (D) with its canonical volume µ q, and θq of the map defined in Lemma 9.3. is the dual 9.3 A formula for Popp volume In this section we prove an explicit formula for the Popp volume. We say that a local frame X 1,...,X n is adapted if X 1,...,X ki is a local frame for D i, where k i := dimd i, and X 1,...,X k are orthonormal. It is useful to define the functions c l ij C (M) by [X i,x j ] = n c l ij X l. (9.8) l=1 With a standard abuse of notation we call them structure constants. For j = 2,...,m we define the adapted structure constants b l i 1...i j C (M) as follows: [X i1,[x i2,...,[x ij 1,X ij ]]] = k j l=k j b l i 1 i 2...i j X l mod D j 1, (9.9)

190 where 1 i 1,...,i j k. These are a generalization of the c l ij, with an important difference: the structure constants of Eq. (9.8) are obtained by considering the Lie bracket of all the fields of the local frame, namely 1 i,j,l n. On the other hand, the adapted structure constants of Eq. (9.9) are obtained by taking the iterated Lie brackets of the first k elements of the adapted frame only (i.e. the local orthonormal frame for D), and considering the appropriate equivalence class. For j = 2, the adapted structure constants can be directly compared to the standard ones. Namely b l ij = cl ij when both are defined, that is for 1 i,j k, l k+1. Then, we define the k j k j 1 dimensional square matrix B j as follows: [B j ] hl = k i 1,i 2,...,i j =1 b h i 1 i 2...i j b l i 1 i 2...i j, j = 1,...,m, (9.1) with the understanding that B 1 is the k k identity matrix. It turns out that each B j is positive definite. Theorem 9.5. Let X 1,...,X n be a local adapted frame, and let ν 1,...,ν n be the dual frame. Then Popp s volume P satisfies 1 P = ν 1... ν n, (9.11) j detb j where B j is defined by (9.1) in terms of the adapted structure constants (9.9). To clarify the geometric meaning of Eq. (9.11), let us consider more closely the case m = 2. If D is a step 2 distribution, we can build a local adapted frame {X 1,...,X k,x k+1,...,x n } by completing any local orthonormal frame {X 1,...,X k } of the distribution to a local frame of the whole tangent bundle. Even though it may not be evident, it turns out that B2 1 (q) is the Gram matrix of the vectors X k+1,...,x n, seen as elements of T q M/D q. Thelatter has a natural structure of inner product space, induced by the surjective linear map [, ] : D q D q T q M/D q (see Lemma 9.2). Therefore, the function appearing at the beginning of Eq. (9.11) is the volume of the parallelotope whose edges are X 1,...,X n, seen as elements of the orthogonal direct sum gr q (D) = D q T q M/D q. Proof of Theorem 9.5 We are now ready to prove Theorem 9.5. For convenience, we first prove it for a distribution of step m = 2. Then, we discuss the general case. In the following subsections, everything is understood to be computed at a fixed point q M. Namely, by gr(d) we mean the nilpotentization of D at the point q, and by D i we mean the fibre Dq i of the appropriate higher order distribution. Step 2 distribution If D is a step 2 distribution, then D 2 = TM. The growth vector is G = (k,n). We choose n k independent vector fields {Y l } n l=k+1 such that X 1,...,X k,y k+1,...,y n is a local adapted frame for TM. Then n [X i,x j ] = b l ij Y l mod D. (9.12) l=k+1 19

191 For each l = k + 1,...,n, we can think to b l ij as the components of an Euclidean vector in Rk2, which we denote by the symbol b l. According to the general construction of Popp s volume, we need first to compute the inner product on the orthogonal direct sum gr(d) = D D 2 /D. By Lemma 9.2, the norm on D 2 /D is induced by the linear map π : 2 D D 2 /D π(x i X j ) = [X i,x j ] mod D. (9.13) The vector space 2 D inherits an inner product from the one on D, namely X,Y,Z,W D, X Y,Z W = X,Z Y,W. π is surjective, then we identify the range D 2 /D with kerπ 2 D, and define an inner product on D 2 /D by this identification. In order to compute explicitly the norm on D 2 /D (and then, by polarization, the inner product), let Y D 2 /D. Then D 2 /D Y = min{ 2 D Z s.t. π(z) = Y}. (9.14) Let Y = n l=k+1 cl Y l and Z = k i,j=1 a ijx i X j 2 D. We can think to a ij as the components of a vector a R k2. Then, Eq. (9.14) writes D 2 /D Y = min{ a s.t. a b l = c l, l = k +1,...,n}, (9.15) where a is the Euclidean norm of a, and the dot denotes the Euclidean inner product. Indeed, D 2 /D Y is the Euclidean distance of the origin from the affine subspace of R k2 defined by the equations a b l = c l for l = k + 1,...,n. In order to find an explicit expression for D 2 /D 2 Y in terms of the b l, we employ the Lagrange multipliers technique. Then, we look for extremals of n L(a,b k+1,...,b n,λ k+1,...,λ n ) = a 2 2 λ l (a b l c l ). (9.16) We obtain the following system n λ l b l a =, l=k+1 n λ l b l b r = c r, l=k+1 l=k+1 r = k+1,...,n. (9.17) Let usdefinethen k squarematrix B, withcomponents B hl = b h bl. B is agram matrix, which is positive definite iff the b l are n k linearly independent vectors. These vectors are exactly the rows of the representative matrix of the linear map π : 2 D D 2 /D, which has rank n k. Therefore B is symmetric and positive definite, hence invertible. It is now easy to write the solution of system (9.17) by employing the matrix B 1, which has components B 1 hl. Indeed a straightforward computation leads to D 2 /D 2 c s Y s = c h B 1 hl cl. (9.18) By polarization, the inner product on D 2 /D is defined, in the basis Y l, by Y l,y h D 2 /D = B 1 lh. (9.19) Observe that B 1 is the Gram matrix of the vectors Y k+1,...,y n seen as elements of D 2 /D. Then, by the definition of Popp s volume, if ν 1,...,ν k,µ k+1,...,µ n is the dual basis associated with X 1,...,X k,y k+1,...,y n, the following formula holds true P = 1 detb ν 1 ν k µ k+1 µ n. (9.2) 191

192 General case In the general case, the procedure above can be carried out with no difficulty. Let X 1,...,X n be a local adapted frame for the flag D D D 2 D m. As usual k i = dim(d i ). For j = 2,...,m we define the adapted structure constants b l i 1...i j C (M) by [X i1,[x i2,...,[x ij 1,X ij ]]] = k j l=k j 1 +1 b l i 1 i 2...i j X l mod D j 1, (9.21) where 1 i 1,...,i j k. Again, b l i 1...i j can be seen as the components of a vector b l R kj. Recall that for each j we defined the surjective linear map π j : j D D j /D j 1 π j (X i1 X i2 X ij ) = [X i1,[x i2,...,[x ij 1,X ij ]]] mod D j 1. (9.22) Then, we compute the norm of an element of D j /D j 1 exactly as in the previous case. It is convenient todefine, foreach 1 j m, thek j k j 1 dimensional squarematrix B j, of components [B j ] hl = k i 1,i 2,...,i j =1 b h i 1 i 2...i j b l i 1 i 2...i j. (9.23) withtheunderstandingthatb 1 is thek k identity matrix. Each oneof thesematrices is symmetric and positive definite, hence invertible, due to the surjectivity of π j. The same computation of the previouscase, appliedtoeachd j /D j 1 showsthatthematricesbj 1 are precisely the Gram matrices of the vectors X kj 1 +1,...,X kj D j /D j 1, in other words X kj 1 +l,x kj 1 +h D j /Dj 1 = B 1 lh. (9.24) Therefore, if ν 1,...,ν n is the dual frame associated with X 1,...,X n, Popp s volume is P = 1 m j=1 detb j ν 1... ν n. (9.25) 9.4 Popp volume and isometries Inthe last partof the paperwediscuss theconditions underwhich a local isometry preserves Popp s volume. In the Riemannian setting, an isometry is a diffeomorphism such that its differential is an isometry for the Riemannian metric. The concept is easily generalized to the sub-riemannian case. Definition 9.6. A (local) diffeomorphism φ : M M is a (local) isometry if its differential φ : TM TM preserves the sub-riemannian structure (D, ), namely i) φ (D q ) = D φ(q) for all q M, ii) φ X φ Y φ(q) = X Y q for all q M, X,Y D q. Remark 9.7. Condition i), which is trivial in the Riemannian case, is necessary to define isometries in the sub-riemannian case. Actually, it also implies that all the higher order distributions are preserved by φ, i.e. φ (Dq) i = Dφ(q) i, for 1 i m. 192

193 Definition 9.8. Let M be a manifold equipped with a volume form µ Ω n (M). We say that a (local) diffeomorphism φ : M M is a (local) volume preserving transformation if φ µ = µ. In the Riemannian case, local isometries are also volume preserving transformations for the Riemannian volume. Then, it is natural to ask whether this is true also in the sub-riemannian setting, for some choice of the volume. The next proposition states that the answer is positive if we choose Popp s volume. Proposition 9.9. Sub-Riemannian (local) isometries are volume preserving transformations for Popp s volume. Proposition 9.9 may be false for volumes different than Popp s one. We have the following. Proposition 9.1. Let Iso(M) be the group of isometries of the sub-riemannian manifold M. If Iso(M) acts transitively on M, then Popp s volume is the unique volume (up to multiplication by scalar constant) such that Proposition 9.9 holds true. Definition Let M be a Lie group. A sub-riemannian structure (M,D, ) is left invariant if g M, the left action L g : M M is an isometry. As a trivial consequence of Proposition 9.9 we recover a well-known result (see again [23]). Corollary Let (M,D, ) be a left-invariant sub-riemannian structure. Then Popp s volume is left invariant, i.e. L gp = P for every g M. This section is devoted to the proof of Propositions 9.9 and 9.1. Proof of Proposition 9.9 Let φ Iso(M) be a (local) isometry, and 1 i m. The differential φ induces a linear map φ : i D q i D φ(q). (9.26) Moreover φ preserves the flag D... D m. Therefore, it induces a linear map φ : D i q /Di 1 q The key to the proof of Proposition 9.9 is the following lemma. Lemma φ and φ are isometries of inner product spaces. D i φ(q) /Di 1 φ(q). (9.27) Proof. The proof for φ is trivial. The proof for φ is as follows. Remember that the inner product on D i /D i 1 is induced by the surjective maps π i : i D D i /D i 1 defined by Eq. (9.5). Namely, let Y D i q/d i 1 q. Then Y D i q /D i 1 q = min{ Z Dq s.t. π i (Z) = Y}. (9.28) As a consequence of the properties of the Lie brackets, π i φ = φ π i. Therefore Y D i q /D i 1 q = min{ φ Z Dφ(q) s.t. π i ( φ Z) = φ Y} = φ Y D i φ(q) /D i 1. (9.29) φ(q) By polarization, φ is an isometry. 193

194 Since gr q (D) = m i=1 Di q /Di 1 q is an orthogonal direct sum, φ : gr q (D) gr φ(q) (D) is also an isometry of inner product spaces. Finally, Popp s volume is the canonical volume of gr q (D) when the latter is identified with T q M through any choice of a local adapted frame. Since φ is equal to φ under such an identification, and the latter is an isometry of inner product spaces, the result follows. Proof of Proposition 9.1 Let µ be a volume form such that φ µ = µ for any isometry φ Iso(M). There exists f C (M), f such that P = fµ. It follows that, for any φ Iso(M) fµ = P = φ P = (f φ)φ µ = (f φ)µ, (9.3) where we used the Iso(M)-invariance of Popp s volume. Then also f is Iso(M)-invariant, namely φ f = f for any φ Iso(M). By hypothesis, the action of Iso(M) is transitive, then f is constant. Hausdorff dimension and Hausdorff volume Density of the Hausdorff volume with respect to a smooth volume Bibliographical notes 194

195 Chapter 1 Regularity of the sub-riemannian distance In this chapter we focus our attention on the analytical properties of the sub-riemannian squared distance from a fixed point. In particular we want to answer to the following questions: (i) Which is the (minimal) regularity of d 2 that one can expect? (ii) Is the sub-riemannian distance d 2 smooth? If not, can we characterize smooth points? 1.1 General properties of the distance function In this section we recall and collect some general properties of the sub-riemannian distance and results related to it, some of which we already proved in the previous chapters. Let us consider a free sub-riemannian structure (M,U,f) where the vector fields f 1,...,f m define a generating family, i.e. f : U TM, f(u,q) = m u i f i (q) i=1 Here U is a trivial Euclidean bundle on M of rank m. Definition 1.1. Fix a point q M. The flag of the sub-riemannian structure at the point q is the sequence of subspaces {D i q} i N defined by D i q := span{[f j1,...,[f jl 1,f jl ]](q), l i} Notice that Dq 1 = D q is the set of admissible directions. Moreover, by construction, Dq i Di+1 q all i 1. for The bracket generating assumptions implies that q M, m(q) > s.t. D m(q) q and m(q) is called the step of the sub-riemannian structure at q. 195 = T q M

196 Exercise Prove that the filtration defined by the subspaces Dq i, for i 1, is independent on the choice of a generating family (i.e., on the trivialization of U). 2. Show that m(q) does not depend on the generating frame. Prove that the map q m(q) is upper semicontinuous. In Chapter 8 we already proved that the sub-riemannian distance is Hölder continuous. For the reader s convenience, we recall here the statement. Proposition 1.3. For every q M there exists a neighborhood O q such that q,q 1 O q and for every coordinate map φ : O q R n d(q,q 1 ) C φ(q ) φ(q 1 ) 1/m where m = m(q) is the step of the sub-riemannian structure at q. In what follows we fix a point q M to be fixed and r > such that B = B q (r ) is a closed compact ball centered in q. Let us denote by F = F q : U M the end-point map based at q M, i.e., the map that associates to every control u( ) U L 2 the end-point q u (1) of the solution associated to the control u (we recall that U is the open set of L 2 such that the corresponding solution q u ( ) is defined on [,1]). Denote with B the ball of radius r in L 2 (where r is chosen in such a way that the closure of B q (r ) is compact). Notice that since B is compact then B U. Proposition 1.4. F B : B M is continuous in the weak topology. In other words if u n u in the weak-l 2 topology then F(u n ) F(u). Proof. Consider the solution of the problem γ(t) = f u(t) (γ(t)), γ() = q, u B. Since the ball B is compact, all trajectories are Lipschitzian with the same Lipchitz constant. In particular this set has compact closure in the C topology. Assume now that u n u and consider the family of curves γ n (t) associated to u n, that satisfy γ n (t) = q + t f un(τ)(γ n (τ))dτ. By compactness there exists a subsequence, which we still denote γ n, such that γ n γ uniformly, for some curve γ, in particular their endpoints converge. It remains to show that γ is the trajectory associated to u. Since u n u we have that f un(t)(γ n (t)) f u(t) (γ(t)) being the product between strong and weak convergent sequences. 1 taking the limit we find i.e. γ is the trajectory associated to u. γ(t) = q + t 1 one can write the coordinate expression u i kf i(q k (t)) f u(τ) (γ(τ))dτ, 196

197 Remark 1.5. Actually we prove that all trajectories converge uniformly and not only their endpoints. The previous proposition given another proof of the existence of minimizers Corollary 1.6 (Existence of minimizers). For any q B q (r) there exists u (with u r) that join q and q and is a minimizer. i.e. u = d(q,q). Proof. Consider a point q in the compact ball B. Then take a minimizing sequence u n such that F(u n ) = q and u n d(q,q). The sequence ( u n ) n is bounded, hence by weak compactness of balls in L 2 there exists a subsequence, that we still call u n such that u n u for some u. By continuity F(u) = q. Moreover the semicontinuity of the L 2 norm proves that u corresponds to a minimizer joining q to q since u liminf n u n = d(q,q). Definition 1.7. A control u is called a minimizer if it satisfies J(u) = 1 2 d2 (q,f(u)). Notice that in this case we have u = d(q,f(u)). We denote by M L 2 the set of all minimizing controls. Theorem 1.8 (Compactness). Let K M be compact. The set of all minimal controls associated with trajectories reaching K M K = {u M F(u) K}, is compact in the strong L 2 topology. Proof. Consider a sequence u n M K. Since K is compact, the sequence u n is bounded. Since bounded sets in L 2 are weakly compact, we can assume that u n u. Let us show that we also have u n u. From Proposition 1.4 it follows that F(u n ) F(u) in M and the continuity of the distance implies d(q,f(u n )) d(q,f(u)). Moreover since u n M we have that u n = d(q,f(u n )) and by weak semicontinuity of the L 2 norm we get Hence u n u strongly in L 2 and u M. u liminf n u n = liminf n d(q,f(u n )) = d(q,f(u)). 1.2 Regularity of the squared distance In this section we fix once for all a point q M and a closed ball B = B q (r ) such that B is compact. In particular for each q B there exists a minimizer joining q and q (see Corollary 1.6). In what follows we denote by f the squared distance from q The main result of this chapter is the following. f( ) = 1 2 d2 (q, ). (1.1) 197

198 Theorem 1.9. The function f B : B R is smooth on a open dense subset of B. In the case of complete sub-riemannian structures, since balls are compact for all radii, we have immediately the following corollary Corollary 1.1. Assume that M is a complete sub-riemannian manifold and q M. Then f is smooth on an open and dense subset of M. We start by looking for necessary conditions for f to be C around a point. Proposition Let q B and assume that f is C at q. Then (i) there exists a unique length minimizer γ joining q with q. Moreover γ is not abnormal and not conjugate. (ii) d q f = λ 1, where λ 1 is the final covector of the normal lift of γ. Proof. Under the above assumptions the functional Ψ : v J(v) f(f(v)), v L ([,T],R k ), (1.2) is smooth and non negative. For every optimal trajectory γ, associated with the control u, that connects q with q in time 1, one has = d u Ψ = d u J d q f D u F. (1.3) Thus, γ is a normal extremal trajectory, with Lagrange multiplier λ 1 = d q f. By Theorem 4.24, we can recover γ by the formula γ(t) = π e (t 1) H (λ 1 ). Then, γ is the unique minimizer of J connecting its endpoints, and is normal. Next we show that γ is not abnormal and not conjugate. For y in a neighbourhood O q of q, let us consider the map Φ : O q T q M, Φ(y) = e H (d y f). (1.4) The map Φ, by construction, is a smooth right inverse for the exponential map, since E(Φ(y)) = π e H (e H (d y f)) = π(d y f) = y. (1.5) This implies that q is a regular value for the exponential map. Since q is a regular value for the exponential map and, a fortiori, u is a regular point for the end-point map. This proves that u corresponds to a trajectory that is at the same time strictly normal and not conjugate. Remark Notice that from the proof it follows that if we only assume that f is differentiable at q, we can still conclude that there exists a unique minimizer γ joining q to q, and it is normal. Before going further in the study of the smoothness property of the distance function, we are already able to prove an important corollary of this result. 198

199 Denote, for r >, S r := f 1 ( r2 2 ) the sub-riemannian sphere of radius r centered at q Corollary Assume that D q T q M. For every r r, the sphere S r contains a non smooth point of the function f. Proof. Since r r, the sphere S r is non empty and contained in a compact ball. Assume, by contradiction, that f is smooth at every point of S r. Then S r is a level set defined by f and d q f for every q S r (since d q f is the nonzero covector attached at the final point of a geodesic, see Proposition 1.11). It follows that S r is a smooth submanifold of dimension n 1, without boundary. Moreover, being the level set of a continuous function, S r is closed, hence compact. Let us consider the map Φ : S r T q M, Φ(q) = e H (d q f), By assumption f is smooth, hence Φ is a smooth right inverse of the exponential map (see also (1.5)). In particular the differential of Φ is injective at every point. Moreover H(Φ(q)) = r since f(q) = H(λ) = r for every q S r. It follows that actually Φ defines a smooth immersion Φ : S r H 1 (r) T q M (1.6) of the sphere S r into the set { C r := H 1 (r) Tq M = λ Tq M : 1 2 } k λ,f i (q ) 2 = r. i=1 Notice that C r is a smooth connected and non compact n 1 dimensional submanifold of the fiber T q M, indeed diffeomorphic to the cylinder S k 1 R n k (here k = dimd q < n is the rank of the structure at the point q ). By continuity of Φ, the image Φ(S r ) is closed in C r. Moreover, since every immersion is a local submersion and dims r = dimc r, the set Φ(S r ) is also open in C r. Hence it is connected. Since Φ(S r ) has no boundary, it is a connected component of C r, namely Φ(S r ) = C r. This is a contradiction since, by continuity, Φ(S r ) is compact, while C r is not. Next we go back to the proof of the main result. Recall that q M is fixed and f is the one half of the distance squared from q. After Proposition 1.11, it is natural to introduce the following definition. Definition Fix a point q M. The set of smooth point from q is the set Σ M of q M such that there exists a unique lenght-minimizer γ joining q to q, that it is strictly normal, and not conjugate. From the proof of Proposition 1.11 (see also Remark 1.12) it follows that if the squared distance f from q, is smooth at q then q Σ. The name smooth point of f is justified by the following theorem. Theorem The set Σ is open and dense in B. Moreover f is smooth at every point of Σ. Proof. We divide the proof into three parts: (a) the set Σ is open, (b) the function f is smooth in a neighborhood of every point of Σ, (c) the set Σ is dense in B. 199

200 (a). To prove that Σ is open we have to show that for every q Σ there exists a neighborhood O q of q such that every q O q is also in Σ. Let us start by proving the following claim: there exists a neighborhood of q in B such that every point in this neighborhood is reached by exactly one minimizer. By contradiction, ifthis propertyisnot true, thereexists asequenceq n of points inb converging to q such that (at least) two minimizers γ n and γ n joining q and q n. Let us denote by u n and v n the corresponding minimizing controls. By Proposition 1.8, the set of controls associated with minimizers whose endpoint is in the compact ball B is compact in L 2 (w.r.t. the strong topology). Then there exist, up to considering a subsequence, two controls u,v such that u n u and v n v. Moreovers the limits u and v are both minimizers and join q with q. Since by assumption there is a unique minimizer γ joining q with q, it follows that u = v is the corresponding control. By smoothness of the end point map both D un F and D vn F tends to D u F, which has has full rank (u is strictly normal, hence is not a critical point for F). Hence, for n big enough, both D un F and D vn F are surjective, i.e., u n and v n are strictly normal, and we can build the sequence λ n 1 and ξ n 1 of corresponding final covectors in T q n M satisfying λ n 1 D u n F = u n, ξ n 1 D v n F = v n. These relations can be rewritten in terms of the adjoint linear maps (D un F) λ n 1 = u n, (D vn F) ξ n 1 = v n. Since both (D un F) and (D vn F) are a family of injective linear maps converging to (D u F) and u n,v n u, it follows that the corresponding (unique) solutions λ n 1 and ξn 1 also converge to the solutionofthelimitproblem(d u F) λ 1 = u, i.e, bothconvergetothefinalcovector λ 1 corresponding to γ. By using the flow defined by the corresponding controls we can deduce the convergence of the sequences λ n and ξn of the initial covectors associated to u n and v n to the unique initial covector λ corresponding to γ. Finally, since λ by assumption is a regular point of the exponential map, i.e., the unique minimizer γ joining q to q is not conjugate, it follows that the exponential map is invertible in a neighborhood V λ of λ onto its image O q := E(V λ ), that is a neighborhood of q. In particular this proves our initial claim. More precisely we have proved that for every point q O q there exists a unique minimizer joining q to q, whose initial covector λ V λ is a regular point of the exponential map. This implies that every q O q is a smooth point, and Σ is open. (b). Now we prove that f is smooth in a neighborhood of each point q Σ. From the part (a) of the proof it follows that if q Σ there exists a neighborhood V λ of λ and O q of q such that E Vλ : V λ O q is a smooth invertible map. Denote by Φ : O q V λ its smooth inverse. Since for every q O q there is only one minimizer joining q to q with initial covector Φ(q ) it follows that, f(q ) = 1 2 d2 (q,q ) = H(Φ(q )), that is a composition of smooth functions, hence smooth. (c). Our next goal is to show that Σ is a dense set in B. We start by a preliminary definition. 2

201 Definition A point q B is said to be (i) a fair point if there exists a unique normal minimizer trajectory joining q to q. (ii) a good point if it is a fair point and the unique minimizer is strictly normal. We denote by Σ f and Σ g the set of fair and good points, respectively. We stress that a fair point can be reached by a unique minimizer that is both normal and abnormal. From the definition it is immediate that Σ Σ g Σ f. The proof of (c) relies on the following four steps: (c1) Σ f is a dense set in B, (c2) Σ g is a dense set in B, (c3) f is Lipschitz in a neighborhood of every point of Σ g, (c4) Σ is a dense set in B. (c1). Fix an open set O B and let us show that Σ f O. Consider a smooth function a : O R such that a 1 ([s,+ [) is compact for every s R. Then consider the function ψ : O R, ψ(q) = f(q) a(q) The function ψ is continuous on O and, since f is nonnegative, the set ψ 1 (],s[) are compact for every s R due to the assumption on a. It follows that ψ attains its minimum at some point q 1 O. Define a control u 1 associated with a minimizer γ joining q and F(u 1 ) = q 1. Since J(u) f(f(u)) for every u, it is easy to see that the map Φ : U R, Φ(u) = J(u) a(f(u)) attains its minimum at u 1. In particular it holds = D u1 Φ = u 1 (d q1 a)d u1 F. The last identity implies that u 1 is normal and λ 1 = d q1 a is the final covector associated with the trajectory. By Theorem 4.24, the corresponding trajectory γ is uniquely recovered by the formula γ(t) = π e (t 1) H (d q1 a). In particular γ is the uniqueminimizer joining q to q 1 O, and is normal, i.e. q 1 Σ f O. Remark In the Riemannian case Σ f = Σ g since there are no abnormal extremal. (c2). As in the proof of (c1), we shall prove that Σ g O for any open O B. By (c1) the set Σ f O is nonempty. For any q Σ f O we can define rankq := rankd u F, where u is the control associated to the unique minimizer γ joining q to q. To prove (c2) it is sufficient to prove that there exists a point q Σ f O such that rankq = n (i.e., D u F is surjective, where u is the control associated to the unique minimizer joining q and q ). Assume by contradiction that k O := max rankq < n, q Σ f O and consider a point q where the maximum is attained, i.e., such that rank q = k O. 21

202 We claim that all points of Σ f O that are sufficiently close to q have the same rank (we stress that the existence of points in Σ f O arbitrary close to q is also guaranteed by (c1)). Assume that the claim is not true, i.e., there exist a sequence of points q n Σ f O such that q n q and rankq n k O 1. Reasoning as in the proof of (a), using uniquenessand compactness of the minimizers, one can prove that the sequence of controls u n associated to the unique minimizers joining q to q n satisfies u n û strongly in L 2, where û is the control associated to the unique minimizer joining q with q. By smoothness of the end-point map F it follows that D un F DûF which, by semicontinuity of the rank, implies the contradiction rank q = rankdûf liminf n rankd u n F k O 1. Thus, without loss of generality, we can assume that rankq = k O < n for every q Σ f O (maybe by restricting our neighborhood O). We introduce the following set Π q = e H {ξ T qm ξd u F = λ 1 D u F} T q M. The set Π q is the set of initial covector λ T q M whose image via the exponential map is the point q. Lemma Π q is an affine subset of T q M such that dimπ q = n k O. Moreover the map q Π q is continuous. Proof. It is easy to check that the set Π q = {ξ T qm ξd u F = λ 1 D u F} is an affine subspace of T q M. Indeed ξ Π q if and only if (D u F) (ξ λ 1 ) =, that is Π q = {ξ T qm ξd u F = λ 1 D u F} = λ 1 +Ker(D u F), Moreover dimker(d u F) = n dimimd u F = n k O. Since all elements ξ Π q are associated with the same control u, we have that Π q = e H ( Π q ) = P,t ( Π q ), hence Π q is an affine subspace of T q M. Let us now show that the map q Π q is continuous on Σ f O. Consider a sequence of points q n in Σ f O such that q n q Σ f O. Let u n (resp. u) be the unique control associated with the minimizing trajectory joining q and q n (resp. q). By theuniqueness-compactness argument already used in the previous part of the proof we have that u n u strongly and moreover D un F D u F. Since rank D un F is constant, it follows that Ker(D un F) Ker(D u F), as subspaces. Consider now A T q M a k O -dimensional ball that contains λ = e H (λ 1 ) and is transversal to Π q. By continuity A is transversal also to Π q, for q Σ f O close to q. In particular Π q A. Since E(Π q ) = q, this implies that Σ f O E(A). By (c1), Σ f O is a dense set, hence E(A) is also dense in O. On the other hand, since E is a smooth map and A is a compact ball of positive codimension (k O < n), by Sard Lemma it follows that E(A) is a closed dense set of O that has measure zero, that is a contradiction. (c3) The proof of this claim relies on the following result, which is of independent interest. Theorem Let K B a compact in our ball such that any minimizer connecting q to q K is strictly normal. Then f is Lipschitz on K. 22

203 Proof of Theorem Let us first notice that, since K is compact, it is sufficient to show that f is locally Lipschitz on K. Fix a point q K and some control u associated with a minimizer joining q and q (it may be not unique). By our assumptions D u F is surjective, since u is strictly normal. Thus, by inverse function theorem, there exist neighborhoods V of u in U and O q of q in K, together with a smooth map Φ : O q V that is a local right inverse for the end-point map, namey F(Φ(q )) = q for all q O q (see also Theorem 2.47). Fix then local coordinates around q. Since Φ is smooth, there exists R > and C > such that B q (C r) F(B u (r)), r < R, (1.7) where B u (r) is the ball of radius r in L 2 and B q (r) is the ball of radius r in coordinates on M. Let us also observe that, since J is smooth on, there exists C 1 > such that for every u,u B u (R) one has J(u ) J(u) C 1 u u 2 Pick then any point q K such that q q = C r, with r R. By (1.7), there exists u B u (R) with u u 2 r such that F(u ) = q. Using that f(q ) J(u ) and f(q) = J(u), since u is a minimizer, we have f(q ) f(q) J(u ) J(u) C 1 u u 2 C q q, where C = C 1 /C. Notice that the above inequality is true for all q such that q q C R. Since K is compact, and the set of control u associated with minimizers that reach the compact set K is also compact, the constants R > and C,C 1 can be chosen uniformly with respect to q K. Hence we can exchange the role of q and q in the above reasoning and get f(q ) f(q) C q q, for every pair of points q,q such that q q C R. To end the proof of (c3) it is sufficient to show that if q Σ g there exists a (compact) neighborhood O q of q such that every point in O q is reached by only strictly normal minimizers (we stress that no uniqueness is required here). By contradiction, assume that the claim is not true. Then there exists a sequence of points q n converging to q and a choice of controls u n, such that the corresponding minimizers are abnormal. By compacness of minimizers there exists u such that u n u and by uniqueness of the limit u is abnormal for the point q, that is a contradiction. (c4). We have to prove that Σ O is non empty for every open neighborhood O in B. By (c3) we can choose q Σ g O and fix O O neighborhood of q such that f is Lipschitz on O. It is then sufficient to show that Σ O. By Proposition 1.11 (see also Remark 1.12) every differentiability point of f is reached by a unique minimizer that is normal, hence is a fair point. Since we know that f is Lipschitz on O, it follows by Rademacher Theorem that almost every point of O is fair, namely meas(σ f O ) = meas(o ). Let us also notice that the set Σ f O of fair points of O is also contained in the image of the exponential map. Thanks to the Sard Lemma, the set of regular values of the exponential map in 23

204 O is also a set of full measure in O. Since by definition a point in Σ f that is a regular value for the exponential map is in Σ, this implies that meas(σ O ) = meas(σ f O ) = meas(o ). This in particular proves that Σ O is not empty. As a corollary of this result we can prove that if there are no abnormal minimizers, then the set of smooth points has full measure Corollary 1.2. Assume that M is a complete sub-riemannian structure and that there are no abnormal minimizers. Then meas(m \Σ) =. This result is not known in general, and it is indeed a main open problem of sub-riemannian geometry to establish whether Corollary 1.2 remains true in presence of abnormal minimizers. We stress that the assumptions of the theorem are satisfied in the case of Riemannian structure. Indeed in this case, following the same arguments of the proof, we have the following result. Proposition Let M be a sub-riemannian structure that is Riemannian at q,i.e., such that dimd q = dimm. Then there exists a neighborhood O q of q such that f is smooth on O q. 1.3 Locally Lipschitz functions and maps If S is a subset of a vector space V, we denote by conv(s) the convex hull of S, that is the smallest convex set containing S. It is characterized as the set of v V such that there exists a finite number of elements v,...,v l S such that v = l λ i v i, λ i, i= n λ i = 1. i= If ϕ : M R is a function defined on a smooth manifold M, we say that ϕ is locally Lipschitz is ϕ is locally Lipschitz in any coordinate chart, as a function defined on R n. The classical Rademacher theorem implies that a locally Lipschitz function ϕ : M R is differentiable almost everywhere. Still we can introduce a weak notion of differential that is defined at every point. If ϕ : M R is locally Lipschitz, any point q M is the limit of differentiability points. In what follows, whenever we write d q ϕ, it is implicitly understood that q M is a differentiability point of ϕ. Definition Let ϕ : M R be a locally Lipschitz function. The (Clarke) generalized differential of ϕ at the point q M is the set q ϕ := conv{ξ T q M ξ = lim q n q d q n ϕ} (1.8) Notice that, by definition, q ϕ is a subset of T qm. It is closed by definition and bounded since the function is locally Lipschitz, hence compact. Exercise (i). Show that the mapping q q ϕ is upper semicontinuous in the following sense: if q n q in M and ξ n ξ in T M where ξ n qn ϕ, then ξ q ϕ. (ii). We say that q is regular for ϕ if / q ϕ. Prove that the set of regular point for ϕ is open in M. 24

205 From the very definition of generalized differential we have the following result. Lemma Let ϕ : M R be a locally Lipschitz function and q M. The following are equivalent: (i) q ϕ = {ξ} is a singleton, (ii) d q ϕ = ξ and the map x d x ϕ is continuous at q, i.e., for every sequence of differentiability point q n q we have d qn ϕ d q ϕ. Remark Let A be a subset of R n of measure zero and consider the set of half-lines L v = {q +tv,t } emanating from q and parametrized by v S n 1. It follows from Fubini s theorem that for almost every v S n 1 the one-dimensional measure of the intersection A L v is zero. If we apply this fact to the case when A is the set at which a locally Lipschitz function ϕ : R n R fails to be differentiable, we deduce that for almost all v S n 1, the function t ϕ(q +tv) is differentiable for a.e. t. Example Let ϕ : R R defined by (i) ϕ(x) = x. Then ϕ = [ 1,1], (ii) ϕ(x) = x, if x < and ϕ(x) = 2x, if x. In this case ϕ = [1,2]. In particular in the first example is a minimum for ϕ and ϕ. In the second case the function is locally invertible near the origin and ϕ is separated from zero. In what follows we will prove that these fact corresponds to general results (cf. Proposition 1.3 and Theorem 1.34). The following is a classical hyperplane separation theorem for closed convex sets in R n. Lemma Let K and C be two disjoint, closed, convex sets in R n, and suppose that K is compact. Then there exists ε > and a vector v S n 1 such that x,v > y,v +ε, x K, y C. (1.9) We also recall here another useful result from convex analysis. Lemma 1.28 (Carathéodory). Let S R n and x conv(s). Then there exists x,...,x n S such that x conv{x,...,x n }. The notion of generalized gradient permits to extend some classical properties of critical points of smooth functions. Proposition Let ϕ : M R be locally Lipschitz and q be a local minimum for ϕ. Then q ϕ. Proof. Sincetheclaim is alocal propertywecan assumewithoutloss of generality thatm = R n. As usual we will identify vectors and covectors with elements of R n and the duality covectors-vectors is given by the Euclidean scalar product, that we still denote,. Assume by contradiction that / q ϕ and let us show that q cannot be a minimum for ϕ. To this aim, we prove that there exists a direction w in S n 1 such that the scalar map t ϕ(q +tw) has no minimum at t =. 25

206 The set q ϕ is a compact convex set that does not contain the origin, hence by Lemma 1.27, there exist ε > and v S n 1 such that ξ,v < ε, ξ q ϕ. By definition of generalized differential, one can find open neighborhoods O q of q in R n and V v of v in S n 1 such that for all differentiability point q O q of ϕ one has dq ϕ,v ε/2, v V v. Fix q O q where ϕ is differentiable and a vector w V v such that the set of differentiable points with the line {q +tw} has full measure (cf. Remark 1.25). Then we can compute for t > ϕ(q +tw) ϕ(q) = Thus ϕ cannot have a minimum at q. t d q+sw ϕ,w ds εt/2. The following proposition gives an estimate for the generalized differential of some special class of function. Proposition 1.3. Let ϕ ω : M R be a family of C 1 functions, with ω Ω a compact set. Assume that the following maps are continuous: (ω,q) ϕ ω (q), (ω,q) d q ϕ ω Then the function a(q) := min ω Ω ϕ ω(q) is locally Lipschitz on M and q a conv{d q ϕ ω ω Ω s.t. ϕ ω (q) = a(q)}. (1.1) Proof. As in the proof of Proposition 1.29 we can assume that M = R n. Notice that, if we denote by Ω q = {ω Ω,ϕ ω (q) = a(q)} we have by compactness of Ω that Ω q is non empy for every q M and we can rewrite the claim as follows q a conv{d q ϕ ω ω Ω q }. (1.11) We divide the proof into two steps. In step (i) we prove that a is locally Lipschitz and then in (ii) we show the estimate (1.11). (i). Fix a compact K M. Since every ϕ ω is Lipschitz on K and Ω is compact, there exists a common Lipschitz constant C K >, i.e. the following inequality holds Clearly we have ϕ ω (q) ϕ ω (q ) C K q q, q,q K, ω Ω, min ϕ ω(q) ϕ ω (q ) C K q q, q,q K, ω Ω, ω Ω and since the last inequality holds for all ω Ω we can pass to the min with respect to ω in the left hand side and a(q) a(q ) C K q q, q,q K. 26

207 SincetheconstantC K dependsonlyonthecompactsetk wecanexchangeinthepreviousreasoning the role of q and q, that gives a(q) a(q ) C K q q, q,q K. (ii). Define D q := conv{d q ϕ ω ω Ω q }. Let us first prove prove that d q a D q for every differentiability point q of a. Fix any ξ / D q. By Lemma 1.27 applied to the pair D q and {ξ}, there exist ε > and v S n 1 such that d q ϕ ω,v > ξ,v +ε, ω Ω q, By continuity of the map (ω,q) d q ϕ ω, there exists a neighborhood O q of q and V neighborhood of Ω q such that dq ϕ ω,v > ξ,v +ε/2, q O q, ω V, An integration argument let us to prove that there exists δ > such that for ω V 1 t (ϕ ω(q +tv) ϕ ω (q)) > ξ,v +ε/4, < t < δ. Clearly we have 1 t (ϕ ω(q +tv) a(q)) ξ,v +ε/4, < t < δ. and since the minimum in a(q+tv) = min ω Ω ϕ ω (q+tv) is attained for ω in Ω q+tv V for t small enough, we can pass to the minimum w.r.t. ω V in the left hand side, proving that there exists t > such that 1 t (a(q +tv) a(q)) ξ,v +ε/4, < t < t. Passing to the limit for t we get d q a,v ξ,v +ε/4 (1.12) If d q a / D q we can choose ξ = d q a in the above reasoning and (1.12) gives the contradiction d q a,v d q a,v +ε/4. Hence d q a D for every differentiability point q of a. Now suppose that one has a sequence q n q, where q n are differentiability points of a. Then d qn a D qn for all n from the first part of the proof. We want to show that, whenever the limit ξ = lim n d qn a exists, then ξ D q. This is a consequence of the fact that the map (ω,q) d q ϕ ω is continuous (in particular upper semicontinuous in the sense of Exercise 1.23) and the fact that Ω is compact. Exercise Complete the second part of the proof of Proposition 1.3. Hint: use Carathéodory lemma Locally Lipschitz map and Lipschitz submanifolds Asforscalar functions, amapf : M N betweensmoothmanifoldsissaidtobelocally Lipschitzif for any coordinate chart in M and N the corresponding function from R n to R n is locally Lipschitz. For a locally Lipschitz map between manifolds f : M N the (Clarke) generalized differential is defined as follows q f := conv{l Hom(T q M,T f(q) N) L = lim q n q D q n f, q n diff. point of f}, The following lemma shows how the standard chain rule extends to the Lipschitz case. 27

208 Lemma Let M be a smooth manifold and f : M N be a locally Lipschitz map. (a) If φ : M M is a diffeomorphism and q M we have q (f φ) = ϕ(q) f D q φ. (1.13) (b) If ϕ : N W is a C 1 map, and q M we have q (ϕ f) = D f(q) ϕ q f. (1.14) Moreover the generalized differential, as a set, is upper semicontinuous. More precisely for every neighborhood Ω Hom(T q M,T f(q) N) of q f there exists a neighborhood O q of q such that q f Ω, for every q O q. Sketch of the proof. For a detailed proof of this result see??. Here we only give the main ideas. (a). Since φ is a diffeomorphism, it sends every differentiability point q of f φ to a differentiability point φ(q) for f. Then (1.13) is true at differentiability point and passing to the limit it is also valid for sub-differential (one proves both inclusions using φ and φ 1 ). Part (b) can be proved along the same lines. The semicontinuity can be proved by using the hyperplane separation theorem and the Carathéodory Lemma. Definition Let f : M N be a locally Lipschitz map. A point q M is said critical for f if q f contains a non-surjective map. If q M is not critical it is said regular. Notice that by the semicountinuity property of Lemma 1.32, it follows that the set of regular point of a locally Lipschitz map f is open. Theorem Let f : R n R n be a locally Lipschitz map and q M be a regular point. Then there exists neighborhood O f(q) and a locally Lipschitz map g : O f(q) R n R n such that f g = g f = Id. Remark The classical C 1 version of the inverse function theorem (cf. Theorem??) can be proved from Theorem 1.34 and the chain rule (Lemma 1.32). Indeed Theorem 1.34 implies that there exists a locally Lipschitz inverse g and using the chain rule it is easy to show that the sub-differential of g contains only one element (this implies that it is differentiable at that point) and the differential of g is the inverse of the differential of f. Before proving Theorem 1.34 we need the following technical lemma. Lemma Let f : R n R n be a locally Lipschitz map and q M be a regular point. Then there exists a neighborhood O q of q and ε > such that v S n 1, ξ v S n 1 s.t. ξ v, x f(v) > ε, x O q. (1.15) Moreover f(x) f(y) ε x y, for all x,y O q. We stress that (1.15) means that the inequality ξ v,l(v) > ε holds for every x O q and every element L x f. 28

209 Proof. Notice that, since q is a regular point, the set q f contains only invertible linear maps. For every v S n 1, theset q f(v) is compact and convex, and does not contain thezero linear map. By the hyperplane separation theorem we can find ξ v such that ξ v, q f(v) > ε(v). The map x x f is upper semicontinuous, hence there exists a neighborhood O q of q such that ξ v, x f(v) > ε(v) for all x O q. Since S n 1 is compact, there exists a uniform ε = min{ε(v),v S n 1 } that satisfies (1.15). To prove the second statement of the Lemma, write y = x+sv, where s = x y and v S n 1. Consider a vector v S n 1 close to v such that almost every point in the direction of v is a point of differentiability (cf. Remark 1.25), and set y = x + sv and ξ v the vector associated to v defined by (1.15). Then we can write f(y ) f(x) = s (D x+tv f)v dt. and we have the inequality f(y ) f(x) ξ v,f(y ) f(x) = s ε y x ξv,(d x+tv f)v dt Since ε does not depend on v, we can pass to the limit for v v in the above inequality (in particular y y) and the Lemma is proved. Proof of Theorem The inequality proved in Lemma 1.36 implies that f is injective in the neighborhood O q of the point q. If we show that f(o q ) covers a neighborhood O f(q) of the point f(q), then the inverse function g : O f(q) R n is well defined and locally Lipschitz. Without loss of generality, up to restricting the neighborhood O q, we can assume that every point in O q is regular for f and moreover that the estimate of the Lemma 1.36 holds also on the topological boundary O q. Lemma 1.36 also implies that dist(f(q), f(o q )) εdist(q, O q ) >, where dist(x,a) = inf y A x y denotes the Euclidean distance from x to the set A. Then consider a neighborhood W f(o q ) of f(q) such that y f(q) < dist(y, f(o q )), for every y W. Fix an arbitrary ȳ W and let us show that the equation f(x) = ȳ has a solution. Define the function ψ : O q R, ψ(x) = f(x) ȳ 2 By construction ψ(q) < ψ(z), for all z O q, hence by continuity ψ attains the minimum on some point x O q. By Proposition 1.29, we have x ψ. Moreover, using the chain rule x ψ = (f( x) ȳ) T x f Since x is a regular point of f, the linear map x f is invertible. Thus x ψ implies f( x) = ȳ. We say that c R is a regular value of a locally Lipschitz function ϕ : M R if ϕ 1 (c) and every x ϕ 1 (c) is a regular point. 29

210 Corollary Let ϕ : M R be locally Lipschitz and assume that c R is a regular value for ϕ. Then ϕ 1 (c) is a Lipschitz submanifold of M of codimension 1. Proof. We show that in any small neighborhood O x of every x ϕ 1 (c) the set O x ϕ 1 (c) can be described as the zero locus of a locally Lipschitz function. Since x ϕ does not contain, by the hyperplane separation theorem there exists v 1 S n 1, such that x ϕ,v 1 > for every x in the compact neighborhood O x ϕ 1 (y). Let us complete v 1 to an orthonormal basis {v 1,v 2,...,v n } of R n and consider the map ϕ(x ) c f : O x R n, f(x v 2,x ) =. v n,x By construction f is locally Lipschitz and x is a regular point of f. Hence there exists, by Theorem 1.34 alipschitz inverseg of f. Inparticular theinversemap isalipschitzfunction that transforms the hyperplane {y 1 = } into ϕ 1 (c). Hence the level set ϕ 1 (c) is a Lipschitz submanifold A non-smooth version of Sard Lemma In this section we prove a Sard-type result for the special class of Lipschitz functions we considered in the previous section. We first recall the statement of the classical Sard lemma. We denote by C f the critical point of a smooth map f : M N, i.e. the set of points x in M at which the differential of f is not surjective. Theorem 1.38 (Sard lemma). Let f : R n R m be a C k function, with k max{n m+1,1}. Then the set f(c f ) of critical values of f has measure zero in R m. Notice that the classical Sard Lemma does not apply to C 1 functions ϕ : R n R, whenever n 1. Theorem Let M be a smooth manifold and ϕ ω : M R a family of smooth functions, with ω Ω. Assume that (i) Ω = i N N i is the union of smooth submanifold, and is compact, (ii) the maps (ω,q) ϕ ω (q) and (ω,q) d q ϕ ω are continuous on Ω M, (iii) the maps ψ i : N i M R, (ω,q) ϕ ω (q) are smooth. Then the set of critical values of the function a(q) = min ω Ω ϕ ω(q) has measure zero in R. Proof. We aregoing to defineacountable set of smooth functions Φ α indexedby α = (α,...,α n ) N n+1, where n = dimm, such that to every critical point q of a there corresponds a critical point z q of some Φ α. Moreover we have Φ α (z q ) = a(q). 21

211 Denote by Λ n = {(λ,...,λ n ) λ i, λ i = 1}. For every α = (α,...,α n ) N n+1 let us consider the map Φ α : N α... N αn Λ n M R n Φ α (ω,...,ω n,λ,...,λ n,q) = λ i ϕ ωi (q). (1.16) By computing partial derivatives, it is easy to see that a point z = (ω,...,ω n,λ,...,λ n,q) is critical for Φ α id and only if it satisfies the following relations: i= n i= λ αi i ψ ω (ω i,q) =, i =,...,n, n i= λ id q ϕ ωi = i =,...,n, ϕ ω (q) =... = ϕ ωn (q) (1.17) Recall that ψ i is simply the restriction of the map (ω,q) ϕ ω (q) for ω N i. Let us now show that every critical point q of a can be associated to a critical point z q of some Φ α. By Proposition 1.3, the function a is locally Lipschitz. Assume that q is a critical point of a, then we have q a conv{d q ϕ ω ω Ω s.t. ϕ ω (q) = a(q)}. By Carathéodory lemma there exist n+1 element ω,..., ω n and n+1 scalars λ,..., λ n such that λ i, n i= λ i = 1 and = n λ i d q ϕ ωi, i= ϕ ωi (q) = a(q), i =,...,n. Moreover, let us choose for every i =,...,n an index ᾱ i N such that ω i Nᾱi. Since ϕ ωi (q) = a(q) = min Ω ϕ ω (q), ω i is critical for the map ψ αi, namely we have ψ αi ω ( ω i,q) =. This implies that z q = ( ω,..., ω n, λ,..., λ n,q) satisfies the relations (1.17) for the function Φᾱ, with ᾱ = (ᾱ,...,ᾱ n ). Moreover it is easy to check that Φᾱ(z q ) = a(q) since Φᾱ(z q ) = ( n n λ i ϕ ωi (q) = λ i )a(q) = a(q). i= i= Then if C a denotes the set of critical points of a and C α the set of critical point of Φ α we have meas(a(c a )) meas Φ α (C α ) meas(φ α (C α )) =, α N n+1 α N n+1 since meas(φ α (C α )) = for all α by classical Sard lemma. 211

212 We want to apply the previous result in the case of functions that are infimum of smooth functions on level sets of a submersion. Theorem 1.4. Let F : N M be a smooth map between finite dimensional manifolds and ϕ : N R be a smooth function. Assume that (i) F is a submersion (ii) for all q M the set N q = {x N, ϕ(x) = min ϕ(x)} is a non empty compact set. x F 1 (q) Then the set of critical values of the function a(q) = min ϕ(x) has measure zero in R. x F 1 (q) Proof. Denote by C a the set of critical points of a and a(c a ) is the set of its critical values. Let us first show that for every point q M there exist an open neighborhood O q of q such that meas(a(c a ) O qn ) =. From assumption (i), it follows that for every q M the set F 1 (q) is a smooth submanifold in N. Let us now consider an auxiliary non negative function ψ : N R such that (A) A α := ψ 1 ([,α]) is compact for every α >. and select moreover a constant c > such that the following assumptions are satisfied: (A1) N q inta c, (A2) c is a regular level of ψ F 1 (q). The existence of such a c > is guaranteed by the fact that (A1) is satisfied for all c big enough since N q is compact and A c contains any compact as c +. Moreover, by classical Sard lemma (cf. Theorem 1.38), almost every c is a regular value for the smooth function ψ F 1 (q). By continuity, there exists a neighborhood O q of the point q such that assumptions (A)-(A2) are satisfied for every q O q, for c > and ψ fixed. We observe that (A2) is equivalent to require that level set of F are transversal to level of ψ. We can infer that F 1 (O q ) A c is a smooth manifold with boundary that has the structure of locally trivial bundle. Maybe restricting the neighborhood of q then we can assume F 1 (q) A c = Ω, F 1 (O q ) A c O q Ω, where Ω is a smooth manifold with boundary. In this neighborhood we can split variables in N as follows x = (ω,q) with ω Ω and q M and the restriction a Oq is written as a Oq : O q R, a(q) = min ω Ω ϕ(ω,q). Notice that Ω is compact and is the union of its interior and its boundary, which are smooth by assumptions(a)-(a2). WecanthenapplytheTheorem1.39toa Oq, thatgives meas(a(c a O q ) = for every q M. We have built a covering of M = q M O q. Since M is a smooth manifold, from every covering it is possible to extract a countable covering, i.e. there exists a sequence q n of points in M such that M = n NO qn 212

213 In particular this implies that meas(a(c a )) n Nmeas(a(C a ) O qn ) = since meas(a(c a O q ) = for every q. Remark Notice that we do not assume that N is compact. In that case the proof is easier since every submersion F : N M with N compact automatically endows N with a locally trivial bundle structure. We end this section by applying the previous theory to get information about the regularity of sub-riemannian spheres. Before proving the main result we need two lemmas. Lemma Fix q M and let K T q M \(H 1 () T q M) be a compact set such that all normal extremals associated with λ K are not abnormal. Then there exists ε = ε(k) such that tλ is a regular point for the E q for all < t ε. Proof. By Corollary 7.37 for every strongly normal extremal γ(t) = E(tλ ), with λ T q M, there exists ε = ε(λ ) > such that γ ],ε] does not contain points conjugate to q, or equivalently, tλ is a regular point for the E q for all < t ε. Since K is compact, it follows that there exists ε = ε(k) such that the above property holds uniformly on K. Lemma Let q M and K M be a compact set such that every point of K is reached from q by only strictly normal minimizers. Define the set Then C is compact. C = {λ T q M λ minimizer, E(λ ) K}. Proof. It is enough to show that C is bounded. Assume by contradiction that there exists a sequence λ n C of covectors (and the associate sequence of minimizing trajectories γ n, associated with controls u n ) such that λ n +, where is some norm in T q M. Since these minimizers are normal they satisfy the relation and dividing by λ n one obtain the identity λ n D un F = u n, n N. (1.18) λ n λ n D u n F = u n, n N. (1.19) λ n Using compactness of minimizers whose endpoints stay in a compact region, we can assume that u n u. Morever the sequence λ n / λ n is bounded and we can assume that λ n / λ n λ for some final covector λ. Using that D un F D u F and the fact that λ n +, passing to the limit for n in (1.19) we obtain λd u F =. This implies in particular that the minimizers γ n converge to a minimizer γ (associated to λ) that is abnormal and reaches a point of K that is a contradiction. 213

214 Theorem Let M be a sub-riemannian manifold, q M and r > such that every point different from q in the compact ball Bq (r ) is not reached by abnormal minimizers. Then the sphere S q (r) is a Lipschitz submanifold of M for almost every r r. Proof. Let us fix δ > and consider the annulus A δ = B r (q )\B δ (q ). Define the set C = {λ T q M λ minimizer, E(λ ) A δ } By Lemma 1.43 the set C := C is compact. Moreover define C 1 := {λ C H 1 ([,ε ])}, for some ε > that is chosen later. Notice that C 1 is compact. For every λ T M let us consider the control u associated with γ(t) = E(tλ ) and denote by Φ λ := (P 1,t ) : T q M T E q (λ ) M, the pullback of the flow defined by the control u, computed at q. For a fixed λ C, using that C 1 is compact, let us choose ε = ε(λ ) satisfying the following property: for every λ 1 C 1, the covector Φ λ (λ 1 ) T E q (λ ) M, is a regular point of E E q (λ ). Being C also compact, we can define ε = min{ε(λ ),λ C }. Define the map Ψ : C C 1 D δ M, Ψ(λ,λ 1 ) = E Eq (λ )(Φ λ (λ 1 )). By construction Ψ is a submersion. We want to apply Theorem 1.4 to the submersion Ψ and the scalar function H : C C 1 R, H(λ,λ 1 ) = H(λ )+H(λ 1 ). Let us show that the assumption of Theorem 1.4 are satisfied. Indeed we have to show that the set N q = {(λ,λ 1 ) C C 1 H(λ,λ 1 ) = min Ψ(λ,λ 1 )=q H(λ,λ 1 )}, q A δ, is non empty and compact. Let us first notice that Ψ(λ,sλ ) = E q ((1+s)λ ), H(λ,sλ ) = (1+s 2 )H(λ ). By definition of C, for each q A δ there exists λ C such that E q ( λ ) = q and such that the corresponding trajectory is a minimizer. Moreover we can always write this unique minimizer as the union of two minimizers. It follows that min H(λ,λ 1 ) = min H(λ ) = f(q), q A δ. Ψ(λ,λ 1 )=q E q (λ )=q This implies that N q is non empty for every q. Moreover one can show that N q is compact. By applying Theorem 1.4 one gets that the function a(q) = min Ψ(λ,λ 1 )=q H(λ,λ 1 ) = f(q), is locally Lipschitz in A δ and the set of its critical values has measure zero in A δ. Since δ > is arbitrary we let δ and we have that f is locally Lipschitz in B q (r ) \{q } and the set of its critical values has measure zero. In particular almost every r r is a regular value for f. Then, applying Corollary 1.37, the sphere f 1 (r 2 /2) is a Lipschitz submanifold for almost every r r. 214

215 1.4 Geodesic completeness and Hopf-Rinow theorem We start by proving a technical lemma that is needed later. Lemma For every ε > and x M we have B(x,r +ε) = {y M dist(y,b(x,r)) < ε}. Proof. Let us prove that dist(y,b(x,r)) = d(x,y) r for all r < d(x,y). We will prove the two inequalities separately. (i). From the triangle inequality d(x,y) d(x,z)+d(z,y) one immediately gets d(x,y) r + inf d(z,y) r +dist(y,b(x,r)), B(x,r) z B(x,r). (ii). Let γ n : [,1] M be a sequence of curves joining x to y such that l(γ n ) d(x,y) and let z n be a point on γ n such that d(x,z n ) = r, with z n = γ n (t n ) for some sequence t n (the existence is guaranteed by continuity of the distance). Let us define γ n = γ n [,tn] and γ n = γ n [tn,1]. Then of course l(γ n ) = l(γ n )+l(γ n ). Since z n B(x,r) we have dist(y,b(x,r)) l(γ n) l(γ n ) l(γ n) (1.2) = l(γ n ) r (1.21) and the other inequality follows by taking the limit with respect to n. Theorem Let M be a sub-riemannian manifold. Then (M, d) is complete if and only if all sub-riemannian closed balls are compact. Proof. (i). Assume that all closed balls are compact and let {x j } be a Cauchy sequence in M. We have to prove that {x j } admits a convergent subsequence. By assumption, if we fix ε > there exists n N such that d(x j,x k ) < ε for all j,k n. Let us definer := max j n d(x j,x n )+ε >. By construction x j B(x n,r) for every j, and B(x n,r) has compact closure by assumption. Hence the sequence admits a converging sub-sequence. (ii). Assume now that (M,d) is complete. Fix x M and define A := {r > B(x,r) is compact}, R := supa. (1.22) Since the topology of (M,d) is locally compact then A and R >. First we prove that A is open and then we prove that R = +. Notice in particular that this proves that A =],+ [ since, by Remark 3.4, r A implies ],r[ A. (ii.a) It is enough to show that, if r A, then there exists δ > such that r +δ A. For each y B(x,r) there exists r(y) < ε small enough such that B(y,r(y)) is compact. We have B(x,r) y B(x,r) B(y,r(y)). By compactness of B(x,r) there exists a finitenumberof points {y i } N i=1 in B(x,r) such that (denote r i := r(y i )) N B(x,r) B(y i,r i ). i=1 215

216 Moreover, there exists δ > such that the set of points B(x,r+δ) = {y M dist(y,b(x,r)) δ}, where the equality is given by Lemma 1.45, satisfies B(x,r +δ) N i=1 B(y i,r i ). This proves that r+δ A, since a finite union of compact sets is compact. (ii.b) Assume by contradiction that R < + and let us prove that B := B(x,R) is compact. Since B is a closed set, it is enough to show that it is totally bounded, i.e. it admits an ε-net 2 for every ε >. Fix ε > and consider an (ε/3)-net S for the ball B = B(x,R ε/3), that exists by compactness. By Lemma 1.45 one has for every y B that dist(y,b ) < ε/3. Then it is easy to show that dist(y,s) < dist(y,b )+ε/3 < ε, that is S is an ε-net for B and B is compact. This shows that if R < +, then R A. Hence (ii.a) implies that R+δ A for some δ >, contradicting the fact that R is a sup. Hence R = +. The next result implies that the geodesic completeness of M, i.e. the completeness of H, implies the completeness of M as a metric space. Theorem 1.47 (sub-riemannian Hopf-Rinow). Let M be a sub-riemannian manifold that does not admit abnormal length minimizers. If there exists a point x M such that the exponential map E x is defined on the whole T xm, then M is complete with respect to the sub-riemannian distance. Proof. For the fixed x M, let us consider A = {r > B(x,r) is compact}, R := supa. As in the proof of Theorem 1.46, one can show that A and that A is open (by using the local compactness of the topology and repeating the proof of (ii.a)). Assume now that R < + and let us show that R A. By openness of A this will give a contradiction and A =],+ [. We have to show that B(x,R) is compact, i.e., for every sequence y i in B(x,R) we can extract a convergent subsequence. Define r i := d(y i,x). It is not restrictive to assume that r i R (if it is not the case, the sequence stays in a compact ball and the existence of a convergent subsequence is clear). Since the ball B(x,ri ) is compact, by Theorem 3.39 there exists a length minimizing trajectory γ i : [,r i ] M joining x and y i, parametrized by unit speed. Due to the completeness of the vector field H, we can extend each curve γ i, parametrized by length, to the common interval [, R]. By construction this sequence of trajectory is normal γ i (t) = E(tλ i ) = π e t H (λ i ), for some λ i T x M, and is contained in the compact set B(x,R). Since there is no abnormal minimizer, by Lemma 1.43 the sequence {λ i } is bounded in T xm, thus there exists a subsequence λ in converging to λ. Then r in λ in Rλ and by continuity of E we have that {y i } has a convergent subsequence y in = γ in (r in ) = E(r in λ in ) E(Rλ) =: y 2 an ε-net S for a set B in a metric space is a finite set of points S = {z i} N i=1 such that for every y B one has dist(y,s) < ε (or, equivalently, for every y B there exists i such that d(y,z i) < ε). 216

217 To end the proof, one should just notice that an arbitrary Cauchy sequence in M is bounded, hence contained in a suitable ball centered at x, which is compact since R = +. Thus it admits a convergent subsequence. As an immediate corollary we have the following version of geodesic completeness theorem. Corollary Let M be a sub-riemannian manifold that does not admit abnormal length minimizers. If the vector field H is complete on T M, then M is complete with respect to the sub-riemannian distance. 217

218 218

219 Chapter 11 Abnormal extremals and second variation In this chapter we are going to discuss in more details abnormal extremals and how the regularity of the sub-riemannian distance is affected by the presence of these extremals Second variation We want to introduce the notion of Hessian (and second derivative) for smooth maps between manifolds. We first discuss the case of the second differential of a map between linear spaces. Let F : V M be a smooth map from a linear space V on a smooth manifold M. As we know, the first differential of F at a point x V D x F : V T F(x) M, D x F(v) = d dt F(x+tv), v V, t= and is a well defined linear map independent on the linear structure on V. This is not the case for the second differential. Indeed it is easy to see that the second order derivative DxF(v) 2 = d2 dt 2 F(x+tv) (11.1) t= has not invariant meaning if D x F(v). Indeed in this case the curve γ : t F(x + tv) is a smooth curve in M with nonzero tangent vector. Then there exists some local coordinates on M such that the curve γ is a straight line. Hence the second derivative Dx 2 F(v) vanish in these coordinates. In general, the linear structure on V let us to define the second differential of F as a quadratic map DxF 2 : KerD x F T F(x) M (11.2) On the other hand the map (11.2) is not independent on the choice of the linear structure on V and this construction cannot be used if the source of F is a smooth manifold. Assume now that F : N M is a map between smooth manifolds. The first differential is a linear map between the tangent spaces D x F : T x N T F(x) M, x N. 219

220 and the definition of second order derivative should be modified using smooth curves with fixed tangent vector (that belong to the kernel of D x F): DxF(v) 2 = d2 dt 2 F(γ(t)), γ() = x, γ() = v KerD x F, (11.3) t= Computing in coordinates we find that d 2 dt 2 F(γ(t)) = d2 F df ( γ(), γ())+ γ() (11.4) t= dx2 dx that shows that term (11.4) is defined only up to ImD x F. Thus is intrinsically defined only a certain part of the second differential, which is called the Hessian of F, i.e. the quadratic map Hess x F : KerD x F T F(x) M/ImD x F 11.2 Abnormal extremals and regularity of the distance In the previuos chapter we proved that if we have abnormal minimizer that reach some point q, then the sub-riemannian distance is not smooth at q. If we also have that no normal minimizers reach q we can say that it is not even Lipschitz. Proposition Assume that there are no normal minimizers that join q to q. Then f is not Lipschitz in a neighborhood of q. Moreover lim d q f = +. (11.5) q q q Σ In the previous theorem is an arbitrary norm of the fibers of T M. Proof. Consider a sequence of smooth points q n Σ such that q n q. Since q n are smooth we know that there exists unique controls u n and covectors λ n such that λ n D un F = u n, λ n = d qn f. Assume by contradiction that d qn f M then, using compactness we find that u n u, λ n λ with λd u F = u, that means that the associate geodesic reach q. In other words, there exists a normal minimizer that goes at q, that is a contradiction. Let us now consider the end-point map F : U M. As we explained in the previous section, its Hessian at a point u U is the quadratic vector function Hess u F : KerD u F CokerD u F = T F(u) M/ImD u F. Remark Recall that λd u F = if and only if λ (ImD u F). In other words, for every abnormal extremal there is a well defined scalar quadratic form λhess u F : KerD u F R Notice that the dimension of the space ImD u F of such covectors coincide with dimcokerd u F. 22

221 Definition Let Q : V R be a quadratic form defined on a vector space V. The index of Q is the maximal dimension of a negative subspace of Q: ind Q = sup{dimw Q W\{} < }. (11.6) Recall that in the finite-dimensional case this number coincide with the number of negative eigenvalues in the diagonal form of Q. The following notion of index of the map F will be also useful: Definition Let F : U M and u U be a critical point for F. The index of F at u is Ind u F = min λ ImD uf λ ind (λhess u F) codimimd u F Remark If codimimd u F = 1, then there exists a unique (up to scalar multiplication) non zero λ ImD u F, hence Ind u F = ind (λhess u F) 1. Theorem If Ind u F 1, then u is not a strictly abnormal minimizer. We state without proof the following result (see Lemma 2.8 of [3]) Lemma Let Q : R N R n be a vector valued quadratic form. Assume that Ind Q. Then there exists a regular point x R n of Q such that Q(x) =. Definition Let Φ : E R n be a smooth map defined on a linear space E and r >. We say that Φ is r-solid at a point x E if there exists a constant C >, ε > and a neighborhood U of x such that for all ε < ε there exists δ(ε) > satisfying for all maps Φ C (E,R n ) such that Φ Φ C (U,R n ) < δ. B Φ(x) (Cεr ) Φ(B x (ε)), (11.7) Exercise Prove that if x is a regular point of Φ, then Φ is 1-solid at x. (Hint: Use implicit function theorem to prove that Φ satisfies (11.7) and Brower theorem to show that the same holds for some small perturbation) Proposition Assume that Ind x Φ. Then Φ is 2-solid at x. Proof. We can assume that x = and that Φ() =. We divide the proof in two steps: first we prove that there exists a finite dimensional subspace E E such that the restriction Φ E satisfies the assumptions of the theorem. Then we prove the proposition under the assumption that dime < +. (i). Denote k := dimcokerd Φ and consider the Hessian Hess Φ : KerD Φ CokerD Φ We can rewrite the assumption on the index of Φ as follows ind λhess Φ k, λ ImD Φ \{}. (11.8) 221

222 Since property (11.8) is invariant by multiplication of the covector by a positive scalar we are reduced to the sphere λ S k 1 = {λ ImD Φ, λ = 1}. By definition of index, for every λ S k 1, there exists a subspace E λ E, dime λ = k such that λhess u Φ Eλ \{} < By the continuity of the form with respect to λ, there exists a neighborhood O λ of λ such that E λ = E λ for every λ O λ. By compactness we can choose a finite covering of S k 1 made by open subsets S k 1 = O λ1... O λn Then it is sufficient to consider the finitedimensional subspace E = (ii). Assume dime < and split N j=1 E λj E = E 1 E 2 E 2 := KerD Φ The Hessian is a map Hess Φ : E 2 R n /D Φ(E 1 ) According to Lemma 11.7 there exists e 2 E 2, regular point of Hess Φ, such that Hess Φ(e 2 ) = = D 2 Φ(e 2) = D Φ(e 1 ), for some e 1 E 1. Define the map Q : E R n by the formula Q(v 1 +v 2 ) := D Φ(v 1 )+ 1 2 D2 Φ(v 2 ), v = v 1 +v 2 E = E 1 E 2. and the vector e := e 1 /2+e 2. From our assumptions it follows that e is a regular point of Q and Q(e) =. In particular there exists c > such that B (c) Q(B (1)) and the same holds for some perturbation of the map Q (see Exercice 11.9). Consider then the map Φ ε : v 1 +v 2 1 ε 2Φ(ε2 v 1 +εv 2 ) (11.9) Using that v 2 KerD Φ we compute the Taylor expansion with respect to ε Φ ε (v 1 +v 2 ) = Q(v 1 +v 2 )+O(ε) (11.1) hence for small ε the image of Φ ε contain a ball around from which it follows that B φ() (cε 2 ) Φ(B (ε)) (11.11) Moreover as soon as ε is fixed we can perturb the map Φ and still the estimate (11.11) holds. 222

223 Actually we proved the following statement, that is stronger than 2-solideness of Φ: Lemma Under the assumptions of the Theorem 11.1, there exists C > such that for every ε small enough B Φ() (Cε 2 ) Φ(B (ε2 ) B (ε)) (11.12) where B and B denotes the balls in E 1 and E 2 respectively. The key point is that, in the subspace where the differential of Φ vanish, the ball of radius ε is mapped into a ball of radius ε 2, while the restriction on the other subspace preserves the order, as the estimates (11.9) and (11.1) show. 1 Proof of Theorem We prove that if Ind u F 1, where u is a strictly abnormal geodesic, then u cannot be a minimizer. It is sufficient to show that the extended endpoint map ( ) J(u) Φ : U R M, Φ(u) =, F(u) is locally open at u. Recall that d u J = λd u F, for some λ T F(u) M, if and only if d u J KerDuF = (see also Proposition 7.6). Since u is strictly abnormal, it follows that d u J KerDuF. (11.13) Moreover from the definition of Φ and (11.13) one has KerD u Φ = Kerd u J KerD u F, dimimd u J = 1. Moreover, a covector λ = (α,λ) in R T F(u) M annihilates the image of D uφ if and only if α = and λ ImD u F, indeed if = λd u Φ = αd u J +λd u F with α, this would imply that u is also normal. In other words we proved the equality ImD u Φ = {(,λ) R T F(u) M λ ImD uf } (11.14) Combining (11.13) and (11.14) one obtains for every λ = (,λ) ImD u Φ λhess u Φ = λhess u F KerduJ KerD uf (11.15) Moreover codimimd u Φ = codimimd u F since dimimd u Φ = dimimd u F+1 by (11.13) and D u Φ takes values in R T F(u) M. Then for every λ = (,λ) ImD u Φ ind ( λhess u Φ) codimimd u Φ = ind (λhess u F KerduJ KerD uf ) codimimd uf and passing to the infimum with respect to λ we get ind (λhess u F) 1 codimimd u F Ind u Φ Ind u F 1. By Proposition 11.1 this implies that Φ is locally open at u. Hence u cannot be a minimizer. 1 B (c) Φ ε(b(1)) B (cε 2 ) Φ(ε 2 v 1 +εv 2),v i B i (1) B (cε 2 ) Φ(B ε 2 B ε) 223

224 Now we prove that, under the same assumptions on the index of the endpoint map given in Theorem 11.6, the sub-riemannian is Lipschitz even if some abnormal minimizers are present. Theorem Let K B q (r ) be a compact and assume that Ind u F 1 for every abnormal minimizer u such that F(u) K. Then f is Lipschitz on K. Proof. Recall that if there are no abnormal minimizers reaching K, Theorem 1.44 ensures that f is Lipschitz on K. Then, using compactness of the set of all minimizers, it is sufficient to prove the estimate in neighborhood of a point q = F(u), where u is abnormal. Since Ind u F 1 by assumption, Theorem 11.6 implies that every abnormal minimizer u is not strictly abnormal, i.e., has also a normal lift. We have Hess u F : KerD u F CokerD u F, with Ind u F 1. and, since u is also normal, it follows that d u J = λd u F for some λ T F(u) M, hence KerD uf Kerd u J. The assumption of Lemma are satisfied, hence splitting the the space of controls L 2 k ([,1]) = E 1 E 2, E 2 := KerD u F we have that there exists C > and R > such that for ε < R we have B q (C ε 2 ) F(B ε ), B ε := B u (ε2 ) B u (ε), q = F(u), (11.16) where B u(r) and B u(r) are the ball of radius r in E 1 and E 2 respectively, and B q (r) is the ball of radius r in coordinates on M. Let us also observe that, since J is smooth on B u(ε 2 ) B u(ε), with d u J = on E 2, by Taylor expansion we can find constants C 1,C 2 > such that for every u = (u 1,u 2 ) B ε one has (we write u = (u 1,u 2 )) J(u ) J(u) C 1 u 1 u 1 +C 2 u 2 u 2 2 Pick then any point q K such that q q = C ε 2, with ε < R. Then (11.16) implies that there exists u = (u 1,u 2 ) B ε such that F(u ) = q. Using that f(q ) J(u ) and f(q) = J(u), since u is a minimizer, we have f(q ) f(q) J(u ) J(u) C 1 u 1 u 1 +C 2 u 2 u 2 2 (11.17) Cε 2 = C q q (11.18) where we can choose C = max{c 1,C 2 } and C = C/C. Since K is compact, and the set of control u associated with minimizers that reach the compact set K is also compact, the constants R > and C,C 1,C 2 can be chosen uniformly with respect to q K. Hence we can exchange the role of q and q in the above reasoning and get f(q ) f(q) C q q, for every pair of points q,q such that q q C R

225 11.3 Goh and generalized Legendre conditions In this section we present some necessary conditions for the index of the quadratic form along an abnormal extremal to be finite. Theorem Let u be an abnormal minimizer and let λ 1 T F(u) M satisfy λ 1D u F =. Assume that ind λ 1 Hess u F < +. Then the following condition are satisfied : (i) λ(t),[f i,f j ](γ(t)), for a.e. t, i,j = 1,...,k, (ii) λ(t),[[f u(t),f v ],f v ](γ(t)), for a.e. t, v R k, (Goh condition) (Generalized Legendre condition) where λ(t) and γ(t) = π(λ(t)) are respectively the extremal and the trajectory associated to λ 1. Remark Notice that, in the statement of the previous theorem, if λ 1 satisfies the assumption λ 1 D u F =, then also λ 1 satisfies the same assumptions. Since ind ( λ 1 Hess u F) = ind + λ 1 Hess u F this implies that the statement holds under the assumption ind + λ 1 Hess u F < +. Indeed the proof shows that as soon as the Goh condition is not satisfied, both the positive and the negative index of this form are infinity. Notice that these condition are related to the properties of the distribution of the sub-riemannian structure and not to the metric. Indeed recall that the extremal λ(t) is abnormal if and only if it satisfies k λ(t) = u i (t) h i (λ(t)), λ(t),f i (γ(t)) =, i = 1,...,k, i=1 i.e. λ(t) satifies the Hamiltonian equation and belongs to Dγ(t). Goh condition are equivalent to require that λ(t) (Dγ(t) 2 ). Corollary Assume that the sub-riemannian structure is 2-generating, i.e. D 2 q = T qm for all q M. Then there are no strictly abnormal minimizers. In particular f is locally Lipschitz on M. Proof. Since D 2 q = T q M implies (D 2 γ(t) ) = for every q M, no abnormal extremal can satisfy the Goh condition. Hence by Theorem it follows that Ind u F = +, for any abnormal minimizer u. In particular, from Theorem 11.6 it follows that the minimizer cannot be strictly abnormal Hence f is globally Lipschitz by Theorem Remark Notice that f is locally Lipschitz on M if and only if the sub-riemannian structureis 2-generating. Indeed if the structure is not 2-generating at a point q, then from Ball-Box Theorem (Corollary 8.5) it follows that the squared distance f is not Lipschitz at the base point q. On the other hand, on the set where f is positive, we have that f is Lipschitz if and only if the sub-riemannian distance d(q, ) is. Before going into the proof of the Goh conditions (Theorem 11.13) we discuss an important corollary. 225

226 Theorem Assume that D q D 2 q. Then for every ε > there exists a normal extremal path γ starting from q such that l(γ) = ε and γ is not a length-minimizer. Before the proof, this is the idea: fix an element ξ D q \ (D 2 q ) which is non empty by assumptions. We want to build an abnormal minimizing trajectory that has ξ as initial covector and that is the limit of a sequence of stricly normal lenth-minimizers. In this way this abnormal will have finite index (the abnormal quadratic form will be the limit of positive ones) and then by Goh condition ξ D 2 q =, which is a contradiction. Proof. Assume by contradiction that there exists T > such that all normal extremal paths γ λ associated with initial covector λ H 1 (1/2) T q M minimize on the segment [,T]. Since restriction of length-minimizers are still length-minimizers, by suitably reducing T >, we can assume, thanks to Lemma 3.33, that there exists 2 a compact set K such that {γ λ (T) λ H 1 (1/2)} K. Fix an element ξ D q \(D 2 q ), which is non empty by assumptions. Then consider, given any λ H 1 (1/2) T q M, thefamilyofnormalextremalpaths(andcorrespondingnormaltrajectories) λ s (t) = e t H (λ +sξ), γ s (t) = π(λ s (t)), t [,T]. andlet u s bethecontrol associated withγ s, anddefinedon[,t]. DuetoTheorem 1.9, thereexists a positive sequence s n + such that q n := γ sn (T) is a smooth point for the squared distance from q, for every n N. By compactness of minimizers reaching K, there exists a subsequence of s n, that we still denote by the same symbol, and a minimizing control ū such that u sn ū, when n. In particular γ sn is a strictly normal length-minimizer for every n N. Denote Φ n t = P usn,t the non autonomous flow generated by the control u sn. The family λ sn (t) satisfies λ sn (t) = e t H (λ +s n ξ) = (Φ n t ) (λ +s n ξ). Moreover, by continuity of the flow with respect to convergence of controls, we have that Φ n t Φ t for n, where Φ t denotes the flow associated with the control ū. Hence we have that the rescaled family ( ) 1 1 λ sn (t) = (Φ n s t) λ +ξ n s n converges for n to the limit extremal λ(t) = Φ tξ. Notice that λ(t) is, by construction, an abnormal extremal associated to the minimizing control ū, and with initial covector ξ. The fact that u sn is a strictly normal minimizer says that the Hessian of the energy J restricted to the level set F 1 (q n ) is non negative. Recall that Hess u J F 1 (q) = I λ 1 D 2 uf, where λ 1 T F(u) M is the final covector of the extremal lift. In particular we have for every n N and every control v the following inequality v 2 λ sn (T)D 2 u sn F(v,v). This immediately implies 1 s n v 2 1 s n λ sn (T)D 2 u sn F(v,v), 2 indeed it is enough to fix an arbitrary compact K with q int(k) such that the corresponding δ K defined by Lemma 3.33 is smaller than T. 226

227 and passing to the limit for n one gets In particular one has that λ(t)d 2 ūf(v,v). ind + λ(t)hessūf = ind ( λ(t)d 2 ūf) =. Hence the abnormal extremal has finite (positive) index and we can apply Goh conditions (see Theorem and Remark 11.14). Thus ξ is orthogonal to D 2 q, which is a contradiction since ξ D q \(D 2 q ). Remark (About the assumptions of Theorem 11.17). Assume that the sub-riemannian structure is bracket-generating and is not Riemannian in an open set O M, i.e., D q T q M for every q O. Then there exists a dense set D O such that D q D 2 q for every q D. IndeedassumethatD q D 2 q forallq inanopenseta, thenitiseasytoseethatd i q = D q T q M for all q A, since the structure is not Riemannian. Hence the structure is not bracket-generating in A, which gives a contradiction Proof of Goh condition - (i) of Theorem Proof of Theorem Denote by u the abnormal control and by P t = exp t f u(s)ds the nonautonomous flow generated by u. Following the argument used in the proof of Proposition 7.2 we can write the end-point map as the composition F(u+v) = P 1 (G(v)), D u F = P 1 D G, and reduced the problem to the expansion of G, which is easier. Indeed denoting gi t map G can be interpreted as the end-point map for the system := P 1 t f i, the k q(t) = gv(t) t (q(t)) = v i (t)gi t (q(t)) and the Hessian of F can be computed easily starting from the Hessian of G at v = Hess u F = P 1 Hess G from which we get, using that λ = P1 λ 1, λ 1 Hess u F = λ 1 P 1 Hess G = λ Hess G i=1 Moreover computing λ(t),[f i,f j ](γ(t)) = λ,p 1 t [f i,f j ](γ(t)) = λ,[g t i,gt j ](γ()) the Goh and generalized Legendre conditions can also be rewritten as λ,[g t i,g t j]γ(), for a.e. t [,1], i,j = 1,...,k, (G.1) λ,[[g t u(t),gt i],g t i]](γ()), for a.e. t [,1], i = 1,...,k. (L.1) 227

228 Now we want to compute the Hessian of the map G. Using the Volterra expansion computed in Chapter 6 we have 1 G(v( )) q Id+ gv(t) t dt+ gv(τ) τ gt v(t) dτdt +O( v 3 ) τ t 1 where we used that gv t is linear with respect to v to estimate the remainder. This expansion let us to recover immediately the linear part, i.e. the expressions for the first differential, which can be interpreted geometrically as the integral mean D G(v) = 1 g t v(t) (q )dt, On the other hand the expression for the quadratic part, i.e. the second differential D 2 G(v) = 2q gv(τ) τ gt v(t) dτdt. τ t 1 has not an immediate geometrical interpretation. Recall that the second differential D 2 G is defined on the set 1 KerD G = {v L 2 k [,1], gv(t) t (q )dt = } (11.19) and, for such a v, D 2G(v) belong to the tangent space T q M. Indeed, using Lemma 7.19, and that v belong to the set (11.19), we can symmetrize the second derivative, getting the formula D 2 G(v) = [gv(τ) τ,gt v(t) ](q )dτdt, τ t 1 which shows that the second differential is computed by the integral mean of the commutator of the vector field gv(t) t for different times. Now consider an element λ ImD G, i.e. that satisfies λ,gv(q t ) =, for a.e. t [,1], v R k. Then we can compute the Hessian λ Hess G(v) = λ,[gv(τ) τ,gt v(t) ](q ) dτdt (11.2) τ t 1 Remark Denoting by K the bilinear form K(τ,t)(v,w) = λ,[gv,g τ w](q t ), the Goh and generalized Legendre conditions are rewritten as follows K(t,t)(v,w) =, v,w R k, for a.e. t [,1], (G.2) K τ (τ,t) (v,v), τ=t v R k, for a.e. t [,1]. (L.2) 228

229 Indeed, the first one easily follows from (G.1). Moreover recall that gv t = P 1 t f v, hence the map t gv t is Lipschitz for every fixed v. By definition of P t = exp t f u(t)dt it follows that which shows that (L.2) is equivalent to (L.1). t gt v = [g t u(t),gt v] Finally we want to express the Hessian of G in Hamiltonian terms. To this end, consider the family of functions on T M which are linear on fibers, associated to the vector fields g t v : h t v(λ) := λ,g t v(q), λ T M, q = π(λ). and define, for a fixed element λ ImD G : Using the identities η t v := h t v(λ ) T λ T M (11.21) σ λ ( h t v, h t w) = {h t v,h t w}(λ) = λ,[g t v,g t w](q), q = π(λ) and computing at the point λ T q M we find σ λ (η t v,ηt w ) = λ,[g t v,gt w ](q ) and we get the final expression for the Hessian λ Hess G(v( )) = τ t 1 σ λ (ηv(τ) τ,ηt v(t) )dtdτ. (11.22) where the control v KerD G satisfies the relation (notice that π η t v = g t v(q )) 1 1 π ηv(t) t dt = π ηv(t) t dt = Moreover the Hamiltonian version of Goh and Legendre conditions is expressed as follows: σ λ (η t v,η t w) =, v,w R k, for a.e. t [,1], (G.3) σ λ ( η t v,η t v), v R k, for a.e. t [,1]. (L.3) We are reduced to prove, under the assumption ind λ Hess G < +, that (G.3) and (L.3) hold. Actually we will prove that Goh and generalized Legendre conditions are necessary conditions for the restriction of the quadratic form to the subspace of controls in KerD G that are concentrated on small segments [t,t+s]. In what follows we fix once for all t [,1[. Consider an arbitrary vector control function v : [,1] R k with compact support in [,1] and build, for s > small enough, the control ( ) τ t v s (τ) = v, suppv s [t,t+s]. (11.23) s 229

230 The idea is to apply the Hessian to this particular control functions and then compute the asymptotics for s. indice finito allora e finito anche qui sopra. Actually, since the index of a quadratic form is finite if and only if the same holds for the restriction of the quadratic form to a subspace of finite codimension, it is not restrictive to restrict also to the subspace of zero average controls E s := {v s KerD G, v s defined by (11.23), 1 v(τ)dτ = }. Notice that this space depend on the choice of t, while codime s does not depend on s. Remark We will use the following identity (writing σ for σ λ ), which holds for arbitrary control functions v,w : [,1] R k β t β β σ(ηv(τ) τ,ηt w(t) )dtdτ = σ( ηv(τ) τ dτ,ηt w(t) )dt = σ(ηv(τ) τ, ηw(t) t dt)dτ. (11.24) α τ t β α α For the specific choice w(t) = t v(τ)dτ we have also the integration by parts formula β α η t v(t) dt = ηβ w(β) ηα w(α) β α α τ η w(t) t dt. (11.25) Combining (11.22) and (11.24), we rewrite the Hessian applied to v s as follows λ Hess G(v s ( )) = t+s t σ( τ t ηv θ s(θ) dθ,ητ v s(τ) )dτ. (11.26) Notice that the control v s is concentrated on the segment [t,t + s], thus we have restricted the extrema of the integral. The integration by parts formula (11.25), using our boundary conditions, gives τ τ ηv θ dθ = s(θ) ητ w s(τ) η w θ s(θ) dθ. (11.27) where we defined t w s (θ) = Combining (11.26) and (11.27) one has λ Hess G(v s ( )) = = t+s t t+s t θ t v s (τ)dτ, σ(η τ w s(τ),ητ v s(τ) )dτ t+s t t θ [t,t+s]. τ σ( t t+s t+s σ(ηw τ s(τ),ητ v )dτ s(τ) σ( η w τ, s(τ) t η θ w s(θ) dθ,ητ v s(τ) )dτ τ ηv θ s(θ) dθ)dτ (11.28) where the second equality uses (11.24). Next consider the second term in (11.28) and apply again the integration by part formula (recall that w s (t+s) = ) t+s t t+s t+s σ( η w τ, s(τ) ηv θ dθ)dτ = s(θ) σ( η w τ s(τ),ητ w )dτ s(τ) τ t 23 t+s t t+s σ( η w τ, s(τ) η w θ dθ)dτ. s(θ) τ

231 Collecting together all these results one obtains λ Hess G(v s ( )) = t+s t σ(η τ w s(τ),ητ v s(τ) )dτ + t+s t σ( η τ w s(τ),ητ w s(τ) )dτ + t+s t t+s σ( η w τ, s(τ) η w θ dθ)dτ s(θ) This is indeed a homogeneous decomposition of λ Hess G(v s ( )) with respect to s, in the following sense. Since ( ) θ t w s (θ) = sw, s we can perform the change of variable ζ = τ t, τ [t,t+s], s and obtain the following expression for the Hessian: λ Hess G(v s ( )) = s 2 1 σ(η t+sθ w(θ),ηt+sθ v(θ) )dθ +s 3 1 σ( η t+sθ w(θ),ηt+sθ w(θ) +s 4 1 τ )dθ (11.29) σ( η t+sθ w(θ), 1 θ η t+sζ w(ζ) dζ)dθ We recall that here v s is defined through a control v compactly supported in [,1] by (11.23) and w is the primitive of v, that is also compactly supported on [,1]. In particular we can write 1 λ Hess G(v s ( )) = s 2 σ(ηw(θ) t,ηt v(θ) )dθ +O(s3 ). (11.3) By assumption ind λ Hess G < +. This implies that the quadratic form given by its principal part w( ) 1 σ(ηw(θ) t,ηt ẇ(θ) )dθ, (11.31) has also finite index. Indeed, assume that (11.31) has infinite negative index. Then by continuity every sufficiently small perturbation of (11.31) would have infinite index too. Hence, for s small enough, the quadratic form λ Hess G would also have infinite index, contradicting our assumption on (11.3). To prove Goh condition, it is then sufficient to show that if (11.31) has finite index then the integrand is zero, which is guaranteed by the following Lemma Let A : R k R k R be a skew-symmetric bilinear form and define the qudratic form Q : U R, Q(w( )) = 1 A(w(t),ẇ(t))dt, where U := {w( ) Lip[,1],w() = w(1) = }. Then ind Q < + if and only if A. 231

232 Proof. Clearly if A =, then Q = and ind Q =. Assume then that A and we prove that ind Q = +. We divide the proof into steps (i). The bilinear form B : U U R defined by B(w 1 ( ),w 2 ( )) = 1 A(w 1 (t),ẇ 2 (t))dt is symmetric. Indeed, integrating by parts and using the boundary conditions we get B(w 1,w 2 ) = 1 = = 1 1 A(w 1 (t),ẇ 2 (t))dt A(ẇ 1 (t),w 2 (t))dt A(w 2 (t),ẇ 1 (t))dt = B(w 2,w 1 ) (ii). Q is not identically zero. Since Q is the quadratic form associated to B and from the polarization formula B(w 1,w 2 ) = 1 4 (Q(w 1 +w 2 ) Q(w 1 w 2 )) it easily follows that Q if and only if B. Then it is sufficient to prove that B is not zero. Assume that there exists x,y R k such that A(x,y), and consider a smooth nonconstant function α : R R, s.t. α() = α(1) = α() = α(1) =. Then α(t)z,α(t)z U for every z R k and we can compute B( α( )x,α( )y) = 1 = A(x,y) A( α(t)x, α(t)y)dt 1 α(t) 2 dt. (iii). Q has the same number of positive and negative eigenvalues. Indeed it is easy to see that Q satisfies the identity Q(w(1 )) = Q(w( )) from which (iii) follows. (iv). Q is non zero on a infinite dimensional subspace. Consider some w U such that Q(w) = α. For every x = (x 1,...,x N ) R N one can built the function w x (t) = x i w(nt i), t [ i N, i+1 N ], i = 1,...,N. An easy computations shows that Q(w x ) = α In particular there exists a subspace of arbitrary large dimension where Q is nondegenerate. 232 N i=1 x 2 i

233 Proof of generalized Legendre condition - (ii) of Theorem Applying Lemma for any t we prove that the s 2 order term in (11.29) vanish and we get to λ Hess G(v( )) = s 3 1 = s 3 1 σ( η t+sθ w(θ),ηt+sθ w(θ) )dθ +O(s4 ) σ( η t+sθ w(θ),ηt w(θ) )dθ +O(s4 ) where the last equalily follows from the fact that ηv t is Lipschitz with respect to t (see also (11.21)), i.e. = ηv t +O(s) η t+sθ v On the other hand η v t is only measurable bounded, but the Lebesgue points of u are the same of η. In particular if t is a Lebesgue point of η, the quantity η w( ) t is well defined and we can write 1 λ Hess G(v( )) = s 3 σ( η w(θ) t,ηt w(θ) )dθ s 3 ( 1 σ( η t+sθ w(θ),ηt w(θ) ) σ( ηt w(θ),ηt w(θ) )dθ )+O(s 4 ) Using the linearity of σ and the boundedness of the vector fields we can estimate 1 σ( η t+sθ w(θ),ηt w(θ) ) σ( ηt w(θ),ηt w(θ) )dθ C 1 η t+sθ s C sup v 1 1 s w(θ) ηt w(θ) dθ η t+τ v η t v dτ s where the last term tends to zero by definition of Lebesgue point. Hence we come to 1 λ Hess G(v( )) = s 3 σ( η w(θ) t,ηt w(θ) )dθ +o(s3 ) (11.32) To prove the generalized Legendre condition we have to prove that the integrand is a non negative quadratic form. This follows from the following Lemma, which can be proved similarly to Lemma Lemma Let Q : R k R be a quadratic form on R k and The quadratic form U := {w( ) Lip[,1],w() = w(1) = }. Q : U R, Q(w( )) = has finite index if and only if Q is non negative Q(w(t))dt

234 More on Goh and generalized Legendre conditions If Goh condition is satisfied, the generalized Legendre condition can also be characterized as an intrinsic property of the module. Indeed one can see that the quadratic map U γ(t) R, v λ(t),[[f u(t),f v ],f v ](γ(t)) is well defined and does not depend on the extension of f v to a vector field f v(t) on U. Notice that, using the notation h v (λ) = λ,f v (q) an abnormal extremal satisfies h v (λ t ), v R k Recalling that the Poisson bracket between linear functions on T M is computed by the Lie bracket we can rewrite the Goh condition as follows while strong Legendre conditions reads {h v,h w }(λ) = λ,[f v,f w ](q) Taking derivative of (11.33) with respect to t we find {h v,h w }(λ(t)), v,w R k (11.33) {{h u(t),h v },h v }, v R k (11.34) {h u(t),{h v,h w }}(λ(t)), v,w R k and using Jacobi identity of the Poisson bracket we get that the bilinear form (v,w) {{h u(t),h v },h w }(λ) (11.35) is symmetric. Hence the generalized Legendre condition says that the quadratic form associated to (11.35) is nonnegative. Now we want to characterize the trajectories that satisfy these conditions. Recall that, if λ(t) is an abnormal geodesic, we have λ(t) = h u(t) (λ(t)), h i (λ(t)), t 1. (11.36) where h u(t) = k i=1 u i(t) h i (t). Moreover for any smooth function a : T M R d dt a(λ(t)) = {h u(t),a}(λ(t)) = k u i (t){h i,a}(λ(t)) i=1 Notation. We will denote the iterated Poisson brackets h i1...i k (λ) = {h i1,...,{h ik 1,h ik }}(λ) (11.37) = λ,[f i1,...,[f ik 1,f ik ]](q), q = π(λ) (11.38) 234

235 Differentiating the identities in (11.36), using (11.37), we get h i (λ(t)) = k u j (t)h ji (λ(t)) =, t. (11.39) j=1 If k is odd we always have a nontrivial solution of the system, if k is even is possible only for those λ that satisfy det{h ij (λ)} =. But we want to characterize only those controls that satisfy Goh conditions, i.e. such that h ij (λ(t)). (11.4) Hence you cannot recover the control u from the linear system (11.39). We differentiate again equations (11.4) and we find k u l (t)h lij (λ(t)). (11.41) l=1 For every fixed t, these are k(k 1)/2 equations in k variables u 1,...,u k. Hence (i) If k = 2, we have 1 equation in 2 variables and we can recover the control u 1,u 2 up to a scalar mutilplier, if at least one of the coefficients does not vanish. Since we can always deal with lengh-parametrized curve this uniquely determine the control u. (ii) If k 3, we have that the system is overdetermined. Remark For generic systems it is proved that, when k 3, Goh conditions are not satisfied. On the other hand, in the case of Carnot groups, for big codimension of the distribution, abnormal minimizers always appear Rank 2 distributions and nice abnormal extremals Consider a rank 2 distribution generated by a local frame f 1,f 2 and let h 1,h 2 be the associated linear Hamiltonian. An abnormal extremal λ(t) associated with a control u(t) satisfies the system of equations λ(t) = u 1 (t) h 1 (λ(t))+u 2 (t) h 2 (λ(t)), h 1 (λ(t)) = h 2 (λ(t)) =. (11.42) Define the linear Hamiltonian associated with the h 12 (λ(t)) = λ,[f 1,f 2 ](q). Notice that in this special framework the Goh condition is rewritten as h 12 (λ(t)) = for a.e. t. Equivalently, every abnormal extremal satisfies Goh conditions if and only if λ(t) (D 2 ). Lemma Every nontrivial abnormal extremal on a rank 2 sub-riemannian structure satisfies the Goh condition. Proof. Indeed differentiating the identity (11.42) one gets (we omit t in the notation for simplicity) u 2 {h 2,h 1 } = u 2 h 21 (λ) =, u 1 {h 1,h 2 } = u 1 h 21 (λ) =, Since at least one among u 1 and u 2 is not identically zero, we have that h 12 (λ(t)), that is Goh condition. 235

236 From now on we focus on a special class of abnormal extremals. Definition An abnormal extremal λ(t) is called nice abnormal if, for every t [,1], it satisfies λ(t) (D 2 ) \(D 3 ). Remark Assume that λ(t) is a nice abnormal extremal. The system (11.41) obtained by differentiating twice the equations (11.42) reads u 1 h 112 (λ) = u 2 h 221 (λ). (11.43) Under our assumption, at least one coefficient in (11.43) is nonzero and we can uniquely recover the control u = (u 1,u 2 ) up to a scalar as follows u 1 (t) = h 221 (λ(t)), u 2 (t) = h 112 (λ(t)). (11.44) If we plug this control into the original equation we find that λ(t) is a solution of Let us now introduce the quadratic Hamitonian λ = h 221 (λ) h 1 (λ)+h 112 (λ) h 2 (λ). (11.45) H = h 221 h 1 +h 112 h 2. (11.46) Theorem Any abnormal extremal belong to (D 2 ). Moreover we have that λ(t) (D 2 ) \ (D 3 ) for all t [,1] if and only if λ(t) satisfies with initial condition λ (D 2 q) \(D 3 q). λ(t) = H (λ(t)) (11.47) Remark Notice that, as soon as n > 3, the set (D 2 q) \(D 3 q) is nonempty for an open dense set of q M. Indeed assume that we have D 2 q = D 3 q for any q in a open neighborhood O q of a point q in M. Then it follows that D 2 q = D 3 q = D 4 q =... and the structure cannot be bracket generating, since dimd i q < dimm for every i > 1. The case n = 3 will be treated separately. Proof. Using that any abnormal extremal belong to the subset {h 1 (λ(t)) = h 2 (λ(t)) = }, it is easy to show that an abnormal extremal λ(t) satisfies (11.45) if and only if it is an integral curve of the Hamiltonian vector field H. It remains to prove that a solution of the system λ(t) = H(λ(t)), λ (D 2 ) \(D 3 ), (11.48) satisfies λ(t) (D 2 ) \ (D 3 ) for every t. First notice that the solution cannot intersect the set (D 3 ) sincetheseareequilibriumpointsofthesystem(11.48)(sinceat thesepointsthehamiltonian has a root of order two). 236

237 We are reduced to prove that (D 2 ) is an invariant subset for H. Hence we prove that the functions h 1,h 2,h 12 are constantly zero when computed on the extremal. To do this we find the differential equation satisfied by these Hamiltonians. Recall that, for any smooth function a : T M R and any solution of the Hamiltonian system λ(t) = e t H λ, we have ȧ = {H,a}. Hence we get ḣ 12 = {h 221 h 1 +h 112 h 2,h 12 } = {h 221,h 12 }h 1 +{h 112,h 12 }h 2 +h 112 h 221 +h 212 h }{{ 112 } = c = 1 h 1 +c 2 h 2 for some smooth coefficients c 1 and c 2. We see that there exists smooth functions a 1,a 2,a 12 and b 1,b 2,b 12 such that ḣ 1 = a 1 h 1 +a 2 h 2 +a 12 h 12 ḣ 2 = b 1 h 1 +b 2 h 2 +b 12 h 12 (11.49) ḣ 12 = c 1 h 1 +c 2 h 2 If we plug the solution λ(t) into the equation of (11.48), i.e. if we consider it as a system of differential equations for the scalar functions h i (t) := h i (λ(t)), with variable coefficients a i (λ(t)),b i (λ(t)), c i (λ(t)), we find that h 1 (t),h 2 (t),h 12 (t) satisfy a nonautonomous homogeneous linear system of differential equation with zero initial condition, since λ (D 2 ), i.e. h 1 (λ ) = h 2 (λ ) = h 12 (λ ) =. (11.5) Hence h 1 (λ(t)) = h 2 (λ(t)) = h 12 (λ(t)) =, t. We also can prove easily that nice abnormals satisfy the generalized Legendre condition. Recall that if λ(t) is an abnormal extremal, then λ(t) is also an abnormal extremal. Lemma Let λ(t) be a nice abnormal. Then λ(t) or λ(t) satisfy the generalized Legendre condition. Proof. Let u(t) be the control associated with the extremal λ(t). It is sufficient to prove that the quadratic form Q t : v λ(t),[[f u(t),f v ],f v ], v R 2 (11.51) is non negative definite. We already proved (cf.??) that the bilinear form B t : (v,w) λ(t),[[f u(t),f v ],f w ], v,w R 2 (11.52) is symmetric. From (11.52) it is easy to see that u(t) KerB t for every t. Hence Q t is degenerate for every t. On the other hand if the quadratic form is identically zero we have λ(t) (D 3 ), which is a contradiction. Hence the quadratic form has rank 1 and is semi-definite and we can choose ±λ in such a way that (11.51) is positive at t =. Since the sign of the quadratic form does not change along the curve (it is continuous and it cannot vanish) we have that it is positive for all t. 237

238 11.5 Optimality of nice abnormal in rank 2 structures Up to now we proved that every nice abnormal extremal in a rank 2 sub-riemannian structure automatically satisfies the necessary condition for optimality. Now we prove that actually they are strict local minimizers. Theorem Let λ(t) be a nice abnormal extremal and let γ(t) be corresponding abnormal trajectory. Then there exists s > such that γ [,s] is a strict local length minimizer in the L 2 - topology for the controls (equivalently the H 1 -topology for trajectories). Remark Notice that this property of γ does not depend on the metric but only on the distribution. In particular the value of s will be independent on the metric structure defined on the distribution. It follows that, as soon as the metric is fixed, small pieces of nice abnormal are also global minimizers. Before proving Theorem 11.3 we prove the following technical result. Lemma Let Φ : E R n be a smooth map defined on a Hilbert space E such that Φ() =, where is a critical point for Φ λd Φ =, λ R n, λ. Assume that λhess φ is a positive definite quadratic form. Then for every v such that λ,v <, there exists a neighborhood of zero O E such that Φ(x) / R + v, x O,x, R + = {α R,α > }. In particular the map Φ is not locally open and x = is an isolated point on its level set. Proof. In the first part of the proof we build some particular set of coordinates that simplifies the proof, exploiting the fact that the Hessian is well defined independently on the coordinates. Split the domain and the range of the map as follows E = E 1 E 2, E 2 = KerD Φ, (11.53) R n = R k 1 R k 2, R k 1 = ImD Φ, (11.54) where we select the complement R k 2 in such a way that v R k 2 (notice that by our assumption v / R k 1 ). Accordingly to the notation introduced, let us write Φ(x 1,x 2 ) = (Φ 1 (x 1,x 2 ),Φ 2 (x 1,x 2 )), x i E i, i = 1,2. Since Φ 1 is a submersion by construction, the Implicit function theorem implies that by a smooth change of coordinates we can linearize Φ 1 and assume that Φ has the form Φ(x 1,x 2 ) = (D Φ(x 1 ),Φ 2 (x 1,x 2 )), since x 2 E 2 = KerD Φ. Notice that, by construction of the coordinate set, the function x 2 Φ 2 (,x 2 ) coincides with the restriction of Φ to the kernel of its differential, modulo its image. 238

239 Hence for every scalar function a : R k 2 R such that d a = λ we have the equality λhess Φ = Hess (a Φ 2 (, )) > In particular the function a Φ 2 (,y) is non negative in a neighborhood of. Assume now that Φ(x 1,x 2 ) = sv for some s. Since v R k 2 it follows that D Φ(x 1 ) = = x 1 =, and Φ 2 (,x 2 ) = sv. In particular we have d ds a(φ 2 (,x 2 )) = d s= ds a(sv) = λ,v a(sv) for s s= which is a contradiction. Let λ(t) be an abnormal extremal and let γ(t) be corresponding abnormal trajectory. γ = u 1 f 1 (γ)+u 2 f 2 (γ). (11.55) In what follows we always assume that γ. = {γ(t) : t [,1]} is a smooth one-dimensional submanifold of M, with or without border. Then either the curve γ has no self-intersection or γ is diffeomorfic to S 1. In both cases we can chose a basis f 1,f 2 in a neighborhood of γ in such a way that γ is the integral curve of f 1 γ = f 1 (γ) Then γ is the solution of (11.55) with associated control ũ = (1,). Notice that a change of the frame on M corresponds to a smooth change of coordinates on the end-point map. With analogous reasoning as in the previous section, we describe the end point map as the composition where G is the end point map for the system F : (u 1,u 2 ) γ(1) F = e f 1 G q = (u 1 1)e tf 1 f 1 +u 2 e tf 1 f 2. (11.56) Since e tf 1 f 1 = f 1, denoting g t := e tf 1f 2 and defining the primitives w(t) = t (1 u 1 (τ))dτ, v(t) = we can rewrite the system, whose endpoint map is G, as follows The Hessian of G is computed λ Hess G(u 1, v) = 1 λ,[ t q = ẇf 1 (q)+ vg t (q). t u 2 (τ)dτ, (11.57) ẇ(τ)f 1 + v(τ)g τ dτ, ẇ(t)f 1 + v(t)g t ](q ) dt. (11.58) 239

240 Recall that D G(u 1, v) = 1 ẇ(t)f 1 (q )+ v(t)g t (q )dt = w(1)f 1 (q )+ and the condition λ ImD G is rewritten as 1 v(t)g t (q )dt λ,f 1 (q ) = λ,g t (q ) =, t. (11.59) Notice that since equality (11.59) is valid for all t then we have that λ,ġ t (q ) = λ,[f 1,g t ](q ) =, (11.6) Then we can rewrite our quadratic form only as a function of v, since all terms containing ẇ disappear 1 t λ Hess G( v) = λ,[ v(τ)g τ dτ, v(t)g t ](q ) dt (11.61) with the extra condition 1 v(t)g t (q )dt = w(1)f 1 (q ). (11.62) Now we rearrange these formulas, using integration by parts, rewriting the Hessian as a quadratic form on the space of primitives Using the equality we have t λ Hess G( v) = v(t) = t v(τ)dτ v(τ)g τ dτ = v(t)g t 1 t λ,[v(t)g t, v(t)g t ](q ) dt 1 λ,[ t v(τ)ġ τ dτ (11.63) v(τ)ġ τ dτ, v(t)g t ](q ) dt The first addend is zero since [g t,g t ] =. Exchanging the order of integration in the second term 1 λ,[ t and then integrating by parts v(τ)ġ τ dτ, v(t)g t ](q ) dt = 1 t 1 v(τ)g τ dτ = v(1)g 1 v(t)g t 24 1 λ,[v(t)ġ t, v(τ)g τ dτ](q ) dt t 1 t v(τ)ġ τ dτ

241 we get to λhess G( v) = 1 which can also be rewritten as follows λhess G( v) = 1 λ,[ġ t,g t ](q ) v(t) 2 dt + 1 λ,[ t λ,[ġ t,g t ](q ) v(t) 2 dt + 1 λ,[ t 1 v(τ)ġ τ,v(t)ġ t v(1)g 1 ](q ) dt (11.64) v(τ)ġ τ dτ +v(1)g 1,v(t)ġ t ](q )dt. (11.65) Moreover, again integrating by parts the extra condition (11.62), we find 1 v(t)ġ t (q )dt = w(1)f 1 (q )+v(1)g 1 (q ) (11.66) Remark Notice that we cannot plug in the expression (11.66) directly into the formula since this equality is valid only at the point q, while in (11.64) we have to compute the bracket. Notice that the vectors f 1 (q 1 ) and f 2 (q 1 ) are linearly independent, then also f 1 (q ) = e f 1 (f 1 (q 1 )), and g 1 (q ) = e f 1 (f 2 (q 1 )), are linearly independent. From (11.66) it follows that for every pair(w, v) in the kernel the following estimates are valid w(1) C v L 2, v(1) C v L 2. (11.67) Theorem Let γ : [,1] M be an abnormal trajectory and assume that the quadratic form (11.64) satisfies λ Hess G( v) α v 2 L 2. (11.68) Then the curve is locally minimizer in the L 2 topology of controls. Remark Notice that the estimate (11.68) depends only on v, while the map G is a smooth map of v and ẇ. Hence Lemma does not apply. Moreover, the statement of Lemma violates for the endpoint map, since it is locally open as soon as the bracket generating condition is satisfied (this is equivalent to the Chow-Rashevsky Theorem). Moreover the final point of the trajectory is never isolated in the level set. What we are going to use is part of the proof of this Lemma, to show that the statements holds for the restriction of the endpoint map to some subset of controls Proof of Theorem Our goal is to prove that there are no curves shorter than γ that join q to q 1 = γ(1). To this extent we consider the restriction of the endpoint map to the set of curves that are shorter or have the same lenght than the original curve. Hence we need to fix some sub-riemannian structure on M. 241

242 We can then assume the orthonormal frame f 1,f 2 to be fixed and that the length of our curve is exactly 1 (we can always dilate all the distances on our manifold and the local optimality of the curve is not affected). The set of curves of length less or equal than 1 can be parametrized, using Lemma 3.15, by the set {(u 1,u 2 ) u 2 1 +u 2 2 1} Following the notation (11.57), notice that {(u 1,u 2 ) u 2 1 +u 2 2 1} {(w,v) ẇ }. We want to show that, for some function a C (M) such that d q a = λ ImD F, we have in the domain a F D (ẇ, v) = λhess F(ẇ, v)+r(w,v), where D = {(ẇ, v) KerD F,ẇ } R(w, v) v 2 (11.69) (ẇ, v) Indeed if we prove (11.69) we have that the point (ẇ, v) = (,) is locally optimal for F. This means that the curve γ, i.e. the curve associated to controls u 1 = 1,u 2 =, is also locally optimal. Using the identity t exp v(τ)f 2 dτ = e v(t)f 2 and applying the variations formula (6.21) to the endpoint map F we get F(ẇ, v) = q exp = q exp 1 1 (1 ẇ(t))f 1 + v(t)f 2 dt (1 ẇ(t))e v(t)f 2 f 1 dt e v(1)f 2 Hence we can express the endpoint map as a smooth function of the pair (ẇ,v). Now, to compute (11.69), we can assume that the function a is constant on the trajectories of f 2 (since we only fix its differential at one point) so that which simplifies our estimates: Writing a F(ẇ, v) = q exp e v(1)f 2 a = a 1 (1 ẇ(t))e v(t)f 2 f 1 dta (1 ẇ(t))e v(t)f 2 f 1 = f 1 +X (v(t))+ẇ(t)x 1 (v(t)) (11.7) and using the variation formula (6.22), setting Yt i = e (t 1)f 1 X i for i =,1, we get (recall that q 1 = q e f 1 (q )) a F(ẇ, v) = q 1 exp 1 Expanding the chronological exponential we find that Yt (v(t))+ẇ(t)y t 1 (v(t))dta, Y t () = Y t 1 () =, 242

243 (a) the zero order term vanish since Yt () = Y t 1 () =, (b) all first order terms vanish since the vector fields f 1 and [f 1,f 2 ] spans the image of the differential (hence are orthogonal to λ = d q a) (c) the second order terms are in the Hessian, since our domain D is contained in the kernel of the differential In other words it remains to show that every term in v,w of order greater or equal than 3 in the expansion can be estimated with o( v 2 ). 3 Let us prove first the claim for monomial of order three: 1 ẇ(t)v 2 (t)dt = o( v 2 ), 1 ẇ(t) t ẇ(τ)v(τ)dτdt = o( v 2 ) 1 ẇ(t) t τ ẇ(τ) ẇ(s)dsdτdt = o( v 2 ) Using that ẇ, which is the key assumption, and the fact that (ẇ, v) KerD F, which gives the estimates (11.67), we compute 1 ẇ(t)v 2 1 (t)dt ẇ(t) v 2 (t)dt = 1 ẇ(t)v 2 (t)dt = w(1)v 2 (1) v 3 +ε v 2, 1 w(t)v(t) v(t)dt where we estimate for the second term follows from 1 1 w(t)v(t) v(t)dt maxw(t) v(t) v(t)dt w(1) v v C v v 2 The second integral can be rewritten 1 ẇ(t) and then we estimate t 1 ẇ(τ)v(τ)dτdt = w(1) ẇ(t) t 3 where o( v 2 ) have the same meaning as in (11.69). 1 ẇ(t)v(t)dt 1 1 ẇ(τ)v(τ)dτdt 2 w(1) v(t)ẇ(t)dt C ẇ v 2 w(t)v(t)ẇ(t)dt 243

244 Finally, the last integral is very easy to estimate using the equality 1 ẇ(t) t τ ẇ(τ) ẇ(s)dsdτdt = 1 1 ẇ(t) 3 dt 6 C ẇ v 2 Starting from these estimate it is easy to show that any mixed monomial of order greater that three satisfies these estimates as well. Applying these results to a small piece of abnormal trajectory we can prove that small pieces of nice abnormals are minimizers Proof of Theorem If we apply the arguments above to a small piece γ s = γ [,s] of the curve γ it is easy to see that the Hessian rescale as follows, λ Hess G s (v) = s λ,[g t,ġ t ](q ) v(t) 2 dt + s λ,[ t v(τ)ġ τ dτ,v(t)ġ t v(s)g s ](q ) dt Since the generalized Legendre condition ensures 4 that (see also Lemma 11.29) then the norm λ,[g t,ġ t ](q ) C > ( s 1/2 v g = λ,[g t,ġ t ](q ) v(t) dt) 2 (11.71) is equivalent to the standard L 2 -norm. Hence the Hessian can be rewritten as where T is a compact operator in L 2 of the form λhess G s (v) = v g + Tv,v (11.72) (Tv)(t) = s K(t,τ)v(τ)dτ Since T 2 = K 2 L 2 for s, it follows that the Hessian is positive definite for small s > Conjugate points along abnormals In this section, we give an effective way to check the inequality (11.68) that implies local minimality of the nice abnormal geodesic according to Theorem it is semidefinite and we already know that f 1 is in the kernel 244

245 We define Q 1 (v) := λhess G( v). Quadratic form Q 1 is continuous in the topology defined by the norm v L2. The closure of the domain of Q 1 in this topology is the space D(Q 1 ) = { v L 2 [,1] : 1 } v(t)ġ t (q )dt span{f 1 (q ),g 1 (q )}. The extension of Q 1 to this closure is denoted by the same symbol Q 1. We set: l(t) = λ,[ġ t,g t ](q ), X t = v 1 g 1 + and we rewrite the form Q 1 in these more compact notations: Q 1 (v) = 1 l(t)v(t) 2 dt+ 1 t 1 v(τ)ġ τ dτ λ,[x t,ẋ t ](q ) dt, Ẋ t = v(t)ġ t, X 1 g 1 =, X (q ) f 1 (q ) =. (1) Moreover, we introduce the family of quadratic forms Q s, for < s 1, as follows Q s (v) := s l(t)v(t) 2 dt+ s λ,[x t,ẋ t ](q ) dt, Ẋ t = v(t)ġ t, X s g s =, X (q ) f 1 (q ) =. (1) Recall that l(t) is a strictly positive continuous function. In particular, 1 l(t)v(t)2 dt is the square of a norm of v that is equivalent to the standard L 2 -norm. Next statement is proved by the same arguments as Proposition We leave details to the reader. Proposition The form Q 1 is positive definite if and only if kerq s =, s (,1]. Definition A time moment s (,1] is called conjugate to for the abnormal geodesic γ if kerq s. We are going to characterize conjugate times in terms of an appropriate Jacobi equation. Let ξ 1 T λ (T M) and ζ t T λ (T M) be the values at λ of the Hamiltonian lifts of the vector fields f 1 and g t. Recall that the Hamiltonian lift of a field f VecM is the Hamiltonian vector field associated to the Hamiltonian function λ λ,f(q), λ T qm, q M. We have: Q s (v) = s l(t)v(t) 2 dt+ s σ(x(t),ẋ(t))dt, ẋ(t) = v(t) ζ t, x(s) ζ s =, π x() π ξ 1 =, (2) where σ is the standard symplectic product on T λ (T M) and π : T M M is the standard projection. Moreover l(t) = σ( ζ t,ζ t ), t 1. (11.73) Let E = span{ξ 1,ζ t, t 1}. We use only the restriction of σ to E in the expression of Q s and we are going to get rid of unnecessary variables. Namely, we set: Σ. = E/(kerσ E ). 245

246 Lemma dimσ 2(dimspan{f 1 (q ),g t (q ), t 1} 1). Proof. Dimension of Σ is equal to twice the codimension of a maximal isotropic subspace of σ E. We have: σ(ξ 1,ζ t ) = λ,[f 1,g t ](q )] =, t [,1], hence ξ 1 kerσ E. Moreover, π (E) = span{f 1 (q ),g t (q ), t 1} and E kerπ is an isotropic subspace of σ E. We denote by ζ t Σ the projection of ζ t to Σ and by Π Σ the projection of E kerπ. Note that the projection of ξ 1 to Σ is ; moreover, equality (11.73) implies that ζ t, t [,1]. The final expression of Q s is as follows: Q s (v) = s l(t)v(t) 2 dt+ s σ(x(t),ẋ(t))dt, ẋ(t) = v(t) ζ t, x(s) ζ s =, x() Π. (4) We have: v kerq s if and only if s ( l(t)v(t)+σ(x(t), ζ ) t ) w(t)dt =, for any w( ) such that We obtain that v kerq s if and only if there exists ν Π ζ s s We set y(t) = x(t) ν and obtain the following: ζ t w(t)dt Π+Rζ s. (5) such that l(t)v(t)+σ(x(t), ζ t ) = σ(ν, ζ t ), t s. Theorem A time moment s (,1] is conjugate to if and only if there exists a nontrivial solution of the equation l(t)ẏ = σ( ζ t,y) ζ t (11.74) that satisfy the following boundary conditions: ν Π ζ such that (y(s)+ν) ζ s s =, (y()+ν) Π. (11.75) Remark Notice that identity (11.73) implies that y(t) = ζ t for t [,1] is a solution to the equation (11.74). However this solution may violate the boundary conditions. Let us consider the special case: dimspan{f 1 (q ),g t (q ), t 1} = 2; this is what we automatically have for abnormal geodesics in a 3-dimensional sub-riemannian manifold. In this case, dime = 2, dimπ = 1; hence Π = Π,ζ s = Rζ s and Π ζ =. Then ν in the boundary s conditions (11.75) must be and y(s) = cζ s, where c is a nonzero constant. Hence y(t) = cζ t for t 1 and y() = cζ / Π. We obtain: Corollary If dimspan{f 1 (q ),g t (q ), t 1} = 2, then the segment [,1] does not contain conjugate time moments and assumption of Theorem is satisfied. We can apply this corollary to the isoperimetric problem studied in Section Abnormal geodesics correspond to connected components of the zero locus of the function b (see notations in Sec ). All these abnormal geodesics are nice if and only if zero is a regular value of b. Take a compact connected component of b 1 (); this is a smooth closed curve. Our corollary together with Theorem implies that this closed curve passed once, twice, three times or arbitrary number of times is a locally optimal solution of the isoperimetric problem. Moreover, this is true for any Riemannian metric on the surface M! 246

247 Abnormals in dimension 3 Nice abnormals for the isoperimetric problem on surfaces Recall the isoperimetric problem: given two points x,x 1 on a 2-dimensional Riemannian manifold N, a 1-form ν Λ 1 N and c R, we have to find (if it exists) the minimum: min{l(γ),γ() = x,γ(t) = x 1, ν = c} (11.76) As shown in Section 4.4.2, this problem can be reformulated as a sub-riemannian problem on the extended manifold M = N R = {(x,y),x N,y R}, where the sub-riemannian structure is defined by the contact form D = Ker(dy ν) and the sub-riemannian length of a curve coincides with the Riemannian length of its projection on N. If we write dν = bdv, where b is a smooth function and dv denote the Riemannian volume on N, we have that the Martinet surface is defined by the cilynder M = R b 1 (), where, generically, the set b 1 () is a regular level of b. Since the distribution is well behaved with the projection on N by construction, it follows that the distribution is always transversal to the Martinet surface and all abnormal are nice, since Dq 3 = T qm for all q. Thus the projection of abnormal geodesics on N are the connected components of the set b 1 () and we can recover the whole abnormal extremal integrating the 1-form ν to find the missing component. In other words the abnormal extremals are spirals on M with step equal to Adν, (if dν is the volume form on N, it coincide with the area of the region A inside the curve defined on N by the connected component of b 1 ()). Corollary Let M be a sub-riemannian manifold, dimm = 3, and let γ : [,1] M be a nice abnormal geodesic. Then γ is a strict local minimizer for the L 2 control topology, for any metric. Remark Notice that we do not require that the curve does not self-intersect since in the 3D case this is automatically guaranteed by the fact that nice abnormal are integral curves of a smooth vector fields on M. γ A non nice abnormal extremal In this section we give an example of non nice (and indeed not smooth) abnormal extremal. Consider the isoperimetric problem on R 2 = {(x 1,x 2 ), x i R} defined by the 1-form ν such that dν = x 1 x 2 dx 1 dx

248 Here b(x 1,x 2 ) = x 1 x 2 and the set b 1 () consists of the union of the two axes, with moreover db =. Let us fix x 1, x 2 > and consider the curve joining (, x 2 ) and ( x 1,) that is the union of two segment contained in the coordinate axes { γ : [ x 2, x 1 ] R 2 (, t), t [ x 2,],, γ(t) = (t,), t [, x 1 ]. Proposition The curve γ is a projection of an abnormal extremal that is not a length minimizer. Proof of Proposition Let us built a family of variations γ ε,δ of the curve γ defined as in Figure Namely in γ ε,δ we cut a corner of size ε at the origin and we turn around a small circle of radius δ before reaching the endpoint. Denoting by D ε and D δ the two region enclosed by the curve it is easy to see that the isoperimetric condition rewrites as follows = ν = dν dν γ ε,δ D ε D δ It is then easy using that dν = x 1 x 2 dx 1 dx 2 to show that there exists c 1,c 2 > such that dν = c 1 ε 4, dν = c 2 δ 3 D ε D δ while l(γ ε,δ ) l(γ) = 2πδ (2 2)ε (11.77) Choosing ε in such a way that c 1 ε 4 = c 2 δ 3 it is an easy exercise to show that the quantity (11.77) is negative when δ > is very small. Remark If you consider some plane curve γ that is a projection of a normal extremal having the same endpoint γ and contained in the set {(x 1,x 2 ) R 2,x 1 >,x 2 > }, then γ must have self intersections. Indeed it is easy to see that if it is not the case then the isoperimetric condition ν = cannot be satisfied. γ It is still an open problem to find which is the length minimizer joining these two points. We know that it should be a projection of a normal extremal (hence smooth) but for instance we do not know how many self-intersection it has Higher dimension Now consider another important special case that is typical if dimension of the ambient manifold is greater than 3. Namely, assume that, for some k 2, the vector fields f 1, f 2, (adf 1 )f 2,..., (adf 1 ) k 1 f 2 (11.78) 248

249 x 2 D δ D ε x 1 Figure 11.1: An abnormal extremal that is not length minimizer are linearly independent in any point of a neighborhood of our nice abnormal geodesic γ, while (adf 1 ) k f 2 is a linear combination of the vector fields (11.78) in any point of this neighborhood; in other words, k 1 (adf 1 ) k f 2 = a i (adf 1 ) i f 2 +αf 1, i= where a i,α are smooth functions. In this case, all closed to γ solutions of the equation q = f 1 (q) are abnormal geodesics. A direct calculation basedon thefact that λ t,(adf i 1 )f 2)(γ(t) =, t 1, gives theidentity: ζ (k) k 1 t = a i (γ(t))ζ (i) +α(γ(t))ξ 1. t 1. (11.79) i= Identity (11.79) implies that dime = k and Π =. The boundary conditions (11.75) take the form: y() ζ s, (y(s) y()) ζ =. (11.8) s The caracterization of conjugate points is especially simple and geometrically clear if the ambient manifold has dimension 4. Let be a rank 2 equiregular distribution in a 4-dimensional manifold (the Engel distribution). Then abnormal geodesics form a 1-foliation of the manifold and condition (11.78) is satisfied with k = 2. Moreover, dime = 3, dimσ = 2 and ζ s = Rζ. Recall that s y(t) = ζ t, t s, is a solution to (11.74). Hence boundary conditions (11.8) are equivalent to the condition ζ s ζ =. (11.81) It is easy to re-write relation (11.81) in the intrinsic way without special notations we used to simplify calculations. We have the following characterization of conjugate times. Lemma A time moment t is conjugate to for the abnormal geodesic γ if and only if e tf 1 D γ() = D γ(t). The flow e tf 1 preserves D 2 and f 1 but it does not preserve D. The plane e tf 1 D rotates around the line Rf 1 inside D 2 with a nonvanishing angular velocity. Conjugate moment is a moment when the plane makes a complete revolution. Collecting all the information we obtain: 249

250 Theorem Let D be the Engel distribution, f 1 be a horizontal vector field such that [f 1,D 2 ] = D 2 and γ = f 1 (γ). Then γ is an abnormal geodesic. Moreover (i) if e tf 1 D γ() D γ(t), t (,1], then γ is a local length minimizer for any sub-riemannian structure on D (ii) If e tf 1 D γ() = D γ(t) for some t (,1) and γ is not a normal geodesic, then γ is not a local length minimizer Equivalence of local minimality Now we prove that, under the assumption that our trajectory is smooth, it is equivalent to be locally optimal in the H 1 -topology or in the uniform topology for the trajectories. Recall that a curve γ is called a C -local length-minimizer if l( γ) l(γ) for every curve γ that is C -close to γ satisfying the same boundary conditions, while it is called a H 1 -local lengthminimizer if l( γ) l(γ) for every curve γ such that the control u corresponding to γ is close in the L 2 topology to the control ū associated with γ and γ satisfies the same boundary conditions. Any C -local minimizer is automatically a H 1 -local minimizer. Indeed it is possible to show that for every v,w in a neighborhood of a fixed control u there exists a constant C > such that γ v (t) γ w (t) C u v L 2, t [,T], where γ v and γ w are the trajectories associated to controls v,w respectively. Theorem Let M be a sub-riemannian structure that is the restriction to D of a Riemannian structure (M,g). Assume γ is of class C 1 and has no self intersections. If γ is a (strict) local minimizer in the L 2 topology for the controls then γ is also a (strict) local minimizer in the C topology for the trajectories. Proof. Since γ has no self intersections, we can look for a preferred system of coordinates on an open neighborhood Ω in M of the set V = { γ(t) : t [,1]}. For every ε >, define the cylinder in R n = {(x,y) : x R,y R n 1 } as follows I ε B n 1 ε = {(x,y) R n : x ] ε,1+ε[,y R n 1, y < ε}, (11.82) We need the following technical lemma. Lemma There exists ε > and a coordinate map Φ : I ε Bε n 1 t [,1] (a) Φ(t, ) = γ(t), (b) the Riemannian metric Φ g is the identity matrix at (t,),i.e., along γ. Ω such that for all Proof of the Lemma. As in the proof of Theorem??, for every ε > we can find coordinates in the cylinder I ε Bε n 1 such that, in these coordinates, our curve γ is rectified γ(t) = (t,) and has length one. Our normalization of the curve γ implies that for the matrix representing the Riemannian metric Φ g in these coordinates satisfies ( ) Φ G11 G g = 12, with G G 21 G 11 (x,) =

251 where G ij, for i,j = 1,2, are the blocks of Φ g corresponding to the splitting R n = R R n 1 defined in (11.82). For every point (x,) let us consider the orthogonal complement T(x,) of the tangent vector e 1 = x to γ with respect to G. It can be written as follows (in this proof, is the Euclidean product in R n ) T(x,) = { ( v x,y,y),y R n 1} for some family 5 of vectors v x R n 1, depending smoothly with respect to x. Let us consider now the smooth change of coordinates Ψ : R n R n, Ψ(x,y) = (x v x,y,y) Fix ε > small enough such that the restriction of Ψ to I ε Bε n 1 is possible since vx detdψ(x,y) = 1 x,y. is invertible. Notice that this It is not difficult to check that, in the new variables (that we still denote by the same symbol), one has ( ) 1 G(x,) =, M(x,) where M(x,) is a positive definite matrix for all x I ε. With a linear change of cooordinates in the y space (x,y) (x,m(x,) 1/2 y) we can finally normalize the matrix in such a way that G(x,) = Id for all x I ε. We are now ready to prove the theorem. We check the equivalence between the two notions of local minimality in the coordinate set, denoted (x, y), defined by the previous lemma. Notice that the notion of local minimality is independent on the coordinates. Given an admissible curve γ(t) = (x(t),y(t)) contained in the cylinder I ε Bε n 1 and satisfying γ() = (,) and γ(1) = (1,) and denoting the reference trajectory γ(t) = (t,) we have that γ γ 2 H 1 = = = ẋ(t) ẏ(t) 2 dt ẋ(t) 2 + ẏ(t) 2 dt 2 ẋ(t) 2 + ẏ(t) 2 dt 1 1 ẋ(t)dt+1 where we used that x() = and x(1) = 1 since γ satisfies the boundary conditions. If we denote by J(γ) = 1 G(γ(t)) γ(t), γ(t) dt, J e (γ) = 1 ẋ(t) 2 + ẏ(t) 2 dt (11.83) respectively the energy of γ and the Euclidean energy, we have γ γ 2 H 1 = J e (γ) 1 and the H 1 -local minimality can be rewritten as follows: 5 Indeed it is easily checked that v x = G 1 21(x,), where G 1 21 denotes the first column of the (n 1) (n 1) matrix G

252 ( ) there exists ε > such that for every γ admissible and J e (γ) 1+ε one has J(γ) 1. Next we build the following neighborhood of γ: for every δ > define A δ as the set of admissible curves γ(t) = (x(t),y(t)) in I ε Bε n 1 such that the dilated curve γ δ (t) = (x(t), 1 δy(t)) is still contained in the cylinder. This implies that in particular that γ is contained in I ε B n 1 δε. Notice that A δ A δ whenever δ < δ. Moreover, every curve that is εδ close to γ in the C -topology is contained in A δ. It is then sufficient to prove that, for δ > small enough, for every γ A δ one has l(γ) l( γ). Indeed it is enough to check that J(γ) J( γ). Let us consider two cases (i) γ A δ and J e (γ) 1+ε. In this case ( ) implies that J(γ) 1. (ii) γ A δ and J e (γ) > 1+ε. In this case we have G(x,) = Id and, by smoothness of G, we can write for (x,y) I ε B n 1 δε and δ G(x,y)v,v = (1+O(δ)) v,v, where O(δ) is uniform with respect to (x,y). Since γ A δ implies that γ is contained in I ε B n 1 δε we can deduce for δ J(γ) = J e (γ)(1+o(δ)) (1+ε)(1+O(δ)) and one can choose δ > small enough such that the last quantity is strictly bigger than one. This proves that there exists δ > such every admissible curve γ A δ is longer than γ. Remark Notice that this result implies in particular Theorem 4.57, since normal extremals are always smooth. Nevertheless, the argument of Theorem 4.57 can be adapted for more general coercive functional (see [3]), while this proof use specific estimates that hold only for our explicit cost (i.e., the distance). We proved in Theorem that nice abnormals are smooth and cannot have self-intersections, being solution of a smooth Hamiltonian system. Thus we can combine Theorem 11.3 and and obtain the following result. Corollary Let γ(t) be a nice abnormal trajectory. Then there exists s > such that γ [,s] is a strict local length minimizer in the C -topology. 252

253 Chapter 12 Curves in the Lagrange Grassmannian In this chapter we introduce the manifold of Lagrangian subspaces of a symplectic vector space. After a description of its geometric properties, we discuss how to define the curvature for regular curves in the Lagrange Grassmannian, that are curves with non-degenerate derivative. Then we discuss the non-regular case, where a reduction procedure let us to reduce to a regular curve in a reduced symplectic space The geometry of the Lagrange Grassmannian In this section we recall some basic facts about Grassmanians of k-dimensional subspaces of an n-dimensional vector space and then we consider, for a vector space endowed with a symplectic structure, the submanifold of its Lagrangian subspaces. Definition Let V be an n-dimensional vector space. The Grassmanian of k-planes on V is the set G k (V) := {W W V is a subspace, dim(w) = k}. It is a standard fact that G k (V) is a compact manifold of dimension k(n k). Now we describe the tangent space to this manifold. Proposition Let W G k (V). We have a canonical isomorphism T W G k (V) Hom(W,V/W). Proof. Consider a smooth curve on G k (V) which starts from W, i.e. a smooth family of k- dimensional subspaces defined by a moving frame W(t) = span{e 1 (t),...,e k (t)}, W() = W. We want to associate in a canonical way with the tangent vector Ẇ() a linear operator from W to the quotient V/W. Fix w W and consider any smooth extension w(t) W(t), with w() = w. Then define the map W V/W, w ẇ() (mod W). (12.1) 253

254 We are left to prove that the map (12.1) is well defined, i.e. independent on the choices of representatives. Indeed if we consider another extension w 1 (t) of w satisfying w 1 (t) W(t) we can write k w 1 (t) = w(t)+ α i (t)e i (t), for some smooth coefficients α i (t) such that α i () = for every i. It follows that i=1 ẇ 1 (t) = ẇ(t)+ k i=1 k α i (t)e i (t)+ α i (t)ė i (t), (12.2) i=1 and evaluating (12.2) at t = one has ẇ 1 () = ẇ()+ k i=1 α i ()e i (). This shows that ẇ 1 () = ẇ()(mod W), hence the map (12.1) is well defined. In the same way one can prove that the map does not depend on the moving frame defining W(t). Finally, it is easy to show that the map that associates the tangent vector to the curve W(t) with the linear operator W V/W is surjective, hence it is an isomorphism since the two space have the same dimension. Let us now consider a symplectic vector space (Σ, σ), i.e. a 2n-dimensional vector space Σ endowed with a non degenerate symplectic form σ Λ 2 (Σ). Definition A vector subspace Π Σ of a symplectic space is called (i) symplectic if σ Π is nondegenerate, (ii) isotropic if σ Π, (iii) Lagrangian if σ Π and dimπ = n. Notice that in general for every subspace Π Σ, by nondegeneracy of the symplectic form σ, one has dimπ+dimπ = dimσ. (12.3) where as usual we denote the symplectic orthogonal by Π = {x Σ σ(x,y) =, y Π}. Exercise Prove the following properties for a vector subspace Π Σ: (i) Π is symplectic iff Π Π = {}, (ii) Π is isotropic iff Π Π, (iii) Π is Lagrangian iff Π = Π. Exercise Prove that, given two subspaces A,B Σ, one has the identities (A + B) = A B and (A B) = A +B. 254

255 Example Any symplectic vector space admits Lagrangian subspaces. Indeed fix any nonzero element e 1 := e in Σ. Choose iteratively e i span{e 1,...,e i 1 } \span{e 1,...,e i 1 }, i = 2,...,n. (12.4) Then Π := span{e 1,...,e n } is a Lagrangian subspaceby construction. Notice that the choice (12.4) is possible by (12.3) Lemma Let Π = span{e 1,...,e n } be a Lagrangian subspace of Σ. Then there exists vectors f 1,...,f n Σ such that (i) Σ = Π, := span{f 1,...,f n }, (ii) σ(e i,f j ) = δ ij, σ(e i,e j ) = σ(f i,f j ) =, i,j = 1,...,n. Proof. We prove the lemma by induction. By nondegeneracy of σ there exists a non-zero x Σ such that σ(e n,x). Then we define the vector f n := x σ(e n,x), = σ(e n,f n ) = 1. The last equality implies that σ restricted to span{e n,f n } is nondegerate, hence by (a) of Exercise 12.4 span{e n,f n } span{e n,f n } =, (12.5) And we can apply induction on the 2(n 1) subspace Σ := span{e n,f n }. Notice that (12.5) implies that σ is non degenerate also on Σ. Remark In particular the complementary subspace = span{f 1,...,f n } defined in Lemma 12.7 is Lagrangian and transversal to Π Σ = Π. Considering coordinates induced from the basis chosen for this splitting we can write Σ = R n R n, (denoting R n denotes the set of row vectors). More precisely x = (ζ,z) if n x = ζ i e i +z i f i, ζ = ( z 1 ζ 1 ζ n), z =., i=1 z n and using canonical form of σ on our basis (see Lemma 12.7) we find that in coordinates, if x 1 = (ζ 1,z 1 ),x 2 = (ζ 2,z 2 ) we get σ(x 1,x 2 ) = ζ 1 z 2 ζ 2 z 1, (12.6) where we denote with ζz the standard rows by columns product. Lemma 12.7 shows that the group of symplectomorphisms acts transitively on pairs of transversal Lagrangian subspaces. The next exercise, whose proof is an adaptation of the previous one, describes all the orbits of the action of the group of symplectomorphisms on pairs of subspaces of a symplectic vector spaces. Exercise Let Λ 1,Λ 2 be two subspaces in a symplectic vector space Σ, and assume that dimλ 1 Λ 2 = k. Show that there exists Darboux coordinates (p,q) in Σ such that Λ 1 = {(p,)}, Λ 2 = {((p 1,...,p k,,...,),(,...,,q k+1,...,q n )}. 255

256 The Lagrange Grassmannian Definition The Lagrange Grassmannian L(Σ) of a symplectic vector space Σ is the set of its n-dimensional Lagrangian subspaces. Proposition L(Σ) is a compact submanifold of the Grassmannian G n (Σ) of n-dimensional subspaces. Moreover diml(σ) = n(n+1). (12.7) 2 Proof. Recall that G n (Σ) is an 2 -dimensionalcompact manifold. Clearly L(Σ) G n (Σ) asasubset. Consider the set of all Lagrangian subspaces that are transversal to a given one = {Λ L(Σ) : Λ = }. Clearly L(Σ) is an open subset and since by Lemma 12.7 every Lagrangian subspace admits a Lagrangian complement L(Σ) =. L(Σ) It is then sufficient to find some coordinates on these open subsets. Every n-dimensional subspace Λ Σ which is transversal to is the graph of a linear map from Π to. More precisely there exists a matrix S Λ such that Λ = Λ = {(z T,S Λ z),z R n }. (Here we used the coordinates induced by the splitting Σ = Π.) Moreover it is easily seen that Λ L(Σ) S Λ = (S Λ ) T. Indeed we have that Λ L(Σ) if and only if σ Λ = and using (12.6) this is rewritten as σ((z T 1,S Λ z 1 ),(z T 2,S Λ z 2 )) = z T 1 S Λ z 2 z T 2 S Λ z 1 =, which means exactly S Λ symmetric. Hence the open set of all subspaces that are transversal to Λ is parametrized by the set of symmetric matrices, that gives coordinates in this open set. This also proves that the dimension of L(Σ) coincide with the dimension of the space of symmetric matrices, hence (12.7). Notice also that, being L(Σ) a closed set in a compact manifold, it is compact. Now we describe the tangent space to the Lagrange Grassmannian. Proposition Let Λ L(Σ). Then we have a canonical isomorphism T Λ L(Σ) Q(Λ), where Q(Λ) denote the set of quadratic forms on Λ. Proof. Consider a smooth curve Λ(t) in L(Σ) such that Λ() = Λ and Λ() T Λ L(Σ) its tangent vector. As before consider a point x Λ and a smooth extension x(t) Λ(t) and denote with ẋ := ẋ(). We define the map Λ : x σ(x,ẋ), (12.8) 256

257 that is nothing else but the quadratic map associated to the self adjoint map x ẋ by the symplectic structure. We show that in coordinates Λ is a well defined quadratic map, independent on all choices. Indeed Λ(t) = {(z T,S Λ(t) z),z R n }, and the curve x(t) can be written x(t) = (z(t) T,S Λ(t) z(t)), x = x() = (z T,S Λ z), for some curve z(t) where z = z(). Taking derivative we get ẋ(t) = (ż(t) T,ṠΛ(t)z(t)+S Λ(t) ż(t)), and evaluating at t = (we simply omit t when we evaluate at t = ) we have x = (z T,S Λ z), ẋ = (ż T,ṠΛz +S Λ ż), and finally get, using the simmetry of S Λ, that σ(x,ẋ) = z T (ṠΛz +S Λ ż) ż T S Λ z = z T Ṡ Λ z +z T S Λ ż ż T S Λ z = z T Ṡ Λ z. (12.9) Exercise Let Λ(t) L(Σ) such that Λ = Λ() and σ be the symplectic form. Prove that the map S : Λ Λ R defined by S(x,y) = σ(x,ẏ), where ẏ = ẏ() is the tangent vector to a smooth extension y(t) Λ(t) of y, is a symmetric bilinear map. Remark We have the following natural interpretation of this result: since L(Σ) is a submanifold of the Grassmanian G n (Σ), its tangent space T Λ L(Σ) is naturally identified by the inclusion with a subspace of the Grassmannian i : L(Σ) G n (Σ), i : T Λ L(Σ) T Λ G n (Σ) Hom(Λ,Σ/Λ), where the last isomorphism is Proposition Being Λ a Lagrangian subspace of Σ, the symplectic structure identifies in a canonical way the factor space Σ/Λ with the dual space Λ defining Σ/Λ Λ, [z] Λ,x = σ(z,x). (12.1) Hence the tangent space to the Lagrange Grassmanian consist of those linear maps in the space Hom(Λ,Λ ) that are self-adjoint, which are naturally identified with quadratic forms on Λ itself. 1 Remark Given a curve Λ(t) in L(Σ), the above procedure associates to the tangent vector Λ(t) a family of quadratic forms Λ(t), for every t. We end this section by computing the tangent vector to a special class of curves that will play a major role in the sequel, i.e. the curve on L(Σ) induced by the action on Λ by the flow of the linear Hamiltonian vector field h associated with a quadratic Hamiltonian h C (Σ). (Recall that a Hamiltonian vector field transform Lagrangian subspaces into Lagrangian subspaces.) 1 any quadratic form on a vector space q Q(V) can be identified with a self-adjoint linear map L : V V, L(v) = B(v, ) where B is the symmetric bilinear map such that q(v) = B(v,v). 257

258 Proposition Let Λ L(Σ) and define Λ(t) = e t h (Λ). Then Λ = 2h Λ. Proof. Consider x Λ and the smooth extension x(t) = e t h (x). Then ẋ = h(x) and by definition of Hamiltonian vector field we find σ(x,ẋ) = σ(x, h(x)) = d x h,x = 2h(x), where in the last equality we used that h is quadratic on fibers Regular curves in Lagrange Grassmannian The isomorphism between tangent vector to the Lagrange Grassmannian with quadratic forms makes sense to the following definition (we denote by Λ the tangent vector to the curve at the point Λ as a quadratic map) Definition Let Λ(t) L(Σ) be a smooth curve in the Lagrange Grassmannian. We say that the curve is (i) monotone increasing (descreasing) if Λ(t) ( Λ(t) ). (ii) strictly monotone increasing (decreasing) if the inequality in (i) is strict. (iii) regular if its derivative Λ(t) is a non degenerate quadratic form. Remark Notice that if Λ(t) = {(p,s(t)p),p R n } in some coordinate set, then it follows from the proof of Proposition that the quadratic form Λ(t) is represented by the matrix ṠΛ(t) (see also (12.9)). In particular the curve is regular if and only if detṡλ(t). The main goal of this section is the construction of a canonical Lagrangian complement. (i.e. another curve Λ (t) in thelagrange Grassmannian defined by Λ(t) and such that Σ = Λ(t) Λ (t).) Consider an arbitrary Lagrangian splitting Σ = Λ() defined by a complement to Λ() (see Lemma 12.7) and fix coordinates in such a way that that Σ = {(p,q), p,q R n }, Λ() = {(p,), p R n }, = {(,q), q R n }. In these coordinates our regular curve is described by a one parametric family of symmetric matrices S(t) Λ(t) = {(p,s(t)p), p R n }, such that S() = and Ṡ() is invertible. All Lagrangian complement to Λ() are parametrized by a symmetrix matrix B as follows B = {(Bq,q),q R n }, B = B T. The following lemma shows how the coordinate expression of our curve Λ(t) change in the new coordinate set defined by the splitting Σ = Λ() B. 258

259 Lemma Let S B (t) the one parametric family of symmetric matrices defining Λ(t) in coordinates w.r.t. the splitting Λ() B. Then the following identity holds S B (t) = (S(t) 1 B) 1. (12.11) Proof. It is easy to show that, if (p,q) and (p,q ) denotes coordinates with respect to the splitting defined by the subspaces and B we have { p = p Bq q (12.12) = q The matrix S B (t) by definition is the matrix that satisfies the identity q = S B (t)p. Using that q = S(t)p by definition of Λ(t), from (12.12) we find q = q = S(t)p = S(t)(p +Bq ), and with straightforward computations we finally get S B (t) = (I S(t)B) 1 S(t) = (S(t) 1 B) 1. Since Ṡ(t) represents the tangent vectors to the regular curve Λ(t), its properties are invariant with respect to change of coordinates. Hence it is natural to look for a change of coordinates (i.e. a choice of the matrix B) that simplifies the second derivative our curve. Corollary There exists a unique symmetric matrix B such that S B () =. Proof. Recall that for a one parametric family of matrices X(t) we have d dt X(t) 1 = X(t) 1 Ẋ(t)X(t) 1. Applying twice this identity to (12.11) (we omit t to denote the value at t = ) we get ( ) d d dt S B (t) = (S 1 B) 1 t= dt S 1 (t) (S 1 B) 1 t= = (S 1 B) 1 S 1 ṠS 1 (S 1 B) 1 = (I SB) 1 Ṡ(I BS) 1. Hence for the second derivative evaluated at t = (remember that in our coordinates S() = ) one gets S B = S +2ṠBṠ, and using that Ṡ is non degerate, we can choose B = 1 2Ṡ 1 SṠ 1. We set Λ () := B, where B is determined by (12.13). Notice that by construction Λ () is a Lagrangian subspace and it is transversal to Λ(). The same argument can be applied to define Λ (t) for every t. 259

260 Definition Let Λ(t) be a regular curve, the curve Λ (t) defined by the condition above is called derivative curve of Λ(t). Exercise Prove that, if Λ(t) = {(p,s(t)p), p R n } (without the condition S() = ), then the derivative curve Λ (t) = {(p,s (t)p), p R n }, satisfies S (t) = B(t) 1 +S(t), where B(t) := 1 2Ṡ(t) 1 S(t) Ṡ(t) 1, (12.13) provided Λ (t) is transversal to the subspace = {(,q),q R n }. (Actually this condition is equivalent to the invertibility of B(t).) Notice that if S() = then S () = B() 1. Remark The set Λ tr of all n-dimensional spaces transversal to a fixed subspace Λ is an affine space over Hom(Σ/Λ,Λ). Indeed given two elements 1, 2 Λ tr we can associate with their difference the operator 2 1 A Hom(Σ/Λ,Λ), A([z] Λ ) = z 2 z 1 Λ, (12.14) where z i i [z] Λ are uniquely identified. If Λ is Lagrangian, we have identification Σ/Λ Λ given by the symplectic structure (see (12.1)) that Λ, that coincide by definition with the intersection Λ tr L(Σ) is an affine space over Hom S (Λ,Λ), the space of selfadjoint maps between Λ and Λ, that it isomorphic to Q(Λ ). Notice that if we fix a distinguished complement of Λ, i.e. Σ = Λ, then we have also the identification Σ/Λ and Λ Q(Λ ) Q( ). Exercise Prove that the operator A defined by (12.14), in the case when Λ is Lagrangian, is a self-adjoint operator. Remark Assumethat thesplittingσ = Λ isfixed. ThenourcurveΛ(t) inl(σ), suchthat Λ() = Λ, is characterized by a family of symmetric matrices S(t) satisfying Λ(t) = {(p,s(t)p),p R n }, with S() =. By regularity of the curve, Λ(t) Λ for t > small enough, hence we can consider its coordinate presentation in the affine space on the vector space of quadratic forms defined on (see Remark 12.23) that is given by S 1 (t) and write the Laurent expansion of this curve in the affine space ( ) 1 S(t) 1 = tṡ + t2 2 S +O(t 3 ) = 1 (I tṡ 1 + t ) 1 2 SṠ 1 +O(t 2 ) = 1 tṡ 1 1 2Ṡ 1 SṠ 1 +O(t). }{{} B It is not occasional that the matrix B coincides with the free term of this expansion. Indeed the formula (12.11) for the change of coordinates can be rewritten as follows S B (t) 1 = S 1 (t) B, (12.15) and the choice of B corresponds exactly to the choice of a coordinate set where the curve Λ(t) has no free term in this expansion (i.e. S B (t) 1 has no term of order zero). This is equivalent to say that a regular curve let us to choose a privileged origin in the affine space of Lagrangian subspaces that are transversal to the curve itself. 26

261 12.3 Curvature of a regular curve Now we want to define the curvature of a regular curve in the Lagrange Grassmannian. Let Λ(t) be a regular curve and consider its derivative curve Λ (t). The tangent vectors to Λ(t) and Λ (t), as explained in Section 12.1, can be interpreted in a a canonical way as a quadratic form on the space Λ(t) and Λ (t) respectively Λ(t) Q(Λ(t)), Λ (t) Q(Λ (t)). Being Λ (t) a canonical Lagrangian complement to Λ(t) we have the identifications through the symplectic form 2 Λ(t) Λ (t), Λ (t) Λ(t), and the quadratic forms Λ(t), Λ (t) can be treated as (self-adjoint) mappings: Λ(t) : Λ(t) Λ (t), Λ (t) : Λ (t) Λ(t). (12.16) Definition TheoperatorR Λ (t) := Λ (t) Λ(t) : Λ(t) Λ(t)iscalled thecurvature operator of the regular curve Λ(t). Remark In the monotonic case, when Λ(t) defines a scalar product on Λ(t), the operator R(t) is, by definition, symmetric with respect to this scalar product. Moreover R(t), as quadratic form, has the same signature and rank as Λ (t)sign( Λ (t)). Definition Let Λ 1,Λ 2 be two transversal Lagrangian subspaces of Σ. We denote the projection on Λ 2 parallel to Λ 1, i.e. the linear operator such that π Λ1 Λ 2 : Σ Λ 2, (12.17) π Λ1 Λ 2 Λ1 = π Λ1 Λ 2 Λ2 = Id. Exercise Assume Λ 1 and Λ 2 be two Lagrangian subspaces in Σ and assume that, in some coordinate set, Λ i = {(x,s i x), R n } for i = 1,2. Prove that Σ = Λ 1 Λ 2 if and only if ker(s 1 S 2 ) = {}. In this case show that the following matrix expression for π Λ1 Λ 2 : π Λ1 Λ 2 = ( S 1 12 S 1 S12 1 S 2 S12 1 S 1 S 2 S12 1 ), S 12 := S 1 S 2. (12.18) From the very definition of the derivative of our curve we can get the following geometric characterization of the curvature of a curve. Proposition Let Λ(t) a regular curve in L(Σ) and Λ (t) its derivative curve. Then Λ(t)(x t ) = π Λ(t)Λ (t)(ẋ t ), Λ (t)(xt ) = π Λ (t)λ(t)(ẋ t ). In particular the curvature is the composition R Λ (t) = Λ (t) Λ(t). 2 if Σ = Λ is a splitting of a vector space then Σ/Λ. If moreover the splitting is Lagrangian in a symplectic space, the symplectic form identifies Σ/Λ Λ, hence Λ. 261

262 Proof. Recall that, by definition, the linear operator Λ : Λ Σ/Λ associated with the quadratic form is the map x ẋ (mod Λ). Hence to build the map Λ Λ it is enough to compute the projection of ẋ onto the complement Λ, that is exactly π ΛΛ (ẋ). Notice that the minus sign in equation (12.3) is a consequence of the skew symmetry of the symplectic product. More precisely, the sign in the identification Λ Λ depends on the position of the argument. The curvature R Λ (t) of the curve Λ(t) is a kind of relative velocity between the two curves Λ(t) andλ (t). Inparticular noticethat ifthetwocurvesmoves inthesamedirection wehaver Λ (t) >. Now we compute the expression of the curvature R Λ (t) in coordinates. Proposition Assume that Λ(t) = {(p,s(t)p)} is a regular curve in L(Σ). Then we have the following coordinate expression for the curvature of Λ (we omit t in the formula) R Λ = ((2Ṡ) 1 S) ((2 Ṡ) 1 S) 2 (12.19) = 1 2Ṡ 1... S 3 4 (Ṡ 1 S) 2. (12.2) Proof. Assume that both Λ(t) and Λ (t) are contained in the same coordinate chart with Λ(t) = {(p,s(t)p)}, Λ (t) = {(p,s (t)p)}. We start the proof by computing the expression of the linear operator associated with the derivative Λ : Λ Λ (we omit t when we compute at t = ). For each element (p,sp) Λ and any extension (p(t),s(t)p(t)) one can apply the matrix representing the operator π ΛΛ (see (12.18)) to the derivative at t = and find π ΛΛ (p,sp) = (p,s p ), p = (S S ) 1 Ṡp. Exchanging the role of Λ and Λ, and taking into account of the minus sign one finds that the coordinate representation of R is given by R = (S S) 1 Ṡ (S S) 1 Ṡ. (12.21) We prove formula (12.2) under the extra assumption that S() =. Notice that this is equivalent to the choice of a particular coordinate set in L(Σ) and, being the expression of R coordinate independent by construction, this is not restrictive. Under this extra assumption, it follows from (12.13) that Λ(t) = {(p,s(t)p)}, Λ (t) = {(p,s (t)p)}, where S (t) = B(t) 1 +S(t) and we denote by B(t) := 1 2Ṡ(t) 1 S(t) Ṡ(t) 1. Hence we have, assuming S() = and omitting t when t = R = (S S) 1 Ṡ (S S) 1 Ṡ ( ) d = B dt B(t) 1 +S(t) BṠ t= = (BṠ)2 ḂṠ. Plugging B = 1 2Ṡ 1 SṠ 1 into the last formula, after some computations one gets to (12.2). 262

263 Remark The formula for the curvature R Λ (t) of a curve Λ(t) in L(Σ) takes a very simple form in a particular coordinate set given by the splitting Σ = Λ() Λ (), i.e. such that Λ() = {(p,),p R n }, Λ () = {(,q),q R n }. Indeed using a symplectic change of coordinates in Σ that preserves both Λ and Λ (i.e. of the kind p = Ap, q = (A 1 ) q) we can choose the matrix A in such a way that Ṡ() = I. Moreover we know from Proposition that the fact that Λ = {(,q),q R n } is equivalent to S() =. Hence one finds from (12.2) that R = 1... S 2 When the curve Λ(t) is strictly monotone, the curvature R represents a well defined operator on Λ(), naturally endowed with the sign definite quadratic form Λ(). Hence in these coordinates the eigenvalues of... S (and not only the trace and the determinant) are invariants of the curve. Exercise Let f : R R be a smooth function. The Schwartzian derivative of f is defined as ( ) f ) f 2 Sf := ( 2f 2f (12.22) Prove that Sf = if and only if f(t) = at+b ct+d for some a,b,c,d R. Remark The previous proposition says that the curvature R is the matrix version of the Schwartzian derivative of the matrix S (cfr. (12.19) and (12.22)). Example Let Σ be a 2-dimensional symplectic space. In this case L(Σ) P 1 (R) is the real projective line. Let us compute the curvature of a curve in L(Σ) with constant (angular) velocity α >. We have Λ(t) = {(p,s(t)p),p R}, S(t) = tan(αt) R. From the explicit expression it easy to find the relation Ṡ(t) = α(1+s 2 (t)), S(t) = αs(t), 2Ṡ(t) from which one gets that R(t) = αṡ(t) α2 S 2 (t) = α 2, i.e. the curve has constant curvature. We end this section with a useful formula on the curvature of a reparametrized curve. Proposition Let ϕ : R R a diffeomorphism and define the curve Λ ϕ (t) := Λ(ϕ(t)). Then R Λϕ (t) = ϕ 2 (t)r Λ (ϕ(t))+r ϕ (t)id. (12.23) Proof. It is a simple check that the Schwartzian derivative of the composition of two function f and g satisfies S(f g) = (Sf g)(g ) 2 +Sg. Notice that R ϕ (t) makes senseas thecurvatureof theregular curveϕ : R R P 1 in thelagrange Grassmannian L(R 2 ). 263

264 Exercise (Another formula for the curvature). Let Λ,Λ 1 L(Σ) be such that Σ = Λ Λ 1 andfixtwotangent vectors ξ T Λ L(Σ)andξ 1 T Λ1 L(Σ). Asin(12.16)wecantreat eachtangent vector as a linear operator ξ : Λ Λ 1, ξ 1 : Λ 1 Λ, (12.24) and define the cross-ratio [ξ 1,ξ ] = ξ 1 ξ. If in some coordinates Λ i = {(p,s i p)} for i =,1 we have 3 [ξ 1,ξ ] = (S 1 S ) 1 Ṡ 1 (S 1 S ) 1 Ṡ. Let now Λ(t) a regular curve in L(Σ). By regularity Σ = Λ() Λ(t) for all t > small enough, hence the cross ratio [ Λ(t), Λ()] : Λ() Λ(), is well defined. Prove the following expansion for t [ Λ(t), Λ()] 1 t 2Id+ 1 3 R Λ()+O(t). (12.25) 12.4 Reduction of non-regular curves in Lagrange Grassmannian In this section we want to extend the notion of curvature to non-regular curves. As we will see in the next chapter, it is always possible to associate with an extremal a family of Lagrangian subspaces in a symplectic space, i.e. a curve in a Lagrangian Grassmannian. This curve turns out to be regular if and only if the extremal is an extremal of a Riemannian structure. Hence, if we want to apply this theory for a genuine sub-riemannian case we need some tools to deal with non-regular curves in the Lagrangian Grassmannian. Let (Σ, σ) be a symplectic vector space and L(Σ) denote the Lagrange Grassmannian. We start by describing a natural subspace of L(Σ) associated with an isotropic subspace Γ of Σ. This will allow us to define a reduction procedure for a non regular curve. Let Γ be a k-dimensional isotropic subspace of Σ, i.e. σ Γ =. This means that Γ Γ. In particular Γ /Γ is a 2(n k) dimensional symplectic space with the restriction of σ. Lemma There is a natural identification of L(Γ /Γ) as a subspace of L(Σ): Moroever we have a natural projection where Λ Γ := (Λ Γ )+Γ = (Λ+Γ) Γ. L(Γ /Γ) {Λ L(Σ),Γ Λ} L(Σ). (12.26) π Γ : L(Σ) L(Γ /Γ), Λ Λ Γ, Proof. Assume that Λ L(Σ) and Γ Λ. Then, since Λ is Lagrangian, Λ = Λ Γ, hence the identification (12.26). Assume now that Λ L(Γ /Γ) and let us show that π Γ (Λ) = Λ, i.e. π Γ is a projection. Indeed from the inclusions Γ Λ Γ one has π Γ (Λ) = Λ Γ = (Λ Γ )+Γ = Λ+Γ = Λ. 3 here Ṡi denotes the matrix associated with ξi. 264

265 We are left to check that Λ Γ is Lagrangian, i.e. (Λ Γ ) = Λ Γ. (Λ Γ ) = ((Λ Γ )+Γ) = (Λ Γ ) Γ = (Λ+Γ) Γ = Λ Γ, where we repeatedly used Exercise (The identity (Λ Γ ) + Γ = (Λ + Γ) Γ is also a consequence of the same exercise.) Remark Let Γ = {Λ L(Σ),Λ Γ = {}}. The restriction π Γ Γ is smooth. Indeed it can be shown that π Γ is defined by a rational function, since it is expressed via the solution of a linear system. The following example shows that the projection π Γ is not globally continous on L(Σ). Example Consider the symplectic structure σ on R 4, with Darboux basis {e 1,e 2,f 1,f 2 }, i.e. σ(e i,f j ) = δ ij. Let Γ = span{e 1 } be a one dimensional isotropic subspace and define Λ ε = span{e 1 +εf 2,e 2 +εf 1 }, ε >. It is easy to see that Λ ε is Lagrangian for every ε and that Λ Γ ε = span{e 1,f 2 }, ε >, (12.27) Λ Γ = span{e 1,e 2 }. Indeed f 2 e 1, that implies e 1 +εf 2 Λ ε Γ, therefore f 2 Λ ε Γ. By definition of reduced curve f 2 Λ Γ ε and (12.27) holds. The case ε = is trivial Ample curves In this section we introduce ample curves. Definition Let Λ(t) L(Σ) be a smooth curve in the Lagrange Grassmannian. The curve Λ(t) is ample at t = t if there exists N N such that Σ = span{λ (i) (t ) λ(t) Λ(t),λ(t) smooth, i N}. (12.28) In other words we require that all derivatives up to order N of all smooth sections of our curve in L(Σ) span all the possible directions. As usual, we can choose coordinates in such a way that, for some family of symmetric matrices S(t), one has Σ = {(p,q) p,q R n }, Λ(t) = {(p,s(t)p) p R n }. Exercise Assume that Λ(t) = {(p,s(t)p),p R n } with S() =. Prove that the curve is ample at t = if and only if there exists N N such that all the columns of the derivative of S(t) up to order N (and computed at t = ) span a maximal subspace: rank{ṡ(), S(),...,S (N) ()} = n. (12.29) In particular, a curve Λ(t) is regular at t if and only if is ample at t with N =

266 An important property of ample and monotone curves is described in the following lemma. Lemma Let Λ(t) L(Σ) a monotone, ample curve at t. Then, there exists ε > such that Λ(t) Λ(t ) = {} for < t t < ε. Proof. Without loss of generality, assume t =. Choose a Lagrangian splitting Σ = Λ Π, with Λ = J(). For t < ε, thecurve is contained in the chart definedby such a splitting. In coordinates, Λ(t) = {(p,s(t)p) p R n }, with S(t) symmetric and S() =. The curve is monotone, then Ṡ(t) is a semidefinite symmetric matrix. It follows that S(t) is semidefinite too. Suppose that, for some t, Λ(t) Λ() {} (assume t > ). This means that v R n such that S(t)v =. Indeed also v S(t)v =. The function τ v S(τ)v is monotone, vanishing at τ = and τ = t. Therefore v S(τ)v = for all τ t. Being a semidefinite, symmetric matrix, v S(τ)v = if and only if S(τ)v =. Therefore, we conclude that v kers(τ) for τ t. This implies that, for any i N, v kers (i) (), which is a contradiction, since the curve is ample at. Exercise Provethat amonotonecurveλ(t) isampleat t ifandonlyifoneoftheequivalent conditions is satisfied (i) the family of matrices S(t) S(t ) is nondegenerate for t t close enough, and the same remains true if we replace S(t) by its N-th Taylor polynomial, for some N in N. (ii) the map t det(s(t) S(t )) has a finite order root at t = t. Let us now consider an analytic monotone curve on L(Σ). Without loss of generality we can assume the curve to be non increasing, i.e. Λ(t). By monotonicity Λ() Λ(t) = τ t Λ(τ) =: Υ t Clearly Υ t is a decreasing family of subspaces, i.e. Υ t Υ τ if τ t. Hence the family Υ t for t stabilizes and the limit subspace Υ is well defined Υ := lim t Υ t The symplectic reduction of the curve by the isotropic subspace Υ defines a new curve Λ(t) := Λ(t) Υ L(Υ /Υ). Proposition If Λ(t) is analytic and monotone in L(Σ), then Λ(t) is ample L(Υ /Υ). Proof. By construction, in the reduced space Υ /Υ we removed the intersection of Λ(t) with Λ(). Hence Λ() Λ(t) = {}, in L(Υ /Υ) (12.3) In particular, if S(t) denotes the symmetric matrix representing Λ(t) such that S() = Λ(t ), it follows that S(t) is non degenerate for < t < ε. The analyticity of the curve guarantees that the Taylor polynomial (of a suitable order N) is also non degenerate. 266

267 12.6 From ample to regular In this section we prove the main result of this chapter, i.e. that any ample monotone curve can be reduced to a regular one. Theorem Let Λ(t) be a smooth ample monotone curve and set Γ := Ker Λ(). Then the reduced curve t Λ Γ (t) is a smooth regular curve. In particular Λ Γ () >. Before proving Theorem 12.46, let us discuss two useful lemmas. Lemma Let v 1 (t),...,v k (t) R n and define V(t) as the n k matrix whose columns are the vectors v i (t). Define the matrix S(t) := t V(τ)V(τ) dτ. Then the following are equivalent: (i) S(t) is invertible (and positive definite), (ii) span{v i (τ) i = 1,...,k;τ [,t]} = R n. Proof. Fix t > and let us assume S(t) is not invertible. Since S(t) is non negative then there exists a nonzero x R n such that S(t)x,x =. On the other hand S(t)x,x = t V(τ)V(τ) x,x dτ = t V(τ) x 2 dτ This implies that V(τ) x = (or equivalently x V(τ) = ) for τ [,t], i.e. the nonzero vector x is orthogonal to Im τ [,t] V(τ) = span{v i (τ) i = 1,...,k, τ [,t]} = R n, that is a contradiction. The converse is similar. Lemma Let A,B two positive and symmetric matrices such that < A < B. Then we have also < B 1 < A 1. Proof. Assume first that A and B commute. Then A and B can be simultaneously diagonalized and the statement is trivial for diagonal matrices. In the general case, since A is symmetric and positive, we can consider its square root A 1/2, which is also symmetric and positive. We can write < Av,v < Bv,v By setting w = A 1/2 v in the above inequality and using Av,v = A 1/2 v,a 1/2 v one gets < w,w < A 1/2 BA 1/2 w,w, which is equivalent to I < A 1/2 BA 1/2. Since the identity matrix commutes with every other matrix, we obtain < A 1/2 B 1 A 1/2 = (A 1/2 BA 1/2 ) 1 < I which is equivalent to < B 1 < A 1 reasoning as before. 267

268 Proof of Theorem By assumption the curve t Λ(t) is ample, hence Λ(t) Γ = {} and t Λ Γ (t) is smooth for t > small enough. We divide the proof into three parts: (i) we compute the coordinate presentation of the reduced curve. (ii) we show that the reduced curve, extended by continuity at t =, is smooth. (iii) we prove that the reduced curve is regular. (i). Let us consider Darboux coordinates in the symplectic space Σ such that Σ = {(p,q) : p,q R n }, Λ(t) = {(p,s(t)p) p R n }, S() =. Morover we can assume also R n = R k R n k, where Γ = {} R n k. According to this splitting we have the decomposition p = (p 1,p 2 ) and q = (q 1,q 2 ). The subspaces Γ and Γ are described by the equations Γ = {(p,q) : p 1 =,q = }, Γ = {(p,q) : q 2 = } and (p 1,q 1 ) are natural coordinates for the reduced space Γ /Γ. Up to a symplectic change of coordinates preserving the splitting R n = R k R n k we can assume that ( ) ( ) S11 (t) S S(t) = 12 (t) Ik S12 (t) S, with Ṡ() =. (12.31) 22(t) where I k is the k k identity matrix. Finally, from the fact that S is monotone and ample, that implies S(t) > for each t >, it follows S 11 (t) >, S 22 (t) >, t >. (12.32) Then we can compute the coordinate expression of the reduced curve, i.e. the matrix S Γ (t) such that Λ Γ (t) = {(p 1,S Γ (t)p 1 ),p 1 R k }. From the identity Λ(t) Γ = {(p,s(t)p),s(t)p R k } = {( S 1 (t) ( q1 ) ( )) } q1,,q 1 R k (12.33) one gets the key relation S Γ (t) 1 = (S(t) 1 ) 11. Thus the matrix expression of the reduced curve Λ Γ (t) in L(Γ /Γ) is recovered simply by considering it as a map of (p 1,q 1 ) only, i.e. ( )( ) ( ) S11 S S(t)p = 12 p1 S11 p S12 = 1 +S 12 p 2 S 22 p 2 S12 p 1 +S 22 p 2 from which we get S(t)p R k if and only if S 12 (t)p 1 +S 22 (t)p 2 =. Then that means Λ Γ (t) = {(p 1,S 11 p 1 +S 12 p 2 ) : S 12 (t)p 1 +S 22 (t)p 2 = } = {(p 1,(S 11 S 12 S 1 22 S 12 )p 1)} S Γ = S 11 S 12 S 1 22 S 12. (12.34) (ii). By the coordinate presentation of S Γ (t) the only term that can give rise to singularities is the inverse matrix S22 1 (t). In particular, since by assumption t dets 22(t) has a finite order zero at t =, the a priori singularity can be only a finite order pole. 268

269 To prove that the curve is smooth it is enough the to show that S Γ (t) for t, i.e. the curve remains bounded. This follows from the following Claim I. As quadratic forms on R k, we have the inequality S Γ (t) S 11 (t). Indeed S(t) symmetric and positive one has that its inverse S(t) 1 is symmetric and positive also. This implies that S Γ (t) 1 = (S(t) 1 ) 11 > and so is S Γ (t). This proves the left inequality of the Claim I. Moreover using (12.34) and the fact that S 22 is positive definite (and so S 1 22 (S11 S Γ )p 1,p 1 = S12 S 1 22 S 12p 1,p 1 = S 1 22 (S 12p 1 ),(S 12p 1 ). ) one gets Since S(t) for t, clearly S 11 (t) when t, that proves that S Γ (t) also. (iii). We are reduced to show that the derivative of t S Γ (t) at is non degenerate matrix, which is equivalent to show that t S Γ (t) 1 has a simple pole at t =. We need the following lemma, whose proof is postponed at the end of the proof of Theorem Lemma Let A(t) be a smooth family of symmetric nonnegative n n matrices. If the condition rank(a, A,...,A (N) ) t= = n is satisfied for some N, then there exists ε > such that εta() < t A(τ)dτ for all ε < ε and t > small enough. Applying the Lemma to the family A(t) = Ṡ(t) one obtains (see also (12.31)) S(t)p,p > εt p 1 2 for all < ε < ε, any p R n and any small time t >. Now let p 1 R k be arbitrary and extend it to a vector p = (p 1,p 2 ) R n such that (p,s(t)p) Λ(t) Γ (i.e. S(t)p = (q 1 ) T or equivalently S(t) 1 (q 1,) = (p 1,p 2 )). This implies in particular that S Γ (t)p 1 = q 1 and S Γ (t)p 1,p 1 = S(t)p,p εt p1 2, This identity can be rewritten as S Γ (t) > εti k > and implies by Lemma which completes the proof. < S Γ (t) 1 < 1 εt I k Proof of Lemma We reduce the proof of the Lemma to the following statement: Claim II. There exists c, N > such that for any sufficiently small ε,t > ( t ) det A(τ) εa()dτ > ct N. Moreover c, N depends only on the 2N-th Taylor polynomial of A(t). Indeed fix t >. Since A(t) and A(t) is not the zero family, then t A(τ)dτ >. Hence, for a fixed t, there exists ε small enough such that t A(τ) εa()dτ >. Assume now that the matrix S t = t A(τ) εa()dτ > is not strictly positive for some < t < t, then dets(τ) = for some τ [t,t ], that is a contradiction. 269

270 We now prove Claim II. We may assume that t A(t) is analytic. Indeed, by continuity of the determinant, the statement remains true if we substitute A(t) by its Taylor polynomial of sufficiently big order. An analytic one parameter family of symmetric matrices t A(t) can be simultaneously diagonalized (see??), in the sense that there exists an analytic (with respect to t) family of vectors v i (t), with i = 1,...,n, such that A(t)x,x = n v i (t),x 2. i=1 In other words A(t) = V(t)V(t), where V(t) is the n n matrix whose columns are the vectors v i (t). (Notice that some of these vector can vanish at or even vanish identically.) Let us now consider the flag E 1 E 2... E N = R n defined as follows E i = span{v (l) j,1 j n, l i}. Notice that this flag is finite by our assumption on the rank of the consecutive derivatives of A(t) and N is the same as in the statement of the Lemma. We then choose coordinates in R n adapted to this flag (i.e. the spaces E i are coordinate subspaces) and define the following integers (here e 1,...,e n is the standard basis of R n ) m i = min{j : e i E j }, i = 1,...,n. In other words, when written in this new coordinate set, m i is the order of the first nonzero term in the Taylor expansion of the i-th row of the matrix V(t). Then we introduce a quasi-homogeneous family of matrices V(t): the i-th row of V(t) is the m i -homogeneous part of the i-the row of V(t). Then we define Â(t) := V(t) V(t). The columns of the matrix Â(t) satisfies the assumption of Lemma 12.47, then t Â(τ)dτ > for every t >. If we denote the entries A(t) = {a ij (t)} n i,j=1 and Â(t) = {â ij(t)} n i,j=1 we obtain â ij (t) = c ij t m i+m j, a ij (t) = â ij (t)+o(t m i+m j +1 ), for suitable constants c ij (some of them may be zero). Then we let A ε (t) := A(t) εa() = {a ε ij (t)}n i,j=1. Of course aε ij (t) = cε ij tm i+m j +O(t m i+m j +1 ) where { c ε ij = (1 ε)c ij, if m i +m j =, c ij, if m i +m j >. From the equality one gets On the other hand ( t det t ( a ε ij (τ)dτ = c ε ) tm i+m j +1 ij m i +m j +1 +O(t) ( t ) ( ( det A ε (τ)dτ = t n+2 N i=1 m i det ) ( ( Â(τ)dτ = t n+2 N i=1 m i det 27 c ε ij m i +m j +1 c ij m i +m j +1 ) ) +O(t) ) ) +O(t) >

271 ( hence det c ε ij m i +m j +1 ) > for small ε. The proof is completed by setting ( ) c ij N c := det, N := n+2 m i +m j +1 i=1 m i 12.7 Conjugate points in L(Σ) In this section we introduce the notion of conjugate point for a curve in the Lagrange Grassmannian. In the next chapter we explain why this notion coincide with the one given for extremal paths in sub-riemannian geometry. Definition Let Λ(t) be a monotone curve in L(Σ). We say that Λ(t) is conjugate to Λ() if Λ(t) Λ() {}. As a consequence of Lemma 12.43, we have the following immediate corollary. Corollary Conjugate points on a monotone and ample curve in L(Σ) are isolated. The following two results describe general properties of conjugate points Theorem Let Λ(t), (t) two ample monotone curves in L(Σ) defined on R such that (i) Σ = Λ(t) (t) for every t, (ii) Λ(t), (t), as quadratic forms. Then there exists no τ > such that Λ(τ) is conjugate to Λ(). Moreover lim t + Λ(t) = Λ( ). Proof. Fix coordinates induced by some Lagrangian splitting of Σ in such a way that S Λ() = and S () = I. The monotonicity assumption implies that t S Λ(t) (resp. t S (t) ) is a monotone increasing (resp. decreasing) curve in the space of symmetric matrices. Moreover the tranversality of Λ(t) and (t) implies that S (t) S Λ(t) is a non degenerate matrix for all t. Hence < S Λ(t) < S (t) < I, for all t >. In particular Λ(t) never leaves the coordinate neighborhood under consideration, the subspace Λ(t) is always traversal to Λ() for t > and has a limit Λ( ) whose coordinate representation is S Λ ( ) = lim t + S Λ (t). Theorem Let Λ s (t), for t,s [,1] be an homotopy of curves in L(Σ) such that Λ s () = Λ for s [,1]. Assume that (i) Λ s ( ) is monotone and ample for every s [,1], (ii) Λ ( ),Λ 1 ( ) and Λ s (1), for s [,1], contains no conjugate points to Λ. Then no curve t Λ s (t) contains conjugate points to Λ. 271

272 Proof. Let us consider the open chart Λ defined by all the Lagrangian subspaces traversal to Λ. The statement is equivalent to prove that Λ s (t) Λ for all t > and s [,1]. Let us fix coordinates induced by some Lagrangian splitting Σ = Λ in such a way that Λ = {(p,)} and Λ s (t) = {(B s (t)q,q)} for all s and t > (at least for t small enough, indeed by ampleness Λ s (t) Λ for t small). Moreover we can assume that B s (t) is a monotone increasing family of symmetric matrices. Notice that x T B s (τ)x for every x R n when τ +, due to the fact that Λ s () = Λ is out of the coordinate chart. Moreover, a necessary condition for Λ s (t) to be conjugate to Λ is that there exists a nonzero x such that x T B s (τ)x for τ t. It is then enough to show that, for all x R n the function (t,s) x T B s (t)x is bounded. Indeed by assumptions t x T B (t)x and t x T B 1 (t)x are monotone increasing and bounded up to t = 1. Hence the continuous family of values M s := x T B s (1)x is weel defined and bounded for all s. The monotonicity implies that actually x T B s (t)x < + for all values of t,s [,1]. (See also Figure 12.7). + x T B s (1)x x T B (1)x x T B 1 (1)x x T B s (t)x s Figure 12.1: Proof of Theorem Comparison theorems for regular curves In this last section we prove two comparison theorems for regular monotone curves in the Lagrange Grassmannian. Corollary Let Λ(t) be a monotone and regular curve in the Lagrange Grassmannian such that R Λ (t). Then Λ(t) contains no conjugate points to Λ(). Proof. This is a direct consequence of Theorem

273 Theorem Let Λ(t) be a monotone and regular curve in the Lagrange Grassmannian. Assume that there exists k such that for all t (i) R Λ (t) kid. Then, if Λ(t) is conjugate to Λ(), we have t π k. (ii) 1 n tracer Λ(t) k. Then for every t there exists τ [t,t+ π k ] such that Λ(τ) is conjugate to Λ(). We stress that assumption (i) means that all the eigenvalues of R Λ (t) are smaller or equal than k, while (ii) requires only that the average of the eigenvalues is bigger or equal than k. Remark Notice that the estimates of Theorem are sharp, as it is immediately seen by considering the example of a 1-dimensional curve of constant velocity (see Example 12.35). Proof. (i). Consider the real function ϕ : R ], π [, ϕ(t) = 1 (arctan kt+ π k k 2 ) Using that ϕ(t) = (1+kt 2 ) 1 it is easy to show that the Schwarzian derivative of ϕ is k R ϕ (t) = (1+kt 2 ) 2. Thus using ϕ as a reparametrization we find, by Proposition R Λϕ (t) = ϕ 2 R Λ (ϕ(t))+r ϕ (t)id 1 = (1+kt 2 ) 2(R Λ(ϕ(t)) kid). By Corollary the curve Λ ϕ has no conjugate points, i.e. Λ has no conjugate points in the interval ], π k [. (ii). We prove the claim by showing that the curve Λ(t), on every interval of length π/ k has non trivial intersection with every subspace (hence in particular with Λ()). This is equivalent to prove that Λ(t) is not contained in a single coordinate chart for a whole interval of length π/ k. Assume by contradiction that Λ(t) is contained in one coordinate chart. Then there exists coordinates such that Λ(t) = {(p,s(t)p)} and we can write the coordinate expression for the curvature: R Λ (t) = Ḃ(t) B(t)2, where B(t) = (2S(t)) 1 S(t). Let now b(t) := traceb(t). Computing the trace in both sides of equality we get Ḃ(t) = B 2 (t)+r Λ (t), ḃ(t) = trace(b 2 (t))+tracer Λ (t). (12.35) Lemma For every n n symmetric matrix S the following inequality holds true trace(s 2 ) 1 n (traces)2. (12.36) 273

274 Proof. For every symmetric matrix S there exists a matrix M such that MSM = D is diagonal. Since trace(mam 1 ) = trace(a) for every matrix A, it is enough to prove the inequality (12.36) for a diagonal matrix D = diag(λ 1,...,λ n ). In this case (12.36) reduces to the Cauchy-Schwartz inequality ( n n ) 2 λ 2 i 1 λ i. n i=1 i=1 Applying Lemma to (12.35) and using the assumption (ii) one gets ḃ(t) 1 n b2 (t)+nk, (12.37) By standardresults in ODE theory wehave b(t) ϕ(t), whereϕ(t) is the solution of the differential equation ϕ(t) = 1 n ϕ2 (t)+nk (12.38) The solution for (12.38), with initial datum ϕ(t ) =, is explicit and given by ϕ(t) = n ktan( k(t t )). This solution is defined on an interval of measure π/ k. Thus the inequality b(t) ϕ(t) completes the proof. 274

275 Chapter 13 Jacobi curves Now we are ready to introduce the main object of this part of the book, i.e. the Jacobi curve associated with a normal extremal. Heuristically, we would like to extract geometric properties of the sub-riemannian structure by studying the symplectic invariants of its geodesic flow, that is the flow of H. The simplest idea is to look for invariants in its linearization. As we explain in the next sections, this object is naturally related to geodesic variations, and generalizes the notion of Jacobi fields in Riemannian geometry to more general geometric structures. In this chapter we consider a sub-riemannian structure (M, U, f) on a smooth n-dimensional manifold M and we denote as usual by H : T M R its sub-riemannian Hamiltonian From Jacobi fields to Jacobi curves Fix a covector λ T M, with π(λ) = q, and consider the normal extremal starting from q and associated with λ, i.e. λ(t) = e t H (λ), γ(t) = π(λ(t)). (i.e. λ(t) T γ(t) M.) For any ξ T λ (T M) we can define a vector field along the extremal λ(t) as follows X(t) := e t H ξ T λ(t) (T M) The set of vector fields obtained in this way is a 2n-dimensional vector space which is the space of Jacobi fields along the extremal. For an Hamiltonian H corresponding to a Riemannian structure, the projection π gives an isomorphisms between the space of Jacobi fields along the extremal and the classical space of Jacobi fields along the geodesic γ(t) = π(λ(t)). Notice that this definition, equivalent to the standard one in Riemannian geometry, does not need curvature or connection, and can be extended naturally for any strongly normal sub- Riemannian geodesic. In Riemannian geometry, the study of one half of this vector space, namely the subspace of classical Jacobi fields vanishing at zero, carries informations about conjugate points along the given geodesic. By the aforementioned isomorphism, this corresponds to the subspace of Jacobi fields along the extremal such that π X() =. This motivates the following construction: For 275

276 any λ T M, we denote V λ := kerπ λ the vertical subspace. We could study the whole family of (classical) Jacobi fields (vanishing at zero) by means of the family of subspaces along the extremal L(t) := e t H V λ T λ(t) (T M). Notice that actually, being e t H a symplectic transformation and V λ a Lagrangian subspace, the subspace L(t) is a Lagrangian subspace of T λ(t) (T M) Jacobi curves The theory of curves in the Lagrange Grassmannian developed in Chapter?? is an efficient tool to study family of Lagrangian subspaces contained in a single symplectic vector space. It is then convenient to modify the construction of the previous section in order to collect the informations about the linearization of the Hamiltonian flow into a family of Lagrangian subspaces at a fixed tangent space. By definition, the pushforward of the flow of H maps the tangent space to T M at the point λ(t) back to the tangent space to T M at λ: e t H : T λ(t) (T M) T λ (T M). If we then restrict the action of the pushforward e t H to the vertical subspace at λ(t), i.e. the tangent space T λ(t) (Tγ(t) M) at the point λ(t) to the fiber T γ(t) M, we define a one parameter family of n-dimensional subspaces in the 2n-dimensional vector space T λ (T M). This family of subspaces is a curve in the Lagrangian Grassmannian L(T λ (T M)). Notation. In the following we use the notation V λ := T λ (Tq M) for the vertical subspace at the point λ T M, i.e. the tangent space at λ to the fiber Tq M, where q = π(λ). Being the tangent space to a vector space, sometimes it will be useful to identify the vertical space V λ with the vector space itself, namely V λ TqM. Definition Let λ T M. The Jacobi curve at the point λ is defined as follows J λ (t) := e t H V λ(t), (13.1) whereλ(t) := e t H (λ)andγ(t) = π(λ(t)). Notice thatj λ (t) T λ (T M)andJ λ () = V λ = T λ (T q M) is vertical. As discussed in Chapter 12, the tangent vector to a curve in the Lagrange Gassmannian can be interpreted as a quadratic form. In the case of a Jacobi curve J λ (t) its tangent vector is a quadratic form J λ (t) : J λ (t) R. Proposition The Jacobi curve J λ (t) satisfies the following properties: (i) J λ (t+s) = e t H J λ(t) (s), for all t,s, (ii) J λ () = 2H T q M as quadratic forms on V λ T q M. (iii) rank J λ (t) = rankh T γ(t) M 276

277 Proof. Claim (i) is a consequence of the semigroup property of the family {e t H } t. To prove (ii), introduce canonical coordinates (p,x) in the cotangent bundle. Fix ξ V λ. The smooth family of vectors defined by ξ(t) = e t H ξ (considering ξ as a constant vertical vector field) is a smooth extension of ξ, i.e. it satisfies ξ() = ξ and ξ(t) J λ (t). Therefore, by (12.8) ( J λ ()ξ = σ(ξ, ξ) = σ ξ, d ) dt e t H ξ = σ(ξ,[ H,ξ]). (13.2) t= To compute the last quantity we use the following elementary, although very useful, property of the symplectic form σ. Lemma Let ξ V λ a vertical vector. Then, for any η T λ (T M) where we used the canonical identification V λ = T q M. σ(ξ,η) = ξ,π η, (13.3) Proof. In any Darboux basis induced by canonical local coordinates (p,x) on T M, we have σ = n i=1 dp i dx i and ξ = n i=1 ξi pi. The result follows immediately. To complete the proof of point (ii) it is enough to compute in coordinates π [ H,ξ] = π [ H p x H x p,ξ ] = 2 H p p 2 ξ x, Hence by Lemma 13.3 and the fact that H is quadratic on fibers one gets σ(ξ,[ H,ξ]) = ξ, 2 H p 2 ξ = 2H(ξ). (iii). The statement for t = is a direct consequence of (ii). Using property (i) it is easily seen that the quadratic forms associated with the derivatives at different times are related by the formula J λ (t) e t H = J λ(t) (). (13.4) Since e t H is a symplectic transformation, it preserves the sign and the rank of the quadratic form. 1 Remark Notice that claim (iii) of Proposition 13.2 implies that rank of the derivative of the Jacobi curve is equal to the rank of the sub-riemannian structure. Hence the curve is regular if and only if it is associated with a Riemannian structure. In this case of course it is strictly monotone, namely J λ (t) < for all t. Corollary The Jacobi curve J λ (t) associated with a sub-riemannian extremal is monotone nonincreasing for every λ T M. 1 Notice that J λ (t), J λ(t) () are defined on J λ (t), J λ(t) () respectively, and J λ (t) = e t H J λ(t) (). 277

278 13.2 Conjugate points and optimality At this stage we have two possible definition for conjugate points along normal geodesics. On one hand we have singular points of the exponential map along the extremal path, on the other hand we can consider conjugate points of the associated Jacobi curve. The next result show that actually the two definition coincide. Proposition Let γ(t) = E q (tλ) be a normal geodesic starting from q with initial covector λ. Denote by J λ (t) its Jacobi curve. Then for s > γ(s) is conjugate to γ() J λ (s) is conjugate to J λ (). Proof. By Definition 7.33, γ(s) is conjugate to γ() if sλ is a critical point of the exponential map E q. This is equivalent to say that the differential of the map from Tq M to M defined by λ π e s H (λ) is not surjective at the point λ, i.e. the image of the differential e s H has a nontrivial intersection with the kernel of the projection π e s H J λ () T λ(s) Tγ(s) M {}. (13.5) Applying the linear invertible transformation e s H to both subspaces one gets that (13.5) is equivalent to J λ () J λ (s) {} which means by definition that J λ (s) is conjugate to J λ (). The next result shows that, as soon as we have a segment of points that are conjugate to the initial one, the segment is also abnormal. Theorem Let γ : [,1] M be a normal extremal path such that γ [,s] is not abnormal for all < s 1. Assume γ [t,t 1 ] is a curve of conjugate points to γ(). Then the restriction γ [t,t 1 ] is also abnormal. Remark Recall that if a curve γ : [,T] M is a strictly normal trajectory, it can happen that a piece of it is abnormal as well. If the trajectory is strongly normal, then if t,t 1 satisfy the assumptions of Theorem 13.7 necessarily t >. Proof. Let us denote by J λ (t) the Jacobi curve associated with γ(t). From Proposition 13.6 it follows that J λ (t) J λ () {} for each t [t,t 1 ]. We now show that actually this implies J λ () t [t,t 1 ] J λ (t) {}. (13.6) We can assume that the whole piece of the Jacobi curve J λ (t), with t t t 1, is contained in a single coordinate chart. Otherwise we can cover [t,t 1 ] with such intervals and repeat the argument on each of them. Let us fix coordinates given by a Lagrangian splitting in such a way that J λ (t) = {(p,s(t)p),p R n }, J λ () = {(p,),p R n } 278

279 Moreover we can assume that S(t) for every t t t 1, i.e. is non positive definite and monotone decreasing, 2 In particular J λ (t 1 ) J λ () {} if and only if there exists a vector v such that S(t 1 )v =. Since the map t v T S(t)v is nonpositive and decreasing this means that S(t)v = for all t [t,t 1 ], thus J λ () J λ (t 1 ) J λ () J λ (t) (13.7) t [t,t 1 ] that implies that actually we have the equality in (13.7). We are left to show that if a Jacobi curve J λ (t) is such that every t is a conjugate point for τ τ, then the corresponding extremal is also abnormal. Indeed let us fix an element ξ such that ξ J λ (t) t [,τ] which is non-empty by the above discussion. Then we consider the vertical vector field ξ(t) = e t H ξ T λ(t) (Tγ(t) M), t τ. By construction, the vector field ξ is preserved by the Hamiltonian field, i.e. e t H ξ = ξ, that implies [ H,ξ](λ(t)) =. Then the statement is proved by the following Exercise Define η(t) = ξ(λ(t)) T γ(t) M (by canonical identification T λ(t q M) T q M). Show that the identity [ H,ξ](λ(t)) = rewrites in coordinates as follows k h i (η(t)) 2 =, η(t) = i=1 k h i (λ(t)) h i (η(t)). (13.8) i=1 Exercise 13.9 shows that η(t) is a family of covectors associated with the extremal path corresponding to controls u i (t) = h i (λ(t)) and such that h i (η(t)) =, that means that it is abnormal. Corollary Let J λ (t) be the Jacobi curve associated with λ T M and γ(t) = π(λ(t)) the associated sub-riemannian extremal path. Then γ [,τ] is not abnormal for all τ t if and only if J λ (τ) J λ () = {} for all τ t Reduction of the Jacobi curves by homogeneity The Jacobi curve at point λ T M parametrizes all the possible geodesic variations of the geodesic associated with an initial covector λ. Since the variations in the direction of the motion are always trivial, i.e. the trajectory remains the same up to parametrizations, one can reduce the space of variation to an (n 1)-dimensional one. This idea is formalized by considering a reduction of the Jacobi curve in a smaller symplectic space. As we show in the next section, this is a natural consequence of the homogeneity of the sub-riemannian Hamiltonian. 2 Indeed it is proved that the only invariant of a pair of two Lagrangian subspaces in a symplectic space is the dimension of the intersection, i.e. the rank of the difference rank(s(t) S()). Add exercise 279

280 Remark This procedure was already exploited in Section 7.6, obtained by a direct argument via Proposition Indeed one can recognize that the procedure that reduced the equation for conjugate points of one dimension corresponds exactly to the reduction by homogeneity of the Jacobi curve associated to the problem. We start with a technical lemma, whose proof is left as an exercise. Lemma Let Σ = Σ 1 Σ 2 be a splitting of the symplectic space, with σ = σ 1 σ 2. Let Λ i L(Σ i ) and define the curve Λ(t) := Λ 1 (t) Λ 2 (t) L(Σ). Then one has the splittings: Λ(t) = Λ 1 (t) Λ 2 (t), R Λ (t) = R Λ1 (t) R Λ2 (t). Consider now a Jacobi curve associated with λ T M: J λ (t) = e t H V λ(t), V λ = T λ (T π(λ) M). Denote by δ α : T M T M the fiberwise dilation δ α (λ) = αλ, where α >. Definition The Euler vector field E Vec(T M) is the vertical vector field defined by E(λ) = d ds δ s (λ), λ T M. s=1 It is easy to see that in canonical coordinates (x,ξ) it satisfies E = n identity holds e t E λ = e t λ, i.e. e t E (ξ,x) = (e t ξ,x). i=1 ξ i Exercise Prove that the Euler vector field is characterized by the identity Lemma We have the identity e t H i E σ = s, s = Liouville 1-form in T M. E = E t H. In particular [ H, E] = H. ξ i and the following Proof. The homogeneity property (7.31) of the Hamiltonian can be rewritten as follows e t H (δ s λ) = δ s (e st H (λ)), s,t >. Applying δ s to both sides and changing t into t one gets the identity δ s e t H δ s = e st H. (13.9) Computing the 2 nd order mixed partial derivative at (t,s) = (,1) in (13.9) one gets, by (2.27), that [ H, E] = H. Thus, by (??) we have e t H E = E t H, since every higher order commutator vanishes. Proposition The subspace Σ = span{ E, H} is invariant under the action of the Hamiltonian flow. Moreover { E, H} is a Darboux basis on Σ H 1 (1/2). 28

281 Proof. The fact that Σ is an invariant subspace is a consequence of the identities e t H E = E t H, e t H H =. Moreover, on the level set H 1 (1/2), we have by homogeneity of H w.r.t. p: σ( E, H) = E(H) = d dt H(e t E (p,x)) = p H = 2H = 1. (13.1) t= p It follows that { E, H} is a Darboux basis for Σ. In particular we can consider the the symplectic splitting Σ = Σ Σ. Exercise Prove the following intrinsic characterization of the skew-orthogonal to Σ: Σ = {ξ Tλ (T M) : d λ H,ξ = s λ,ξ = }. The assumptions of Lemma are satisfied and we could split our Jacobi curve. Definition The reduced Jacobi curve is defined as follows Ĵ λ (t) := J λ (t) Σ. (13.11) Notice that, if we put V λ := V λ T λ H 1 (1/2), we get Ĵ λ () = V λ, Ĵ λ (t) = e t H V λ. Moreover we have the splitting J λ (t) = Ĵλ(t) R( E t H). We stress again that Ĵ λ (t) is a curve of (n 1)-dimensional Lagrangian subspaces in the(2n 2)- dimensional vector space Σ. Exercise With the notation above (i) Show that the curvature of the curve J λ (t) Σ in Σ is always zero. (ii) Prove that J λ () J λ (s) {} if and only if Ĵλ() Ĵλ(s) {}. 281

282 282

283 Chapter 14 Riemannian curvature On a manifold, in general there is no canonical method for identifying tangent spaces at different points, (or more generally fibers of a vector bundle at different points). Thus, we have to expect that a notion of derivative for vector fields (or sections of a vector bundle), has to depend on certain choices. In our presentation we introduce the general notion of Ehresmann connection and we then we discuss how this notion is related with the notion of parallel transport and covariant derivative usually introduced in classical Riemannian geometry Ehresmann connection Given a smooth fiber bundle E, with base M and canonical projection π : E M, we denote by E q = π 1 (q) the fiber at the point q M. The vertical distribution is by definition the collection of subspaces in TE that are tangent to the fibers V = {V z } z E, V z := kerπ,z = T z E π(z) T z E. Definition Let E be a smooth fiber bundle. An Ehresmann connection on E is a smooth vector distribution H in E satisfying H = {H z } z E, T z E = V z H z. Notice that V, being the kernel of the pushforward π, is canonically associated with the fibre bundle. Defining a connection means exactly to define a canonical complement to this distribution. For this reason H is also called horizontal distribution. Definition LetX Vec(M). Thehorizontal lift of X is theuniquevector field X Vec(E) such that X (z) H z, π X = X, z E. (14.1) The uniqueness follows from the fact that π,z : T z E T π(z) M is an isomorphism when restricted to H z. Indeed π,z is a surjective linear map with kerπ,z = V z. Notation. In the following we will refer also at as the connection on E. 283

284 Given a smooth curve γ : [,T] M on the manifold M, the connection let us to define the parallel transport along γ, i.e. a way to identify tangent vectors belonging to tangent spaces at different points of the curve. Let X t be a nonautonomous smooth vector field defined on a neighborhood of γ, that is an extension of the velocity vector field of the curve 1, i.e. such that γ(t) = X t (γ(t)), t [,T]. Then consider the non autonomous vector field Xt Vec(E) obtained by its lift. Definition Let γ : [,T] M be a smooth curve. The parallel transport along γ is the map Φ defined by the flow of Xt Φ t,t 1 := exp t1 t Xs ds : E γ(t ) E γ(t1 ), for < t < t 1 < T. (14.2) In the general case we need some extra assumptions on the vector field to ensure that (14.2) exists (even for small time t > ) since the existence time of a solution also depend on the point on the fiber. For instance if we the fibers are compact, then it is possible to find such t >. Exercise Show that the parallel transport map sends fibers to fibers and does not depend on the extension of the vector field X t. (Hint: consider two extensions and use the existence and uniqueness of the flow.) Curvature of an Ehresmann connection Assume that π : E M is a smooth fiber bundle and let be a connection on E, defining the splitting E = V H. Given an element z E we will also denote by z hor (resp. z ver ) its projection on the horizontal (resp. vertical) subspace at that point. The commutator of two vertical vector field is always vertical. The curvature operator associated with the connection computes if the same holds true for two horizontal vector fields. Definition Let E be a smooth fiber bundle and a connection on E. Let X,Y Vec(M) and define R(X,Y) := [ X, Y ] ver (14.3) The operator R is called the curvature of the connection. Notice that, given a vector field on E, its horizontal part coincide, by definition, with the lift of its projection. In particular [ X, Y ] hor = [X,Y], (i.e. π [ X, Y ] = [X,Y]) Hence R(X,Y) computes the nontrivial part of the bracket between the lift of X and Y and R if and only if the horizontal distribution H is involutive. The curvature R(X,Y) is also rewritten in the following more classical way R(X,Y) = [ X, Y ] [X,Y]. = X Y X Y [X,Y]. Next we show that R is actually a tensor on T q M, i.e. the value of R(X,Y) at q depends only on the value of X and Y at the point q. 1 this is always possible with a (maybe non autonomous) vector field. 284

285 Proposition R is a skew symmetric tensor on M. Proof. The skew-symmetry is immediate. To prove that the value of R(X,Y) at q depends only on the value of X and Y at the point q, it is sufficient to prove that R is linear on functions. By skew-symmetry, we are reduced to prove that R is linear in the first argument, namely R(aX,Y) = ar(x,y), where a C (M). Notice that the symbol a in the right hand side stands for the function π a = a π in C (E), that is constant on fibers. By definition of lift of a vector field it is easy to prove theidentities ax = a X and X a = Xa for every a C (M). Applying the definition of and the Leibnitz rule for the Lie bracket one gets R(aX,Y) = [ ax, Y ] [ax,y] = a[ X, Y ] ( Y a) X a[x,y] (Ya)X = a[ X, Y ] (Ya) X a [X,Y] +(Ya) X = ar(x,y) Linear Ehresmann connections Assume now that E is a vector bundle on M (i.e. each fiber E q = π 1 (q) has a natural structure of vector space). In this case it is natural to introduce a notion of linear Ehresmann connection on E. Given a vector bundle π : E M, we denote by CL (E) the set of smooth functions on E that are linear on fibers. Remark For a vector bundle π : E M, the base manifold M can be considered immersed in E as the zero section (see also Example 2.41). The dual version of this identification is the inclusion i : C (M) C (E). Indeed any function in C (M) can be considered as a functions in C (E) which is constant on fibers, i.e. more precisely a C (M) π a C (E). Exercise Show that a vector field on E is the lift of a vector field on M if and only if, as a differential operator on C (E), it maps the subspace C (M) into itself. After this discussion it is natural to give the following definition. Definition A linear connection on a vector bundle E on the base M is an Ehresmann connection such that the lift X of a vector field X Vec(M) satisfies the following property: for every a C L (E) it holds Xa C L (E). Remark Given a local basis of vector fields X 1,...,X n on M we can build dual coordinates (u 1,...,u n ) on the fibers of E defining the functions u i (z) = z,x i (q) where q = π(z). In this way E = {(u,q),q M,u R n }, 285

286 and the tangent space to E is splitted in T z E T q M T z E q. A connection on E is determined by the lift of the vector fields X i,i = 1,...,n on the base manifold (recall that π Xi = X i ) Xi = X i + n a ij (u,q) uj, i = 1,...,n, (14.4) j=1 where a ij C (E) are suitable smooth functions. Then is linear if and only if for every i,j the function a ij (u,q) = n k=1 Γk ij (q)u k is linear with respect to u. The smooth functions Γ k ij are also called the Christoffel symbols of the linear connection. Exercise Let γ be a smooth curve on the manifold such that γ(t) = n i=1 v i(t)x i (γ(t)). Show that the differential equation ξ(t) = γ(t) ξ(t) for the parallel transport along γ are written as u j = i,k Γk ij v iu k where (u 1,...,u n ) are the vertical coordinates of ξ. Notice that, for a linear connection, the parallel transport is defined by a first order linear (nonautonomous) ODE. The existence of the flow is then guaranteed from stantard results form ODE theory. Moreover, when it exists, the map Φ t,t 1 is a linear transformation between fibers Covariant derivative and torsion for linear connections Once a connection on a linear vector bundle E is given, we have a well defined linear parallel transport map Φ t,t 1 := t1 exp Xs ds : E γ(t ) E γ(t1 ), for < t < t 1 < T. (14.5) t If we consider the dual map of the parallel transport one can naturally introduce a non autonomous linear flow on the dual bundle (notice the exchange of t,t 1 in the integral) Φ t,t 1 := ( t ) exp Xs ds : Eγ(t ) E γ(t 1 ), for < t < t 1 < T. (14.6) t 1 The infinitesimal generator of this adjoint flow defines a linear parallel transport, hence a linear connection, on the dual bundle E. In what follows we will restrict our attention to the case of the vector bundle E = T M and we assume that a linear connection on T M is given. Notice that, by the above discussion, all the constructions can be equivalently performed on the dual bundle E = TM. For every vector field Y Vec(M) let us denote with Y C (T M) the function Y (λ) = λ,y(q), q = π(z), namely the smooth function on E associated with Y that is linear on fibers. This identification betweenvector fieldsonm andlinearfunctionsont M permitsustodefinethecovariant derivative of vector fields. Definition Let X,Y Vec(M). We define X Y = Z if and only if X Y = Z with Z Vec(M). 286

287 Notice that the definition is well-posed since is linear, hence X Y is a linear function and there exists Z Vec(M) such that X Y = Z. 2 Lemma Let {X 1,...,X n } be a local frame on M. Then Xi X j = Γ k ij X k, where Γ k ij are the Christoffel symbols of the connection. Proof. Let us prove this in the coordinates dual to our frame. In these coordinates, the linear connection is specified by the lifts Xi = X i +Γ k iju k uj, where u j (λ) = λ,x j. Moreover Xj = u j. Hence it is immediate to show Xi Xj = Γk ij X k, and the lemma is proved. We now introduce the torsion tensor of a linear connection on T M. As usual, σ denotes the canonical symplectic structure on T M. Definition Thetorsion of alinear connection is themap T : Vec(M) 2 Vec(M) defined by the identity T(X,Y) := σ( X, Y ), X,Y Vec(M). (14.7) It is easy to check that T is actually a tensor, i.e. the value of T(X,Y) at a point q depends only on the values of X,Y at the point. The torsion computes how much the horizontal distribution H is far from being Lagrangian. In particular H is Lagrangian if and only if T. The classical formula for the torsion tensor, in terms of the covariant derivative, is recovered in the following lemma. Lemma The torsion tensor satisfies the identity T(X,Y) = X Y Y X [X,Y]. (14.8) Proof. We have to prove that T(X,Y) = X Y Y X [X,Y]. Notice that by definition of the Liouville 1-form s Λ 1 (T M), s λ = λ π we have X (λ) = λ,x = s λ, X. Then we have, using that σ = ds and the Cartan formula (4.74) T(X,Y) = ds( X, Y ) = X s, Y Y s, X s,[ X, Y ] = X s, Y Y s, X s, [X,Y] = X Y Y X [X,Y], where in the second equality we used that s,[ X, Y ] = s,[ X, Y ] hor = s, [X,Y] since the Liouville form by definition depends only on the horizontal part of the vector. Exercise Show that a linear connection on a vector bundle E satisfies the following Leibnitz rule X (ay) = a X Y +(Xa)Y, for each a C (M). 2 There is no confusion in the notation above since, by definition, X it is well defined when applied to smooth functions on T M. Whenever it is applied to a vector field we follow the aforementioned convention. 287

288 14.2 Riemannian connection In this section we want to introduce the Levi-Civita connection on a Riemannian manifold M by defining an Ehresmann connection on T M via the Jacobi curve approach. Recall that every Jacobi curve associated with a trajectory on a Riemannian manifold is regular. Moreover, as showed in Chapter 12, every regular curve in the Lagrangian Grassmannian admits a derivative curve, which defines a canonical complement to the curve itself. Hence, following this approach, it is natural to introduce the Riemannian connection at λ T M as the canonical complement to the Jacobi curve defined at λ. Definition The Levi-Civita connection on T M is the Ehresmann connection H is defined by H λ = J λ (), λ T M, where as usual J λ (t) denotes the Jacobi curve defined at the point λ T M and Jλ derivative curve. denotes its The next proposition characterizes the Levi-Civita connection as the unique linear connection on T M that is linear, metric preserving and torsion free. Proposition The Levi-Civita connection satisfies the following properties: (i) is a linear connection, (ii) is torsion free, (iii) is metric preserving, i.e. X H = for each vector field X Vec(M). Proof. (i). It is enough to prove that the connection H λ is 1-homogeneous, i.e. H cλ = δ c H λ, c >. (14.9) Indeed in this case the functions a ij C (T M) defining the connection (see (14.4)) are 1- homogeneous, hence linear as a consequence of Exercise Let us prove (14.9). The differential of the dilation on the fibers δ c : T M T M satisfies the property δ c (T λ (Tq M)) = T cλ(tq M). From this identity and differentiating the identity one easily gets that Indeed one has the following chain of identities e t H δ c = δ c e ct H, c >, (14.1) J cλ (t) = δ c J λ (ct), t,λ T M. (14.11) J cλ (t) = e t H (T cλ (Tq M)) = e t H δ c (T λ (T qm)) (by (14.1)) = δ c e ct H (T λ (Tq M)) = δ c J λ (ct). 288

289 Now we show that the same relation holds true also for the derivative curve, i.e. J cλ (t) = δ c J λ (ct), t, λ T M. (14.12) Indeed one can check in coordinates (we denote as usual J λ (t) = {(p,s λ (t)p),p R n }) that the identity (14.11) is written as S cλ (t) = 1 c S λ(ct) thus S cλ (t) 1 = cs λ (ct) 1. From here 3 one also gets B cλ (t) = cb λ (ct) and (14.12) follows from the identity S (t) = B 1 (t) +S(t). (See also Exercise 12.22). In particular at t = the identity (14.12) says that H cλ = δ c H λ. (ii). It is a direct consequence of the fact that J λ () is a Lagrangian subspace of T λ(t M) for every λ T M, hence the symplectic form vanishes when applied to two horizontal vectors. (iii). Again, for every X Vec(M), both X and H are horizontal vector field. Since the horizontal space is Lagrangian X H = σ( X, H) =. Exercise Let f : R n R be a smooth function that satisfies f(αx) = αf(x) for every x R n and α. Then f is linear. The following theorem says that a connection satisfying the three properties above is unique. Then it characterize the Levi-Civita connection in terms of the structure constants of the Lie algebra defined by an orthonormal frame. Theorem There is a unique Ehresmann connection satisfying the properties (i), (ii), and (iii) of Proposition 14.18, that is the Levi-Civita connection. Its Christoffel symbols are computed by Γ k ij = 1 2 (ck ij c i jk +cj ki ), (14.13) where c k ij are the smooth functions defined by the identity [X i,x j ] = n k=1 ck ij X k. Proof. Let X 1,...,X n be a local orthonormal frame for the Riemannian structure and let us consider coordinates (q,u) in T M, where the fiberwise coordinates u = (u 1,...,u n ) are dual to the orthonormal frame. From the linearity of the connection it follows that there exist smooth functions Γ k ij : M R (depending on q only) such that n Xi = X i + Γ k ij u k uj, j=1 i = 1,...,n. In particular Xi X j = Γ k ij X k. In these coordinates the Hamiltonian vector field associated with the Riemannian Hamiltonian H = 1 n 2 i=1 u2 i reads (see also Exercise??) H = n i,j,k=1 u i X i +c k iju i u k uj, while the symplectic form σ is written (ν 1,...,ν n denotes the dual basis to X 1,...,X n ) σ = n i,j,k=1 du k ν k c k iju k ν i ν k. 3 recall that B is the zero order term of the expansion of S

290 Since the horizontal space is Lagrangian, one has the relations n = σ( Xi, Xj ) = (Γ k ij Γk ji ck ij )u k, i,j = 1,...,n, k=1 hence c k ij = Γk ij Γk ji for all i,j,k. Moreover the connection is metric, i.e. it satisfies n = Xi H = Γ k ij u ku j, j,k=1 i = 1,...,n. The last identity implies that Γ k ij is skew-symmetric with respect to the pair (j,k), i.e. Γk ij = Γj ik. Thus combining the two identities one gets c k ij c i jk +cj ki = (Γk ij Γ k ji) (Γ i jk +Γi kj )+(Γj ki Γj ik ) = Γ k ij Γj ik = 2Γk ij. Remark Notice that in the classical approach one can recover formula (14.13) from the following particular case of the Koszul formula Γ k ij = g( X i X j,x k ) = 1 2 (g([x i,x j ],X k ) g([x j,x k ],X i )+g([x k,x i ],X j )), that holds for every orthonormal basis X 1,...,X n. Notice also that the Hamiltonian vector field is written in coordinates H = n i=1 u i Xi, which gives another proof of the fact that it is horizontal. Let X,Y,Z,W Vec(M). We define R(X,Y)Z = W if R(X,Y)Z = W. Proposition (Bianchi identity). For every X, Y, Z Vec(M) the following identity holds R(X,Y)Z +R(Y,Z)X +R(Z,X)Y =. (14.14) Proof. We will show that (14.14) is a consequence of the Jacobi identity (2.3). Using that is a torsion free connection we can write Then [X,[Y,Z]] = X [Y,Z] [Y,Z] X = X Y Z X Z Y [Y,Z] X, [Z,[X,Y]] = Z X Y Z Y X [X,Y] Z, [Y,[Z,X]] = Y Z X Y X Z [Z,X] Y, = [X,[Y,Z]]+[Y,[Z,X]] +[Z,[X,Y]] = X Y Z X Z Y [Y,Z] X + Z X Y Z Y X [X,Y] Z + Y Z X Y X Z [Z,X] Y = R(X,Y)Z +R(Y,Z)X +R(Z,X)Y. 29

291 Exercise Prove the second Bianchi identity ( X R)(Y,Z,W)+( Y R)(Z,X,W)+( Z R)(X,Y,W) =, X,Y,Z,W Vec(M). (Hint: Expand the identity [X,[Y,Z]]+[Y,[Z,X]]+[Z,[X,Y]] W =.) Let usdenote(x,y,z,w) := R(X,Y)Z,W. Following thisnotation, thefirstbianchi identity can be rewritten as follows: (X,Y,Z,W)+(Z,X,Y,W)+(Y,Z,X,W) =, X,Y,Z,W Vec(M). (14.15) Remark The property of the Riemann tensor can be reformulated as follows (X,Y,Z,W) = (Y,X,Z,W), (X,Y,Z,W) = (X,Y,W,Z). (14.16) Proposition For every X,Y,Z,W Vec(M) we have (X,Y,Z,W) = (Z,W,X,Y). Proof. Using (14.15) four times we can write the identities (X,Y,Z,W) +(Z,X,Y,W) +(Y,Z,X,W) =, (Y,Z,W,X) +(W,Y,Z,X) +(Z,W,Y,X) =, (Z,W,X,Y )+(X,Z,W,Y )+(W,X,Z,Y ) =, (W,X,Y,Z) +(Y,W,X,Z)+(X,Y,W,Z) =. Summing all together and using the skew symmetry (14.16), one gets (X,Z,W,Y ) = (W,Y,X,Z). Proposition Assume that (X,Y,X,W) = for every X,Y,W Vec(M). Then (X,Y,Z,W) = X,Y,Z,W Vec(M). Proof. By assumptions and the skew-simmetry properties (14.16) of the Riemann tensor we have that (X,Y,Z,W) = whenever any two of the vector fields coincide. In particular = (X,Y +W,Z,Y +W) = (X,Y,Z,W)+(X,W,Z,Y ). (14.17) since the two extra terms that should appear in the expansion vanish by assumptions. Then (14.17) can be rewritten as (X,Y,Z,W) = (Z,X,Y,W), i.e. the quantity (X,Y,Z,W) is invariant by ciclic permutations of X,Y,Z. But the cyclic sum of terms is zero by (14.15), hence (X,Y,Z,W) =. We end this section by summarizing the symmetry property of the Riemann curvature as follows Corollary There is a well defined map R : 2 T q M 2 T q M, R(X Y) := R(X,Y). Moreover R is skew-adjoint with respect to the induced scalar product on 2 T q M, that means R(X Y),Z W = X Y,R(Z W). 291

292 14.3 Relation with Hamiltonian curvature In this section we compute the curvature of the Jacobi curve associated with a Riemannian geodesic and we describe the relation with the Riemann curvature discussed in the previous section. As we show, the curvature associated to a geodesic is a kind of sectional curvature operator in the direction of the geodesic itself. Definition The Hamiltonian curvature tensor at λ T M is the operator R λ := R Jλ () : V λ V λ. In other words R λ is the curvature of the Jacobi curve associated with λ at t =. Proposition Let ξ V λ and V be a smooth vertical vector field extending ξ. Then R λ (ξ) = [ H,[ H,V] hor ] ver (λ) Proof. This is a direct consequence of Proposition Indeed recall that the curvature of the Jacobi curve is expressed through the composition R λ = J λ() J λ (). Moreover, being J λ () = V λ and J λ () = H λ we have that π J()J ()(ξ) = ξ hor, π J ()J()(η) = η ver. FInally we can extend vectors in J λ () (resp. Jλ ()) by applying the Hamiltonian vector field since J λ (t) = e t H J λ () (resp. Jλ (t) = et H Jλ ()). From these remarks we obtain the following formulas J λ ()ξ = [ H,V] hor, J λ()η = [ H,W] ver for some V vertical (resp. W horizontal) extension of the vector ξ V λ (reps. η H λ ). Another immediate property of the curvature tensor is the homogeneity with respect to the rescaling of the covector (that corresponds to reparametrization of the trajectory). Indeed by choosing ϕ(t) = ct, with c >, in Proposition one gets Corollary For every c > we have R cλ = c 2 R λ. If we use the Riemannian product to identify the tangent and the cotangent space at a point, we recognize that R λ is nothing but the sectional curvature operator where one entry is the tangent vector γ of the geodesic. Let us denote by I : TM T M the isomorphism defined by the Riemannian scalar product. In particular I(v) = λ for λ T qm and v T q M if λ,w = v w for all w T q M. Let denote H q = H T q M. Recall that the differential of H q can be interpreted as a linear map DH q : T qm T q M that sends λ T qm into D λ H q seen as a linear functional on T qm, i.e. a tangent vector. This map is actually the inverse of the isomorphism I. Lemma D λ H q = I 1 (λ). Proof. It is a simple consequence of the formula H(λ) = 1 2 λ,i 1 (λ). 292

293 Corollary Assume I(v) = λ, then H(λ) = v. Proof. Indeed, since H is an horizontal vector field, it is sufficient to show that π H(λ) = v, which is a consequence of Lemma Indeed for every vertical vector ξ T λ (T qm) one has ξ,v = ξ,i 1 (λ) = D λ H(ξ) = σ(ξ, H(λ)) = ξ,π H(λ). By arbitrary of ξ T λ (Tq M) one has the equality v = π H(λ). Theorem We have the following identity R I(X) (I(Y)) = R(X,Y)X, X,Y T q M. (14.18) Proof. We have to compute the quantity R I(X) (I(Y)) = [ H,[ H,IY] hor ] ver (I(X)) First notice that π [ H,I(Y)] = Y hence [ H,I(Y)] hor = Y. Then [ H,[ H,I(Y)] hor ] ver (I(X)) = [ X, Y ] ver (I(X)) = R(X,Y)(X). Definition The Ricci tensor at λ is defined as the trace of the curvature operator at λ, Ric(λ) := trace R λ. Exercise Prove the following expression for the Ricci tensor, where X 1,...,X n is a local orthonormal frame and γ() = v = I 1 (λ) is the tangent vector to the geodesic: Ric(λ) = = n R(v,X i )v X i i=1 n σ λ ([ H, Xi ], Xi ). i=1 This shows that Ric(λ) = Ric(v) coincide with the classical Riemannian Ricci tensor Locally flat spaces In this section we want to show that the Riemannian curvature is the only obstruction for a Riemannian manifold to be locally Euclidean. Finally we show that the Riemannian curvature is also completely recovered by the Hamiltonian curvature R λ. A Riemannian manifold M is called flat if R(X,Y) = for every X,Y Vec(M). Theorem M is flat if and only if M is locally isometric to R n. 293

294 Proof. If M is locally isometric to R n, then its curvature tensor at every point in a neighborhood is identically zero. Then let us assume that the Riemann tensor R vanishes identically and prove that M is locally Euclidean. We will do that by showing that there exists coordinate such that the Hamiltonian, in these set of coordinates, is written as the Hamiltonian of the Euclidean R n. Since R is identically zero the horizontal distribution (defined by the Levi Civita connection) is involutive. Hence, by Frobenius theorem, there exists a horizontal Lagrangian foliation of T M, i.e. for each λ T M, there exists a leaf L λ of the foliation passing through this point that is tangent to the horizontal space H λ. In particular each leaf is transversal to the fiber T qm, where q = π(λ). Fix a point q M and a neighborhood O q where R is identically zero. Define the map Ψ : π 1 (O q ) T q M, λ π 1 (O q ) L λ T q M that assigns to each λ the intersection of the leaf passing through this point and T q M. Exercise Show that Ψ is a linear, orthogonal transformation, i.e. H(Ψ(λ)) = H(λ) for all λ π 1 (O q ). (Hint: use the linearity of the connection and the fact that H is horizontal). Fix now a basis {ν 1,...,ν n } in T q M that is orthonormal (with respect to the dual metric). Being Ψ linear on fibers, we can write Ψ(λ) = n ψ i (λ)ν i, where ψ i (λ) = λ,x i (q) i=1 for a suitable basis of vector fields X 1,...,X n in the neighborhood O q. Moreover X 1,...,X n is an orthonormal basis since Ψ is an orthogonal map. We want to show that {X 1,...,X n } is an orthonormal basis of vector fields that commutes everywhere. Let us show that the fact that the foliation is Lagrangian implies [X i,x j ] = for all i,j = 1,...,n. Indeed the tautological 1-form is written in these coordinates as s = n i=1 ψ iν i and σ = ds = n dψ i ν i +ψ i dν i. (14.19) i=1 Since on each leaf the function ψ i is constant by definition (hence dψ i L = ), we have that σ L = i ψ idν i. In particular each leaf is Lagrangian if and only if dν i = for i = 1,...,n. Then, from the Cartan formula, one gets = dν i (X j,x k ) = ν i ([X j,x k ]), i,j,k. This proves that [X i,x j ] = for each i,j = 1,...,n. Hence, in the coordinate set (ψ,q), we have H(ψ,q) = 1 2 ψ 2. The next result shows that the Hamiltonian curvature can detect if a manifold is flat or not. Corollary M is flat if and only if R λ = for every λ T M. 294

295 Proof. Assume that M is flat. Then R is identically zero and a fortiori R λ = from (14.18). Let us prove the converse. Recall that R λ = implies, again by (14.18), that (X,Y,X,W) =, X,Y,W Vec(M). Then the statement is a consequence of Proposition Exercise Prove that actually the Riemann tensor R is completely determined by R Example: curvature of the 2D Riemannian case In this section we apply the definition of curvature discussed in this chapter to a two dimensional Riemannian surface. As we explain, we recover that the Riemannian curvature tensor is determined by the Gauss curvature of the manifold. Let M be a 2-dimensional surface and f 1,f 2 Vec(M) be a local orthonormal frame for the Riemannian metric. The Riemannian Hamiltonian H is written as follows (we use canonical coordinates λ = (p,x) on T M) H(p,x) = 1 2 ( p,f 1(x) 2 + p,f 2 (x) 2 ) (14.2) Here, foracovector λ = (p,x) T M, thesymplecticvector spaceσ λ = T λ (T M) is 4-dimensional. Recall that, being M 2-dimensional, the level set H 1 (1/2) Tq M is a circle. Hence, there is a well defined vector field that produces rotation on the reduced fiber. Let us define the angle θ on the level H 1 (1/2) Tx M by setting p,f 1 (x) = cosθ, p,f 2 (x) = sinθ, in such a way that θ = corresponds to the direction of f 1. Denote by θ the rotation in the fiber of the unit tangent bundle and by E, the Euler vector field. Denote finally by H := [ θ, H]. Notice that Σ λ = V λ H λ where V λ = span{ E, θ } and H λ = span{ H, H }. Lemma The vector fields { E, θ, H, H } at λ form a Darboux basis for Σ λ. Proof. We want to compute the following symplectic products of the vector fields: σ( θ, E) =, σ( θ, H) =, σ( E, H) = 1. (14.21) σ( θ, H ) = 1, σ( E, H ) =, σ( H, H ) =. (14.22) Indeed, let us prove first (14.21). The first equality follows from the fact that both vectors belong to the vertical subspace, that is Lagrangian. The second one is a consequence of the fact that, by construction, θ is tangent to the level set of H, i.e. σ( θ, H) = θ ( H) = dh, θ =. The last identity is (13.1). As a preliminary step for the proof of (14.22) notice that, if s = i E σ denotes the tautological Liouville form, one has s, H = 1, s, H =. (14.23) 295

296 These two identities follows from s, H = σ( E, H) = 1, (14.24) s, H = s,[ θ, H] = ds( θ, H) = σ( θ, H) =, (14.25) where in the second line we used the Cartan formula (4.74) and the fact that θ is vertical. Let us now prove (14.22). Being [ θ, H ] = [ θ,[ θ, H]] = H, we have again by Cartan formula and (14.23) σ( θ, H ) = ds( θ, H ) = s,[ θ, H ] = s, H = σ( E, H) = 1 Moreover by (14.23) The last computation is similar. Let us write σ( E, H ) = s, H =. σ( H, H ) = dh, H = dh,[ θ, H], and apply the Cartan formula to the last term (with dh as 1-form). dh([ θ, H]) = d 2 H( θ, H) θ dh, H + H dh, θ = since the three terms are all equal to zero. Now we compute the curvature via the Jacobi curve, reduced by homogeneity. Notice that by Lemma 14.4 we can remove the symplectic space spanned by { E, H} and, being { E, H} = { θ, H }, we have Ĵ λ (t) = span{e t H θ }. Then we define the generator of the Jacobi curve V t = e t H θ, Vt = e t H [ H, θ ] = e t H H Notice that σ(v t, V t ) = 1, for every t. (14.26) Indeedit istruefort = andtheequality isvalid forall tsincethetransformatione t H is symplectic. To compute the curvature of the Jacobi curve let us write V t = α(t)v β(t) V (14.27) We claim that the matrix S(t) representing the 1-dimensional Jacobi curve (that actually is a scalar), is given in these coordinates by S(t) = β(t) α(t) = σ(v,v t ) σ( V,V t ). Indeed the identity ( V t = α(t)v β(t) V = α(t) V β(t) ) α(t) V, (14.28) 296

297 tells us that the matrix representing the vector space spanned by V t is the graph of the linear map V β(t) α(t) V. Moreover, using that V and V is a Darboux basis, it is easy to compute σ(v,v t ) = α(t)σ(v,v ) β(t)σ(v }{{}, V ) = β(t), (14.29) }{{} = = 1 σ( V,V t ) = α(t)σ( V,V ) β(t)σ( }{{} V, V ) = α(t). (14.3) }{{} =1 = Differentiating the identity (14.26) with respect to t one gets the relations σ(v t, V t ) =, σ(v t,v (3) t ) = σ( V t, V t ) Notice that these quantities are constant with respect to t. Collecting the above results one can compute the asymptotic expansion of S(t) with respect to t t+ t3 S(t) = 6 σ(v, V... )+O(t 5 ) 1+ t2 2 σ( V, V )+O(t 4 ) ( = t+ t3 6 σ(v, V... )( ) )+O(t 5 ) 1 t2 2 σ( V, V )+O(t 4 ) (14.31) (14.32) and one gets for the derivative of S(t) at t = Ṡ() = 1, S() =,... S() = 2σ( V, V ). The formula for the curvature R is finally computed in terms of S(t) as follows: Using that V t = e t H θ we can expand V t as follows hence (14.33) is rewritten as R = 1... S() = σ( V, 2 V ) (14.33) V t = θ +t[ H, θ ]+ t2 2 [ H,[ H, θ ]]+O(t 3 ) R = σ([ H,[ H, θ ]],[ H, θ ]) (14.34) = σ([ H, H ], H ) (14.35) To end this section, we compute the curvature R with respect to the orthonormal frame f 1,f 2. Denote the Hamiltonians h i (p,x) = p,f i (x), i = 1,2. The PMP reads ẋ = h 1 f 1 (x)+h 2 f 2 (x) ḣ 1 = {H,h 1 } = {h 2,h 1 }h 2 (14.36) ḣ 2 = {H,h 2 } = {h 2,h 1 }h 1 297

298 Moreover {h 2,h 1 }(p,x) = p,[f 2,f 1 ](x). Assume that [f 1,f 2 ] = a 1 f 1 +a 2 f 2, a i C (M). Then {h 2,h 1 } = a 1 h 1 a 2 h 2. If we restrict to h 1 = cosθ and h 2 = sinθ equations (14.36) become { ẋ = cosθf 1 +sinθf 2 θ = a 1 cosθ+a 2 sinθ and it is easy to compute the following expression for H and commutators 4 H = h 1 f 1 +h 2 f 2 +(a 1 h 1 +a 2 h 2 ) θ, H = h 2 f 1 +h 1 f 2 +( a 1 h 2 +a 2 h 1 ) θ, [ H, H ] = (f 1 a 2 f 2 a 1 a 2 1 a2 2 ) θ. Recall that κ = f 1 a 2 f 2 a 1 a 2 1 a 2 2, is the Gaussian curvature of the surface M (see also Chapter 4). Since σ( θ, H ) = 1 one gets R = σ([ H, H ], H ) = σ(κ θ, H ) = κ. Exercise In this exercise we recover the previous computations introducing dual coordinates to our frame. Let ν 1,ν 2 be the dual basis to f 1,f 2 and set f θ := h 1 f 1 +h 2 f 2, ν θ := h 1 ν 1 +h 2 ν 2. Define the smooth function b := a 1 h 1 +a 2 h 2 on T M. In these notation H = f θ +b θ, H = f θ +b θ, where denotes the derivative with respect to θ. Then, using that in these coordinates the tautological form is s = ν θ, show that the symplectic form is written as and compute the following expressions σ = ds = dθ ν θ bν 1 ν 2, i H σ = (b b)ν θ dθ, [ H, H ] = (f θ b f θ b b 2 b 2 ) θ, showing that this gives an alternative proof of the above computation of the curvature. 4 here we still use the notation h 1,h 2 as functions of θ satisfying θ h 1 = h 2, θ h 2 = h 1 298

299 Chapter 15 Curvature in 3D contact sub-riemannian geometry The main goal of this chapter is to compute the curvature of the three dimensional contact sub- Riemannian case. Then we will discuss some general features of the curvature in sub-riemannian geometry D contact sub-riemannian manifolds In this section we consider a sub-riemannian manifold M of dimension 3 whose distribution is defined as the kernel of a contact 1-form ω Λ 1 (M), i.e. D q = kerω q for all q M. Let us also fix a local orthonormal frame f 1,f 2 such that D q = kerω q = span{f 1 (q),f 2 (q)} Recall that the 1-form ω Λ 1 (M) defines a contact distribution if and only if ω dω is never vanishing. Exercise Let M be a 3D manifold, ω Λ 1 M and D = kerω. The following are equivalent: (i) ω is a contact 1-form, (ii) dω D, (iii) f 1,f 2 D linearly independent, then [f 1,f 2 ] / D. Remark The contact form ω is defined up to a smooth function, i.e. if ω is a contact form, aω is a contact form for every a C (M). This let us to normalize the contact form by requiring that dω D = ν 1 ν 2, (i.e. dω(f 1,f 2 ) = 1.) where ν 1,ν 2 is the dual basis to f 1,f 2. This is equivalent to say that dω is equal to the area form induced on the distribution by the sub-riemannian scalar product. Definition The Reeb vector field of the contact structure is the unique vector field f Vec(M) that satisfies dω(f, ) =, ω(f ) = 1 299

300 In particular f is transversal to the distribution and the triple {f,f 1,f 2 } defines a basis of T q M at every point q M. Notice that ω,ν 1,ν 2 is the dual basis to this frame. Remark The flow generated by the Reeb vector field e tf : M M is a group of diffeomorphisms that satisfy (e tf ) ω = ω. Indeed L f ω = d(i f ω)+i f dω = since i f ω = ω(f ) = 1 is constant and i f dω = dω(f, ) =. In what follows, to simplify the notation, we will replace the contact form ω by ν, as the dual element to the vector field f. We can write the structure equations of this basis of 1-forms dν = ν 1 ν 2 dν 1 = c 1 1 ν ν 1 +c 1 2 ν ν 2 +c 1 12 ν 1 ν 2 (15.1) dν 2 = c 2 1 ν ν 1 +c 2 2 ν ν 2 +c 2 12 ν 1 ν 2 The structure constants c k ij are smooth functions on the manifold. Recall that the equation 2 2 dν k = c k ij ν i ν j if and only if [f j,f i ] = c k ij f k. i,j= k= Introduce the coordinates (h,h 1,h 2 ) in each fiber of T M induced by the dual frame λ = h ν +h 1 ν 1 +h 2 ν 2 where h i (λ) = λ,f i (q) are the Hamiltonians linear on fibers associated to f i, for i =,1,2. The sub-riemannian Hamiltonian is written as follows H = 1 2 (h2 1 +h2 2 ). We now compute the Poisson bracket {H,h }, denoting with {H,h } q its restriction to the fiber T qm. Proposition The Poisson bracket {H,h } q is a quadratic form. Moreover we have {H,h } = c 1 1h 2 1 +(c 2 1 +c 1 2)h 1 h 2 +c 2 2h 2 2, (15.2) c 1 1 +c 2 2 =. (15.3) Notice that q ker{h,h } q and {H,h } q can be treated as a quadratic form on T q M/ q = q. Proof. Using the equality {h i,h j }(λ) = λ,[f i,f j ](q) we get {H,h } = 1 2 {h2 1 +h 2 2,h } = h 1 {h 1,h }+h 2 {h 2,h } = h 1 (c 1 1h 1 +c 2 1h 2 )+h 2 (c 1 2h 1 +c 2 2h 2 ) = c 1 1 h2 1 +(c2 1 +c1 2 )h 1h 2 +c 2 2 h2 2. 3

301 Differentiating the first equation in (15.1) one gets: which proves (15.3). = d 2 ν = dν 1 ν 2 ν 1 ν 2 = (c 1 1ν ν 1 ) ν 2 ν 1 (c 2 2ν ν 2 ) = (c 1 1 +c2 2 )ν ν 1 ν 2 Remark Being {H,h } q a quadratic form on the Euclidean plane D q (using the canonical identification of the vector space D q with its dual D q given by the scalar product), it can be interpreted as a symmetric operator on the plane itself. In particular its determinant and its trace are well defined. From (15.3) we get trace{h,h } q = c 1 1 +c 2 2 =. This identity is a consequence of the fact that the flow defined by the normalized Reeb f preserves not only the distribution but also the area form on it. It is natural then to define our first invariant as the positive eigenvalue of this operator, namely: χ(q) = det{h,h } q. (15.4) Notice that the function χ measures an intrinsic quantity since both H and h are defined only by the sub-riemannian structure and are independent by the choice of the orthonormal frame. Indeed the quantity {H,h } compute the derivative of H along the flow of h, i.e. the obstruction to the fact that the flow of the Reeb field f (which preserves the distribution and the volume form on it) to preserve the metric. Notice that, by definition χ. Corollary Assume that the vector field f is complete. Then {e tf } t R is a group of sub- Riemannian isometries if and only if χ. In the case when χ one can consider (locally) the quotient of M with respect to the action of this group, i.e. the space of trajectories described by f. The two dimensional surface defined by the quotient strucure is endowed with a well defined Riemannian metric. The sub-riemannian structure on M coincide with the isoperimetric Dido problem constructed on this surface. The Heisenberg case corresponds with the case when the surface has zero Gaussian curvature Curvature of a 3D contact structure In this section we compute the sub-riemannian curvature of a 3D contact structure with a technique similar to that used in Section 14.5 for the 2D Riemannian case. Let us consider the level set {H = 1/2} = {h 2 1 +h2 2 = 1} and define the coordinate θ in such a way that h 1 = cosθ, h 2 = sinθ. On the bundle T M H 1 (1/2) we introduce coordinates (x,θ,h ). Notice that each fiber is topologically a cylinder S 1 R. 31

302 The sub-riemannian Hamiltonian equation written in these coordinates are ẋ = h 1 f 1 (x)+h 2 f 2 (x) ḣ 1 = {H,h 1 } = {h 2,h 1 }h 2 ḣ 2 = {H,h 2 } = {h 2,h 1 }h 1 ḣ = {H,h } (15.5) Computing the Poisson bracket {h 2,h 1 } = h + c 1 12 h 1 + c 2 12 h 2 and introducing the two functions a,b : T M R given by a = {H,h } = 2 c j i h ih j, b := c 1 12h 1 +c 2 12h 2. i,j=1 we can rewrite the system, when restricted to H 1 (1/2), as follows ẋ = cosθf 1 +sinθf 2 θ = h b ḣ = a (15.6) Notice that, while a is intrinsic, the function b depends on the choice of the orthonormal frame. In particular we have for the Hamiltonian vector field in the coordinates (q,θ,h ) (where we use h 1,h 2 as a shorthand for cosθ and sinθ): H = h 1 f 1 +h 2 f 2 (h +b) θ +a h (15.7) [ θ, H] = H = h 2 f 1 +h 1 f 2 +a h b θ (15.8) where we denoted by the derivative with respect to θ, e.g. h 1 = h 2 and h 2 = h 1. Now consider thesymplectic vector spaceσ λ = T λ (T M). Thevertical subspacev λ is generated by the vectors θ, h, E. Hence the Jacobi curve is J λ (t) = span{e t H θ,e t H h,e t H E} The first reduction, by homogeneity, let us to split the space Σ λ = span{ E, H} span{ E, H} and consider the reduced Jacobi curve Λ(t) := Ĵλ(t) in the 4-dimensional symplectic space Λ(t) := e t H V λ /R H = span{e t H θ,e t H h }/RH Next we describe the second reduction of the Jacobi curve, the one related with the fact that the curve is non-regular. Indeed notice that the rank of Ĵλ(t) is 1. To find the new reduced curve, we need to compute the kernel of the derivative of the curve at t = From the definition of Λ := Λ() it follows that Γ := Ker Λ() Λ( θ ) = π [ H, θ ] = h 2 f 1 h 1 f 2 Λ( h ) = π [ H, h ] = π ( θ ) = Hence Γ = R h and Γ is 3-dimensional in V λ /R H. 32

303 Proposition We have the following characterizations: (i) Γ = span{ h, θ, H } in V λ /R H, (ii) { θ, H } is a Darboux basis for Γ /Γ. Proof. Since h and θ are vertical to prove (i) it is enough to show that H is skew-orthongonal to h. It is easy to compute, by Cartan formula σ( h, H ) = h s, H H s, h s,[ h, H ] =, since all the three terms vanish. Indeed s, H = σ( E, H ) = and s, h = s,[ h, H ] = since h and [ h, H ] are both vertical, as can be computed from (15.8). To complete the proof of (ii) it is enough to show, using [ θ, H ] = H, that σ( θ, H ) = θ s, H H s, θ s,[ θ, H ] = s, H = 1. Next we compute the curvature in terms of the Hamiltonian vector field and its commutators. For a vector field W we use the notations Ẇ := [ H,W], W := [ θ,w]. Let us consider the vector field V t = e t H h. Notice that V = θ, V = H. The fact that θ and h are vertical implies that σ(v t, V t ) =, t Differentiating the above identity at t = we get (from now on, we omit t when we evaluate at t = ) σ( V, V)+σ(V, V) = = σ(v, V) =. Differentiating once more the last identity and using σ( V, V) = σ( θ, H ) = 1 one gets σ( V, V)+σ(V,V (3) ) = = σ(v,v (3) ) = 1. With similar computations one can show that σ( V,V (3) ) = σ(v,v (4) ) =. Evaluating all derivatives of order 4 one can see that r := σ( V,V (3) ) = σ( V,V (4) ) = σ(v,v (5) ). Proposition The sub-riemannian curvature is R = 1 1 σ([ H, H ], H ) = r 1 33

304 Proof. The second equality follows from the definition of r and the fact that V = H and V (3) = [ H, H ]. To prove the first identity we have to compute the Schwartzian derivative of the bi-reduced curve, in the symplectic basis ( V, V) of the space Γ /Γ (notice the minus sign). Recall that Λ(t) = span{v t, V t }. To compute the 1-dimensional reduced curve Λ Γ (t) in the symplectic space Γ /Γ we need to compute the intersection of Λ(t) with Γ (for all t). In other words we look for x(t) such that σ( V t +x(t)v t,v ) = = x(t) = σ( V t,v ) σ(v t,v ). (15.9) Then we write this vector as a linear combination of the Darboux basis (cf. (14.28) for the 2D Riemannian case) V t +x(t)v t = α(t) V β(t) V +ξ(t)v (15.1) To see it as acurve in the space Γ/Γ we simply ignore the coefficient along V. Inthese coordinates the matrix S(t), which is a scalar, representing the curve is S(t) = β(t) α(t) (15.11) Notice that this is a one-dimensional non-degenerate curve. These coefficients are computed by the symplectic products Combining (15.12),(15.13) with (15.11) and (15.9) one gets α(t) = σ( V t +x(t)v t, V ) (15.12) β(t) = σ( V t +x(t)v t, V ) (15.13) S(t) = σ( V t, V )σ(v t,v ) σ(v t, V )σ( V t,v ) σ( V t, V )σ( V t, V ) σ( V t, V )σ( V t, V ) (15.14) After some computations, by Taylor expansion one gets S(t) = t 4 t3 12 r+o(t4 ) (15.15) Since S = the curvature is computer by R =... S 2Ṡ = r 1 We end this section by computing the expression of the curvature in terms of the orthonormal frame for the distribution and the Reeb vector filed. As usual we restrict to the level set H 1 (1/2) where h 2 1 +h 2 2 = 1, h 1 = cosθ, h 2 = sinθ. In the following we use the notation f θ = h 1 f 1 +h 2 f 2, ν θ = h 1 ν 1 +h 2 ν 2. 34

305 If h = (h 1,h 2 ) = (cosθ,sinθ) we denote by h = ( h 2,h 1 ) = ( sinθ,cosθ) its derivative with respect to θ and, more in general, we denote F := θ F for a smooth function F on T M. To express the quantity r = σ([ H, H ], H ) we start by computing the commutator [ H, H ]. From (15.7) and (15.8) one gets [ H, H ] = f +h f θ +(f 2 c 1 12 f 1c 2 12 (h +b)b (b ) 2 +a ) θ. Next we write, following this notation, the symplectic form σ = ds. The Liouville form s is expressed, in the dual basis ν,ν 1,ν 2 to the basis of vector fields f 1,f 2,f as follows hence the symplectic form σ is written as follows s = h ν +ν θ σ = dh ν +h ν θ ν θ +dθ ν θ +dν θ where we used that dν = ν 1 ν 2 = ν θ ν θ. Computing the symplectic product then one finds the value of 1R = h a +κ where κ = f 2 c 1 12 f 1c 2 12 (c1 12 )2 (c 2 12 )2 + c2 1 c1 2 (15.16) 2 By homogeneity, the function R is defined on the whole T M, and not only for λ H 1 (1/2). For every λ = (h,h 1,h 2 ) TxM 1R = h a +κ(h 2 1 +h2 2 ) Remark The restriction of R to the 1-dimensional subspace λ D (that corresponds to λ = (h,,)), is a strictly positive quadratic form. Moreover it is equal to 1/1 when evaluated on the Reeb vector field. Hence the curvature R encodes both the contact form ω and its normalization. On the orthogonal complement (with respect to R) {h = } we have that R is treated as a quadratic form R = 3 2 a +κ(h 2 1 +h 2 2). Remark (i). If a there always exists a frame such that a = 2χh 1 h 2 and in this frame we can express R as a quadratic form on the whole T M R = h 2 +(κ+3χ)h 2 1 +(κ 3χ)h 2 2. It is easily seen from this formulas that we can recover the two invariants χ,κ considering trace(1r h = ) = 2κ, discr(1r h = ) = 36χ. (ii). When a = the eigenvalues of R coincide and χ =. In this case κ represents the Riemannian curvature of the surface defined by the quotient of M with respect to the flow of the Reeb vector field. 35

306 Indeed the flow e tf preserves the metric and it is easy to see that the identities e tf f i = f i, i = 1,2. implies [f,f 1 ] = [f,f 2 ] =. Hence c 2 1,c1 2 = and the expression of κ reduces to the Riemannian curvature of a surface whose orthonormal frame is f 1,f 2. Exercise Let f 1,f 2 bean orthonormal frame for M and denote by f 1, f 2 the frame obtained rotating f 1,f 2 by an angle θ = θ(q). Show that the structure constants ĉ k ijof rotated frame satisfies ĉ 1 12 = cosθ(c 1 12 f 1 (θ)) sinθ(c 2 12 f 2 (θ)), ĉ 2 12 = sinθ(c 1 12 f 1 (θ))+cosθ(c 2 12 f 2 (θ)). Exercise Show that the expression (15.16) for κ does not depend on the choice of an orthonormal frame f 1,f 2 for the sub-riemannian structure. 36

307 Chapter 16 Asymptotic expansion of the 3D contact exponential map In this chapter we study the small time asymptotics of the exponential map in the three-dimensional contact caseandseehowthestructureofthecutandtheconjugatelocusisencodedinthecurvature. Let us consider the sub-riemannian Hamiltonian of a 3D contact structure (cf. Section ) H = h 1 f 1 +h 2 f 2 (h +b) θ +a h (16.1) written in the dual coordinates (h,h 1,h 2 ) of a local frame f,f 1,f 2, where ν is the normalized contact form, f is the Reeb vector field and f 1,f 2 is a local orthonormal frame for the sub- Riemannian structure. As usual the coordinate θ on the level set H 1 (1/2) is defined such a way that h 1 = cosθ and h 2 = sinθ. In this chapter it will be convenient to introduce the notation ρ := h for the function linear on fibers of T M associated with the opposite of the Reeb vector field. The Hamiltonian system (16.1) on the level set H 1 (1/2) is rewritten in the following form: q = cosθf 1 +sinθf 2 θ = ρ b (16.2) ρ = a The exponential map starting from the initial point q M is the map that to each time t > and every initial covector (θ,ρ ) Tq M H 1 (1/2) assigns the first component of the solution at time t of the system (16.2), denoted by E q (t,θ,ρ ), or simply E(t,θ,ρ ). Conjugate points are points where the differential of the exponential map is not surjective, i.e. solutions to the equation E E E =. (16.3) θ ρ t The variation of the exponential map along time is always nonzero and independent with respect to variations of the covectors in the set H 1 (1/2) (see also Section 7.6 and Proposition 7.28). This implies that (16.3) is equivalent to E E =. (16.4) θ ρ 37

308 16.1 Nilpotent case The nilpotent case, i.e. the Heisenberg group, corresponds to the case when the functions a and b vanish identically, i.e. the system q = cosθf 1 +sinθf 2 θ = ρ (16.5) ρ = Let us first recover, in this notation, the conjugate locus in the case of the Heisenberg group. Let us denote coordinates on the manifold R 3 as follows q = (x,y), x = (x 1,x 2 ) R 2,y R. (16.6) Notice moreover that in this case the Reeb vector field is proportional to y and its dual coordinate ρ is constant along trajectories. There are two possible cases: (i) ρ =. Then the solution is a straight line contained in the plane y = and is optimal for all time. (ii) ρ. In this case we claim that the equation (16.4) is equivalent to the following x θ x ρ =. (16.7) By thegauss Lemma(Proposition 7.28) thecovector p = (p x,ρ) at thefinal point annihilates the differential of the exponential map restricted to the level set, i.e. p, E = p x, x +ρ y = (16.8) θ θ θ p, E = p x, x +ρ y = (16.9) ρ ρ ρ and since ρ it follows that among the three vectors x 1 x 2 y θ θ θ x 1 x 2 y (16.1) ρ ρ ρ the third one is always a linear combination of the first two. Proposition The first conjugate time is t c (θ,ρ ) = 2π/ ρ. Proof. In the standard coordinates (x 1,x 2,y) the two vector fields f 1 and f 2 defining the orthonormal frame are f 1 = x1 x 2 2 y, f 2 = x2 + x 1 2 y Thus, the first two coordinates of the horizontal part of the Hamiltonian system satisfy { ẋ 1 = cosθ (16.11) ẋ 2 = sinθ 38

309 It is then easy to integrate the x-part of the exponential map being θ(t) = θ + ρt (recall that ρ ρ and, without loss of generality we can assume ρ > ) x(t;θ,ρ ) = t ( ) cos(θ +ρs) ds = sin(θ +ρs) θ +t θ ( ) cosρs ds (16.12) sinρs Due to the symmetry of the Heisenberg group, the determinant of the Jacobian map will not depend on θ. Hence to compute the determinant of the Jacobian it is enough to compute partial derivatives at θ = ( ) x cosρt 1 x ρ = 1 ρ 2 and denoting by τ := ρt one can compute θ = ( sinρt 1 cosρt sinρt ) + t ρ ( ) cosρt sinρt x x = 1 ( ) cosτ 1 τ cosτ sinτ θ ρ ρ 2 det, sinτ 1+τ sinτ +cosτ = 1 ρ2(τ sinτ +2cosτ 2). The fact that t c = 2π/ ρ follows from Exercise Exercise Prove that τ c = 2π is the first positive root of the equation τ sinτ+2cosτ 2 =. Moreover show that τ c is a simple root General case: second order asymptotic expansion Let us consider the Hamiltonian system for the general 3D contact case q = f θ := cosθf 1 +sinθf 2 θ = ρ b ρ = a (16.13) We are going to study the asymptotic expansion for our system for the initial parameter ρ ±. To this aim, it is convenient to introduce the change of variables r := 1/ρ and denote by ν := r() = 1/ρ its initial value. Notice that ρ is no more constant in the general case and ρ implies ν. The main result of this section says that the conjugate time for the perturbed system is a perturbation of the conjugate time of the nilpotent case, where the perturbation has no term of order 2. Proposition The conjugate time t c (θ,ν) is a smooth function of the parameter ν for ν >. Moreover for ν t c (θ,ν) = 2π ν +O( ν 3 ). 39

310 Proof. Let us introduce a new time variable τ such that dt dτ = r. If we now denote by F the derivative of a function F with respect to the new time τ, the system (16.13) is rewritten in the new coordinate system (q,θ,r) (where we recall r = 1/ρ), as follows q = rf θ θ = 1 rb ṙ = r 3 (16.14) a ṫ = r To compute the asymptotics of the conjugate time, it is also convenient to consider a system of coordinates, depending on a parameter ε, corresponding to the quasi-homogeneous blow up of the sub-riemannian structure at q and converging to the nilpotent approximation. In other words we consider the change of coordinates Φ ε such that f θ 1 ε fε θ where f ε θ = f +εf () +ε 2 f (1) +... Accordingly to this change of coordinates we have the equalities f i = 1 ε fε i, f = 1 ε 2fε, b = 1 ε bε, a = 1 ε 2aε where f ε is the Reeb vector field defined by the orthonormal frame fε 1,fε 2 (and analogously for a ε,b ε ). Let us now define, for fixed ε, the variable w such that r = εw. The system (16.14) is finally rewritten in the following form q = wfθ ε θ = 1 wb ε ẇ = εw 3 a ε (16.15) ṫ = εw Notice that the dynamical system is written in a coordinate system that depends on ε. Moreover the initial asymptotic for ρ, corresponding to r, is now reduced to fix an initial value w() = 1 and send ε. Consider some linearly adapted coordinates (x,y), with x R 2 and y R (cf. Definition 8.22). If we denote by q ε = (x ε,y ε ) the solution of the horizontal part of the ε-system (16.15), conjugate points are solutions of the equation q ε qε θ w =. w =1 As in Section 16.1, one can check that this condition is equivalent to x ε xε θ w =. w =1 Notice that the original parameters (t,θ,ρ ) parametrizing the trajectories in the exponential map correspond to a conjugate point if the corresponding parameters (τ,θ,ε) satisfy ϕ(τ,ε,θ ) := xε xε θ w = (16.16) w =1 31

311 For ε =, i.e. the nilpotent approximation, the first conjugate time is τ c = 2π, and moreover it is a simple root. Thus one gets ϕ(2π,,θ ) =, ϕ τ (2π,,θ ). (16.17) Hence the implicit function theorem guarantees that there exists a smooth function τ c (ε,θ ) such that τ c (,θ ) = 2π and ϕ(τ c (ε,θ ),ε,θ ) =. (16.18) Inotherwordsτ c (ε,θ )computestheconjugatetimeτ associated withparametersε,θ. Bysmoothness of τ c one immediately has the expansion for ε τ c (ε,θ ) = 2π +O(ε). Now the statement of the proposition is rewritten in terms of the function τ c as follows Differentiating the identity (16.18) with respect to ε one has τ c (ε,θ ) = 2π +O(ε 2 ). (16.19) ϕ τ c τ ε + ϕ ε =, hence, thanks to (16.17), the expansion (16.19) holds if and only if ϕ ε (2π,,θ ) =. Moreover differentiating the expression (16.16) with respect to ε one has ϕ ε (2π,,θ ) = 2 x ε xε 2 x ε xε ε θ w ε w θ w =1,ε=,τ=2π The second one vanishes since at ε = is the Heisenberg case, whose horizontal part at τ = 2π does not depend on θ. Hence we are reduced to prove that 2 x ε ε θ =. (16.2) ε=,τ=2π which is a consequence of the following lemma. Lemma The quantity xε ε does not depend on θ. ε=,τ=2π Proof of Lemma. To prove the lemma it will be enough to find the first order expansion in ε of the solution of the system (16.15). Recall that when ε = the system corresponds to the Heisenberg case, i.e. we have a ε ε= =,b ε ε= =. This gives the expansion of w (recall that w() = w = 1) w(t) = w()+ t εa ε (τ)w 3 (τ)dτ w = 1+O(ε 2 ) Analogously we have b ε = ε β,u +O(ε 2 ), where β,u = β 1 u 1 +β 2 u 2 and β denotes the (constant) coefficient of weight zero in the expansion of b with respect to ε. 311

312 Denoting u(θ) = (cosθ,sinθ), the equation for θ then is reduced to This equation can be integrated and one gets θ ε = ε= θ = 1 ε β,u(θ) +O(ε 2 ), θ() = θ. t β,u(θ(τ)) dτ = β,u (θ +t) u (θ ) (16.21) where u (θ) = ( sinθ,cosθ). Next we are going to use (16.21) to compute the derivative of x ε wrt ε. The equation for the horizontal part of (16.15) can be expanded in ε as follows ẋ ε = u(θ)+εf () u(θ) (x)+o(ε2 ) where the first term is Heisenberg, and f () u(θ) is the term of weight zero of f u, which is linear with respect to x 1 and x 2 because of the weight. 1 To compute the derivative of the solution with respect to parameter we use the following general fact Lemma Let φ(ε, t) denote the solution of the differential equation ẏ = F(ε, y) with fixed initial condition y() = y. Then the derivative φ ε satisfies the following linear ODE d φ F (ε,t) = dt ε y (ε,φ(ε,t)) φ ε (ε,t)+ F ε (ε,φ(ε,t)) We apply the above lemma when y = (x,θ) and F = (F x,f θ ) and we compute at ε =. In particular we need the solution of the original system at ε = φ(,t) = ( x(t), θ(t)), θ(t) = θ +t, x(t) = u (θ ) u (θ +t). Then by Lemma 16.5 we have d x dt ε = Fx x x ε + Fx θ θ ε + Fx ε Computing the derivatives at ε = gives F x F x x =, ε= θ = u ( θ(t)), ε= F x ε = f () ε= u( θ(t)) ( x(t)) and we obtain the equation for x ε d x dt ε = θ ε= ε u (θ +t)+f () u(θ +t) (u (θ ) u (θ +t)) ε= 1 Recall that this is the zero order part of the vector field f u along x, hence only x variables appear and have order

313 If we set s = θ +t we can rewrite this equation and integrating one has d ds x ε = (2π,) x ε = θ ε= ε u (s)+f () u(s) (u (θ ) u (s)) θ +2π θ β,u (s) u (θ ) u (s)ds θ +2π + f () u(s) (u (θ ) u (s))ds θ In the last expression it is easy to see that all terms where θ appears are zero, while the others vanish since we compute integrals of periodic functions over a period (which does not dep on θ ). This finishes the proof of Lemma 16.4, hence the proof of the Proposition General case: higher order asymptotic expansion Next we continue our analysis about the structure of the conjugate locus for a 3D contact structure by studying the higher order asymptotic. In this section we determine the coefficient of order 3 in the asymptotic expansion of the conjugate locus. Namely we have the following result, whose proof is postponed to Section Theorem In a system of local coordinates around q M one has the expansion Con q (θ,ν) = q ±πf ν 2 ±π(a f θ af θ ) ν 3 +O( ν 4 ), ν ±. (16.22) If we choose coordinates such that a = 2χh 1 h 2 one gets Con q (θ,ν) = q ±πf ν 2 ±2πχ(q )(cos 3 θf 2 sin 3 θf 1 ) ν 3 +O( ν 4 ), ν ±. (16.23) Moreover for the conjugate length we have the expansion l c (θ,ν) = 2π ν πκ ν 3 +O( ν 4 ), ν ±. (16.24) Analogous formulas can be obtained for the asymptotics of the cut locus at a point q where the invariant χ is non vanishing. Theorem Assume χ(q ). In a system of local coordinates around q M such that a = 2χu 1 u 2 one gets Cut q (θ,ν) = q ±πν 2 f (q )±2πχ(q )cosθf 1 (q )ν 3 +O(ν 4 ), ν ± Moreover the cut length satisfies l cut (θ,ν) = 2π ν π(κ+2χsin 2 θ) ν 3 +O(ν 4 ), ν ± (16.25) 313

314 cut f conjugate πν 2 q 2πχ(q )ν 3 f 2 f 1 Figure 16.1: Asymptotic structure of cut and conjugate locus We can collect the information given by the asymptotics of the conjugate and the cut loci in Figure All geometrical information about the structure of these sets is encoded in a pair of quadratic forms defined on the fiber at the base point q, namely the curvature R and the sub-riemannian Hamiltonian H. Recall that the sub-riemannian Hamiltonian encodes the information about the distribution and about the metric defined on it (see Exercise 4.31). Let us consider the kernel of the sub-riemannian Hamiltonian kerh = {λ T q M : λ,v =, v D q} = D q. (16.26) The restriction of R to the 1-dimensional subspace D q for every q M, is a strictly positive quadratic form. Moreover it is equal to 1/1 when evaluated on the Reeb vector field. Hence the curvature R encodes both the contact form ω and its normalization. If we denote by D q the orthogonal complement of D q in the fiber with respect to R 2, we have that R is a quadratic form on D q and, by using the Euclidean metric defined by H on D q, as a symmetric operator. As we explained in the previous chapter, at each q where χ(q ) there always exists a frame such that {H,h } = 2χh 1 h 2 2 this is indeed isomorphic to the space of linear functionals defined on D q. 314

Introduction to Riemannian and Sub-Riemannian geometry

Introduction to Riemannian and Sub-Riemannian geometry From Hamiltonian viewpoint andrei agrachev davide barilari ugo boscain This version: June 12, 216 Preprint SISSA 9/212/M 2 Contents Introduction 9