Stochastic Arnold Diffusion of deterministic systems

Stochastic Arnold Diffusion of deterministic systems V. Kaloshin December 5, 2016 Two main examples: Asteroid belt and Arnold s example. Ergodicity (almost all orbits over a long time behave the same), mixed behavior: regular and stochastic. (Nearly) integrable systems, regular & stochastic behavior. KAM and regular (quasiperiodic) behavior on a set of positive measure. Stochastic behavior: horseshoe, Anosov, geodesic flows on manifolds of negative curvature. Can it occur on a set of positive measure? Building instabilities: Arnold Example, Arnold mechanism and Mather mechanism Weak KAM theory. Variational shadowing lemma. Arnold Example, Invariant Laminations mechanism Invariant laminations and random iteration of cylinder maps Stochasticity: Diffusion process, CLT for random iterations of cylinder maps 1

1 Dynamics in the Asteroid belt We would like to analyze dynamics of Asteroids in the Asteroid belt. Location between orbits of Mars and Jupiter. Quantity 10 6 objects of diameter 1km. Total mass 4% of mass of the Moon. Goal: Understand the dynamics of many orbits in the Asteroid belt. Consider the distribution of Asteroid according to orbital period, see e.g. https://en.wikipedia.org/wiki/asteroid(underline)belt The Puzzle: This distribution has gaps called Kirkwood gaps. Heuristic: Outside of gaps motions are regular (or quasi-periodic), Inside of gaps stochastic. Notice that equations of motion are deterministic and have randomness, stochasticity comes from initial conditions! 1.1 Mathematical model Let q i R 3, i = 0, 1,..., N be a finite collection of point masses. According to Newton s law equations of motion are m i q i = j i m i m j (q j q i ) q j q i 3. Let the Sun be q 0, Jupiter be q 1. Assumptions Neglect interactions among Asteroids. 2

Neglect effect of Asteroids on the Sun and Jupiter. Neglect effect of Mars (and other planets). = The Sun and Jupiter form a two body problem. = (Kepler) Orbits of the Sun and Jupiter, denoted q 0 (t) and q 1 (t), are ellliptic. One can choose a coordinate frame so that m 0 q 0 (t) + m 1 q 1 (t) 0. The number µ = m 1 m 0 +m 1 is called mass ratio. In reality it is small and 10 3. Figure 1: The Kirkwood gaps After normalization equations of motion become q = (1 µ)(q 1 q(t)) q 0 q(t) 3 + µ(q 1 q(t)) q 1 q(t) 3. (1) Mathematical Goal: Understand the dynamics of many orbits of this equation of motion. 3

KAM theory In a region away from collisions and small µ for most of initial conditions orbits are quasiperiodic. Stochasticity Conjecture For a positive measure set of initial conditions orbits behavior of eccentricity is stochastic, where stochasticity is due to a random choice of initial condition. We shall make this conjecture more concrete later. 2 Mechanical and Hamiltonian systems Consider an evolution of a particle x : R R N, N 1. According to the Newton Second Law the acceleration vector, defined by ẍ(t 0 ) = d2 x dt 2 t=t 0, is proportional to the force applied to the particle. We define a force as a function F : R N R N R R N and equations of motion become ẍ = F (x, ẋ, t) and are called Newton s equation. Call systems satisfying this equation mechanical systems. Denote ( ) the the scalar (or dot) product in R N and x = (x x) be the length of x. Example of mechanical systems Let U : R N R be a differentiable function, called potential. Denote by U/ x the gradient of U. Consider x = (x 1,..., x N ) R N and where du dx = ( x 1 U,..., xn U). ẍ = du dx, 4

1. A stone falling to the Earth. In the case N = 1 and U = gx for Galileo s g 9.8m/s 2. More generally, for some g R N, called acceleration vector, we have U(x) = (g x). 2. Falling from great height Let U(x) = k, i.e. U is inversely proportional to the distance x. x 3. Oscillations of a spring Let N = 1 and U(x) = α2 x 2 2. More generally, let U(x) 0, U(x) = 0. 4. The mathematical pendulum Let N = 1 and U(x) = k sin x for some k > 0. See also Chapter 1, [1]. 5. Product systems One can consider X = (x 1, x 2 ) R N 1 R N 2 for some N 1, N 2 > 0 and each component x 1 and x 2 evolves according to one of the above equations. More generally, for a manifold M and a cotangent bundle T M one can define so-called canonical coordinate (q, p) T M. Then for a function smooth H(q, p) called Hamiltonian we have the following equations of motion {ṗ = H q q = H. (2) p A Hamiltonian function can also depend on time, i.e. H(q, p, t). Often time dependence assumed to be periodic, i.e. t T = R/Z. Definition 1. We call the Hamiltonian equation (2) of n degrees of freedom, if the dimension of M equals n; For time-periodic Hamiltonian with dimm = n, the degree of freedom is n + 1 2. All of the above examples with M = R N for a proper N, x = q, p = ẋ, H = (p p) 2 + U(x). 5

Geodesic flows on Reimannian manifolds Let (M, g) be a smooth Rimennian manifold with a metric g. One can define a flow on the cotangent bundle T M locally minimizing g-distance between nearby points on the same orbit. One can represent this flow as a Hamiltonian system. See Chapter 2, [3] For many other examples, see e.g. https://en.wikipedia.org/wiki/hamiltonian(underline)mechanics Two main examples:, Example 1 Arnold s example is given by the product of the Mathematical Pendulum a nonlinear spring+ O(ε). For example, the Hamiltonian can have the form H ε (q, p) = p2 1 2 + k sin q 1 + p2 2 2 + U(q 2) + εh 1 (q, p, t), where q i, p i R, (3) U is even, U(q 2 ) 0, U(0) = U (0) = 0, U (q 2 ) 0, q 2 0 and U(q 2 ) + as q 2 + and ε is small. For example, U(q 2 ) = q 2 2 +q 4 2, the function H 1 is assumed to be periodic in q 1 and t. Notice that in action-angle variable the condition is a lot easier to state, see (6). Example 2 Motion of Asteroids in the Asteroid belt. The Hamiltonan has the form H(q, p, t) = (p p) 2 + 1 µ q q 0 (t) + µ q q 1 (t), where q, p R3, and q 0 (t), q 1 (t) are elliptic orbits of the Sun and Jupiter, given by Kepler s law. The associated problem is called the Restricted Three Body Problem. 6

Theorem (Arnold) For a positive measure set of initial conditions orbits are quasiperiodic. In this setting the Stochasticity conjecture takes the following form: Stochasticity conjecture For a positive measure set of initial conditions in a proper time scale eccentricity behaves as a stochastic diffusion process, where stochasticity is due to a random choice of initial conditions. 3 Ergodicity Let X be a set and Σ be a σ-algebra over X. A non-negative function µ from Σ to [0, ) + is called measure if µ( ) = 0 µ satisfies countable additivity, i.e. for all countable collections {E i } i=1 of pairwise disjoint sets in Σ: µ( k=1 ) = k=1 µ(σ k). The pair (X, Σ) is called a measurable space and elements of Σ are called measurable sets. A triple (X, Σ, µ) is called a measure space. A probability measure is a measure with total measure one, i.e. µ(x) = 1. A probability space is a measure space with a probability measure. Let T : X X be a measure-preserving transformation on a probability space (X, Σ, µ), i.e. for every E in Σ with µ(t 1 (E)) = µ(e). Call a set E in Σ invariant if T 1 (E) = E. A measure-preserving transformation T ergodic if for every E invariant set either µ(e) = 0 or µ(e) = 1. Time and space averages! Let f is a µ-integrable function, i.e. f L 1 (µ), i.e. f(x) dµ(x) < +. Then we define the following averages: Time average: defined as the average (if it exists) over iterations of T starting from some initial point x: 1 n 1 ˆf(x) = lim f(t k x). n + n 7 k=1

Space average: If µ(x) is finite and nonzero, we can consider the space or phase average of f: f = 1 f(y)dµ(y). µ(x) Ergodic theorem If T is ergodic, then for µ-almost all x the time average equals the space average, i.e. Let X be a topological space. ˆf(x) = f. Corollary 2. In the above setting for µ-almost all x the corresponding orbit {T k x} k Z is dense in X. Example 1. Let X = T = R/Z be the circle, Σ be the σ-algebra of measurable sets, µ be the Lebesgue measure. Let f α : x x + α ( mod 1) be the rotation by an irrational angle α Q. Then f α is ergodic. Example 2. For the space probability space (X, Σ, µ) as in the previous example. Let h 2 : x 2x ( mod 1) be the doubling map. Then h 2 is ergodic. We say that a measure preserving T : X X transformation of a probability space (X, Σ, µ) has mixed behavior if there are at least two invariant sets R and S both of positive measure such that behavior of orbits in R and in S are different. For example, almost periodic in R and stochastic (in a certain sense) in S. 4 Integrable systems Definition 3 (Hamiltonian). For a n dimensional smooth and boundless manifold M, We call an ODE on T M being Hamitonian, if there exists a function H : T M R and q = H p, ṗ = H q (H) for any (q, p) T M. 8

To make the notion of a conserved quantity more precise, we give a formal definition. Let M be a manifold of dimension n and T M be the cotangent bundle. Let (p, q) = (p 1,..., p n, q 1,..., q n ) be the canonical coordinates. Define the canonical non-degenerate two form n ω = dp k dq k. i=1 The map Φ is called canonical if it preserves the canonical form ω, i.e. Φ ω = ω. Define the Piosson bracket {, } of two C 1 functions F and H by {F, H} = ω( F, H), where F and H are the gradients of F and G resp. Definition 4. F : M 2n R is a first integral of the Hamiltonian (H) if F = {F, H} = 0. Definition 5. Two real-valued functions F 1, F 2 on M 2n are in involution if {F 1, F 2 } 0. Definition 6. A Hamiltonian (H) of n degrees of freedom is completely integrable (or simply integrable) if it has n first integrals in involution: {F i, F j } 0 for all i j {1,, n} 4.1 First Integrals of the Two Body Problem For example, we claim that the two body problem is completely integrable. Without loss of generality, we may assume it is planar, so T M 2n = R 8, so we need to find 4 first integrals to show it is completely integrable. We have q 1, q 2 R 2, m 1, m 2 > 0. We claim that the energy H, the angular momentum G, and the linear momentum L from the system are first integrals. Since the linear momentum is a vector, the 3 quantities H, G, and L = (L 1, L 2 ) really give 4 conserved quantities. These quantities are H = p i 2 2m i m i m j q i q j, G = q i p i, L = p i. To verify, we compute the appropriate Poisson brackets: {G, H} = Ġ = 0 by Kepler s second law, {L i, H} = L i = 0 since the center of mass is fixed at 0 (for i = 1, 2), {G, L i } = L j (for i, j {1, 2}, i j) by Jacobi s identity, {L 1, L 2 } = 0 by Jacobi s identity, 9

so the two body problem is completely integrable. It is conjectured and widely believed that for n 3, the n body problem is not integrable. All the known integrals remain energy, angular momentum, and linear momentum, and for every additional body added the dimension of the phase space increases by four. Conjecture 7. There are no other analytic first integrals for the n body problem. Here is the famous Arnold-Liouville theorem on integrable Hamiltonians. Theorem 8. Suppose the Hamiltonian (H) is integrable with H = F 1,..., F n as smooth first integrals. Consider Γ a = {x M 2n : F j (x) = a j, 1 j n.}, a R n. Suppose F j are linearly independent on Γ a, i.e. Then Rank( F 1,, F n ) = n. 1. Γ a is a smooth n-dimensional submanifold invariant with respect to (H). 2. If Γ a is compact and connected, then it is diffeomorphic to the n-torus T n = {θ = (θ 1,..., θ n ) : θ i S 1 }. 3. In a neighborhood U of Γ a there is a canonical map Φ : (p, q) (I, θ) and an implicitly defined function H 0 (I(p, q)) = H(p, q) with { θ = I H 0 (I) = ω(i), (4) I = 0. 4. The flow of (H) defines a periodic or almost periodic motion on Γ a, i.e., dθ dt = ω. The coordinates (θ, I) are called action-angle: θ T n be an n-dimensional angle and I R n be an action. The function ω(i) is called the frequency map. Example 3. Let X = T n R n, n 1, Σ be the set of measurable sets, and µ be the Lebesgue measure. The (X, Σ, µ) be a probability space. If ω(i) = I, then the time one map in (5) has the form { θ θ + I ( mod 1), (5) I I. 10

In this case the frequency map ω(i) = I is the identity map and H 0 (I) = n j=1 I2 j /2. It turns out that in action-angle variable it is easier to write an Arnold s example (3). Let q 1, θ, t T be angle, p 1, I R be actions. Then the Hamiltonan can have the form H ε (q, p) = p2 1 2 + k sin q 1 + f(i) + εh 1 (q 1 p 1, θ, I, t), (6) 2 where f (I) and is strinctly monotone for I > 0, i.e. f (I) 0 for I > 0, H 1 is assumed to be periodic in t. Exercise 1. Consider a nonlinear spring with a potential U(x), x R such that U is even, U(x) 0, U(0) = U (0) = 0, U (x) 0 for all x 0. The Hamiltonian H(x, p) = p 2 /2 + U(x) is completely integrable and can be written in action-angle variables (θ, I) T R +. One can compute period of this string at each level set {H = E}, E > 0. Compute period T (E) and notice that strict monotonicity of T (E) implies strict monotonicity of f(i). 4.2 Regular vs chaotic behavior We start with preliminary analysis of the above three examples: 1. For the map f α : x x + α ( mod 1) with either irrational or rational α for a pair of nearby initial conditions x, x and any n Z we have f n αx f n αx = x x. In this case, there is no sensitive dependence on initial conditions. 2. For the map h 2 : x 2x ( mod 1) for a pair of nearby initial conditions x, x and any positive n Z we have h n 2x h n 2x min{2 n x x, 1}. In this case, there is sensitive dependence on initial conditions as nearby orbits go apart exponentially fast. The exponent λ of growth of the distance, i.e. is called Lyapunov exponents h n 2x h n 2x e λn x x 11

3. For the map of the cylinder (θ, I) T n R n given by (5), denoted F, for a pair of nearby initial conditions (θ, I), (θ, I ) and any positive m Z we have m I I F m (θ, I ) F m (θ, I) m (θ, I ) (θ, I). In this case, there is weak sensitive dependence on initial conditions as nearby orbits go apart linearly. Exercise 2. Construct example so that nearby orbits go apart as O(m 2 ). How about m k for any positive integer k? 5 KAM theorem and Regular Behavior Notice that all orbits of the equation (5) are either periodic or almost periodic. Consider the Hamiltonian in action angle coordinates H 0 (I). The whole phase space is completely foliated into invariant tori T n {I 0 }, and on each of these tori the flow is linear with frequencies ω = (ω 1,..., ω n ) = ω = I H 0 (I). The components of ω provide integrals in involution for the motion of the system. Such an integrable system is called nondegenerate, if these integrals are also functionally independent, which is expressed by the condition det ω I = det 2 IIH 0 0. Notice that this condition implies that ω is a local diffeomorphism and frequencies are varying in an open set. Therefore, there are always both periodic and almost periodic orbits. Consider a Hamiltonan perturbation of H 0 given by H ε (θ, I) = H 0 (I) + εh 1 (θ, I). We want to persist some of these tori T n {I 0 } for small ε. But it was already known to Poincare that tori with periodic orbits generally break up under small perturbations, and for a long time it was an open question whether any torus can be continued for ε 0. Finally, by the work of Kolmogorov, Arnold, 12

and Moser, it turned out that those tori will persist whose frequencies are not only rationally independent but satisfy a small denominator condition (ω k) γ k τ, 0 k Zn, (7) with γ > 0. Call a vector satisfying this condition (γ, τ)-diophantine and denote D γ,τ. Call a vector simply diophantine if it is (γ, τ)-diophantine for some γ > 0 and τ > n 1. Exercise 3. Show that almost all points in R n satisfy such a condition for some γ while τ > n 1 is kept fixed (Lebesgue full measure). The set D γ,τ is nowhere dense. Let Ω R n be an bounded open set. Then the complement to the set of (γ, τ)-diophatine numbers Ω \ D γ,τ has measure cγ, where c depends on measure of Ω. Fix τ > n 1. For an open set Ω R n denote by Ω γ = Ω D γ,τ the Cantor set of γ-diophatine numbers. Kolmogorov Arnold Moser Theorem (Autonomous) Let Ω R n be an open set and H 0 : Ω R be a real analytic Hamiltonian, which is nondegenerate for I Ω. Let the perturbed Hamiltonian H ε = H 0 + εh 1 be real analytic. Then, for a sufficienty small ε portional to γ 2, the perturbed system possesses an analytic invariant n-dimensional tori T w with a linear flow for all ω D γ. Moreover, if Ω is bounded, then the union of these tori occupies all but O( ε) part of T n Ω, i.e. Mes(Ω T n )\Mes( with c depending on n and H 0 C 2. w D γ T w ) c ε Sometimes one needs to consider a time-periodic perturbation of the integral systems, which is formally expressed by H ε (θ, I, t) = H 0 (I) + εh 1 (θ, I, t), (θ, I, t) T n Ω T. (8) 13

In order to reduce to the autonomous case one can complete this system into an autonomous one by adding one variable E conjugate to time t and have H ε (θ, I, t, E) = E + H ε (θ, I, t), with the unperturbed integral part H 0 (Ĩ) = E + H 0(I), Ĩ = (I, E). Notice that det 2 H ĨĨ 0 (Ĩ) = 0 so the previous Theorem is no longer available for this case, but we can verify that H 0 (Ĩ) is so called isoenergetically non-degenerate by ( det 2ĨĨ H 0 (Ĩ) H ) 0 H 0, 0 0 and we indeed have a non-autonomous version of KAM theorem w.r.t. this condition: Kolmogorov Arnold Moser Theorem (Non-autonomous) Let Ω R n be an open set and H 0 : Ω R be a real analytic Hamiltonian, which is isoenergetically non-degenerate for I Ω. Let the perturbed Hamiltonian H ε = H 0 + εh 1 be real analytic. Then, for a sufficienty small ε portional to γ 2, the perturbed system possesses an analytic invariant n-dimensional tori T w with a linear flow for all ω D γ. Moreover, if Ω is bounded, then the union of these tori occupies all but O( ε) part of T n Ω, i.e. Mes(Ω T n )\Mes( with c depending on n and H 0 C 2. w D γ T w ) c ε 5.1 The Energy Reduction We need to specify that, the autonomous case can be transferred into the time-periodic case: For a fixed energy level H ε (θ, I) = E, if In H ε (θ, I) 0, then there should be a function I n (θ, Î, E) satisfying H ε (θ, Î, I n(θ, Î, E)) = E, (ˆθ, Î) := (θ 1,, θ n 1, I 1,, I n 1 ) 14

due to the implicit function theorem. Benefit from the continuity of H ε, we can find an open interval U E R containing E, such that In H ε (θ, I) 0. UE Then I n (ˆθ, Î, θ n, E) can be interpreted by a non-autonomous Hamiltonian with θ n playing the role of time t, i.e. dˆθ dθ n = I n Î, with (ˆθ, Î) T T n 1, θ n T. dˆθ dθ n = I n ˆθ Theorem 9. Suppose M is a n + 1 dimensional manifold, on which we can define a Hamiltonian flow (2). Restricted on the energy surface S E = {H = E}, the Hamiltonian equations can be transferred into { dpi = dt qi P n+1 dq i (10) = dt pi P n+1. where the function P n+1 (P, Q, t; E) is implicitly defined by H(P, P n+1, Q, t) = E. Remark. Notice that the time-periodic system has dimension 2n + 1, which one dimension less of the original autonomous system. Since P n+1 (P, Q, t; E) can be considered as a new Hamiltonian defined on the phase space (P, Q, t) T T n T 1, the corresponding degree of freedom is n + 1 2. Exercise 4. Prove that if H ε is nearly integrable, then induced I n is also nearly integrable. Example 4. (Needs to be modified to include nearly integrable case) Consider a smooth Hamiltonian H ε (q 1,..., q n+1, p 1,..., p n+1 ) = H 0 (p 1,..., p n+1 ) + εh 1 (q, p). Fix δ > 0 and E > δ and p n+1 > δ. Then ( n p n+1 = ± 2E p 2 j 2εH 1 (q 1,..., q n+1, p 1,..., p n+1 ) j=1 15 ) 1/2 (9)

which has a solution P n+1 (P, Q, t; E) = ± ( 2E 1/2 n pj) 2 + ε P (P, Q, t : E), where P = (p 1,..., p n ), Q = (q 1,..., q n ) and P ( ) is a smooth function. j=1 Now let s turn back to the KAM theorem, instead of giving a detailed proof of that, we will make the demonstration via two low dimensional examples, which are related with the aforementioned settings. Example 5. For n = 1, any Hamiltonian system on T T 1 is integral, cause H(θ, I) is a naturally first integral. If for a fixed energy level H(θ, I) = E, we have I H 0, by the Implicit Function Theorem we can solve the implicit E equation H(θ, J (θ, E)) = E. The solution T = {(θ, I(θ, E)) : θ T} describes an invariant torus. Notice that for any sufficiently small perturbation εh 1 (θ, I), the condition E I H 0 implies that the condition I H + εh 1 0. Thus, there exists E an torus I ε (θ, E) satisfying H(θ, I ε (θ, E)) + εh 1 (θ, I ε (θ, E)) = E, θ T. Moreover, this torus is a small perturbation of the torus {I = I }, where H 0 (I ) = E. Remark. We need to specify that the deformed torus with the same frequency ω may lie a different energy surface compare to the unperturbed torus. Here is a trivial examle: H(I) = I 2 /2 and H ε (I) = H(I) + εi, for w 0 = 1, the unperturbed torus T 0 = {(θ, 1) θ T} lies on {H(θ, I) = 1/2}, whereas T ε = {(θ, 1 ε)} lies on {H ε (θ, I) = (1 ε 2 )/2}. Example 6. For n = 1.5, which means the Hamiltonian is time-periodic H ε (θ, I, t) = H 0 (I) + εh 1 (θ, I, t) with (θ, I, t) T T T, we can process the KAM iteration for the following Poincaré map: 16

5.2 Poincare maps and further dimension reduction In the previous section we showed that orbits of the autonomous system (2) satisfy a time-periodic equation (10), which has one dimension less. Now we describe the procedure of reducing (2n+1)-dimensional time-periodic system to a 2n-dimensional map. Let θ = (θ 1,..., θ n ) T n, I = (I 1,..., I n ) R n. H ε (θ, I, t) = H 0 (I) + ε H 1 (θ, I, t) (11) be a time-periodic Hamiltonian. Consider the time one map f ε : (θ, I) (θ, I ) (12) sending (θ, I, 0) to (θ, I, 1). This is a 2n-dimensional symplectic map, i.e. it preserves the canonical form ω = di dθ. Corresponding to aforementioned Example, we study the case n = 1. Call a 2n-dimensional symplectic map f : (θ, I) (θ, I ) integrable if it has the form (θ, I) (θ + ω(i) (mod 1), I) for some map ω(i) R n. Call a 2n-dimensional symplectic map f nearly integrable if it can be written in the form f = f 0 + εf 1, where f 1 is sufficiently smooth and ε can be chosen sufficiently small. Exercise 5. Show that if H ε is nearly integrable, then so is its Poincare map f ε. 5.3 Exact Area Preserving Twist maps Let (θ, I) T R = A be coordinates on the cylinder A. Call a C 1 smooth map f : (θ, I) (θ, I ). Denote by f(x, I) the list of this map to a map R R R R, i.e. f(x + 1, I) = (x + 1, I ) for all (x, I) R 2. We call f area-preserving if it preserves the area form dθ di. exact if for any noncontractible curve γ the flux of γ is zero, i.e. the area above γ below f(γ) equals the area below γ above f(γ). twist if I / x > 0, i.e. image f(l x0 ) of every vertical line L x0 = {x = x 0 } is monotonically twisted in the x-component. 17

Call f satisfying the first two items a twist map and satisfying all three items an EAPT map. Theorem 10. (Moser) Any EAPT map can be given as the time one map of a time-periodic Hamiltonian H(x, I, t) satisfying the Legendre condition 2 II H(x, I, t) > 0 for all (x, I, t) R R T. Example 7 (Standard Map). Let a be a real number. Consider the map of a cylinder Its lift has the form f a : (θ, I) (θ + I + a sin 2πθ (mod 1), I + ε sin 2πθ). f a : (x, I) (x + I + a sin 2πx (mod 1), I + ε sin 2πx). Exercise 6. Prove that f a is an EAPT map for any a R. Example 8 (The Billiard Map). Let Ω R 2 be a strictly convex domain. Let the boundary Ω be C r smooth with r 3. Fix a point P 0 Ω and let s be the length parametrization, i.e. P s is at distance s from P 0 in the clockwide direction. Define a map f Ω : (s, I) (s, I ), where ϕ [0, π] be angle of a ray with the boundary and I = cos ϕ [ 1, 1]. This is called a billiard map, see Fig.2. Exercise 7. Prove that f Ω : T [ 1, 1] T [ 1, 1] is an EAPT map. HINT: prove that the generating function is P 0 (s) P 1 (s ) e, where e is the Euclid norm on R 2. 5.4 Generating Functions We say that f : R 2 R 2 has a generating function h : R 2 R, h(x, x ) if the map f(x, I) = (x, I ) for all (x, I) R 2 satisfies { 1 h(x, x ) = y (13) 2 h(x, x ) = y. 18

Figure 2: the incident angle always equals the reflective angle Example 9 (Standard Map). The generating function for the standard map is h(x, x ) = 1 2 x x 2 ε 2π cos 2πx, x, x R. Example 10 (Billiard Map). The generating function for the billiard map is h(s, s ) = P 0 (s) P 1 (s ) e, where e is the Euclid norm on R 2. Theorem 11. Any C 1 smooth twist map f posseses a generating function h, i.e. (13) holds for all (x, I) R 2. Moreover, f is exact if and only if h is periodic in the sense that h(x + 1, x + 1) = h(x, x ) for all (x, x ) R 2. Exercise 8. Prove that for a nearly integrable Hamiltonian H ε (θ, I, t) = H 0 (I) + εh 1 (θ, I, t) with H 0 (I) be positively definite, the Poincaré map of Σ 0 is an EAPT map with the generating function by h(x, X) = inf γ(0)=x γ(1)=x γ C ab ([0,1],R) 1 0 L(γ, γ, t)dt where the Lagrangian function L(x, v, t) defined on T T T can be derived via the Legrendre transformation: L(x, v, t) = max{p v H(x, p, t)}. p R 19

HINT: use Moser s theorem. In a view of this exercise the equation (13) can be viewed as the Euler- Largange equation. Indeed, if f k (x, I) = (x k, I k ), k Z is an orbit, then for the formal sum h(x k, x k+1 ) k Z the extrema for each k Z satisfy the equation 1 h(x k, x k+1 ) + 2 h(x k 1, x k ) = 0. (14) In general, the sequences (x k ) k Z satisfying this equation called stationary. Exercise 9. Each orbit f k (x, I) = (x k, I k ), k Z defines a stationary configuration (x k ) k Z and each stationary configuration (x k ) k Z defines an orbit (x k, I k ), k Z by setting I k = 1 h(x k x k+1 ). Call a generating function h(x, x ) nearly integrable if it can be written in the form h(x, x ) = h 0 (x x) + εh 1 (x, x ), where h 1 has bounded C 3 norms and ε can be chosen sufficiently small. Exercise 10. Show that if f ε is nearly integrable, then so is its generating function h ε. 5.5 Equation for an invariant curve for EAPT Suppose Γ ω is an invariant curve of a lifted EAPT f : (x, I) (x, I ), where (x, I) R 2. Moreover, supose that in the (x, I)-coordinates there is a parametrization Γ ω = {w(θ) = (u(θ), v(θ)) R 2 θ R}, such that for some real ω R we have (IE) f(w(θ)) = w(θ + ω), θ R. This, in particular, means that f restricted on Γ ω is conjugated to the rigid rotation x x + ω. Notice that u(θ) θ, v(θ) are both 1-periodic in θ. Denote by Γ ω the natural projection of Γ ω onto the cylinder A and by the projection of f to an EAPT map f. Denote by u ± (θ) := u(θ ± ω), then we have the following criteria 20

Theorem 12. Γ ω satisfies (IE) iff u(θ) satisfies the following 2 nd order difference equation holds (EL) : E(u(θ)) = 1 h(u, u + )(θ) + 2 h(u, u)(θ) 0, θ R. (15) Remark. For the standard map, the 2 nd order equation becomes: u + (θ) 2u(θ) + u (θ) = ε sin 2πu(θ). For a nearly integrable map with a generating function h ε (x, x ) = h 0 (x x) + εh 1 (x, x ) the 2 nd order equation becomes: h 0 (u + (θ) u(θ)) + h 0 (u(θ) u (θ)) = ε( 1 h 1 (u, u + )(θ) + 2 h 1 (u, u)(θ)). So we successfully transfer the existence of KAM torus to the existence of function u(θ) of (15). In the next section we look for Γ ω so that it corresponds to just the KAM torus of H ε with the frequency (ω, 1). 5.6 Newton Iteration Method Goal: suppose E(u 0 ) ε is small and E (u 0 ) O(1), then in a neighborhood of u 0 there exists a unique solution making E(u ) = 0. The following is a scheme towards this goal, which is actually the essence of the Newton Iteration Method. For v ε 1, E(u 0 + v) = E(u 0 ) + E (u 0 )v + Q(v, v), where Q(v, v) ε 2 is a quadratic term. Step 1: solve E(u 0 ) + E (u 0 )v = 0, and denote u 1 = u 0 + v, so we finish the first step iteration by E(u 1 ) = Q(v, v) E 2 (u 0 ); Step n: make E(u n )+E (u n )v n = 0, and denote u n+1 = u n +v n. Similarly we get E(u n+1 ) = Q(v n, v n ) E 2 (u n ) E 2n (u 0 ). Remark. We have to confess that, in the previous scheme, v n has less regularity than u n, and that would cause some difficulty!!! 21

5.7 A further KAM exploration: from (IE) to a circle diffeomorphsim Now we turn back to (15), and further explore the relationship between it and the KAM iteration. Recall that f ε is nearly integrable due to Exercise 4, that implies u(θ) θ ε. So f ε Γ w ε becomes a circle diffeomorphism: (Model Problem): f 0 : x x + w (mod 1), f ε : x x + w + εv(x) (mod 1), find v(x) such that theses two circle diffeomorphisms are conjugated via the following commutative graph: f ε T T ϕ ε ϕε T f 0 T. Formally we write by ϕ ε (x) = x + εψ(x), then f ε = ϕ ε f 0 ϕ 1 ε = x + w + εψ(x + w) εψ(x) + O(ε 2 ). (16) On the other side, f ε (x) = x + w + εv(x) + O(ε 2 ), so we deduce the following cohomology equation: ψ(x + w) ψ(x) = v(x). Definition 13. Denote by W r := {f C w (T, R, r) f r = sup Iθ r f(θ) }, where r is called the analytic radius. Exercise 11. The Fourier expansion f(θ) = k Z f ke ikθ satisfies for f W r. f k f r exp( k r), k Z Lemma 14 (Iteration). Suppose w is γ Diophantine, v W r is of zero mean, i.e. v 0 = 0. Then the cohomology equation has a unique solution ψ W r with zero mean. Moreover, for any 0 < r < r, ψ r c(γ) v r r r 3. 22

Proof. Formally calculation ψ(x+w) ψ(x) = k Z ψ k(e i2πkw 1)e ikx, which implies v k ψ k = e i2πkw 1, (k 0, ψ 0 = 0). Recall that qw p γ q 2 for all q, p Z, which is due to the Diophantine condition, then On the other side, Then e i2πkw e i2πp c(γ) k 2. (17) v k v r exp( k r), k Z ψ k v r e k r c 1 (γ) k 2 v r e k s e k (r s) c 1 (γ) k 2. Recall that xe x e 1 for x 1, then for x = a k b Substitute by a = r s, b = 2 then Finally we get e a k k b e b ( b a) b, k N. ψ k < c 1 (γ) v r exp( s k ) (r s) 2. ψ r k ψ k e r k c 1(γ) v r (r s) 2 (1 exp( (s r ))) c 1 (γ) v r (r s) 2 (s r ) 2 c 1(γ) v r (r r ) 2 (18) for s = r+r. So we finish the proof. 2 23

5.8 the KAM iteration of generating functions The proof we present here is due to Moser and can be found in Moser-Levi [4]. A multidimensional version of the same approach can be found in [5]. Let s rewrite the discrete (EL) equation here: E(u(θ)) = 1 h(u, u + ) + 2 h(u, u) = 0, θ R. For nearly integrable generating function h ε (x, x ) = h 0 (x x) + εh 1 (x, x ), we can transfer (EL) into w.r.t h 0( ) 0. h 0(u + u) h 0(u u ) = ε[ 1 h 1 (u, u + ) + 2 h 1 (u, u)] Remark. Actually, (EL) is the 1 st order variational equation of the following Percival Variational Principle: δ 1 0 h(u, u + )dθ = 0. (P.V.P ). the mean value of u θ E(u) equals zero, i.e. 1 0 u θ E(u)dθ = 0. Theorem 15 (the KAM theorem of twist maps). Let D C 2 be an open set of complex plane such that a generating function h(x, x ) is real analytic and analytic in D. Moreover, the complex expansion still satisfies h(x, x ) = h(x + 1, x + 1), (x, x ) D. Let min D 12 h > κ > 0 for some κ > 0. Suppose there exists M > 0, such that h C 3 (D) < M. Denote by D R := {(x, x ) D whose R neighborhood belongs to D}. u 0 (θ) θ W r with 0 < r < 1, and (u 0, u + ) D R. There exists N 0 > 0, such that (u 0 ) θ < N 0, (u 0 ) 1 θ < N 0 Iθ < r. 24

For any diophantine ω, i.e. γ > 0, such that p qω > γ q 2 (19) for all p, q Z, if some h ε, u 0 satisfy all the previous conditions, then δ = δ(r, h ε, M, N 0, γ, R), if E(u 0 ) r < δ, then there exists a unique u close to u 0, such that E(u) = 0 and u(θ) θ W r/2 with mean value of u(θ) θ being zero. In the next section we will give a proof for this Theorem. 5.9 The homological equation and Newton Iteration method Let s recall the (EL) equation: if u = u 0 + v, then with E(u) = E(u 0 ) + E (u 0 )v + Q(v, v) E (u)v = [ 11 h(u, u + ) + 22 h(u, u)]v + [ 12 h(u, u + )v + + 12 h(u, u)v ]. We can apply the Newton Iteration method to this equation, i.e. Step 1: solve E (u 0 )v 0 = E(u 0 ). Let u 1 = u 0 + v 0, then E(u 1 ) = Q(v 0, v 0 ) E 2 (u 0 ). Step n: solve E (u n )v n = E(u n ). Let u n+1 = u n +v n, then E(u n+1 ) = Q(v n, v n ) E 2 (u n ) E 2n (u 0 ). But we still have to face the decay of regularity!!! We call E (u)v = E(u) (HE) the homological equation. Recall that we will face the quadratic derivatives in (HE), so we should improve it into a manageable case. In other words, we need to make a transformation from solving quadratic equation to solving a couple of linear equations!!! 25

Let s multiply a u θ on both sides, u θ E (u)v = u θ E(u). Recall that E(u) < δ and v < δ, then v d dθ E(u) = ve (u)u θ is a quadratic small term. So we can get which can be further transferred into u θ E (u)v ve (u)u θ = u θ E(u), (20) 12 h(u, u + )(u θ v + u + θ v) 12h(u, u)(u θ v u θ v) = u θe(u). Introduce w = u u θ v, then it becomes 12 h(u, u + )u θ u + θ (w+ w) 12 h(u, u)u θ u θ (w w ) = u θ E(u). Let s define the discrete gradient by f = f + f, f = f f, then formally we get a manageable homological equation by: ( 12 hu θ u + θ w) = g, g = u θe(u). (MHE) Lemma 16. Suppose (u, u + ) D for Imθ < r and u θ < N, u 1 θ < N, then (MHE) has a unique solution w W ρ of zero average for all 0 < ρ < r and v = u θ w satisfying v ρ where c = c(m, N, R). c (r ρ) 6 E(u) r, v θ ρ Proof. Denote by p = ( 12 hu θ u + θ ) 1 for short, then ψ = g, p 1 w = ψ + µ pψdθ where µ is a constant equal to pθ, which implies c (r ρ) 7 E(u) r µ RMN 4 g r c (r r ). 3 26

Due to Lemma 17, ψ r c(γ) (r r ) 3 g r, for all 0 < r < r, then w ρ c 1 p(ψ + µ) r (r ρ) 3 c 2 g r (r ρ) 3 (r r ) 3. Let r = r+ρ, then w 2 ρ c 3 E(u) r. Recall that v w u (r ρ) 6 θ, then v θ ρ c 4 E(u) r (r ρ) 7. Lemma 17 (Decease of domain of analyticity). w is a Diophantine frequency, g W r of zero mean, then ψ(θ) = g(θ) has a unique solution ψ W r of zero mean, 0 < r < r. Moreover, g r ψ r c 2 (γ) (r r ). (21) 3 Proof. The Fourier coefficients of g and ψ are related via the following: ψ n = g n e i2πnw 1 0, ψ 0 = 0. The Diophantine condition (19) gives the lower bound on the denominators: while g W r implies e i2πnw 1 > c(γ) n 2, g n g r e 2π n r. Using previous inequalities we obtain, for 0 < r < s < r, ψ n g rc 1 (γ)e 2πr n n 2 = g r c 1 (γ)e 2πs n e?2π(r?s) n n 2 We estimate the right hand side by using the estimate xe x e 1 for positive x, which for x = a n /b gives the inequality e a n n b e b (b/a) b for all n; with a = 2π(r s) and b = 2 we obtain the result ψ n c 1 (γ) g r 1 (r s) 2 e n 2πs. 27

From this estimate we obtain ψ r ψ n e n 2πr 2c 1 g r (r s) 2 (1 e 2π(s r ) ) 1 2c 1 g r (r s) 2 (s r ) which for s = (r+r )/2 gives the desired estimate (21) for ψ. This completes the proof. Lemma 18. For u(θ) satisfying all the conditions of Lemma 16, and ũ = u + v satisfying (ũ, ũ + ) D for all Imθ < ρ for 0 < ρ < r, then where c = c (M, N, γ). E(ũ) ρ c E(u) 2 r (r ρ) 8, Proof. With u as specified, we estimate the error E(ũ) ρ E(u + v) ρ = E(u) + E (u)v + Q ρ with Q being a quadratic remainder. From (20) we obtain u θ E(u) + u θ E (u)v = ve (u)u θ, or E(u) + E (u)v = we (u)u θ = w d E(u). Using Lemma 16 and the Cauchy dθ estimate d E(u) dθ ρ c E(u) r, we obtain r ρ E(u) + E E(u) 2 r (u)v ρ c 4 (r ρ) < c E(u) 2 r 7 4 (r ρ), 8 where c 4 = c 4 (M, N, γ). Moreover, the error Q is quadratically small in terms of E(u) r. Indeed, by Taylor s formula we have for some 0 t 1: Then we have d 2 Q = 1 E(u + tv). 2 dt2 Q ρ c 5 v 2 ρ, where c 5 depends only on h C 3. From this, we obtain the desired estimate. 28

Proof of Theorem 15: Benefit from Lemma 16, Lemma 17 and Lemma 18, the strategy at the beginning of this section is available now. So we can repeat this iteration for infinite times, with the corresponding parameters r n = r 2 + 2 n r 2, and ε n+1 = E(u n+1 ) rn+1 ε 2 nc(m, N, γ, r)2 16n. By taking η n = 2 16(n+1) cε n we get η n+1 η 2 n. Once η 0 = 2 16 cε 0 < 1, then as ε n 0. η n η 2n 0 0 6 Arnold s Example and Normally hyperbolic invariant cylinder (NHIC) Let s start from the following example: Example 11 (Arnold). For ε k 1, H ε (x, t) = p2 2 + k(cos 2πq 1) + r2 2 + εh 1(x, t) where x = (p, q, r, θ) with (q, θ) T 2, (p, r) R 2 and H 1 (x, t) is 1 periodic of time t. For ε = 0, H ε is totally integrable, i.e. the phase space x T T 2 is a product of individual rotator and pendulum systems. We can easy find that (p, q) = (0, Z) are a list of copies of saddle points, with the heteroclinic orbits H ± 0 (t) = (p ± (t), q ± (t)) satisfying (p ± ) 2 (t) 2 + k(cos 2πq ± (t) 1) = 0. Moreover, H + 0 (0, 1) as t +, and H + 0 (0, 0) as t ; H 0 (0, 1) as t, and H 0 (0, 0) as t +. ********************************************************* Goal: For any A < B and a generic perturbation εh 1, we can always find an orbit x(t), such that r(0) < A, r(t ) > B for some T > 0. (AD) 29

Hypothesis (1). H 1 (0, 0, r, θ, t) 0, forall (r, θ, t) R T 2, then Λ 0 = {(p, q, t) = (0, 0, 0)} is a 2 dimensional invariant cylinder. Consider the Poincaré map F ε : x x, restricted on the time section Σ 0 = {t = 0}. It is given by the time one map of H ε. Notice that when F ε is restricted to the cylinder Λ 0, it has a family of invariant curves with either periodic or quasiperiodic motion in each, i.e. F ε Λ0 : (r, θ) (r, θ + r). Let X be an invariant set, i.e. F ε (X) = X given by either the cylinder Λ 0 or an invariant circle T ω = Λ 0 {r = ω}. Define the stable (unstable) manifold by W s ε (X) = {x F n ε x X as n + }, W u ε (X) = {x F n ε x X as n }. Remark. For ε = 0, the invariant manifolds of the cylinder coincide and are given by 3-dimensional manifolds W s (Λ 0 ) = W u (Λ 0 ) = {x p2 2 + k(cos 2πq 1) = 0}. The invariant manifolds of the invariant circle T ω also coincide and are given by 2-dimensional manifolds W s (T ω ) = W u (T ω ) = {x p2 2 + k(cos 2πq 1) = 0, r = ω}. For ε > 0 the invariant manifolds of Λ 0 (resp. T ω ) no longer need to concide and are immersed 3-dimensional (resp. 2-dimensional) manifolds. Arnold s Mechanism: Definition 19. Call a collection of invariant curves (T ω0,, T ωn ) transition chain if W u (T ωi ) W s (T ωi+1 ) i = 0,, n 1, i.e. the intersection is non-empty and transverse. 30

Lemma 20. If there exists a transition chain such that ω 0 < A and ω n > B, then there exist orbits satisfying (AD). Remark. In the original paper of Arnold, he choose H 1 = (1 cos 2πq)(cos 2πθ + cos 2πt) of which the transition chain indeed exists. Besides, for w i away from 0, ω i ω i+1 e 1 k. *************************************************** Now we observe that to prove the Goal, we should find the transition chain first. For general H 1 with ε 0, Hypothesis (1) no longer holds, which urges us to find a deformative invariant cylinder Λ ε close to Λ 0, then try to find the transition chain of the invariant curves T wi on it. 6.1 Existence of NHIC and Conley isolating block Definition 21 (Fenichel, HPS). Let F : M M be a C 1 diffeomorphism. We call Λ M a normally hyperbolic invariant manifold (NHIM), if Λ = F (Λ) and there exists a decomposition of the bundle space, i.e. T M = T Λ E u E s, such that C > 0, 0 < λ < µ < 1 satisfying for all x Λ, v E s x df n (x)v Cλ n v, n 0, v E u x df n (x)v Cλ n v, n 0, v T x Λ df n (x)v Cµ n v, n Z. Example 12. For ε = 0, M = T T 2, Λ = Λ 0 = {x p = q = 0}, then we can see that E s is tangent to the stable direction of the saddle and E u is tangent to the unstable direction of the saddle. Moreover, λ = exp(negative eigenvalue of the saddle), µ = 1 + Cε, C = C(H 1 ) is a constant. Theorem 22. For all εh 1 be C 3 small perturbation, system H ε preserves a 2 dimensional NHIC Λ ε close to Λ 0. 31

Remark. This Theorem is a special application of HPS s invariant manifold theorem in our setting. From the definition of NHIM, we can see that at least we need Λ be C 1 sub manifold (if it exists). This implies that we need to prove the existence of topologically invariant set first (isolating block), then raise the smoothness in the tangent bundle space (cone condition). 6.1.1 Conley Isolating Block For a C 1 differential equation ẋ = F (x), x R n, (22) we can decompose R n by R ns R nu R nc, of which F = (F s, F u, F c ). We choose Ω = B u B s B c, where B ( = u, s, c) are all Euclidean balls. Assume L(x) is bounded with L uu L us L uc L(x) = df (x) = L su L ss L sc. L cu L cs L cc We call such a Ω isolating block, if the following Hypothesis holds: Hypothesis (2). Isolating Block F c = 0 on B u B s B c, F u (u, s, c)u > u on B u B s B c, F s (u, s, c)s < s on B u B s B c. Theorem 23 (Existence). If Ω is an isolating block, there exist a topological invariant set in Ω. Proof. We just need to take i Z F i (B u B s B c ) and Hypothesis (2) ensures the intersection is not empty. Hypothesis (3). Cone Condition For all x Ω, L uu (x) αi, L ss (x) αi, L ij (x) m, ij ss, uu. ij 32

Exercise 12. Let B c = 0, Prove that F satisfying Hypothesis (2) and (3) with α > 2m will preserve a saddle point in B u B s. Theorem 24. Assume Hypothesis (2) and (3) hold, then for K = m 1 2, W sc = {x x(t) Ω, t 0} is a graph of C 1 function W sc ; W uc = {x x(t) Ω, t 0} is a graph of C 1 function W uc ; W c = {x x(t) Ω, t R} is a graph of C 1 function W c ; Here W sc : B s B c B u, W uc : B u B c B s and W c : B c B u B s. Moreover, dw sc K, dw uc K and dw c 2m. α 2m Corollary 25. Theorem 24 leads to Theorem 22 for small ε, and we can estimate how small ε is!!! Lemma 26. If Hypothesis (3) holds, then π sc W sc = B s B c, < 1 > π uc W uc = B u B c. < 2 > Moreover, the closure W sc and W uc satisfies W sc B u B s B c, Wuc B s B u B c. < 3 > Here π ( = s, u, c) is the standard projection to each subspace. Proof. T + (x) = min{t > 0, x(t) Ω}. If T + (x) <, x / W sc. By Hypothesis (3), x(t + (x)) B u B s B c, which implies T + (x) is locally continuous and is C 1 dependent of the initial position x. Then we can define the retraction: Definition 27. We call the map f : X Y X a retraction, if f Y = id. We can see that there exists no retraction from B u to B u. To prove < 1 >, we assume that (s, c) B s B c such that W sc B u {s} {c} =, which implies π sc W sc (s, c) =. Then we actually find a retraction u B u π u x(t + (u, s, c)) B u, which leads to a contradiction. < 2 > can be proved in a similar way. For < 3 >, notice that in a small neighborhood of B u, T + (x) <. Then W sc (N ( B u ) B s B c ) =. This leads to W sc B u B s B c. The other cases follows the same way. 33

6.1.2 the cone condition and graph transform Recall that previous Theorem 23 ensures the existence of a topological invariant set in Ω, but we need to prove the smoothness (at leaf C 1 ) then get Theorem 24. Besides, we can see that the topological invariant set W c is actually the intersection of the following forward invariant and backward invariant sets: W sc = i Z +F i (Ω); W uc = i Z F i (Ω); W c = W sc W uc. Benefit from Hypothesis (3), we can prove that W uc (reap. W sc ) are C 1 smooth and then W c is also C 1 : For a point z Ω define a map from T z Ω to its neighborhood U Ω by considering the exponential map exp z (v) Ω. By definition exp z (0) = z and for a unit vector v we have exp z (tv) to be the position of geodesic starting at z and for a unit vector v after time t. For the Euclidean components we assume that the metric is flat and the corresponding exponential map is linear. Now let s denote by the following unstable (resp. stable) cone in T z Ω by K u (z) = {(u, s, c) u s + c }; K s (z) = {(u, s, c) s u + c }; with z Ω and (u, s, c) T z Ω is the coordinate corresponding to the components of Ω = B u B s B c. Lemma 28. For z W uc, K s (z) is backward invariant of the flow (22). Respectively, for z W sc, K u (z) is forward invariant. Remark. Due to the exponential map, we can get that W uc and W sc are both Lipschitz graphs. Actually, we can automatically raise the smoothness to C 1, this is because the topological invariant set W c is unique and every Lipschitz function is almost everywhere differentiable. At last, let s point out that due to the invariant theorem we can continue to raise the smoothness of W c, see [2] for more details. 34

7 Large Gap problem and weak KAM theory In last section, Theorem 24 implies NHIC could preserve for sufficiently small ε, even if Hypothesis (1) is not available. Then due to Arnold s mechanism, we need to find a transition chain first, then have chance to construct a diffusion orbit. However, the foliation structure of NHIC will be destructed by generic H 1, and the KAM theorem only ensures that a sparse set of KAM tori preserves, although they may occupy a large measure part. Example 13. Verify that the greatest p deviation between any two points in the zero energy level of H = p2 2 + ε(cos q 1) with (p, q) T T is 2 ε. Previous example implies that under a generic perturbation of the NHIC, there would be a Birkhoff unstable domain of the width O( ε), i.e. the two adjacent KAM tori on the NHIC will be O( ε) faraway. T w T w ε in turns implies that w w ε. On the other side, the Melnikov function implies that W u (T w ) W s (T w ) holds only for w w ε (resp. W s (T w ) W u (T w )). Here ε ε cause a so called Large Gap Problem, i.e. we couldn t use the Arnold s mechanism to overcome the Birkhoff unstable domain. ********************************************************* Goal: For any w [w, w ] with T w and T w are two adjacent non contractible invariant circles, we can fill the gap with invariant set A w (Aubry Mather set) of the rotational number w, such that W u (A wk ) W s (A wk+1 ) (resp. W s (A wk ) W u (A wk+1 ) ) for at least one sequence {w i } N i=1 [w, w ]. This {w i } will form a transition chain and we can now across the Birkhoff unstable domain. 7.1 weak KAM theory Definition 29. Suppose M is a smooth, compact manifold without boundary, thence call a Hamiltonian H : T M R be Tonelli, if 35

H(x, p) is at least C 2 smooth; H(x, p) is fibre convex, i.e. 2 pph is positively definite for all (x, p) T M; H(x, p) is superlinear, i.e. lim p H(x,p) p =. Example 14. (Geodesic Flow) H(x, p) = 1 2 p 2 x; (Mechanical) H(x, p) = 1 2 p 2 + U(x); (Mané s Hamiltonian) for any vector field X(x) on M, H(x, p) = 1 2 p 2 + p, X(x). Definition 30. Suppose M is a smooth, compact manifold without boundary, thence call a Lagrangian L : T M R be Tonelli, if L(x, v) is at least C 2 smooth; L(x, v) is fibre convex, i.e. 2 vvl is positively definite for all (x, v) T M; L(x, v) is superlinear, i.e. lim p L(x,v) v =. This Largangian is highly related with the following variation problem: denote by A L (x, x, a, b) = b a L(γ(s), γ(s))ds the Action function of the C 1 piecewise path γ : [a, b] M with fixed endpoints γ(a) = x, γ(b) = x, the following Lemma holds: Lemma 31. The critical value of A L (x, x, a, b) corresponds to curves γ satisfying the Euler Lagrange equation: for all t [a, b]. d dt L v(γ, γ) = L x (γ, γ), (EL) Remark. For autonomous Lagrangian L(x, v), (EL) is complete, i.e. any solution of (EL) can be expanded for all t R. If L(x, v, t) is time periodic, then we have to add the following item in the definition of Tonelli Lagrangian: 36

(EL) flow is complete for all t R. Theorem 32 (Legendre Transformation). There is an one to one correspondence between Tonelli Hamiltonian and Tonelli Lagrangian: L(x, v) = max { p, v H(x, p)}. ( ) p T M Respectively, H(x, p) = max v T M { p, v L(x, v)}. Recall that L : (x, p) (x, v) is a diffeomorphism due to the convexity of H(x, p) and actually v = p H(x, p). We call L the Fenichel Legendre transformation. Due to this Legrandre transformation, we get: Example 15. (Geodesic Flow) L(x, v) = 1 2 v 2 x; (Mechanical) L(x, v) = 1 2 v 2 U(x); (Mané s Lagrangian) L(x, v) = 1 2 v X(x) 2. Lemma 33. L sends an orbit of (H) into an orbit of (EL). Definition 34. We call T w a (maximal) KAM torus with w R n, if T w = {(x, c + du(x))} is a Lagrangian graph; T w is invariant of the Hamiltonian flow; the flow on T w is conjugated to the rigid flow ẋ = w (or x x + wt (mod 1)). Remark. If k, w 0 for all k Z n \{0}, then any orbit on T w is dense. Moreover, due to the Birkhoff ergodic Theorem, there exists a unique invariant probability measure µ which supports on T w and µ = ϕ Leb with Leb is the Lebesgue measure on T n and ϕ conforms to the commutative diagram: ϕt H T T ϕ ϕ T ρt w T. Theorem 35 (Herman). If k, w 0 for all k Z n \{0}, any KAM torus has to be a maximal KAM torus. 37

Definition 36. Denote by M the space of all probability invariant measures on T M, then we define by w = [µ] = vdµ the rotational vector of µ. Remark. If µ is ergodic, then due to the Birkhoff ergodic Theorem, we can find a (EL) curve γ, such that 1 T lim T T 0 T M γ(t)dt = T M vdµ = [µ]. So essentially we can see that w is the averaging velocity of the ergodic curve γ. Exercise 13. For integrable system H(x, p) = 1 2 p2, the probability measure µ supported on T w has the rational vector w. Theorem 37. The unique measure (invariant, probability) supported on T w satisfies Ldµ = inf Ldµ. [µ]=w Now we expand previous idea: for any ρ R n, can we find an invariant probability measure µ ρ, such that [µ ρ ] = ρ? In the universal cover of T n, for any fixed point x 0, we introduce the following variational principle: A(ρ, x 0, T ) := T min L(γ(s), γ(s), s)ds, γ(0)=x 0 0 γ(t )=x 0 +ρt where min is for all the absolute continuous curves. Remark. If L(x, v, t) is time-periodic, we need to impose the completeness of (EL). Due to the Tonelli Theorem, the minimizer of A(ρ, x 0, T ) is always achievable and we can improve the smoothness of the minimizer γ to C 2 ([0, T ], T n ). 38

Suppose µ T is the singular probability measure supported on {(γ(s), γ(s)) s [0, T ]}, i.e. for all function f(x, v) with compact support in T T n, T T n fdµ T = 1 T T 0 0 f(γ(s), γ(s))ds. Recall that µ T is not invariant, but indeed γ(t ) γ(0) ρ = = 1 T γ(s)ds = T T vdµ T. Due to Krylov-Bogerlybov Theorem, the space of all probability measures on T T n is weak* compact, which implies lim T µ T exists, and each limit measure µ is invariant with vdµ = ρ. Recall that if µ is ergodic, we can use the Birkhoff ergodic theorem and find at least one (γ, γ) supp (µ ), such that 1 T lim T T 0 γ(t)dt = T M vdµ = ρ. (23) If µ is not ergodic, the following example shows that we are not able to find such a supported orbit γ such that (23) holds: Example 16 (Hedlund). We choose a Riemannian Hamiltonian H(q, p) = 1 2 p 2 h, where (q, p) T T 3 and x R 3 is the lift of q T 3. Along e 1 = (1, 0, 0), e 2 = (0, 1, 0) and e 3 = (0, 0, 1), we construct 3 high way T j, j = 1, 2, 3, i.e. the Riemannian metric, h is sufficiently small; outside T j the metric is Euclidean, see Fig. 3. Exercise 14. Use the Legendre transformation we can get L(x, v) = 1 2 v 2 h. Prove that for ρ = (1, 1, 1), we can find an invariant measure µ satisfying [µ] = ρ can be decomposed into µ = µ 1 + µ 2 + µ 3 3 with µ j is a measure supported in T j with [µ j ] = e j, j = 1, 2, 3. 39

Figure 3: Lemma 38. f : T n R is C 1, then T T n df, v dµ = 0 for all invariant measure µ. Proof. Let (x 0, v 0 ) supp(µ), and (x t, v t ) be the (EL) flow. df, v dµ = 1 T dt df, v d(ϕ L t ) µ T 0 T T n 1 T = lim dt df, v T T 0 T T n 1 T = lim dµ df(x t )v t dt T T T T n 0 = lim T = 0. dµ f(x T ) f(x 0 ) T T T n (xt,v t) dµ So we finish the proof. Now we turn back to the (maximal) KAM torus: Suppose the frequency is w and µ w is the corresponding invariant measure on T w, then L(x, v)dµ w = T T n L(x, v) c + du, v dµ w + T T n c, v dµ w T T n 40