Hamiltonian Mechanics

Similar documents
14 Lecture 14 Local Extrema of Function

Lecture 2: Convex Sets and Functions

HAMILTON S PRINCIPLE

Linear and non-linear programming

Legendre-Fenchel transforms in a nutshell

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

On John type ellipsoids

Introduction to Real Analysis Alternative Chapter 1

Newtonian Mechanics. Chapter Classical space-time

Convex Optimization Notes

Curves in the configuration space Q or in the velocity phase space Ω satisfying the Euler-Lagrange (EL) equations,

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Introduction to Convex Analysis Microeconomics II - Tutoring Class

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

1 Directional Derivatives and Differentiability

Static Problem Set 2 Solutions

Continuity. Chapter 4

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Legendre Transforms, Calculus of Varations, and Mechanics Principles

September Math Course: First Order Derivative

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

Tangent spaces, normals and extrema

Global Maxwellians over All Space and Their Relation to Conserved Quantites of Classical Kinetic Equations

Legendre-Fenchel transforms in a nutshell

Lecture 4 Lebesgue spaces and inequalities

Continuity. Chapter 4

Chapter 1 Preliminaries

Lecture 2: Linear Algebra Review

Exercises: Brunn, Minkowski and convex pie

Mathematics 530. Practice Problems. n + 1 }

Iowa State University. Instructor: Alex Roitershtein Summer Homework #5. Solutions

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Convex Analysis and Economic Theory Winter 2018

2019 Spring MATH2060A Mathematical Analysis II 1

BASICS OF CONVEX ANALYSIS

Convex Functions and Optimization

Optimality Conditions for Constrained Optimization

Riemann integral and volume are generalized to unbounded functions and sets. is an admissible set, and its volume is a Riemann integral, 1l E,

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Convex Analysis and Optimization Chapter 2 Solutions

Convex Analysis and Economic Theory AY Elementary properties of convex functions

APPLICATIONS OF DIFFERENTIABILITY IN R n.

Handout 2: Elements of Convex Analysis

2.2 Some Consequences of the Completeness Axiom

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Time-Dependent Statistical Mechanics 5. The classical atomic fluid, classical mechanics, and classical equilibrium statistical mechanics

Chapter 2 Convex Analysis

EULER-LAGRANGE TO HAMILTON. The goal of these notes is to give one way of getting from the Euler-Lagrange equations to Hamilton s equations.

The Symmetric Space for SL n (R)

PHYS 705: Classical Mechanics. Hamiltonian Formulation & Canonical Transformation

ARCS IN FINITE PROJECTIVE SPACES. Basic objects and definitions

Mathematical Physics. Bergfinnur Durhuus and Jan Philip Solovej

Analysis II - few selective results

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Definitions & Theorems

5. Duality. Lagrangian

B. Appendix B. Topological vector spaces

Some Background Math Notes on Limsups, Sets, and Convexity

10. Smooth Varieties. 82 Andreas Gathmann

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim

Elementary linear algebra

Examples of Dual Spaces from Measure Theory

Math 341: Convex Geometry. Xi Chen

Elements of Convex Optimization Theory

Topological properties of Z p and Q p and Euclidean models

CHAPTER 1. Thermostatics

Preliminary draft only: please check for final version

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

A metric space is a set S with a given distance (or metric) function d(x, y) which satisfies the conditions

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

ON A CLASS OF NONCONVEX PROBLEMS WHERE ALL LOCAL MINIMA ARE GLOBAL. Leo Liberti

CHAPTER 7. Connectedness

The Geometry of Euler s equation. Introduction

MAT 257, Handout 13: December 5-7, 2011.

Thermodynamics of phase transitions

h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote

CHAPTER 3 Further properties of splines and B-splines

3. Linear Programming and Polyhedral Combinatorics

F (x) = P [X x[. DF1 F is nondecreasing. DF2 F is right-continuous

2 Sequences, Continuity, and Limits

Optimization and Optimal Control in Banach Spaces

Optimality Conditions for Nonsmooth Convex Optimization

Legendre transformation and information geometry

MACROSCOPIC VARIABLES, THERMAL EQUILIBRIUM. Contents AND BOLTZMANN ENTROPY. 1 Macroscopic Variables 3. 2 Local quantities and Hydrodynamics fields 4

HOMEWORK ASSIGNMENT 6

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

L p Spaces and Convexity

Concave and Convex Functions 1

Differentiation. f(x + h) f(x) Lh = L.

Chapter 8. P-adic numbers. 8.1 Absolute values

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

2. The Concept of Convergence: Ultrafilters and Nets

Assignment 1: From the Definition of Convexity to Helley Theorem

Sequences. Chapter 3. n + 1 3n + 2 sin n n. 3. lim (ln(n + 1) ln n) 1. lim. 2. lim. 4. lim (1 + n)1/n. Answers: 1. 1/3; 2. 0; 3. 0; 4. 1.

at time t, in dimension d. The index i varies in a countable set I. We call configuration the family, denoted generically by Φ: U (x i (t) x j (t))

Transcription:

Chapter 3 Hamiltonian Mechanics 3.1 Convex functions As background to discuss Hamiltonian mechanics we discuss convexity and convex functions. We will also give some applications to thermodynamics. We will discuss convex functions without assuming differentiability. In thermodynamics for instance the functions considered are often not differentiable. This happens, e.g., in situations where we have phase transitions. Definition 3.1. For x, y R and 0 < α < 1 we say that the point αx + (1 α)y is a convex combination of x and y. This point is on the line segment between x and y and all points on this line segment can be written in this way for some 0 α 1. Definition 3.2 (Convex set). A subset C R k is said to be convex if whenever x, y C then the whole line joining x and y is also in C. We may rephrase this as follows. For all 0 α 1 we have αx + (1 α)y C. x y y x Convex Not convex Figure 3.1: 1

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 2 Definition 3.3 (Convex function). A function f : C R defined on a convex set C R k is said to be convex if for all x, y C and all 0 α 1 we have f(αx + (1 α)y) αf(x) + (1 α)f(y). This says that the graph of the function f lies below the line segment joining the two points (x, f(x)) and (y, f(y)). The function is said to be strictly convex if the graph is strictly below, i.e, if 0 < α < 1 implies f(αx + (1 α)y) < αf(x) + (1 α)f(y). A function is said to be concave if f is convex. f f(y) α f(x)+ (1 α) f(y) f (α x+ (1 α) y) f(x) x α x+ (1 α) y y Figure 3.2: Lemma 3.4. If f : [a, b] R is a convex function of one variable. Then f attains its maximal value at one of the endpoints a or b. Proof. Any point x [a, b] can be written as x = αa + (1 α)b with 0 α 1. Since f is convex we have f(x) αf(a) + (1 α)f(b) max{f(a), f(b)}. Lemma 3.5. If f : I R is a convex function of one variable defined on an open interval I R. Then the map (x, y) f(x) f(y), x, y I, x y x y is monotone increasing in both x and y separately.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 3 f x 1 x 2 x 3 Figure 3.3: Proof. For points x 1 < x 2 < x 3 in I we have x 2 = x 3 x 2 x 3 x 1 x 1 + x 2 x 1 x 3 x 1 x 3, and 1 = x 3 x 2 x 3 x 1 + x 2 x 1 x 3 x 1. Thus from the convexity of f we obtain This implies that f(x 2 ) x 3 x 2 x 3 x 1 f(x 1 ) + x 2 x 1 x 3 x 1 f(x 3 ). For any fixed y I we let f(x 2 ) f(x 1 ) x 2 x 1 f(x 3) f(x 1 ) x 3 x 1 f(x 3) f(x 2 ) x 3 x 2. g(x) = f(x) f(y). x y We then see from the above inequalities that g(x 1 ) g(x 2 ) in the three situations x 1 < x 2 < y, x 1 < y < x 2, and y < x 1 < x 2. This implies the statement in the lemma. Theorem 3.6 (Supporting hyperplanes). Let f : C R be a convex function defined on a convex set C R k. For each interior point x 0 C there exist at least one vector µ R k such that f(x) f(x 0 ) + µ (x x 0 ). We call h(x) = f(x 0 ) + µ (x x 0 ) a supporting hyperplane for f at x 0. The function f has partial derivatives at x 0 if and only if the supporting hyperplane is unique. In this case µ = f(x 0 ).

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 4 Proof. We give the proof first in the case of one variable, i.e., if k = 1. It follows from Lemma 3.5 that the left and right derivatives f(x) f(x 0 ) f(x) f(x 0 ) µ + = lim, µ = lim x x 0 + x x 0 x x 0 x x 0 exist and that µ µ +. From the lemma again we conclude that f(x) f(x 0 ) + µ(x x 0 ) for all x if and only if µ [µ, µ + ]. This proves the theorem for one variable. For a convex function of several variables the problem is more complicated. A proof of the existence of a support plane is given in Exercise 3.21. If the function has partial derivatives we may consider its restriction to the lines through x 0 in the coordinate directions. Each of these restrictions will be a convex function of one variable and we may apply the result just proved to conclude that there are unique supporting lines along each coordinate direction. Since the function is convex it is not difficult to see that these lines span a supporting hyperplane. Theorem 3.7 (Continuity of convex functions). A convex function f : C R defined on an open convex set C R k is continuous. Proof. We prove this first for convex functions of one variable. Let x < x 0 < x + be in the domain of f. Then by Lemma 3.5 we have (see Figure 3.4) where Thus a < f(x) f(x 0) x x 0 < a + a = f(x ) f(x 0 ) x x 0, a + = f(x +) f(x 0 ) x + x 0. f(x) f(x 0 ) max{ a, a + } x x 0, which proves the continuity of f at x 0. For a convex function of several variables it is more complicated. We will show that f is convex at x 0 C. Let Q λ be a k-dimensional cube of side length λ centered at x 0. Since C is open Q λ C for λ small enough. By Lemma 3.4 we conclude that the restriction of f to Q λ must take its maximal value at one of the 2 k corner points of Q λ. If we let λ approach 0 the corners of Q λ will trace out straight lines approaching x 0. From the one-dimensional case we know that f is continuous when restricted to each of these straight lines. Hence the value at the 2 k corners will approach f(x 0 ) as λ tends to 0. We conclude that lim λ 0 max x Qλ f(x) = f(x 0 ). On the other hand we know that f has a supporting hyperplane at x 0, i.e., there exists µ 0 R k such that f(x) f(x 0 ) + µ 0 (x x 0 ). Hence lim λ 0 min x Qλ f(x) = f(x 0 ) and we conclude that lim λ 0 max x Qλ f(x) f(x 0 ) = 0 and thus f is continuous at x 0.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 5 Slope a + Slope a x 0 x x + Figure 3.4: Continuity of convex function Convex functions define on sets that are not open may be discontinuous at the boundary (see Exercise 3.3). Theorem 3.8 (Jensen s inequality). Let f : I R be a convex function on an open interval I R. Given non-negative real numbers α 1,..., α m such that α 1 +... + α m = 1 and points x 1,..., x m I then f(α 1 x 1 +... + α m x m ) α 1 f(x 1 ) +... + α m f(x m ). Proof. Since x 0 = α 1 x 1 +...+α m x m is an average of the points x 1,..., x m we have x 0 I. Since f is convex it has a supporting line at x 0, i.e., for all x I. Thus f(x) f(x 0 ) + µ(x x 0 ) α 1 f(x 1 ) +... + α m f(x m ) α 1 (f(x 0 ) + µ(x 1 x 0 )) +... α m (f(x 0 ) + µ(x m x 0 )) = f(x 0 ) + µ(α 1 x 1 +... + α m x m x 0 ) = f(x 0 ). Definition 3.9 (Convex hull). Given a function f : A R defined on a subset A R k and bounded below by an affine function, i.e., such that the set D = {(µ, b) R k R f(x) µ x + b for all x A} is non-empty. We define the convex hull (see Fig. 3.5) of f to be the function defined on the set where the sup is finite. f c (x) = sup{µ x + b (µ, b) D}.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 6 f f c Figure 3.5: The convex hull f c agrees with f except along the dashed line Theorem 3.10. The convex hull f c of a function f is convex and satisfies f c (x) f(x) for all points in the domain of f. Moreover, if f is defined on an open set then f c (x) is the largest convex function with this property, i.e., for any convex function g f on the domain of f we have g f c on the domain of f. Proof. Let D be the set corresponding to the function f as in Definition 3.9. If 0 α 1 and x 1, x 2 are in the domain of f c then for all (µ, b) D αf c (x 1 ) + (1 α)f c (x 2 ) µ (αx 1 + (1 α)x 2 ) + b. Thus (αx 1 + (1 α)x 2 ) is in the domain of f, i.e., this domain is convex. If we take the sup over (µ, b) D on the right side above we find αf c (x 1 ) + (1 α)f c (x 2 ) f c (αx 1 + (1 α)x 2 ). Thus f c is convex. Let g be a convex function such that g f. Then for each x 0 in the domain of f, x 0 is an interior point in the domain of g and hence g has a supporting hyperplane, i.e., for all x in the domain of f. Thus g(x 0 ) + µ 0 (x x 0 ) g(x) f(x), g(x 0 ) + µ 0 (x x 0 ) f c (x) for all x in the domain of f c and, in particular, g(x 0 ) f c (x 0 ). Corollary 3.11. If f is a convex function defined on an open convex set C R k then f c (x) = f(x). f f c in version of Dec. 3 Proof. Since f is convex and f f we have f f c f on C.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 7 Theorem 3.12. A C 2 -function f : I R defined on an open interval I R is convex if and only if f (x) 0 for all x I. If f (x) > 0 for all x I then f is strictly convex. Proof. If f is convex we conclude from Lemma (3.5) that f is monotone increasing and hence that f (x) 0. On the other hand assume that f (x) 0, 0 < α < 1 and x 1, x 2 I with x 1 < x 2. Then by the Mean Value Theorem there is a ξ 1 [x 1, αx 1 + (1 α)x 2 ] such that f(αx 1 + (1 α)x 2 ) f(x 1 ) = f (ξ 1 )((αx 1 + (1 α)x 2 ) x 1 ) = (1 α)f (ξ 1 )(x 2 x 1 ). Likewise there is an ξ 2 [αx 1 + (1 α)x 2, x 2 ] such that f(x 2 ) f(αx 1 + (1 α)x 2 ) = f (ξ 2 )(x 2 (αx 1 + (1 α)x 2 ) = αf (ξ 2 )(x 2 x 1 ). Since f 0 we have f (ξ 1 ) f (ξ 2 ) and thus α(f(αx 1 + (1 α)x 2 ) f(x 1 )) (1 α)(f(x 2 ) f(αx 1 + (1 α)x 2 ) which is equivalent to f(αx 1 + (1 α)x 2 ) αf(x 1 ) + (1 α)f(x 2 ). Thus f is convex. strict. If f > 0 the inequality above is strict and hence the convexity is Example 3.13. The function f(x) = x a defined on x > 0 satisfies f (x) = a(a 1)x a 2. We see that f (x) 0 for x > 0 if a 1 or a 0. Thus f is convex in these cases. 3.1.1 Legendre transform Given a function it is often relevant to use the derivative at a point as the variable instead of the point itself. This leads to the important Legendre transform of the function. We will use a definition of the Legendre transform which does not assume the function to be differentiable. Definition 3.14 (Legendre transform). Given a function f : A R defined on any subset A R k. We define the Legendre transform of f by f (p) = sup{x p f(x) x A}. defined on the set of p R k where this supremum is finite. We call p the dual variable to x. In order for the the Legendre transform to be defined at a single point we must have that f is bounded below by some affine function. If f is differentiable and the sup above is attained in an interior point x of the domain of f then at this point we will have p = f(x). We are thus using the derivative p = f(x) as the variable. One way to approach the Legendre transform is to define p = f(x) and attempt to solve this equation for x in terms of p and then express x p f(x) in terms of p. This approach is difficult since it requires solving an equation and discuss whether the solution is unique. Defining the Legendre transform as a supremum has several advantages. It does not require f to be differentiable and it does not require discussing the solution to an equation.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 8 Example 3.15. We want to calculate the Legendre transform of the function f(x) = x a / a defined for x > 0 where a 0. We must maximize g p (x) = xp x a / a. We consider first a > 1 and p 0 then the supremum of g p is 0. If p > 0 the maximum occurs when p = ax a 1 / a = x a 1, i.e., x = p 1/(a 1). Thus f (p) = p 1/(a 1) p a 1 p a/(a 1) = p b /b, where b = a a 1. Note that a 1 + b 1 = 1 and a, b > 1. If a < 0 and p > 0 then the supremum of g p is infinite. For p = 0 the supremum is 0. For p < 0 the maximum occurs if p = ax a 1 / a = x a 1, i.e., x = p 1/(a 1). Thus for a < 0, f is defined for p 0 and f (p) = p 1/(a 1) p p a/(a 1) / a = ( 1 a 1 ) p a/(a 1) = p b /b, a a 1 = a a +1. where again b = We finally consider 0 < a < 1. If p > 0 then the supremum of g p (x) is again infinite. If p 0 then the supremum is 0. Thus f is defined for p 0 and f (p) = 0. In this case the Legendre transform is not very useful. Example 3.16. If we have an affine function f(x) = µ x + b then the Legendre transform is only defined for p = µ and in this case f (µ) = b. Theorem 3.17 (Convexity of the Legendre transform). The Legendre transform is a convex function defined on a convex set. Proof. Assume that p 1, p 2 belong to the domain of the Legendre transform f of the function f : A R. Given 0 α 1 then x (αp 1 + (1 α)p 2 ) f(x) = α(x p 1 f(x)) + (1 α)(x p 2 f(x)) αf (p 1 ) + (1 α)f (p 2 ). This proves both that the point αp 1 + (1 α)p 2 belong to the domain of f and that f is convex. Lemma 3.18. For a function f : A R defined on a set A R k we have for all x A that f(x) f (x). Proof. For all x in the domain of f and all p in the domain of f we have x p f (p) f(x). Hence f (x) f(x). Theorem 3.19. If f : A R is defined on an open set A R and is bounded below by an affine function, then f = f c on A. Proof. From the previous lemma we have that f (x) f(x) for all x A. Hence since f is convex we have from Theorem 3.10 that f (x) f c (x) for x A. We will now prove the opposite inequality. For all x 0 A we have since f c is convex that there is a supporting hyperplane for f c, i.e., f c (x) µ 0 (x x 0 ) + f c (x 0 )

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 9 for all x in the domain of f c. In particular we have for all x in the domain of f that µ 0 x 0 f c (x 0 ) µ 0 x f c (x) µ 0 x f(x). Thus µ 0 x 0 f c (x 0 ) f (µ 0 ) and hence f (x 0 ) µ 0 x 0 f (µ 0 ) f c (x 0 ). Corollary 3.20. If f : C R is a convex function defined on an open convex set C then f (x) = f(x) for all x in C. Moreover, p 0 (x x 0 ) + f(x 0 ) is a supporting hyperplane for f at x 0 if and only if f (p 0 ) = p 0 x 0 f(x 0 ). If f is strictly convex we may write f (p) = p x(p) f(x(p))), where x(p) is the unique point with supporting hyperplane for f given by x p (x x(p)) + f(x(p)). Proof. From Corollary 3.11 we know that f = f c and it follows from the previous theorem that f = f. We have that p 0 (x x 0 ) + f(x 0 ) is a supporting hyperplane for f if and only if for all x C, i.e., if and only if p 0 (x x 0 ) + f(x 0 ) f(x), p 0 x f(x) p 0 x 0 f(x 0 ) for all x C. This is however equivalent to f (p 0 ) = p 0 x 0 f(x 0 ). It is clear that if f is strictly convex then two points cannot have the same supporting hyperplane. This proves the last statement. Thus x 0 is uniquely determined from p 0, i.e., x 0 = x(p 0 ). If f has partial derivatives and is strictly convex the equation for the function x(p) described above is of course. f(x) = p. Corollary 3.21. If f is strictly convex on an open convex set then f has partial derivatives at all points and the point x(p) above is given by x(p) = f (p). Proof. According to Theorem 3.6 we must show that f has a unique supporting hyperplane at all points. By Corollary 3.20 we know that f (p 0 )+x 0 (p p 0 ) is a supporting hyperplane for f at p 0 if and only if f(x 0 ) = f (x 0 ) = p 0 x 0 f (p 0 ), but this is equivalent to p 0 (x x 0 )+f(x 0 ) being the supporting hyperplane for f at x 0, i.e, x(p 0 ) = x 0. Thus x 0 is unique and hence thus is the supporting hyperplane for f at p 0. Since f thus has partial derivatives at p 0 we must have x(p 0 ) = x 0 = f (p 0 ). We see that the Legendre transform is particularly useful for convex function, where the function can be recovered from its Legendre transform. As a geometric interpretation of the Legendre transform we see that p x f (p) denotes the supporting plane of f with slope p.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 10 3.1.2 Legendre transform in thermodynamics It is an important property of functions in thermodynamics that they are either convex or concave. As an example the entropy S(U, V ) is a monotone increasing concave function of total energy U at fixed volume V. The inverse function U(S, V ) which gives the total energy as a function of S and V is thus a convex function (see Exercise 3.11). It is natural to ask for the Legendre transform of U as a function of S. The negative of the Legendre transform of the total energy is called the free energy F (T, V ) = U (T, V ) = sup(t S U(S, V )) = inf(u(s, V ) T S). S S We have called the dual variable to S for T, since it is indeed the temperature of the system. The free energy is the amount of work that the system can perform in a thermodynamic process at constant temperature. Not all the total energy U is available. At the critical temperature of a phase transition the free energy may not be differentiable. At a boiling point, for example, the temperature does not change while the liquid turns into vapor. Since, as we have seen above the temperature is the derivative of U wrt. S at constant V the entropy increases linearly with the total energy during the phase transition. This is again reflected in a jump in the derivative of the the free energy see Figure 3.6. U F= U * Slope S 2 T c T Slope S 1 S 1 S 2 S Figure 3.6: T c is a critical temperature of a phase transition The fact that the temperature and entropy are dual variables and that the free energy and the total energy are Legendre transforms is related to what is called equivalence of ensembles. The equivalence of ensembles refers to the fact that different microscopic states lead to equivalent macroscopic states. We illustrate this for the ideal gas discussed in Chapter 1. In Chapter 1 we discussed the microscopic state described by the Maxwell-Boltzmann probability distribution. It describes the situation of an ensemble of systems. Picking one system at random from the ensemble the probability of finding the particles in that system, with certain positions

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 11 and velocities at a given time is determined by the Maxwell-Boltzmann distribution. One refers to this as the canonical ensemble. In the Maxwell-Boltzmann distribution the state with velocities v 1,..., v N is given the relative weight ( ) N i=1 1 2 exp v2 i. k B T (assuming here that all particles have mass 1) A different microscopic state corresponds to giving all states with total energy N i=1 1 2 v2 i less than some U equal probability. This is referred to as the micro-canonical ensemble. The Legendre transform equivalence between the energy and the free energy describes the equivalence of these two ensembles. This reflects the fact that as the number of particles N tends to infinity the probabilities will in both cases concentrate on states with a fixed total energy. To understand this we must explain how the entropy and free energy are given. We introduce the partition functions for the two systems, i.e., the normalization constants for the weights above (the number they must be divided by in order to make them probability distributions), i.e., ( ) N Z Canonical (T, V, N) = (N!) 1 V N i=1 1 2 exp v2 i d 3N v k B T Z Micro canonical (U, V, N) = (N!) 1 V N P Ni=1 12 v2i <U 1d 3N v. (The factor (N!) 1 comes from treating the particles as indistinguishable.) The free energy and entropy are given by F N (T, V ) = k B T ln Z Canonical (T, V, N), S N (U, V ) = k ln Z Micro canonical (U, V, N). (3.1) These two functions will not in general be Legendre transforms of each other. But in the large N limit they will (see Exercise 3.12). This is a consequence of the probabilities in both cases concentrating on states with a fixed total energy. In the next section we discuss the micro-canonical and canonical ensembles for other systems than the ideal gas. Discussing the equivalence of these ensembles in generality goes beyond the scope of these notes. 3.2 The Hamiltonian and Hamilton s equations We turn to the application of Legendre transform in mechanics. We note that in all the situations discussed in Chapter 2 of a Lagrangian function L(q, v, t) describing a mechanical system the function had the form L(q, v, t) = (A(q, t)v + b(q, t)) 2 + h(q, t) (3.2) for a non-singular square matrix function A, a vector function b and a scalar function h. This is the case in an inertial system, after a change of coordinates (even time dependent)

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 12 or for constrained motion. Thus in all these cases the mechanical Lagrangian is a strictly convex function of v for fixed q and t (See Exercise 3.10). We will study its Legendre transform. Definition 3.22 (The Hamiltonian). If L(q, v, t) is a mechanical Lagrangian function, which is strictly convex in v for fixed q and t, then the Legendre transform for fixed q and t is called the Hamiltonian It is defined for all p R k. H(p, q, t) = sup (v p L(q, v, t)). v If v L(q, v, t) is C 1, strictly convex and defined on an open convex set we know from Corollary 3.20 that we may write H(q, p, t) = v p L(q, v(p, q, t), t), (3.3) where v(q, p, t) is determined as the unique solution to the equations p i = L v i (q, v, t), i = 1,..., k (3.4) i.e., p i is the generalized momentum corresponding to the coordinate q i. If we are in an inertial system and have a system of N particles we have L(q, v, t) = N i=1 1 2 m iv 2 i V (q), where q i, v i R 3, for i = 1,..., N. Then p i = vi L(q, v) = m i v i and thus H(q, p) = N i=1 p 2 i 2m i + V (q) I.e., we recognize the Hamiltonian H(q, p) as the energy function. Our goal is to rewrite the equations of motion in terms of the Hamiltonian. Theorem 3.23. Let the Lagrangian L(q, v, t) be a C 2 -function of the form (3.2), which is strictly convex in v. The motion γ(t) solves the equations of motion, i.e., the Euler- Lagrange equations for the action corresponding to L, if and only if q i (t) = γ i (t) and for i = 1,..., k solve Hamilton s equations p i (t) = L v i (γ(t), γ(t), t), (3.5) q i (t) = H p i (q(t), p(t), t) ṗ i (t) = H q i (q(t), p(t), t).

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 13 Proof. Since v L(q, v, t) is a strictly convex C 2 function we know from Corollary 3.21 that the relation (3.4) is equivalent to v i = H p i (q, p, t), i = 1,..., k. Thus with q(t) = γ(t) the first of Hamilton s equations is equivalent to (3.5). We then see that the second of Hamilton s equations is equivalent to the Euler-Lagrange equations ( ) d L (γ(t), γ(t), t) = L (γ(t), γ(t), t), i = 1,..., k dt v i q i if we can prove that H (q, p, t) = L (q, v, t). (3.6) q i q i If the functions v i (q, p, t) = H p i (q, p, t) for i = 1,..., k are C 1, we can use the chain rule on (3.3) to obtain H q i (q, p, t) = k j=1 v j q i (q, p, t) p j = L q i (q, v(q, p, t), t), k j=1 L v j (q, v(q, p, t), t) v j q i (q, p, t) L q i (q, v(q, p, t), t) where we have used (3.4). If L is of the form (3.2) then we can explicitly check that v i (q, p, t) are C 1 for i = 1,..., k (see Exercise 3.10). In fact, one can also conclude the validity of (3.6) without the explicit form of L, but we shall not give the argument here. If H : Ω R is a C 2 function on an open subset Ω of R 2k+1 it follows from an existence and uniqueness theorem similar to Theorem 1.3 that there exists a unique solution to Hamilton s equations defined in some open interval around t 0 if we specify initial conditions q(t 0 ) = q 0 and p(t 0 ) = p 0, such that (q 0, p 0, t 0 ) Ω. If we assume for simplicity that Ω = R 2k+1 and that the solution exists for all times we find as in Section 1.4 a flow Ψ t,t0 : R 2k R 2k such that Ψ t,t0 (q 0, p 0 ) = (q(t), p(t)) is the solution to Hamilton s equations with initial conditions q(t 0 ) = q 0 and p(t 0 ) = p 0. This flow is called the Hamiltonian flow. If H is a C 2 function then Ψ t,t0 (q, p) will be a C 2 function of (q, p, t) (see E.A. Coddington and N. Levinson: Ordinary Differential Equation). One of the important properties of the Hamiltonian flow is that it preserves volume in phase space. This result is known as Liouville s Theorem and the precise formulation is as follows. Theorem 3.24 (Liouville s Theorem). If Ψ t,t0 is the Hamiltonian flow for a Hamiltonian which is C 2 then det DΨ t,t0 = 1, where DΨ t,t0 is the Jacobian.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 14 In particular, for any continuous function f : R 2k R we have f(ψ 1 t,t 0 (q, p))d k qd k p = f(q, p)d k qd k p. R 2k R 2k Proof. As in Chapter 1 we have Ψ t,t0 = Ψ t,t1 Ψ t1,t 0. Thus by the chain rule DΨ t,t0 = (DΨ t,t1 Ψ t1,t 0 )DΨ t1,t 0 and hence Thus Since Ψ t1,t 1 det(dψ t,t0 ) = det(dψ t,t1 Ψ t1,t 0 ) det(dψ t1,t 0 ). d dt det(dψ t,t 0 ) t=t1 = d dt det(dψ t,t 1 Ψ t1,t 0 ) t=t1 det(dψ t1,t 0 ). is the identity map we see from Exercise (3.16) that Since Ψ is C 2 we have By Hamilton s equations we find Hence and therefore D d dt Ψ t,t 1 = d dt det(dψ t,t 1 ) t=t1 = Tr d dt D(Ψ t,t 1 ) t=t1. d dt (Ψ t,t 1 ) t=t1 = d dt DΨ t,t 1 = D d dt Ψ t,t 1. ( H,..., H, H,..., H ) p 1 p k q 1 q k 2 H q 1 p 1 2 H q k p 1..... 2 H q 1 p k 2 H q k p k 2 H q 2 1 2 H q k q 1..... 2 H q 1 q k 2 H qk 2 Tr d dt D(Ψ t,t 1 ) t=t1 = k i=1 2 H p 2 1 2 H p k p 1..... 2 H p 1 p k 2 H p 2 k 2 H p 1 q 1 2 H p k q 1..... 2 H p 1 q k 2 H p k q k 2 H q i p i 2 H p i q i = 0. We conclude that det(dψ t,t0 ) is independent of time. Since Ψ t0,t 0 is the identity we have that det(dψ t,t0 ) = 1. The last statement in the theorem is an immediate consequence of the transformation theorem for integrals. As in Chapter 1 we call the space of (q, p) the phase space. In Chapter 1, the points in phase space were (x, v), i.e., space coordinates and velocities. Now we have replaced the velocities v i by the momenta p i. For an inertial system this only means that we have included the mass p i = m i v i.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 15 Theorem 3.25 (Conservation of energy). If H(p, q) is a C 1 -function independent of time and (q(t), p(t)) are C 1 solutions to Hamilton s equations then d H(q(t), p(t)) = 0. dt Proof. By the chain rule we calculate d k dt H(q(t), p(t)) = H (q(t), p(t)) q i (t) + H (q(t), p(t))ṗ i (t) q i p i = i=1 k i=1 H (q(t), p(t)) H (q(t), p(t)) H (q(t), p(t)) H (q(t), p(t)) = 0. q i p i p i q i Remark 3.26. In the previous section we discussed the canonical and micro-canonical ensembles for the ideal gas. We can also define these ensembles for more complicated systems. In fact, for any system described by a time independent Hamiltonian we define the canonical ensemble as the probability distribution on phase space giving the relative weight exp( H/(k B T )) to each state. The micro-canonical ensemble is defined as giving equal weight to all states with H U. By the previous statement these probability distributions are invariant under the Hamiltonian flow. To prove that these two ensembles give equivalent macroscopic descriptions in the sense discussed in the previous section requires additional assumptions and is a highly non-trivial fact. 3.3 Noether s Theorem The final topic we want to discuss in classical mechanics is the notion of symmetries and conservation laws. For simplicity we consider a system described by a Lagrangian function L(q, v) independent of time and defined on all of space L : R 2k R. We introduce transformations as maps very similar to the coordinate changes from Chapter 2. Definition 3.27 (Transformation). A transformation in space is a C 1 map ψ : R k R k that is bijective with det Dψ( x) 0, for all x R k. A continuous transformation is a ψ depending on an additional parameter such that it is a C 2 -function ψ : R k ( a, a) R k, for some a > 0 with ψ 0 (q) = q. We have here written ψ s (q) for the transformation. The continuous parameter in a continuous transformation should not be confused with time and we will therefore denote it by s. We will denote the derivative wrt. s by ψ s(q) and the Jacobian wrt. q by Dψ s (q). Definition 3.28 (Symmetry). We say that a transformation ψ is a symmetry of our system or that L is invariant under ψ if (compare with how L changed under a coordinate change) L(ψ(q), Dψ(q)v) = L(q, v). A continuous transformation of symmetries is called a continuous symmetry of the system.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 16 Example 3.29. (a) If A O(3) is a 3 3 orthogonal matrix then the linear transformation ψ(q) = Aq is a symmetry for a system described by a Lagrangian function of the form L(q, v) = 1 2 mv2 V ( q ), where m > 0 and V : [0, ) R is a C 1 function. (See Exercise 3.17.) (b) The continuous transformation ψ s (q) = q + s (translations) is a continuous symmetry of the free action S(γ) = 1 t2 2 t 1 γ(t) 2 dt. (See Exercise 3.18.) Theorem 3.30 (Noether s Theorem). If ψ : R k ( a, a) R k is a continuous symmetry of a system described by a C 2 -Lagrangian function L : R 2k R then the function I(q, v) = v L(q, v)ψ 0(q) is an integral of the motion, i.e., a conserved quantity. This means that if γ(t) is a solution to the Euler-Lagrange equations for L then d I(γ(t), γ(t)) = 0. dt Proof. From the Euler-Lagrange equations we find d dt I(γ(t), γ(t)) = d ( v L(γ(t), γ(t)) ) ψ dt 0(γ(t)) + v L(γ(t), γ(t))dψ 0(γ(t)) γ(t) = q L(γ(t), γ(t))ψ 0(γ(t)) + v L(γ(t), γ(t))dψ 0(γ(t)) γ(t). On the other hand since ψ is a continuous symmetry we have for all s ( a, a). Thus L(ψ s (q), Dψ s (q)v) = L(q, v) 0 = d ds L(ψ s(q), Dψ s (q)v) s=0 = q L(q, v)ψ 0(q) + v L(q, v)dψ 0(q)v. We see that d dti(γ(t), γ(t)) = 0. Exercises Exercise 3.1. Determine which of the following functions that are convex: x 2, 1 1 + x 2, exp(x), ln(x), 1 + x 2, x 2 Exercise 3.2. Show that the intersection of two convex sets is convex.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 17 Exercise 3.3. Show that the function f(x) = defined on [0, 1] is convex and discontinuous. { 0, x [0, 1) 1, x = 1 Exercise 3.4. Use Jensen s inequality to show that the the arithmetic mean is bigger than the geometric mean, i.e, x 1 +... + x n n (x 1 x n ) 1/n, for x 1,..., x n > 0. Hint: You may use that the exponential function e x is convex. Exercise 3.5. Show that the function f(x) = a + x b, x R, is convex and find its Legendre transform. Here a and b are arbitrary real constants. Exercise 3.6. Let f : D R be a convex function and denote its Legendre transform by f : E R. (a) Show that the Legendre transform of the function f 1 : D R defined by f 1 (x) = a + bx + f(x), x D, equals the function f 1 : E R defined by where E = {p + b p E}. f 1 (p) = f (p b) a, p E, (b) Show that the Legendre transform of the function f 2 : D R defined by f 2 (x) = f(x b), x D, where D = {x + b x D}, equals the function f 2 : E R defined by f 2 (p) = bp + f (p), p E. Exercise 3.7. Show that exp is a convex function on R and determine its Legendre transform. Exercise 3.8. Let f : R n R be the quadratic function defined by f(x) = 1 2 xt Ax, x R n, where A is a positive definite matrix (i.e. x t Ax > 0 for all x R n \ {0}). Show that the Legendre transform f of f is given by f (p) = 1 2 pt A 1 p, p R n, either by using that f is a convex C 2 -function or by rewriting x p f(x) as the difference of two quadratic expressions.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 18 Exercise 3.9. (a) Show that if f is any function and x is in the domain of f and p is in the domain of its Legendre transform then x p f(x) + f (p). (This is sometimes called Young s inequality.) (b) Prove that for all x, y 0 and all a, b > 1 with a 1 + b 1 = 1 we have xy xa a + yb b. Exercise 3.10. Show that if A is a non-singular square matrix and b is a vector then the function v Av + b 2 is strictly convex. Calculate its Legendre transform. Exercise 3.11. Show that if f : I R is a monotone increasing concave function defined on an interval I R, then f has an inverse defined on the interval f(i) and the inverse function f 1 : f(i) R is convex. You must also show that f(i) is an interval. Exercise 3.12. In this exercise we will illustrate the equivalence of ensembles for the ideal gas. (a) If f : [0, ) R is a continuous function then using polar coordinates in all dimensions we obtain the formula f( x )d n x = ω n f(r)r n 1 dr R n for some constant ω n. Using that ( ) n e x 2 d n x = e t2 dt = π n/2 R n show that ω n = 2π n/2 Γ(n/2) 1, where the Gamma function is given by Γ(u) = 0 e u u n 1. Check the formula in the case n = 2. Recall that Γ(n) = (n 1)! for integer n. (b) Use the result of the previous question to calculate the free energy and entropy in (3.1). (c) Using the approximation in Stirling s formula lim u u 1 (Γ(u) (u ln(u) u)) = 1 we replace Γ(u) by u ln(u) u in the formula for the entropy above (also replace ln N! by N ln N N). Show that in this approximation the formula agrees with what was found in Exercise 1.20. (d) Show that in the large particle number approximation used in the previous question the Legendre transform of the total energy U(S, V ) as a function of entropy S is F (T, V ) (minus the free energy) as a function of temperature T. (e) (Difficult) Show that this is a consequence of the probabilities in both cases concentrating on the set of states where the total energy is exactly U for the micro-canonical ensemble and 3 2 Nk BT for the canonical ensemble. 0

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 19 Exercise 3.13. Consider the Newtonian potential between two masses m 1, m 2 > 0 placed at the points q 1, q 2 R 3 : V (q 1, q 2 ) = Gm 1 m 2 q 1 q 2 1. (a) Write down the Lagrangian and the Hamiltonian for this two body problem. (b) Write down Hamilton s equations for the system Exercise 3.14. Let A : R 3 R 3 be a C 2 vector field. We consider the magnetic field B : R 3 R 3 that has A as its vector potential, i.e., B = A. The Lagrangian for a particle of mass m > 0 and charge e moving in this magnetic field is L(q, v) = 1 2 mv2 + ea(q) v. Find the corresponding Hamiltonian and write down Hamilton s equations. Exercise 3.15. Consider two particles of masses m 1, m 2 > 0 at positions q 1, q 2 R 3 with Lagrangian where V is a C 2 function V : R 3 R. (a) Argue that the center of mass is a point between q 1 and q 2. L(q 1, q 2, v 1, v 2 ) = 1 2 m 1v 2 1 + 1 2 m 2v 2 2 V (q 2 q 1 ), Q = m 1q 1 + m 2 q 2 m 1 + m 2 (b) Define the coordinate change ψ with inverse Determine ψ. ψ 1 (q 1, q 2 ) = (Q, q 2 q 1 ) (c) Find the Lagrangian in the new coordinates (Q, q) where q = q 2 q 1. (d) Find the Hamiltonian in the new coordinates. Exercise 3.16. Let A(t) be a square matrix with entries being C 1 functions of time t. Show that if A(0) = I then ( ) d dt det A (0) = TrA (0). Exercise 3.17. Prove that the linear transformation in Example 3.29(a) is really a symmetry of the given system. Exercise 3.18. Show the claim in Example 3.29(b)

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 20 Exercise 3.19. Consider a 2-dimensional system with Lagrangian L(q, v) = 1 2 mv2 V ( q ) where m > 0 and V : [0, ) R is a C 2 function. (a) Show that the continuous transformation ψ s (x, y) = (cos(s)x sin(s)y, sin(s)x + cos(s)y), x, y, s R defines a continuous symmetry of the system. (b) Determine the integral of the motion I corresponding to this continuous symmetry according to Noether s Theorem. Exercise 3.20. Show that the function g(x) = (x 2 1) 4 defined on R is not convex and determine its convex hull. (Hint: Write down what you think is the convex hull and show that it is convex). Exercise 3.21. In this exercise we will show that a convex function f : C R defined on an open convex set C R k has supporting hyperplanes at all points in C, even when the function does not have partial derivatives. (a) Show that for all x 0 C and v R k µ x0 (v) = lim t 0+ t 1 (f(x 0 + tv) f(x 0 )) = inf t>0 t 1 (f(x 0 + tv) f(x 0 )) exists. Hint: Use the argument proving the existence of µ ± in Theorem 3.6. (b) Show that µ x0 from (a) satisfies µ x0 (sv) = sµ x0 (v) and µ x0 (v + w) µ x0 (v) + µ x0 (w) for all s > 0 and v, w R k. Hint: For the last assumption use that f is convex. (c) Show that h(x) = f(x 0 ) + p (x x 0 ) is a supporting hyperplane for f at x 0 if and only if µ x0 (v) p v for all v R k. We will use induction on the dimension k to show that f has a supporting hyperplane at all points. The case k = 1 was treated in the proof of Theorem 3.6. We assume the result holds in k 1 dimensions. We want to prove it in dimension k. Thus we have p R k 1 such that for all v R k 1 we have µ x0 (v ) p v. Let e be a unit vector in the k-th coordinate direction.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 21 (d) Use this induction hypothesis and the second result from question (b) to show that for all v, w R k 1 p v µ x0 (v e) p w + µ x0 (w + e). Use this together with the first property in (b) to show that we can choose p k R such that ( sup p v µ x0 (v se) ) ( sp k inf p w + µ x0 (w + se) ), v R k 1 w R k 1 for all s > 0. (e) Use the result of (d) to conclude that (p + p k e) (v + v k e) µ x0 (v + v k e) for all v R k 1 and all v k R. Conclude that f(x 0 ) + p (x x 0 ) is a supporting hyperplane for f at x 0 if p = p + p k e. (f) Use the construction of the supporting hyperplane in (e) to show that the hyperplane is unique if and only if f has partial derivatives at x 0. Exercise 3.22. Consider a single particle in a conservative force field whose corresponding potential V (x) is rotationally invariant, i.e. it depends on x only. (a) Write down the Lagrange function in spherical coordinates. (You may use Exercise 2.18 for this purpose.) (b) Determine the Hamilton function in spherical coordinates, find the cyclic coordinates and the corresponding conserved generalized momenta. Exercise 3.23. Show that the function f(x) = ( x + 1) 2 defined on R is convex and determine its Legendre transform. Exercise 3.24. The entropy of an ideal gas of N particles, total energy U > 0, and volume V > 0, is ( S(U, V ) = Nk B ln C 0 (V/N)(U/N) 3/2), where k B is Boltzmann s constant and C 0 > 0 is some constant. We will keep N fixed in this problem. (a) Determine the temperature T (U) and pressure P (V ) such that T (U)dS = du + P (U, V )dv. (b) Determine the inverse function U(S, V ) of U S(U, V ) and show that U(S, V ) is monotone increasing and convex as a function of S.

Chap. 3 Hamiltonian Mechanics Version of 03.12.08 22 (c) Determine the free energy F (T, V ) of the ideal gas, i.e., F (T, V ) = U (T, V ) where T U (T, V ) is the Legendre transform of S U(S, V ). Exercise 3.25. Consider 3 particles with masses m 1, m 2, m 3 > 0 at positions q 1, q 2, q 3 R 3 in an inertial system and interacting through a conservative force given by the potential where V : R 3 R is a C 2 -function. (q 1, q 2, q 3 ) V (q 1 q 2 ) + V (q 1 q 3 ) + V (q 2 q 3 ), (a) Write down the Lagrangian function and show that ψ : R 9 R R 9 given by ψ s (q 1, q 2, q 3 ) = (q 1 + (s, 0, 0), q 2 + (s, 0, 0), q 3 + (s, 0, 0)) for s R defines a continuous symmetry of the problem. (b) Determine the integral of the motion given by Noether s Theorem.