Classical Mechanics. Eric D Hoker Department of Physics and Astronomy, University of California, Los Angeles, CA 90095, USA 2 September PDF Free Download

Classical Mechanics Eric D Hoker Department of Physics and Astronomy, University of California, Los Angeles, CA 90095, USA 2 September 2012 1

Contents 1 Review of Newtonian Mechanics 4 1.1 Some History................................... 4 1.2 Newton s laws................................... 5 1.3 Comments on Newton s laws........................... 6 1.4 Work........................................ 7 1.5 Dissipative forces................................. 8 1.6 Conservative forces................................ 8 1.7 Velocity dependent conservative forces..................... 10 1.8 Charged particle in the presence of electro-magnetic fields.......... 12 1.9 Physical relevance of conservative forces.................... 13 2 Lagrangian Formulation of Mechanics 14 2.1 The Euler-Lagrange equations in general coordinates............. 14 2.2 The action principle............................... 17 2.3 Variational calculus................................ 18 2.4 Euler-Lagrange equations from the action principle.............. 20 2.5 Equivalent Lagrangians.............................. 21 2.6 Systems with constraints............................. 21 2.7 Holonomic versus non-holonomic constrains.................. 23 2.7.1 Definition................................. 25 2.7.2 Reducibility of certain velocity dependent constraints......... 25 2.8 Lagrangian formulation for holonomic constraints............... 26 2.9 Lagrangian formulation for some non-holonomic constraints.......... 28 2.10 Examples..................................... 28 2.11 Symmetry transformations and conservation laws............... 30 2.12 General symmetry transformations....................... 32 2.13 Noether s Theorem................................ 34 2.14 Examples of symmetries and conserved charges................ 35 2.14.1 Translation in time............................ 35 2.14.2 Translation in a canonical variable................... 35 3 Quadratic Systems: Small Oscillations 37 3.1 Equilibrium points and Mechanical Stability.................. 37 3.2 Small oscillations near a general solution.................... 40 3.3 Lagrange Points.................................. 41 3.4 Stability near the non-colinear Lagrange points................ 43 2

4 Hamiltonian Formulation of Mechanics 45 4.1 Canonical position and momentum variables.................. 45 4.2 Derivation of the Hamilton s equations..................... 46 4.3 Some Examples of Hamiltonian formulation.................. 47 4.4 Variational Formulation of Hamilton s equations................ 48 4.5 Poisson Brackets and symplectic structure................... 49 4.6 Time evolution in terms of Poisson brackets.................. 50 4.7 Canonical transformations............................ 50 4.8 Symmetries and Noether s Theorem....................... 53 4.9 Poisson s Theorem................................ 54 4.10 Noether charge reproduces the symmetry transformation........... 54 5 Lie groups and Lie algebras 56 5.1 Definition of a group............................... 56 5.2 Matrix multiplication groups........................... 57 5.3 Orthonormal frames and parametrization of SO(N).............. 59 5.4 Three-dimensional rotations and Euler angles................. 61 5.5 Definition of a Lie group............................. 62 5.6 Definition of a Lie algebra............................ 62 5.7 Relating Lie groups and Lie algebras...................... 63 5.8 Symmetries of the degenerate harmonic oscillator............... 64 5.9 The relation between the Lie groups SU(2) and SO(3)............ 67 6 Motion of Rigid Bodies 68 6.1 Inertial and body-fixed frames.......................... 68 6.2 Kinetic energy of a rigid body.......................... 69 6.3 Angular momentum of a rigid body....................... 70 6.4 Changing frames................................. 71 6.5 Euler-Lagrange equations for a freely rotating rigid body........... 72 6.6 Relation to a problem of geodesics on SO(N)................. 73 6.7 Solution for the maximally symmetric rigid body............... 74 6.8 The three-dimensional rigid body in terms of Euler angles.......... 74 6.9 Euler equations.................................. 76 6.10 Poinsot s solution to Euler s equations..................... 76 7 Special Relativity 78 7.1 Basic Postulates.................................. 78 7.2 Lorentz vector and tensor notation....................... 80 7.3 General Lorentz vectors and tensors....................... 82 3

7.3.1 Contravariant tensors........................... 82 7.3.2 Covariant tensors............................. 82 7.3.3 Contraction and trace.......................... 83 7.4 Relativistic invariance of the wave equation................... 84 7.5 Relativistic invariance of Maxwell equations.................. 85 7.5.1 The gauge field, the electric current, and the field strength...... 85 7.5.2 Maxwell s equations in Lorentz covariant form............. 86 7.6 Relativistic kinematics.............................. 88 7.7 Relativistic dynamics............................... 90 7.8 Lagrangian for a massive relativistic particle.................. 91 7.9 Particle collider versus fixed target experiments................ 92 7.10 A physical application of time dilation..................... 93 8 Manifolds 94 8.1 The Poincaré Recurrence Theorem....................... 94 8.2 Definition of a manifold............................. 95 8.3 Examples..................................... 96 8.4 Maps between manifolds............................. 97 8.5 Vector fields and tangent space......................... 97 8.6 Poisson brackets and Hamiltonian flows..................... 99 8.7 Stokes s theorem and grad-curl-div formulas.................. 100 8.8 Differential forms: informal definition...................... 102 8.9 Structure relations of the exterior differential algebra............. 103 8.10 Integration and Stokes s Theorem on forms................... 105 8.11 Frobenius Theorem for Pfaffian systems..................... 105 9 Completely integrable systems 107 9.1 Criteria for integrability, and action-angle variables.............. 107 9.2 Standard examples of completely integrable systems.............. 107 9.3 More sophisticated integrable systems...................... 108 9.4 Elliptic functions................................. 109 9.5 Solution by Lax pair............................... 110 9.6 The Korteweg de Vries (or KdV) equation................... 112 9.7 The KdV soliton................................. 112 9.8 Integrability of KdV by Lax pair........................ 114 4

Course textbook Bibliography Classical Dynamics (a contemporary perspective), J.V. José and E.J. Saletan, Cambridge University Press, 6-th printing (2006). Classics Mechanics, Course of Theoretical Physics, Vol 1, L.D. Landau and E.M. Lifschitz, Butterworth Heinemann, Third Edition (1998). Classical Mechanics, H. Goldstein, Addison Wesley, (1980); Further references The variational Principles of Mechanics, Cornelius Lanczos, Dover, New York (1970). Analytical Mechanics, A. Fasano and S. Marmi, Oxford Graduate Texts. More Mathematically oriented treatments of Mechanics Mathematical Methods of Classical Mechanics, V.I. Arnold, Springer Verlag (1980). Foundations of Mechanics, R. Abraham and J.E. Marsden, Addison-Wesley (1987). Higher-dimensional Chaotic and Attractor Systems, V.G. Ivancevic and T.T. Ivancevic, Springer Verlag. Physics for Mathematicians, Mechanics I; by Michael Spivak (2011). Mathematics useful for physics Manifolds, Tensor Analysis, and Applications, R. Abraham, J.E. Marsden, T. Ratiu, Springer-Verlag (1988). Geometry, Topology and Physics, M. Nakahara, Institute of Physics Publishing (2005). The Geometry of Physics, An Introduction, Theodore Frankel, Cambridge (2004). 5

1 Review of Newtonian Mechanics A basic assumption of classical mechanics is that the system under consideration can be understood in terms of a fixed number N p of point-like objects. Each such object is labeled by an integer n = 1,, N p, has a mass m n > 0, and may be characterized by a position x n (t). The positions constitute the dynamical degrees of freedom of the system. On these objects and/or between them, certain forces may act, such as those due to gravitaty and electro-magnetism. The goal of classical mechanics is to determine the time-evolution of the position x n (t) due to the forces acting on body n, given a suitable set of initial conditions. A few comments are in order. The point-like nature of the objects described above is often the result of an approximation. For example, a planet may be described as a point-like object when studying its revolution around the sun. But its full volume and shape must be taken into account if we plan to send a satellite to its surface, and the planet can then no longer be approximated by a point-like object. In particular, the planet will rotate as an extended body does. This extended body may be understood in terms of smaller bodies which, in turn, may be treated as point-like. A point-like object is often referred to as a material point or a particle, even though its size may be that of a planet or a star. In contrast with quantum mechanics, classical mechanics allows specification of both the position and the velocity for each of its particles. In contrast with quantum field theory, classical mechanics assumes that the number of particles is fixed, with fixed masses. In contrast with statistical mechanics, classical mechanics assumes that the positions and velocities of all particles can (in principle) be known to arbitrary accuracy. 1.1 Some History Historically, one of the greatest difficulties that needed to be overcome was to observe and describe the motion of bodies in the absence of any forces. Friction on the ground and in the air could not easily be reduced with the tools available prior to the Renaissance. It is the motion of the planets which would produce the first reliable laws of mechanics. Based on the accurate astronomical observations which Tycho Brahe (1546-1601) made with the naked eye on the positions of various planets (especially Mars), Johannes Kepler (1571-1630) proposed his quantitative and precise mathematical laws of planetary motion. Galileo Galilei (1564-1642) investigated the motion of bodies on Earth, how they fall, how they roll on inclined planes, and how they swing in a pendulum. He demonstrated with the help of such experiments that bodies with different masses fall to earth at the same rate (ignoring air friction), and deduced the correct (quadratic) mathematical relation between height and elapsed time during such falls. He may not have been the first one to derive such laws, but 6

Galileo formulated the results in clear quantitative mathematical laws. Galileo proposed that a body in uniform motion will stay so unless acted upon by a force, and he was probably the first to do so. Of course, some care is needed in stating this law precisely as the appearance of uniform motion may change when our reference frame in which we make the observation is changed. In a so-called inertial frame, which we shall denote by R, the motion of a body on which no forces act is along a straight line at constant velocity and constant direction. A frame R which moves with constant velocity with respect to R is then also inertial. But a frame R which accelerates with respect to R is not inertial, as a particle in uniform motion now sweeps out a parabolic figure. Galileo stated, for the first time, that the laws of mechanics should be the same in different inertial frames, a property that is referred to as the principle of Galilean Relativity, and which we shall discuss later. Isaac Newton (1642-1727) developed the mathematics of differential and integral calculus which is ultimately needed for the complete formulation of the laws of mechanics. These laws form the foundation of mechanics, and were laid out in his Philosophae Naturalis Principia Mathematica, or simply the Principia, published in 1687. Remarkably, the mathematics used in the Principia is grounded in classical Greek geometry, supplemented with methods borrowed from infinitesimal calculus. Apparently, Newton believed that a formulation in terms of Greek geometry would enjoy more solid logical foundations that Descartes analytic geometry and his own calculus. 1.2 Newton s laws Newton s laws may be stated as follows, 1. Space is rigid and flat 3-dimensional, with distances measured with the Euclidean metric. Time is an absolute and universal parameter. Events are described by giving the position vector and the time (x, t). Events at any two points in space occur simultaneously if their time parameters are the same. It is assume that in this spacetime setting, an inertial frame exists. 2. The laws of motion of a material point of mass m and position x(t) are expressed in terms of the momentum of the material point, defined in terms of the mass, position and velocity by, p = mv v = ẋ = dx dt Newton s second law may then be cast in the following form, dp dt (1.1) = F(x) (1.2) 7

where F is the force to which the material point with position x is subject. The second law holds in any inertial frame. 3. The law of action and reaction states that if a body B exerts a force F A on body A, then body A exerts a force F B = F A on body B. In terms of momentum, the law states that the momentum transferred from A to B by the action of the force is opposite to the momentum transferred from B to A. 4. The law of gravity between two bodies with masses m A and m B, and positions x A and x B respectively, is given by r F A = F B = Gm A m B r = x r 3 A x B (1.3) where G is Newton s universal constant of gravity. Here, F A is the force exerted by body B on body A, while F B is the force exerted by A on B. 1.3 Comments on Newton s laws Some immediate comments on Newton s laws may be helpful. The definition of momentum given in (1.1) holds in any frame, including non-inertial frames. The expression for momentum given in (1.1) will require modification, however, in relativistic mechanics, as we shall develop later in this course. The mass parameter m may depend on time. For example, the mass of a rocket will depend on time through the burning and expelling of its fuel. Therefore, the popular form of Newton s second law, F = ma with the acceleration given by a = v, holds only in the special case where the mass m is constant. A first result of the third law (item 3 above) is that the total momentum in an isolated system (on which no external forces act) is conserved during the motion of the system. A second result of the same law is that total angular momentum, defined by, is also conserved for an isolated system. The measured value of Newton s constant of gravity is, L = r p (1.4) G = 6.67384(80) 10 11 m 3 8 kg s 2 (1.5)

The force described by Newton s law of gravity acts instantaneously, and at a distance. Both properties will ultimately be negated, the first by special relativity, the second by field theory, such as general relativity. 1.4 Work When a particle moves in the presence of a force, a certain amount of work is being done on the particle. The expression for the work δw done by a force F under an infinitesimal displacement dx on a single particle is given by, δw = F dx (1.6) The work done by the force on the particle along a path C 12 between two points x(t 1 ) and x(t 2 ) on the trajectory of the particle is given by the following integral, W 12 = F dx C 12 (1.7) If the mass of the particle is time-independent, and we have F = mẍ, then the integral for the work may be carried out using Newton s equations, and we find, W 12 = m v v dt = T 2 T 1 C 12 (1.8) where T is the kinetic energy, T = 1 2 mv2 (1.9) and T 1, T 2 are the kinetic energies corresponding to the two boundary points on the trajectory. Thus, the work done on the particle is reflected in the change of kinetic energy of the particle. Work is additive. Thus, in a system with N p particles with positions x n (t), subject to forces F n, the infinitesimal work is given by, δw = N p n=1 F n dx n (1.10) For a system in which all masses are independent of time, the integral between two points in time, t 1 and t 2, may again be calculated using Newton s second law, and given by, W 12 = N p n=1 where the total kinetic energy for the N p particles is given by, T = C 12 F n dx n = T 2 T 1 (1.11) N p n=1 1 2 m nv 2 n (1.12) 9

1.5 Dissipative forces For a general force F and given initial and final data the result of performing the line integral of (1.6), which defines the work, will depend upon the specific trajectory traversed. This occurs, for example, when the system is subject to friction and/or when dissipation occurs. A specific example for a single particle is given by the friction force law, F = κv κ > 0 (1.13) where κ itself may be a function of v. Evaluating the work done along a closed path, where the points x(t 1 ) and x(t 2 ) coincide, gives W 12 = κv dx = κv 2 dt (1.14) The integral on the right is always negative, since its integrand is manifestly positive. For a very small closed path, the work tends to zero, while for larger paths it will be a finite number since no cancellations can occur in the negative integral. As a result, the work done will depend on the path. The force of friction always takes energy away from the particle it acts on, a fact we know well from everyday experiences. 1.6 Conservative forces A force is conservative if the work done depends only on the initial and final data, but not on the specific trajectory followed. A sufficient condition to have a conservative force is easily obtained when F depends only on x, and not on t and ẋ. Considering first the case of a single particle, and requiring that the work done on the particle vanish along all closed paths, we find using Stokes s theorem, 0 = C F(x) dx = D d 2 s ( F(x)) (1.15) where D is any two-dimensional domain whose boundary is C, and d 2 s is the corresponding infinitesimal surface element. Vanishing of this quantity for all D requires F(x) = 0 (1.16) Up to global issues, which need not concern us here, this means that the force F derives from a scalar potential V, which is defined, up to an arbitrary additive constant, by F(x) = V (x) (1.17) 10

In terms of this potential, the work may be evaluated explicitly, and we find, W 12 = F dx = C 12 dx V (x) = V 1 V 2 C 12 (1.18) Relation (1.8) between work and kinetic energy may be reinterpreted in terms of the total energy of the system, T + V, and reduces to the conservation thereof, whence the name of conservative force. T 1 + V 1 = T 2 + V 2 (1.19) The case of multiple particles may be treated along the same line. We assume again that the forces depend only on positions x n (t), but not on velocities. From considering just one particle at the time, and varying its trajectory, it is immediate that the force F n on each particle n must be the gradient of a scalar potential, F n = xn V (n). Simultaneously varying trajectories of different particles yields the stronger result, however, that all these potentials V (n) are equal. To see this, it will be useful to introduce a slightly different notation, and define coordinates y i and components of force f i as follows, x 1 n = y 3n 2 F 1 n = f 3n 2 x 2 n = y 3n 1 Fn 2 = f 3n 1 x 3 n = y 3n Fn 3 = f 3n (1.20) where x n = (x 1 n, x 2 n, x 3 n), F n = (F 1 n, F 2 n, F 3 n), with n = 1, 2,, N p throughout. It will be convenient throughout to work directly in terms of the number N of dynamical degrees of freedom given by, N = 3N p (1.21) Vanishing of the work integral along any closed path may be recast in the following form, 0 = N p n=1 C 12 F n dx n = N i=1 C 12 dy i f i (1.22) We now use the higher-dimensional generalization of Stokes s Theorem to recast this lineintegral in terms of an integral over a two-dimensional domain D whose boundary is C 12, but this time in the N-dimensional space of all y i, N i=1 C 12 dy i f i = N i,j=1 D ( d 2 fi y ij f ) j y j y i 11 (1.23)

where d 2 y ij are the area elements associated with directions i, j. Since C 12 and thus D is arbitrary, it follows that f i y j f j y i = 0 (1.24) for all i, j = 1, 2,, N. Again ignoring global issues, this equation is solved generally in terms of a single potential. Recasting the result in terms of the original coordinates and forces gives, F n = xn V (1.25) where V is a function of x n. The work relation (1.19) still holds, but T is now the total kinetic energy of (1.12) and V is the total potential energy derived above. 1.7 Velocity dependent conservative forces The notion of conservative force naturally generalizes to a certain class of velocity dependent forces. Consider, for example, the case of the Lorentz force acting on a particle with charge e due to a magnetic field B, The work done by this force is given by, F = e v B (1.26) W 12 = e dx (v B) = e C 12 dt v (v B) = 0 C 12 (1.27) and vanishes for any particle trajectory. Thus the magnetic part of the Lorentz for, though velocity dependent, is in fact conservative. Therefore, it is appropriate to generalize the notion of conservative force to include the Lorentz force case, as well as others like it. This goal may be achieved by introducing a potential U, which generalizes the potential V above in that it may now depend on both x and ẋ. For the sake of simplicity of exposition, we begin with the case of a single particle. Let us consider the total differential of U(x, ẋ), given by, 1 du = dx x U + dẋ ẋu (1.28) 1 We use the notation x U for the vector with Cartesian coordinates ( U ( U ) v, U 1 v, U 2 v, with x = (x 1, x 2, x 3 ) and ẋ = v = (v 1, v 2, v 3 ). 3 12 x, U 1 x, U 2 x 3 ), and ẋu for

and integrate both sides from point 1 to point 2 on the particle trajectory. We obtain, U 2 U 1 = dx x U + C 12 dẋ ẋu C 12 (1.29) Here, U 1 and U 2 are the values of the potential at the initial and final points on the trajectory. The first integral on the right hand side is analogous to the structure of equation (1.18). The second integral may be cast in a similar form by using the following relation, dẋ = d dx (1.30) dt To prove this relation, it suffices to exploit an elementary definition of the time derivative, dẋ(t) = d lim ε 0 x(t + ε) x(t) ε = lim ε 0 dx(t + ε) dx(t) ε = d dx(t) (1.31) dt The second integral may now be integrated by parts, dẋ ẋu = ẋ ẋu 2 dx d C 12 1 C 12 dt ( ẋu) (1.32) Putting all together, we see that (1.29) may be recast in the following form, (U ẋ ẋu) ( 2 = dx x U d ) 1 C 12 dt ẋu This means that any force of the form, (1.33) F = x U + d dt ẋu (1.34) is conservative in the sense defined above: is independent of the trajectory. The generalization to the case of multiple particles proceeds along the same lines as for velocity independent conservative forces and is straightforward. The result is that a set of velocity-dependent forces F n on an assembly of N p particles is conservative provided there exists a single function U which depends on both positions and velocities, such that F n = xn U + d dt ẋ n U (1.35) The corresponding conserved quantity may be read off from equations (1.8), (1.33), and (1.35), and is given by, T + U N p n=1 ẋ n ẋn U (1.36) 13

This quantity may look a bit strange at first, until one realizes that the kinetic energy may alternatively be expressed as T = N p n=1 ẋ n ẋn T T (1.37) Inserting this identity into (1.36) we see that all its terms may be combined into a function of a single quantity L = T U, which is nothing but the Lagrangian, in terms of which the conserved quantity is given by, N p n=1 ẋ n ẋn L L (1.38) which we recognize as the standard relation giving the Hamiltonian. Note that U is allowed to be an arbitrary velocity-dependent potential, so the above derivation and results go far beyond the more familiar form L = T V where V is velocity-independent. 1.8 Charged particle in the presence of electro-magnetic fields The Lorentz force acting on a particle with charge e in the presence of electro-magnetic fields E and B is given by, F = e (E + v B) (1.39) We shall treat the general case where E and B may depend on both space and time. This force is velocity dependent. To show that it is conservative in the generalized sense discussed above, we recast the fields in terms of the electric potential Φ and the vector potential A, E = x Φ A t B = x A (1.40) This means that we must solve half of the set of all Maxwell equations. Using the identity, the force may be recast as follows, v ( x A) = x (v A) (v x )A (1.41) F = x (eφ ev A) + d ( ea) (1.42) dt 14

Introducing the following velocity dependent potential, U = eφ ev A (1.43) and using the fact that ẋu = ea, we see that the Lorentz force is indeed conservative in the generalized sense. The corresponding Lagrangian is given by, L = 1 2 mv2 eφ + ev A (1.44) The total energy of the charged particle, given by (1.38) will be conserved provided the electro-magnetic fields have no explicit time dependence. 1.9 Physical relevance of conservative forces It would seem that in any genuine physical system, there must always be some degree of dissipation, or friction, rendering the system non-conservative. When dissipation occurs, the system actually transfers energy (and momentum) to physical degrees of freedom that have not been included in the description of the system. For example, the friction produced by a body in motion in a room full of air will transfer energy from the moving body to the air. If we were to include also the dynamics of the air molecules in the description of the system, then the totality of the forces on the body and on the air will be conservative. To summarize, it is a fundamental tenet of modern physics that, if all relevant degrees of freedom are included in the description of a system, then all forces will ultimately be conservative. Conservative systems may be described by Lagrangians, as was shown above (at least for the special case when no constraints occur). Indeed, the four fundamental forces of Nature, gravity, electro-magnetism, weak and strong forces are all formulated in terms of Lagrangians, and thus effectively correspond to conservative forces. Of course, friction and dissipation remain very useful effective phenomena. In particular, the whole set of phenomena associated with self-organization of matter, including life itself, are best understood in terms of systems subject to a high degree of dissipation. It is this dissipation of heat (mostly) that allow our bodies to self-organize, and dissipate entropy along with heat. 15

2 Lagrangian Formulation of Mechanics Newton s equations are expressed in an inertial frame, parametrized by Cartesian coordinates. Lagrangian mechanics provides a reformulation of Newtonian mechanics in terms of arbitrary coordinates, which is particularly convenient for generalized conservative forces, and which naturally allows for the inclusion of certain constraints to which the system may be subject. Equally important will be the fact that Lagrangian mechanics may be derived from a variational principle, and that the Lagrangian formulation allows for a systematic investigation into the symmetries and conservation laws of the system. Finally, the Lagrangian formulation of classical mechanics provides the logical starting point for the functional integral formulation of quantum mechanics. Joseph Louis Lagrange (1736-1813) was born in Piedmont, then Italy. He worked first at the Prussian Academy of Sciences, and was subsequently appointed the first professor of analysis at the Ecole Polytechnique in Paris which had been founded in 1794 by Napoleon Bonaparte, five years after the French revolution. In 1807, Napoleon elevated Lagrange to the nobility title of Count. Besides the work in mechanics which bears his name, Lagrange developed the variational calculus, obtained fundamental results in number theory and group theory, thereby laying the ground for the work of Galois in algebra. In this section, we shall derive the Euler-Lagrange equations from Newton s equations for systems with generalized conservative forces, in terms of arbitrary coordinates, and including certain constraints. We shall show that the Euler-Lagrange equations result from the variational principle applied to the action functional, investigate symmetries and conservation laws, and derive Noether s theorem. 2.1 The Euler-Lagrange equations in general coordinates Newton s equations for a system of N p particles, subject to generalized conservative forces, are formulated in an inertial frame, parametrized by Cartesian coordinates associated with the position vectors x n (t), and take the form, dp n dt = xn U + d dt ( ẋ n U) (2.1) The momentum p n may be expressed in terms of the total kinetic energy T, as follows, p n = ẋn T T = 16 N p n=1 1 2 m nẋ 2 n (2.2)

Using furthermore the fact that xn T = 0, it becomes clear that equations (2.1) may be recast in terms of a single function L, referred to as the Lagrangian, which is defined by, L T U (2.3) in terms of which equations (2.1) become the famous Euler-Lagrange equations, d dt ( ẋ n L) xn L = 0 (2.4) These equations were derived in an inertial frame, and in Cartesian coordinates. The remarkable and extremely useful property of the Euler-Lagrange equations is that they actually take the same form in an arbitrary coordinate system. To see this, it will be convenient to make use of the slightly different notation for Cartesian coordinates, introduced already in (1.20), namely for all n = 1, 2,, N p we set, x 1 n = y 3n 2 x 2 n = y 3n 1 x 3 n = y 3n (2.5) In terms of these coordinates, the Euler-Lagrange equations take the form, d dt ( ) L ẏ i L y i = 0 (2.6) for i = 1, 2,, N. We shall again use the notation N = 3N p for the total number of degrees of freedom of the system. Next, we change variables, from the Cartesian coordinates y i to a set of arbitrary coordinates q i. This may be done by expressing the coordinates y 1,, y N as functions of the coordinates q 1,, q N, y i = y i (q 1,, q N ) i = 1,, N (2.7) We want this change of coordinates to be faithful. Mathematically, we want this to be a diffeomorphism, namely differentiable and with differentiable inverse. In particular, the Jacobian matrix y i / q j must be invertible at all points. Denoting the Lagrangian in Cartesian coordinates by L (y), we shall denote the Lagrangian in the system of arbitrary coordinates q i now by L, and define it by the following relation, where we shall use the following shorthand throughout, L(q, q) L (y) (y, ẏ) (2.8) L(q, q) = L(q 1,, q N ; q 1,, q N ) (2.9) 17

To compare the Euler-Lagrange equations in both coordinate systems, we begin by computing the following variations, δl (y) (y, ẏ) = δl(q, q) = ( ) N L (y) δy i + L(y) δẏ i i=1 y i ẏ i ( N L δq j + L ) δ q j j=1 q j q j (2.10) Now a variation δq i produces a variation in each y j in view of the relations (2.7) which, together with their time-derivatives, are calculated as follows, δy i = δẏ i = N y i δq j j=1 q j ( N yi δ q j + d ( ) ) yi δq j j=1 q j dt q j (2.11) Under these variations, the left sides of equations (2.10) coincide in view of (2.8), and so the right sides must also be equal to one another. Identifying the coefficients of δq j and δ q j throughout gives the following relations, L q j = L q j = N y i L (y) i=1 q j ẏ i ( N yi L (y) + d ( ) ) yi L (y) i=1 q j y i dt q j ẏ i (2.12) The Euler-Lagrange equations in terms of coordinates y i and q i are then related as follows, d L L = dt q j q j N i=1 ( y i d L (y) ) L(y) q j dt ẏ i y i (2.13) If the Euler-Lagrange equations of (2.6) are satisfied in the inertial frame parametrized by Cartesian coordinates y i, then the Euler-Lagrange equations for the Lagrangian defined by (2.8) in arbitrary coordinates q i will be satisfied as follows, d L L = 0 (2.14) dt q j q j and vice-versa. The Euler-Lagrange equations take the same form in any coordinate system. 18

2.2 The action principle Consider a mechanical system described by generalized coordinates q i (t), with i = 1,, N, and a Lagrangian L which depends on these generalized positions, and associated generalized velocities q i (t), and possibly also explicitly depends on time t, L(q, q, t) = L(q 1,, q N ; q 1,, q N ; t) (2.15) One defines the action of the system by the following integral, S[q] t2 t 1 dt L(q 1,, q N ; q 1,, q N ; t) (2.16) The action is not a function in the usual sense. Its value depends on the functions q i (t) with t running through the entire interval t [t 1, t 2 ]. Therefore, S[q] is referred to as a functional instead of a function, and this is why a new notation being is used, with square brackets, to remind the reader of that distinction. An commonly used alternative way of expressing this is by stating that S depends on the path that q i (t) sweeps out in the N-dimensional space (or manifold) as t runs from t 1 to t 2. The Action Principle goes back to Pierre Louis Maupertuis (1698-1759) and Leonhard Euler (1707-1783), and was formulated in its present form by Sir William Rowan Hamilton (1805-1865). Euler was a child prodigy and became the leading mathematician of the 18-th century. Lagrange was Euler s student. His work in mathematics and physics covers topics from the creation of the variational calculus and graph theory to the resolution of practical problems in engineering and cartography. Euler initiated the use of our standard notations for functions f(x), for the trigonometric functions, and introduced Euler Γ(z) and B(x, y) functions. He was incredibly prolific, at times producing a paper a week. Condorcet s eulogy included the famous quote He ceased to calculate and to live. The action principle applies to all mechanics systems governed by conservative forces, for which Newton s equations are equivalent to the Euler-Lagrange equations. Its statement is that the solutions to the Euler-Lagrange equations are precisely the extrema or stationary points of the action functional S[q]. Before we present a mathematical derivation of the action principle, it is appropriate to make some comments. First, by extremum we mean that in the space of all possible paths q i (t), which is a huge space of infinite dimension, a trajectory satisfying the Euler-Lagrange equations can locally be a maximum, minimum, or saddle-point of the action. There is no need for the extremum to be a global maximum or minimum. Second, the action principle is closely related with the minimum-path-length approach to geometrical optic, and in fact Hamilton appears to have been particularly pleased by this similarity. 19

2.3 Variational calculus Standard differential calculus on ordinary functions was extended to the variational calculus on functionals by Euler, Lagrange, Hamilton and others. To get familiar with the corresponding techniques, we start with a warm-up for which we just have a single degree of freedom, q(t) so that N = 1. The action is then a functional of a single function q(t), S[q] = t2 t 1 dt L(q, q, t) (2.17) We wish to compare the value taken by the action S[q] on a path q(t) with the value taken on a different path q (t), keeping the end points fixed. In fact, we are really only interested in a path q (t) that differs infinitesimally from the path q(t), and we shall denote this infinitesimal deviation by δq(t). It may be convenient to think of the infinitesimal variation δq as parametrized by the size of the variation ε together with a fixed finite (noninfinitesimal) function s(t), so that we have, δq(t) = ε s(t) (2.18) Variations are considered here to linear order in ε. The deformed paths are given as follows, q (t) = q(t) + ε s(t) + O(ε 2 ) Keeping the end-points fixed amounts to the requirement, q (t) = q(t) + ε ṡ(t) + O(ε 2 ) (2.19) s(t 1 ) = s(t 2 ) = 0 (2.20) We are now ready to compare the values takes by the Lagrangian on these two neighboring paths at each point in time t, δl(q, q, t) = L(q, q, t) L(q, q, t) = L(q + δq, q + δ q, t) L(q, q, t) = L L δq + q q δ q ( L = ε q s + L ) q ṡ + O(ε 2 ) (2.21) Thus, the variation of the action may be expressed as follows, ( δs[q] = S[q t2 L ] S[q] = ε dt t 1 q s + L ) q ṡ 20 + O(ε 2 ) (2.22)

q q(t2) q(t1) q(t)+δq(t) q(t) t1 t2 t Figure 1: The path of a solution to the Euler-Lagrange equations (red), and an infinitesimal variation thereof (blue), both taking the same value at the boundary points. Integrating the second term in the integral by parts, we see that the boundary terms involving s(t) cancel in view of (2.20), so that we are left with, ( t2 L δs[q] = ε dt t 1 q d ) L s(t) + O(ε 2 ) (2.23) dt q The action is stationary on a path q(t) provided the first order variations in q(t) leave the value of the action unchanged to linear order. Thus, a path q(t) will be extremal or stationary provided that for all functions s(t) in the interval t [t 1, t 2 ], and obeying (2.20) on the boundary of the interval, we have, t2 t 1 dt ( L q d dt This implies that the integrand must vanish, ) L s(t) = 0 (2.24) q d L dt q L q = 0 (2.25) which you recognize as the Euler-Lagrange equation for a single degree of freedom. One may give an alternative, but equivalent, formulation of the variational problem. Let q (t) be a deformation of q(t), as specified by (2.19) by the parameter ε and the arbitrary function s(t). We define the functional derivative of the action S[q], denoted as δs[q]/δq(t), by the following relation, S[q ] ε ε=0 dt s(t) δs[q] δq(t) 21 (2.26)

Equation (2.22) allows us to compute this quantity immediately in terms of the Lagrangian, and we find, δs[q] δq(t) = L q d L dt q (2.27) Paths for which the action is extremal are now analogous to points at which an ordinary function is extremal: the first derivative vanishes, but the derivative is functional. Variational problems are ubiquitous in mathematics and physics. For example, given a metric on a space, the curves of extremal length are the geodesics, whose shapes are determined by a variational problem. The distance functions on the Euclidean plane, on the round sphere with unit radius, and on the hyperbolic half plane are given as functionals of the path. In standard coordinates, they are given respectively by, D P (1, 2) = D S (1, 2) = D H (1, 2) = 2 1 2 1 2 1 dt ẋ 2 + ẏ 2 dt θ2 + (sin θ) 2 φ 2 ẋ2 + ẏ dt 2 y y > 0 (2.28) It is readily shown that the geodesics are respectively straight lines, grand circles, and half circles centered on the y = 0 axis. 2.4 Euler-Lagrange equations from the action principle The generalization to the case of multiple degrees of freedom is straightforward. We consider arbitrary variations δq i (t) with fixed end points, so that, δq i (t 1 ) = δq i (t 2 ) = 0 (2.29) for all i = 1,, N. The variation of the action of (2.16) is then obtained as in (2.22), δs[q] = t2 t 1 dt N i=1 ( L δq i + L ) δ q i q i q i (2.30) Using the cancellation of boundary contributions in the process of integration by parts the second term in the integral, we find the formula generalizing (2.23), namely, δs[q] = t2 t 1 ( N L dt d ) L δq i (2.31) i=1 q i dt q i 22

Applying now the action principle, we see that setting δs[q] = 0 for all variations δq(t) satisfying (2.29) requires that the Euler-Lagrange equations, be satisfied. 2.5 Equivalent Lagrangians d L L = 0 (2.32) dt q i q i Two Lagrangians, L(q; q; t) and L (q; q; t) are equivalent to one another if they differ by a total time derivative of a function which is local in t, L (q(t); q(t); t) = L(q(t); q(t); t) + d Λ(q(t); t) (2.33) dt Note that this equivalence must hold for all configurations q(t), not just the trajectories satisfying the Euler-Lagrange equations. The Euler-Lagrange equations for equivalent Lagrangians are the same. The proof is straightforward, since it suffices to show that the Euler-Lagrange equations for the difference Lagrangian L L vanishes identically. It is, of course, also clear that the corresponding actions S[q] and S [q] are the same up to boundary terms so that, by the variational principle, the Euler-Lagrange equations must coincide. 2.6 Systems with constraints It often happens that the motion of a particle or a body is subject to one or several constraints. A first example of a system with a constraint is illustrated in figure 2 by a particle moving inside a cup which mathematically is a surface Σ. The motion of a particle with trajectory x(t) is subject to an external force F (think of the force of gravity) which pulls the particle downwards. But the impenetrable wall of the cup produces a contact force f which keeps the particle precisely on the surface of the cup. (This regime of the system is valid for sufficiently small velocities; if the particle is a bullet arriving at large velocity, it will cut through the cup.) While the force F is known here, the contact force f is not known. But what is known is that the particle stays on the surface Σ at all times. Such a system is referred to as constrained. A second example of a system with a constraint is illustrated in figure 3 by a solid body rolling on a surface Σ. The solid body has more degrees of freedom than the particle, since in addition to specifying its position (say the center of mass of the solid body), we also need to specify three angles giving its relative orientation. The position together with the 23

f Σ x(t) F Figure 2: Particle constrained to move on a surface Σ. The trajectory x(t) is indicated in red, the external force F in green, and the contact force f in blue. orientation must be such that the solid body touches the surface Σ at all time, but there are further conditions on the velocity as well. If the motion is without slipping and sliding, the velocity of the solid body must equal the velocity of the surface Σ at the point of contact. Σ Figure 3: Solid body in blue constrained to move on a surface Σ. The fact that the contact forces f are not known complicates the study of constrained systems. We can make progress, however, by studying the different components of the contact force. To this end, we decompose the contact force f, at the point on Σ where the contact occurs, into its component f perpendicular to the surface Σ, and its component f which is tangent to the surface, f = f + f (2.34) Now if the body is constrained to move along the surface Σ, then the motion of its contact point is parallel to Σ. Thus, the force f is perpendicular to the motion of the body, and as 24

a result does zero work, δw = dx f = 0 (2.35) Following our earlier definitions, we see that the component f of the contact force is conservative. The component f is parallel to Σ and, in general, will produce work on the moving body, and thus f will not be conservative. We shall mostly be interested in conservative forces, and will now extend the Lagrangian formulation to include the case where the contact forces are all conservative. A schematic illustration of the difference between conservative and non-conservative contact forces is given in figure 4. (a) x(t) (b) x(t) f f Figure 4: Schematic representation of a solid body (here a circle) constrained to move along a surface Σ (here a horizontal line). Figures (a) and (b) respectively represent the cases without and with friction. 2.7 Holonomic versus non-holonomic constrains For systems with conservative external forces and conservative contact forces, the equations of motion may be obtained in the Lagrangian formulation, at least for certain special classes of constraints. One general form of a constraint may be expressed as the vanishing of a function φ α which may depend on position, velocity, and possibly even explicitly on time. In keeping with the formulation of Lagrangian mechanics in generalized coordinates q(t), we express constraints formulated as equalities in terms of generalized coordinates as well, φ α (q; q; t) = 0 α = 1,, A (2.36) where we continue to use the abbreviation φ α (q; q; t) = φ α (q 1,, q N ; q 1,, q N ; t) familiar from earlier discussions of Lagrangian mechanics. The role of the index α is to label the A 25

different functionally independent constraints to which the system is subject. There may also be constraints expressed in the form of inequalities. Different types of constraints may have to be treated by different methods. We begin by discussing some of these differences, through the use of various specific examples. Example 1 A particle is constrained to move on the surface of a sphere, in the presence of external forces, such as a gravitational field. The constraint to motion on a sphere is the result, for example, of the particle being suspended by an inelastic rod or string to form a pendulum. The dynamical degrees of freedom may be chosen to be the Cartesian position of the particle x(t), subject to the external gravitational force F = mg, where g is the gravitational acceleration vector. The system is subject to a single constraint φ = 0 with, where x 0 is the center of the sphere, and R is its radius. φ(x; ẋ; t) = (x(t) x 0 ) 2 R 2 (2.37) Example 2 This example is somewhat more complicated, and is illustrated in figure 5. We consider a wheel attached to an axle (in red) which in turn has its other end point attached to a fixed point. The wheel is free to roll, without slipping and sliding, on a plane. y θ x R (x,y) φ Figure 5: Wheel attached to an axle, and free to roll on a plane. The degrees of freedom introduced in figure 5 consist of the point of contact on the plane described by Cartesian coordinates (x, y), the angle θ giving the position of the axle, and 26

the angle ϕ giving the rotational position of the wheel. The absence of slipping and sliding requires matching the components of the velocity of the contact points, and are given by, φ 1 = ẋ R ϕ sin θ φ 1 = ẏ + R ϕ cos θ (2.38) Under the assumptions, any motion of the system is subject to the above two constraints. Example 3 Finally, a last example is that of a particle allowed to move under the influence of external forces, but constrained to remain above a surface or above a plane. Such constraints are expressed through inequalities, ψ(q, q, t) > 0 (2.39) Often, such constraints can be approximated by subjecting the system to a potential energy which vanishes in the allowed regions, but tends to in the forbidden regions. 2.7.1 Definition A constraint is holonomic provided its is given by an equality on a set of functions which may depend on positions q and explicitly on time t, but not on the velocities q. Thus, any set of holonomic constraints is given by a set of equations, φ α (q 1,, q N ; t) = 0 α = 1,, A < N (2.40) All other constraints are non-holonomic, including the constraints by inequalities. 2.7.2 Reducibility of certain velocity dependent constraints An important caveat, which is very useful to know about, is that a constraint given by an equality φ(q, q, t) = 0 where φ is linear in q, may actually be holonomic, even though it exhibits q-dependence. The velocity-dependent constraint φ(q, q, t) = 0 is then reducible to a holonomic one. This is the case when we have, φ(q, q; t) = d ψ(q; t) (2.41) dt where ψ(q; t) depends on positions q but not on velocities q. The original constraint φ(q, q; t) = 0 may be replaced with the holonomic constraint ψ(q; t) = ψ 0 where ψ 0 is an arbitrary constant. Equivalently, writing out the constraint in terms of its linear dependence on velocities, N φ(q, q; t) = a i q i (2.42) i=1 27

the coefficients a i may depend on q but not on q. The constraint is then holonomic provided the set of coefficients is a gradient, a i = ψ q i (2.43) Alternatively, the differential form a = N i=1 a i dq i must be closed for holonomic constraints, while for non-holonomic constraints, it will not be closed. In view of the above definition and caveat, let us reconsider the examples given earlier. The constraint of example 1 is holonomic. The constraint of example 3 is non-holonomic. The constraint of example 2 is non-holonomic if the angles θ, ϕ are both dynamical variables, since the constraints in (2.38) cannot be transformed into holonomic ones. On the other hand, suppose we kept the angle θ fixed and removed the axle. The wheel is then allowed to roll in a straight line on the plane. In that case, the constraints are equivalent to holonomic ones, ψ 1 = x Rϕ cos θ ψ 2 = y Rϕ sin θ (2.44) where the values of ψ 1 and ψ 2 are allowed to be any real constants. 2.8 Lagrangian formulation for holonomic constraints Consider a system with coordinates q i (t), for i = 1,, N, Lagrangian L(q; q; t), and subject to a set of A holonomic constraints φ α (q; t) = 0 for α = 1,, A < N. The action principle still instructs us to extremize the action, but we must do so now subject to the constraints. This can be done through the use of extra dynamical degrees of freedom, referred to as Lagrange multipliers λ α (t), again with α = 1,, A, namely precisely as many as the are holonomic constraints. The λ α are independent of the variables q i, and may be considered more or less on the same footing with them. Instead of extremizing the action S[q] introduced earlier with respect to q, we extremize a new action, given by, S[q; λ] = t2 t 1 ( ) A dt L(q; q; t) + λ α (t)φ α (q; t) α=1 (2.45) with respect to both q i and λ α for the ranges i = 1,, N and α = 1,, A. The role of the variables λ α is as follows. Extremizing S[q; λ] with respect to λ α, keeping the independent variables q i fixed, we recover precisely all the holonomic constraints, while extremizing with 28

respect to q i, keeping the independent variable λ α fixed, gives the Euler-Lagrange equations, 0 = φ α (q; t) 0 = d L L dt q i q i A α=1 λ α φ α q i (2.46) A few comments are in order. First, we see that the constraints themselves result from a variational principle, which is very convenient. Second, note that the counting of equations works out: we have N + A independent variables q i and λ α, and we have A constraint equations, and N Euler-Lagrange equations. Third, notice that the λ α never enter with time derivatives, so that they are non-dynamical variables, whose sole role is to provide the compensating force needed to keep the system satisfying the constraint. Finally, and perhaps the most important remark is that, in principle, holonomic constraints can always be eliminated. This can be seen directly from the Lagrangian formulation. Since we are free to choose any set of generalized coordinates, we choose to make all the constraint functions into new coordinates, q i(q; t) = q i (t) i = 1,, N A q i(q; t) = φ α (q; t) α = 1,, A, i = α + N A (2.47) The way this works is that given the new coordinates of the first line, the coordinates q i with i = α + N A then vary with q i so as to satisfy (2.47). The Lagrangian in the new coordinates q is related to the original Lagrangian by, L(q; q; t) = L (q ; q ; t) (2.48) But now we see that in this new coordinate system, the equations of motion for λ α simply set the corresponding coordinates q i(t) = 0 for i = N A + 1,, N. The Euler-Lagrange equations also simplify. This may be seen by treating the cases i N A and i > N A separately. For i N A, we have φ α / q i = 0, since φ α and q i are independent variables. Thus the corresponding Euler-Lagrange equations are, For i > N A, we have instead, d L L = 0 i = 1,, N A (2.49) dt q i q i φ α q i = δ α+n A,i (2.50) 29

so that the Euler-Lagrange equations become, d L L = λ i N+A i = N A + 1,, N (2.51) dt q i q i We see that the role of this last set of A equations is only to yield the values of λ α. Since these variables were auxiliary throughout and not of direct physical relevance, this last set of equations may be ignored altogether. 2.9 Lagrangian formulation for some non-holonomic constraints Non-holonomic constraints are more delicate. There appears to be no systematic treatment available for the general case, so we shall present here a well-known and well-understood important special case, when the constraint functions are all linear in the velocities. The proof will be postponed until later in this course. Consider a Lagrangian L(q, q; t) subject to a set of non-holonomic constraints φ α = 0 with, The Euler-Lagrange equations are then given by, N φ α (q; q; t) = Cα(q; i t) q i (2.52) i=1 0 = d L L dt q i q i A λ α Cα i (2.53) α=1 A special case is when the constraint is reducible to a holonomic one, so that C i α = ψ α q i (2.54) When this is the case, the Euler-Lagrange equations of (2.53) reduce to the ones for holonomic constraints derived in (2.46). 2.10 Examples The options of expressing the equations of mechanics in any coordinate system, and of solving (some, or all of) the holonomic constraints proves to be tremendous assets of the Lagrangian formulation. We shall provide two illustrations here in terms of some standard mechanical problems (see for example Landau and Lifshitz). 30

The first example is the system of a double pendulum, with two massive particles of masses m 1 and m 2 suspended by weightless stiff rods of lengths l 1 and l 2 respectively, as illustrated in figure 6 (a). The motion is restricted to the vertical plane, where the angles ϕ and ψ completely parametrize the positions of both particles. The whole is moving in the presence of the standard uniform gravitational field with acceleration g. φ m1 m1 φ m1 (a) ψ m2 (b) m2 Figure 6: Motion of a double pendulum in the plane in (a), and of two coupled double pendulums subject to rotation around its axis by angle ψ in (b). To obtain the equations of motion, we first obtain the Lagrangian. We use the fact that this system is subject only to holonomic constraints, which may be solved for completely in terms of the generalized coordinates ϕ and ψ, so that the Cartesian coordinates for the positions of the masses may be obtained as follows, x 1 = +l 1 sin ϕ x 2 = l 1 sin ϕ + l 2 sin ψ z 1 = l 1 cos ϕ z 2 = l 1 cos ϕ l 2 cos ψ (2.55) The kinetic energy is given by the sum of the kinetic energies for the two particles, T = 1 2 m 1(ẋ 2 1 + ż 2 1) + 1 2 m 2(ẋ 2 2 + ż 2 2) = 1 2 (m 1 + m 2 )l 2 1 ϕ 2 + 1 2 m 2l 2 2 ψ 2 + 1 2 m 2l 1 l 2 ϕ ψ cos(ϕ ψ) (2.56) while the potential energy is given by, The Lagrangian is given by L = T V. V = (m 1 + m 2 )gl 1 cos ϕ m 2 l 2 cos ψ (2.57) 31

In the second example, illustrated in figure 6, we have two double pendulums, with segments of equal length l, coupled by a common mass m 2 which is free to slide, without friction, on the vertical axis. The whole is allowed to rotate in three dimensions around the vertical axis, with positions described by the angle of aperture ϕ and the angle ψ of rotation around the vertical axis. The positions of the masses m 1 in Cartesian coordinates are, x 1 = ±l sin ϕ cos ψ y 1 = ±l sin ϕ sin ψ z 1 = l cos ϕ (2.58) where the correlated ± signs distinguish the two particles with mass m 1. The position of the particle with mass m 2 is given by, (x 2, y 2, z 2 ) = (0, 0, 2l cos ϕ) (2.59) The kinetic energy of both masses m 1 coincide, and the total kinetic energy takes the form, T = m 1 (ẋ 2 1 + ẏ1 2 + ż1) 2 + 1 2 m 2ż2 2 = m 1 l 2 ( ϕ 2 + ψ 2 sin 2 ϕ) + 2m 2 ϕ 2 sin 2 ϕ (2.60) while the potential energy is given by, V = 2(m 1 + m 2 )g cos ϕ (2.61) The Lagrangian is given by L = T V. 2.11 Symmetry transformations and conservation laws The concept of symmetry plays a fundamental and dominant role in modern physics. Over the past 60 years, it has become clear that the structure of elementary particles can be organized in terms of symmetry principles. Closely related, symmetry principles play a key role in quantum field theory, in Landau s theory of second order phase transitions, and in Wilson s unification of both non-perturbative quantum field theory and phase behavior. Symmetry plays an essential role in various more modern discoveries such as topological quantum computation and topological insulators, supergravity, and superstring theory. Here, we shall discuss the concept and the practical use of symmetry in the Lagrangian formulation (and later also in Hamiltonian formulation) of mechanics. Recall that Newton s 32

equations, say for a system of N particles interacting via a two-body force, are invariant under Galilean transformations, x x = R(x) + vt + x 0 t t = t + t 0 (2.62) where x 0 and t 0 produce translations in space and in time, while v produces a boost, and R a rotation. Transformations under which the equations are invariant are referred to as symmetries. It is well-known that continuous symmetries are intimately related with conserved quantities. Time translation invariance implies conservation of total energy, while space translation invariance implies conservation of total momentum, and rotation symmetry implies conservation of angular momentum. We shall see later what happens to boosts. We shall now proceed to define symmetries for general systems, expressed in terms of a set of arbitrary generalized coordinates q(t). For the simplest cases, the derivation may be carried out without appeal to any general theory. Consider first the case of time translation invariance. We begin by calculating the time derivative of L, dl dt = L N ( t + L q i + L ) q i (2.63) q i q i Using the Euler-Lagrange equations, we eliminate L/ q i, and obtain, i=1 dh dt = L t where the total energy (i.e. the Hamiltonian) H is defined by, H = N i=1 (2.64) L q i q i L (2.65) The value E of the total energy function H will be conserved during the time-evolution of the system, provided q(t) obeys the Euler-Lagrange equations, and the Lagrangian L has no explicit time-dependence. When this is the case, L is invariant under time translations, and the associated function H is referred to as a conserved quantity, or as a first integral of motion. Since H is time-independent, its value E may be fixed by initial conditions. Note any system of a single dynamical variable q(t), governed by a Lagrangian L(q, q) which has no explicit time dependence, may be integrated by quadrature. Since energy is conserved for this system, and may be fixed by initial conditions, we have E = L q q L (2.66) 33

which is a function of q and q only. Solve this equation for q as a function of q, and denote the resulting expression by The the complete solution is given by q = v(q, E) (2.67) t t 0 = q dq q 0 v(q, E) (2.68) where q 0 is the value of the position at time t 0, and q is the position at time t. For example, if L = m q 2 /2 V (q), then the integral becomes, t t 0 = q q 0 dq 2(E V (q )) m (2.69) 2.12 General symmetry transformations More generally, a symmetry is defined to be a transformation on the generalized coordinates, which leaves the Lagrangian invariant, up to equivalence, q i (t) q i(q i ; t) (2.70) L(q ; q ; t) = L(q; q; t) + dλ dt (2.71) for some Λ which is a local function of q i (t), t, and possibly also of q i. Under composition of maps, symmetries form a group. An alternative, but equivalent, way of looking at symmetries is as follows. One of the fundamental results of Lagrangian mechanics is that the Euler-Lagrange equations take the same form in all coordinate systems, provided the Lagrangian is mapped as follows, L (q ; q ; t) = L(q; q; t) (2.72) To every change of coordinates q q, there is a corresponding new Lagrangian L. symmetry is such that the Lagrangian L coincides with L, up to equivalence. A transformation may be discrete, such as parity q i = q i, or continuous, such as translations and rotations. Continuous symmetries lead to conservation laws, while discrete symmetries do no. Thus, we shall focus on continuous transformations and symmetries. By definition, a continuous symmetry is parametrized by a continuous dependence on a set of 34 A

real parameters ε α. For example, in space translations, we would have 3 real parameters corresponding to the coordinates of the translation, while in rotations, we could have three Euler angles. We shall concentrate on a transformation generated by a single real parameter ε. Thus, we consider coordinates which depend on ε, For a rotation in the q 1, q 2 -plane for example, we have, q i(t) = q i (t, ε) (2.73) q 1(t, ε) = q 1 (t) cos ε q 2 (t) sin ε q 2(t, ε) = q 1 (t) sin ε + q 2 (t) cos ε (2.74) The study of continuous symmetries is greatly simplified by the fact that we can study them first infinitesimally, and then integrate the result to finite transformations later on, if needed. Thus, the finite rotations of (2.74) may be considered infinitesimally, q 1(t, ε) = q 1 (t) ε q 2 (t) + O(ε 2 ) q 2(t, ε) = q 2 (t) + ε q 1 (t) + O(ε 2 ) (2.75) A general continuous transformation may be considered infinitesimally, by writing, q i(t, ε) = q i (t) + εδq i + O(ε 2 ) (2.76) where δq i may be a function of q i, t, as well as of q i. Alternatively, we have, δq i = q i(t, ε) (2.77) ε ε=0 so that δq i is a tangent vector to the configuration space of q i at time t, pointing in the direction of the transformation. Next, we impose the condition (2.71) for a transformation to be a symmetry. For our infinitesimal transformation, this relation simplifies, and we find the following condition, L(q ; q ; t) = dλ (2.78) ε ε=0 dt for some local function Λ which may depend on q i (t), t, and q i (t). Some comments are in order here. First, for a transformation to be a symmetry, one must be able to find the function Λ without using the Euler-Lagrange equations! Second, given a Lagrangian, it is generally easy to find some symmetry transformations (such as time translation symmetry, for which the sole requirement is the absence of explicit time dependence of L), but it is considerably more difficult to find all possible symmetry transformations, even at the infinitesimal level. We shall provide some general algorithms later on. 35

2.13 Noether s Theorem Noether s theorem(s) relating symmetries and conservation laws provides one of the most powerful tools for systematic investigations of classical mechanics with symmetries. Emmy Noether (1882-1935) was born in Germany to a mathematician father, and went on to become one of the most influential women mathematicians of all times. After completing her dissertation, she had great difficulty securing an academic position, and worked for many years, without pay, at the Erlangen Mathematical Institute. David Hilbert and Felix Klein nominated her for a position at Göttingen, but the faculty in the humanities objected to appointing women at the university. After the rise of Nazism in Germany in 1933, Noether emigrated to the US, and held a position at Bryn Mawr College. Her work covers many areas of abstract algebra, including Galois theory and class field theory, as well as in mathematical physics with her work on symmetries. Noether s Theorem states that, to every infinitesimal symmetry transformation of (2.75), and thus satisfying (2.78), there corresponds a first integral, Q = N i=1 L q i δq i Λ (2.79) which is conserved, i.e. it remains constant in time along any trajectory that satisfies the Euler-Lagrange equations, The quantity Q is referred to as the Noether charge. dq dt = 0 (2.80) Given the set-up we have already developed, the Theorem is easy to prove. We begin by writing out the condition (2.78) using (2.75), and we find, N i=1 ( L δq i + L ) δ q i = dλ q i q i dt (2.81) Using the relation δ q i = d(δq i )/dt, as well as the Euler-Lagrange equations to eliminate L/ q i, it is immediate that, ( N d L δq i + L ) d i=1 dt q i q i dt δq i = dλ dt from which (2.80) follows using the definition of Q in (2.79). 36 (2.82)

2.14 Examples of symmetries and conserved charges We provide here some standard examples of symmetries sand conservation laws. 2.14.1 Translation in time Using Noether s Theorem, we can re-derive total energy conservation. The transformation of time translation acts as follows on the position variables, q i(t, ε) = q i (t + ε) = q i (t) + ε q i (t) + O(ε 2 ) (2.83) so that δq i (t) = q i (t). Next, compute the transformation of the Lagrangian, L(q ; q ; t) ε N = ε=0 i=1 ( L q i + L ) q i = dl q i q i dt L t (2.84) From this, we see that we can have time translation symmetry if and only if L/ t = 0. In this case, we have Λ = L, and the Noether charge Q then coincides with H. The examples of space translation and rotation symmetries are analogous but, in almost all cases have Λ = 0. 2.14.2 Translation in a canonical variable Translations in linear combinations of the dynamical variables q i, are given by, q i(t) = q i (t) + εa i (2.85) where a i are constants (some of which, but not all of which, may vanish). Under this transformation, the Lagrangian changes as follows, L(q, q ; t) ε = N i=1 L q i a i (2.86) which is a total time derivative only of Λ = 0, so that the Euler-Lagrange equations imply, d dt P a = 0 P a = N i=1 a i L q i (2.87) The momentum in the direction of translation P a position variable is said to be a cyclic variable. is then conserved, and the conjugate 37

(3) The simplest system with boost invariance is a free particle moving in one dimension, with Lagrangian, L = m q 2 /2. A boost acts as follows, q (t, ε) = q(t) + εt (2.88) where ε is the boost velocity. Proceeding as we did earlier, we find Λ = mq(t), and thus, Q = m qt mq(t) (2.89) Charge conservation here is possible because of the explicit time dependence of Q. We verify indeed that Q = 0 provided the Euler-Lagrange equations are obeyed, namely m q = 0. The meaning of the conserved charge is clarified by solving the equation of motion, to obtain q(t) = v 0 t + q 0. Thus, we have Q = mq 0, referring to the time-independence of a distinguished reference position of the particle. This is not really a very interesting conserved quantity, but Noether s Theorem nonetheless demonstrates its existence, and its conservation. (3) A particle with mass m and electric charge e subject to external electric and magnetic fields E and B is governed by the Lagrangian, L = 1 2 mẋ2 eφ + eẋ A (2.90) where B = A and E = A/ t Φ. The scalar and gauge potential Φ, A corresponding to given electro-magnetic fields E and B are not unique, and allows for arbitrary local gauge transformations, Φ(t, x) Φ (t, x) = Φ(t, x) Θ(t, x) t A(t, x) A (t, x) = A(t, x) + Θ(t, x) (2.91) for an arbitrary scalar function Θ. To derive the behavior of the Lagrangian under gauge transformations, we calculate, and hence we have, L L = eφ + eẋ A + eφ eẋ A = e Θ(t, x) + eẋ Θ (2.92) t L L = d (eλ) (2.93) dt so that gauge transformations are symmetries of L. Since they act on the fields and not on the dynamical variable x(t), a derivation of the Noether charge will have to be postponed until we write down the Lagrangian also for the electro-magnetic fields. 38

3 Quadratic Systems: Small Oscillations In many physical situations, the dynamics of a system keeps it near equilibrium, with small oscillations around the equilibrium configuration. Also, the dynamics of a given system may be too complicated for a complete analytical solution, and approximations near an analytically known solution may be sometimes the best results one can obtain. All such problems ultimately boil down to linearizing the Euler-Lagrange equations (or Hamilton s equations) or equivalently reducing the problem to an action which is quadratic in the positions q i and velocities q i. All problems of small oscillations around an equilibrium point reduce to linear differential equations, with constant coefficients, and may thus be solved by methods of linear algebra. 3.1 Equilibrium points and Mechanical Stability A system is in mechanical equilibrium if the generalized forces acting on the system add up to zero. In the general class of potential-type problems, the Lagrangian is of the form, L = 1 2 M ij(q) q i q j + N i (q) q i V (q) (3.1) This form includes electro-magnetic problems, expressed in arbitrary generalized coordinates. We assume that L has no explicit time dependence, thereby precluding externally driven systems. Equilibrium points correspond to extrema qi 0 of V, which obey V (q) q i qi =qi 0 = 0 (3.2) If the system is prepared at an initial time t 0 in a configuration with q i (t 0 ) = q 0 i q i (t 0 ) = 0 (3.3) then evolution according to the Euler-Lagrange equations will leave the system in this configuration at all times. One distinguishes three types of equilibrium configurations, namely stable, unstable, and marginal. To make this distinction quantitative, we expand the Lagrangian around the equilibrium point, as follows, q i (t) = q 0 i + η i (t) + O(η 2 ) (3.4) 39

and retains terms up to order η i and η i included. The resulting Lagrangian L 2 is given by, N ( 1 L 2 = 2 m ij η i η j + n ij η i η j 1 ) 2 v ijη i η j i,j=1 (3.5) where the constant matrices m ij, n ij, and v ij are defined by, m ij = M ij qi =q 0 i n ij = N i q j qi =qi 0 v ij = 2 V q i q j qi =qi 0 (3.6) Without loss of generality, we may assume that m ij and v ij are symmetric in ij, while n ij is anti-symmetric, up to the addition of a total time derivative to the Lagrangian. Let us concentrate on the most common situation where m ij is positive definite as a matrix. This will always be the case if the system originates from a problem with standard kinetic energy terms, which are always positive definite in the above sense. We may then change to new variables η in terms of which the matrix m ij is just the identity matrix. The precise form of the change of variables is as follows, N N η i = µ ij η j m ij = µ ik µ kj (3.7) j=1 k=1 where the matrix µ ij the square root of the matrix m ij, and may be chosen positive definite and symmetric. Having performed this change of variables, we shall just set m ij = δ ij and omit the primes from η. We begin by studying the special case where n ij = 0. The Lagrangian takes the form, N ( 1 L 2 = 2 δ ij η i η j 1 ) 2 v ijη i η j i,j=1 (3.8) The different types of equilibrium may now be characterized as follows, STABLE: the eigenvalues of the matrix v ij are all strictly positive; UNSTABLE: the matrix v ij has at least one negative eigenvalue; MARGINAL: the matrix v ij has at least one zero eigenvalue, all other eigenvalues being positive or zero. 40

The case of unstable equilibrium includes systems for which some of the eigenvalues of v ij are positive. Note that the free particle provides an example of marginal stability. The relevance of the signs of the eigenvalues becomes clear upon solving the system. The Euler-Lagrange equations are, N η i + v ij η j = 0 (3.9) j=1 Since v ij is real symmetric, it may be diagonalized by a real orthogonal matrix S, so that we have v = S t W S with S t S = I, and W real and diagonal, with components, In terms of the new variables η i, defined by, W ij = w i δ ij (3.10) N η i(t) = S ij η j (3.11) j=1 the Euler-Lagrange equations decouple into 1-dimensional oscillation equations, η i + w i η i = 0 (3.12) The stability question may now be analyzed eigenvalue by eigenvalue. The solutions are given as follows, w i = +ω 2 i > 0 w i = ω 2 i < 0 η i(t) = γ + i e iω it + γ i e iω it η i(t) = γ + i e ω it + γ i e ω it w i = 0 η i(t) = γ 1 i t + γ 0 i (3.13) where in each case ω i is real and positive, and γ i ±, γi 1, γi 0 are constants to be determined by initial conditions. for generic assignments of initial conditions γ i ±, γi 1, γi 0, the amplitudes of oscillation will remain bounded if and only if all eigenvalues w i > 0, which is the stable case. If at least one eigenvalue w i is negative or zero, then the motion will become unbounded. The general case where n ij 0 is more complicated, but often very interesting. It includes systems of charged particles in a magnetic field, or in the presence of a Coriolis force. The corresponding Euler-Lagrange equations are, N η i + (2n ij η j + v ij η j ) = 0 (3.14) j=1 41

A stable system is one in which all solutions remain bounded in time, and must be oscillatory, and of the form, with ω real, and satisfying the equation, η i (t) = γ i e iωt (3.15) N ( ) ω 2 δ ij + 2iωn ij + v ij γj = 0 (3.16) j=1 This is not quite the standard form of a characteristic equation, but it may be analyzed and solved along parallel lines. In the section on Lagrange points, we shall encounter a real mechanical system with stable equilibrium points, but at which the potential V is actually a (local) maximum! 3.2 Small oscillations near a general solution The above discussion may be generalized by considering small oscillations not around an equilibrium point, but rather around a non-trivial classical trajectory q 0 i (t). Small fluctuations away from this trajectory may be parametrized as before, q i (t) = q 0 i (t) + η i (t) (3.17) Without going through all the details, the corresponding quadratic Lagrangian for η i (t) will be of the form, N ( 1 L 2 = 2 m ij(t) η i η j + n ij(t) η i η j 1 ) 2 v ij(t)η i η j i,j=1 (3.18) leading again to linear Euler-Lagrange equations, but this time with time-dependent coefficients. The corresponding Euler-Lagrange equations are given by, N j=1 ( ) d dt (m ij η j + n ij η j ) n ji η j + v ij η j = 0 (3.19) where m ij, n ij, and v ij now all depend on time. Such equations cannot, in general, be solved analytically. 42

3.3 Lagrange Points The general 3-body problem for masses subject to gravitational attraction cannot be solved exactly. When one of the masses is much smaller than both of the other masses, the problem becomes more tractable. This problem also has considerable practical significance for the orbits of satellites moving in a region of space whose gravitational field is dominated by two astronomical masses, such as the Earth and the Moon. Some remarkable phenomena occur, which we shall now study. Let m 1 and m 2 be the masses of the astronomical bodies, and m m 1, m 2 the mass of the satellite. Let x 1, x 2 and x the corresponding Cartesian positions. For simplicity, we shall assume that the mutual orbit of m 1 and m 2 around one another is circular, and we choose an inertial frame whose center is the center of mass of m 1 and m 2. Thus, we have the relation, m 1 x 1 + m 2 x 2 = 0. Given the circular orbits, we have, as well as, subject to Kepler s law x 1 = (r 1 cos ωt, r 1 sin ωt, 0) x 2 = ( r 2 cos ωt, r 2 sin ωt, 0) (3.20) r 1 = m 2d M r 2 = m 1d M d = r 1 + r 2 M = m 1 + m 2 (3.21) GM = d 3 ω 2 (3.22) In the above inertial frame, the Lagrangian for the satellite is given by, L = 1 2 mẋ2 Gm 1m x x 1 Gm 2m x x 2 (3.23) In the co-moving frame, which rotates at angular frequency ω with respect to the inertial frame, the satellite position may be parametrized as follows, x = (x cos ωt y sin ωt, x sin ωt + y cos ωt, z) (3.24) In the co-moving frame, the kinetic energy T and the potential energy V are time-independent, and take the form, T = 1 2 m ( ẋ 2 + ẏ 2 + ż 2 + ω 2 (x 2 + y 2 ) + 2ω(xẏ yẋ) ) Gm 1 m V = [(x r 1 ) 2 + y 2 + z 2 ] Gm 2 m (3.25) 1/2 [(x + r 2 ) 2 + y 2 + z 2 ] 1/2 43

The value z = 0 is a minimum of the potential for any fixed values of x, y. Thus, motion in the plane z = 0 is stable, and we shall restrict to this case. Even with this simplification, the problem is too hard to solve exactly. Notice that the change of frame has provided a centrifugal force acting outward on the satellite. So, let us see whether there are any equilibrium positions in the x, y plane, balancing the attractive gravitational potentials against the repulsive centrifugal force. To this end, we extremize the effective potential V eff = 1 2 mω2 (x 2 + y 2 ) The equations for an extremum are as follows, ω 2 x = ω 2 y = Gm 1 m [(x r 1 ) 2 + y 2 ] Gm 2 m (3.26) 1/2 [(x + r 2 ) 2 + y 2 ] 1/2 Gm 1 (x r 1 ) [(x r 1 ) 2 + y 2 ] + Gm 2(x + r 2 ) 3/2 [(x + r 2 ) 2 + y 2 ] 3/2 Gm 1 y [(x r 1 ) 2 + y 2 ] 3/2 + Gm 2 y [(x + r 2 ) 2 + y 2 ] 3/2 (3.27) The solution has two branches. The first has y = 0, so that the equilibrium point is co-linear with the masses m 1 and m 2. The equation governing the position x, ω 2 x = Gm 1(x r 1 ) x r 1 3 + Gm 2(x + r 2 ) x + r 2 3 (3.28) is equivalent to a polynomial in x of degree 5 and is solved numerically in figure 7. For the branch y 0, we may simplify the last equation by dividing by a factor of y. Multiplying the resulting equation by x and subtracting it from the first equation gives, 0 = ω 2 = Gm 1 r 1 [(x r 1 ) 2 + y 2 ] + Gm 2 r 2 3/2 [(x + r 2 ) 2 + y 2 ] 3/2 Gm 1 [(x r 1 ) 2 + y 2 ] + Gm 2 (3.29) 3/2 [(x + r 2 ) 2 + y 2 ] 3/2 This is a system of linear equations for the two root factors. Solving for the root factors, and then simplifying the result gives the following equations, (x + r 2 ) 2 + y 2 = d 2 (x r 1 ) 2 + y 2 = d 2 (3.30) Thus, the two equilibrium points (x L, ±y L ) form an equilateral triangle with the masses m 1, m 2. These points are referred to as the Lagrange points. From looking at the potential, it becomes clear that the Lagrange points are maxima of the effective potential V eff. 44

O plot(data, style=point,labels=["r","x"]); 1 0.5 x 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 r K0.5 K1 Figure 7: Plot of the positions x of the colinear Lagrange points, as a function of r = r 1 or r = r 2, in units of d. The Lagrange point of the center branch lies on the interval between masses m 1 and m 2, while the outer branches lie outside the interval. 3.4 Stability near the non-colinear Lagrange points We shall focus on the non-colinear Lagrange points, since their positions are known analytically. The analysis for the co-linear Lagrange points is analogous. Linearizing the Euler-Lagrange equations around the Lagrange points, x = x L + X x L = 1 2 (r 1 r 2 ) y = y L + Y y L = we find the following equations of motion, 3 2 d (3.31) Ẍ 2ωẎ 3 4 ω2 X + λω 2 Y = 0 Ÿ + 2ωẎ 9 4 ω2 Y + λω 2 X = 0 (3.32) with λ = 3 3 4 m 1 m 2 m 1 + m 2 (3.33) 45

Since the coefficients in this differential equation are constant, the solutions are of the form, X = X 0 e κt Y = Y 0 e κt (3.34) where κ is given by, κ 2 = 1 23 2 ± i 16 λ2 ω 2 (3.35) Given the definition of λ, we have, λ 2 = 27 16 ε 0 < ε < 27 16 (3.36) In the case where one of the astronomical bodies is much heavier than the other, the parameter ε will be small. In this case, the solutions are as follows κ (1) ± ±i ω κ (2) ± ±i ω ε (3.37) To this order, the time dependence is purely oscillatory, with no runaway solutions! The Lagrange points, even though they occur at the maximum of the potential, are actually stable due to the Coriolis forces. These properties will continue to hold as long as λ 2 > 23/16, or 0 m 2 1 + m 2 2 25m 2 m 2. Applying these calculations to possible Lagrange points in the Earth-Moon system, we use the values of their masses, m (earth) 1 = 5.97 10 24 kg m (Moon) 2 = 7.36 10 22 kg (3.38) Since α = m 2 /m 1 0.0123, and 1 25α + α 2 0.6926 > 0, we see that the Earth-Moon system does have stable Lagrange points. 46

4 Hamiltonian Formulation of Mechanics Having transformed Newton s equations into the Lagrangian formulation of mechanics, one may wonder why yet a further different formulation is worth pursuing. The Hamiltonian formulation is based on a change of variables from velocities to associated momenta, and as a result converts the (typically) second order equations of motion of Newton and Lagrange into first order equations only. Of course, such a change can be effected on any set of second order equations, but the Hamiltonian formulation does this in a particularly elegant and useful manner. As a result, the Hamiltonian formulation has compelling geometric and algebraic structures, from which powerful methods emerge for the study of symmetries, and the formulations of both quantum and statistical physics. In this chapter we shall present the basic construction of the Hamiltonian formulation, and postpone more advance geometric and algebraic considerations to later. 4.1 Canonical position and momentum variables The starting point will be a Lagrangian formulation and we shall assume, for the sake of simplicity, that all holonomic constraints have been eliminated. The remaining canonical position variables are q i (t) with i = 1,, N. We now introduce generalized momenta through the definition, p i L q i (q, q) (4.1) The canonical momentum enters directly in the Euler-Lagrange equations, which may be recast in terms of p i by, dp i dt = L q i (4.2) The Hamiltonian formulation is derived from the Lagrangian by performing a Legendre transform. We begin by performing the following change of variables, (q i, q i ; t) (q i, p i ; t) (4.3) Thus, it will be required to invert the relation (4.1), and to obtain the function, q i (q, p; t) (4.4) For the simplest cases, the relation (4.1) admits an inverse and q i may be uniquely obtained in terms of q and q. For certain systems, however, the relation (4.1) is not invertible, and the system is subject to constraints. 47

The space (or manifold) in which the coordinates live is referred to as phases space. This concept plays a fundamental role in the global analysis of classical mechanical dynamics, as well as in statistical and quantum physics. 4.2 Derivation of the Hamilton s equations Assuming invertibility of (4.1), the function p i is uniquely defined via (4.4), and we wish to change variables to q i, p i. To this end, consider the total differential of L, dl = L N ( t dt + L dq i + L ) d q i q i q i i=1 (4.5) By the definition of the momenta p i, the last term may be partially recast in terms of the momenta p i, so that we have, dl = L N ( ) t dt + L dq i + p i d q i q i i=1 (4.6) This expression is not yet satisfactory since the differential still appeals to the velocity. To eliminate this dependence, we use instead the Legendre transform of L, which is the Hamiltonian H, defined by, whose total differential is given by, N H = p i q i L (4.7) i=1 N dh = dl + (p i d q i + dp i q i ) (4.8) i=1 Eliminating dl between (4.6) and (4.8) on the one hand, we obtain, dh = L t dt + N i=1 ( L ) dq i + q i dp i q i (4.9) On the other hand, eliminating now the velocity in favor of positions q i and momenta in p i in H, using (4.4), H becomes a function of q i, p i, and t, and its total differential is given by, dh = H N ( t dt + H dq i + H ) dp i q i p i i=1 48 (4.10)

Comparison of (4.9) and (4.10) reveals the following relations, q i = H p i ṗ i = H q i (4.11) These are Hamilton s equations of mechanics. The relation between Lagrangian and Hamiltonian mechanics, contained in the subsidiary relations, may be omitted altogether. L t = H t 4.3 Some Examples of Hamiltonian formulation Example 1 Consider a system with N degrees of freedom q i, and Lagrangian given by L = N i,j=1 (4.12) 1 2 M ij(q) q i q j V (q) (4.13) We assume the matrix M ij (q) to be symmetric and invertible for all q. momenta are given by The canonical p i = L q i = N M ij (q) q j (4.14) j=1 Inverting this relation allows us to express the velocities q i in terms of the momenta, q i = N ( ) M(q) 1 p j (4.15) ij j=1 where m(q) 1 denotes the inverse matrix of m(q). The Hamiltonian is found to be given by, H = N i,j=1 1 ( ) M(q) 1 2 p ip j + V (q) (4.16) ij An immediate application of this example is a particle in spherical coordinates, r, θ, φ, given by the standard Lagrangian with potential, L = 1 2 m(ṙ2 + r 2 θ2 + r 2 sin 2 θ φ 2 ) V (r, θ, φ) (4.17) 49

Its canonical momenta are, p r = mṙ p θ = mr 2 θ pφ = mr 2 sin 2 θ φ (4.18) The associated Hamiltonian is readily found with the help of the above general formulas, H = 1 ( ) p 2 r + p2 θ 2m r + p2 φ 2 r 2 sin 2 + V (r, θ, φ) (4.19) θ Example 2 A particle with mass m and electric charge e in the presence of an electro-magnetic field with scalar potential Φ and vector potential A, has Lagrangian, The canonical momentum is found to be, L = 1 2 mv2 eφ + ev A v = ẋ (4.20) p = mv + ea (4.21) and thus receives an electro-magnetic contribution. The Hamiltonian is found to be, H = 1 2m (p ea)2 + eφ (4.22) 4.4 Variational Formulation of Hamilton s equations The action S[q] associated with the Lagrangian L(q, q; t), S[q] = t2 t 1 dt L(q(t); q(t); t) (4.23) may be recast in terms of the Hamiltonian using the Legendre transformation formula (4.7), to get a form which now depends not just on positions q i (t) but also on momenta p i (t), and is given by, ( t2 N ) I[q, p] = dt p i (t) q I (t) H(q(t); p(t); t) (4.24) t 1 i=1 The functional derivatives of I with respect to q and p are given by, δi[q, p] δq i (t) δi[q, p] δp i (t) = ṗ i (t) H p i = q i (t) H q i (4.25) Setting these functional derivatives to zero, as is appropriate for enforcing the action principle, is readily seen to be equivalent to the Hamilton equations of (4.11). 50

4.5 Poisson Brackets and symplectic structure There is a powerful geometrical structure on phase space, the Poisson bracket which is very useful for a more mathematical formulation of classical mechanics, as well as for its relation with quantum mechanics. For any two functions A(q, p) and B(q, p), the Poisson brachet is denoted by {A, B}, whose definition is as follows, {A, B} N i=1 ( A B A ) B q i p i p i q i (4.26) The Poisson bracket is linear in A and B, anti-symmetric under interchange of its arguments, acts as a derivative in each argument, and satisfies the Jacobi identity, 0 = {A, B} + {B, A} {A, BC} = {A, B}C + {A, C}B 0 = {{A, B}, C} + {{B, C}, A} + {{C, A}, B} (4.27) These relations follow from the application of basic chain rule, and Schwarz s identity. Interestingly, we may proceed more algebraically, and more globally in terms of phase space. It follows from the definition of ([4.26) that, {q i, q j } = {p i, p j } = 0 {q i, p j } = {p i, q j } = δ i,j (4.28) Suppose we now postulate a pairing {a, b} which is linear in a, and linear in b, satisfies the properties (4.27), as well as the relations (4.28). Then it follows that the the structure {A, B} coincides with the Poisson bracket of (4.26). The novelty of our new approach is that linearity together with relations (4.27) are extremely natural and general equations which arise in many branches of mathematics, and which are related to the theory of Lie algebras. The canonical Poisson brackets of (4.28) merely express a normalization of a privileged set of coordinates (q, p) on phase space. In fact, let us unify the position and momentum coordinates q i and p i with i = 1,, N into a single set of coordinates φ α of phase space, where α = 1,, 2N, and φ i = q i i = 1,, N φ i+n = p i (4.29) The canonical Poisson brackets of (4.28) now take the form, {φ α, φ β } = J αβ (4.30) 51

where we define the matrix J as follows, J = ( 0 IN I N 0 ) (4.31) and I N denotes the identity matrix in N dimensions. Now the matrix J is very special, and corresponds to a canonical symplectic pairing. In particular, we have J 2 = I 2N. More generally, if the Poisson brackets on phase space are given by the relation (4.30) for a general (real anti-symmetric) matrix J, and the matrix J is invertible, then we actually have a symplectic structure on phase space. More formally, the phase space manifold is a Poisson manifold if it carries relation (4.30), and a symplectic manifold if J is invertible. 4.6 Time evolution in terms of Poisson brackets The time derivative of any function A(q, p; t), along a trajectory which satisfies Hamilton s equations, may be expressed simply via Poisson brackets, d dt A = t A + ( A q i + A ) ṗ i (4.32) i q i p i Using Hamilton s equations we eliminate q i and ṗ i, and we find, da dt = A + {A, H} (4.33) t In particular, for the canonical variables q i and p i, and for the phase space variables φ α, Hamilton s equations may be recast in the following form, q i = {q i, H} φα = {φ α, H} ṗ i = {p i, H} (4.34) We shall soon interpret these equations even more geometrically. 4.7 Canonical transformations The idea is to look for a set of generalized coordinates in terms of which the system simplifies as much as possible. In the Lagrangian formulation, we allowed for a change of coordinates q i q i(q; t) accompanied by a redefinition of the Lagrangian L (q, q ; t) = L(q, q; t) maps the Euler-Lagrange equations for L into the Euler-Lagrange equations for L. A change of coordinates is particularly useful if one of the new coordinates q i is cyclic, since then the system may be partially integrated. 52

In the Hamiltonian formulation, positions and momenta are on the same footing, so we expect to be able to make changes of coordinates on the full phase space. Let us denote the coordinate transformations as follows, q i = q i(q, p; t) p i = p i(q, p; t) (4.35) The transformation is canonical provided Hamilton s equations in the new coordinates take on the same form as in the old coordinates, q i = H p i ṗ i = H q i (4.36) where H is the new Hamiltonian in the coordinates q i, p i. Not all transformations of the form (4.35) are canonical. Rather than trying to work out directly the conditions (4.36) for a transformation to be canonical, it is easier to go back to the action principle in the Hamiltonian form, which is where Hamilton s equations derived from. The integrand of the action must be the same in both coordinates, up to a total time derivative, N N p i dq i H(p, q; t)dt = p idq i H (p, q ; t)dt + df (4.37) i=1 i=1 where F is a scalar function. Now we could continue to take q i, p i as independent phase space variables. Actually, since the differentials appearing in (4.37) are dt, dq i, dq i, it turns out to be simpler to use q i and q i as independent variables, and we shall thus assume that F is a function of these, so that F = F (q i, q i; t). Identifying independent differentials gives, F t = H (q, p ; t) H(q, p; t) F q i = p i F q i = p i (4.38) Thus, the function F may be viewed as the generating function for the canonical transformation. There is another, but equivalent, way of looking at canonical transformations. Using the relations of (4.38), one shows that canonical transformations preserve the Poisson bracket, in the following sense {A, B } q,p = {A, B} p,q (4.39) Here, we denote Poisson brackets evaluated with respect to the coordinates q, p by {, } q,p to render this dependence explicit, and analogously for primed variables. The functions A, B are related to A, B by, A (q, p ) = A(q, p) B (q, p ) = B(q, p) (4.40) 53

Often one does not even make the distinction between primed and unprimed functions (see e.g. Landau and Lifshitz). To prove this result, we compute both side in terms of a set of independent variables which are well-adapted to the problem, and these are q i, q i, and introduce the intermediate functions Ã, B by, Ã(q, q ) = A(q, p) = A (q, p ) B(q, q ) = B(q, p) = B (q, p ) (4.41) Expressing the differential da = da = dã is each set of corresponding coordinates, dã = ( Ã dq i + Ã ) dq i q i q i i da = ( A dq i + A ) dp i i q i p i da = ( ) A dq i q i i + A dp p i i (4.42) and then converting to the adapted coordinates q, q on lines two and three, and and identifying coefficients of the differentials dq i and dq i, we find the following relations, Ã = A + q i q i j A p j p j q i Ã q i = j A p j p j q i Ã q i = A q i + j A p j p j q i Ã q i = j A p j (4.43) p j q i and similarly for the derivatives of the function B. Re-expressing the derivatives with respect to q in {A, B} q,p, and with respect to q in {A, B } q,p in terms of Ã and B using the above equations, we find, {A, B} q,p = ( Ã B A B ) + ( A B pj p ) i i q i p i p i q i i,j p i p j q i q j {A, B } q,p = ( Ã B A B ) + A B ( ) p j p i (4.44) i q i p i p i q i i,j p i p j q i q j Next, we use the fact that, in view of the Schwarz identity, the following matrices are symmetric under interchange of i, j, p i q j = 2 F q i q j p i q j = 2 F q i q j (4.45) 54

so that the second terms in (4.44) cancel. Next, in the first terms in (4.44), we express the derivatives of A, A, B, B in terms of derivatives of Ã, B using the right hand formulas of (4.43), so that we find, {A, B} p,q = i,j {A, B } p,q = i,j q j ( Ã B p i q i q j ( q j Ã p i q i B q i Ã B q j B q i ) q j ) Finally, to compare both lines of (4.46), we evaluate the inverses of the prefactors, p j q i = 2 F q j q i Ã q j (4.46) p j q i = 2 F q j q i (4.47) and find that they are opposites of one another. It follows that (4.39) holds true. 4.8 Symmetries and Noether s Theorem Noether s theorem, which we derived in the Lagrangian formulation, has an important extension and reformulation in the Hamiltonian formulation. Recall that an infinitesimal transformation δq i on the dynamical variables q i is a symmetry of the Euler-Lagrange equations provided the corresponding infinitesimal change in the Lagrangian obeys, δl = dλ (4.48) dt where Λ is a function of q(t) and q(t) which is local in t. The Noether charge associated with this symmetry transformation is given by, Q = N i=1 L q i δq i Λ (4.49) By construction in the Lagrangian formulation, Q is a function of q and q. In order to make use of Q in the Hamiltonian formulation, we begin by changing variables from (q, q; t) (q, p; t) as in (4.3), so that Q is now a function of (q, p; t). Its conservation may be re-expressed using the Poisson bracket and formula (4.30), which applies to any function of (q, p; t), 0 = dq dt = Q + {Q, H} (4.50) t There is a powerful result, referred to as Poisson s theorem, which provides relations between different continuous symmetries, and Noether charges, of a given mechanical system. 55

4.9 Poisson s Theorem Poisson s Theorem states that if Q 1 and Q 2 are two conserved Noether charges, then their Poisson bracket {Q 1, Q 2 } is also a conserved Noether charge. It is worth proving this theorem, for the sake of practice with Poisson brackets. To prove the Theorem, it will suffice to prove that the total time derivative of {Q 1, Q 2 } vanishes. Thus, we calculate, d dt {Q 1, Q 2 } = t {Q 1, Q 2 } + {{Q 1, Q 2 }, H} = { Q 1 t, Q 2} + {Q 1, Q 2 t } + {{Q 1, H}, Q 2 } + {Q 1, {Q 2, H}} (4.51) where we have used the Jacobi identity to recast the last term of the first line into the last two terms of the second line. Finally, it suffices to combine the first and third term, and the second and fourth term, to obtain, d dt {Q 1, Q 2 } = { Q 1, Q 2 } + {Q 1, Q 2 } (4.52) which must indeed vanish in view of the assumption that both Q 1 and Q 2 are conserved, so that Q 1 = Q 2 = 0. Thus, {Q 1, Q 2 } is conserved. We shall later give a general definition of a Lie algebra, and see that the relations established already above are precisely such that the conserved Noether charges form a Lie algebra. Poisson s theorem shows that if we know two infinitesimal symmetry transformations δ 1 q and δ 2 q, with associated Noether charges Q 1 and Q 2, then {Q 1, Q 2 } is conserved. It may of course be the case that {Q 1, Q 2 } = 0, or more generally that {Q 1, Q 2 } is a function of Q 1 and Q 2, so that we do not get any new charge from this procedure. But if {Q 1, Q 2 } is functionally independent of Q 1 and Q 2, then we obtain a new Noether charge, and we would expect a new infinitesimal symmetry transformation δ 12 q. Thus, the question arises as to how to compute the infinitesimal symmetry transformation from the Noether charge. 4.10 Noether charge reproduces the symmetry transformation The fundamental result is as follows. If Q is a conserved Noether charge, given by (4.49), and expressed in terms of the variables (q, p; t), N Q(q, p; t) = p i δq i Λ(q, p; t) (4.53) i=1 then the associated infinitesimal transformation is given by, δq i = {Q, q i } δp i = {Q, p i } (4.54) 56

We shall prove the first relation, the second one being analogous. First, using (4.53), we compute, i=1 {Q, q i } = Q p i = δq i + N j=1 p j δq j p i Λ p i (4.55) To obtain the last term, we use its defining relation, now expressed in terms of variables (q, p; t), and by expanding this relation we find, ( N L δq i + L ) δ q i = Λ N ( q i q i t + Λ q i + Λ ) q i (4.56) q i q i Expressing δ q i = d(δq i )/dt in terms of its independent variables, we obtain, δ q i = δq ( i N δqi + q j + δq ) i q j t q j q j j=1 Identifying both sides of (4.56) using (4.57), we find for the term proportional to q i, N i=1 Changing variables to (q, p; t), we find, Using now relation (4.58), we find, Putting all together, we have, L q i δq i q j Λ p i = Λ p i = {Q, q i } = δq i + N j=1 N N j=1 j,k=1 i=1 (4.57) = Λ q j (4.58) Λ q j q j p i (4.59) p j q k p i δq j q k (4.60) p j ( δqj p i N k=1 ) q k δq j p i q k By the same argument that we derived (4.59), we also find that δq j p i = N k=1 (4.61) q k p i δq j q k (4.62) so that we indeed recover the first line of (4.54), and prove our theorem. 57

5 Lie groups and Lie algebras In this section, we shall discuss the role of Lie groups and Lie algebras in classical mechanics. We begin by defining the algebraic concept of a group, then produce some of the groups most frequently used in physics within the context of the multi-dimensional harmonic oscillator system, and finally give a general definition of a Lie group, and of a Lie algebra. 5.1 Definition of a group A group G, consists of a set G and a multiplication law, defined between any pair of elements g 1, g 2 G. It is denoted by g 1 g 2, and is subject to the following axioms, 1. Multiplication is closed in the set G: the product g 1 g 2 G for all g 1, g 2 G; 2. Associativity: the trip product obeys (g 1 g 2 ) g 3 = g 1 (g 2 g 3 ) for all g 1, g 2, g 3 G; 3. G contains a unit element e such that e g = g e = g for all g G; 4. Every g G has an inverse, denoted g 1, so that g g 1 = g 1 g = e. A few comments are in order. (a) When a multiplication law is associative, the parentheses in the triple product may be dropped altogether, g 1 g 2 g 3 = (g 1 g 2 ) g 3 = g 1 (g 2 g 3 ) since no ambiguity can arise. (b) The unit element unique. (c) The inverse of each g G is unique. These properties define a group G,, or often simply denoted by G when the multiplication law is clearly specified. Direct product of groups: If G and G are groups, respectively with operations and, the the direct product G = G G also forms a group with the natural combined operation g 1 g 2 = (g 1, g 1) (g 2, g 2) = (g 1 g 2, g 1 g 2) for all g 1, g 2 G and g 1, g 2 G. Subgroups: A subset H G is referred to as a subgroup of G provided H, is a group. H is an invariant subgroup of G, if g h g 1 H, for all g G and h H. For example, if G is a product group, with G = G G, then both G and G are invariant subgroups of G. But this is not the most general situation; the product may also be semi-direct, as it is in the case of the group of Euclidean motions (translations and rotations) in 3-space. The group of translations is an invariant subgroup, but the product is not direct. A group G is defined to be simple if the only invariant subgroups of G are trivial, namely the subgroup {e}, and G itself. Simple groups form building blocks into which general groups may be decomposed, as direct (or semi-direct) products. 58

Abelian versus non-abelian groups: If, for all pairs g 1, g 2 G, we have g 1 g 2 = g 2 g 1 (5.1) then the group G, is said to be Abelian (or commutative). In the contrary case (when there exists at least one pair g 1, g 2 G such that g 1 g 2 g 2 g 1 ) the the group is said to be non-abelian (or non-commutative). Both Abelian and non-abelian groups play fundamental roles throughout physics. There is an important topological distinction between different types of groups. A group is said to be discrete if all of its points are isolated; otherwise, it is continuous. Elementary examples of Abelian groups are as follows, Z, + Q, + R, + C, + Q 0, R 0, C 0, (5.2) where the superscript 0 indicates that the zero element is to be removed from the set. The inclusion relations indicated in (5.2) correspond to subgroups. The groups Z, Q are discrete, while R, C are continuous. The set of all M N matrices with addition, either with integer, rational, real, or complex entries, also forms an Abelian group, for all values of M, N. Note that addition os matrices is a special case of multiplication of matrices. Indeed, let A and B be two M N matrices, and introduce the following matrices of dimension (M +N) (M +N), ( ) IM A A = 0 I N ( ) IM B B = 0 I N (5.3) Then the product of the matrices A and B corresponds to the sum of A and B, ( ) IM A + B AB = 0 I N (5.4) Henceforth, we shall view matrix addition as a special case of matrix multiplication. 5.2 Matrix multiplication groups Matrix multiplication groups are the most important groups used in physics. Square matrices, of size N N for any positive integer N = 1, 2, 3,, close under matrix multiplication, and matrix multiplication is always associative. The identity element is the identity N N matrix, and will be denoted either by I N, or simply by I when no confusion is expected to 59

arise. Not every N N matrix has an inverse though. To be invertible, a matrix M must have non-zero determinant. 2 This leads us to introduce the general linear groups of N N invertible matrices (with non-zero determinant), Gl(N, Q) Gl(N, R) Gl(N, C) (5.5) respectively with entries taking values in Q, R, and C. Rescaling any invertible matrix M by a non-zero number (corresponding to the number field in which the matrix entries take values) yields again an invertible matrix. This makes it natural and useful to introduce groups for which all matrices have determinant 1, a requirement that is more or less systematically indicated with the letter S preceding the group name. This leads us to introduce the special linear groups of N N matrices with unit determinant, Sl(N, Z) Sl(N, Q) Sl(N, R) Sl(N, C) (5.6) The discrete groups built on Z and Q do sometimes enter into physics, but only in rather special settings. Henceforth, we shall concentrate on the continuous groups with either real or complex entries. There are three series of classical matrix groups, each of which is defined through a quadratic relation on its elements. The orthogonal group O(N) consists of N N real matrices M which obey the relation, M t I N M = M t M = I N (5.7) Considering the relation (5.7) on complex matrices M, the group is denoted by O(N, C) instead. To verify that the set of matrices M obeying (5.7) forms a group (either with real or complex-valued entries), it suffices to check that matrix multiplication preserves (5.7), and that each M has an inverse. To check the first, we use (M 1 M 2 ) t (M 1 M 2 ) = M t 2M t 1M 1 M 2 = M t 2M 2 = I N. To check invertibility, we take the determinant of relation (5.7), and find (detm) 2 = 1, proving invertibility of each element. The group SO(N) is defined with the additional requirement detm = 1. The unitary group U(N) consists of N N complex matrices M which obey the relation M I N M = M M = I N M (M ) t (5.8) The proof is elementary. Taking again the determinant, we have detm 2 = 1, so that M is invertible. To check closure, we compute (M 1 M 2 ) M 1 M 2 = M 2M 1M 1 M 2 = I N. 2 In these notes, N will almost always be taken to be finite; for definitions, problems, and properties involving infinite N, see my notes on Quantum Mechanics. 60

The group SU(N) is defined with the additional requirement detm = 1. These groups are of fundamental importance in physics and enter frequently. For example the Standard Model of particle physics is based on a Yang-Mills theory with gauge group SU(3) c SU(2) L U(1) Y. The symplectic group Sp(2N) consists of 2N 2N matrices M which obey the relation, ( ) 0 M t IN J M = J J = J 2N = (5.9) o where J is the symplectic matrix, encountered already as the Poisson bracket on all of phase space. To prove invertibility, we take the determinant of (5.9), and find (detm) 2 = 1. We also have, (M 1 M 2 ) t J (M 1 M 2 ) = M2M t 1J t M 1 M 2 = M2J t M 2 = J. The symplectic group considered over the complex is denoted by Sp(2N, C). One also encounters the group USp(2N) = Sp(2N, C) SU(2N). 5.3 Orthonormal frames and parametrization of SO(N) The importance of the group SO(N) derives from the fact that it preserves the Euclidean inner product and distance in the N-dimensional vector space R N. Parametrizing vectors in R N by column matrices, and the inner product by matrix products, x 1 N x 2 (X, Y ) = X t Y = x i y i X = i=1 x N relation (5.7) is equivalent to the invariance on the inner product, I N y 1 y 2 Y = y N (5.10) (MX, MY ) = (X, Y ) for all X, Y (5.11) The Euclidean distance between two points defined by d(x, Y ) = (X Y, X Y ) is then also invariant, d(mx, MY ) = d(x, Y ). In particular, a pair of orthogonal vectors is mapped into a pair of orthogonal vectors. This last fact gives a us a very natural way of viewing orthogonal transformations physically, and parametrizing them explicitly. To see this, introduce the following basis for R N, 1 0 V 1 = 0 0 1 V 2 = 0 61 0 0 V N = 1 (5.12)

This basis is orthonomal, namely we have (V i, V j ) = δ ij. We now view an orthogonal matrix M SO(N) as a transformation on vectors in R N. In particular, M will transform each basis vector V i into a new vector V i = MV i. As a result of (5.11), we have (V i, V j ) = δ ij. Thus, the transformed basis vectors V i also form an orthonormal basis. In fact, the components of the matrices V 1, V 2,, V N are precisely the entries of the matrix M, V i = MV i = m 1i m 2i m Ni m 11 m 12 m 1N m 21 m 22 m 2N M = m N1 m 2N m NN (5.13) For detm = 1 the orientation of the new frame is preserved, while for detm = 1 the orientation is reversed. The group SO(N) maps an oriented orthonormal frame into an orthonormal frames with the same orientation. Specifically, this map is one-to-one and onto. Conversely, viewing M SO(N) as a transformation on the unit vectors of an orthonormal frame gives us a natural parametrization of all orthogonal matrices. The parametrization proceeds as follows. We begin by parametrizing a general N-dimensional unit vector V in a recursive manner, V = ( ) v sin φ cos φ (5.14) where φ is an angle taking values in the interval φ [0, π], and v is a unit vector in dimension N 1, represented here by an N 1-dimensional column matrix. The vector v may in turn be parametrized in this recursive manner, and so on. In all, the parametrization of an N- dimensional unit vector requires N 1 angles. Note, however, that a two-dimensional unit vector, which is the last step in the recursion, is parametrized by, v = ( ) sin ψ cos ψ (5.15) but where the angle ψ [0, 2π]. To parametrize the full matrix M SO(N), we begin by parametrizing the last column V N with the help of the above recursive method. Next, we parametrize V N 1. This is also a unit vector, which may be parametrized recursively by the above method as well, but we must now insist on one relation, namely the orthogonality with V N, namely (V N 1, V N) = 0. But this relation may be easily enforced by fixing one of the N 1 angles in the general unit vector parametrization of V N 1. Next, V N 2 is a unit vector, 62

but it must be orthogonal to both V N 1 and V N, and so on. In total, the parametrization of SO(N) requires (N 1) + (N 2) + (N 3) + + 1 = 1 N(N 1) (5.16) 2 angles, with ranges as indicated by the recursive construction of each unit vector. The total number of independent angles needed to parametrize SO(N) is the dimension of the group. This number could also have been derived directly, by looking at the defining relation M t M = I. The matrix M has N 2 real entries. The equation M t M = I imposes N 2 relations, which are not all independent. Since the matrix M t M is automatically symmetric, it imposes only N(N + 1)/2 relations, leaving N 2 N(N + 1)/2 = N(N 1)/2 independent parameters. 5.4 Three-dimensional rotations and Euler angles For the case N = 3, corresponding to rotations of customary 3-dimensional space, the resulting parametrization is equivalent to Euler angles. To see this, we introduce rotations around the coordinate axes x and z, as follows, 1 0 0 cos φ sin φ 0 R x (θ) = 0 cos θ sin θ R z (φ) = sin φ cos φ 0 (5.17) 0 sin θ cos θ 0 0 1 The parametrization of the most general rotation of SO(3) is then obtained as follows, and is given by M = M(φ, θ, ψ) = R z (ψ)r x (θ)r z (φ) (5.18) cos ψ cos φ cos θ sin φ sin ψ cos ψ sin φ + cos θ cos φ sin ψ sin ψ sin θ M = sin ψ cos φ cos θ sin φ cos ψ sin ψ sin φ + cos θ cos φ cos ψ cos ψ sin θ sin θ sin φ sin θ cos φ cos θ Notice that indeed, the last column is the most general unit 3-vector, parametrized by two independent angles, with ranges ψ [0, 2π] and θ [0, π]. The first two columns are unit vectors orthogonal to the last column. 63

5.5 Definition of a Lie group All of the above matrix groups whose entries are real or complex numbers, are examples of Lie groups. Thus, it is appropriate to give a more general definition of a Lie group, to later expound its connection with Lie algebras. A Lie group G is a group whose elements may be parametrized (at least locally, in the neighborhood of every point of G) by a set of real parameters (s 1, s 2,, s d ), g(s) = g(s 1, s 2,, s d ) (5.19) in such a way that the product and inverse functions, conveniently combined as follows, g(s) g 1 (t) = g(σ(s, t)) σ(s, t) = (σ 1 (s, t), σ 2 (s, t),, σ d (s, t)) (5.20) is governed by real analytic composition functions σ(s, t). A powerful theorem of Sophus Lie (Norway, 1842-1899) states that if the functions σ are merely assumed to be continuous, then they are automatically real analytic. 5.6 Definition of a Lie algebra A Lie algebra G is a vector space endowed with a pairing, usually denoted by the bracket [, ] (the same symbol as the commutator). The defining properties that make this vector space with the pairing a Lie algebra are as follows. For all X, Y, Z G, we must have, 1. The pairing takes values in G, namely [X, Y ] G; 2. The pairing is anti-symmetric, [X, Y ] = [Y, X]; 3. The pairing is bilinear, [(λx + µy ), Z] = λ[x, Z] + µ[y, Z] for all λ, µ in the field of the vector space G. Linearity in the second entry then follows from anti-symmetry; 4. The Jacobi identity [[X, Y ], Z] + [[Y, Z], X] + [[Z, X], Y ] = 0 holds. Two fundamental examples of Lie algebra pairings are given by the commutator [, ] on square matrices, in which case the Jacobi identity holds trivially; the other is the Poisson bracket, for which we had established the Jacobi identity earlier. 64

5.7 Relating Lie groups and Lie algebras Given a Lie group G, it is straightforward to construct the associated Lie algebra G, by expanding the group multiplication law in the neighborhood of the identity element of G. The vector space of the Lie algebra is then the tangent space to G at the identity, whose elements are tangent vectors. We shall use parameters such that the identity of G corresponds to s = (s 1, s 2,,, s d ) = 0. For matrix groups, one may simply expand the matrices around the identity, and one has, 3 d g(s) = I + s i X i + O(s 2 ) (5.21) i=1 The tangent vectors X i form a basis for the Lie algebra G. To first order in parameters, the product in G amounts to addition in the Lie algebra, d g(s)g(t) = I + (s i + t i )X i + O(s 2, st, t 2 ) (5.22) i=1 so that to linear order, the inverse is obtained by reversing the sign of the parameters. To second order, the commutator is produced. It is convenient to consider the combination, g(s)g(t)g(s) 1 g(t) 1 = I + d i,j=1 s i t j [X i, X j ] + O(s 3, s 2 t, st 2, t 3 ) (5.23) But now, for all infinitesimal values of the parameters s, t, this combination must again be an element of the Lie algebra. Thus, the commutator must take values in G, as required by the first axion of the Lie algebra structure. In particular, we may decompose the commutator itself onto the basis elements X i by, d [X i, X j ] = f ijk X k (5.24) k=1 The coefficients f ijk are the structure constants of the Lie algebra. By construction, they are anti-symmetric in ij. The Jacobi identity imposes the following condition, d (f ijm f mkn + f jkm f min + f kim f mjn ) = 0 (5.25) m=1 3 When the Lie group is defined more abstractly, one needs to define the tangent space more carefully, customarily in terms of one-parameter subgroups. 65

for all i, j, n = 1,, d. Associativity of the group multiplication law implies the Jacobi identity at the level of the algebra. In conclusion, to any given Lie group G, there is associated a unique Lie algebra G obtained by expanding around the identity element of G. A powerful theorem of Sophus Lie informs informs on the converse. Given a Lie algebra G, there exists a unique simply connected Lie group of which G is the Lie algebra in the sense defined above. 4 For example, you may check that the Lie groups SU(2) and SO(3) are different, and yet their Lie algebras is common and is just the angular momentum algebra; the same holds for SU(4) and SO(6). The power of the theory of Lie groups and Lie algebras lies in the fact that continuous symmetries in physics form Lie groups, but these can be studied, up to global topological issues such as simple-connectedness, by studying the linearize problem of Lie algebras. 5.8 Symmetries of the degenerate harmonic oscillator Consider the N-dimensional degenerate harmonic oscillator, governed by the Lagrangian, L = N i=1 1 2 m ( ) p 2 i ω 2 qi 2 (5.26) Note that this is a quadratic problem, but a very special one, since the frequencies of the N oscillators coincide. A special case is when N = 3, and the generalized coordinates coincide with the Cartesian coordinates of a single particle, with Lagrangian, L = 1 2 mẋ2 1 2 mω2 x 2 (5.27) The potential term is invariant under 3-dimensional rotations. By analogy, the Lagrangian of (5.19) is invariant under rotations in N-dimensional space. Orthogonal symmetry To see how this works, it will be convenient to recast the Lagrangian of (5.1), and the associated Hamiltonian in matrix form, by introducing, q 1 p 1 q 2 p 2 Q = P = (5.28) q N p N 4 Simple-connectedness is a topological concept. A space G is simply connected if every closed path in G can be shrunk to a point through a continuous sequence of closed paths which lie entirely inside G. 66

so that L = 1 2 m Q t Q 1 2 mω2 Q t Q H = 1 2m P t P + 1 2 mω2 Q t Q (5.29) An orthogonal transformation R O(N) satisfies T t R = I and acts as follows on Q and P, Q Q = RQ (Q ) t (Q ) = Q t Q P P = RP (P ) t (P ) = P t P (5.30) Thus, the kinetic and potential parts of L and H are separately invariant. The Poisson bracket {q i, p j } = δ ij is also invariant. The Noether charge associated with the infinitesimal rotations takes the form of generalized angular momentum, and satisfies the Lie algebra relations of SO(N), namely, L ij = q i p j q j p i (5.31) {L ij, L kl } = δ ik L jl δ jk L il δ il L jk + δ jl L ik (5.32) The infinitesimal transformation is again generated by the Poisson bracket, N δq i = {q i, L} = ω ik q k L = 1 k=1 2 N k,l=1 ω kl L kl (5.33) where the anti-symmetric coefficients ω ji = ω ij parametrize the infinitesimal rotation. Unitary symmetry By a method familiar from quantum mechanics, we can recast the Hamiltonian in terms of raising and lowering variables, defined by, A = A = 1 2mω (ip + mωq) 1 2mω ( ip + mωq) (5.34) In terms of A and A, and their matrix entries a i and a i, the Hamiltonian and Poisson brackets become, H = ωa A {a i, a j } = 0 67 {a i, a j} = iδ ij (5.35)

In this form, it is clear that we may make on A an arbitrary unitary transformation g U(N), A A = ga A (A ) = A g (5.36) and H as well as the Poisson bracket are invariant under this transformation. The corresponding Noether charges are given by, Q T = A T A (5.37) where T is any Hermitian N N matrix (namely the generators of the Lie algebra of U(N)). These charges obey the following composition law under Poisson brackets, {Q T1, Q T2 } = iq [T1,T 2 ] (5.38) The invariance of the harmonic oscillator under SO(N) is contained in the U(N) symmetry of the oscillator, which is consistent with the fact that SO(N) is a subgroup of U(N). Symplectic Symmetry To conclude, we point out the role of symplectic transformations. Suppose we introduced, as we had done already earlier, the combined coordinates of phase space, in the form of a column matrix of height 2N by, ( ) P X = (5.39) mωq whose entries will be denoted collectively by x α with α = 1,, 2N. The Hamiltonian and Poisson brackets may then be recast as follows, H = 1 2m Xt X {x α, x β } = mωj αβ (5.40) where J is the symplectic matrix introduced in (5.9) when discussing the symplectic group. Now, there is no doubt that H is invariant under the group SO(2N) which is larger than the group U(N) (their dimensions are respectively N(2N 1) and N 2 ). But the Poisson bracket is naturally invariant under another group: Sp(2N). So, the combined Hamiltonian structure of the Hamiltonian itself and the Poisson bracket are invariant under the intersection SO(2N) Sp(2N). The intersection is easily worked out: a matrix satisfying both M t M = I 2N and M t J 2N M = J 2N must commute with J 2N. As a result, it must be of the form ( ) M1 M M = 2 M M 2 M 1M t 1 + M2M t 2 = I N 1 M1M t 2 M2M t 1 = 0 (5.41) But these are precisely the conditions for required to make the matrix M 1 + im 2 to be unitary. Thus, we recover SO(2N) Sp(2N) = U(N), recovering our previous result. 68

5.9 The relation between the Lie groups SU(2) and SO(3) It is well-known from the angular momentum algebra of quantum mechanics that the Lie algebras of SU(2) and SO(3) coincide. And yet, the Lie groups are different, and the corresponding representation theories of both Lie groups are also different, namely SU(2) has unitary representations for all non-negative half-integers j = 0, 1, 1, 3, while the 2 2 (single-valued) representations of SO(3) are required to have integer j, as is the case in orbital angular momentum. In this subsection, we shall describe the precise map between the Lie groups, and this will allow us to see how the groups differ from one another. An arbitrary element g SU(2) may be parametrized by two complex numbers u, v C, as follows, ( u v ) g = v u u 2 + v 2 = 1 (5.42) The pair (u, v), subject to the above constraint, precisely parametrizes the unit sphere S 3 embedded in R 4 = C 2, the sense that this map is bijective. Thus we conclude that SU(2) = S 3 (5.43) The key ingredient to the map between SU(2) and SO(3) is the basis of 2 2 traceless Hermitian matrices, given by the Pauli matrices, ( ) ( ) ( ) 0 1 0 i 1 0 σ 1 = σ 2 = σ 3 = (5.44) 1 0 i 0 0 1 Now consider the composite gσ i g for i = 1, 2, 3. Each of these 2 2 matrices is itself traceless and Hermitian, so that they can be decomposed onto Pauli matrices as follows, 3 gσ i g = R i jσ j (5.45) j=1 Using the relation {σ i, σ i } = 2δ ij I 2 to compute the anti-commutators {gσ i g, gσ j g } in two different ways, {σ i, σ j } = g{σ i, σ j }g = {gσ i g, gσ j g } = we establish that R must be orthogonal, 3 k,l=1 R i kr j l{σ k, σ l } (5.46) R t R = I 3 (5.47) Equation (5.45) provides a map from g SU(2) to R SO(3). Every R SO(3) is attained in this way, but g and g map to the same R, even though g and g are distinct elements of SU(2). Thus the map is 2-to-1, and we have SO(3) = SU(2)/Z 2 where Z 2 = {1, 1}. 69

6 Motion of Rigid Bodies As much as we may understand the equations governing the motion of rigid bodies, such as tops, bicycles, and gyroscopes, the actual physical motion is not intuitive, and continues to intrigue. In the photograph below Wolfgang Pauli and Niels Bohr are no longer young men, 3. The Motion of Rigid Bodies and yet still marvel at the behavior of a toy top. Figure 22: Wolfgang Pauli and Niels Bohr stare in wonder at a spinning top. Having now mastered the technique of Lagrangians, this section will be one big 6.1 Inertial and body-fixed frames application of the methods. The systems we will consider are the spinning motions of To deriveextended the equations objects. As forwe the shall motion see, these of rigid can often bodies, be counterintuitive. we begin by defining Certainlywhat Pauli we mean and Bohr found themselves amazed! by a rigid body, and then carefully parametrizing their motion. A rigid body is an assembly of massive points, We shall the consider relative extended distances objects ofthat all don t of which have remain any internal unchanged in time. Thus, we may associate degreeswith of freedom. a rigidthese bodyare a frame called rigid in which bodies, all points definedof tothe be rigid body are static. We shall denote a collection this frame of N points by Γ, constrained and its origin so that bythe O distance, as indicated between in figure 8. This frame is referred to theas points the is body-fixed fixed. i.e. frame for obvious reasons. Of course, this frame is not unique, and both the origin may be rshifted, i r j = and constant a new body-fixed frame (3.1) may also be rotated with respect to Γ. So, we shall pick any one such frame, and discuss later Figure the effect 23: of changing for all i, j = 1,..., N. A simple example is a dumbell (two masses frames. The massive points of the rigid body may be parametrized by the N Cartesian connected by a light rod), or the pyramid drawn in the figure. In both cases, the distances between the masses is fixed. 70 45

Classical Mechanics. Eric D Hoker Department of Physics and Astronomy, University of California, Los Angeles, CA 90095, USA 2 September 2012