Statistical Mechanics

Contents Chapter 1. Ergodicity and the Microcanonical Ensemble 1 1. From Hamiltonian Mechanics to Statistical Mechanics 1 2. Two Theorems From Dynamical Systems Theory 6 3. The Microcanonical Ensemble and the Ergodic Hypothesis 9 4. Density Operators in Quantum Mechanics 13 5. Discussion 18 3

CHAPTER 1 Ergodicity and the Microcanonical Ensemble The pressure in an ideal gas, recall, is proportional to the average kinetic energy per molecule. Since pressure may be understood as an average over billions upon billions of microscopic collisions, this simple relationship illustrates how statistical techniques may be used to supress information about what each individual molecule is doing in order to extract information about what the molecules do on average as a whole. Our first task, as we examine the foundations of statistical mechanics, is to understand more precisely why this suppression is necessary and how exactly it is to be accomplished with more precision. We must, therefore, begin by considering the laws of microscopic dynamics. In physics, there are two choices here the laws of classical mechanics and the laws of quantum mechanics. Remarkably, the choice is not important; in either case, detailed solutions to the dynamical equations are completely unnecessary. We will consider both cases, but follow the classical route through Hamiltonian mechanics first, as this provides the clearest introduction to the structure of statistical mechanics. In this section, we will review the essential elements of Hamiltonian mechanics and discuss the need for and basic elements of a probabilitistic framework...this NEEDS TO BE REWRITTEN TO BETTER TAKE ACCOUNT OF THE CONTENTS AND HIGHLIGHTS OF THIS CHAPTER 1. From Hamiltonian Mechanics to Statistical Mechanics Newton s second law for a particle of mass m, F total = m q, is a second-order ordinary differential equation. Therefore, given the instantaneous values of the particle s position q and momentum p = m q at some time t = 0, the particle s subsequent motion is uniquely determined for all t > 0. For this reason, the state of a classical system consisting of n configurational degrees of freedom can 1

2 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE be thought of as a point (q 1,..., q n, p 1,..., p n ) in a 2n-dimensional space called the phase space of the system. As the state evolves in time, this point will trace out in phase space a trajectory defined by the tangent vector, (1) v(t) = ( q 1 (t),..., q n (t), ṗ 1 (t),..., ṗ n (t)). A Hamiltonian system evolves according the canonical equations of motion, (2) (3) q i = p i H(q, p, t), ṗ i = q i H(q, p, t), where the function H(q, p, t) = H(q 1,..., q n, p 1,..., p n, t) is called the Hamiltonian of the system. These equations represent the full content of Newtonian mechanics. Note that exactly one trajectory passes through each point in the phase space; the classical picture is completely deterministic. Example (single particle dynamics). Find the canonical equations of motion for a single particle of mass m in an external potential V (q). Solution. The Hamiltonian for this system is simply H(q, p) = p2 2m + V (q), which we recognize as the sum of the kinetic and potential energies. This leads to the following dynamical equations: q = p m ṗ = q V (q). A system of many interacting particles has a similar solution, though the potential term becomes much more complicated. We see, therefore, that the first canonical equation (2) generalizes the relationship between velocity and momenta (in a more complicated system, the i-th momentum may depend on several of the q i and q i ). Similarly, the second canonical equation (3) generalizes the rule that force may be expressed as a gradient of an energy function.

1. FROM HAMILTONIAN MECHANICS TO STATISTICAL MECHANICS 3 In a Hamiltonian system, the time dependence of any function of the momenta and coordinates can be written, (4) f = f(q 1,..., q n, p 1,..., p n, t) df dt = { f, H } + f t, where { f, H } is the Poisson bracket of the function f and the Hamiltonian. The Poisson bracket of two functions f 1 and f 2 with respect to a set of canonical variables is defined as (5) { f1, f 2 } = n j=1 ( f1 f 2 f ) 1 f 2. q j p j p j q j The Poisson bracket is important in Hamiltonian dynamics because it is independent of how the various coordinates and momenta are defined; that is, { u, v } takes the same value for any set of canonical variables q and p. Furthermore, the canonical equations of motion can be re-written in the following form, (6) (7) q i = { q i, H }, ṗ i = { p i, H }. This is known as the Poisson bracket formulation of classical mechanics. It is important to recognize that very similar expressions arise in quantum mechanics (we ll look at these in Section 4). Indeed, every classical expression involving Poisson brackets has a quantum analogue employing commutators. This elegant correspondence principle, first pointed out by Dirac, has deep significance for the relationship between classical and quantum physics. It also provides our first glimpse of why statistical mechanics transcends the details of the microscopic equations of motion. For now, we return to the classical route into the heart of statistical mechanics... Examining a physical system from the classical mechanical point of view, one first constructs the canonical equations of motion and then integrates these from known initial conditions to determine the phase trajectory. If the system of interest involves a macroscopic number of particles, this approach condemns one to numerical computations involving matrices of bewildering size. Yet system size is not the major obstacle: The canonical equations of motion are in general nonlinear and, as a result, small changes in system parameters or initial conditions may

4 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE lead to large changes in system behavior. In particular, neighboring trajectories in many nonlinear systems diverge from one another at an exponential rate, a phenomenon known as sensitive dependence on initial conditions or, more popularly, as the butterfly effect, the idea being that a flap of a butterfly s wings may make the difference between sunny skies and snow two weeks later. Systems exhibiting sensitive dependence on initial conditions are said to be chaotic. Calculations of chaotic trajectories are intolerant of even infinitesimal errors, such as those arising from finite precision and uncertainties in the state of the system. Therefore, setting aside the impractical integration problem of calculating a high-dimensional phase trajectory, our necessarily incomplete knowledge of initial conditions in a macroscopic system seriously compromises our ability to predict future evolution. Though the prospects for dealing directly with the phase trajectories of a macroscopic system of particles seem hopeless, it is not the case that we must discard all knowledge of the microscopic physics of the system. There are many macroscopic phenomena which cannot be understood from a purely macroscopic point of view. What is combustion? What determines whether a solid will be a metal or an insulator? What are the energy sources in stellar and galactic cores? These questions are best dealt with by appealing to various microscopic details. On the other hand, given the success of the laws of thermodynamics, it is evident that macroscopic systems exhibit a collective regularity where the exact details of each particle s motion and state are nonessential. This suggests that we may envision the time evolution of macroscopic quantities in a Hamiltonian system as some sort of average over all of the microscopic states consistent with available macroscopic knowledge and constraints. For this reason, one abandons the mechanical approach of computing the exact time evolution from a single point in phase space in favor of a statistical approach employing averages over an entire ensemble of points in phase space. This is accomplished as follows: Consider a large collection of identical copies of the system, distributed in phase space according to a known distribution function, ρ(q, p, t) = ρ(q 1,..., q 3N, p 1,..., p 3N, t),

1. FROM HAMILTONIAN MECHANICS TO STATISTICAL MECHANICS 5 where (8) ρ(q, p, t) dq dp = 1 for all t. ρ(q, p, t) is the density in phase space of the points representing the ensemble, normalized according to (8), and may be interpreted as describing the probability of finding the system in various different microscopic states. Once ρ(q, p, t) is specified, we can compute the probabilities of different values of any quantity f which is a function of the canonical variables. We can also compute the mean value f of any such function f by averaging over the probabilities of different values, (9) f(t) = f(q, p) ρ(q, p, t) dp dq. Thus, instead of following the time evolution of a single system through many different microscopic states, we consider at a single time an ensemble of copies of the system distributed into these states according to probability of occupancy. This shift is one of the cornerstones of statistical mechanics. Exercise 1.1. Derive equation (4). HINT: Use the chain rule df dt = i f q i q i t + i f p i p i t + f t Exercise 1.2. Show that H(q, p, t) is a constant of the motion if and only if it does not depend explicitly on time. Exercise 1.3. Show that the canonical equations of motion can be re-written in the following form, (10) (11) q i = { q i, H }, ṗ i = { p i, H }. This is known as the Poisson bracket formulation of classical mechanics. Exercise 1.4. Compute the following Poisson brackets: (1) { q i, q j } (2) { q i, p j } Are your results in any way familiar, given your knowledge of quantum mechanics? If so, how do the interpretations of these results differ from their quantum mechanical analogues?

6 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE Exercise 1.5. Show that the canonical equations of motion can be written in the symplectic form, ẋ = M H(q, p, t), x where x = { q, p } (what s M in this expression?) 2. Two Theorems From Dynamical Systems Theory One is often interested in general qualitative questions about a system s dynamics, such as the existence of stable equilibria or oscillations. In discussing such questions, mathematicians often speak of the flow of a dynamical system: Any autonomous system of ordinary differential equations can be written in the form (12) ẋ = f(x) (changes of variables may be required if the equations involve second-order and higher derivatives). If we interpret a general system of differential equations (12) as representing a fluid in which the fluid velocity at each point x is given by the vector f(x), then we may envision any particular point x 0 as flowing along the trajectory φ(x 0 ) defined by the velocity field. More precisely, we define φ t (x 0 ) = φ(x 0, t), where φ(x 0, t) is a point on the trajectory φ(x 0 ) passing through the initial condition x 0 ; φ t maps the starting point x 0 to its location after moving with the flow for a time t. It is important to note that φ t defines a map on the entire phase space we may envision the entire phase space flowing according the velocity field defined by (12). Indeed, we shall see in this section that this fluid metaphor is especially appropriate in statistical mechanics. The notion of the flow of a dynamical system very naturally accomodates a shift towards considering how whoie regions of phase space participate in the dynamics, a shift away from the language of initial conditions and trajectories. This shift is what enables mathematicians to state and prove general theorems about dynamical systems. It also turns out that this shift provides the natural setting for several of the central concepts of statistical mechanics. In the previous section, we motivated a statistical framework in which, rather than follow the time evolution of a single system, we consider at a single time an ensemble of copies of that system distributed

2. TWO THEOREMS FROM DYNAMICAL SYSTEMS THEORY 7 in phase space according to probability of occupancy. The main player in this new framework is the distribution function ρ(q, p, t) describing the ensemble. ρ allows us to take into account take into account which states in phase space a system is likely to occupy 1. In this section, we examine how the ensemble interacts with the flow defined by a set of canonical equations. It turns out that, in a Hamiltonian system, the time evolution of ρ has several interesting properties, which are the subject of two important theorems from dynamical systems theory. We begin with a simple calculation of the rate of change of ρ. We know from (4), which describes the time evolution of any function of the canonical variables q and p, that (13) dρ dt = ρ t + { ρ, H }. However, we also know from local conservation of probability that ρ must satisfy a continuity equation, (14) ρ t + (ρv) = 0, where ( ) =,...,,,..., q 1 q n p 1 p n is the gradient operator in phase space and v is defined in (1). Applying the chain rule, we see that (15) (ρv) = { ρ, H } + ρ( v). Since the v = 0 vanishes for a Hamiltonian system, (13) and (14) are equal and therefore (16) and (17) ρ t + { ρ, H } = 0. dρ dt = 0. This result is known as Liouville s theorem. The partial derivative term in (16) expresses the change in ρ due to elapsed time dt, while the ( ρ) v = { ρ, H } term expresses the change in ρ due to motion along the vector field a distance vdt. 1 Mathematicians include this as part of a more general approach, called measurable dynamics, which we need not go into here.

8 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE Thus, Liouville s theorem tells us that the local probability density as seen by an observer moving with the flow in phase space is constant in time; that is, ρ is constant along phase trajectories. The theorem can also be interpreted as stating that, in a Hamiltonian system, phase space volumes are conserved by the flow or, equivalently, that ρ moves in phase space like an incompressible fluid. From the incompressible fluid analogy, we see that while Hamiltonian systems can exhibit chaotic dynamics, they cannot have any attractors! Liouville s theorem has other important consequences when combined with system constraints, such as conservation laws. Conservation laws constrain the flow to lie on families of hypersurfaces in phase space. These surfaces are bounded and invariant under the flow: (18) φ t (X) = X for each hypersurface X defined by a conservation law. When volume-preserving flows are restricted to bounded, invariant regions of phase space, a surprising result emerges: Let X be a bounded region of phase space which is invariant under a volume-preserving flow. Take any region S which occupies a finite fraction of the total volume in X (this specifically excludes what mathematicians call sets of measure zero: sets with no volume). Then any randomly selected initial condition x in S generates a trajectory φ t (x) which returns to S infinitely often this is known as the Poincaré recurrence theorem. In order to understand where this theorem comes from and what it means, we consider how the region S moves under the flow. Define a function f which maps S along the flow for a time T, f(s) = φ T (S). Subsequent iterations of this time-t map produce a sequence of subsets of X, f 2 (S) = φ 2T (S), f 3 (S) = φ 3T (S), and so on, all with finite volume in X. Each iteration takes a bite out of X and so, if we iterate enough times, eventually we must exhaust all of the volume in X. As result, two of these subsets must intersect; i.e. there must exist integers i and j, with i > j, such that f i (S) f j (S) is non-empty. This implies that f i j (S) S is also non-empty. S must fold back on itself repeatedly under this time-t flow map. By considering small subsets of S, which must also have

3. THE MICROCANONICAL ENSEMBLE AND THE ERGODIC HYPOTHESIS 9 this property, we can convince ourselves that a randomly selected point in S does indeed return to S infinitely often (for a precise proof of the theorem, see references at end of chapter). The Poincaré recurrence theorem as stated implies that almost every initial condition x 0 in the bounded region X generates a trajectory which returns arbitrarily close to x 0 infinitely many times. This recurrence property is truly remarkable when you consider the bewildering array of nonlinear Hamiltonian systems to which it may be applied. Indeed, the Poincaré recurrence theorem is considered the first great theorem of modern dynamics; we will have more to say about its role in statistical mechanics later on. 3. The Microcanonical Ensemble and the Ergodic Hypothesis As discussed earlier, the role of the ensemble in statistical mechanics is to provide a probabilistic method of extracting important information about a macroscopic system. Naturally, the choice of a particular ensemble depends on the physical problem of interest but quite often one is interested in equilibrium properties of a physical system. In this special case of equilibrium statistical mechanics, we expect that ensemble averages (9) do not depend explicitly on time. This implies (19) ρ(q, p, t) = 0. t An ensemble satisfying (19) is said to be stationary. Note that a stationary ensemble satisfying Liouville s Theorem (17) has a vanishing Poisson Bracket with the Hamiltonian, (20) { ρ(q, p), H(q, p) } = 0. Since { q i, p j } = δij, no function of q or p alone will satisfy (20). The general solution for a stationary ensemble therefore has the form ρ(q, p) = ρ(h(q, p)). The Hamiltonian plays an important role in determining the form of the distribution function. The simplest example of a stationary ensemble is the microcanonical ensemble, in which the probabilities are uniformly distributed across the hypersurfaces

10 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE in phase space defined by energy conservation: (21) ρ(q, p) = constant. The microcanonical ensemble is one of the cornerstones of equilibrium statistical mechanics, Accordingly, many introductory textbooks begin with the assumption that this ensemble is valid, that all probabilities are equal a priori. It is not difficult, however, to construct low-dimensional examples for which this ensemble is clearly not valid. Why then, beyond the demonstrable success of statistical mechanics as a physical theory, do we believe in a priori equal probabilities? The answer comes again from dynamical systems theory. In this section, we expose some of the dynamical mechinery underlying the microcanonical ensemble. In particular, we will introduce what physicists call the ergodic hypothesis and discuss how this hypothesis places statistical mechanics on firm ground. Recall from the preceeding section that conservation of energy constrains the flow to lie on families of hypersurfaces in phase space; for what follows, we consider one such surface X. Recall also that X is bounded and is invariant (18). Since the flow is Hamiltonian, we know that almost every point is recurrent. What we don t know is how intertwined the phase trajectories are. That is, it s conceivable that X can be broken up into a number of different invariant subspaces X 1, X 2,..., φ t (X i ) = X i for all i, without violating the Poincaré recurrence theorem. When this partitioning into invariant subspaces is not possible, the flow is said to be ergodic. More precisely, the flow is ergodic if and only if the only invariant subspaces of X occupy a volume equal to that of X or else occupy zero volume. This means that in an ergodic flow almost every trajectory wanders almost everywhere on X. Note how much stronger this is than the recurrence property. The Poincaré recurrence theorem is always true for a general Hamiltonian system but we are not given ergodicity a priori. In physics, we assume that the Hamiltonian systems used to describe the macroscopic physical world have ergodic flows 2 ; this is known as the ergodic hypothesis. 2 Remember that we re talking about enormous systems involving over 10 23 particles here and not low-dimensional systems such as those used to describe rigid body motion. More about this later.

3. THE MICROCANONICAL ENSEMBLE AND THE ERGODIC HYPOTHESIS 11 In order to better understand the consequences of the ergodic hypothesis and how it connects with the microcanonical ensemble, we need two more theorems from dynamical systems. These will be stated without proof; interested readers can refer to the end of the chapter for recommendations on where to find a more precise treatment of what follows. The first theorem we need states very simply that a flow is ergodic if and only if the only functions which are invariant under the flow are constants, (22) φ t is ergodic whenever f(φ t (x) = f(x), f = constant. This is plausible. Ergodic flows, recall, have trajectories wandering almost everywhere in X. And so, having a function be constant on some trajectory means that it must be constant almost everywhere in X. This result leads us straight to the microcanical ensemble. From Liouville s theorem, we know that the distribution function ρ is invariant under the flow. When we add the ergodic hypothesis, (22) tells us right away that ρ must be a constant on each energy surface X. Thus, the ergodic hypothesis replaces the need to accept on faith a separate assumption of a priori equal probabilities. Next we want to consider how functions of the dynamical variables behave when averaged along trajectories in a Hamiltonian flow (and we mean any flow here, not just an ergodic one). Does the following have a well defined limit: 1 lim n T 0 f(φ t (x)) dt? The answer is yes. The Birkhoff pointwise ergodic theorem states that, for almost all x, these time averages converge to something: 1 lim n T 0 f(φ t (x)) dt = f (x) The limit may depend on x, which is why f is written above as a function of x (and is why mathematicians call this a pointwise theorem), but the limit almost always exists 3 this is the kind of thing a physicist is usually willing to take on faith. This theorem also, however, makes some interesting statements about the 3 We have to say for almost all x because, though the limit exists for a randomly selected x, there may be a set of measure zero for which convergence fails and we want to be careful

12 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE limiting function f (x). First, this limiting function is invariant under the flow, (23) f (φ t (x) = f (x). Even more surprising, the ensemble average of the limiting function simply equals ensemble average of the original function f, (24) f (x) ρ(x) dx = f(x) ρ(x) dx; somehow time averaging under the integral sign doesn t affect the value of the ensemble average! The Birkhoff theorem does not asume that the flow is ergodic. When we combine this theorem with the ergodic hypothesis of statistical mechanics, we get a major result: Begin with any function of the dynamical variables f(x). interested in the long time average of this function, 1 lim n T 0 f(φ t (x)) dt = f (x) We re We know from (23) that the limiting function f (x) is invariant under the flow. Since the flow is ergodic, (22) implies that f (x) is a constant (almost everywhere). We can actually compute this constant by integrating over the ensemble, f ρ(x) dx = f ρ(x) dx = f. However, we know from (24) the term on the left is equal to the ensemble average of the original function f. Therefore, 1 (25) lim f(φ t (x)) dt = f(x) ρ(x) dx ; n T 0 statistical averaging over the entire ensemble at fixed time is equivalent to timeaveraging a single member of the ensemble. This consequence of the ergodic hypothesis is the justification for replacing macroscopic averages over computed trajectories with an ensemble theory. Consider how when we compute the pressure in an ideal gas using kinetic theory, we ignore time evolution and consider only what a typical gas molecule is doing on average. This works precisely because of the equivalence of ensemble averaging and time averaging. Indeed, all of equilibrium statistical mechanics may be understood in terms of this result. It gives physicists great confidence that the ergodic hypothesis has not led them astray. Furthermore, to the extent that all measurements in the lab are time averages, ergodicity and

4. DENSITY OPERATORS IN QUANTUM MECHANICS 13 the microcanonical ensemble firmly ground macroscopic measurements in the microscopic dynamics of the system being investigated. No experiment to date has shaken our confidence in the foundations of statistical mechanics. EXERCISE: The indicator function. 4. Density Operators in Quantum Mechanics In classical physics, the state of a system at some fixed time t is uniquely defined by specifying the values of all of the generalized coordinates q i (t) and momenta p i (t). In quantum mechanics, however, the Heisenberg uncertainty principle prohibits simultaneous measurements of position and momentum to arbitrary precision. We might therefore anticipate some revisions in our approach. It turns out, however, that the classical ensemble theory developed above carries over into quantum mechanics with hardly revision at all. Most of the necessary alterations are built directly into the edifice of quantum mechanics and all we need is to find suitable quantum mechanical replacements for the density function ρ(q, p) and Liouville s Theorem. Understanding this is the goal of this section. Readers who are unfamiliar with Dirac notation and the basic concepts of quantum mechanics are referred to the references at the end of the chapter. The uncertainty principle renders the concept of phase space meaningless in quantum mechanics. The quantum state of a physical system is instead represented by a state vector, ψ, belonging to an abstract vector space called the state space of the system. The use of an abstract vector space stems from the important role that superposition of states plays in quantum mechanics linear combinations of states provide new states and, conversely, quantum states can always be decomposed into linear combinations of other states. The connection between these abstract vectors and experimental results is supplied by the formalism of linear algebra, by operators and their eigenvalues. Dynamical variables, such as position and energy, are represented by self-adjoint linear operators on the state space and the result of any measurement made on the system is always represented by the eigenvalues of the appropriate operator (that is, the eigenvectors of an observable physical quantity form a basis for the entire state space). This use of operators and eigenvalues directly encodes many of the distinct hallmarks of quantum mechanical systems: Discretization, such as that of angular momentum or

14 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE energy observed in observed in numerous experiments, simply points to an operator with a discrete spectrum of eigenvalues. And wherever the order in which several different measurements are made may affect the results obtained, the associated quantum operators do not commute. In quantum mechanics, the time evolution of the state vector is described by Schrödinger s equation, (26) i ψ(t) = H(t) ψ(t), t where H(t) is the Hamiltonian operator for the system; this evolution law replaces the canonical equations of classical mechanics. Exercise 1.6 (single particle dynamics). Write down, using wavefunctions ψ(q, t), Schrödinger s equation for a single particle of mass m in an external potential V (q). Solution. Recall, that the classical Hamiltonian for this system is simply H(q, p) = p2 2m + V (q). We transform this into a quantum operator by replacing q and p with the appropriate quantum operators: q is the position operator and p = i is the momentum operator for a wavefunction ψ(q, t). Then, Schrödinger s equation q (26) becomes the following partial differential equation, (27) i t ψ(q, t) = ( 2 2m 2 + V (q) ) ψ(q, t). Schrödinger s equation has a number of nice properties. First, as a linear equation, it directly expresses the principle of superposition built into the vector structure of the state space linear combinations of solutions to (26) provide new solutions. In addition, it can be shown that the norm of a state vector ψ ψ(t) is invariant in time; this turns out to have a nice interpretation in terms of local conservation of probability. On the other hand, Schrödinger equation is not easy to solve directly. Even a system as simple as the one-dimensional harmonic oscillator requires great dexterity. For a macroscopic system, (26) generates either an enormous eigenvalue problem or a high-dimensional partial differential equation

4. DENSITY OPERATORS IN QUANTUM MECHANICS 15 (consider the generalization of (27) to a many-body system). Either way, we see that direct solution is hopeless. The situation is essentially identical with that of macroscopic classical mechanics the mathematics and, more importantly, our lack of information about the microscopic state (quantum numbers, in this case) necessitate a statistical approach. We would like to find a quantum mechanical entity that replaces the classical probability density ρ(q, p), which uses probabilities to represent our ignorance of the true state of the system. Unfortunately, the usual interpretation of quantum mechanics already employs probabilities on a deeper level: If the measurement of some physical quantity A in this system is made a large number of times (i.e. on a large ensemble of identically prepared systems), the average of all the results obtained is given by the expectation value (28) A = ψ A ψ, provided the quantum state ψ(t) is properly normalized to satisfy ψ ψ(t) = 1. In order to understand the consequences of this, we introduce a basis of eigenstates for the operator A. Let a i be the eigenvector corresponding to the eigenvalue a i. Since the a i form a basis, we can expand the identity operator as follows, (29) 1 = i a i a i. Inserting this operator into (28) twice, we obtain (30) A = i a i a i ψ 2. Comparing this result to the definition of the expectation value, (31) A = i a i p(a i ), we see that a i ψ 2 must be interpreted as represented the probability p(a i ) of obtaining a i as the result of the measurement. This probabilistic framework replaces the classical notion of a dynamical variable having a definite value. While the expectation value of A is a definite quantity, particular measurements are indefinite in quantum mechanics we can only talk about the probabilities of different outcomes of an experiment. Now we can introduce an ensemble. Instead of considering a single state ψ, let p k represent the probability of the system being in a quantum

16 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE state represented by the normalized state vector ψ k. If the system is actually in state ψ k, then the probability of measuring a i is simply a i ψ k 2. If, however, we are uncertain about the true state then we have to average over the ensemble. In this case, the total probability of measuring a i is given by (32) p(a i ) = p k a i ψ k ( 2 ) = a i ψ k p k ψ k a i. k k The object in parentheses in this last expression, (33) ρ = k ψ k p k ψ k, is known as the density operator. (33) turns out to be exactly what we re looking for, the quantum mechanical operator corresponding to the classical density function ρ(q, p). Recall, that the classical density satisfies the following properties: (1) Non-negativity of probabilities: ρ(q, p) must be non-negative for all points in the phase space. (2) Normalization of probabilities: ρ(q, p) dq dp = 1. (3) Expectation values: The average value of a dynamical variable A(p, q) across the entire ensemble represented by ρ(q, p) is given by A = A(q, p)ρ(q, p) dq dp. These properties carry over into the quantum mechanical setting, with appropriate modification (see exercises). In particular, it can be shown that A = trace { Aρ }. Apart from traces over a density operator replacing integration over the classical ensemble, the statistical description of a complex quantum system is essentially no different than that of a complex classical system. The time evolution of the density operator ρ will be given by a quantum version of Liouville s Theorem and will lead to the same notions of a microcanonical ensemble and ergodicity. write First, we derive the quantum evolution law for ρ. Using the chain rule, we can (34) i ρ t = k i [( t ρ ) p k ρ + ρ ) p k ( t ρ )].

4. DENSITY OPERATORS IN QUANTUM MECHANICS 17 Substituting the Schrödinger equation, this reduces to (35) i ρ t = k [( H ρ ) pk ρ + ρ ) p k ( H ρ )] = Hρ ρh. Thus, (36) ρ t = 1 [ρ, H], i where [ρ, H] = ρh Hρ is called the commutator of ρ and H. Note the striking resemblance between (36) and Liouville s Theorem the commutator of the density and Hamiltion operators has replaced the classical Poisson bracket of the density and Hamiltonian functions but the expressions are otherwise identical. This is a special case of a correspondence first pointed out by Dirac: classical Poisson bracket, { u, v } quantum commutator, 1 [ ] u, v. i As in the classical setting, a stationary ρ should be independent of time; for an equilibrium quantum system, ρ must therefore be a function of the Hamiltonian, ρ(h). The simplest choice is again a uniform distribution, (37) ρ = k ψ k 1 n ψ k, where n is the number of states ψ k in the ensemble. This the quantum microcanonical ensemble. It is essentially the same as the classical one, except discrete....the SAME STATISTICAL PRINCIPLES APPLY, WE JUST HAVE TO SWITCH TO A DISCRETIZED FORMALISM (TRACES OVER OPERATORS INSTEAD OF...) Exercise 1.7. Show that the eigenvalues of the density operator are nonnegative. Solution. Let ρ represent any eigenvalue of ρ and let ρ be the eigenvector associated with this eigenvalue. Then ψ k p k ψ k ρ = ρ ρ = ρ ρ k Multiplying on the left by ρ, we obtain p ψk k ρ 2 = ρ ρ ρ. k

18 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE It follows that, since the p k are positive and ρ ρ is non-negative, ρ cannot be negative. Since eigenvalues in the quantum setting represent measurements in the classical setting, this result mirrors property (1) above. Exercise 1.8. Show that the matrix representation of ρ in any basis satisfies (38) trace { ρ } = 1. Solution. Consider a basis of eigenstates a i of the operator A. The matrix elements ρ ij = a i ρ a j are the representation of ρ in this basis. Then, trace { ρ } = i = k a i ρ a i = p k ψ k a i 2 i k ( p k ψ k a i 2 ) = p k = 1 i k Since the trace is invariant under a change of basis, this result holds for any basis. The condition trace { ρ } = 1 should be compared to the normalization property (2) above. Exercise 1.9. Show that, in a quantum ensemble represented by the operator ρ, the expectation value of an operator A satisfies (39) A = trace { Aρ }. Solution. A = k p k ψ k A ψ k = k,i p k ψ k a i a i A ψ k = i,k a i A ψ k p k ψ k a i = i,k a i Aρ a i = trace { Aρ }. This result should be compared to the classical definition of expectation value, property (3) above. 5. Discussion NOT SURE WHAT TO DO HERE... One important feature of Hamiltonian dynamics is the equal status given to coordinates and momenta as independent variables, as this allows for a great deal of freedom in selecting which quantities to designate as coordinates and momenta (the

5. DISCUSSION 19 q i and p i are often called generalized coordinates and momenta). Any set of variables which satisfy the canonical equations (2-3) are called canonical variables. One may transform between different sets of canonical variables; these changes of variables are called canonical transformations. Note that while the form of the Hamiltonian depends on how the chosen set of canonical variables are defined, the form of the canonical equations are by definition invariant under canonical transformations... Hamiltonian systems have a great deal of additional structure. The quantity, n (40) p dq = p i dq i, γ i=1 γ known as Poincaré s integral invariant, is independent of time if the evolution of the closed path γ follows the flow in phase space. The left-hand side of (40) is also known as the symplectic area. This result can be generalized if we extend our phase space by adding a dimension for the time t. Let Γ 1 be a closed curve in phase space (at fixed time) and consider the tube of trajectories in the extended phase space passing through points on Γ 1. If Γ 2 is another closed curve in phase space enclosing the same tube of trajectories, then (41) (p dq H dt) = (p dq H dt). Γ 1 Γ 2 This result that the integral (p dq H dt) takes the same value any two paths around the same tube of trajectories is called the Poincaré-Cartan integral theorem. Note, if both paths are taken at fixed time, then (41) simply reduces to (40). Structure of this sort, as well as the presence of additional invariant quantities, greatly constrains the flow in phase space and one may wonder whether this structure is compatible with the ergodic hypothesis and the microcanonical ensemble. The most extreme illustration of the conflict is the special case of integrable Hamiltonian systems. A time-independent Hamiltonian system is said to be integrable if it has n indepedent global constraints of the motion (one of which is the Hamiltonian itself), no two of which have a non-zero Poisson bracket. The existence of n invariants confinements the phase trajectories to an n-dimensional subspace

20 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE (recall that the entire phase space is 2n-dimensional; this is a significant reduction of dimension). The independence of these invariants guarantees that none can be expressed as a function of the others. The last condition, that no two of the invariants has a non-zero Poisson bracket, restricts the topology of the manifold to which the trajetories are confined it must be a n-dimensional torus. A canonical transformation to what are known as action-angle variables, for which I i = 1 p dq 2π γ i provides the canonical momenta and the angle θ i around the loop γ i provides the canonical coordinates, simplifies the description immensely: Each I i provides a frequency for uniform motion around the loops defined by the γ i, generating trajectories which spiral uniformly around the surface of the n-torus. For most choices of the I i, a single trajectory will fill up the entire torus; this is called emphquasiperiodic motion. The microcanonical ensemble, for which the trajectories wander ergodically on an (2n 1)-dimensional energy surface, captures none of this structure. On one hand, highly structured Hamiltonian systems appear to exist in Nature, the premiere example being our solar system. On the other hand, we have the remarkable success of the statistical mechanics (and its underlying hypotheses of ergodicity and equal a priori probabilities) in providing a foundation for thermodynamics and condensed matter physics. This success remains a mystery.