Information Theory in Statistical Mechanics: Equilibrium and Beyond... Benjamin Good

Similar documents
Statistical Mechanics

Title of communication, titles not fitting in one line will break automatically

arxiv: v2 [cond-mat.stat-mech] 16 Mar 2012

(# = %(& )(* +,(- Closed system, well-defined energy (or e.g. E± E/2): Microcanonical ensemble

On the Asymptotic Convergence. of the Transient and Steady State Fluctuation Theorems. Gary Ayton and Denis J. Evans. Research School Of Chemistry

Information Theory and Predictability Lecture 6: Maximum Entropy Techniques

ChE 210B: Advanced Topics in Equilibrium Statistical Mechanics

CHAPTER 4 ENTROPY AND INFORMATION

Quantum Mechanical Foundations of Causal Entropic Forces

Entropy and irreversibility in gas dynamics. Joint work with T. Bodineau, I. Gallagher and S. Simonella

MACROSCOPIC VARIABLES, THERMAL EQUILIBRIUM. Contents AND BOLTZMANN ENTROPY. 1 Macroscopic Variables 3. 2 Local quantities and Hydrodynamics fields 4

Non-equilibrium phenomena and fluctuation relations

Irreversibility and the arrow of time in a quenched quantum system. Eric Lutz Department of Physics University of Erlangen-Nuremberg

Elements of Statistical Mechanics

2 Lyapunov Exponent exponent number. Lyapunov spectrum for colour conductivity of 8 WCA disks.

Information theory explanation of the fluctuation theorem, maximum. entropy production and self-organized criticality in non-equilibrium

Information in Biology

The second law, maximum entropy production and Liouville s theorem

Introduction. Chapter The Purpose of Statistical Mechanics

Information in Biology

Khinchin s approach to statistical mechanics

arxiv: v1 [cond-mat.stat-mech] 9 Oct 2014

arxiv: v1 [physics.data-an] 10 Sep 2007

The Jarzynski Equation and the Fluctuation Theorem

Basic Concepts and Tools in Statistical Physics

Hardy s Paradox. Chapter Introduction

4. The Green Kubo Relations

Lecture 8. The Second Law of Thermodynamics; Energy Exchange

Classical Statistical Mechanics: Part 1

Thermodynamic equilibrium

Lecture 8. The Second Law of Thermodynamics; Energy Exchange

Contrasting measures of irreversibility in stochastic and deterministic dynamics

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester

ChE 503 A. Z. Panagiotopoulos 1

Macroscopic time evolution and MaxEnt inference for closed systems with Hamiltonian dynamics. Abstract

Mathematical Structures of Statistical Mechanics: from equilibrium to nonequilibrium and beyond Hao Ge

arxiv:cond-mat/ v2 [cond-mat.stat-mech] 25 Sep 2000

Dissipation and the Relaxation to Equilibrium

Verschränkung versus Stosszahlansatz: The second law rests on low correlation levels in our cosmic neighborhood

although Boltzmann used W instead of Ω for the number of available states.

Derivation of the GENERIC form of nonequilibrium thermodynamics from a statistical optimization principle

Introduction. Introductory Remarks

Markovian Description of Irreversible Processes and the Time Randomization (*).

Stochastic Processes

Computer simulation methods (1) Dr. Vania Calandrini

Flaw in Crooks fluctuation theorem

Enter Heisenberg, Exit Common Sense

Physics 172H Modern Mechanics

Chapter 2 Ensemble Theory in Statistical Physics: Free Energy Potential

Even if you're not burning books, destroying information generates heat.

Role of closed paths in the path integral approach of statistical thermodynamics

Introduction Statistical Thermodynamics. Monday, January 6, 14

Statistical Mechanics in a Nutshell

84 My God, He Plays Dice! Chapter 12. Irreversibility. This chapter on the web informationphilosopher.com/problems/reversibility

Bayesian vs frequentist techniques for the analysis of binary outcome data

PHYSICS 715 COURSE NOTES WEEK 1

Nonequilibrium thermodynamics at the microscale

The Kawasaki Identity and the Fluctuation Theorem

Lecture 1: Historical Overview, Statistical Paradigm, Classical Mechanics

Inconsistencies in Steady State. Thermodynamics

Removing the mystery of entropy and thermodynamics. Part 3

Chapter 1 Elementary Logic

Quantum measurement theory and micro-macro consistency in nonequilibrium statistical mechanics

Grand Canonical Formalism

1. Propositions: Contrapositives and Converses

Light Quantum Hypothesis

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Data Analysis and Monte Carlo Methods

Contact Geometry of Mesoscopic Thermodynamics and Dynamics

Hidden subquantum processes as the true cause of the canonical distribution and the entropy

CHAPTER 4 SOME METHODS OF PROOF

8 Lecture 8: Thermodynamics: Principles

THERMODYNAMICS AND STATISTICAL MECHANICS

G : Statistical Mechanics

Stochastic Chemical Kinetics

The fine-grained Gibbs entropy

Second law, entropy production, and reversibility in thermodynamics of information

Lecture 5: Temperature, Adiabatic Processes

Some Review Problems for Exam 1: Solutions

4 Derivations in the Propositional Calculus

Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences

Beyond the Second Law of Thermodynamics

Introduction to Stochastic Thermodynamics: Application to Thermo- and Photo-electricity in small devices

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble

Introduction to Fluctuation Theorems

Optimal quantum driving of a thermal machine

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

Hugh Everett III s Many Worlds

Fluctuations of intensive variables and non-equivalence of thermodynamic ensembles

Chapter 1. Introduction

Ch 3.2: Direct proofs

Chapter 17. Free Energy and Thermodynamics. Chapter 17 Lecture Lecture Presentation. Sherril Soman Grand Valley State University

Let s start by reviewing what we learned last time. Here is the basic line of reasoning for Einstein Solids

G : Statistical Mechanics

Introduction. Introductory Remarks

Equilibrium. Chapter Introduction

A critical remark on Planck s model of black body

U = γkt N. is indeed an exact one form. Thus there exists a function U(S, V, N) so that. du = T ds pdv. U S = T, U V = p

Before the Quiz. Make sure you have SIX pennies

Symmetry of the Dielectric Tensor

Transcription:

Information Theory in Statistical Mechanics: Equilibrium and Beyond... Benjamin Good Principle of Maximum Information Entropy Consider the following problem: we have a number of mutually exclusive outcomes {A i } and we wish to specify a probability distribution p i over these outcomes. We lack the proper information to completely determine this probability distribution, so we wish to assign probabilities in the most non-committal way while still incorporating any additional information we might have. E.T. Jaynes proposed choosing the p i in such a way as to maximize the information entropy S I = i p i log p i (1) subject to normalization and any other given constraints [1]. Additional constraints often involve fixing the average values of some set of functions p i f j (x i ) = f j (x), j = 1,..., N (2) i To incorporate these constraints, we introduce Lagrange multipliers µ j (one for each constraint) and set ( ) S I + λ p i 1 + ( ) µ j p i f j (x i ) f j = 0 (3) p i i j i which yields the solution p i = e λ j µ jf(x i ) = 1 Z e j µ jf(x i ) where Z is just a normalization constant. In principle, the µ j can be determined by substituting into the remaining constraint equations. Although Jaynes gave no deductive proof of the correctness of his method, the principle of Maximum Information Entropy (MaxEnt) can be confirmed experimentally in areas quite removed from statistical mechanics. For example, consider a sample of n independent dice-rolls where we are told the average is 4. We week the probability distribution for the individual dice-rolls under this constraint. Clearly, we know it is not 1 6 because then the average would be 3.5. MaxEnt allows us to calculate such a probability distribution. If we simulate a large number of n-roll samples and remove those sequences whose average is not equal to 4, the resulting frequency distribution goes over to the MaxEnt distribution in the limit that n (as predicted by the law of large numbers). This simple experiment offers powerful proof of the far-reaching validity of the MaxEnt principle, despite the fact that theoretical justifications for its use are hard to come by. Application to Equilibrium Statistical Mechanics The application of the MaxEnt principle to equilibrium statistical mechanics is a relatively straightforward exercise (as we ll see in the next section), but it involves a slight change of viewpoint in the interpretation. Thus, it will help to review the traditional interpretive framework of statistical mechanics. (4) 1

The Central Dogma In the traditional approach to statistical mechanics, we assume that some sort of ergodic hypothesis holds [2]. By ergodic, we mean that the system is continually moving through all allowable states at such a rapid rate that the entire phase space (or at least the majority of it) is sampled in the time it takes to make a measurement. This has two implications: (1) If M F is a measurement of some observable F taken during the time interval τ, then M F = 1 τ τ In other words, measurements correspond to time averages 0 F (t)dt (5) (2) For any macroscopic time interval τ, the system samples such a large part of the phase space that 1 τ F (t)dt F (6) τ In other words, time averages are equal to phase space averages. We combine these two equations to obtain 0 M F F (7) This interpretation enables us connect the measurements we observe to the phase-space averages we can calculate, and all of statistical mechanics follows. While this process has met with spectacular success, the central assumption the ergodic hypothesis is notoriously hard to prove, and for some systems seems absolutely false. Statmech as Inference E.T. Jaynes and his subjectivist school follow a different line of interpretation. To them, the problem of statistical mechanics is a problem of inference. In any realistic system, the sheer number of degrees of freedom makes the exact state and evolution of the system impossible to determine. However, the successes of thermodynamics and Newtonian mechanics suggest that in some cases information about a small number of degrees of freedom allow us to make rather definite predictions about the large scale behavior of the system. The subjectivist school asks the following question: given information about a small number of macroscopic parameters, what is the best guess we can make about the state of the system? We have to clarify what we mean by best guess, however. If we want a point estimate of a quantity F, we seek a number f that minimizes the mean-squared error (F f) 2 (8) The solution of this simple estimation problem is just f = F. Thus, to a subjectivist, the reason we calculate phase averages is not that they are connected to time averages but because they are the best estimate of the actual quantity [3]. Jaynes takes this estimation problem one step further with the MaxEnt procedure. Instead of providing a single point estimate, the MaxEnt method yields the best-estimate of a probability distribution given the available information. The interesting systems therefore, are just those where information about some small number of degrees of freedom yields relatively definite predictions, because these are the ones that can be tested and confirmed in the laboratory [4]. 2

The Grand Canonical Distribution We now apply the Jaynes formalism to a common equilibrium system. Consider a system of volume that is free to exchange energy and particles with the environment. If we assume the system is at equilibrium, the average value of the energy and particle number (at any time) is enough to characterize the macroscopic behavior of the system for all time. Thus, we have the following constraints: (i) Normalization: i p i = 1 (ii) Average energy: i p ie i = E (iii) Average particle number: i p in i = n. Maximizing S I subject to these constraints yields the distribution p i = e λ λ 1E i λ 2 n i = 1 Z e λ 1E i λ 2 n i (9) We can identify λ 1 and λ 2 with the temperature T and chemical potential µ: which yields the Gibbs distribution: λ 1 = 1 kt, λ 2 = µ kt p i = 1 Z e (E i µn i )/kt (10) (11) This is exactly the same distribution we obtain using the ergodic hypothesis and the traditional framework of statistical mechanics! Jaynes believes this equivalence is a result of the fact that... there is nothing in the general laws of motion that can provide us with any additional information about the state of a system in equilibrium beyond what we have obtained from measurement [4]. In other words, statistics rather than mechanics dominates the behavior of equilibrium systems, a fact we have encountered many times throughout this course! Application to Nonequilibrium Statistical Mechanics The Jaynes formalism has the added benefit that it naturally generalizes to the nonequilibrium case. Whereas the traditional approach fails as soon as ergodicity is lost, MaxEnt works in exactly the same way. A general nonequilibrium state is not a single stationary state as in the equilibrium case but rather a path through state space. Thus, we seek the probabilities p of the system evolving according to a particular path. Our known information (apart from normalization) still consists of average values of observables, but we must now keep track of the time at which the measurements are made: p f j (x (tk ), t k ) = f j (x, t k ), j = 1,..., N, k = 1,..., M (12) Maximizing S I subject to these constraints yields the distribution p = e λ j k µ jkf(x (tk ),t k ) = 1 Z e j k µ jkf(x (tk ),t k ) (13) 3

If we take M and t 0 (to simulate a continuum of measurements), the Lagrange multipliers µ jk become functions µ j (t) and we obtain p = 1 Z e j µj (t)f(x (t),t)dt (14) Application: Derivation of the Steady-State Fluctuation Theorem 1 Consider a steady-state nonequilibrium system of volume and boundary Ω. Even though the system is in a steady state, it is not in equilibrium because steady currents of energy and particles are being driven through the system. 2 Let the internal energy density be given by u(x, t), mass density by ρ(x, t), and energy and mass flux densities by J u (x, t) and J m (x, t), respectively. Let J u (x, t) and J m(x, t) denote the flux densities on the boundary. Since we are considering a steady-state nonequilibrium system, the initial conditions u(x, 0), ρ(x, 0) and the time averaged boundary fluxes J u (x, t), J m(x, t) completely determine the macroscopic behavior of the system. Including normalization and local energy and mass conservation, we have the following constraints: (i) Normalization: p = 1 (15) (ii) Initial Conditions: (iii) External Fluxes: p u(x, 0) = u(x, 0), p J u (x) = J u (x), p ρ(x, 0) = ρ(x, 0) (16) p J m(x) = J m(x) (17) (iv) Conservation Laws: u(x, t) t + J u (x, t) = 0, ρ(x, t) t + J m (x, t) = 0 (18) Maximizing S I with respect to these constraints yields the distribution where A() is the path action defined by A() = λ u (x)u(x, 0) + λ m (x)ρ(x, 0) + p = 1 Z ea() (19) Ω µ u (x)j u (x) + µ m (x)j m(x) (20) 1 This derivation parallels the argument given in Dewar [5]. 2 It is true that energy and particles are also exchanged in the equilibrium Gibbs distribution, but the average flux is zero. 4

Now, the Lagrange multipliers λ and µ are not entirely independent, because they are connected at the boundary by the conservation laws. To take this into account, note that we can expand the initial conditions in the form u(x, 0) = 1 2 [u(x, 0) + u(x, τ) ] τ 2 u(x, t) t We can then use the conservation laws to substitute for the time averaged derivative. This, in effect, allows us to drop the surface integrals from the path action formula, leaving us with A() = 1 λ u [u(x, 0) + u(x, τ) ] + λ m [ρ(x, 0) + ρ(x, τ) ] 2 τ J u (x) λ u + J m (x) λ m 2 To put this into a more recognizable form, note that we can relate the Lagrange multipliers to a generalized temperature T (x) and generalized chemical potential µ(x) through the relations: (21) λ u (x) = 1 kt (x), λ m(x) = µ(x) kt (x) (22) We also define the nonequilibrium Hamiltonian density H(x, t) = u µρ. Then we can rewrite the above equation in the form A() = 1 H (0) + H (τ) + τσ (23) 2 kt 2k where σ is the time-averaged entropy production rate along, given by ( ( ) ( )) 1 µ σ = J u (x) + J u (x) T T The Role of Time Symmetry Because Newton s laws are time-symmetric, for every path there exists a time-reversed path R that follows the same trajectory but reverses the direction of time. This implies that all velocities and fluxes must change sign as well. We now seek the probability p R of observing the time reversed path. The first term in the path action is symmetric with respect to time reversals because R (0) = (τ) and R (τ) = (0), so it is unchanged. The entropy production on the other hand, is linear in the fluxes, so it must be anti-symmetric under time reversal. Thus, A( R ) = 1 2 H R (0) + H R (τ) kt so the ratio of the probabilities is simply + τσ R 2k = 1 2 H (0) + H (τ) kt τσ 2k (24) (25) p p R = e τσ k (26) If we look at the probability of observing a time averaged entropy production rate of σ = A, then P (σ = A) = δ(σ A)p (27) 5

and we obtain P (σ = A) P (σ = A) = eτa/k (28) This is the famous Evans-Searles Fluctuation Theorem (FT) [6]. Discovered in the early 1990 s, this simple relation gives a quantitative version of the Second Law of Thermodynamics, which states that σ 0 (29) In other words, entropy is a strictly increasing function. Boltzmann himself had reservations about the definite character of the second law and envisioned a probabilistic version where occasional violations of the second law were permitted as long as the macroscopic entropy increased on the average. Indeed, with a little algebra, one can show that the Fluctuation Theorem implies that Figure 1: A sample system exhibiting the Fluctuation Theorem. The majority of the states increase entropy, but there is a nonzero probability that the system actually runs in reverse, thus destroying entropy in apparent contradiction with the second law. The Fluctuation Theorem helps quantify this probability. σ 0 (30) Furthermore, in the limit of large systems or long waiting times, the Fluctuation Theorem also implies P (σ = A) P (σ = A) (31) Thus, the Fluctuation Theorem reduces to the Second Law in the proper limits in addition to quantifying the transition between probabilistic fluctuations for small systems and thermodynamic 6

certainty for large systems. It can also be used to derive the near-equilibrium Green-Kubo relations for linear transport theory. The Fluctuation Theorem has even been confirmed in the laboratory through a series of clever experiments by Evans et al [7]! Figure 2: (a) Logratihm of the ratio of number of trajectories in forward and reverse histogram bins, log(n i /N ir ) vs the value of the average entropy production rate for that bin. The line of best fit has slope of 0.98. (b) Demonstration of the Integrated FT. References [1] Jaynes, E.T. (2003). Probability Theory: The Logic of Science. New York, NY: Cambridge University Press. [2] Chandler, David (1987). Introduction to Modern Statistical Mechanics. New York, NY: Oxford University Press. [3] Jaynes, E.T. (1979). Where do we stand on maximum entropy? In The maximum entropy formalism (ed. R. D. Levine and M. Tribus), pp. 15-118. MIT Press, Cambridge, Massachusetts. [4] Jaynes, E.T. (1957). Information Theory and Statistical Mechanics. Phys. Rev., 106, 620. [5] Dewar, Roderick C. Information theory explanation of the fluctuation theorem, maximum entropy production and self-organized criticality in non-equilibrium stationary states. cond-mat/0005382 [6] Evans, Denis J and Searles Debra J (2002). The Fluctuation Theorem. Advances in Physics, 51, 1529-1585. [7] Carberry, D.M. et al (2004). Fluctuations and Irreversibility: An Experimental Demonstration of a Second-Law-Like Theorem using a Colloidal Particle Held in an Optical Trap. Phys. Rev. Lett., 92, 140601. 7