[Pop-up Salon, Maths, SJTU, 2019/01/10] On the Complete Monotonicity of Heat Equation Fan Cheng John Hopcroft Center Computer Science and Engineering Shanghai Jiao Tong University chengfan@sjtu.edu.cn
Overview 2 f x, t = 2 x2 t (Heat equation) f(x, t) h X + tz (Gaussian Channel) Y = X + tz Z N(0,1) h X = xlogx dx CMI = 2 (H. P. McKean, 1966) CMI 4. Conjecture: CMI = + (Cheng, 2015)
Outline Super-H Theorem Boltzmann equation and heat equation Shannon Entropy Power Inequality Complete Monotonicity Conjecture
Fire and Civilization Drill Myth: west and east Steam engine James Watts The Wealth of Nations Independence of US 1776
Study of Heat t f x, t = 1 f(x, t) 2 x2 2 Heat transfer The history begins with the work of Joseph Fourier around 1807 In a remarkable memoir, Fourier invented both Heat equation and the method of Fourier analysis for its solution
Information Age A mathematical theory of communication, Bell System Technical Journal. 27 (3): 379 423. Gaussian Channel: Z t N(0, t) X and Z are mutually independent. The p.d.f of X is g(x) Y is the convolution of X and Z t : Y = X+ The p.d.f. of Y 1 f(y; t) = g(x) e(y x)2 2t 2πt t f(y; t) = 1 2 2 y2 f(y; t) Fundamentally, Gaussian channel and heat equation are identical in mathematics (Gaussian mixture model) Z t
Entropy Formula Second law of thermodynamics: one way only Entropy
Ludwig Boltzmann Boltzmann formula: Gibbs formula: S = k b i S = k B lnw p i lnp i Ludwig Eduard Boltzmann 1844-1906 Vienna, Austrian Empire Boltzmann equation: H-theorem: df dt = ( f t ) force + ( f t ) diff + ( f t ) coll H(f(t))is non decreasing
Super H-theorem for Boltzmann Equation Notation A function is completely monotone (CM) iff all the signs of its derivatives are: +, -, +, -, (e.g., 1, t e t ) McKean s Conjecture on Boltzmann equation (1966): H(f(t)) is CM in t, when f t satisfies Boltzmann equation False, disproved by E. Lieb in 1970s the particular Bobylev-Krook-Wu explicit solutions, this theorem holds true for n 101 and breaks downs afterwards H. P. McKean, NYU. National Academy of Sciences
Super H-theorem for Heat Equation Heat equation: Is H(f(t)) CM in t, if f(t) satisfies heat equation Equivalently, is H(X + tz) CM in t? The signs of the first two order derivatives were obtained Failed to obtain the 3 rd and 4 th. (It is easy to compute the derivatives, it is hard to obtain their signs) This suggests that, etc., but I could not prove it C. Villani, 2010 Fields Medalist
Claude E. Shannon and EPI Central limit theorem Capacity region of Gaussian broadcast channel Capacity region of Gaussian Multiple-Input Multiple-Output broadcast channel Uncertainty principle All of them can be proved by Entropy power inequality (EPI) Entropy power inequality (Shannon 1948): For any two independent continuous random variables X and Y, e 2h(X+Y) e 2h(X) + e 2h(Y) Equalities holds iff X and Y are Gaussian Motivation: Gaussian noise is the worst noise. Impact: A new characterization of Gaussian distribution in information theory Comments: most profound! (Kolmogorov)
Entropy Power Inequality Shannon himself didn t give a proof but an explanation, which turned out to be wrong The first proof is given by A. J. Stam (1959), N. M. Blachman (1966) Research on EPI Generalization, new proof, new connection. E.g., Gaussian interference channel is open, some stronger EPI should exist. Stanford Information Theory School: Thomas Cover and his students: A. El Gamel, M. H. Costa, A. Dembo, A. Barron (1980-1990) Princeton Information Theory School: Sergio Verdu, etc. (2000s) Battle field of Shannon theory
Ramification of EPI Gaussian perturbation: h(x + tz) Shannon EPI Fisher Information: I X + tz = t h(x + tz)/2 Fisher Information is decreasing in t Fisher information inequality (FII): 1 1 + 1 I(X+Y) I(X) I(Y) e 2h(X+ tz) is concave in t Tight Young s inequality X + Y r c X p Y q Status Quo: FII can imply EPI and all its generalizations. However, life is always hard. FII is far from enough
On X + tz X is arbitrary and h(x) may not exist When t 0, X + tz X. When t, X + tz Gaussian When t > 0, X + tz and h(x + tz) are infinitely differential X + tz is called mixed gaussian distribution (Gaussian Mixed Model (GMM) in machine learning) X + tz is Gaussian channel/source in information theory Gaussian noise is the worst additive noise Gaussian distribution maximizes h(x) Entropy power inequality, central limit theorem, etc.
Where we take off Shannon Entropy power inequality Fisher information inequality h(x + tz) h f t is CM When f(t) satisfied Boltzmann equation, disproved When f(t) satisfied heat equation, unknown We even don t know what CM is! Motivation: to study some inequalities; e.g., the convexity of h(x + the concavity of Any progress? None I X+ tz t Information theorists got lost in the past 70 years Mathematicians ignored it e t Z), It is widely believed that there should be no new EPI except Shannon EPI and FII.
Discovery I X + tz = h X + tz 0 (de Bruijn, 1958) 2 t I (1) = I X + tz 0 (McKean1966, Costa 1985) t Observation: I(X + tz) is convex in t h X + tz = 1 ln 2πet, I X + tz = 1. I is CM: +, -, +, - 2 t If the observation is true, the first three derivatives are: +, -, + Q: Is the 4 th order derivative -? Because Z is Gaussian! The signs of derivatives of h(x + tz) are independent of X. Invariant! Exactly the same problem in McKean 1966 To convince people, we must prove its convexity
Challenge Let X g(x) h Y t = f(y, t) ln f(y, t) dy, no closed form except some special g(x). f(y, t) satisfies heat equation. I Y t = f 1 2 dy f I 1 Y t = f 2 f 2 1 f f 2 2 dy So what is I (2)? (Heat equation, integration by parts)
Challenge (cont d) It is trivial to calculate derivatives. It is hard to prove their signs
Breakthrough Integration by parts: udv = uv vdu First breakthrough since McKean 1966
GCMC Gaussian complete monotonicity conjecture: I(X + tz) is CM in t Conjecture: logi(x + tz) is convex in t Pointed out by C. Villani and G. Toscani the connection with McKean s paper A general form: number partition. Hard to determine the coefficients. Hard to find β k,j!
Complete monotone function How to construct g(x)? A new expression for entropy involved special functions in mathematical physics Herbert R. Stahl, 2013
Complete monotone function A function f(t) is CM, then logf(t) is convex in t I Y t is CM in t, then log I(Y t ) is convex in t A function f(t) is CM, a Schur-convex function can be obtained by f(t) Schur-convex Majority theory Remarks: The current tools in information theory don t work. More sophisticated tools should be built to attack this problem. A new mathematical theory of information theory
Potential application: Interference channel A challenge question: what is the application of GCMC? Mathematical speaking, a beautiful result on a fundamental problem will be very useful Potential Application Central limit theorem Capacity region of Gaussian broadcast channel Capacity region of Gaussian Multiple-Input Multiple-Output broadcast channel Uncertainty principle Where EPI works Gaussian interference channel: open since 1970s Where EPI fails CM is considered to be much more powerful than EPI
Remarks If GCMC is true A fundamental breakthrough in mathematical physics, information theory and any disciplines related to Gaussian distribution A new expression for Fisher information Derivatives are an invariant Though h(x + tz) looks very messy, certain regularity exists Application: Gaussian interference channel? If GCMC is false No Failure, as heat equation is a physical phenomenon A lucky number (e.g. 2019) where Gaussian distribution fails. Painful!