ENTROPY VERSUS VARIANCE FOR SYMMETRIC LOG-CONCAVE RANDOM VARIABLES AND RELATED PROBLEMS

Similar documents
Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

GAUSSIAN MEASURE OF SECTIONS OF DILATES AND TRANSLATIONS OF CONVEX BODIES. 2π) n

Dimensional behaviour of entropy and information

Weak and strong moments of l r -norms of log-concave vectors

A Lower Bound on the Differential Entropy of Log-Concave Random Vectors with Applications

Strong log-concavity is preserved by convolution

Monotonicity of entropy and Fisher information: a quick proof via maximal correlation

Probability and Measure

A note on the convex infimum convolution inequality

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

A Criterion for the Compound Poisson Distribution to be Maximum Entropy

ON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION

The Entropy Per Coordinate of a Random Vector is Highly Constrained Under Convexity Conditions Sergey Bobkov and Mokshay Madiman, Member, IEEE

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Convex inequalities, isoperimetry and spectral gap III

MATH 426, TOPOLOGY. p 1.

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Random Bernstein-Markov factors

Integral Jensen inequality

GAUSSIAN MIXTURES: ENTROPY AND GEOMETRIC INEQUALITIES

On isotropicity with respect to a measure

Solutions to Tutorial 11 (Week 12)

ON CONCENTRATION FUNCTIONS OF RANDOM VARIABLES. Sergey G. Bobkov and Gennadiy P. Chistyakov. June 2, 2013

Entropy and the Additive Combinatorics of Probability Densities on LCA groups

A reverse entropy power inequality for log-concave random vectors

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Convex Analysis and Economic Theory Winter 2018

Maximizing the number of independent sets of a fixed size

Szemerédi s Lemma for the Analyst

Aliprantis, Border: Infinite-dimensional Analysis A Hitchhiker s Guide

Convex Analysis and Economic Theory AY Elementary properties of convex functions

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

The Codimension of the Zeros of a Stable Process in Random Scenery

On a Class of Multidimensional Optimal Transportation Problems

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

McGill University Math 354: Honors Analysis 3

Exercise 1. Let f be a nonnegative measurable function. Show that. where ϕ is taken over all simple functions with ϕ f. k 1.

RANDOM WALKS WHOSE CONCAVE MAJORANTS OFTEN HAVE FEW FACES

h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote

On John type ellipsoids

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

A VERY BRIEF REVIEW OF MEASURE THEORY

STAT 7032 Probability Spring Wlodek Bryc

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

3 Integration and Expectation

Fisher Information, Compound Poisson Approximation, and the Poisson Channel

Solutions to Problem Set 5 for , Fall 2007

Analysis Qualifying Exam

On Concentration Functions of Random Variables

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Some Background Material

Wiener Measure and Brownian Motion

Gaussian Measure of Sections of convex bodies

The main results about probability measures are the following two facts:

the convolution of f and g) given by

Estimates for probabilities of independent events and infinite series

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

An introduction to some aspects of functional analysis

Introductory Analysis I Fall 2014 Homework #9 Due: Wednesday, November 19

Sumset and Inverse Sumset Inequalities for Differential Entropy and Mutual Information

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

Laplace s Equation. Chapter Mean Value Formulas

Behaviour of Lipschitz functions on negligible sets. Non-differentiability in R. Outline

The small ball property in Banach spaces (quantitative results)

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Brownian Motion and Stochastic Calculus

Mathematical Methods for Physics and Engineering

Maths 212: Homework Solutions

Self-normalized Cramér-Type Large Deviations for Independent Random Variables

Multiplicativity of Maximal p Norms in Werner Holevo Channels for 1 < p 2

Steiner s formula and large deviations theory

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Week 2: Sequences and Series

KLS-TYPE ISOPERIMETRIC BOUNDS FOR LOG-CONCAVE PROBABILITY MEASURES. December, 2014

MATH6081A Homework 8. In addition, when 1 < p 2 the above inequality can be refined using Lorentz spaces: f

Daniel M. Oberlin Department of Mathematics, Florida State University. January 2005

PCA sets and convexity

On Z p -norms of random vectors

NECESSARY CONDITIONS FOR WEIGHTED POINTWISE HARDY INEQUALITIES

REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE

Problem set 2 The central limit theorem.

Contents Ordered Fields... 2 Ordered sets and fields... 2 Construction of the Reals 1: Dedekind Cuts... 2 Metric Spaces... 3

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only

On Kusuoka Representation of Law Invariant Risk Measures

Fisher information and Stam inequality on a finite group

Introduction to Real Analysis Alternative Chapter 1

Principle of Mathematical Induction

Real Analysis Problems

A dual form of the sharp Nash inequality and its weighted generalization

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

A COUNTEREXAMPLE TO AN ENDPOINT BILINEAR STRICHARTZ INEQUALITY TERENCE TAO. t L x (R R2 ) f L 2 x (R2 )

Entropy power inequality for a family of discrete random variables

1 An Elementary Introduction to Monotone Transportation

Transcription:

ENTROPY VERSUS VARIANCE FOR SYMMETRIC LOG-CONCAVE RANDOM VARIABLES AND RELATED PROBLEMS MOKSHAY MADIMAN, PIOTR NAYAR, AND TOMASZ TKOCZ Abstract We show that the uniform distribution minimises entropy among all symmetric log-concave distributions with fixed variance We construct a counterexample regarding monotonicity and entropy comparison of weighted sums of independent identically distributed log-concave random variables 2 Mathematics Subject Classification Primary 94A7; Secondary 6E5 Key words entropy, log-concave function, monotonicity of entropy, reverse entropy power inequality Introduction It is a classical fact going back to Boltzmann [8] that when the variance of a realvalued random variable X is kept fixed, the differential entropy is maximized by taking X to be Gaussian As is standard in information theory, we use the definition of Shannon [2]: the differential entropy or simply entropy, henceforth, since we have no need to deal with discrete entropy in this note) of a random vector X with density f is defined as h X) = f log f, R n provided that this integral exists, this definition having a minus sign relative to Boltzmann s H-functional It is easy to see that if one tried to minimize the entropy instead of maximizing it, there is no minimum among random variables with densities indeed, a discrete random variable with variance has differential entropy, and densities of probability measures approaching such a discrete distribution in an appropriate topology would have differential entropies converging to as well Nonetheless, it is of significant interest to identify minimizers of entropy within structured subclasses of probability measures For instance, it was observed independently by Keith Ball unpublished) and in [7] that the question of minimizing entropy under a covariance matrix constraint within the class of log-concave measures on R n is intimately tied to the well known hyperplane or slicing conjecture in convex geometry More generally, log-concave distributions emerge naturally from the interplay between information theory and convex geometry, and have recently been a very fruitful and active topic of research see the recent survey [9]) A probability density f on R is called log-concave if it is of the form f = e V for a convex function V : R R { } Our first goal in this note is to establish some sharp inequalities relating the entropy and in fact, the more general class of Rényi entropies) to moments of log-concave distributions Our second goal is to falsify a conjecture about Schur-concavity of weighted sums of iid random variables from a log-concave distribution, motivated by connections to the monotonicity of entropy in the central limit theorem as well as MM was supported in part by the US National Science Foundation through the grant DMS-4954 P N was partially supported by the National Science Centre Poland grant 25/8/A/ST/553 The research leading to these results is part of a project that has received funding from the European Research Council ERC) under the European Union s Horizon 22 research and innovation programme grant agreement No 63785)

by questions about capacity-achieving distributions for additive non-gaussian noise channels Finally we make an unconnected remark about a variant for complex-valued random variables of a recent entropy power inequality for dependent random variables of Hao and Jog [3] First, we show that among all symmetric log-concave probability distributions on R with fixed variance, the uniform distribution has minimal entropy In fact, we obtain a slightly more general result Theorem Let X be a symmetric log-concave random variable and p, 2] Then, [ h X) log E X p ) /p] [ + log 2p + ) /p], with equality if and only if X is a uniform random variable It is instructive to write this inequality using the entropy power NX) = e 2hX), in which case it becomes NX) 4p + ) 2/p E X p ) 2/p In the special case p = 2 corresponding to the variance, we have the sandwich inequality 2 VarX) NX) 2πe) VarX), with both inequalities being sharp in the class of symmetric log-concave random variables the one on the left, coming from Theorem, giving equality uniquely for the uniform distribution, while the one on the right, coming from the maximum entropy property of the Gaussian, giving equality uniquely for the Gaussian distribution) Note that 2πe 779, so the range of entropy power given variance is quite constrained for symmetric log-concave random variables We note that Theorem can be viewed as a sharp version in the symmetric case of some of the estimates from [5] However, finding the sharp version is quite delicate and one needs significantly more sophisticated methods, as explained below For related upper bounds on the variance in terms of the entropy for mixtures of densities of the form e t α, see the recent work [9] Our argument comprises two main steps: first we reduce the problem to simple random variables compactly supported, piecewise exponential density), using ideas and techniques developed by Fradelizi and Guedon [2] in order to elucidate the sophisticated localization technique of Lovász and Simonovits [4] Then, in order to verify the inequality for such random variables, we prove a two-point inequality To motivate our next result, we first recall the notion of the Schur majorisation One vector a = a,, a n ) in R n is majorised by another one b = b,, b n ), usually denoted a b, if the nonincreasing rearrangements a a n and for each k n and n j= a j = n j= b j For instance, any vector a with nonnegative coordinates adding up to is majorised by the vector,,, ) and majorises the vector n, n,, n ) It was conjectured in [4] that for two independent identically λx distributed log-concave random variables the function λ h + ) λx 2 is b b n of a and b satisfy the inequalities k j= a j k j= b j nondecreasing on [, 2 ] A way to generalise this is to ask whether for any n iid copies X,, X n of a log-concave random variable, we have h a i X i ) h b i X i ), provided that a 2,, a2 n) b 2,, b2 n) This property called Schur-concavity in the literature) holds for symmetric Gaussian mixtures as recently shown in [] We suspect it holds for uniform random variables see also []), but here we prove it does not hold in general for log-concave symmetric distributions for the following reason: since, n,, n, n ), n,, n, ), if it held, then the sequence ) h X + X 2++X n n+ would be nondecreasing and as it converges to h X + G), where 2

G is an independent Gaussian random variable ) with the same variance as X, we would have in particular that h X + X 2++X n n+ h X + G) We construct examples where the opposite holds Theorem 2 There exists a symmetric log-concave random variable with variance such that if X, X, are its independent copies and n is large enough, we have h X + X ) + + X n > h X + Z), n where Z is a standard Gaussian random variable, independent of the X i Our proof is based on sophisticated and remarkable Edgeworth type expansions recently developed in [5] en route to obtaining precise rates of convergence in the entropic central limit theorem Theorem 2 can be compared with the celebrated monotonicity of entropy property proved in [3] see [6, 2, ] for simpler proofs and [7, 8] for extensions), which says that for any random variable the sequence h X ++X n is nondecreasing, where X,, X n are its independent copies Theorem 2 also has consequences for the understanding of capacity-achieving distributions for additive noise channels with non-gaussian noise Our theorem also provides an example of two independent symmetric log-concave random variables X and Y with the same variance such that h X + Y ) > h X + Z), where Z is a Gaussian random variable with the same variance as X and Y, independent of them, which is again in contrast to symmetric Gaussian mixtures see []) It is an interesting question, already posed in [], whether in general, for two iid summands, swapping one for a Gaussian with the same variance, increases entropy Our third result is a short remark regarding a recent inequality by Hao and Jog see their paper [3] for motivation and proper discussion): if X = X,, X n ) is an n ) uncoditional random vector in R n, then n h X) h X ++X n n ) Recall that a random vector is called unconditional if for every choice of signs η,, η n {, +}, the vector η X,, η n X n ) has the same distribution as X We remark that a complex analogue of this inequality also holds and the proof is essentially trivial thanks to existence of complex matrices with determinant and all entries with modulus Theorem 3 Let X = X,, X n ) be a random vector in C n which is complexunconditional, that is for every complex numbers z,, z n such that z j = for every j, the vector z X,, z n X n ) has the same distribution as X Then n h X) h ) X + + X n n In the subsequent sections we present proofs of our theorems and provide some additional remarks 2 Proof of Theorem Let F be the set of all even log-concave probability density functions on R Define for f F the following functionals: entropy, h f) = f log f, and p-th moment, m p f) = x p fx)dx) /p 3

Our goal is to show that Reduction inf F { } h f) log [m p f)] [ = log 2p + ) /p] Bounded support First we argue that it only suffices to consider compactly supported densities Let F L be the set of all densities from F which are supported in the interval [ L, L] Given f F, by considering f L = f [ L,L], which is in F L L, and checking that L f h f L ) and m p f L ) tend to h f) and m p f), we get { } { } inf h f) log [m p f)] = inf inf h f) log [m p f)] F L> F L This last infimum can be further rewritten as inf α,l> inf {h f), f F L, m p f) = α} log α Consequently, to prove Theorem, it suffices to show that for every α, L >, we have [ inf {h f), f F L, m p f) = α} log α + log 2p + ) /p] Degrees of freedom We shall argue that the last infimum is attained at desities f which on [, ) are first constant and then decrease exponentially Fix positive numbers α and L and consider the set of densities A = {f F L, m p f) = α} We treat A as a subset of L R) which is a locally convex Hausdorff space later on, this will be needed to employ Krein-Milman type theorems) Step I We show that sup f A h f) is finite and attained at a point from the set of the extremal points of A Let us recall that a set F is an extremal subset of A X X is a vector space) if it is nonempty and if for some x F, we have x = λa + λ)b for some elements a, b A and λ [, ], then both a and b are in F Notice that this definition does not require the convexity of A Moreover, a A is an extremal point of A, if {a} is extremal We remark that for a convex function Φ : A R, the set of points where its supremum is attained if nonnempty) is an extremal subset of A for instance, see Lemma 764 in [2]) An application of Zorn s lemma together with a separation type theorem shows that every nonempty compact extremal subset of A of a locally convex Hausdorff vector space contains at least one extremal point see Lemma 765 in [2]) Therefore, it remains to show a) sup f A h f) < +, b) the set of the extremisers of entropy, is nonempty and compact First we note a standard lemma M = {f A, h f) = sup h g)} g A Lemma 4 For every even log-concave function f : R [, + ), we have p+ f) p x p fx)dx 2 p Γp + ) fx)dx) 4 )

Proof By homogeneity we can assume that f) = Consider gx) = e a x such that g = f By log-concavity, there is exactly one sign change point x for f g We have, x p [fx) gx)] = [ x p x p ][fx) gx)] since the integrand is nonpositive It remains to verify the lemma for g, which holds with equality To see a), we observe that by Lemma 4 combined with the inequality h h, we get h f) log f), which gives a) To see b), let γ = sup g A h g) and take a sequence of functions f n from A such that f n log f n γ To proceed we need another elementary lemma Lemma 5 Let f n ) n be a sequence of functions in A Then there exists a subsequence f nk ) k converging pointwise to a function f in A Proof As noted above, the functions from A are uniformly bounded by Lemma 4) and thus, using a standard diagonal argument, by passing to a subsequence, we can assume that f n q) converges for every rational q in [ L, L]), say to fq) Notice that f is log-concave on the rationals, that is fλq + λ)q 2 ) fq ) λ fq 2 ) λ, for all rationals q, q 2 and λ [, ] such that λq + λ)q 2 is also a rational Moreover, f is even and nonincreasing on [, L] Let L = inf{q >, q is rational, fq) = } If x > L, then pick any rational L < q < x and observe that f n x) f n q) fq) =, so f n x) The function f is continuous on [, L ) If < x < L, consider rationals q, q 2, r such that q < x < q 2 < r < L Then, by monotonicity and log-concavity, fq ) fq 2 ) [ ] fq ) q 2 q r q fr) [ ] f) q 2 q r q fr) [ ] f) q 2 q r x, fr) thus lim q x fq ) = lim q2 x+ fq 2 ) these limits exist by the monotonicity of f) Now for any ε >, take rationals q and q 2 such that q < x < q 2 and fq ) fq 2 ) < ε Since, f n q 2 ) f n x) f n q ), we get fq 2 ) lim inf f n x) lim sup f n x) fq ) therefore lim sup f n x) lim inf f n x) ε Thus, f n x) is convergent, to say fx) We also set, say fl ) = Then f n converges to f at all but two points ±L, the function f is even and log-concave By Lebesgue s dominated convergence theorem, f A By the lemma, f nk f for some subsequence n k ) and f A By the Lebesgue dominated convergence theorem, f nk log f nk f log f, so f M To show that the set M is compact, we repeat the same argument Step II Every extremal point of A has at most 2 degrees of freedom Recall the notion of degrees of freedom of log-concave functions introduced in [2] The degree of freedom of a log-concave function g : R [, ) is the largest integer k such that there exist δ > and linearly independent continuous functions h,, h k defined on {x R, gx) > } such that for every ε,, ε k ) [ δ, δ] k, the function g + k i= ε ih i is log-concave Suppose f extconva)) A and f has more than two degrees of freedom Then there are continuous functions h, h 2, h 3 supported in [ L, L]) and δ > such that for all ε, ε 2, ε 3 [ δ, δ] the function f + ε h + ε 2 h 2 + ε 3 h 3 is log-concave Note that 5

the space of solutions ε, ε 2, ε 3 to the system of equations ε h + ε 2 h 2 + ε 3 h 3 = ε x p h + ε 2 x p h 2 + ε 3 x p h 3 = is of dimension at least Therefore this space intersected with the cube [ δ, δ] 3 contains a symmetric interval and, in particular, two antipodal points η, η 2, η 3 ) and η, η 2, η 3 ) Take f + = f + η h + η 2 h 2 + η 3 h 3 and f = g η h η 2 h 2 η 3 h 3, which are both in A Then, f = 2 f + + f ) and therefore f is not an extremal point Step III Densities with at most 2 degrees of freedom are simple We want to determine all nonincreasing log-concave functions f on [, ) with degree of freedom at most 2 Suppose x < x 2 < < x n are points of differentiability of the potential V = log g, such that < V x ) < V x 2 ) < < V x n ) Define { V x), x < xi V i x) = V x i ) + x x i )V x i ), x x i We claim that e V + δ + n i= δ iv i ) is a log-concave non-increasing function for δ i ε, with ε sufficiently small To prove log-concavity we observe that on each interval the function is of the form e V x) + τ + τ 2 x + τ 3 V x)) On the interval [, x ] it is of the form e V + τv ) Log-concavity follows from Lemma in [2] We also have to ensure that the density is nonincreasing On [, x ] it follows from the fact that V log + τv )) = V ) τ + τv for small τ On the other intervals we have similar expressions V x) τ 2 + τ 3 V x) + τ + τ 2 x + τ 3 V x) >, which follows from the fact that V x) > α for some α > From this it follows that if there are points x < x 2 < < x n, such that < V x ) < V x 2 ) < < V x n ), then e V has degree of freedom n + It follows that the only function with degree of freedom at most 2 is of the form { β, x < a V x) = β + γx a), x [a, a + b] A two-point inequality It remains to show that for every density f of the form fx) = c [,a] x ) + ce γ x a) [a,a+b] x ), where c is a positive normalising constant and a, b and γ are nonnegative, we have h f) log m p f) log [2p + ) /p] with equality if and only if f is uniform If either b or γ are zero, then f is a uniform density and we directly check that there is equality Therefore let us from now on assume that both b and γ are positive and we shall prove the strict inequality Since the left-hand side does not change when f is replaced by x λfλx) for any positive λ, we 6

shall assume that γ = Then the condition f = is equivalent to 2ca+ e b ) = We have h f) = 2ac log c 2 b ce x logce x )dx = 2ca + e b ) log c + 2c + b)e b ) + b)e b = log c + a + e b Moreover, a m p p+ b ) pf) = 2c p + + x + a) p e x dx Putting these together yields h f) log m p f) log [2p + ) /p] + b)e b = a + e b p + loga + e b ) p [ a p+ b ] p log p + + x + a) p e x dx Therefore, the proof of Theorem is complete once we show the following two-point inequality Lemma 6 For nonnegative s, positive t and p, 2] we have [ t ] log s p+ + p + ) s + x) p e x dx < p + ) log[s + e t + t)e t ] + p s + e t Proof Integrating by parts, we can rewrite the left hand side as log[ t s+x)p+ dµx)] for a Borel measure µ on [, t] which is absolutely continuous on, t) with density e x and has the atom µ{t}) = e t ) With s and t fixed, this is a strictly convex function of p by Hölder s inequality) The right hand side is linear as a function of p Therefore, it suffices to check the inequality for p = and p = 2 For p = the inequality becomes equality For p = 2, after computing the integral and exponentiating both sides, the inequality becomes s 3 + 3 e t )s 2 + 6 + t)e t )s + 3e t 2e t t 2 2t 2) < a 3 e 2 b a, where we put a = s+ e t and b = +t)e t, which are positive We lower-bound the right hand side using the estimate e x > + x + 2 x2 + 6 x3, x >, by a 3 + 2 b a + 2 b2 a 2 + 4 b 3 ) 3 a 3 = a 3 + 2a 2 b + 2ab 2 + 4 3 b3 Therefore it suffices to show that s 3 + 3 e t )s 2 + 6 + t)e t )s + 3e t 2e t t 2 2t 2) a 3 + 2ba 2 + 2b 2 a + 4 3 b3 After moving everything on one side, plugging in a, b, expanding and simplifying, it becomes where ut) = e t t, 2e t ut) s 2 + e 2t vt) s + 3 e 3t wt), vt) = 3e 2t 2te t 2e t + 2t 2 + 8t + 9, wt) = e 3t + 3e 2t 3t 2 4t 3) + 3e t 6t 2 + 2t + 9) 4t 3 8t 2 3t 9 7

It suffices to prove that these functions are nonnegative for t This is clear for u For v, we check that v) = v ) = v ) = and 2 e t v t) = 2e t t 9 2t + ) t 9 = t + 3 3 For w, we check that w ) = w ) = w ) =, w 4) ) = 8 and 3 e t w 5) t) = 8e 2t + 32e t 3t 2 + t 8) + 8t + 6t 2 + 239 8e 2t 8 32e t + 239 = 8 e t 28 ) 2 + 2975 8 8 2975 8 It follows that vt) and wt) are nonnegative for t Remark 7 If we put s = and t in the inequality from Lemma 6, we get log Γp + 2) p in particular p < 265) We suspect that this necessary condition is also sufficient for the inequality to hold for all positive s and t Let us denote 3 Proof of Theorem 2 Z n = X + + X n n and let p n be the density of Z n and let ϕ be the density of Z Since X is assumed to be log-concave, it satisfies E X s < for all s > According to the Edgeworth-type expansion described in [5] Theorem 32 in Chapter 3), we have with any m s < m + ) + x m )p n x) ϕ m x)) = on s 2 2 ) uniformly in x, where Here the functions q k are given by m 2 ϕ m x) = ϕx) + q k x)n k/2 k= q k x) = ϕx) H k+2j x) r! r k! where H n are Hermite polynomials, γ3 ) ) rk r γk+2, 3! k + 2)! H n x) = ) n e x2 /2 dn dx n e x2 /2, and the summation runs over all nonnegative integer solutions r,, r k ) to the equation r + 2r 2 + + kr k = k, and one uses the notation j = r + + r k The numbers γ k are the cumulants of X, namely k dr γ k = i dt r log EeitX t= Let us calculate ϕ 4 Under our assumption symmetry of X and EX 2 = ), we have γ 3 = and γ 4 = EX 4 3 Therefore q = and q 2 = 4! γ 4ϕH 4 = 4! γ 4ϕ 4), ϕ 4 = ϕ + n 4! EX4 3)ϕ 4) We get that for any ε, ) + x 4 )p n x) ϕ 4 x)) = on 3 ε 2 ), uniformly in x Let f be the density of X Let us assume that it is of the form f = ϕ + δ, where δ is even, smooth and compactly supported say, supported in [ 2, ] [, 2]) with 8

bounded derivatives Moreover, we assume that 2ϕ f 2ϕ and that δ /4 Multiplying δ by a very small constant we can ensure that f is log-concave We are going to use Theorem 3 from [6] To check the assumptions of this theorem, we first observe that for any α > we have D α Z Z) = α log ) ϕ + δ α ϕ) <, ϕ since δ has bounded support We have to show that for sufficiently big α = is Ee tx < e α t 2 /2, t Since X is symmetric, we can assume that t > Then Ee tx = e t2 /2 t 2k + x 2k δx)dx e t2 /2 t 2k + 2k)! 2k)! 22k k= k= < e t2 /2 2t) 2k ) t 2k + = + 2k)! 2 k k! + 2t)2k 2k)! k= k= ) t 2k + k! + 2t)2k t 2k 4 2k = e 6t2, k! k! k= k= 2 2 δx) dx α α there where we have used the fact that δx)dx =, δ has a bounded support contained in [ 2, 2] and δ /4 We conclude that and thus p n x) ϕx) C n e x2 /64 p n x) ϕx) + C n e x2 /64 C e x2 /C In this proof C and c denote sufficiently large and sufficiently small universal constants that may change from one line to another On the other hand, C, c and c denote constants that may depend on the distribution of X ) Moreover, for x c log n we have so p n x) ϕx) C n e x2 /64 n e x2 /64, p n x) c n C, for x c log n Let us define h n = p n ϕ 4 Note that ϕ 4 = ϕ + c n ϕ 4), where c = 4! EX4 3) We have f p n log f p n = f ϕ + c ) n f ϕ4) + f h n log f p n = f ϕ log f p n + c f ϕ 4) log f p n + f h n log f p n n = I + I 2 + I 3 We first bound I 3 Note that f h n )x) 2ϕ h n )x) on 5/4 ) 9 e y2 /2 + x y) 4 dy

Assuming without loss of generality that x >, we have e y2 /2 + x y) 4 dy + y [ 2 x,2x] y / [ 2 x,2x] e x2 /8 + y [ 2 x,2x] + e y2 /2 dy 6 x4 y / [ 2 x,2x] 3 2π /8 2 xe x2 + + C 6 x4 + x 4 We also have f p n )x) 2ϕ p n )x) C ϕ e y2 /C )x) C Moreover, assuming without loss of generality that x >, f p n )x) 2 ϕ p n)x) c n C e x y)2 /2 y c log n c n C e x2 /2 e y2 /2 c /2 y c log n n C e x2 Thus As a consequence I 3 on 5/4 ) log f p n x) logc n C e x2 /2 ) + x 4 log f p nx) dx on 5/4 ) For I 2 fix β > and observe that x β f ϕ 4) logf p n ) log n 2 C = on c ) = on 5/4 log n) x β log n x β log n + x 4 logc n C e x2 /2 )dx ϕ ϕ 4) logf p n ) + x 4 )e x2 /4 logc n C e x2 /2 ) Hence, f ϕ 4) log f p n = x β + log n x >β = log n x β f ϕ 4) log f p n + o) log n Writing p n = ϕ + r n we get f ϕ 4) log f p n = x β log n Here and x β log n x β log n f ϕ) [logf 4) ϕ) + log f ϕ) 4) logf ϕ) = f ϕ) 4) logf ϕ) + o) x β log n f ϕ) 4) log + f r ) n = o), f ϕ since for x β log n with β sufficiently small we have f r n f ϕ 4 ϕ C e x2 /C ) n ϕ ϕ C n ec x 2 C n + f r n f ϕ )]

By Jensen s inequality, I = f ϕ log f p n f ϕ log f ϕ Putting these things together we get f p n log f p n f ϕ log f ϕ + c f ϕ) 4) logf ϕ) + on ) n This is HX + Z) HX + Z n ) + n 4! EX4 3) f ϕ) 4) logf ϕ) + on ) It is therefore enough to construct X satisfying all previous conditions) such that EX 4 3) f ϕ) 4) logf ϕ) < It actually suffices to construct g such that g = gx 2 = gx 4 = but the function f = ϕ + εg satisfies f ϕ) logf ϕ) > for small ε Then we perturb g a bit to get EX 4 < 3 instead of EX4 = 3 This can be done without affecting log-concavity Let ϕ 2 = ϕ ϕ We have f ϕ) logf ϕ) = ϕ 2 + εϕ g) logϕ 2 + εϕ g) ϕ 2 + εϕ g) logϕ 2 ) + ε ϕ g ) ) ϕ g 2 ϕ 2 2 ε2 ϕ 2 The leading term and the term in front of ε vanish thanks to g being orthogonal to, x,, x 4 ) The term in front of ε 2 is equal to ϕ g) ϕ g) J = ϕ 2 ϕ g)2 ϕ 2 2 ϕ 2 = J J 2 2 The first integral is equal to J = 2 πe x2 /4 g s)gt) /2 2π e x s)2 e x t)2 /2 dxdsdt Now, Therefore, Moreover, so we get J 2 = 2 πe x2 /4 /2 2π e x s)2 e x t)2 /2 dx = 2e 6 s 2 +4st t 2 ) 3 J = 2 3 ϕ 2 ϕ 2 2 = e 6 s 2 +4st t 2 ) g s)gt)dsdt π 6 2 2x2 + x 4 )e x2 /4, π 6 2 2x2 + x 4 )e x2 /4 gs)gt) 2π e x s)2 /2 e x t)2 /2 dxdsdt

Since π 6 2 2x2 + x 4 )e x2 /4 2π e x s)2 /2 e x t)2 /2 dx we arrive at J 2 = = 8 3 e 6 s2 +4st t2 ) [ 27 + s + t) 2 8 + s + t) 2 ) ], 8 3 e 6 s2 +4st t2 ) [ 27 + s + t) 2 8 + s + t) 2 ) ] gs)gt)dsdt If we integrate the first integral 4 times by parts we get J 2 = 2 [ 8 e 6 s2 +4st t 2 ) 27 + s 4 8s 3 t 72t 2 3 ] + 6t 4 8st 9 + 4t 2 ) + 6s 2 3 + 4t 2 ) gs)gt)dsdt Putting these two things together we get J = [ 8 exp 6 s2 +4st t 2 ) 27 + s 4 2s 3 t 26t 2 + 3t 4 3 ] + 6s 2 3 + 7t 2 ) + s8t 68t 3 ) gs)gt)dsdt The function gs) = 728 69 s3 25 23 s2 + 49 69 s 7875 ) 23 [,2] s) satisfies g = gx 2 = gx 4 = The resulting integral J can be partially computed and is equal to 2 [ J = 59 5 + 5s + 8s2 3s 3 + s 4 + 26 3675e 4 s2 675 + 4s 945s 2 + 28s 3 ) 2e 4s e 3 87 + 474s 832s 2 ) + 2e 4s 275 529s + 46s 2 )) e 4+4s2 2295 + 6848s 756s 2 + 3328s 3 )Φ 2s) ] + e 4+4s2 2295 + 6848s 756s 2 + 3328s 3 )Φ 2 2s)) ds, where Φ x) = 2 x e t2 dt Numerical computations show that J 28398, which is non-zero 4 Proof of Theorem 3 Take any n n unitary matrix U which all entries are complex numbers of the same modulus n, say U = n [e 2πikl/n ] k,l By complex-unconditionality, each coordinate of the vector UX has the same distribution, the same as X ++X n n Therefore, by subadditivity, n ) X + + X n h X) = h UX) h UX) j ) = nh, n which finishes the proof j= Remark 8 In the real case, this simple idea only works in dimensions for which there are Hadamard matrices It would be interesting to find a way around it, for instance by reducing to such dimensions or deriving the real version from the complex one 2

Acknowledgments We learned about the Edgeworth expansion used in our proof of Theorem 2 from S Bobkov during the AIM workshop Entropy power inequalities We are immensely grateful to him as well as the AIM and the organisers of the workshop which was a valuable and unique research experience for us References [] AimPL: Entropy power inequalities, available at http://aimplorg/entropypower [2] C D Aliprantis and K C Border Infinite dimensional analysis Springer, Berlin, third edition, 26 A hitchhikers guide [3] Artstein, S, Ball, K M, Barthe, F, Naor, A, Solution of Shannon s problem on the monotonicity of entropy, J Amer Math Soc 7 24), no 4, 975 982 [4] Ball, K, Nayar, P, Tkocz, T, A reverse entropy power inequality for log-concave random vectors Studia Math 235 26), no, 7 3 [5] Bobkov, S G, Chistyakov, G P, Götze, F, Rate of convergence and Edgeworth-type expansion in the entropic central limit theorem Ann Probab 4 23), no 4, pp 2479 252 [6] Bobkov, S G, Chistyakov, G P, Götze, F, Rényi divergence and the central limit theorem Preprint 26) [7] S Bobkov and M Madiman The entropy per coordinate of a random vector is highly constrained under convexity conditions IEEE Trans Inform Theory, 578):4944954, August 2 [8] Boltzmann, L, Lectures on gas theory Reprint of the 896 898 edition by Dover Publications, 995 [9] Chung, H W, Sadler, B M, Hero, A O, Bounds on variance for unimodal distributions IEEE Trans Inform Theory 63 27), no, 6936 6949 [] T A Courtade Bounds on the Poincaré constant for convolution measures Preprint, arxiv:8727, 28 [] Eskenazis, A, Nayar, P, Tkocz, T, Gaussian mixtures: entropy and geometric inequalities, to appear in Ann Probab [2] Fradelizi, M, Guédon, O, A generalized localization theorem and geometric inequalities for convex bodies Adv Math 24 26), no 2, 59 529 [3] Hao, J, Jog, V, An entropy inequality for symmetric random variables preprint, arxiv:83868 [4] L Lovász and M Simonovits Random walks in a convex body and an improved volume algorithm Random Structures Algorithms, 44):359 42, 993 [5] Marsiglietti, A, Kostina, V, A Lower Bound on the Differential Entropy of Log-Concave Random Vectors with Applications Entropy 28, 2, 85 [6] M Madiman and AR Barron The monotonicity of information in the central limit theorem and entropy power inequalities In Proc IEEE Intl Symp Inform Theory, pages 2 25 Seattle, July 26 [7] M Madiman and AR Barron Generalized entropy power inequalities and monotonicity properties of information IEEE Trans Inform Theory, 537):237 2329, July 27 [8] M Madiman and F Ghassemi Combinatorial entropy power inequalities: A preliminary study of the Stam region IEEE Trans Inform Theory to appear), 28 Available online at arxiv:7477 [9] Madiman, M, Melbourne, J, Xu, P, Forward and reverse entropy power inequalities in convex geometry Convexity and Concentration, Vol 6, pp 427 485, Springer, 27 [2] Shannon CE, A mathematical theory of communication Bell System Tech J, 27:379 423, 623 656, 948 [2] A M Tulino and S Verdu Monotonic decrease of the non-gaussianness of the sum of independent random variables: A simple proof IEEE Trans Inform Theory, 529):4295 7, September 26 University of Delaware MM) E-mail address: madiman@udeledu University of Warsaw PN) E-mail address: nayar@mimuwedupl Carnegie Mellon University TT) E-mail address: ttkocz@mathcmuedu 3