Moment-entropy inequalities for a random vector

1 Momet-etropy iequalities for a radom vector Erwi Lutwak, Deae ag, ad Gaoyog Zhag Abstract The p-th momet matrix is defied for a real radom vector, geeralizig the classical covariace matrix. Sharp iequalities relatig the p-th momet ad Reyi etropy are established, geeralizig the classical iequality relatig the secod momet ad the Shao etropy. The extremal distributios for these iequalities are completely characterized. I. INTRODUCTION I [9 the authors demostrated how the classical iformatio theoretic iequality for the Shao etropy ad secod momet of a real radom variable could be exteded to iequalities for Reyi etropy ad the p-th momet. The extremals of these iequalities were also completely characterized. Momet-etropy iequalities, usig Reyi etropy, for discrete radom variables have also bee obtaied by Arika [2. We describe how to exted the defiitio of the secod momet matrix of a real radom vector to that of the p-th momet matrix. Usig this, we exted the momet-etropy iequalities ad the characterizatio of the extremal distributios proved i [9 to higher dimesios. Variats ad geeralizatios of the theorems preseted ca be foud i work of the authors [8, [10, [11 ad Bastero- Romace [3. The authors would like to thak Christoph Haberl for his careful readig of this paper ad valuable suggestios for improvig it. II. THE p-th MOMENT MATRIX OF A RANDOM VECTOR A. Basic otatio Throughout this paper we deote: = -dimesioal Euclidea space x y = stadard Euclidea ier product of x, y x = x x S = positive defiite symmetric -by- matrices A = determiat of A S K = Lebesgue measure of K. The stadard Euclidea ball i will be deoted by B, ad its volume by ω. Each ier product o ca be writte uiquely as (x, y) x, y A = Ax Ay, for A S. The associated orm will be deoted by A. E. Lutwak (elutwak@poly.edu), D. ag (dyag@poly.edu), ad G. Zhag (gzhag@poly.edu) are with the Departmet of Mathematics, Polytechic Uiversity, Brookly, New ork. ad were supported i part by NSF Grat DMS-0405707. Throughout this paper, X deotes a radom vector i. The probability measure o associated with a radom vector X is deoted m X. We will deote the stadard Lebesgue desity o by dx. By the desity fuctio f X of a radom vector X, we mea the Rado-Nikodym derivative of probability measure m X with respect to Lebesgue measure. If V is a vector space ad Φ : V is a cotiuous fuctio, the the expected value of Φ(X) is give by E[Φ(X) = Φ(x) dm X (x). We call a radom vector X odegeerate, if E[ v X > 0 for each ozero v. B. The p-th momet of a radom vector For p (0, ), the stadard p-th momet of a radom vector X is give by E[ X p = x p dm X (x). (1) More geerally, the p-th momet with respect to the ier product, A is C. The p-th momet matrix E[ X p A = x p A dm X(x). The secod momet matrix of a radom vector X is defied to be M 2 [X = E[X X, where for v, v v is the liear trasformatio give by x (x v)v. Recall that M 2 [X E[X is the covariace matrix. A importat observatio is that the defiitio of the momet matrix does ot use the ier product o. A uique characterizatio of the secod momet matrix is the followig: Let M = M 2 [X. The ier product, M 1/2 is the uique oe whose uit ball has maximal volume amog all ier products, A that are ormalized so that the secod momet satisfies E[ AX 2 =. We exted this characterizatio to a defiitio of the p-th momet matrix M p [X for all p (0, ). Theorem 1: If p (0, ) ad X is a odegeerate radom vector i with fiite p-th momet, the there exists a uique matrix A S such that ad E[ X p A = A A,

2 for each A S such that E[ X p A =. Moreover, the matrix A is the uique matrix i S satisfyig I = E[AX AX AX p 2. We defie the p-th momet matrix of a radom vector X to be M p [X = A p, where A is give by the theorem above. The proof of the theorem is give i IV A. Etropy III. MOMENT-ENTROP INEQUALITIES The Shao etropy of a radom vector X is defied to be h[x = f X log f X dx, provided that the itegral above exists. For λ > 0 the λ-reyi etropy power of a desity fuctio is defied to be ( f λ 1 λ X if λ 1, = e h[f if λ = 1, provided that the itegral above exists. Observe that lim N λ[x = N 1 [X. λ 1 The λ Reyi etropy of a radom vector X is defied to be h λ [X = log. The etropy h λ [X is cotiuous i λ ad, by the Hölder iequality, decreasig i λ. It is strictly decreasig, uless X is a uiform radom vector. It follows by the chai rule that for each A S. B. Relative etropy N λ [AX = A, (2) Give two radom vectors X, i, their relative Shao etropy or Kullback Leibler distace [6, [5, [1 (also, see page 231 i [4) is defied by ( ) fx h 1 [X, = f X log dx, (3) R f provided that the itegral above exists. Give λ > 0, we defie the relative λ Reyi etropy power of X ad as follows. If λ 1, the N λ [X, = ( ad, if λ = 1, the f X dx ( f λ X dx ( 1 λ R N 1 [X, = e h1[x,, λ(1 λ) f λ λ dx, (4) provided i both cases that the righthad side exists. Defie the λ Reyi relative etropy of radom vectors X ad by h λ [X, = log N λ [X,. Observe that h λ [X, is cotiuous i λ. Lemma 2: If X ad are radom vectors such that h λ [X, h λ [, ad h λ [X, are fiite, the h λ [X, 0. Equality holds if ad oly if X =. Proof: If λ > 1, the by the Hölder iequality, ( ) λ 1 ( f X dx f λ λ dx fx λ dx R ad if λ < 1, the we have fx λ = ( R ( f X ) λ f λ(1 λ) ) λ λ f X f R (R λ. λ, The iequality for λ = 1 follows by takig the limit λ 1. The equality coditios for λ 1 follow from the equality coditios of the Hölder iequality. The iequality for λ = 1, icludig the equality coditio, follows from the Jese iequality (details may be foud, for example, page 234 i [4). C. Geeralized Gaussias We call the extremal radom vectors for the mometetropy iequalities geeralized Gaussias ad recall their defiitio here. Give t R, let Let Γ(t) = t + = max(t, 0). deote the Gamma fuctio, ad let 0 x t 1 e x dx β(a, b) = Γ(a)Γ(b) Γ(a + b) deote the Beta fuctio. For each p (0, ) ad λ (/( + p), ), defie the stadard geeralized Gaussia to be the radom vector Z i whose desity fuctio f Z : [0, ) is give by a p,λ (1 + (1 λ) x p /(λ 1) + if λ 1 f Z (x) = (5) a p,1 e x p if λ = 1, where p(1 λ) p ω β( p, 1 1 λ p ) if λ < 1, p a p,λ = ω Γ( p ) if λ = 1, p(λ 1) p ω β( p, λ λ 1 ) if λ > 1. Ay radom vector i that ca be writte as = AZ, for some A S is called a geeralized Gaussia.

3 D. Iformatio measures of geeralized Gaussias If 0 < p < ad λ > /( + p), the λ-reyi etropy power of the stadard geeralized Gaussia radom vector Z is give by ( λ 1 (λ 1) 1 + a 1 p,λ if λ 1 N λ [Z = pλ e p a 1 p,1 if λ = 1 If 0 < p < ad λ > /( + p), the the p-th momet of Z is give by [ ( E[ Z p = λ 1 + p ) 1 1. We defie the costat c(, p, λ) = E[ Z p 1/p N λ [Z 1/ [ = a 1/ p,λ λ ( 1 + p ) 1 p 1 b(, p, λ), where ( ) 1 (1 λ) 1 (1 λ) b(, p, λ) = pλ if λ 1 e 1/p if λ = 1. Observe that if λ 1 ad 0 < p <, the f λ Z = a λ 1 p,λ (1 + (1 λ)e[ Z p ), (7) ad if λ = 1, the (6) h[z = log a p,1 + E[ Z p. (8) We will also eed the followig scalig idetities: f tz (x) = t f Z (t 1 x), (9) for each x. Therefore, ftz λ dx = t (1 λ) ad E[ tz p = t p E[ Z p. E. Spherical momet-etropy iequalities f λ Z dx, (10) The proof of Theorem 2 i [9 exteds easily to prove the followig. A more geeral versio ca be foud i [7. Theorem 3: If p (0, ), λ > /( + p), ad X is a radom vector i such that, E[ X p <, the E[ X p 1/p 1/ c(, p, λ), where c(, p, λ) is give by (6). Equality holds if ad oly if X = tz, for some t (0, ). Proof: For coveiece let a = a p,λ. Let ( E[ X p /p t = E[ Z p (11) ad = tz. If λ 1, the by (9) ad (5), (1), (11), ad (7), f X a λ 1 t (1 λ) + (1 λ)a λ 1 t (1 λ) p x p f X (x) dx = a λ 1 t (1 λ) (1 + (1 λ)t p E[ X p ) = a λ 1 t (1 λ) (1 + (1 λ)e[ Z p ) = t (1 λ) f λ Z, (12) where equality holds if λ < 1. It follows that if λ 1, the by Lemma 2, (4), (10) ad (12), ad (11), we have 1 N λ [X, λ ( ) ( = f λ t N λ[z = E[ X p /p f λ X N λ [Z E[ Z p /p. ) 1 ( 1 λ R ) λ 1 λ f X If λ = 1, the by Lemma 2, (3) ad (5), ad (8) ad (11), 0 h 1 [X, = h[x log a + log t + t p E[ X p = h[x + h[z + p log E[ X p E[ Z p. Lemma 2 shows that equality holds i all cases if ad oly if = X. F. Elliptic momet-etropy iequalities Corollary 4: If A S, p (0, ), λ > /( + p), ad X is a radom vector i satisfyig, E[ X p <, the E[ X p A 1/p c(, p, λ), (13) A 1/ 1/ where c(, p, λ) is give by (6). Equality holds if ad oly if X = ta 1 Z for some t (0, ). Proof: By (2) ad Theorem 3, E[ X p A 1/p A 1/ = E[ AX p 1/p 1/ N λ [AX 1/ E[ Z p 1/p N λ [Z 1/, ad equality holds if ad oly if AX = tz for some t (0, ). G. Affie momet-etropy iequalities Optimizig Corollary 4 over all A S yields the followig affie iequality. Theorem 5: If p (0, ), λ > /( + p), ad X is a radom vector i satisfyig, E[ X p <, the M p [X 1/p /p c(, p, λ),

4 where c(, p, λ) is give by (6). Equality holds if ad oly if X = A 1 Z for some A S. Proof: Substitute A = M p [X 1/p ito (13) Coversely, Corollary 4 follows from Theorem 5 by Theorem 1. IV. PROOF OF THEOREM 1 A. Isotropic positio of a probability measure A Borel measure µ o is said to be i isotropic positio, if x x x 2 dµ(x) = 1 I, (14) where I is the idetity matrix. Lemma 6: If p 0 ad µ is a Borel probability measure i isotropic positio, the for each A S, A 1/ ( Ax p x p /p dµ(x) 1, with either equality holdig if ad oly if A = ai for some a > 0. Proof: By Hölder s iequality, (R Ax p /p ( x p dµ(x) exp log Ax R x ) dµ(x), so it suffices to prove the p = 0 case oly. By (14), R (x e) 2 x 2 dµ(x) = 1, (15) for ay uit vector e. Let e 1,..., e be a orthoormal basis of eigevectors of A with correspodig eigevalues λ 1,..., λ. By the cocavity of log, ad (15), log Ax dµ(x) = 1 R log Ax 2 R x 2 x 2 dµ(x) = 1 log 2 i=1 1 2 i=1 = log A 1/. λ 2 i (x e i ) 2 x 2 (x e i ) 2 x 2 dµ(x) log λ 2 i dµ(x) The equality coditio follows from the strict cocavity of log. B. Proof of theorem Lemma 7: If p > 0 ad X is a odegeerate radom vector i with fiite p-th momet, the there exists c > 0 such that E[ e X p c, (16) for every uit vector e. Proof: The left side of (16) is a positive cotiuous fuctio of the uit sphere, which is compact. Theorem 8: If p 0 ad X is a odegeerate radom vector i with fiite p-th momet, the there exists A S, uique up to a scalar multiple, such that A 1/ E[ AX p 1/p A 1/ E[ A X p 1/p (17) for every A S. Proof: Let S S be the subset of matrices whose maximum eigevalue is exactly 1. This is a bouded set iside the set of all symmetric matrices, with its boudary S equal to positive semidefiite matrices with maximum eigevalue 1 ad miimum eigevalue 0. Give A S, let e be a eigevector of A with eigevalue 1. By Lemma 7, A 1/ E[ A X p 1/p A 1/ E[ X e p 1/p c 1/p A 1/. (18) Therefore, if A approaches the boudary S, the left side of (18) grows without boud. Sice the left side of (18) is a cotiuous fuctio o S, the existece of a miimum follows. Let A S be such a miimum ad = AX. The for each B S, B 1/ E[ B p 1/p = A 1/ BA 1/ E[ (BA)X p 1/p A 1/ A 1/ E[ AX p 1/p = E[ p 1/p. (19) with equality holdig if ad oly if equality holds for (17) with A = BA. Settig B = I + tb for B S, we get I + tb 1/ E[ (I + tb ) p 1/p E[ p 1/p, for each t ear 0. It follows that d dt I + tb 1/ E[ (I + tb ) p 1/p = 0, t=0 for each B S. A straightforward computatio shows that this holds oly if 1 E[ p I = E[ p 2. (20) Applyig Lemma 6 to dµ(x) = x p dm (x) E[ p, implies that equality holds for (19) oly if B = ai for some a (0, ). This, i tur, implies that equality holds for (17) oly if A = aa. Theorem 1 follows from Theorem 8 by rescalig A so that E[ p = ad substitutig = AX ito (20). REFERENCES [1 Shu-ichi Amari, Differetial-geometrical methods i statistics, Lecture Notes i Statistics, vol. 28, Spriger-Verlag, New ork, 1985. MR 86m:62053 [2 Erdal Arika, A iequality o guessig ad its applicatio to sequetial decodig, IEEE Tras. Iform. Theory 42 (1996), 99 105. [3 J. Bastero ad M. Romace, Positios of covex bodies associated to extremal problems ad isotropic measures, Adv. Math. 184 (2004), o. 1, 64 88.

5 [4 Thomas M. Cover ad Joy A. Thomas, Elemets of iformatio theory, Joh Wiley & Sos Ic., New ork, 1991, A Wiley-Itersciece Publicatio. [5 I. Csiszár, Iformatio-type measures of differece of probability distributios ad idirect observatios, Studia Sci. Math. Hugar. 2 (1967), 299 318. MR 36 #2428 [6 S. Kullback ad R. A. Leibler, O iformatio ad sufficiecy, A. Math. Statistics 22 (1951), 79 86. MR 12,623a [7 E. Lutwak, D. ag, ad G. Zhag, The Cramer-Rao iequality for star bodies, Duke Math. J. 112 (2002), 59 81. [8, Momet etropy iequalities, Aals of Probability 32 (2004), 757 774. [9, Cramer-Rao ad momet-etropy iequalities for Reyi etropy ad geeralized Fisher iformatio, IEEE Tras. Iform. Theory 51 (2005), 473 478. [10, L p Joh ellipsoids, Proc. Lodo Math. Soc. 90 (2005), 497 520. [11, Optimal Sobolev orms ad the L p Mikowski problem, It. Math. Res. Not. (2006), 62987, 1 21. Erwi Lutwak Erwi Lutwak received his B.S., M.S., ad Ph.D. degrees i Mathematics from Polytechic Uiversity, where he is ow Professor of Mathematics. Deae ag Deae ag received his B.A. i mathematics ad physics from Uiversity of Pesylvaia ad Ph.D. i mathematics from Harvard Uiversity. He has bee a NSF Postdoctoral Fellow at the Courat Istitute ad o the faculty of Rice Uiversity ad Columbia Uiversity. He is ow a full professor at Polytechic Uiversity. Gaoyog Zhag Gaoyog Zhag received his B.S. degree i mathematics from Wuha Uiversity of Sciece ad Techology, M.S. degree i mathematics from Wuha Uiversity, Wuha, Chia, ad Ph.D. degree i mathematics from Temple Uiversity, Philadelphia. He was a Rademacher Lecturer at the Uiversity of Pesylvaia, a member of the Istitute for Advaced Study at Priceto, ad a member of the Mathematical Scieces Research Istitute at Berkeley. He was a assistat professor ad is ow a full professor at Polytechic Uiversity.