Log-concave distributions: definitions, properties, and consequences

Log-concave distributions: definitions, properties, and consequences Jon A. Wellner University of Washington, Seattle; visiting Heidelberg Seminaire, Institut de Mathématiques de Toulouse 28 February 202

Seminaire, Toulouse Part, 28 February Part 2, 28 February Chernoff s distribution is log-concave

Seminaire 2, Thursday, March: Strong log-concavity of Chernoff s density; problems connections and Seminaire 3, Monday, 5 March: Nonparametric estimation of log-concave densities Seminaire 4, Thursday, 8 March: A local maximal inequality under uniform entropy

Outline, Part : 2: 3: 4: 5: 6: 7. Log-concave densities / distributions: definitions Properties of the class Some consequences (statistics and probability) Strong log-concavity: definitions Examples & counterexamples Some consequences, strong log-concavity Questions & problems Seminar, Institut de Mathématiques de Toulouse; 28 February 202.3

. Log-concave densities / distributions: definitions Suppose that a density f can be written as f(x) f ϕ (x) = exp(ϕ(x)) = exp ( ( ϕ(x))) where ϕ is concave (and ϕ is convex). The class of all densities f on R, or on R d, of this form is called the class of log-concave densities, P log concave P 0. Note that f is log-concave if and only if : logf(λx+( λ)y) λlogf(x)+( λ)logf(y) for all 0 λ and for all x, y. iff f(λx + ( λ)y) f(x) λ f(y) λ iff f((x + y)/2) iff f((x + y)/2) 2 f(x)f(y). f(x)f(y), (assuming f is measurable) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.4

. Log-concave densities / distributions: definitions Examples, R Example : standard normal f(x) = (2π) /2 exp( x 2 /2), logf(x) = 2 x2 + log 2π, ( logf) (x) =. Example 2: Laplace f(x) = 2 exp( x ), logf(x) = x + log2, ( logf) (x) = 0 for all x 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.5

. Log-concave densities / distributions: definitions Example 3: Logistic Example 4: Subbotin e x f(x) = ( + e x ) 2, logf(x) = x + 2log( + e x ), ( logf) e x (x) = ( + e x ) 2 = f(x). f(x) = C r exp( x r /r), C r = 2Γ(/r)r /r, logf(x) = r x r + logc r, ( logf) (x) = (r ) x r 2, r, x 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.6

. Log-concave densities / distributions: definitions Many univariate parametric families on R are log-concave, for example: Normal (µ, σ 2 ) Uniform(a, b) Gamma(r, λ) for r Beta(a, b) for a, b Subbotin(r) with r. t r densities with r > 0 are not log-concave Tails of log-concave densities are necessarily sub-exponential: i.e. if X f P F 2, then Eexp(c X ) < for some c > 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.7

. Log-concave densities / distributions: definitions Log-concave densities on R d : A density f on R d is log-concave if f(x) = exp(ϕ(x)) with ϕ concave. Examples The density f of X N d (µ, Σ) with Σ positive definite: ( f(x) = f(x; µ, Σ) = exp (2π) d Σ 2 (x µ)t Σ (x µ) logf(x) = 2 (x µ)t Σ (x µ) (/2)log(2π Σ), D 2 ( logf)(x) ( 2 x i x j ( logf)(x), i, j =,..., d ) ) = Σ. If K R d is compact and convex, then f(x) = K (x)/λ(k) is a log-concave density., Seminar, Institut de Mathématiques de Toulouse; 28 February 202.8

. Log-concave densities / distributions: definitions Log-concave measures: Suppose that P is a probability measure on (R d, B d ). P is a logconcave measure if for all nonempty A, B B d and λ (0, ) we have P (λa + ( λ)b) {P (A)} λ {P (B)} λ. A set A R d is affine if tx + ( t)y A for all x, y A, t R. The affine hull of a set A R d is the smallest affine set containing A. Theorem. (Prékopa (97, 973), Rinott (976)). Suppose P is a probability measure on B d such that the affine hull of supp(p ) has dimension d. Then P is log-concave if and only if there is a log-concave (density) function f on R d such that P (B) = B f(x)dx for all B B d. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.9

2. Properties of log-concave densities Properties: log-concave densities on R: A density f on R is log-concave if and only if its convolution with any unimodal density is again unimodal (Ibragimov, 956). Every log-concave density f is unimodal (but need not be symmetric). P 0 is closed under convolution. P 0 is closed under weak limits Seminar, Institut de Mathématiques de Toulouse; 28 February 202.0

2. Properties of log-concave densities Properties: log-concave densities on R d : Any log concave f is unimodal. The level sets of f are closed convex sets. Log-concave densities correspond to log-concave measures. Prékopa, Rinott. Marginals of log-concave distributions are log-concave: f(x, y) is a log-concave density on R m+n, then g(x) = f(x, y)dy Rn is a log-concave density on R m. Prékopa, Brascamp-Lieb. if Products of log-concave densities are log-concave. P 0 is closed under convolution. P 0 is closed under weak limits. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.

3. Some consequences and connections (statistics and probability) (a) f is log-concave if and only if det((f(x i y j )) i,j {,2} ) 0 for all x x 2, y y 2 ; i.e f is a Polya frequency density of order 2; thus log-concave = P F 2 = strongly uni-modal (b) The densities p θ (x) f(x θ) for θ R have monotone likelihood ratio (in x) if and only if f is log-concave. Proof of (b): p θ (x) = f(x θ) has MLR iff f(x θ ) f(x θ) f(x θ ) f(x θ) This holds if and only if for all x < x, θ < θ logf(x θ ) + logf(x θ) logf(x θ ) + logf(x θ). () Let t = (x x)/(x x + θ θ) and note that Seminar, Institut de Mathématiques de Toulouse; 28 February 202.2

3. Some consequences and connections (statistics and probability) x θ = t(x θ ) + ( t)(x θ), x θ = ( t)(x θ ) + t(x θ) Hence log-concavity of f implies that logf(x θ) t logf(x θ ) + ( t)logf(x θ), logf(x θ ) ( t)logf(x θ ) + t logf(x θ). Adding these yields (??); i.e. MLR in x. f log-concave implies p θ (x) has Now suppose that p θ (x) has MLR so that (??) holds. In particular that holds if x, x, θ, θ satisfy x θ = a < b = x θ and t = (x x)/(x x+θ θ) = /2, so that x θ = (a+b)/2 = x θ. Then (??) becomes logf(a) + logf(b) 2logf((a + b)/2). This together with measurability of f implies that f is logconcave. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.3

3. Some consequences and connections (statistics and probability) Proof of (a): Suppose f is P F 2. Then for x < x, y < y, ( f(x y) f(x y det ) ) f(x y) f(x y ) if and only if or, if and only if = f(x y)f(x y ) f(x y )f(x y) 0 f(x y )f(x y) f(x y)f(x y ), That is, p y (x) has MLR in x. log-concave. f(x y ) f(x y) f(x y ) f(x y). By (b) this is equivalent to f Seminar, Institut de Mathématiques de Toulouse; 28 February 202.4

3. Some consequences and connections (statistics and probability) Theorem. (Brascamp-Lieb, 976). Suppose X f = e ϕ with ϕ convex and D 2 ϕ > 0, and let g C (R d ). Then V ar f (g(x)) E (D 2 ϕ) g(x), g(x). (Poincaré - type inequality for log-concave densities) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.5

3. Some consequences and connections (statistics and probability) Further consequences: Peakedness and majorization Theorem. (Proschan, 965) Suppose that f on R is logconcave and symmetric about 0. Let X,..., X n be i.i.d. with density f, and suppose that p, p R n + satisfy p p 2 p n, p p 2 p n, k p j k p j, k {,..., n}, n p j = n p j =. (That is, p p.) Then n p j X j is strictly more peaked than n p j X j : P n p j X j t < P n p j X j t for all t 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.6

3. Some consequences and connections (statistics and probability) Example: p = = p n = /(n ), p n = 0, while p = = p n = /n. Then p p (since n p j = n p j = and k p j = k/(n ) k/n = k p j ), and hence if X,..., X n are i.i.d. f symmetric and log-concave, P ( X n t) < P ( X n t) < < P ( X t) for all t 0. Definition: A d dimensional random variable X is said to be more peaked than a random variable Y if both X and Y have densities and P (Y A) P (X A) for all A A d, the class of subsets of R d symmetric about the origin. which are compact, convex, and Seminar, Institut de Mathématiques de Toulouse; 28 February 202.7

3. Some consequences and connections (statistics and probability) Theorem 2. (Olkin and Tong, 988) Suppose that f on R d is log-concave and symmetric about 0. Let X,..., X n be i.i.d. with density f, and suppose that a, b R n satisfy a a 2 a n, b b 2 b n, k a j k b j, k {,..., n}, n a j = n b j. (That is, a b.) Then n a j X j is more peaked than n b j X j : P n In particular, P n a j X j A P n a j X j t P b j X j A for all A A d n b j X j t for all t 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.8

3. Some consequences and connections (statistics and probability) Corollary: If g is non-decreasing on R + with g(0) = 0, then Eg n Another peakedness result: a j X j Eg n b j X j Suppose that Y = (Y,..., Y n ) where Y j N(µ j, σ 2 ) are independent and µ... µ n ; i.e. µ K n where K n {x R n : x x n }. Let µ n = Π(Y K n ), the least squares projection of Y onto K n. It is well-known that ( sj=r ) µ n = min max Y j s i r i s r +, i =,..., n.. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.9

3. Some consequences and connections (statistics and probability) Theorem 3. (Kelly) If Y N n (µ, σ 2 I) and µ K n, then µ k µ k is more peaked than Y k µ k for each k {,..., n}; that is P ( µ k µ k t) P ( Y k µ k t) for all t > 0, k {,..., n}. Question: Does Kelly s theorem continue to hold if the normal distribution is replaced by an arbitrary log-concave joint density symmetric about µ? Seminar, Institut de Mathématiques de Toulouse; 28 February 202.20

4. Strong log-concavity: definitions Definition. A density f on R is strongly log-concave if f(x) = h(x)cφ(cx) for some c > 0 where h is log-concave and φ(x) = (2π) /2 exp( x 2 /2). Sufficient condition: logf C 2 (R) with ( logf) (x) c 2 > 0 for all x. Definition 2. A density f on R d is strongly log-concave if f(x) = h(x)cγ(cx) for some c > 0 where h is log-concave and γ is the N d (0, ci d ) density. Sufficient condition: logf C 2 (R d ) with D 2 ( logf)(x) c 2 I d for some c > 0 for all x R d. These agree with strong convexity as defined by Rockafellar & Wets (998), p. 565. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.2

5. Examples & conterexamples Examples Example. f(x) = h(x)φ(x)/ hφdx where h is the logistic density, h(x) = e x /( + e x ) 2. Example 2. f(x) = h(x)φ(x)/ hφdx where h is the Gumbel density. h(x) = exp(x e x ). Example 3. f(x) = h(x)h( x)/ h(y)h( y)dy where h is the Gumbel density. Counterexamples Counterexample. f logistic: f(x) = e x /( + e x ) 2 ; ( logf) (x) = f(x). Counterexample 2. f Subbotin, r [, 2) (2, ); f(x) = C r exp( x r /r); ( logf) (x) = (r 2) x r 2. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.22

0.6 0.5 0.4 0.3 0.2 0. 3 2 2 3 Ex. : Logistic (red) perturbation of N(0, ) (green): f (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.23

3 2 3 2 0 2 3 Ex. : ( logf), Logistic perturbation of N(0, ) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.24

0.4 0.3 0.2 0. 4 2 0 2 Ex. 2: Gumbel (red) perturbation of N(0, ) (green): f (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.25

4 3 2 3 2 0 2 3 Ex. 2: ( logf), Gumbel perturbation of N(0, ) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.26

0.6 0.5 0.4 0.3 0.2 0. 2 0 2 Ex. 3: Gumbel ( ) Gumbel( ) (purple); N(0, V f ) (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.27

6 4 2 2 0 2 Ex. 3: loggumbel( ) Gumbel( ) (purple); logn(0, V f ) (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.28

4 3 2 2 0 2 Ex. 3: D 2 ( loggumbel( ) Gumbel( )) (purple); D 2 ( logn(0, V f )) (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.29

0.4 0.3 0.2 0. 4 2 2 4 Subbotin f r r = (blue), r =.5 (red), r = 2 (green), r = 3 (purple) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.30

5 4 3 2 4 2 2 4 log f r : r = (blue), r =.5 (red), r = 2 (green), r = 3 (purple) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.3

4 3 2 4 2 0 2 4 ( log f r ) : r = (blue), r =.5 (red), r = 2 (green), r = 3 (purple) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.32

6. Some consequences, strong log-concavity First consequence Theorem. (Hargé, 2004). Suppose X N n (µ, Σ) with density γ and Y has density h γ with h log-concave, and let g : R n R be convex. Then Eg(Y E(Y )) Eg(X EX)). Equivalently, with µ = EX, ν = EY = E(Xh(X))/Eh(X), and g g( + µ) E{ g(x ν + µ)h(x)} E g(x) Eh(X). Seminar, Institut de Mathématiques de Toulouse; 28 February 202.33

6. Some consequences, strong log-concavity More consequences Corollary. (Brascamp-Lieb, 976). Suppose X f = exp( ϕ) with D 2 ϕ λi d, λ > 0, and let g C (R d ). Then V ar f (g(x)) E (D 2 ϕ) g(x), g(x) λ E g(x) 2. (Poincaré inequality for strongly log-concave densities; improvements by Hargé (2008)) Theorem. (Caffarelli, 2002). Suppose X N d (0, I) with density γ d and Y has density e v γ d with v convex. Let T = ϕ be the unique gradient of a convex map ϕ such that ϕ(x) d = Y. Then 0 D 2 ϕ I d. (cf. Villani (2003), pages 290-29) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.34

7. Questions & problems Does strong log-concavity occur naturally? Are there natural examples? Are there large classes of strongly log-concave densities in connection with other known classes such as P F (Pólya frequency functions of order infinity) or L. Bondesson s class HM of completely hyperbolically monotone densities? Does Kelly s peakedness result for projection onto the ordered cone K n continue to hold with Gaussian replaced by log-concave (or symmetric log concave)? Seminar, Institut de Mathématiques de Toulouse; 28 February 202.35

Selected references: Villani, C. (2002). Topics in Optimal Transportation. Amer. Math Soc., Providence. Dharmadhikari, S. and Joag-dev, K. (988). Unimodality, Convexity, and Applications. Academic Press. Marshall, A. W. and Olkin, I. (979). Inequalities: Theory of Majorization and Its Applications. Academic Press. Marshall, A. W., Olkin, I., and Arnold, B. C. (20). Inequalities: Theory of Majorization and Its Applications. Second edition, Springer. Rockfellar, R. T. and Wets, R. (998). Variational Analysis. Springer. Hargé, G. (2004). A convex/log-concave correlation inequality for Gaussian measure and an application to abstract Wiener spaces. Probab. Theory Related Fields 30, 45-440. Brascamp, H. J. and Lieb, E. H. (976). On extensions of the Brunn- Minkowski and Prékopa-Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation. J. Functional Analysis 22, 366-389. Hargé, G. (2008). Reinforcement of an inequality due to Brascamp and Lieb. J. Funct. Anal., 254, 267-300. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.36