Log-concave distributions: definitions, properties, and consequences

Similar documents
Chernoff s distribution is log-concave. But why? (And why does it matter?)

Nonparametric estimation of log-concave densities

Nonparametric estimation under Shape Restrictions

Nonparametric estimation of log-concave densities

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood

Strong log-concavity is preserved by convolution

Shape constrained estimation: a brief introduction

1 An Elementary Introduction to Monotone Transportation

Bi-s -Concave Distributions

Around the Brunn-Minkowski inequality

Some stochastic inequalities for weighted sums

On Strong Unimodality of Multivariate Discrete Distributions

Stochastic Comparisons of Weighted Sums of Arrangement Increasing Random Variables

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

NONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES. By Arseni Seregin, and Jon A. Wellner, University of Washington

A concavity property for the reciprocal of Fisher information and its consequences on Costa s EPI

Quick Tour of Basic Probability Theory and Linear Algebra

Inverse Brascamp-Lieb inequalities along the Heat equation

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures

Total positivity in Markov structures

Convexification and Multimodality of Random Probability Measures

ECE 4400:693 - Information Theory

Maximum likelihood estimation

CONVOLUTIONS OF LOGARITHMICALLY. Milan Merkle. Abstract. We give a simple proof of the fact that the logarithmic concavity of real

Some properties of skew-symmetric distributions

Logarithmic Sobolev Inequalities

High-dimensional distributions with convexity properties

From the Brunn-Minkowski inequality to a class of Poincaré type inequalities

A glimpse into convex geometry. A glimpse into convex geometry

BORELL S GENERALIZED PRÉKOPA-LEINDLER INEQUALITY: A SIMPLE PROOF. Arnaud Marsiglietti. IMA Preprint Series #2461. (December 2015)

Lecture 17: Differential Entropy

A Law of the Iterated Logarithm. for Grenander s Estimator

A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3

Math 341: Convex Geometry. Xi Chen

Maximization of a Strongly Unimodal Multivariate Discrete Distribution

Probability Background

Convex Geometry. Otto-von-Guericke Universität Magdeburg. Applications of the Brascamp-Lieb and Barthe inequalities. Exercise 12.

Random Variables and Their Distributions

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Lecture 4: Convex Functions, Part I February 1

Maximum likelihood estimation of a log-concave density based on censored data

Fisher information and Stam inequality on a finite group

Steiner s formula and large deviations theory

Dimensional behaviour of entropy and information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Some Applications of Mass Transport to Gaussian-Type Inequalities

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan

ON A LINEAR REFINEMENT OF THE PRÉKOPA-LEINDLER INEQUALITY

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Lecture 1 October 9, 2013

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

Entropy Power Inequalities: Results and Speculation

ST5215: Advanced Statistical Theory

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Weak and strong moments of l r -norms of log-concave vectors

Efficiency of linear estimators under heavy-tailedness: convolutions of [alpha]-symmetric distributions.

Probability Lecture III (August, 2006)

Log-concave Probability Distributions: Theory and Statistical Testing 1

Journal of Inequalities in Pure and Applied Mathematics

Exercises: Brunn, Minkowski and convex pie

BORIS MORDUKHOVICH Wayne State University Detroit, MI 48202, USA. Talk given at the SPCOM Adelaide, Australia, February 2015

BASICS OF CONVEX ANALYSIS

EXPONENTIAL PROBABILITY INEQUALITY FOR LINEARLY NEGATIVE QUADRANT DEPENDENT RANDOM VARIABLES

SMOOTH OPTIMAL TRANSPORTATION ON HYPERBOLIC SPACE

Theory of Probability Fall 2008

A NOTE ON A DISTRIBUTION OF WEIGHTED SUMS OF I.I.D. RAYLEIGH RANDOM VARIABLES

Albert W. Marshall. Ingram Olkin Barry. C. Arnold. Inequalities: Theory. of Majorization and Its Applications. Second Edition.

Convex analysis and profit/cost/support functions

Nonparametric estimation under Shape Restrictions

A Brunn Minkowski inequality for the Hessian eigenvalue in three-dimensional convex domain

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Assignment 1: From the Definition of Convexity to Helley Theorem

x log x, which is strictly convex, and use Jensen s Inequality:

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

Notes on Random Vectors and Multivariate Normal

SWFR ENG 4TE3 (6TE3) COMP SCI 4TE3 (6TE3) Continuous Optimization Algorithm. Convex Optimization. Computing and Software McMaster University

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

WEAK CONVERGENCE THEOREMS FOR EQUILIBRIUM PROBLEMS WITH NONLINEAR OPERATORS IN HILBERT SPACES

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

A Brief Introduction to Copulas

Murat Akman. The Brunn-Minkowski Inequality and a Minkowski Problem for Nonlinear Capacities. March 10

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

3. Probability and Statistics

Brunn-Minkowski inequalities for two functionals involving the p-laplace operator

Lecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Finite Dimensional Optimization Part III: Convex Optimization 1

On intermediate value theorem in ordered Banach spaces for noncompact and discontinuous mappings

Convex Analysis and Economic Theory AY Elementary properties of convex functions

Nonparametric Estimation of the Dependence Function for a Multivariate Extreme Value Distribution

Richard F. Bass Krzysztof Burdzy University of Washington

Multivariate Analysis and Likelihood Inference

Statistics 581, Problem Set 1 Solutions Wellner; 10/3/ (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality

Continuous Random Variables

(somewhat) expanded version of the note in C. R. Acad. Sci. Paris 340, (2005). A (ONE-DIMENSIONAL) FREE BRUNN-MINKOWSKI INEQUALITY

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

Transcription:

Log-concave distributions: definitions, properties, and consequences Jon A. Wellner University of Washington, Seattle; visiting Heidelberg Seminaire, Institut de Mathématiques de Toulouse 28 February 202

Seminaire, Toulouse Part, 28 February Part 2, 28 February Chernoff s distribution is log-concave

Seminaire 2, Thursday, March: Strong log-concavity of Chernoff s density; problems connections and Seminaire 3, Monday, 5 March: Nonparametric estimation of log-concave densities Seminaire 4, Thursday, 8 March: A local maximal inequality under uniform entropy

Outline, Part : 2: 3: 4: 5: 6: 7. Log-concave densities / distributions: definitions Properties of the class Some consequences (statistics and probability) Strong log-concavity: definitions Examples & counterexamples Some consequences, strong log-concavity Questions & problems Seminar, Institut de Mathématiques de Toulouse; 28 February 202.3

. Log-concave densities / distributions: definitions Suppose that a density f can be written as f(x) f ϕ (x) = exp(ϕ(x)) = exp ( ( ϕ(x))) where ϕ is concave (and ϕ is convex). The class of all densities f on R, or on R d, of this form is called the class of log-concave densities, P log concave P 0. Note that f is log-concave if and only if : logf(λx+( λ)y) λlogf(x)+( λ)logf(y) for all 0 λ and for all x, y. iff f(λx + ( λ)y) f(x) λ f(y) λ iff f((x + y)/2) iff f((x + y)/2) 2 f(x)f(y). f(x)f(y), (assuming f is measurable) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.4

. Log-concave densities / distributions: definitions Examples, R Example : standard normal f(x) = (2π) /2 exp( x 2 /2), logf(x) = 2 x2 + log 2π, ( logf) (x) =. Example 2: Laplace f(x) = 2 exp( x ), logf(x) = x + log2, ( logf) (x) = 0 for all x 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.5

. Log-concave densities / distributions: definitions Example 3: Logistic Example 4: Subbotin e x f(x) = ( + e x ) 2, logf(x) = x + 2log( + e x ), ( logf) e x (x) = ( + e x ) 2 = f(x). f(x) = C r exp( x r /r), C r = 2Γ(/r)r /r, logf(x) = r x r + logc r, ( logf) (x) = (r ) x r 2, r, x 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.6

. Log-concave densities / distributions: definitions Many univariate parametric families on R are log-concave, for example: Normal (µ, σ 2 ) Uniform(a, b) Gamma(r, λ) for r Beta(a, b) for a, b Subbotin(r) with r. t r densities with r > 0 are not log-concave Tails of log-concave densities are necessarily sub-exponential: i.e. if X f P F 2, then Eexp(c X ) < for some c > 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.7

. Log-concave densities / distributions: definitions Log-concave densities on R d : A density f on R d is log-concave if f(x) = exp(ϕ(x)) with ϕ concave. Examples The density f of X N d (µ, Σ) with Σ positive definite: ( f(x) = f(x; µ, Σ) = exp (2π) d Σ 2 (x µ)t Σ (x µ) logf(x) = 2 (x µ)t Σ (x µ) (/2)log(2π Σ), D 2 ( logf)(x) ( 2 x i x j ( logf)(x), i, j =,..., d ) ) = Σ. If K R d is compact and convex, then f(x) = K (x)/λ(k) is a log-concave density., Seminar, Institut de Mathématiques de Toulouse; 28 February 202.8

. Log-concave densities / distributions: definitions Log-concave measures: Suppose that P is a probability measure on (R d, B d ). P is a logconcave measure if for all nonempty A, B B d and λ (0, ) we have P (λa + ( λ)b) {P (A)} λ {P (B)} λ. A set A R d is affine if tx + ( t)y A for all x, y A, t R. The affine hull of a set A R d is the smallest affine set containing A. Theorem. (Prékopa (97, 973), Rinott (976)). Suppose P is a probability measure on B d such that the affine hull of supp(p ) has dimension d. Then P is log-concave if and only if there is a log-concave (density) function f on R d such that P (B) = B f(x)dx for all B B d. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.9

2. Properties of log-concave densities Properties: log-concave densities on R: A density f on R is log-concave if and only if its convolution with any unimodal density is again unimodal (Ibragimov, 956). Every log-concave density f is unimodal (but need not be symmetric). P 0 is closed under convolution. P 0 is closed under weak limits Seminar, Institut de Mathématiques de Toulouse; 28 February 202.0

2. Properties of log-concave densities Properties: log-concave densities on R d : Any log concave f is unimodal. The level sets of f are closed convex sets. Log-concave densities correspond to log-concave measures. Prékopa, Rinott. Marginals of log-concave distributions are log-concave: f(x, y) is a log-concave density on R m+n, then g(x) = f(x, y)dy Rn is a log-concave density on R m. Prékopa, Brascamp-Lieb. if Products of log-concave densities are log-concave. P 0 is closed under convolution. P 0 is closed under weak limits. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.

3. Some consequences and connections (statistics and probability) (a) f is log-concave if and only if det((f(x i y j )) i,j {,2} ) 0 for all x x 2, y y 2 ; i.e f is a Polya frequency density of order 2; thus log-concave = P F 2 = strongly uni-modal (b) The densities p θ (x) f(x θ) for θ R have monotone likelihood ratio (in x) if and only if f is log-concave. Proof of (b): p θ (x) = f(x θ) has MLR iff f(x θ ) f(x θ) f(x θ ) f(x θ) This holds if and only if for all x < x, θ < θ logf(x θ ) + logf(x θ) logf(x θ ) + logf(x θ). () Let t = (x x)/(x x + θ θ) and note that Seminar, Institut de Mathématiques de Toulouse; 28 February 202.2

3. Some consequences and connections (statistics and probability) x θ = t(x θ ) + ( t)(x θ), x θ = ( t)(x θ ) + t(x θ) Hence log-concavity of f implies that logf(x θ) t logf(x θ ) + ( t)logf(x θ), logf(x θ ) ( t)logf(x θ ) + t logf(x θ). Adding these yields (??); i.e. MLR in x. f log-concave implies p θ (x) has Now suppose that p θ (x) has MLR so that (??) holds. In particular that holds if x, x, θ, θ satisfy x θ = a < b = x θ and t = (x x)/(x x+θ θ) = /2, so that x θ = (a+b)/2 = x θ. Then (??) becomes logf(a) + logf(b) 2logf((a + b)/2). This together with measurability of f implies that f is logconcave. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.3

3. Some consequences and connections (statistics and probability) Proof of (a): Suppose f is P F 2. Then for x < x, y < y, ( f(x y) f(x y det ) ) f(x y) f(x y ) if and only if or, if and only if = f(x y)f(x y ) f(x y )f(x y) 0 f(x y )f(x y) f(x y)f(x y ), That is, p y (x) has MLR in x. log-concave. f(x y ) f(x y) f(x y ) f(x y). By (b) this is equivalent to f Seminar, Institut de Mathématiques de Toulouse; 28 February 202.4

3. Some consequences and connections (statistics and probability) Theorem. (Brascamp-Lieb, 976). Suppose X f = e ϕ with ϕ convex and D 2 ϕ > 0, and let g C (R d ). Then V ar f (g(x)) E (D 2 ϕ) g(x), g(x). (Poincaré - type inequality for log-concave densities) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.5

3. Some consequences and connections (statistics and probability) Further consequences: Peakedness and majorization Theorem. (Proschan, 965) Suppose that f on R is logconcave and symmetric about 0. Let X,..., X n be i.i.d. with density f, and suppose that p, p R n + satisfy p p 2 p n, p p 2 p n, k p j k p j, k {,..., n}, n p j = n p j =. (That is, p p.) Then n p j X j is strictly more peaked than n p j X j : P n p j X j t < P n p j X j t for all t 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.6

3. Some consequences and connections (statistics and probability) Example: p = = p n = /(n ), p n = 0, while p = = p n = /n. Then p p (since n p j = n p j = and k p j = k/(n ) k/n = k p j ), and hence if X,..., X n are i.i.d. f symmetric and log-concave, P ( X n t) < P ( X n t) < < P ( X t) for all t 0. Definition: A d dimensional random variable X is said to be more peaked than a random variable Y if both X and Y have densities and P (Y A) P (X A) for all A A d, the class of subsets of R d symmetric about the origin. which are compact, convex, and Seminar, Institut de Mathématiques de Toulouse; 28 February 202.7

3. Some consequences and connections (statistics and probability) Theorem 2. (Olkin and Tong, 988) Suppose that f on R d is log-concave and symmetric about 0. Let X,..., X n be i.i.d. with density f, and suppose that a, b R n satisfy a a 2 a n, b b 2 b n, k a j k b j, k {,..., n}, n a j = n b j. (That is, a b.) Then n a j X j is more peaked than n b j X j : P n In particular, P n a j X j A P n a j X j t P b j X j A for all A A d n b j X j t for all t 0. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.8

3. Some consequences and connections (statistics and probability) Corollary: If g is non-decreasing on R + with g(0) = 0, then Eg n Another peakedness result: a j X j Eg n b j X j Suppose that Y = (Y,..., Y n ) where Y j N(µ j, σ 2 ) are independent and µ... µ n ; i.e. µ K n where K n {x R n : x x n }. Let µ n = Π(Y K n ), the least squares projection of Y onto K n. It is well-known that ( sj=r ) µ n = min max Y j s i r i s r +, i =,..., n.. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.9

3. Some consequences and connections (statistics and probability) Theorem 3. (Kelly) If Y N n (µ, σ 2 I) and µ K n, then µ k µ k is more peaked than Y k µ k for each k {,..., n}; that is P ( µ k µ k t) P ( Y k µ k t) for all t > 0, k {,..., n}. Question: Does Kelly s theorem continue to hold if the normal distribution is replaced by an arbitrary log-concave joint density symmetric about µ? Seminar, Institut de Mathématiques de Toulouse; 28 February 202.20

4. Strong log-concavity: definitions Definition. A density f on R is strongly log-concave if f(x) = h(x)cφ(cx) for some c > 0 where h is log-concave and φ(x) = (2π) /2 exp( x 2 /2). Sufficient condition: logf C 2 (R) with ( logf) (x) c 2 > 0 for all x. Definition 2. A density f on R d is strongly log-concave if f(x) = h(x)cγ(cx) for some c > 0 where h is log-concave and γ is the N d (0, ci d ) density. Sufficient condition: logf C 2 (R d ) with D 2 ( logf)(x) c 2 I d for some c > 0 for all x R d. These agree with strong convexity as defined by Rockafellar & Wets (998), p. 565. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.2

5. Examples & conterexamples Examples Example. f(x) = h(x)φ(x)/ hφdx where h is the logistic density, h(x) = e x /( + e x ) 2. Example 2. f(x) = h(x)φ(x)/ hφdx where h is the Gumbel density. h(x) = exp(x e x ). Example 3. f(x) = h(x)h( x)/ h(y)h( y)dy where h is the Gumbel density. Counterexamples Counterexample. f logistic: f(x) = e x /( + e x ) 2 ; ( logf) (x) = f(x). Counterexample 2. f Subbotin, r [, 2) (2, ); f(x) = C r exp( x r /r); ( logf) (x) = (r 2) x r 2. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.22

0.6 0.5 0.4 0.3 0.2 0. 3 2 2 3 Ex. : Logistic (red) perturbation of N(0, ) (green): f (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.23

3 2 3 2 0 2 3 Ex. : ( logf), Logistic perturbation of N(0, ) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.24

0.4 0.3 0.2 0. 4 2 0 2 Ex. 2: Gumbel (red) perturbation of N(0, ) (green): f (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.25

4 3 2 3 2 0 2 3 Ex. 2: ( logf), Gumbel perturbation of N(0, ) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.26

0.6 0.5 0.4 0.3 0.2 0. 2 0 2 Ex. 3: Gumbel ( ) Gumbel( ) (purple); N(0, V f ) (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.27

6 4 2 2 0 2 Ex. 3: loggumbel( ) Gumbel( ) (purple); logn(0, V f ) (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.28

4 3 2 2 0 2 Ex. 3: D 2 ( loggumbel( ) Gumbel( )) (purple); D 2 ( logn(0, V f )) (blue) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.29

0.4 0.3 0.2 0. 4 2 2 4 Subbotin f r r = (blue), r =.5 (red), r = 2 (green), r = 3 (purple) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.30

5 4 3 2 4 2 2 4 log f r : r = (blue), r =.5 (red), r = 2 (green), r = 3 (purple) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.3

4 3 2 4 2 0 2 4 ( log f r ) : r = (blue), r =.5 (red), r = 2 (green), r = 3 (purple) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.32

6. Some consequences, strong log-concavity First consequence Theorem. (Hargé, 2004). Suppose X N n (µ, Σ) with density γ and Y has density h γ with h log-concave, and let g : R n R be convex. Then Eg(Y E(Y )) Eg(X EX)). Equivalently, with µ = EX, ν = EY = E(Xh(X))/Eh(X), and g g( + µ) E{ g(x ν + µ)h(x)} E g(x) Eh(X). Seminar, Institut de Mathématiques de Toulouse; 28 February 202.33

6. Some consequences, strong log-concavity More consequences Corollary. (Brascamp-Lieb, 976). Suppose X f = exp( ϕ) with D 2 ϕ λi d, λ > 0, and let g C (R d ). Then V ar f (g(x)) E (D 2 ϕ) g(x), g(x) λ E g(x) 2. (Poincaré inequality for strongly log-concave densities; improvements by Hargé (2008)) Theorem. (Caffarelli, 2002). Suppose X N d (0, I) with density γ d and Y has density e v γ d with v convex. Let T = ϕ be the unique gradient of a convex map ϕ such that ϕ(x) d = Y. Then 0 D 2 ϕ I d. (cf. Villani (2003), pages 290-29) Seminar, Institut de Mathématiques de Toulouse; 28 February 202.34

7. Questions & problems Does strong log-concavity occur naturally? Are there natural examples? Are there large classes of strongly log-concave densities in connection with other known classes such as P F (Pólya frequency functions of order infinity) or L. Bondesson s class HM of completely hyperbolically monotone densities? Does Kelly s peakedness result for projection onto the ordered cone K n continue to hold with Gaussian replaced by log-concave (or symmetric log concave)? Seminar, Institut de Mathématiques de Toulouse; 28 February 202.35

Selected references: Villani, C. (2002). Topics in Optimal Transportation. Amer. Math Soc., Providence. Dharmadhikari, S. and Joag-dev, K. (988). Unimodality, Convexity, and Applications. Academic Press. Marshall, A. W. and Olkin, I. (979). Inequalities: Theory of Majorization and Its Applications. Academic Press. Marshall, A. W., Olkin, I., and Arnold, B. C. (20). Inequalities: Theory of Majorization and Its Applications. Second edition, Springer. Rockfellar, R. T. and Wets, R. (998). Variational Analysis. Springer. Hargé, G. (2004). A convex/log-concave correlation inequality for Gaussian measure and an application to abstract Wiener spaces. Probab. Theory Related Fields 30, 45-440. Brascamp, H. J. and Lieb, E. H. (976). On extensions of the Brunn- Minkowski and Prékopa-Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation. J. Functional Analysis 22, 366-389. Hargé, G. (2008). Reinforcement of an inequality due to Brascamp and Lieb. J. Funct. Anal., 254, 267-300. Seminar, Institut de Mathématiques de Toulouse; 28 February 202.36