Concentration Properties of Restricted Measures with Applications to Non-Lipschitz Functions

Similar documents
A note on the convex infimum convolution inequality

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures

KLS-TYPE ISOPERIMETRIC BOUNDS FOR LOG-CONCAVE PROBABILITY MEASURES. December, 2014

Concentration inequalities: basics and some new challenges

Convex inequalities, isoperimetry and spectral gap III

On isotropicity with respect to a measure

Contents 1. Introduction 1 2. Main results 3 3. Proof of the main inequalities 7 4. Application to random dynamical systems 11 References 16

GAUSSIAN MEASURE OF SECTIONS OF DILATES AND TRANSLATIONS OF CONVEX BODIES. 2π) n

High-dimensional distributions with convexity properties

Heat Flows, Geometric and Functional Inequalities

1 Cheeger differentiation

Expansion and Isoperimetric Constants for Product Graphs

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

KLS-type isoperimetric bounds for log-concave probability measures

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Discrete Ricci curvature: Open problems

A converse Gaussian Poincare-type inequality for convex functions

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Entropy and Ergodic Theory Lecture 17: Transportation and concentration

Local semiconvexity of Kantorovich potentials on non-compact manifolds

Weak and strong moments of l r -norms of log-concave vectors

From the Brunn-Minkowski inequality to a class of Poincaré type inequalities

Logarithmic Sobolev Inequalities

Solutions to Tutorial 11 (Week 12)

MATH 51H Section 4. October 16, Recall what it means for a function between metric spaces to be continuous:

Entropy for zero-temperature limits of Gibbs-equilibrium states for countable-alphabet subshifts of finite type

Lecture 4 Lebesgue spaces and inequalities

1.1. MEASURES AND INTEGRALS

3 Integration and Expectation

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction

Maths 212: Homework Solutions

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

Inverse Brascamp-Lieb inequalities along the Heat equation

MAJORIZING MEASURES WITHOUT MEASURES. By Michel Talagrand URA 754 AU CNRS

PROPERTY OF HALF{SPACES. July 25, Abstract

SPECTRAL GAP, LOGARITHMIC SOBOLEV CONSTANT, AND GEOMETRIC BOUNDS. M. Ledoux University of Toulouse, France

Displacement convexity of the relative entropy in the discrete h

SYMMETRY RESULTS FOR PERTURBED PROBLEMS AND RELATED QUESTIONS. Massimo Grosi Filomena Pacella S. L. Yadava. 1. Introduction

The small ball property in Banach spaces (quantitative results)

ON CONCENTRATION FUNCTIONS OF RANDOM VARIABLES. Sergey G. Bobkov and Gennadiy P. Chistyakov. June 2, 2013

STABILITY RESULTS FOR THE BRUNN-MINKOWSKI INEQUALITY

N. GOZLAN, C. ROBERTO, P-M. SAMSON

Integral Jensen inequality

Integration on Measure Spaces

Distance-Divergence Inequalities

6.2 Fubini s Theorem. (µ ν)(c) = f C (x) dµ(x). (6.2) Proof. Note that (X Y, A B, µ ν) must be σ-finite as well, so that.

Introduction and Preliminaries

Stability results for Logarithmic Sobolev inequality

Approximately Gaussian marginals and the hyperplane conjecture

Course 212: Academic Year Section 1: Metric Spaces

MATH 202B - Problem Set 5

MATHS 730 FC Lecture Notes March 5, Introduction

CHAPTER 6. Differentiation

L p Spaces and Convexity

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Jensen s inequality for multivariate medians

arxiv: v1 [math.fa] 20 Dec 2011

COMMENT ON : HYPERCONTRACTIVITY OF HAMILTON-JACOBI EQUATIONS, BY S. BOBKOV, I. GENTIL AND M. LEDOUX

On John type ellipsoids

The Hilbert Transform and Fine Continuity

Moment Measures. Bo az Klartag. Tel Aviv University. Talk at the asymptotic geometric analysis seminar. Tel Aviv, May 2013

On the isotropic constant of marginals

ITERATED FUNCTION SYSTEMS WITH CONTINUOUS PLACE DEPENDENT PROBABILITIES

Entropy and Ergodic Theory Lecture 15: A first look at concentration

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

ON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION

BOUNDS ON THE DEFICIT IN THE LOGARITHMIC SOBOLEV INEQUALITY

L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by

THE HUTCHINSON BARNSLEY THEORY FOR INFINITE ITERATED FUNCTION SYSTEMS

HYPERCONTRACTIVE MEASURES, TALAGRAND S INEQUALITY, AND INFLUENCES

Estimates for probabilities of independent events and infinite series

Dimensional behaviour of entropy and information

LECTURE 15: COMPLETENESS AND CONVEXITY

Metric Spaces and Topology

NOTES ON THE REGULARITY OF QUASICONFORMAL HOMEOMORPHISMS

+ 2x sin x. f(b i ) f(a i ) < ɛ. i=1. i=1

Part III. 10 Topological Space Basics. Topological Spaces

Probability and Measure

arxiv: v1 [math.ca] 31 Jan 2016

Measurable functions are approximately nice, even if look terrible.

curvature, mixing, and entropic interpolation Simons Feb-2016 and CSE 599s Lecture 13

Invariances in spectral estimates. Paris-Est Marne-la-Vallée, January 2011

at time t, in dimension d. The index i varies in a countable set I. We call configuration the family, denoted generically by Φ: U (x i (t) x j (t))

PREVALENCE OF SOME KNOWN TYPICAL PROPERTIES. 1. Introduction

The Lusin Theorem and Horizontal Graphs in the Heisenberg Group

Poincaré Inequalities and Moment Maps

Random Bernstein-Markov factors

Construction of a general measure structure

Examples of Dual Spaces from Measure Theory

Basic Properties of Metric and Normed Spaces

THEOREMS, ETC., FOR MATH 515

Part II Probability and Measure

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES

BERNOULLI ACTIONS OF SOFIC GROUPS HAVE COMPLETELY POSITIVE ENTROPY

THEOREMS, ETC., FOR MATH 516

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

THE INVERSE FUNCTION THEOREM FOR LIPSCHITZ MAPS

Transcription:

Concentration Properties of Restricted Measures with Applications to Non-Lipschitz Functions S G Bobkov, P Nayar, and P Tetali April 4, 6 Mathematics Subject Classification Primary 6Gxx Keywords and phrases Subgaussian constant, spread constant, restricted measures, concentration of measure Abstract For any metric probability space (M, d, µ, with a subgaussian constant σ (µ and any Borel measurable set A M, we show that σ (µ A c log (e/ σ (µ, where µ A is a normalized restriction of µ to the set A and c is a universal constant As a consequence, we deduce concentration inequalities for non-lipschitz functions Introduction It is known that many high-dimensional probability distributions µ on the Euclidean space R n (and other metric spaces, including graphs possess strong concentration properties In a functional language, this may informally be stated as the assertion that any sufficiently smooth function f on R n, eg, having a bounded Lipschitz semi-norm, is almost a constant on almost all space There are several ways to quantify such a property One natural approach proposed by N Alon, R Boppana and J Spencer [A-B-S] associates, with a given metric probability space (M, d, µ, its spread constant, s (µ = sup Var µ (f = sup (f m dµ, where m = f dµ, and the sup is taken over all functions f on M with f Lip More information is contained in the so-called subgaussian constant σ = σ (µ which is defined as the infimum over all σ such that e tf dµ e σ t /, for all t R, ( in the class L of all f on M with m = and f Lip (cf [B-G-H] Describing the diameter of L in the Orlicz space L ψ (µ for the Young function ψ (t = e t (within School of Mathematics, University of Minnesota, Minneapolis, MN 55455; research supported in part by the Humboldt Foundation, NSF and BSF grants Institute of Mathematics & Applications, Minneapolis, MN 55455; E-mail address: nayar@mimuwedupl (corresponding author; research supported in part by NCN grant DEC-/5/B/ST/4 School of Mathematics and School of Computer Science, Georgia Institute of Technology, Atlanta, GA 333; research supported in part by NSF DMS-47657

universal factors, the quantity σ (µ appears as a parameter in a subgaussian concentration inequality for the class of all Borel subsets of M As an equivalent approach, it may also be introduced via the transport-entropy inequality connecting the classical Kantorovich distance and the relative entropy from an arbitrary probability measure on M to the measure µ (cf [B-G] While in general s σ, the latter characteristic allows one to control subgaussian tails under the probability measure µ uniformly in the entire class of Lipschitz functions on M More generally, when f Lip L, ( yields µ{ f m t} e t /(σ L, t > ( Classical and well-known examples include the standard Gaussian measure on M = R n in which case s = σ =, and the normalized Lebesgue measure on the unit sphere M = S n with s = σ = n The last example was a starting point in the study of the concentration of measure phenomena, a fruitful direction initiated in the early 97s by V D Milman Other examples come often after verification that µ satisfies certain Sobolev-type inequalities such as Poincaré-type inequalities λ Var µ (u u dµ, and logarithmic Sobolev inequalities [ ρ Ent µ (u = ρ u log u dµ u dµ log ] u dµ u dµ, where u may be any locally Lipschitz function on M, and the constants λ > and ρ > do not depend on u Here the modulus of the gradient may be understood in the generalized sense as the function u(x = lim sup y x u(x u(y, x M d(x, y (this is the so-called continuous setting, while in the discrete spaces, eg, graphs, we deal with other naturally defined gradients In both cases, one has respectively the well-known upper bounds s (µ λ, σ (µ ρ (3 For example, λ = ρ = n on the unit sphere (best possible values, [M-W], which can be used to make a corresponding statement about the spread and Gaussian constants One of the purposes of this note is to give new examples by involving the family of normalized restricted measures µ A (B = µ(a B, B M (Borel, where a Borel measurable set A M is fixed and has a positive measure As an example, returning to the standard Gaussian measure µ on R n, it is known that σ (µ A for any convex body A R n This remarkable property, discovered by D Bakry and M Ledoux [B-L] in a sharper form of a Gaussian-type isoperimetric inequality, has nowadays several proofs and generalizations, cf [B, B] Of course, in general, the set A may have a rather disordered structure, for instance, to be disconnected And then there is no hope for validity of a Poincaré-type inequality for the measure µ A Nevertheless, it turns out that the concentration property of µ A is inherited from µ, unless the measure of A is too small In particular, we have the following observation about abstract metric probability spaces

3 Theorem For any measurable set A M with >, the subgaussian constant σ (µ A of the normalized restricted measure satisfies σ (µ A c log σ (µ, (4 where c is an absolute constant Although this assertion is technically simple, we will describe two approaches: one is direct and refers to estimates on the ψ -norms over the restricted measures, and the other one uses a general comparison result due to F Barthe and E Milman on the concentration functions ([B-M] One may further generalize Theorem by defining the subgaussian constant σf (µ within a given fixed subclass F of functions on M, by using the same bound ( on the Laplace transform This is motivated by a possible different level of concentration for different classes; indeed, in case of M = R n, the concentration property may considerably be strengthened for the class F of all convex Lipschitz functions In particular, one result of M Talagrand [T, T] provides a dimension-free bound σf (µ C for an arbitrary product probability measure µ on the n-dimensional cube [, ] n Hence, a more general version of Theorem yields the bound σf(µ A c log with some absolute constant c, which holds for any Borel subset A of [, ] n (cf Section 6 below According to the very definition, the quantities σ (µ and σ (µ A might seem responsible for deviations of only Lipschitz functions f on M and A, respectively However, the inequality (4 may also be used to control deviations of non-lipschitz f on large parts of the space and under certain regularity hypotheses Assume, for example, f dµ (which is kind of a normalization condition and consider A = {x M : f(x L} (5 If L, this set has the measure L, and hence, σ (µ A cσ (µ with some absolute constant c If we assume that f has a Lipschitz semi-norm L on A, then, according to (, µ A {x A : f m t} e t /(cσ (µl, t >, (6 where m is the mean of f with respect to µ A It is in this sense one may say that f is almost a constant on the set A This also yields a corresponding deviation bound on the whole space, µ{x M : f m t} e t /cσ (µl + L Stronger integrability conditions posed on f can considerably sharpen the conclusion By a similar argument, Theorem yields, for example, the following exponential bound, known in the presence of a logarithmic Sobolev inequality for the space (M, d, µ, and with σ replaced by /ρ (cf [B-G] Corollary Let f be a locally Lipschitz function on M with Lipschitz semi-norms L on the sets (5 If e f dµ, then f is µ-integrable, and moreover, µ{x M : f m t} e t/cσ(µ, t >, where m is the µ-mean of f and c is an absolute constant

4 Equivalently (up to an absolute factor, we have a Sobolev-type inequality f m ψ cσ(µ f ψ, connecting the ψ -norm of f m with the ψ -norm of the modulus of the gradient of f We prove a more general version of this corollary in Section 6 (cf Theorem 6 As will be explained in the same section, similar assertions may also be made about convex f and product measures µ on M = [, ] n, thus extending Talagrand s theorem to the class of non-lipschitz functions In view of the right bound in (3 and (4, the spread and subgaussian constants for restricted measures can be controlled in terms of the logarithmic Sobolev constant ρ, via s (µ A σ (µ A c log ρ However, it may happen that ρ = and σ (µ =, while λ > (eg, for the product exponential distribution on R n Then one may wonder whether one can estimate the spread constant of a restricted measure in terms of the spectral gap In such a case there is a bound similar to (4 Theorem 3 Assume the metric probability space (M, d, µ satisfies a Poincaré-type inequality with λ > For any A M with >, with some absolute constant c ( s (µ A c log e (7 λ It should be mentioned that the logarithmic terms in (4 and (7 may not be removed and are actually asymptotically optimal as functions of, as is getting small, see Section 7 Our contribution below is organized into sections as follows: Bounds on ψ α -norms for restricted measures 3 Proof of Theorem Transport-entropy formulation 4 Proof of Theorem 3 Spectral gap 5 Examples 6 Deviations for non-lipschitz functions 7 Optimality 8 Appendix Bounds on ψ α -norms for restricted measures A measurable function f on the probability space (M, µ is said to have a finite ψ α -norm, α, if for some r >, e ( f /rα dµ The infimum over all such r represents the ψ α -norm f ψα or f L ψα (µ, which is just the Orlicz norm associated with the Young function ψ α (t = e t α We are mostly interested in the particular cases α = and α = In this section we recall well-known relations between the ψ and ψ -norms and the usual L p -norms f p = f L p (µ = ( f p dµ /p For the readers convenience, we include the proof in the appendix Lemma We have f p f p sup f p p L ψ (µ 4 sup, ( p p

5 f p f p sup f p p L ψ (µ 6 sup p p ( Given a measurable subset A of M with >, we consider the normalized restricted measure µ A on M, ie, µ(a B µ A (B =, B M Our basic tool leading to Theorem will be the following assertion Proposition For any measurable function f on M, ( f L ψ (µa 4e log / e f L ψ (µ (3 Proof Assume that f L ψ (µ = and fix p By the left inequality in (, for any q, q q/ f q dµ f q dµ A, so But by the right inequality in (, f L q (µ A q ( /q f ψ f q 4 sup q q 4 f q p sup q p q Applying it on the space (M, µ A, we then get The obtained inequality, f L ψ (µa 4 f L p sup q (µ A q p q 4 p sup q p ( /q = 4 ( /p p f L ψ (µa 4 ( /p p, holds true for any p and therefore may be optimized over p Choosing p = log e, we arrive at (3 A possible weak point in the bound (3 is that the means of f are not involved For example, in applications, if f were defined only on A and had µ A -mean zero, we might need to find an extension of f to the whole space M keeping the mean zero with respect to µ In fact, this should not create any difficulty, since one may work with the symmetrization of f More precisely, we may apply Proposition on the product space (M M, µ µ to the product sets A A and functions of the form f(x f(y Then we get ( e f(x f(y L ψ (µa µ A 4e log / f(x f(y L ψ (µ µ Since log ( ( e log e, we arrive at: Corollary 3 For any measurable function f on M, f(x f(y L ψ (µa µ A 4e ( log / e f(x f(y L ψ (µ µ

6 Let us now derive an analog of Proposition for the ψ -norm, using similar arguments Assume that f L ψ (µ = and fix p By the left inequality in (, for any q, q q f q dµ f q dµ A, so But, by the inequality (, f L q (µ A q ( /q f L ψ f q 6 sup q q 6p sup q p f q q Applying it on the space (M, µ A, we get The obtained inequality, f Lq (µ f L ψ (µa 6p sup A q p q 6p sup q p ( ( /p, f L ψ (µa 6p /q ( /p = 6p holds true for any p and therefore may be optimized over p Choosing p = log e, we arrive at: Proposition 4 For any measurable function f on M, we have f L ψ (µa 6e log f L ψ (µ Similarly to Corollary 3, one may write down this relation on the product probability space (M M, µ µ with functions of the form f(x, y = f(x f(y and the product sets à = A A Then we get f(x f(y L ψ (µa µ A e log f(x f(y L ψ (µ µ (4 3 Proof of Theorem Transport-entropy formulation The finiteness of the subgaussian constant for a given metric probability space (M, d, µ means that ψ -norms of Lipschitz functions on M with mean zero are uniformly bounded Equivalently, for any (for all x M, we have that, for some λ >, e d(x,x /λ dµ(x < Definition ( of σ (µ inspires to consider another norm-like quantity [ ] σf = sup t t / log e tf dµ Here is a well-known relation (with explicit numerical constants which holds in the setting of an abstract probability space (M, µ Once again, we include a proof in the appendix for completeness

7 Lemma 3 If f has mean zero and finite ψ -norm, then 6 f ψ σ f 4 f ψ One can now relate the subgaussian constant of the restricted measure to the subgaussian constant of the original measure Let now (M, d, µ be a metric probability space First, Lemma 3 immediately yields an equivalent description in terms of ψ -norms, namely sup f ψ 6 f σ (µ 4 sup f ψ, (3 f where the supremum is running over all f : M R with µ-mean zero and f Lip Here, one can get rid of the mean zero assumption by considering functions of the form f(x f(y on the product space (M M, µ µ, d, where d is the l -type metric given by d ((x, y, (x, y = d(x, x + d(y, y If f has mean zero, then, by Jensen s inequality, e (f(x f(y /r dµ(x dµ(y e f(x /r dµ(x, which implies that f(x f(y L ψ (µ µ f L ψ (µ On the other hand, by the triangle inequality, f(x f(y L ψ (µ µ f L ψ (µ Hence, we arrive at another, more flexible relation, where the mean zero assumption may be removed Lemma 3 We have 4 6 sup f(x f(y L ψ (µ µ σ (µ 4 sup f(x f(y L ψ (µ µ, f f where the supremum is running over all functions f on M with f Lip Proof of Theorem We are prepared to make last steps for the proof of the inequality (4 We use the well-known Kirszbraun s theorem: Any function f : A R with Lipschitz semi-norm f Lip on A admits a Lipschitz extension to the whole space ([K], [MS] Namely, one may put [ ] f(x = inf f(a + d(a, x, x M a A Applying first Corollary 3 and then the left inequality of Lemma 3 to f, we get f(x f(y L ψ (µ A µ A = f(x f(y L ψ (µ A µ A ( 4e ( e log ( 4e ( e log f(x f(y L ψ (µ µ (4 6 σ (µ Another application of Lemma 3 in the space (A, d, µ A (now the right inequality yields σ (µ A 4 (4e ( e log (4 6 σ (µ This is exactly (4 with constant c = 4 (4e (4 6 = 3 e = 97967

8 Remark 33 Let us also record the following natural generalization of Theorem, which is obtained along the same arguments Given a collection F of (integrable functions on the probability space (M, µ, define σ F (µ as the infimum over all σ such that e t(f m dµ e σ t /, for all t R, for any f F, where m = f dµ Then with the same constant c as in Theorem, for any measurable A M, >, we have σf A (µ A c log σ F(µ, where F A denotes the collection of restrictions of functions f from F to the set A Let us now mention an interesting connection of the subgaussian constant with the Kantorovich distance W (µ, ν = inf d(x, y π(x, y and relative entropy D(ν µ = log dν dµ dν, (also called Kullback-Leibler distance or informational divergence Here, ν is a probability measure on M, which is absolutely continuous with respect to µ (for short, ν << µ, and the infimum in the definition of W is running over all probability measures π on the product space M M with marginal distributions µ and ν, ie, such that π(b M = µ(b, π(m B = ν(b (Borel B M As was shown in [B-G], if (M, d is a Polish space (complete separable, the subgaussian constant σ = σ (µ may be described as an optimal value in the transport-entropy inequality W (µ, ν σ D(ν µ (3 Hence, we obtain from the inequality (4 a similar relation for measures ν supported on given subsets of M Corollary 34 Given a Borel probability measure µ on a Polish space (M, d and a closed set A in M such that >, for any Borel probability measure ν supported on A, W (µ A, ν cσ (µ log D(ν µ A, where c is an absolute constant This assertion is actually equivalent to Theorem Note that, for ν supported on A, there is an identity D(ν µ A = log + D(ν µ In particular, D(ν µ A D(ν µ, so the relative entropies decrease when turning to restricted measures For another (almost equivalent description of the subgaussian constant, introduce the concentration function K µ (r = sup [ µ(a r ] (r >, where A r = {x M : d(x, a < r for some a A} denotes an open r-neighbourhood of A for the metric d, and the sup is running over all Borel sets A M of measure As is well-known, the transport-entropy inequality (3 gives rise to a concentration inequality

9 on (M, d, µ of a subgaussian type (K Marton s argument, but this can also be seen by a direct application of ( Indeed, for any function f on M with f Lip, it implies e t(f(x f(y dµ(x dµ(y e σ t, t R, and, by Chebyshev s inequality, we have a deviation bound (µ µ{(x, y M M : f(x f(y r} e r /4σ, r In particular, one may apply it to the distance functions f(x = d(a, x = inf a A d(a, x Assuming that, the measure on the left-hand side is greater than or equal to ( µ(ar, so that we obtain a concentration inequality µ(a r e r /4σ Therefore, K µ (r min {, e r /4σ } e r /8σ To argue in the opposite direction, suppose that the concentration function admits a bound of the form K µ (r e r /b for all r > with some constant b > Given a function f on M with f Lip, let m be a median of f under µ Then the set A = {f m} has measure, and by the Lipschitz property, Ar {f < m + r} for all r > Hence, by the concentration hypothesis, µ{f m r} K µ (r e r /b A similar deviation bound also holds for the function f with its median m, so that µ{ f m r} e r /b, r > This is sufficient to properly estimate ψ -norm of f m on (M, µ Namely, for any λ < /b, e λ f m dµ = + λ re λr µ{ f m r} dr + λ re λr e r /b dr = + λ b λ =, where in the last equality the value λ = 3b is chosen Thus, e f m /(3b dµ, which means that f m ψ 3 b The latter gives f(x f(y L ψ (µ µ 3 b Taking the supremum over f, it remains to apply Lemma 3, and then we get σ (µ 48 b Let us summarize Proposition 35 Let b = b(µ be an optimal value such that the concentration function of the space (M, d, µ satisfies a subgaussian bound K µ (r e r /b (r > Then 8 b (µ σ (µ 48 b (µ Once this description of the subgaussian constant is recognized, one may give another proof of Theorem, by relating the concentration function K µa to K µ In this connection let us state below, as a lemma, a general observation due to F Barthe and E Milman (cf [B-M], Lemma, p 585

Lemma 36 Let a Borel probability measure ν on M be absolutely continuous with respect to µ and have density p Suppose that, for some right-continuous, non-increasing function R : (, /4] (,, such that β(ε = ε/r(ε is increasing, we have ν{x M : p(x > R(ε} ε ( < ε 4 Then K ν (r β ( K µ (r/, for all r K µ (β(/4 Here β denotes the inverse function, and K µ (ε = inf{r > : K µ (r < ε} The nd proof of Theorem The normalized restricted measure ν = µ A has density p = A (thus taking only one non-zero value, and an optimal choice of R is the constant function R(ε = Hence, Lemma 36 yields the relation K µa (r K µ(r/, for r K µ (/4 In particular, if K µ (r e r /b, then K µa (r e r /(4b, for r b log(4/ Necessarily K µa (r, so the last relation may be extended to the whole positive half-axis Moreover, at the expense of a factor in the exponent, one can remove the factor ; more precisely, we get K µa (r e r / b with b = 4b log log 4, that is, b (µ A 4 log log 4 b (µ It remains to apply the two-sided bound of Proposition 35 4 Proof of Theorem 3 Spectral gap Theorem insures, in particular, that, for any function f on the metric probability space (M, d, µ with Lipschitz semi-norm f Lip, ( e Var µa (f c log σ (µ, up to some absolute constant c In fact, in order to reach a similar concentration property of the restricted measures, it is enough to start with a Poincaré-type inequality on M, λ Var µ (f f dµ Under this hypothesis, a well-known theorem due to Gromov-Milman and Borovkov-Utev asserts that mean zero Lipschitz functions f have bounded ψ -norm One may use a variant of this theorem proposed by Aida and Strook [A-S], who showed that e λ f dµ K = 7 ( f Lip

Hence e λ f dµ K and e λ f dµ K <, thus implying that f ψ λ In addition, e λ (f(x f(y dµ(xdµ(y K and e λ f(x f(y dµ(xdµ(y K < 6 From this, e 3 λ f(x f(y dµ(xdµ(y < 6 /3 <, which means that f(x f(y ψ 3 λ with respect to the product measure µ µ on the product space M M This inequality is translation invariant, so the mean zero assumption may be removed Thus, we arrive at: Lemma 4 Under the Poincaré-type inequality with spectral gap λ >, for any mean zero function f on (M, d, µ with f Lip, f ψ λ Moreover, for any f with f Lip, f(x f(y L ψ (µ µ 3 λ (4 This is a version of the concentration of measure phenomenon (with exponential integrability in the presence of a Poincaré-type inequality Our goal is therefore to extend this property to the normalized restricted measures µ A This can be achieved by virtue of the inequality (4 which when combined with (4 yields an upper bound ( e f(x f(y L ψ (µa µ A 36 e log λ Moreover, if f has µ A -mean zero, the left norm dominates f L ψ (µa (by Jensen s inequality We can summarize, taking into account once again Kirszbraun s theorem, as we did in the proof of Theorem Proposition 4 Assume the metric probability space (M, d, µ satisfies a Poincaré-type inequality with constant λ > Given a measurable set A M with >, for any function f : A R with µ A -mean zero and such that f Lip on A, f L ψ (µa 36 e log λ Theorem 3 is now easily obtained with the constant c = (36e, by noting that L - norms are dominated by L ψ -norms More precisely, since e t t, one has f ψ f Remark 43 Related stability results are known for various classes of probability distributions on the Euclidean spaces M = R n (and even in a more general stituation, where µ A is replaced by an absolutely continuous measure with respect to µ See, in particular, the works by E Milman [Mi,Mi] on convex bodies and log-concave measures

5 Examples Theorems and 3 involve a lot of interesting examples Here are some natural ones The standard Gaussian measure µ = γ on R n satisfies a logarithmic Sobolev inequality on M = R n with a dimension-free constant ρ = Hence, from Theorem we get: Corollary 5 For any measurable set A R n with γ(a >, the subgaussian constant σ (γ A of the normalized restricted measure γ A satisfies σ (γ A c log, γ(a where c is an absolute constant As it was already mentioned, if A is convex, there is a sharper bound σ (γ A However, it may not hold without convexity assumption Nevertheless, if γ(a is bounded away from zero, we obtain a more universal principle Clearly, Corollary 5 extends to all product measures µ = ν n on R n such that ν satisfies a logarithmic Sobolev inequality on the real line, and with constants c depending on ρ, only A characterization of the property ρ > in terms of the distribution function of the measure ν and the density of its absolutely continuous component may be found in [B-G] Consider a uniform distribution ν on the shell A ε = { x R n : ε x }, ε (n Corollary 5 The subgaussian constant of ν satisfies σ (ν c n, up to some absolute constant c In other words, mean zero Lipschitz functions f on A ε are such that n f are subgaussian with universal constant factor This property is well-known in the extreme cases on the unit Euclidean ball A = B n (ε = and on the unit sphere A = S n (ε = Let µ denote the normalized Lebesgue measure on B n In the case ε n, the shell A ε represents the part of B n of measure ( µ(a ε = n n e Since the logarithmic Sobolev constant of the unit ball is of order n, and therefore σ (µ c n, the assertion of Corollary 5 immediately follows from Theorem In case ε n, the assertion follows from a similar concentration property of the uniform distribution σ n on the unit sphere Indeed, with every Lipschitz function f on A ε one may associate its restriction to S n, which is also Lipschitz (with respect to the Euclidean distance We have f(rθ f(θ r ε n, for any r [ ε, ] and θ Sn Hence, f(r θ f(rθ f(θ f(θ + f(r θ f(θ + f(rθ f(θ f(θ f(θ + n, whenever r, r [ ε, ] and θ, θ S n, which implies f(r θ f(rθ f(θ f(θ + 8 n

3 But the map (r, θ θ pushes forward ν onto σ n, so, we obtain that, for any c >, exp{cn f(r θ f(rθ } dν(r, θ dν(r, θ e 8/n exp{cn f(θ f(θ } dσ n (θ dσ n (θ Here, for a certain numerical constant c >, the right-hand side is bounded by a universal constant, thus proving the claim 3 The two-sided product exponential measure µ on R n with density n e ( x + + xn satisfies a Poincaré-type inequality on M = R n with a dimension-free constant λ = /4 Hence, from Proposition 4 we get: Corollary 53 For any measurable set A R n with >, and for any function f : A R with µ A -mean zero and f Lip, we have f L ψ (µa c log, where c is an absolute constant In particular, ( s (µ A c log e Clearly, Corollary 53 extends to all product measures µ = ν n on R n such that ν satisfies a Poincaré-type inequality on the real line, and with constants c depending on λ, only A characterization of the property λ > may also be given in terms of the distribution function of ν and the density of its absolutely continuous component (cf [B-G] 4a Let us take the metric probability space ({, } n, d n, µ, where d n is the Hamming distance, that is, d n (x, y = {i : x i y i }, equipped with the uniform measure µ For this particular space, Marton established the transport-entropy inequality (3 with an optimal constant σ = n 4, cf [Mar] Using the relation (3 as an equivalent definition of the subgaussian constant, we obtain from Theorem : Corollary 54 For any non-empty set A {, } n, the subgaussian constant σ (µ A of the normalized restricted measure µ A satisfies, up to an absolute constant c, σ (µ A cn log (5 4b Let us now assume that A is monotone, ie, A satisfies the condition (x,, x n A = (y,, y n A, whenever y i x i, i =,, n Recall that the discrete cube can be equipped with a natural graph structure: there is an edge between x and y whenever they are of Hamming distance d n (x, y = For monotone sets A, the graph metric d A on the subgraph of A is equal to the restriction of d n to A A Indeed, we have: d n (x, y d A (x, y d A (x, x y + d A (y, x y = d n (x, x y + d n (y, x y = d n (x, y, where x y = (x y,, x n y n Thus, s (µ A, d A σ (µ A, d A cn log ( e

4 This can be compared with what follows from a recent result of Ding and Mossel (see [D-M] The authors proved that the conductance (Cheeger constant of (A, µ A satisfies φ(a 6n However, these type of isoperimetric results may not imply sharp concentration bounds Indeed, by using Cheeger inequality, the above inequality leads to λ c /n and s (µ A, d A /λ cn /, which is even worse than the trivial estimate s (µ A, d A diam(a n / 5 Let (M, d, µ be a (separable metric probability space with finite subgaussian constant σ (µ The previous example can be naturally generalized to the product space (M n, µ n, when it is equipped with the l -type metric n d n (x, y = d(x i, y i, x = (x,, x n, y = (y,, y n M n i= This can be done with the help of the following elementary observation Proposition 55 The subgaussian constant of the space (M n, d n, µ n is related to the subgaussian constant of (M, d, µ by the equality σ (µ n = nσ (µ Indeed, one may argue by induction on n Let f be a function on M n The Lipschitz property f Lip with respect to d n is equivalent to the assertion that f is coordinatewise Lipschitz, that is, any function of the form x i f(x has a Lipschitz semi-norm on M for all fixed coordinates x j M (j i Hence, in this case, for all t R, M e tf(x dµ(x n exp { t M f(x dµ(x n + σ t where σ = σ (µ Here the function (x,, x n M f(x dµ(x n is also coordinatewise Lipschitz Integrating the above inequality with respect to dµ n (x,, x n and applying the induction hypothesis, we thus get M n e tf(x dµ n (x exp { t }, M n f(x dµ n (x + n σ t But this means that σ (µ n nσ (µ For an opposite bound, it is sufficient to test ( for (M n, d n, µ n in the class of all coordinatewise Lipschitz functions of the form f(x = u(x + + u(x n with µ-mean zero functions u on M such that u Lip Corollary 56 For any Borel set A M n such that µ n (A >, the subgaussian constant of the normalized restricted measure µ n A with respect to the l -type metric d n satisfies ( σ (µ n A cnσ e (µ log µ n, (A where c is an absolute constant For example, if µ is a probability measure on M = R such that /λ dµ(x ex (λ >, then for the restricted product measures we have ( σ (µ n A cnλ e log µ n (5 (A with respect to the l -norm x = x + + x n on R n Indeed, by the integral hypothesis on µ, for any f on R with f Lip, e (f(x f(y /λ dµ(xdµ(y } e (x y /λ dµ(xdµ(y e (x +y /λ dµ(xdµ(y 4

5 Hence, if f has µ-mean zero, by Jensen s inequality, e f(x /4λ dµ(x e (f(x f(y /4λ dµ(xdµ(y, meaning that f L ψ (µ λ By Lemma 3, cf (3, it follows that σ (µ 6λ, so, (5 holds true by an application of Corollary 56 6 Deviations for non-lipschitz functions Let us now turn to the interesting question on the relationship between the distribution of a locally Lipschitz function and the distribution of its modulus of the gradient We still keep the setting of a metric probability space (M, d, µ and assume it has a finite subgaussian constant σ = σ (µ (σ Let us say that a continuous function f on M is locally Lipschitz, if f(x is finite for all x M Recall that we consider the sets A = {x M : f(x L}, L > (6 First we state a more general version of Corollary Theorem 6 Assume that a locally Lipschitz function f on M has Lipschitz semi-norms L on sets of the form (6 If µ{ f L }, then for all t >, (µ µ { f(x f(y t } [ inf e t /cσ L + µ { f > L }], (6 L L where c is an absolute constant Proof Although the argument is already mentioned in Section, let us replace (6 with a slightly different bound First note that the Lipschitz semi-norm of f with respect to the metric d in M is the same as its Lipschitz semi-norm with respect to the metric on the set A induced from M (which is true for any non-empty subset of M Hence, we are in a position to apply Theorem, and then Definition ( for the normalized restriction µ A yields a subgaussian bound: e t(f(x f(y dµ A (xdµ A (y e cσ L t /, for all t R, where A is defined in (6 with L L, and c is a universal constant From this, for any t >, (µ A µ A {(x, y A A : f(x f(y t} e t /(cσ L, and therefore (µ µ {(x, y A A : f(x f(y t} e t /(cσ L The product measure of the complement of A A does not exceed µ{ f(x > L}, and we obtain (6 If e f dµ, we have, by Chebyshev s inequality, µ{ f L} e L, so one may take L = log 4 Theorem 6 then gives that, for any L log 4, (µ µ { f(x f(y t } e t /cσ L + 4e L

6 For t σ, one may choose here L = t σ, leading to (µ µ { f(x f(y t } 6 e t/cσ, for some absolute constant c > In case t σ, this inequality is fulfilled automatically, so it holds for all t As a result, with some absolute constant C, f(x f(y ψ Cσ, which is an equivalent way to state the inequality of Corollary As we have already mentioned, with the same arguments inequalities like (6 can be derived on the basis of subgaussian constants defined for different classes of functions For example, one may consider the subgaussian constant σf (µ for the class F of all convex Lipschitz functions f on the Euclidean space M = R n (which we equip with the Euclidean distance Note that f(x is everywhere finite in the n-space, when f is convex Keeping in mind Remark 43, what we need is the following analog of Kirszbraun s theorem: Lemma 6 Let f be a convex function on R n For any L >, there exists a convex function g on R n such that f = g on the set A = {x : f(x L} and g L on R n Accepting for a moment this lemma without proof, we get: Theorem 63 Assume that a convex function f on R n satisfies µ{ f L } Then for all t >, (µ µ { f(x f(y t } [ inf e t /cσ L + µ { f > L }], L L where σ = σf (µ and c is an absolute constant For illustration, let µ = µ µ n be an arbitrary product probability measure on the cube [, ] n If f is convex and Lipschitz on R n, thus with f, then (µ µ { f(x f(y t } e t /c (63 This is one of the forms of Talagrand s concentration phenomenon for the family of convex sets/functions (cf [T, T], [M], [L] That is, the subgaussian constants σf (µ are bounded for the class F of convex Lipschitz f and product measures µ on the cube Hence, using Theorem 63, Talagrand s deviation inequality (63 admits a natural extension to the class of non-lipschitz convex functions: Corollary 64 Let µ be a product probability measure on the cube, and let f be a convex function on R n If µ{ f L }, then for all t >, (µ µ { f(x f(y t } [ inf e t /cl + µ { f > L }], L L where c is an absolute constant In particular, we have a statement similar to Corollary for this family of functions, namely f m L ψ (µ c f L ψ (µ, where m is the µ-mean of f Proof of Lemma 6 An affine function l a,v (x = a + x, v (v R n, a R may be called a tangent function to f, if f l on R n and f(x = l a,v (x for at least one point x It is well-known that f(x = sup{l a,v (x : l a,v L},

7 where L denotes the collection of all tangent functions l a,v Put, g(x = sup{l a,v (x : l a,v L, v L} By the construction, g f on R n and, moreover, g Lip sup{ l a,v Lip : l a,v L, v L} = sup{ v : l a,v L, v L} L It remains to show that g = f on the set A = { f L} Let x A and let l a,v be tangent to f and such that l a,v (x = f(x This implies that f(y f(x y x, v for all y R n and hence f(x = lim sup y x f(y f(x y x Thus, v L, so that g(x l a,v (x = f(x 7 Optimality lim sup y x y x, v y x = v Here we show that the logarithmic dependence in in Theorems and 3 is optimal, up to the universal constant c We provide several examples Example Let us return to Example 4, Section 5, of the discrete hypercube M = {, } n, which we equip with the Hamming distance d n and the uniform measure µ Let us test the inequality (5 of Corollary 54 on the set A {, } n consisting of n + points (,,,,, (,,,,, (,,,,,, (,,,, We have = (n + / n / n The function f : A R, defined by f(x = {i : x i = } n, has a Lipschitz semi-norm f Lip with respect to d and the µ A -mean zero Moreover, f dµ A = n(n+ Expanding the inequality e tf dµ A e σ (µ A t / at the origin yields f dµ A σ (µ A Hence, recalling that σ (µ n 4, we get σ (µ A f dµ A n n ( 3 σ (µ 3 log σ (µ log This example shows the optimality of (5 in the regime Example Let γ n be the standard Gaussian measure on R n of dimension n We have σ (γ n = Consider the normalized measure γ AR on the set A R = { (x, x,, x n R n : x + x R }, R Using the property that the function (x + x has a standard exponential distribution under the measure γ n, we find that γ n (A R = e R / Moreover, s (γ AR Var γar (x = x dγ AR (x = (x + x dγ AR (x ( = re r dr = R e R / R / + = log e γ n (A R

8 Therefore, ( σ (γ AR s (γ AR log e γ n (A R showing that the inequality (4 of Theorem is optimal, up to the universal constant, for any value of γ n (A [, ] Example 3 A similar conclusion can be made about the uniform probability measure µ on the Euclidean ball B(, n of radius n, centred at the origin (asymptotically for growing dimension n To see this, it is sufficient to consider the cylinders A ε = { (x, y R R n : x n ε and y ε }, < ε n, and the function f(x = x We leave to the readers the corresponding computations Example 4 Let µ be the two-sided exponential measure on R with density e x In this case σ (µ =, but, as easy to see, s (µ 4 (recall that λ (µ = 4 We are going to test optimality of the inequality (7 on the sets A R = {x R : x R} (R Clearly, µ(a R = e R, and we find that Therefore, s (µ AR Var µar (x = x dµ AR (x = e R, R r e r dr = R + R + (R + = log µ(a R ( s (µ AR log e, µ(a R showing that the inequality (7 is optimal, up to the universal constant, for any value of (, ] Acknowledgment The authors gratefully acknowledge the support and hospitality of the Institute for Mathematics & Applications (IMA, and the University of Minnesota, Minneapolis, where much of this work was conducted The second named author would like to acknowledge the hospitality of the Georgia Institute of Technology, Atlanta, during the period /8-3/5 References [A-S] Aida, S; Strook, D Moment estimates derived from Poincaré and logarithmic Sobolev inequalities Math Research Letters (994, 75 86 [A-B-S] Alon, N; Boppana, R; Spencer, J An asymptotic isoperimetric inequality Geometric and Functional Anal 8 (998, 4 436 [B-L] Bakry, D; Ledoux, M Lévy Gromov s isoperimetric inequality for an infinite dimensional diffusion generator Invent Math 3 (996, 59 8 [B-M] Barthe, F; Milman, E Transference principles for log-sobolev and spectral-gap with applications to conservative spin systems Comm Math Phys 33 (3, no, 575 65 [B] Bobkov, S G Localization proof of the isoperimetric Bakry-Ledoux inequality and some applications Teor Veroyatnost i Primenen 47 (, no, 34 346 Translation in: Theory Probab Appl 47 (3, no, 38 34 [B] Bobkov, S G Perturbations in the Gaussian isoperimetric inequality J Math Sciences (New York, 66 (, no 3, 5 38 Translated from: Problems in Math Analysis 45 (, 3 4

9 [B-G] Bobkov, S G; Götze, F Exponential integrability and transportation cost related to logarithmic Sobolev inequalities J Funct Anal 63 (999, no, pp 8 [B-G-H] Bobkov, S G; Houdré, C; Tetali, P The subgaussian constant and concentration inequalities Israel J Math 56 (6, 55 83 [D-M] Ding, J; Mossel, E Mixing under monotone censoring Electronic Comm Probab 9 (4, 6 [K] [L] Kirszbraun, M D Über die zusammenziehende und Lipschitzsche Transformationen Fund Math (934 77 8 Ledoux, M The concentration of measure phenomenon Mathematical Surveys and Monographs 89 (, Amer Math Soc, Providence, RI [Mar] Marton, K Bounding d-distance by informational divergence: a method to prove measure concentration, Ann Probab 4 (996, no, 857 866 [M] Maurey, B Some deviation inequalities Geom Funct Anal (99, 88 97 [MS] McShane, E J Extension of range of functions Bull Amer Math Soc 4 (934, no, 837 84 [Mi] Milman, E On the role of convexity in isoperimetry, spectral gap and concentration Invent Math 77 (9, no, 43 [Mi] Milman, E Properties of isoperimetric, functional and transport-entropy inequalities via concentration Probab Theory Related Fields 5 (, no 3 4, 475-57 [M-W] Mueller, C E; Weissler, F B Hypercontractivity for the heat semigroup for ultraspherical polynomials and on the n-sphere J Funct Anal 48 (99, 5 83 [T] Talagrand, M An isoperimetric theorem on the cube and the Khinchine Kahane inequalities Proc Amer Math Soc 4 (988, 95 99 [T] Talagrand, M Concentration of measure and isoperimetric inequalities in product spaces Publ Math IHES 8 (995, 73 5 Appendix Proof of Lemma Using the homogeneity, in order to derive the right-hand side inequality f in (, we may assume that sup p p p Then f p dµ p p/ for all p, and by Chebyshev s inequality, ( p p, F (t µ{ f t} for all t > t If t, choose here p = 4 t, in which case F (t 4 t Integrating by parts, we have, for any < ε < log 4, e εf dµ = = + ε + ε = e 4ε + e εt d( F (t te εt ( F (t dt + ε te εt dt + ε ε log 4 ε e (log 4ε te εt ( F (t dt te εt e log 4 t dt = e ( 4ε ε + ( log 4 ε

If ε log 8, the latter expression does not exceed 3 e4ε which does not exceed for ε log(4/3 4 Both inequalities are fulfilled for ε = log, and with this value e εf dµ Hence f L ψ (µ = ε log < 4, which yields the right inequality in ( Conversely, if f L ψ (µ =, then e 3 4 f dµ Since u(t = t p e 3 4 t p is maximized in t > at t = 3, we get ( p f p p = u( f e f p dµ u(t = 3e Hence, f p p /p 3e <, which yields the left inequality f Now, let us turn to ( and assume that sup p p p = Then f p dµ p p for all p, and by Chebyshev s inequality, for all t >, ( p p F (t µ{ f t} t If t, we may choose here p = t in which case F (t t, while for t < we choose p =, so that F (t log t Arguing as before, we have, for any < ε <, e ε f dµ = + ε e εt ( F (t dt + ε e εt ( F (t dt + ε e εt ( F (t dt + ε e εt e εt dt + ε t dt + ε The pre-last integral can be bounded by e ε t e ε f dµ e ε + εe ε log + e εt e log t dt dt = e ε log, so ε log log ε e ( ε For ε = 6, the latter expression is equal to 98939, and thus e ε f dµ < Hence f L ψ (µ ε = 6 Conversely, if f L ψ (µ =, then e 3 4 f dµ = Since u(t = t p e 3 4 t is maximized at t = p, we get ( p f p p = u( f e 3 4p 4 f dµ u(t = 3e Hence, f p p /p 4 3e <, which yields the left inequality Proof of Lemma 3 First assume that f ψ =, in particular e 7 8 f dµ The function u(t = log e tf dµ is smooth, convex, with u( = and u (t = fe tf dµ e tf dµ

In particular, u ( = Note that, by Jensen s inequality, e tf dµ, so u(t Further differentiation gives u (t = f e tf dµ ( fe tf dµ ( e tf dµ f e tf dµ Using tf t +f and the elementary inequality x e 3 8 x 8 3 e, we get, for t, f e tf dµ f e t +f dµ = e t / f e 3 8 f e 7 8 f dµ e / 8 3 e e 7 8 f dµ 4 Thus, u (t 4, and by Taylor s formula, u(t t On the hand, for t, by Cauchy s inequality, e tf dµ e t +f dµ = e t / e f / dµ ( e t / 4/7 e 7 8 f dµ = 4/7 e t / e ( + 4 7 log t e t Hence, in this case u(t t Thus, σf u(t = sup t t / 4, proving the right inequality of Lemma 3 For the left inequality, let σ f = Then e tf dµ e t / for all t R, which implies F (t µ{ f t} e t /, t From this, integrating by parts, we have, for any < ε <, e εf dµ = = + ε + 4ε e εt df (t = te εt ( F (t dt te εt e t / dt = + e εt d( F (t ε ε The last expression is equal to for ε = 6, which means that f ψ 6