NEW FUNCTIONAL INEQUALITIES

Similar documents
Stein s method, logarithmic Sobolev and transport inequalities

arxiv: v1 [math.pr] 7 May 2013

Stein s Method: Distributional Approximation and Concentration of Measure

Gaussian Phase Transitions and Conic Intrinsic Volumes: Steining the Steiner formula

Stein s method, logarithmic Sobolev and transport inequalities

Stein s method and weak convergence on Wiener space

KOLMOGOROV DISTANCE FOR MULTIVARIATE NORMAL APPROXIMATION. Yoon Tae Kim and Hyun Suk Park

ON MEHLER S FORMULA. Giovanni Peccati (Luxembourg University) Conférence Géométrie Stochastique Nantes April 7, 2016

Stein s Method and Stochastic Geometry

Concentration inequalities: basics and some new challenges

Second Order Results for Nodal Sets of Gaussian Random Waves

Differential Stein operators for multivariate continuous distributions and applications

Malliavin calculus and central limit theorems

Heat Flows, Geometric and Functional Inequalities

Logarithmic Sobolev Inequalities

Stein s Method: Distributional Approximation and Concentration of Measure

The Stein and Chen-Stein Methods for Functionals of Non-Symmetric Bernoulli Processes

Stability results for Logarithmic Sobolev inequality

Multi-dimensional Gaussian fluctuations on the Poisson space

Giovanni Peccati. 1. Introduction QUANTITATIVE CLTS ON A GAUSSIAN SPACE: A SURVEY OF RECENT DEVELOPMENTS

curvature, mixing, and entropic interpolation Simons Feb-2016 and CSE 599s Lecture 13

Probability approximation by Clark-Ocone covariance representation

Stein s method and stochastic analysis of Rademacher functionals

The Stein and Chen-Stein methods for functionals of non-symmetric Bernoulli processes

Stein s method on Wiener chaos

Different scaling regimes for geometric Poisson functionals

Normal approximation of geometric Poisson functionals

A Gentle Introduction to Stein s Method for Normal Approximation I

Normal approximation of Poisson functionals in Kolmogorov distance

Pathwise volatility in a long-memory pricing model: estimation and asymptotic behavior

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Superconcentration inequalities for centered Gaussian stationnary processes

Convergence to equilibrium of Markov processes (eventually piecewise deterministic)

Spaces with Ricci curvature bounded from below

Kolmogorov Berry-Esseen bounds for binomial functionals

arxiv: v1 [math.pr] 7 Sep 2018

Intertwinings for Markov processes

Distance-Divergence Inequalities

Displacement convexity of the relative entropy in the discrete h

Nodal Sets of High-Energy Arithmetic Random Waves

Quantitative stable limit theorems on the Wiener space

Discrete Ricci curvature: Open problems

Concentration inequalities and the entropy method

Stein's method meets Malliavin calculus: a short survey with new estimates. Contents

Contents 1. Introduction 1 2. Main results 3 3. Proof of the main inequalities 7 4. Application to random dynamical systems 11 References 16

Entropic curvature-dimension condition and Bochner s inequality

Asymptotics for posterior hazards

A note on the convex infimum convolution inequality

COMMENT ON : HYPERCONTRACTIVITY OF HAMILTON-JACOBI EQUATIONS, BY S. BOBKOV, I. GENTIL AND M. LEDOUX

Nash Type Inequalities for Fractional Powers of Non-Negative Self-adjoint Operators. ( Wroclaw 2006) P.Maheux (Orléans. France)

University of Luxembourg. Master in Mathematics. Student project. Compressed sensing. Supervisor: Prof. I. Nourdin. Author: Lucien May

Concentration inequalities for non-lipschitz functions

LAN property for sde s with additive fractional noise and continuous time observation

Lectures on Gaussian approximations with Malliavin calculus

Wasserstein-2 bounds in normal approximation under local dependence

Squared chaotic random variables: new moment inequalities with applications

ELEMENTS OF PROBABILITY THEORY

ON THE PROBABILITY APPROXIMATION OF SPATIAL

Information theoretic perspectives on learning algorithms

A REPRESENTATION FOR THE KANTOROVICH RUBINSTEIN DISTANCE DEFINED BY THE CAMERON MARTIN NORM OF A GAUSSIAN MEASURE ON A BANACH SPACE

Optimal Berry-Esseen bounds on the Poisson space

Fonctions on bounded variations in Hilbert spaces

Lecture 35: December The fundamental statistical distances

Convex Optimization & Parsimony of L p-balls representation

Point Process Control

R. Lachieze-Rey Recent Berry-Esseen bounds obtained with Stein s method andgeorgia PoincareTech. inequalities, 1 / with 29 G.

Chapter 6. Stein s method, Malliavin calculus, Dirichlet forms and the fourth moment theorem

Applications of the time derivative of the L 2 -Wasserstein distance and the free entropy dissipation

Anti-concentration Inequalities

Poincaré Inequalities and Moment Maps

Optimal series representations of continuous Gaussian random fields

Introduction to Infinite Dimensional Stochastic Analysis

CLASSICAL AND FREE FOURTH MOMENT THEOREMS: UNIVERSALITY AND THRESHOLDS. I. Nourdin, G. Peccati, G. Poly, R. Simone

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Some functional (Hölderian) limit theorems and their applications (II)

Concentration of Measure: Logarithmic Sobolev Inequalities

Contents. 1 Preliminaries 3. Martingales

Cumulants on the Wiener Space

BOOK REVIEW. Review by Denis Bell. University of North Florida

Stein Couplings for Concentration of Measure

Stochastic Volatility and Correction to the Heat Equation

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

arxiv: v2 [math.pr] 12 May 2015

A Spectral Gap for the Brownian Bridge measure on hyperbolic spaces

Convex inequalities, isoperimetry and spectral gap III

Joint Parameter Estimation of the Ornstein-Uhlenbeck SDE driven by Fractional Brownian Motion

Some Aspects of Universal Portfolio

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Concentration of Measures by Bounded Couplings

TRANSPORT INEQUALITIES FOR STOCHASTIC

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

RANDOM FIELDS AND GEOMETRY. Robert Adler and Jonathan Taylor

Integration by Parts and Its Applications

On the asymptotic behaviour of the number of renewals via translated Poisson

Density estimates and concentration inequalities with Malliavin calculus

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

Dimensionality in the Stability of the Brunn-Minkowski Inequality: A blessing or a curse?

CAMBRIDGE TRACTS IN MATHEMATICS. General Editors B. BOLLOBÁS, W. FULTON, A. KATOK, F. KIRWAN, P. SARNAK, B. SIMON, B.TOTARO

Transcription:

1 / 29 NEW FUNCTIONAL INEQUALITIES VIA STEIN S METHOD Giovanni Peccati (Luxembourg University) IMA, Minneapolis: April 28, 2015

2 / 29 INTRODUCTION Based on two joint works: (1) Nourdin, Peccati and Swan (JFA, 2014), and (2) Ledoux, Nourdin and Peccati (GAFA, 2015). The results to follow are a ramification of a recently established direction of research, focussing on deriving probabilistic approximations (e.g., Berry-Esseen theorems ) from the combination of the Stein s method and the Malliavin calculus of variations. See Nourdin and Peccati (2009, 2012). I shall use the notation (Ω, F, P) for a generic probability space, with E indicating expectation with respect to P.

3 / 29 A GENERAL QUESTION Fix two integers k, d 1. G = (G 1,..., G k ) is a k-dimensional vector of i.i.d. standard Gaussian random variables (the underlying Gaussian field ) F = (f 1 (G),..., f d (G)) is a d-dimensional vector of smooth (nonlinear) transformations of G (the unknown distribution ) with identity covariance matrix. We assume that F has a density. N = (N 1,..., N d ) is a d-dimensional vector of i.i.d. standard Gaussian random variables (the target distribution ). Question: Can we (meaningfully!) bound the quantity TV(F, N) = sup P(F A) P(N A)? A B(R d )

4 / 29 NOTATION The law of N is denoted by γ d, that is γ d (dx 1,..., dx d ) = { 1 exp 1 (2π) d/2 2 := φ(x)dx. } d xi 2 dx 1 dx d i=1 We shall assume that the law of F has a smooth density h with respect to γ d, that is: P[F A] = h(x) γ d (dx), A R d. This implies, in particular, that A TV(F, N) = 1 2 R d h(x) 1 γ d (dx).

5 / 29 THE ORNSTEIN-UHLENBECK SEMIGROUP It will be useful to consider the standard interpolation: so that F 0 = F and F = N. F t := e t F + 1 e 2t N, t 0, It is easy to prove that the density of F t is given by P t h, where, for a given test function g, P t g(x) = E[g(e t x + 1 e 2t N)], t 0, defines the Ornstein-Uhlenbeck semigroup (Mehler s form). We denote by L the generator of {P t }. It is well known that L (as an operator on L 2 (γ k )) has eigenvalues 0, 1, 2,...,; the corresponding eigenspaces {C n : n 0} are the so-called Wiener chaoses associated with γ k.

THE STEIN KERNEL An application of integration by parts shows that, for every smooth mapping g : R d R, E[Fg(F)] = E[τ h g(f)] (as vectors) where τ h is the d d matrix given by τ i,j h = τ i,j ] h [ f (F) = E j (G), L 1 f i (G) R k F. (L 1 is the (pseudo)inverse of the OU semigroup on L 2 (γ k )). We call τ h the Stein kernel associated with F. Note that, if h = 1 (and therefore F LAW = N), then necessarily τ h = identity. 6 / 29

7 / 29 DISCREPANCY Definition The Stein discrepancy S between F and N is defined as follows: S = S(F N) := d i,j=1 E[(τ i,j h δi j )2 ] A direct computation shows that S(F t N) e 2t S(F N), t 0.

8 / 29 SOME ESTIMATES Can one use S to rigorously measure the distance between F and N? Stein s method (Stein 72, 86) allows one to obtain the following bounds (see Nourdin and Peccati, 09, 12; Nourdin and Rosinski, 12). When d = 1, TV(F, N) 2S(F N). When d = 1 and F belongs to the qth Wiener chaos of γ k (q 2), then TV(F, N) 2S(F N) 2 q 1 3q E[F 4 ] 3 (note that 3 = E[N 4 ], see Nualart and Peccati (2005)).

9 / 29 SOME ESTIMATES In general, for any d 1, Wass 1 (F, N) := sup E(g(F)) E(g(N)) S(F N). g Lip(1) Finally, for any d 1 and when each F i is chaotic, Wass 1 (F, N) S(F N) E[ F 4 R d ] E[ N 4 R d ]. Very large scope of applications, e.g.: (1) variations of Gaussiansubordinated processes, (2) local times of fractional processes, (3) zeros of random polynomials, (4) statistical analysis of spherical fields, (5) excursion sets of random fields on homogeneous spaces, (6) random matrices, (7) universality principles.

10 / 29 FROM WASSERSTEIN TO TOTAL VARIATION For a general dimension d, going from Wass 1 to TV is a remarkably difficult task. For instance, Nourdin, Nualart and Poly (2013), have proved that, LAW if F n N, and Fn lives in the sum of the first q chaoses for every ( α, TV(F n, N) = O(1) E[ F n 4 R ] E[ N 4 d R ]) d α < 1 1 + (d + 1)(3 + 4d(q + 1)). We will address this task by using the notions of entropy (H), Fisher information (I), and 2-Wasserstein distance (W).

11 / 29 ENTROPY Definition The relative entropy of F with respect to N is given by H(F N) := h(x) log h(x) γ d (dx) R d = E[ log φ(n)] E[ log p(f)] = Ent(N) Ent(F). (φ = density of N; p = hφ = density of F) Recall Pinsker s inequality: TV(F, N) 2 1 2 H(F N).

12 / 29 FISHER INFORMATION Definition The relative Fisher information of F with respect to N is given by I(F N) := Rd h(x) 2 γ d (dx) = E log h(f) 2 h(x) = E[ log p(f) log φ(f) 2 ] ( φ = density of N; p = hφ = density of F) It is well-known that: I(F t N) e 2t I(F N), t 0 (exponential decay).

IMPORTANT RELATIONS A famous formula, due to de Bruijn, states that H(F N) = 0 I(F t N) dt (the proof is based on the use of the heat equation). Using the exponential decay of t I(F t N), we deduce immediately the log-sobolev inequality (Gross, 1972) H(F N) I(F N) e 2t dt = 1 0 2 I(F N). 13 / 29

14 / 29 2-WASSERSTEIN DISTANCE Definition The 2-Wasserstein distance between the laws of F and N is given by W(F, N) = inf E X Y 2 R d where the infimum runs over all pairs (X, Y) such that X LAW = F and Y LAW = N.

SOME CLASSIC RELATIONS The famous Talagrand s transportation inequality (Talagrand, 1996) states that W(F, N) 2H(F N). In Otto-Villani (2001), it is proved that W(F, N) 0 I(F t N) dt. Finally, in Otto-Villani (2001) one can find the well-known HWI inequality H(F N) W(F, N) I(F N) 1 2 W2 (F, N). 15 / 29

THE ROLE OF STEIN DISCREPANCY Proposition (Ledoux, Nourdin, Peccati (2015)) Assume F admits a Stein s kernel τ h : E[Fg(F)] = E[τ h g(f)] (g smooth). Then, I(F t N) e 4t 1 e 2t S2 (F N), t > 0. Proof: Writing p t for the density of F t, the estimate follows from the relation log p t (F t ) log φ(f t ) = e 2t 1 e 2t E[(Id τ h(f))n F t ], that can be easily verified by a direct computation. 16 / 29

17 / 29 THE HSI INEQUALITY The following result is proved in Ledoux, Nourdin and Peccati (2015) Theorem (HSI inequality) One has that: H(F N) S2 (F N) 2 ( log 1 + I(F N) ) S 2. (F N) Remark: since (for x, y > 0) x log(1 + y/x) y, this estimate (strictly) improves the Log-Sobolev inequality 2H I.

18 / 29 REMARKS The proof follows from the estimate H(F N) = = 0 x 0 I(F t N)dt I(F t N)dt + I(F N) x and then by optimising in x. 0 x I(F t N)dt e 2t dt + S 2 (F N) x e 4t dt, 1 e 2t It is easy to cook up examples where the RHS of HSI converges to zero, while the RHS of HWI (and therefore of log-sobolev) explodes to infinity. We were not able to construct examples going in the other direction.

THE HWS INEQUALITY A suitable modification of the above approach gives the following bound (Ledoux, Nourdin, Peccati, 2015) Theorem (HWS inequality) Under the above notation, W(F, N) S(F N) arccos ) (e H(F N) S 2 (F N). Remark: this estimate improves Talagrand s transportation inequality W 2H. The reason is that arccos e x 2x, for every x 0. 19 / 29

20 / 29 REMARKS Although we have the hierarchy HSI, HWI Log-Sobolev Talagrand, the relations between HSI and HWI are far from being clear. More to the point, when trying to deduce a HWSI inequality by a similar optimisation procedure, one finds that the minimum is either HWI or HSI. This phenomenon is difficult to interpret at the moment.

21 / 29 SOME APPLICATIONS & EXTENSIONS Reinforced convergence to equilibrium: H(F t N) e 4t 1 e 2t S2 (F N) (compare with the estimate: H(F t N) e 2t H(F N)). Concentration via L p norms of Lipschitz functions: [ E[ u(f) p ] 1/p C S p + ] p + ps p, where S p := ( E Id τ h p HS) 1/p. Also, W p (F, N) C p S p.

22 / 29 EXTENSIONS HSI can be suitably extended to the case where the target distribution µ is the invariant measure of a symmetric Markov semigroup {P t } with generator L f = a, Hess f HS + b f. In this case, we require the Stein kernel τ ν to verify the relation b f dν + τ ν, Hess f HS dν = 0 and therefore S 2 (ν µ) = a 1/2 τ ν a 1/2 Id 2 dν. The conditions for HSI to hold involve iterated gradients Γ n of orders n = 1, 2, 3.

THE PROBLEM WITH FISHER INFORMATION Recall that we are interested in bounding H(F N), when F = (f 1 (G),..., f d (G)), and the f i s are non-linear transforms of the Gaussian field G. The problem is that, in such a general framework, there is actually no available technique for properly bounding the relative Fisher information I(F N) without further heavy constraints on F. This means that one has to be much more careful in studying the term x I(F t N)dt, 0 by suitably bounding I(F t N) in terms of the total variation distance. This can be done by exploiting an important inequality (due to Carbery and Wright, 2001), yielding bounds on the small ball probabilities of polynomial random variables. See Nourdin and Poly (2012). 23 / 29

24 / 29 ENTROPIC CLTS ON WIENER SPACE Theorem (Entropic 4 th moment theorem Nourdin, Peccati and Swan (2014)) Let F n = (F 1,n,..., F d,n ), n 1, be a chaotic sequence such that F n converges in distribution to N = (N 1,..., N d ) N (0, C), where C > 0. Set Then, as n, n := E F n 4 R d E N 4 R d > 0, n 1. H(F n N) = O(1) n log n.

25 / 29 REMARKS Of course, by Pinsker inequality, the above statement translates into a bound on the total variation distance. In Ledoux, Nourdin and Peccati (2015): extensions to random vectors living in the eigenspaces of the generator L of a Markov semigroup {P t } whenever some form of the Carbery-Wright inequality is available (some log-concave case, for instance).

26 / 29 AN EXPLICIT EXAMPLE Consider the sequence V n := 1 n ( n k=1(x 2 k 1), ) n (Xk 3 3X k ), n 1, where the Gaussian sequence X k has autocorrelation ϱ(k) k α, for some α < 2/3 (this corresponds to the increments of a fractional Brownian motion of Hurst index H < 2/3). Let N N (0, I 2 ). Then, for a suitable constant σ > 0, k=1 H(σ 1 V n N), TV(σ 1 V n, N) 2 = O(1) log n n.

ANOTHER EXAMPLE Closed convex cone C R d ; g = d-dimensional Gaussian vector; Π C (g) = metric projection of g onto C. Question (Amelunxen, Lotz, McCoy and Tropp, 2013; Compressed Sensing) : how far is the distribution of Π C (g) 2 from that of a Gaussian random variable N? 27 / 29

28 / 29 ANOTHER EXAMPLE The HSI inequality yields (under non-degeneracy): for large d, Ent (N) Ent ( Π C (g) 2) log δ δ where δ = E ( Π C (g) 2) = statistical dimension of the cone. See Ivan s talk on Thursday! (Based on a collaboration with Larry Goldstein)

THANK YOU FOR YOUR ATTENTION! 29 / 29