OXPORD UNIVERSITY PRESS

Similar documents
Concentration inequalities and the entropy method

Concentration inequalities

Concentration, self-bounding functions

Concentration inequalities: basics and some new challenges

Stein s method, logarithmic Sobolev and transport inequalities

Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization

Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence

Concentration inequalities and tail bounds

Probability for Statistics and Machine Learning

Hoeffding, Chernoff, Bennet, and Bernstein Bounds

High Dimensional Probability

Introduction to Machine Learning

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES

STAT 200C: High-dimensional Statistics

Applied Probability and Stochastic Processes

Computational and Statistical Learning Theory

Entropy and Ergodic Theory Lecture 15: A first look at concentration

Stein s Method for Matrix Concentration

Concentration of Measures by Bounded Size Bias Couplings

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.

Lecture 9: October 25, Lower bounds for minimax rates via multiple hypotheses

Superconcentration inequalities for centered Gaussian stationnary processes

STAT 200C: High-dimensional Statistics

Concentration of Measures by Bounded Couplings

An Introduction to Probability Theory and Its Applications

Large Deviations Techniques and Applications

COMS 4771 Introduction to Machine Learning. Nakul Verma

Matrix Concentration. Nick Harvey University of British Columbia

High-Dimensional Probability

Concentration inequalities for non-lipschitz functions

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Heat Flows, Geometric and Functional Inequalities

Logarithmic Sobolev Inequalities

Stein s Method: Distributional Approximation and Concentration of Measure

Reverse Hyper Contractive Inequalities: What? Why?? When??? How????

A Gentle Introduction to Concentration Inequalities

Concentration of Measure with Applications in Information Theory, Communications, and Coding

Fast Rates for Estimation Error and Oracle Inequalities for Model Selection

Class 2 & 3 Overfitting & Regularization

Concentration Inequalities for Random Matrices

MEASURE CONCENTRATION FOR COMPOUND POIS- SON DISTRIBUTIONS

Model Selection and Geometry

Inequalities for Sums of Random Variables: a combinatorial perspective

Outline. Martingales. Piotr Wojciechowski 1. 1 Lane Department of Computer Science and Electrical Engineering West Virginia University.

LAWS OF LARGE NUMBERS AND TAIL INEQUALITIES FOR RANDOM TRIES AND PATRICIA TREES

Statistical Machine Learning

An Introduction to Stochastic Modeling

On concentration of self-bounding functions

Concentration Inequalities

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

First Passage Percolation Has Sublinear Distance Variance

Contents 1. Introduction 1 2. Main results 3 3. Proof of the main inequalities 7 4. Application to random dynamical systems 11 References 16

A Note on Jackknife Based Estimates of Sampling Distributions. Abstract

Stochastic Processes. Theory for Applications. Robert G. Gallager CAMBRIDGE UNIVERSITY PRESS

Introduction to Machine Learning CMU-10701

Computational Learning Theory - Hilary Term : Learning Real-valued Functions

Discrete Ricci curvature: Open problems

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

Distance-Divergence Inequalities

New Perspectives. Functional Inequalities: and New Applications. Nassif Ghoussoub Amir Moradifam. Monographs. Surveys and

Convex inequalities, isoperimetry and spectral gap III

Geometric Functional Analysis College Station July Titles/Abstracts. Sourav Chatterjee. Nonlinear large deviations

Model selection theory: a tutorial with applications to learning

Monte Carlo Methods. Handbook of. University ofqueensland. Thomas Taimre. Zdravko I. Botev. Dirk P. Kroese. Universite de Montreal

Novel Bernstein-like Concentration Inequalities for the Missing Mass

Generalization Bounds in Machine Learning. Presented by: Afshin Rostamizadeh

Random regular digraphs: singularity and spectrum

Lectures 6: Degree Distributions and Concentration Inequalities

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

Rademacher Averages and Phase Transitions in Glivenko Cantelli Classes

Stat 5101 Lecture Notes

Sample Complexity of Learning Mahalanobis Distance Metrics. Nakul Verma Janelia, HHMI

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector

Sparse Additive Functional and kernel CCA

Introduction to Machine Learning CMU-10701

Lecture 6 Proof for JL Lemma and Linear Dimensionality Reduction

Generalization bounds

Concentration behavior of the penalized least squares estimator

Logarithmic Sobolev inequalities in discrete product spaces: proof by a transportation cost distance

ADVANCED ENGINEERING MATHEMATICS

A note on the convex infimum convolution inequality

Introduction to Functional Analysis With Applications

Fisher Information, Compound Poisson Approximation, and the Poisson Channel

Stability results for Logarithmic Sobolev inequality

Contents. Part I Vector Analysis

Course Description - Master in of Mathematics Comprehensive exam& Thesis Tracks

Localized Complexities for Transductive Learning

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Pattern Recognition and Machine Learning

Contents. 1 Introduction 6

COMMENT ON : HYPERCONTRACTIVITY OF HAMILTON-JACOBI EQUATIONS, BY S. BOBKOV, I. GENTIL AND M. LEDOUX

Optimal Estimation of a Nonsmooth Functional

Super-Gaussian directions of random vectors

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

Acceleration of Randomized Kaczmarz Method

PROPERTY OF HALF{SPACES. July 25, Abstract

Differential Privacy without Sensitivity

On Z p -norms of random vectors

arxiv: v1 [math.pr] 11 Feb 2019

Transcription:

Concentration Inequalities A Nonasymptotic Theory of Independence STEPHANE BOUCHERON GABOR LUGOSI PASCAL MASS ART OXPORD UNIVERSITY PRESS

CONTENTS 1 Introduction 1 1.1 Sums of Independent Random Variables and the Martingale Method 2 1.2 The Concentration-of-Measure Phenomenon 4 1.3 The Entropy Method 9 1.4 The Transportation Method 12 1.5 Reading Guide 13 1.6 Acknowledgments 16 2 Basic Inequalities 18 2.1 From Moments to Tails 19 2.2 The Cramer-Chernoff Method 21 2.3 Sub-Gaussian Random Variables 24 2.4 Sub-Gamma Random Variables 27 2.5 A Maximal Inequality 31 2.6 Hoeffding's Inequality 34 2.7 Bennett's Inequality 35 2.8 Bernstein's Inequality 36 2.9 Random Proj ections and the Johnson-Lindenstrauss Lemma 39 2.10 Association Inequalities 43 2.11 Minkowski's Inequality 44 2.12 Bibliographical Remarks 45 2.13 Exercises 46 3 Bounding the Variance.. 52 3.1 The Efron-Stein Inequality 53 3.2 Functions with Bounded Differences 56 3.3 Self-Bounding Functions 60 3.4 More Examples and Applications 63 3.5 A Convex Poincare Inequality 66 3.6 Exponential Tail Bounds via the Efron-Stein Inequality 68 3.7 The Gaussian Poincare Inequality 72 3.8 A Proof of the Efron-Stein Inequality Based on Duality 73 3.9 Bibliographical Remarks 76 3.10 Exercises 78 4 Basic Information Inequalities 83 4.1 Shannon Entropy and Relative Entropy 84 4.2 Entropy on Product Spaces and the Chain Rule 85 4.3 Han's Inequality 86

VÜi I CONTENTS 4.4 Edge Isoperimetric Inequality on the Binary Hypercube 87 4.5 Combinatorial Entropies 89 4.6 Han's Inequality for Relative Entropies 91 4.7 Sub-Additivity of the Entropy 93 4.8 Entropy of General Random Variables 96 4.9 Duality and Variational Formulas 97 4.10 A Transportation Lemma 101 4.11 Pinsker's Inequality 102 4.12 Birg^'s Inequality 103 4.13 Sub-Additivity of Entropy: The General Case 105 4.14 The Brunn-Minkowski Inequality 107 4.15 Bibliographical Remarks 110 4.16 Exercises 112 5 Logarithmic Sobolev Inequalities 117 5.1 Symmetric Bernoulli Distributions 118 5.2 Herbst's Argument: Concentration on the Hypercube 121 5.3 A Gaussian Logarithmic Sobolev Inequality 124 5.4 Gaussian Concentration: The Tsirelson-Ibragimov-Sudakov Inequality 125 5.5 A Concentration Inequality for Suprema of Gaussian Processes 127 5.6 Gaussian Random Projections 128 5.7 A Performance Bound for the Lasso 132 5.8 Hypercontractivity: The Bonami-Beckner Inequality 139 5.9 Gaussian Hypercontractivity 146 5.10 The Largest Eigenvalue of Random Matrices 147 5.11 Bibliographical Remarks 152 5.12 Exercises 154 6 The Entropy Method 168 6.1 The Bounded Differences Inequality 170 6.2 More on Bounded Differences 174 6.3 Modified Logarithmic Sobolev Inequalities 175 6.4 Beyond Bounded Differences 176 6.5 Inequalities for the Lower Tail 178 6.6 Concentration of Convex Lipschitz Functions 180 6.7 Exponential Inequalities for Self-Bounding Functions 181 6.8 Symmetrized Modified Logarithmic Sobolev Inequalities 184 6.9 Exponential Efron-Stein Inequalities 185 6.10 A Modified Logarithmic Sobolev Inequality for the Poisson Distribution 188 6.11 Weakly Self-Bounding Functions 189 6.12 Proof of Lemma 6.22 196 6.13 Some Variations 199 6.14 Janson's Inequality 204 6.15 Bibliographical Remarks 207 6.16 Exercises 209

CONTENTS I ix 7 Concentration and Isoperimetry 215 7.1 Levy's Inequalities 215 7.2 The Classical Isoperimetric Theorem 218 7.3 Vertex Isoperimetric Inequality in the Hypercube 222 7.4 Convex Distance Inequality 224 7.5 Convex Lipschitz Functions Revisited 229 7.6 Bin Packing 230 7.7 Bibliographical Remarks 232 7.8 Exercises 233 8 The Transportation Method 237 8.1 The Bounded Differences Inequality Revisited 239 8.2 Bounded Differences in Quadratic Mean 241 8.3 Applications of Marton's Conditional Transportation Inequality 247 8.4 The Convex Distance Inequality Revisited 249 8.5 Talagrand's Gaussian Transportation Inequality 251 8.6 Appendix: A General Induction Lemma 256 8.7 Bibliographical Remarks 259 8.8 Exercises 260 9 Influences and Threshold Phenomena 262 9.1 Influences 263 9.2 Some Fundamental Inequalities for Influences 264 9.3 Local Concentration 271 9.4 Discrete Fourier Analysis and a Variance Inequality 273 9.5 Monotone Sets 277 9.6 Threshold Phenomena 279 9.7 Bibliographical Remarks 286 9.8 Exercises 287 10 Isoperimetry on the Hypercube and Gaussian Spaces 290 10.1 Bobkov's Inequality for Functions on the Hypercube 291 10.2 An Isoperimetric Inequality on the Binary Hypercube 297 10.3 Asymmetric Bernoulli Distributions and Threshold Phenomena 298 10.4 The Gaussian Isoperimetric Theorem 303 10.5 Lipschitz Functions of Gaussian Random Variables 307 10.6 Bibliographical Remarks 308 10.7 Exercises 309 11 The Variance ofsuprema of Empirical Processes 312 11.1 General Upper Bounds for the Variance 315 11.2 Nemirovski's Inequality 317 11.3 The Symmetrization and Contraction Principles 322 11.4 Weak and Wimpy Variances 327 11.5 Unbounded Summands 330 11.6 Bibliographical Remarks 335 11.7 Exercises 336

X CONTENTS 12 Suprema of Empirical Processes: Exponential Inequalities 341 12.1 An Extension of Hoeffding's Inequality 342 12.2 A Bernstein-Type Inequality for Bounded Processes 342 12.3 A Symmetrization Argument 344 12.4 Bousquet's Inequality for Suprema of Empirical Processes 347 12.5 Non-Identically Distributed Summands and Left-Tail Inequalities 351 12.6 Chi-Square Statistics and Quadratic Forms 353 12.7 Bibliographical Remarks 354 12.8 Exercises 355 13 The Expected Value of Suprema of Empirical Processes 362 13.1 Classical Chaining 363 13.2 Lower Bounds for Gaussian Processes 366 13.3 Chaining and VC-Classes 371 13.4 Gaussian and Rademacher Averages of Symmetric Matrices 374 13.5 Variations of Nemirovski's Inequality 377 13.6 Random Projections of Sparse and Large Sets 379 13.7 Normalized Processes: Slicing and Reweighting 387 13.8 Relative Deviations for L 2 Distances 391 13.9 Risk Bounds in Classification 392 13.10 Bibliographical Remarks 395 13.11 Exercises 397 14 O-Entropies 412 14.1 O-EntropyanditsSub-Additivity 412 14.2 From <t>-entropies to <J>-Sobolev Inequalities 419 14.3 < -Sobolev Inequalities for Bernoulli Random Variables 423 14.4 Bibliographical Remarks 427 14.5 Exercises 428 15 Moment Inequalities 430 15.1 Generalized Efron-Stein Inequalities 431 15.2 Moments of Functions of Independent Random Variables 432 15.3 Some Variants and Corollaries 436 15.4 Sums of Random Variables 440 15.5 Suprema of Empirical Processes 443 15.6 Conditional Rademacher Averages 446 15.7 Bibliographical Remarks 447 15.8 Exercises 449 References 451 Author Index 473 Subject Index 477