Extrema of log-correlated random variables Principles and Examples

Similar documents
arxiv: v1 [math.pr] 4 Jan 2016

Connection to Branching Random Walk

Extreme Value Analysis and Spatial Extremes

The maximum of the characteristic polynomial for a random permutation matrix

A. Bovier () Branching Brownian motion: extremal process and ergodic theorems

Concentration inequalities and tail bounds

8.1 Concentration inequality for Gaussian random matrix (cont d)

Extrema of discrete 2D Gaussian Free Field and Liouville quantum gravity

arxiv: v1 [math.pr] 22 May 2008

Extremal process associated with 2D discrete Gaussian Free Field

On Optimal Stopping Problems with Power Function of Lévy Processes

Lecture 22: Variance and Covariance

Entropy and Ergodic Theory Lecture 15: A first look at concentration

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Lecture 4: September Reminder: convergence of sequences

4 Derivations of the Discrete-Time Kalman Filter

Mod-φ convergence I: examples and probabilistic estimates

Lecture 9. d N(0, 1). Now we fix n and think of a SRW on [0,1]. We take the k th step at time k n. and our increments are ± 1

The Convergence Rate for the Normal Approximation of Extreme Sums

Nonparametric regression with martingale increment errors

Near extreme eigenvalues and the first gap of Hermitian random matrices

Concentration of Measures by Bounded Size Bias Couplings

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

PENULTIMATE APPROXIMATIONS FOR WEATHER AND CLIMATE EXTREMES. Rick Katz

Modelling large values of L-functions

BALANCING GAUSSIAN VECTORS. 1. Introduction

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Part II Probability and Measure

Lecture Quantitative Finance Spring Term 2015

Concentration inequalities: basics and some new challenges

A note on the extremal process of the supercritical Gaussian Free Field *

Some functional (Hölderian) limit theorems and their applications (II)

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

A class of probability distributions for application to non-negative annual maxima

Fluctuations for the Ginzburg-Landau Model and Universality for SLE(4)

arxiv: v2 [math.pr] 28 Dec 2015

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

1. Stochastic Processes and filtrations

Extreme Value Theory and Applications

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.

Randomized Algorithms Week 2: Tail Inequalities

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

EE595A Submodular functions, their optimization and applications Spring 2011

Lecture 5: Asymptotic Equipartition Property

Random Process Lecture 1. Fundamentals of Probability

Lecture 5. 1 Chung-Fuchs Theorem. Tel Aviv University Spring 2011

Stable Process. 2. Multivariate Stable Distributions. July, 2006

Estimation of risk measures for extreme pluviometrical measurements

Lecture 8: Information Theory and Statistics

P (A G) dp G P (A G)

9 Brownian Motion: Construction

Frontier estimation based on extreme risk measures

Lecture 3 - Expectation, inequalities and laws of large numbers

Weak quenched limiting distributions of a one-dimensional random walk in a random environment

Exercises in Extreme value theory

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Scaling exponents for certain 1+1 dimensional directed polymers

Max stable Processes & Random Fields: Representations, Models, and Prediction

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Your first day at work MATH 806 (Fall 2015)

x log x, which is strictly convex, and use Jensen s Inequality:

Random matrices: Distribution of the least singular value (via Property Testing)

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results

k-protected VERTICES IN BINARY SEARCH TREES

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Gaussian vectors and central limit theorem

Superconcentration inequalities for centered Gaussian stationnary processes

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Maximum of the characteristic polynomial of random unitary matrices

Notes on Gaussian processes and majorizing measures

A sequential hypothesis test based on a generalized Azuma inequality 1

Introduction to Self-normalized Limit Theory

4 Sums of Independent Random Variables

Spectral Continuity Properties of Graph Laplacians

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

7 Convergence in R d and in Metric Spaces

Negative Association, Ordering and Convergence of Resampling Methods

IEOR 6711: Stochastic Models I Fall 2013, Professor Whitt Lecture Notes, Thursday, September 5 Modes of Convergence

Extreme values of two-dimensional discrete Gaussian Free Field

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

1 Sequences of events and their limits

EE514A Information Theory I Fall 2013

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Lecture 3: Central Limit Theorem

Lecture 21 Representations of Martingales

1 Math 241A-B Homework Problem List for F2015 and W2016

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Introduction to Stochastic processes

Moreover this binary operation satisfies the following properties

Lecture 3. Random Fourier measurements

LECTURE 10: REVIEW OF POWER SERIES. 1. Motivation

Convergence in Distribution

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Scaling limit of random planar maps Lecture 2.

Math 576: Quantitative Risk Management

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Transcription:

Extrema of log-correlated random variables Principles and Examples Louis-Pierre Arguin Université de Montréal & City University of New York Introductory School IHP Trimester CIRM, January 5-9 2014

Acknowledgements Thank you very much to the organizers for the invitation! Much of what I know on the topic I learned from my collaborators: Anton Bovier, Nicola Kistler, Olivier Zindy, David Belius; and my students: Samuel April, Jean-Sébastien Turcotte et Frédéric Ouimet. I am grateful for all the discussions and insights on the subject. There are many outstanding papers on the subject. I will not be able to reference everybody on the slides. See my webpage http://www.dms.umontreal.ca/ arguinlp/recherche.html for the slides and detailed complementary references.

What is the Statistics of Extremes? The statistics of extremes or extreme value theory in probability deals with questions about the maxima of a collection of random variables: Consider N random variables on a probability space (Ω, F, P) (X i, i = 1,..., N) In the limit N, What can be said about the r.v. max i=1,...,n Xi? Law of the maximum What can be said about the joint law of the reordered collection X (1) X (2) X (3)... Order Statistics or Extremal Process In this mini-course, we will mostly focus on the law of the maximum.

Statistics of Extremes To keep in mind: max i=1,...,n Xi is a functional on the process (X i, i = 1,..., N) like the sum. Our objectives are similar in spirit as the limit theorems for a sum of random variables N i=1 Xi in the limit N Order of magnitude of the maximum Law of Large Numbers Fluctuation of the maximum Central Limit Theorem Ultimately, we want to answer the following questions: Problem Find a N and b N such that max i N X i a N b N N. Identify the limit. converges in law in the limit

Statistics of Extremes: A brief history Earlier works on the theory of extreme values focused on the case where or weakly correlated. (X i, i N) are IID r.v. We have a complete answer to the question. There are only three possible limit laws: Fréchet, Weibull or Gumbel. 1925: Tippett studied the largest values from samples of Gaussians 1927: Fréchet studied distributions other than Gaussian. Obtains the Fréchet limit law. 1928: Fisher & Tippett find two other limit laws. 1936: von Mises find sufficient conditions to converge to the 3 classes. 1943: Gnedenko find necessary and sufficient conditions. 1958: Gumbel writes the first book Statistics of extremes Figure : Gumbel 1891-1966

Statistics of Extremes: Motivation There are applications of the theory of extreme values for IID r.v. in meteorology (floods, droughts, etc). One goal of today s probability theory: Find other classes for the maximum when the r.v. s are STRONGLY CORRELATED. What are the motivations to look at strongly correlated r.v.? Finance: Evidence of slowly-decaying correlations for volatility Physics: The behavior of systems in Statistical physics is determined by the states of lowest energies. States are often correlated through the environment. Ex: Spin glasses, polymers, growth models (KPZ, Random matrices) Mathematics: Distribution of prime numbers seems to exhibit features of strongly correlated r.v. (Lecture 3). Of course, there are many correlation structures that can be studied. We will focus on one class LOG-CORRELATED models

Outline Lecture 1 1. Warm-up: Extrema of IID r.v. 2. Log-correlated Gaussian fields (LGF) Branching Random Walk (BRW) and 2D Gaussian Free Field (2DGFF) 3. Three fundamental properties 4. First order of the maximum ( LLN) Lecture 2 ntermezzo Relations to statistical physics 5. Second order of the maximum ( refined LLN) 6. A word on Convergence and Order Statistics Lecture 3: Universality Class of LGF 7. The maxima of characteristic polynomial of unitary matrices 8. The maxima of the Riemann zeta function

General Setup When we are dealing with correlations, it is convenient to index the r.v. s by points in a metric space, say V n with metric d. Choice of parametrization: (X n(v), v V n) #V n = 2 n E[X n(v)] = 0 for all v V n E[X n(v) 2 ] = σ 2 n For simplicity, assume that (X n(v), v V n) is a Gaussian process. Technical advantages: The covariance encodes the law. Comparison arguments (Slepian s Lemma) may simplify some proofs. The principles that we will discuss hold (or are expected to) in general.

1. Warm-up: The maximum of IID variables Consider (X i, i = 1,... 2 n ) IID Gaussians of variance σ 2 n. In this case it is easy to find a n and b n such that ( ) X i a n P max x converges. i b n Note that a n and b n are defined up to constants, additive and multiplicative resp. We obviously have P ( ) X i a n max x i b n We need to establish convergence of = = ( ) 2 n P(X 1 b nx + a n) ( ) 2 n 1 P(X 1 > b nx + a n). 2 n P(X 1 > b nx + a n) More refined than large deviation.

1. Warm-up: Extrema of IID variables Proposition Consider (X i, i = 1,... 2 n ) IID Gaussians of variance σ 2 n. Then for with c = 2 log 2 σ, we have a n = cn σ2 2c log n P ( max X i a n x ) exp( e cx ) i Gumbel distribution In other words We refer to ( ) max i 2 n Xi = cn σ2 2c log n + G }{{}}{{} Fluctuation Deterministic Order First order of the maximum: cn Second order of the maximum: σ2 log n. 2c Our goal: Establish a general method to prove similar results for log-correlated fields

2. Log-correlated Gaussian Fields ; 5 10 15 20 20 15 v ^ v0 10 5 2 01-1 -2 Vn v v0

Log-correlated Gaussian fields A Gaussian field (X n(v), v V n) is log-correlated if the covariance decays slowly with the distance E[X n(v)x n(v )] log d(v, v ) 2 n This is to be compared with d(v, v ) α or e d(v,v ). This implies that there are an exponential number of points whose correlation with v is of the order of the variance. Precisely, for 0 < r < 1, and a given v V n, { } v V n : E[Xn(v)Xn(v )] r 2n E[Xn(v)] 2 2 rn The correlations do not have to be exactly logarithmic. Approximate or asymptotic log-correlations is enough.

Example 1: Branching Random Walk V n: leafs of a binary tree of depth n Let (Y l ) be IID N (0, σ 2 ) on edges X n(v) = Y l (v) ; l: v Y 1 (v) Variance: E[X n(v) 2 ] = n l=1 E[Y 2 l (v)] = σ 2 n Y 2 (v) v ^ v 0 Covariance: E[X n(v)x n(v )] = v v l=1 E[Y 2 l (v)] = σ 2 v v Y 3 (v) For any 0 r 1 { } v V n : E[Xn(v)Xn(v )] r 2n E[Xn(v)] 2 2 rn v v 0 V n

Example 2: 2D Gaussian Free Field I I Vn : square box in Z2 with 2n points. 5 10 15 20 20 (Xn (v), v Vn ) Gaussian field with 15 0 E[Xn (v)xn (v )] = Ev "τ V Xn # k=0 (Sk )k 0 SRW starting at v. I 10 1{Sk =v0 } 5 2 01-1 -2 Log-Correlations E[Xn (v)2 ] = σ 2 n + O(1) 2n + O(1) E[Xn (v)xn (v 0 )] = log kv v 0 k2 where σ 2 = log 2 π v Vn far from the boundary v, v 0 Vn far from the boundary

3. Fundamental Properties There are three fundamental properties of log-correlated random variables. They are well-illustrated by the case of branching random walk. 1. Multiscale decomposition n X n(v) = Y l (v) l=1 ; Y 1 (v) Define X k (v) = k l=1 Y l(v), 1 k n 2. Self-similarity of scales For a given v, (X n(v ) X l (v ), v v l) Y 2 (v) v ^ v 0 is a BRW on (v, v v l) 2 n l points 3. Dichotomy of scales Y 3 (v) E[Y l (v)y l (v )] = { σ 2 if l v v 0 if l > v v v v 0 V n

Fundamental Properties We now verify the properties for the 2DGFF (X n(v), v V n). Reminder It is good to see (X n(v), v V n) as vectors in a Gaussian Hilbert space. E[X n(v) 2 ]: square norm of the vector E[X n(v)x n(v )], the inner product For B V n, the conditional expectation of X n(v) on X n(v ), v B, E[X n(v) {X n(v ), v B}] = a vv X n(v ) v B is the projection on the subspace spanned by X n(v ), v B. In particular, it is a linear combination of the X v, hence also Gaussian. Orthogonal decomposition X n(v) = (X n(v) E[X n(v) {X n(v ), v B}]) + E[X n(v) {X n(v ), v B}]

Fundamental Properties 1: Multiscale decomposition n X n(v) = Y l (v) l=1 V n Consider B l (v), a ball around v containing 2 n l points. Define F l = σ{x n(v ) : v / B l (v)} Define X l (v) = E[X n(v) F l (v)], l < n. n-l 2 points (X l (v), l n) is a martingale. Lemma (Multiscale) The increments Y l (v) = X l (v) X l 1 (v), l = 1,..., n are independent Gaussians.

Fundamental Properties 2: Self-Similarity Lemma (Self-Similarity) For a given v, (X n(v ) X l (v ), v v l) has the original law on (v, v v l) If B V n, write X B(v) = E[X n(v) {X n(v ), v / B}]. Then ( ) X n(v) X B(v), v B is a GFF on B. In our case, B are the neighborhoods B l (v) containing 2 n l points. E[(X n(v) X l (v)) 2 ] = σ 2 (n l)+o(1) The Y l s have variance σ 2 (1 + o(1)). Linearity of scales! Warning! If v B l (v), it is not true that X l (v ) = X l (v) (as in BRW)... but close! V n n-l 2 points

Fundamental Properties 3: Dichotomy V n E[Y l (v)y l (v )] = { σ 2 if l v v 0 if l > v v Define v v := greatest l such that B l (v) B l (v ). Lemma (Gibbs-Markov Property) For B V n, E[X n(v) {X n(v ), v B c }] = p u(v)x u u B This implies that X n(v) X l (v) = n k=l+1 Y l(v) is independent of Y l (v ) for all l such that v v < l.

Fundamental Properties 3: Dichotomy V n E[Y l (v)y l (v )] = { σ 2 if l v v 0 if l > v v v v' Lemma (Markov Property) For B V n, E[X n(v) {X n(v ), v / B}] = p u(v)x u u A This implies that X n(v) X l (v) = n k=l+1 Y l(v) is independent of Y l (v ) for all l such that v v < l. The decoupling is not exact at the branching point but is for larger scales soon after.

Fundamental Properties 3: Splitting V n E[Y l (v)y l (v )] = { σ 2 if l v v 0 if l > v v v v' Lemma For all l such that v v 2 < 2 n l (l neighborhoods touch) E[(X l (v) X l (v )) 2 ] = O(1) This implies E[X l (v)x l (v )] = E[X l (v) 2 ] + O(1) = σ 2 l + O(1). Thus E[Y l (v)y l (v )] = σ 2 + o(1).

Lectures goals For the remaining part of the lectures, our specific goals are to prove the deterministic orders of the maximum using the 3 properties. Theorem 1. First order: 2. Second order: max v VN X n(v) lim = 2 log 2σ =: c in probability n n max v VN X n(v) cn log n In other words, with large probability max X n(v) = v V n = 3 σ 2 2 c ( cn 3 σ 2 2 c log n Lectures 1 and 2: 2DGFF (BRW as a guide) in probability ) + O(ε log n). Lecture 3: toy model of the Riemann zeta function

4. The first order of the maximum max v Vn X n(v) lim = 2 log 2 σ n n W1(v) W2(v) 1 K n<v^ v0 apple 2 K n W3(v) W3(v 0 ) v v 0

First order of the maximum Let (X n(v), v V n) be a Gaussian field with #V n = 2 n, E[X n(v) 2 ] = σ 2 n. Theorem (First order of the maximum) If (X n(v), v V n) satisfies the three properties (multiscale, self-similarity, splitting), we have max v Vn X n(v) lim = 2 log 2 σ n n }{{} =c in probability This was shown by Biggins 77 for the BRW. This was shown by Bolthausen, Deuschel & Giacomin 2001 for GFF. We follow here the general method of Kistler (2013). 1. Upper bound: P ( max v Vn X n(v) > (c + δ)n ) 0 2. Lower bound P ( max v Vn X n(v) > (c δ)n ) 1 (c )n (c + )n

Upper bound: Plain Markov This is the easy part. Consider the number of exceedances of a level a N n(a) = #{v V n : X n(v) > a} Clearly, by Markov s inequality (or union bound) P ( max X n(v) > a ) = P ( N n(a) 1 ) v V n Note that correlations play no role here! By Gaussian estimate with a = (c + δ)n E[N n(a))] = 2 n P(X n(v) > a) 2 n P(X v > (c + δ)n) 2 n e (c+δ)2 n/2σ 2n e 2 log 2δn/σ goes to zero exponentially fast as n. c = 2 log 2 σ is designed to counterbalance the entropy 2 n e c2 n/2σ 2n = 1.

Lower bound: Multiscale second moment The only tool at our disposal to get lower bound for the right tail of a positive random variable is the Paley-Zigmund inequality: P(N 1) We would like to show that for a = (c δ)n E[N ]2 E[N 2 ] P ( N n(a) 1 ) E[Nn(a)]2 E[N n(a) 2 ] 1 The correlations play a role in the denominator. Good news: we need to find an upper bound. E[N n(a) 2 ] = ( Xn(v) > a, X n(v ) > a ). If the r.v. were independent v,v V n P E[N n(a) 2 ] = v v P ( X n(v) > a )2 + = E[N n(a)] 2 + v E[N n(a)] 2 }{{} +E[N n(a)] dominant for a small! v V n P ( X n(v) > a ) P ( X n(v) > a ) (1 P ( X n(v) > a ) )

Lower bound: Multiscale second moment Use the multiscale decomposition (Prop. 1). K scales (large but fixed) suffices X n(v) = K k=1 k 1 K Y l (v) n<l K k }{{ n } :=W k (v) (W k (v), k = 1,... ) are IID N (0, σ 2 n/k) Prop. 1 and 2 Define a modified number of exceedances Ñ n(a) = # {v V n : W k (v) > a } K k = 1,..., n W2(v) W1(v) k =1 k =2 Note that Since the first order is linear in the scales, this is a good choice. P ( N n(a) 1 ) P ( Ñ n(a) 1 ) v

Lower bound: Multiscale second moment Not losing much in dropping W 1 K X n(v) = W 1(v) + W k (v) > (c δ)n }{{} k=2 > δn }{{} >a( K 1 K ) P(W 1(v) > δn) 1 since Var(W 1) = n/k. This step is crucial and not only technical. W2(v) W1(v) k =1 k =2 v Ñ n(a) = #{v V n : W k (v) > a K It remains to show for a = (c δ)n k = 2,..., K} P ( Ñ n(a) 1 ) E[Ñn(a)]2 E[Ñn(a)2 ] 1

Lower bound: Multiscale second moment The second moment for these exceedances is E[Ñn(a)2 ] = K k=1 v,v : k 1 K n<v v K k n P (W j(v) > a K, Wj(v ) > a ) K j 2 We expect the dominant term to be k = 1 (most independence). For v, v with v v n/k, Prop 3. Splitting P (W j(v) > a K, Wj(v ) > a ) K j 2 = P (W j(v) > a ) 2 K j 2 #{v, v : v v n/k} = 2 2n 2 n 2 n n/k = 2 2n (1 + o(1)) ) 2 But E[Ñn(a)]2 = 2 2n P (W j(v) > ak j 2 E[Ñn(a)2 ] = (1 + } o(1))e[ñn(a)]2 +... {{} k>1 dominant?

Lower bound: Multiscale second moment k>1 v,v : k 1 K n<v v K k n P (W j(v) > a K, Wj(v ) > a ) K j 2 Since we need an upper bound, we can drop conditions in the probability. W1(v) Take v v = l for k 1 n < l k n K K ) P (W j(v) > ak j 2, W j (v ) > ak j k + 1 Use Prop. 3 (splitting): if j > v v, Y j(v) indep. of Y j(v ) P (W j(v) > a ) K j k P (W j (v) > a ) 2 K j k+1 W2(v) 1 K n<v^ v0 apple 2 K n W3(v) W3(v 0 ) v v 0

Lower bound: Multiscale second moment k>1 v,v : k 1 K n<v v K k n P (W j(v) > a K, Wj(v ) > a ) K j 2 Since we need an upper bound, we can drop conditions in the probability. Take v v = l for k 1 2 n 2 k 1 n K K n such pairs. n < l k n. At most K P (W j(v) > a ) K j k P (W j (v) > a ) 2 K j k+1 W1(v) The inside sum is E[Ñn(a)]2 times W2(v) 1 K n<v^ v0 apple 2 K n k 2 k 1 K n 2 k 1 K j=2 ( P W j(v) > a ) 1 K n 2 k 1 K n(1 δ)2 W3(v) W3(v 0 ) This goes to 0 exponentially fast! v v 0

Outline Lecture 1 1. Warm-up: Extrema of IID r.v. 2. Log-correlated Gaussian fields (LGF) Branching Random Walk (BRW) and 2D Gaussian Free Field (2DGFF) 3. Three fundamental properties 4. First order of the maximum ( LLN) Lecture 2 ntermezzo Relations to statistical physics 5. Second order of the maximum ( refined LLN) 6. A word on Convergence and Order Statistics Lecture 3: Universality Class of LGF 7. The maxima of characteristic polynomial of unitary matrices 8. The maxima of the Riemann zeta function