A Note On Large Deviation Theory and Beyond

Size: px
Start display at page:

Download "A Note On Large Deviation Theory and Beyond"

Transcription

1 A Note On Large Deviation Theory and Beyond Jin Feng In this set of notes, we will develop and explain a whole mathematical theory which can be highly summarized through one simple observation ( ) lim n + n log e na + e nb = a b. Staring at the above identity for a moment, if you are sufficiently over-sensitive, you discover that two subjects of mathematics are shouting at you on the left hand side, you see summation and hence probability theory; on the right hand side, you see maximization and hence calculus of variations. The large deviation theory, which is an abstract framework to make the above simple observation rigorous and extensive, has brought profound impacts to mathematics as well as to physics and engineering... Copyright 200 Jin Feng

2 2 JIN FENG, LARGE DEVIATIONS LECTURE Sanov Theorem, from the point of view of Boltzmann.. Outline of Lecture Sanov Theorem, the mathematics Why did Boltzmann care, the concept of entropy, and an elementary proof of Sanov via Sterling formula Gibbs conditioning and maximum entropy principles.2. Sanov theorem, the abstract setup Let Define (a) (S, d) be a complete separable metric space; (b) {X i : i =, 2,...} be i.i.d. S-valued random variables with probability law γ(dx) := P(X dx); (c) Define measure-valued random variable µ n (dx) := n δ Xi (dx) P(S). n i= S(ρ γ) := S log dρ dγ dρ. Let P(S) be given the weak convergence topology with a compatible metric. Then Theorem.. For each ρ P(S), lim lim ɛ 0+ n n log P(µ n B(ρ; ɛ)) = S(ρ γ).

3 LECTURE. SANOV THEOREM, FROM THE POINT OF VIEW OF BOLTZMANN3.3. Boltzmann in 877 Why did Boltzmann care? The setting of discrete ideal gas: (a) S := {x, x 2,..., x m }; (b) P(S) = {γ := (γ,..., γ m ) : m k= γ k =, γ k 0}. (c) µ n = (µ n (x ),..., µ n (x m )) where µ n (x) = n δ Xi ({x}), x S. n µ n is a model for the shape of thin gas. Think about why. Theorem.2 (Boltzmann). i= P(µ n ρ) exp{ ns(ρ γ)} Proof. ( P(µ n ρ) = P (µ n (x ),..., µ n (x m )) ) n (nρ,..., nρ m ) ( ) = P (#X i = x, #X i = x 2,..., #X i = x m ) (nρ,..., nρ m ) = n! (nρ )!... (nρ m )! γnρ... γ nρm m. By Stirling s formula (we will revisit this issue using Gamma function in the second lecture) log(k!) = k log k k + O(log k), then n log P(µ n ρ) = log n + O( log n n ) i ρ i log(nρ i ) + i ρ i + i O( log nρ i ) n + i ρ i log γ i = i ( ρ i log ρ i + ρ i log γ i ) + O( log n n ) = S(ρ γ) + O( log n n ). Definition.3 (Relative Entropy). ( S(ρ γ) := S log dρ dγ ) dρ

4 4 JIN FENG, LARGE DEVIATIONS From the above, we know that µ n γ in probability (law of large number). Indeed, we know more than just that. Lemma.4. Let A be a set which Ā does not contain γ, then for P(µ n A) Ce ni(a) n 0 I(A) := inf S(ρ γ) > 0. ρ Ā What does the distribution of (X,..., X K ) converge to? The following is a complicated way to answer this extremely simple question. First of all, it is a product measure (why?) Note that f, µ n := fdµ n = n f(x i ), n by identical distribution property, and by the above lemma, E[f(X )] = E[ f, µ n ] fdγ i= (indeed, the last limit holds as equality without limit). answer is γ k := γ... γ. Hence the.4. Maximum entropy and Gibbs conditioning Problem. Suppose that we made one observation regarding the samples {X,..., X n }. Knowing such a priori information, how does it change the setup and conclusion of the above Sanov theorem? For instance, suppose that h is a function on S and we observe H n := h(x ) + + h(x n ) = h(x)µ n (dx) := f(µ n ). n S What is lim n n log P(µ n ρ H n e) =? Note that H n := f(µ n ) is a function of µ n, the event { H n = e} = {µ n f (e)}. Therefore we arrive at a more general question which is answered by the following Theorem.5 (Gibbs conditioning principle). lim n n log P(µ n A µ n B) = I(A B) + I(B).

5 LECTURE. SANOV THEOREM, FROM THE POINT OF VIEW OF BOLTZMANN5 Proof. By Sanov theorem, for a large class of sets A P(S), lim n n log P(µ n A) = I(A). Therefore n log P(µ n A µ n B) = n log P(µ n A B) n log P(µ n B) I(A B) + I(B) inf S(ρ γ) + inf S(ρ γ). ρ A B ρ B What is the most likely state for µ n in the conditional probability P(µ n µ n B)? Theorem.6 (Maximum entropy principle). Suppose that ρ is the unique minimizer such that Then Proof. Let Then Hence S(ρ γ) = inf ρ B S(ρ γ). lim P(d(µ n, ρ ) > δ µ n B) = 0, δ > 0. n M := A := {ρ : d(ρ, ρ ) > δ}. inf S(ρ γ) inf S(ρ γ) > 0. B {ρ:d(ρ,ρ )>δ} ρ B P(d(µ n, ρ ) > δ µ n B) e nm 0. We now consider the special case of {µ n B} := { H n e}. By the maximum entropy principle, we d like to optimize S(ρ γ) under the constraint h, ρ = e. By Lagrange multiplier method, we optimize function F (ρ, β) := S(ρ γ) + β( h, ρ e) = S(ρ γ β ), where parametrized probability measure γ β (dx) = Z β e β(h(x) e) γ(dx).

6 6 JIN FENG, LARGE DEVIATIONS From That is, where (.7) F = 0, we get ρ i log ρ i + log γ i + β h(x i ) = 0. ρ i := j eβ h(x j) e β h(x i) γ i = γ β i γ j h, ρ = S hdγ β = e. For people familiar with advanced statistical theory of estimation and inference, one recognize the exponential family connection. Therefore, it is natural to introduce pressure function (to be discussed more extensively in next lecture) Λ(β) := log e βh(x) γ(dx). It can be verified that and that Λ (β) = S Λ (β) > 0, h(x i )e βh(xi) γ i = j eβh(x j) γ j i S h(x)γ β (dx). Corollary.8 (Macro-state). The most likely macro -state is with β uniquely determined by ρ i := e β h(x i ) γ i Z β Λ (β ) = e. Next, we derive lim P(X H n e) n As the Sanov case, by symmetry, By law of large number, E[f(X ) H n e] = E[ f, µ n H n e] lim E[f(X ) H n e] = n S fdγ β By de Finetti theorem (review the concept of exchangeability), Therefore P(X dx,..., X n dx n H n ) = Π n i=p(x i dx i H n ).

7 LECTURE. SANOV THEOREM, FROM THE POINT OF VIEW OF BOLTZMANN7 Corollary.9. For each K fixed, lim P(X,..., X K H n e) = (γ β ) K. n

8

9 LECTURE 2 Free Energy and Entropy, à la Gibbs 2.. Outline of Lecture A duality between free energy and entropy Properties of relative entropy 2.2. An entropy-free energy (pressure) duality The Gibbs conditioning principle tells us the following: We start with a model X,..., X n γ. We make observations based on H n := n i h(x i ). Conditioning on the fact that we saw H, then we should update our prior belief that the underlying measure should be dγ β,h := Z β,h e βh dγ. for some constant β. This is a Bayes theorem essentially. Since βh always come together, we will just set β = and write the renormalized new reference measure (a Gibbs measure) dγ h := eh Z h dγ with normalizing (partition) constant Z h := e h dγ. 9

10 0 JIN FENG, LARGE DEVIATIONS Let h C b (S). The log partition functional Λ(h) := log Z h = log e h dγ plays key role as some kind of dual functional to entropy. We first observe that S(ρ γ) h, ρ = dρ log dρ dγ log Z h h = S(ρ γ h ) log Z h = S(ρ γ h ) Λ(h). We have the following infinite dimensional version of Legendre- Frenchel transform Theorem 2. (Lanford-Varadhan). S(ρ γ) = sup { h, ρ Λ(h)} h C b (S) Λ(h) = sup { h, ρ S(ρ γ)} ρ P(S) The supreme for the second one is uniquely attained at γ h. Proof. Since S(ρ γ) + Λ(h) = h, ρ + S(ρ γ h ) h, ρ and since ρ = γ h is the only solution for S(ρ γ h ) = 0, the conclusions follow Properties of entropy Lemma 2.2. S( γ) : P(S) R + is convex. Proof. This is because that S(ρ γ) = dγ dρ dρ log dγ dγ = S S h( dρ dγ )dγ where h(r) = r log r is convex. The positivity follows from Jensen s inequality. Review the concept and give examples of semicontinuous functions. Lemma 2.3. Let f α ( ) : S R lower semicontinuous for each α Λ fixed. Then is still lower semicontinuous. f(x) := sup f α (x) α Λ

11 LECTURE 2. FREE ENERGY AND ENTROPY, À LA GIBBS Proof. This is a consequence of (2.4) {x : f(x) c} = α Λ {x : f α (x) c}. Lemma 2.5. S( ) : P(S) P(S) R + is lower semicontinuous in the weak convergence topology. Proof. This is because of the variational representation in (2.).

12

13 LECTURE 3 Large Deviation, General Theory 3.. Outline of Lecture Laplace lemma Large deviation principle, Laplace principle and related stuff Exponential tightness Rate function and techniques on identifying it The situation of stochastic processes 3.2. Laplace lemma We will make sense of an infinite dimensional generalization of ( /n e dx) nf(x) = exp{ min f(x)} lim n 0 and its far-reaching impacts to physical applications. Lemma 3. (Laplace Lemma). n log with f C b (S), µ M b (S). Proof. Take home exercise. S e nf(z) µ(dz) n sup f(z) z supp(µ) As an application, we prove the Stirling formula for gamma function Γ(α) := Note that Γ(n) = (n )!. α. 0 x α e x dx. We are interested in behavior of Γ as 3

14 4 JIN FENG, LARGE DEVIATIONS By change of variable x = αy, and by using the Laplace lemma, Γ(α) = α α y exp{ α(y log y)}dy e α log α α, 0 since min{y log y} = log =. y To be more precise, ( Γ(α) ) /α lim = e. α α α The special case of α = n + gives the well-known Sterling formula n! e n log n n. Indeed, if we are more careful, we have next order expansion around the stationary point y 0 = that y log y + 2 (y )2 + O((y ) 3 ). By Gaussian integral properties, Γ(α) = α α+ 2 e α (2π) 2 ( + O(α )) Large Deviation Principle and Laplace Principle A rate (action) function is a function I : S [0, + ] which is lower semicontinuous. If I has compact level sets, we call it good. We denote I(A) := inf x A I(x). Definition 3.2 (LDP). {X n : n =, 2,...} is said to satisfies the Large Deviation principle with rate function I, if and only if (a) For each closed set F S, lim sup n (b) For each open set G S, lim inf n n log P (X n F ) I(F ); n log P (X n G) I(G). Definition 3.3. {X n : n =, 2,...} is said to satisfies the Laplace principle with rate function I, if (a) For all f C b (S), lim sup n n log E[enf(Xn) ] sup{f(x) I(x)}; x S

15 LECTURE 3. LARGE DEVIATION, GENERAL THEORY 5 (b) For each f C b (S), lim inf n n log E[enf(Xn) ] sup{f(x) I(x)}. x S Theorem 3.4. The Laplace principle is equivalent to the Large deviation principle. Proof. First, we prove that Large deviation principle implies Laplace principle. This was due to Varadhan. Let closed set F N,j := {x S : f + j N 2 f f(x) f + j N 2 f }. and we approximate f from above by step functions: N ( f N (x) := f + j ) N 2 f (x F N,j ). j= Note that the level sets of f N are closed. Therefore, by large deviation upper bound lim sup n n log E[enf(Xn) ] lim sup n n log E[enf N (X n) ] max { f + j j=,2...,n N 2 f I(F N,j )} max j=,...,n sup {f(x) I(x)} + 2 f x F N,j N sup {f(x) I(x)} + 2 f x S N. Let x 0 S and ɛ > 0. Then G := {x : f(x) > f(x 0 ) ɛ} is open, by large deviation lower bound lim inf n n log E[enf(Xn) ] lim inf n n log E[(X n G)e nf(xn) ] f(x 0 ) ɛ + lim inf n n log P (X n G) f(x 0 ) ɛ I(G) f(x 0 ) I(x 0 ) ɛ. The Laplace lower bound follows from the arbitrariness of x 0 S and ɛ > 0. Next, we prove that Laplace principle implies Large deviation. This seems was first realized by Dupuis and Ellis....

16

17 LECTURE 4 Occupation Measure and Random Perturbation of ODEs 4.. Outline of Lecture The Donsker-Varadhan theory The Freidlin-Wentzell theory 7

18

19 LECTURE 5 An HJB equation approach to large deviation of Markov Processes 5.. Outline of Lecture Martingale problems A nonlinear semigroup Hamilton-Jacobi-Bellman equation and viscosity solutions Convergence Variational problems through the view of optimal control 9

20

21 LECTURE 6 Examples 6.. Outline of Lecture Examples: Freidlin-Wentzell, Donsker-Varadhan, Multi-scale diffusion Applications to infinite dimensions - Stochastic PDEs Another type of infinite dimensions - Interacting particles 2

22

23 LECTURE 7 Beyond Large Deviation 7.. Outline of Lecture Variational formulation of PDEs - compressible Euler equations Incompressible Navier-Stokes Lasry-Lions Mean-Field Games Transition Path Theory An approach to large time statistical structures of complex flows 23

Weak convergence and large deviation theory

Weak convergence and large deviation theory First Prev Next Go To Go Back Full Screen Close Quit 1 Weak convergence and large deviation theory Large deviation principle Convergence in distribution The Bryc-Varadhan theorem Tightness and Prohorov

More information

Lecture 5: Importance sampling and Hamilton-Jacobi equations

Lecture 5: Importance sampling and Hamilton-Jacobi equations Lecture 5: Importance sampling and Hamilton-Jacobi equations Henrik Hult Department of Mathematics KTH Royal Institute of Technology Sweden Summer School on Monte Carlo Methods and Rare Events Brown University,

More information

Large Deviations Techniques and Applications

Large Deviations Techniques and Applications Amir Dembo Ofer Zeitouni Large Deviations Techniques and Applications Second Edition With 29 Figures Springer Contents Preface to the Second Edition Preface to the First Edition vii ix 1 Introduction 1

More information

Large Deviations for Small-Noise Stochastic Differential Equations

Large Deviations for Small-Noise Stochastic Differential Equations Chapter 21 Large Deviations for Small-Noise Stochastic Differential Equations This lecture is at once the end of our main consideration of diffusions and stochastic calculus, and a first taste of large

More information

Large Deviations for Small-Noise Stochastic Differential Equations

Large Deviations for Small-Noise Stochastic Differential Equations Chapter 22 Large Deviations for Small-Noise Stochastic Differential Equations This lecture is at once the end of our main consideration of diffusions and stochastic calculus, and a first taste of large

More information

Théorie des grandes déviations: Des mathématiques à la physique

Théorie des grandes déviations: Des mathématiques à la physique Théorie des grandes déviations: Des mathématiques à la physique Hugo Touchette National Institute for Theoretical Physics (NITheP) Stellenbosch, Afrique du Sud CERMICS, École des Ponts Paris, France 30

More information

Large Deviations for Weakly Dependent Sequences: The Gärtner-Ellis Theorem

Large Deviations for Weakly Dependent Sequences: The Gärtner-Ellis Theorem Chapter 34 Large Deviations for Weakly Dependent Sequences: The Gärtner-Ellis Theorem This chapter proves the Gärtner-Ellis theorem, establishing an LDP for not-too-dependent processes taking values in

More information

LARGE DEVIATIONS FOR STOCHASTIC PROCESSES

LARGE DEVIATIONS FOR STOCHASTIC PROCESSES LARGE DEVIATIONS FOR STOCHASTIC PROCESSES By Stefan Adams Abstract: The notes are devoted to results on large deviations for sequences of Markov processes following closely the book by Feng and Kurtz ([FK06]).

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

Gärtner-Ellis Theorem and applications.

Gärtner-Ellis Theorem and applications. Gärtner-Ellis Theorem and applications. Elena Kosygina July 25, 208 In this lecture we turn to the non-i.i.d. case and discuss Gärtner-Ellis theorem. As an application, we study Curie-Weiss model with

More information

Entropy and Large Deviations

Entropy and Large Deviations Entropy and Large Deviations p. 1/32 Entropy and Large Deviations S.R.S. Varadhan Courant Institute, NYU Michigan State Universiy East Lansing March 31, 2015 Entropy comes up in many different contexts.

More information

Lecture 2: Convex Sets and Functions

Lecture 2: Convex Sets and Functions Lecture 2: Convex Sets and Functions Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are

More information

General Theory of Large Deviations

General Theory of Large Deviations Chapter 30 General Theory of Large Deviations A family of random variables follows the large deviations principle if the probability of the variables falling into bad sets, representing large deviations

More information

Fokker-Planck Equation on Graph with Finite Vertices

Fokker-Planck Equation on Graph with Finite Vertices Fokker-Planck Equation on Graph with Finite Vertices January 13, 2011 Jointly with S-N Chow (Georgia Tech) Wen Huang (USTC) Hao-min Zhou(Georgia Tech) Functional Inequalities and Discrete Spaces Outline

More information

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

The Moment Method; Convex Duality; and Large/Medium/Small Deviations Stat 928: Statistical Learning Theory Lecture: 5 The Moment Method; Convex Duality; and Large/Medium/Small Deviations Instructor: Sham Kakade The Exponential Inequality and Convex Duality The exponential

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

An inverse of Sanov s theorem

An inverse of Sanov s theorem An inverse of Sanov s theorem Ayalvadi Ganesh and Neil O Connell BRIMS, Hewlett-Packard Labs, Bristol Abstract Let X k be a sequence of iid random variables taking values in a finite set, and consider

More information

1. Principle of large deviations

1. Principle of large deviations 1. Principle of large deviations This section will provide a short introduction to the Law of large deviations. The basic principle behind this theory centers on the observation that certain functions

More information

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where

More information

δ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by

δ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by . Sanov s Theorem Here we consider a sequence of i.i.d. random variables with values in some complete separable metric space X with a common distribution α. Then the sample distribution β n = n maps X

More information

Advanced computational methods X Selected Topics: SGD

Advanced computational methods X Selected Topics: SGD Advanced computational methods X071521-Selected Topics: SGD. In this lecture, we look at the stochastic gradient descent (SGD) method 1 An illustrating example The MNIST is a simple dataset of variety

More information

Large-deviation theory and coverage in mobile phone networks

Large-deviation theory and coverage in mobile phone networks Weierstrass Institute for Applied Analysis and Stochastics Large-deviation theory and coverage in mobile phone networks Paul Keeler, Weierstrass Institute for Applied Analysis and Stochastics, Berlin joint

More information

A large deviation principle for a RWRC in a box

A large deviation principle for a RWRC in a box A large deviation principle for a RWRC in a box 7th Cornell Probability Summer School Michele Salvi TU Berlin July 12, 2011 Michele Salvi (TU Berlin) An LDP for a RWRC in a nite box July 12, 2011 1 / 15

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

MATH 6605: SUMMARY LECTURE NOTES

MATH 6605: SUMMARY LECTURE NOTES MATH 6605: SUMMARY LECTURE NOTES These notes summarize the lectures on weak convergence of stochastic processes. If you see any typos, please let me know. 1. Construction of Stochastic rocesses A stochastic

More information

2a Large deviation principle (LDP) b Contraction principle c Change of measure... 10

2a Large deviation principle (LDP) b Contraction principle c Change of measure... 10 Tel Aviv University, 2007 Large deviations 5 2 Basic notions 2a Large deviation principle LDP......... 5 2b Contraction principle................ 9 2c Change of measure................. 10 The formalism

More information

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

10-704: Information Processing and Learning Spring Lecture 8: Feb 5 10-704: Information Processing and Learning Spring 2015 Lecture 8: Feb 5 Lecturer: Aarti Singh Scribe: Siheng Chen Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal

More information

Brownian Motion and Stochastic Calculus

Brownian Motion and Stochastic Calculus ETHZ, Spring 17 D-MATH Prof Dr Martin Larsson Coordinator A Sepúlveda Brownian Motion and Stochastic Calculus Exercise sheet 6 Please hand in your solutions during exercise class or in your assistant s

More information

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference III: Variational Principle I Junming Yin Lecture 16, March 19, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

Proving the central limit theorem

Proving the central limit theorem SOR3012: Stochastic Processes Proving the central limit theorem Gareth Tribello March 3, 2019 1 Purpose In the lectures and exercises we have learnt about the law of large numbers and the central limit

More information

B553 Lecture 1: Calculus Review

B553 Lecture 1: Calculus Review B553 Lecture 1: Calculus Review Kris Hauser January 10, 2012 This course requires a familiarity with basic calculus, some multivariate calculus, linear algebra, and some basic notions of metric topology.

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics

More information

INTRODUCTION TO STATISTICAL MECHANICS Exercises

INTRODUCTION TO STATISTICAL MECHANICS Exercises École d hiver de Probabilités Semaine 1 : Mécanique statistique de l équilibre CIRM - Marseille 4-8 Février 2012 INTRODUCTION TO STATISTICAL MECHANICS Exercises Course : Yvan Velenik, Exercises : Loren

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Large Deviation Theory. J.M. Swart December 2, 2016

Large Deviation Theory. J.M. Swart December 2, 2016 Large Deviation Theory J.M. Swart December 2, 206 2 3 Preface The earliest origins of large deviation theory lie in the work of Boltzmann on entropy in the 870ies and Cramér s theorem from 938 [Cra38].

More information

0.1 Uniform integrability

0.1 Uniform integrability Copyright c 2009 by Karl Sigman 0.1 Uniform integrability Given a sequence of rvs {X n } for which it is known apriori that X n X, n, wp1. for some r.v. X, it is of great importance in many applications

More information

Lecture 1 Measure concentration

Lecture 1 Measure concentration CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

More information

Mean Field Games on networks

Mean Field Games on networks Mean Field Games on networks Claudio Marchi Università di Padova joint works with: S. Cacace (Rome) and F. Camilli (Rome) C. Marchi (Univ. of Padova) Mean Field Games on networks Roma, June 14 th, 2017

More information

Bioinformatics: Biology X

Bioinformatics: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA Model Building/Checking, Reverse Engineering, Causality Outline 1 Bayesian Interpretation of Probabilities 2 Where (or of what)

More information

Module 3. Function of a Random Variable and its distribution

Module 3. Function of a Random Variable and its distribution Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given

More information

Lagrange Relaxation and Duality

Lagrange Relaxation and Duality Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

The Relativistic Heat Equation

The Relativistic Heat Equation Maximum Principles and Behavior near Absolute Zero Washington University in St. Louis ARTU meeting March 28, 2014 The Heat Equation The heat equation is the standard model for diffusion and heat flow,

More information

Large deviation theory and applications

Large deviation theory and applications Large deviation theory and applications Peter Mörters November 0, 2008 Abstract Large deviation theory deals with the decay of the probability of increasingly unlikely events. It is one of the key techniques

More information

Hamiltonian Mechanics

Hamiltonian Mechanics Chapter 3 Hamiltonian Mechanics 3.1 Convex functions As background to discuss Hamiltonian mechanics we discuss convexity and convex functions. We will also give some applications to thermodynamics. We

More information

Concentration inequalities and tail bounds

Concentration inequalities and tail bounds Concentration inequalities and tail bounds John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno bounds II Sub-Gaussian random variables 1 Definitions 2 Examples

More information

LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS

LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS A Dissertation Presented by CHRISTOPHER L. BOUCHER Submitted to the Graduate School of the University

More information

Introduction to Statistical Learning Theory

Introduction to Statistical Learning Theory Introduction to Statistical Learning Theory In the last unit we looked at regularization - adding a w 2 penalty. We add a bias - we prefer classifiers with low norm. How to incorporate more complicated

More information

Statistical physics models belonging to the generalised exponential family

Statistical physics models belonging to the generalised exponential family Statistical physics models belonging to the generalised exponential family Jan Naudts Universiteit Antwerpen 1. The generalized exponential family 6. The porous media equation 2. Theorem 7. The microcanonical

More information

Contents. Preface xi. vii

Contents. Preface xi. vii Preface xi 1. Real Numbers and Monotone Sequences 1 1.1 Introduction; Real numbers 1 1.2 Increasing sequences 3 1.3 Limit of an increasing sequence 4 1.4 Example: the number e 5 1.5 Example: the harmonic

More information

Large Deviations from the Hydrodynamic Limit for a System with Nearest Neighbor Interactions

Large Deviations from the Hydrodynamic Limit for a System with Nearest Neighbor Interactions Large Deviations from the Hydrodynamic Limit for a System with Nearest Neighbor Interactions Amarjit Budhiraja Department of Statistics & Operations Research University of North Carolina at Chapel Hill

More information

The Central Limit Theorem: More of the Story

The Central Limit Theorem: More of the Story The Central Limit Theorem: More of the Story Steven Janke November 2015 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 1 / 33 Central Limit Theorem Theorem (Central Limit

More information

Fokker-Planck Equation with Detailed Balance

Fokker-Planck Equation with Detailed Balance Appendix E Fokker-Planck Equation with Detailed Balance A stochastic process is simply a function of two variables, one is the time, the other is a stochastic variable X, defined by specifying: a: the

More information

A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems

A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems Giuseppe Savaré http://www.imati.cnr.it/ savare Dipartimento di Matematica, Università di Pavia Nonlocal

More information

Concentration Inequalities

Concentration Inequalities Chapter Concentration Inequalities I. Moment generating functions, the Chernoff method, and sub-gaussian and sub-exponential random variables a. Goal for this section: given a random variable X, how does

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Solution for Problem 7.1. We argue by contradiction. If the limit were not infinite, then since τ M (ω) is nondecreasing we would have

Solution for Problem 7.1. We argue by contradiction. If the limit were not infinite, then since τ M (ω) is nondecreasing we would have 362 Problem Hints and Solutions sup g n (ω, t) g(ω, t) sup g(ω, s) g(ω, t) µ n (ω). t T s,t: s t 1/n By the uniform continuity of t g(ω, t) on [, T], one has for each ω that µ n (ω) as n. Two applications

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

Stat 451 Lecture Notes Numerical Integration

Stat 451 Lecture Notes Numerical Integration Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

arxiv: v2 [math.ap] 28 Nov 2016

arxiv: v2 [math.ap] 28 Nov 2016 ONE-DIMENSIONAL SAIONARY MEAN-FIELD GAMES WIH LOCAL COUPLING DIOGO A. GOMES, LEVON NURBEKYAN, AND MARIANA PRAZERES arxiv:1611.8161v [math.ap] 8 Nov 16 Abstract. A standard assumption in mean-field game

More information

The large deviation principle for the Erdős-Rényi random graph

The large deviation principle for the Erdős-Rényi random graph The large deviation principle for the Erdős-Rényi random graph (Courant Institute, NYU) joint work with S. R. S. Varadhan Main objective: how to count graphs with a given property Only consider finite

More information

Lecture 5 Channel Coding over Continuous Channels

Lecture 5 Channel Coding over Continuous Channels Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From

More information

Statistical Machine Learning Lectures 4: Variational Bayes

Statistical Machine Learning Lectures 4: Variational Bayes 1 / 29 Statistical Machine Learning Lectures 4: Variational Bayes Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 29 Synonyms Variational Bayes Variational Inference Variational Bayesian Inference

More information

SOLVABLE VARIATIONAL PROBLEMS IN N STATISTICAL MECHANICS

SOLVABLE VARIATIONAL PROBLEMS IN N STATISTICAL MECHANICS SOLVABLE VARIATIONAL PROBLEMS IN NON EQUILIBRIUM STATISTICAL MECHANICS University of L Aquila October 2013 Tullio Levi Civita Lecture 2013 Coauthors Lorenzo Bertini Alberto De Sole Alessandra Faggionato

More information

Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations.

Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint work with Michael Salins 2, Konstantinos Spiliopoulos 3.) 1. Department of Mathematics

More information

A relative entropy characterization of the growth rate of reward in risk-sensitive control

A relative entropy characterization of the growth rate of reward in risk-sensitive control 1 / 47 A relative entropy characterization of the growth rate of reward in risk-sensitive control Venkat Anantharam EECS Department, University of California, Berkeley (joint work with Vivek Borkar, IIT

More information

A D VA N C E D P R O B A B I L - I T Y

A D VA N C E D P R O B A B I L - I T Y A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2

More information

Connection to Branching Random Walk

Connection to Branching Random Walk Lecture 7 Connection to Branching Random Walk The aim of this lecture is to prepare the grounds for the proof of tightness of the maximum of the DGFF. We will begin with a recount of the so called Dekking-Host

More information

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. Minimization Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. 1 Minimization A Topological Result. Let S be a topological

More information

Information Theory and Predictability Lecture 6: Maximum Entropy Techniques

Information Theory and Predictability Lecture 6: Maximum Entropy Techniques Information Theory and Predictability Lecture 6: Maximum Entropy Techniques 1 Philosophy Often with random variables of high dimensional systems it is difficult to deduce the appropriate probability distribution

More information

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers. Chapter 3 Duality in Banach Space Modern optimization theory largely centers around the interplay of a normed vector space and its corresponding dual. The notion of duality is important for the following

More information

Other properties of M M 1

Other properties of M M 1 Other properties of M M 1 Přemysl Bejda premyslbejda@gmail.com 2012 Contents 1 Reflected Lévy Process 2 Time dependent properties of M M 1 3 Waiting times and queue disciplines in M M 1 Contents 1 Reflected

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Consistency of the maximum likelihood estimator for general hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models

More information

Week 6 Notes, Math 865, Tanveer

Week 6 Notes, Math 865, Tanveer Week 6 Notes, Math 865, Tanveer. Energy Methods for Euler and Navier-Stokes Equation We will consider this week basic energy estimates. These are estimates on the L 2 spatial norms of the solution u(x,

More information

From Boltzmann Equations to Gas Dynamics: From DiPerna-Lions to Leray

From Boltzmann Equations to Gas Dynamics: From DiPerna-Lions to Leray From Boltzmann Equations to Gas Dynamics: From DiPerna-Lions to Leray C. David Levermore Department of Mathematics and Institute for Physical Science and Technology University of Maryland, College Park

More information

Exercises Measure Theoretic Probability

Exercises Measure Theoretic Probability Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

The Way of Analysis. Robert S. Strichartz. Jones and Bartlett Publishers. Mathematics Department Cornell University Ithaca, New York

The Way of Analysis. Robert S. Strichartz. Jones and Bartlett Publishers. Mathematics Department Cornell University Ithaca, New York The Way of Analysis Robert S. Strichartz Mathematics Department Cornell University Ithaca, New York Jones and Bartlett Publishers Boston London Contents Preface xiii 1 Preliminaries 1 1.1 The Logic of

More information

LECTURE 15: COMPLETENESS AND CONVEXITY

LECTURE 15: COMPLETENESS AND CONVEXITY LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other

More information

Variational approach to mean field games with density constraints

Variational approach to mean field games with density constraints 1 / 18 Variational approach to mean field games with density constraints Alpár Richárd Mészáros LMO, Université Paris-Sud (based on ongoing joint works with F. Santambrogio, P. Cardaliaguet and F. J. Silva)

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

Weak Convergence of Numerical Methods for Dynamical Systems and Optimal Control, and a relation with Large Deviations for Stochastic Equations

Weak Convergence of Numerical Methods for Dynamical Systems and Optimal Control, and a relation with Large Deviations for Stochastic Equations Weak Convergence of Numerical Methods for Dynamical Systems and, and a relation with Large Deviations for Stochastic Equations Mattias Sandberg KTH CSC 2010-10-21 Outline The error representation for weak

More information

Homework 1 Due: Thursday 2/5/2015. Instructions: Turn in your homework in class on Thursday 2/5/2015

Homework 1 Due: Thursday 2/5/2015. Instructions: Turn in your homework in class on Thursday 2/5/2015 10-704 Homework 1 Due: Thursday 2/5/2015 Instructions: Turn in your homework in class on Thursday 2/5/2015 1. Information Theory Basics and Inequalities C&T 2.47, 2.29 (a) A deck of n cards in order 1,

More information

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define Homework, Real Analysis I, Fall, 2010. (1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define ρ(f, g) = 1 0 f(x) g(x) dx. Show that

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

Constrained Optimization Theory

Constrained Optimization Theory Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August

More information

Reaction-Diffusion Equations In Narrow Tubes and Wave Front P

Reaction-Diffusion Equations In Narrow Tubes and Wave Front P Outlines Reaction-Diffusion Equations In Narrow Tubes and Wave Front Propagation University of Maryland, College Park USA Outline of Part I Outlines Real Life Examples Description of the Problem and Main

More information

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A ) 6. Brownian Motion. stochastic process can be thought of in one of many equivalent ways. We can begin with an underlying probability space (Ω, Σ, P) and a real valued stochastic process can be defined

More information

Robust control and applications in economic theory

Robust control and applications in economic theory Robust control and applications in economic theory In honour of Professor Emeritus Grigoris Kalogeropoulos on the occasion of his retirement A. N. Yannacopoulos Department of Statistics AUEB 24 May 2013

More information

Notes on Large Deviations in Economics and Finance. Noah Williams

Notes on Large Deviations in Economics and Finance. Noah Williams Notes on Large Deviations in Economics and Finance Noah Williams Princeton University and NBER http://www.princeton.edu/ noahw Notes on Large Deviations 1 Introduction What is large deviation theory? Loosely:

More information

Lattice spin models: Crash course

Lattice spin models: Crash course Chapter 1 Lattice spin models: Crash course 1.1 Basic setup Here we will discuss the basic setup of the models to which we will direct our attention throughout this course. The basic ingredients are as

More information

MAT 135B Midterm 1 Solutions

MAT 135B Midterm 1 Solutions MAT 35B Midterm Solutions Last Name (PRINT): First Name (PRINT): Student ID #: Section: Instructions:. Do not open your test until you are told to begin. 2. Use a pen to print your name in the spaces above.

More information