A Note On Large Deviation Theory and Beyond
|
|
- Suzan Booker
- 6 years ago
- Views:
Transcription
1 A Note On Large Deviation Theory and Beyond Jin Feng In this set of notes, we will develop and explain a whole mathematical theory which can be highly summarized through one simple observation ( ) lim n + n log e na + e nb = a b. Staring at the above identity for a moment, if you are sufficiently over-sensitive, you discover that two subjects of mathematics are shouting at you on the left hand side, you see summation and hence probability theory; on the right hand side, you see maximization and hence calculus of variations. The large deviation theory, which is an abstract framework to make the above simple observation rigorous and extensive, has brought profound impacts to mathematics as well as to physics and engineering... Copyright 200 Jin Feng
2 2 JIN FENG, LARGE DEVIATIONS LECTURE Sanov Theorem, from the point of view of Boltzmann.. Outline of Lecture Sanov Theorem, the mathematics Why did Boltzmann care, the concept of entropy, and an elementary proof of Sanov via Sterling formula Gibbs conditioning and maximum entropy principles.2. Sanov theorem, the abstract setup Let Define (a) (S, d) be a complete separable metric space; (b) {X i : i =, 2,...} be i.i.d. S-valued random variables with probability law γ(dx) := P(X dx); (c) Define measure-valued random variable µ n (dx) := n δ Xi (dx) P(S). n i= S(ρ γ) := S log dρ dγ dρ. Let P(S) be given the weak convergence topology with a compatible metric. Then Theorem.. For each ρ P(S), lim lim ɛ 0+ n n log P(µ n B(ρ; ɛ)) = S(ρ γ).
3 LECTURE. SANOV THEOREM, FROM THE POINT OF VIEW OF BOLTZMANN3.3. Boltzmann in 877 Why did Boltzmann care? The setting of discrete ideal gas: (a) S := {x, x 2,..., x m }; (b) P(S) = {γ := (γ,..., γ m ) : m k= γ k =, γ k 0}. (c) µ n = (µ n (x ),..., µ n (x m )) where µ n (x) = n δ Xi ({x}), x S. n µ n is a model for the shape of thin gas. Think about why. Theorem.2 (Boltzmann). i= P(µ n ρ) exp{ ns(ρ γ)} Proof. ( P(µ n ρ) = P (µ n (x ),..., µ n (x m )) ) n (nρ,..., nρ m ) ( ) = P (#X i = x, #X i = x 2,..., #X i = x m ) (nρ,..., nρ m ) = n! (nρ )!... (nρ m )! γnρ... γ nρm m. By Stirling s formula (we will revisit this issue using Gamma function in the second lecture) log(k!) = k log k k + O(log k), then n log P(µ n ρ) = log n + O( log n n ) i ρ i log(nρ i ) + i ρ i + i O( log nρ i ) n + i ρ i log γ i = i ( ρ i log ρ i + ρ i log γ i ) + O( log n n ) = S(ρ γ) + O( log n n ). Definition.3 (Relative Entropy). ( S(ρ γ) := S log dρ dγ ) dρ
4 4 JIN FENG, LARGE DEVIATIONS From the above, we know that µ n γ in probability (law of large number). Indeed, we know more than just that. Lemma.4. Let A be a set which Ā does not contain γ, then for P(µ n A) Ce ni(a) n 0 I(A) := inf S(ρ γ) > 0. ρ Ā What does the distribution of (X,..., X K ) converge to? The following is a complicated way to answer this extremely simple question. First of all, it is a product measure (why?) Note that f, µ n := fdµ n = n f(x i ), n by identical distribution property, and by the above lemma, E[f(X )] = E[ f, µ n ] fdγ i= (indeed, the last limit holds as equality without limit). answer is γ k := γ... γ. Hence the.4. Maximum entropy and Gibbs conditioning Problem. Suppose that we made one observation regarding the samples {X,..., X n }. Knowing such a priori information, how does it change the setup and conclusion of the above Sanov theorem? For instance, suppose that h is a function on S and we observe H n := h(x ) + + h(x n ) = h(x)µ n (dx) := f(µ n ). n S What is lim n n log P(µ n ρ H n e) =? Note that H n := f(µ n ) is a function of µ n, the event { H n = e} = {µ n f (e)}. Therefore we arrive at a more general question which is answered by the following Theorem.5 (Gibbs conditioning principle). lim n n log P(µ n A µ n B) = I(A B) + I(B).
5 LECTURE. SANOV THEOREM, FROM THE POINT OF VIEW OF BOLTZMANN5 Proof. By Sanov theorem, for a large class of sets A P(S), lim n n log P(µ n A) = I(A). Therefore n log P(µ n A µ n B) = n log P(µ n A B) n log P(µ n B) I(A B) + I(B) inf S(ρ γ) + inf S(ρ γ). ρ A B ρ B What is the most likely state for µ n in the conditional probability P(µ n µ n B)? Theorem.6 (Maximum entropy principle). Suppose that ρ is the unique minimizer such that Then Proof. Let Then Hence S(ρ γ) = inf ρ B S(ρ γ). lim P(d(µ n, ρ ) > δ µ n B) = 0, δ > 0. n M := A := {ρ : d(ρ, ρ ) > δ}. inf S(ρ γ) inf S(ρ γ) > 0. B {ρ:d(ρ,ρ )>δ} ρ B P(d(µ n, ρ ) > δ µ n B) e nm 0. We now consider the special case of {µ n B} := { H n e}. By the maximum entropy principle, we d like to optimize S(ρ γ) under the constraint h, ρ = e. By Lagrange multiplier method, we optimize function F (ρ, β) := S(ρ γ) + β( h, ρ e) = S(ρ γ β ), where parametrized probability measure γ β (dx) = Z β e β(h(x) e) γ(dx).
6 6 JIN FENG, LARGE DEVIATIONS From That is, where (.7) F = 0, we get ρ i log ρ i + log γ i + β h(x i ) = 0. ρ i := j eβ h(x j) e β h(x i) γ i = γ β i γ j h, ρ = S hdγ β = e. For people familiar with advanced statistical theory of estimation and inference, one recognize the exponential family connection. Therefore, it is natural to introduce pressure function (to be discussed more extensively in next lecture) Λ(β) := log e βh(x) γ(dx). It can be verified that and that Λ (β) = S Λ (β) > 0, h(x i )e βh(xi) γ i = j eβh(x j) γ j i S h(x)γ β (dx). Corollary.8 (Macro-state). The most likely macro -state is with β uniquely determined by ρ i := e β h(x i ) γ i Z β Λ (β ) = e. Next, we derive lim P(X H n e) n As the Sanov case, by symmetry, By law of large number, E[f(X ) H n e] = E[ f, µ n H n e] lim E[f(X ) H n e] = n S fdγ β By de Finetti theorem (review the concept of exchangeability), Therefore P(X dx,..., X n dx n H n ) = Π n i=p(x i dx i H n ).
7 LECTURE. SANOV THEOREM, FROM THE POINT OF VIEW OF BOLTZMANN7 Corollary.9. For each K fixed, lim P(X,..., X K H n e) = (γ β ) K. n
8
9 LECTURE 2 Free Energy and Entropy, à la Gibbs 2.. Outline of Lecture A duality between free energy and entropy Properties of relative entropy 2.2. An entropy-free energy (pressure) duality The Gibbs conditioning principle tells us the following: We start with a model X,..., X n γ. We make observations based on H n := n i h(x i ). Conditioning on the fact that we saw H, then we should update our prior belief that the underlying measure should be dγ β,h := Z β,h e βh dγ. for some constant β. This is a Bayes theorem essentially. Since βh always come together, we will just set β = and write the renormalized new reference measure (a Gibbs measure) dγ h := eh Z h dγ with normalizing (partition) constant Z h := e h dγ. 9
10 0 JIN FENG, LARGE DEVIATIONS Let h C b (S). The log partition functional Λ(h) := log Z h = log e h dγ plays key role as some kind of dual functional to entropy. We first observe that S(ρ γ) h, ρ = dρ log dρ dγ log Z h h = S(ρ γ h ) log Z h = S(ρ γ h ) Λ(h). We have the following infinite dimensional version of Legendre- Frenchel transform Theorem 2. (Lanford-Varadhan). S(ρ γ) = sup { h, ρ Λ(h)} h C b (S) Λ(h) = sup { h, ρ S(ρ γ)} ρ P(S) The supreme for the second one is uniquely attained at γ h. Proof. Since S(ρ γ) + Λ(h) = h, ρ + S(ρ γ h ) h, ρ and since ρ = γ h is the only solution for S(ρ γ h ) = 0, the conclusions follow Properties of entropy Lemma 2.2. S( γ) : P(S) R + is convex. Proof. This is because that S(ρ γ) = dγ dρ dρ log dγ dγ = S S h( dρ dγ )dγ where h(r) = r log r is convex. The positivity follows from Jensen s inequality. Review the concept and give examples of semicontinuous functions. Lemma 2.3. Let f α ( ) : S R lower semicontinuous for each α Λ fixed. Then is still lower semicontinuous. f(x) := sup f α (x) α Λ
11 LECTURE 2. FREE ENERGY AND ENTROPY, À LA GIBBS Proof. This is a consequence of (2.4) {x : f(x) c} = α Λ {x : f α (x) c}. Lemma 2.5. S( ) : P(S) P(S) R + is lower semicontinuous in the weak convergence topology. Proof. This is because of the variational representation in (2.).
12
13 LECTURE 3 Large Deviation, General Theory 3.. Outline of Lecture Laplace lemma Large deviation principle, Laplace principle and related stuff Exponential tightness Rate function and techniques on identifying it The situation of stochastic processes 3.2. Laplace lemma We will make sense of an infinite dimensional generalization of ( /n e dx) nf(x) = exp{ min f(x)} lim n 0 and its far-reaching impacts to physical applications. Lemma 3. (Laplace Lemma). n log with f C b (S), µ M b (S). Proof. Take home exercise. S e nf(z) µ(dz) n sup f(z) z supp(µ) As an application, we prove the Stirling formula for gamma function Γ(α) := Note that Γ(n) = (n )!. α. 0 x α e x dx. We are interested in behavior of Γ as 3
14 4 JIN FENG, LARGE DEVIATIONS By change of variable x = αy, and by using the Laplace lemma, Γ(α) = α α y exp{ α(y log y)}dy e α log α α, 0 since min{y log y} = log =. y To be more precise, ( Γ(α) ) /α lim = e. α α α The special case of α = n + gives the well-known Sterling formula n! e n log n n. Indeed, if we are more careful, we have next order expansion around the stationary point y 0 = that y log y + 2 (y )2 + O((y ) 3 ). By Gaussian integral properties, Γ(α) = α α+ 2 e α (2π) 2 ( + O(α )) Large Deviation Principle and Laplace Principle A rate (action) function is a function I : S [0, + ] which is lower semicontinuous. If I has compact level sets, we call it good. We denote I(A) := inf x A I(x). Definition 3.2 (LDP). {X n : n =, 2,...} is said to satisfies the Large Deviation principle with rate function I, if and only if (a) For each closed set F S, lim sup n (b) For each open set G S, lim inf n n log P (X n F ) I(F ); n log P (X n G) I(G). Definition 3.3. {X n : n =, 2,...} is said to satisfies the Laplace principle with rate function I, if (a) For all f C b (S), lim sup n n log E[enf(Xn) ] sup{f(x) I(x)}; x S
15 LECTURE 3. LARGE DEVIATION, GENERAL THEORY 5 (b) For each f C b (S), lim inf n n log E[enf(Xn) ] sup{f(x) I(x)}. x S Theorem 3.4. The Laplace principle is equivalent to the Large deviation principle. Proof. First, we prove that Large deviation principle implies Laplace principle. This was due to Varadhan. Let closed set F N,j := {x S : f + j N 2 f f(x) f + j N 2 f }. and we approximate f from above by step functions: N ( f N (x) := f + j ) N 2 f (x F N,j ). j= Note that the level sets of f N are closed. Therefore, by large deviation upper bound lim sup n n log E[enf(Xn) ] lim sup n n log E[enf N (X n) ] max { f + j j=,2...,n N 2 f I(F N,j )} max j=,...,n sup {f(x) I(x)} + 2 f x F N,j N sup {f(x) I(x)} + 2 f x S N. Let x 0 S and ɛ > 0. Then G := {x : f(x) > f(x 0 ) ɛ} is open, by large deviation lower bound lim inf n n log E[enf(Xn) ] lim inf n n log E[(X n G)e nf(xn) ] f(x 0 ) ɛ + lim inf n n log P (X n G) f(x 0 ) ɛ I(G) f(x 0 ) I(x 0 ) ɛ. The Laplace lower bound follows from the arbitrariness of x 0 S and ɛ > 0. Next, we prove that Laplace principle implies Large deviation. This seems was first realized by Dupuis and Ellis....
16
17 LECTURE 4 Occupation Measure and Random Perturbation of ODEs 4.. Outline of Lecture The Donsker-Varadhan theory The Freidlin-Wentzell theory 7
18
19 LECTURE 5 An HJB equation approach to large deviation of Markov Processes 5.. Outline of Lecture Martingale problems A nonlinear semigroup Hamilton-Jacobi-Bellman equation and viscosity solutions Convergence Variational problems through the view of optimal control 9
20
21 LECTURE 6 Examples 6.. Outline of Lecture Examples: Freidlin-Wentzell, Donsker-Varadhan, Multi-scale diffusion Applications to infinite dimensions - Stochastic PDEs Another type of infinite dimensions - Interacting particles 2
22
23 LECTURE 7 Beyond Large Deviation 7.. Outline of Lecture Variational formulation of PDEs - compressible Euler equations Incompressible Navier-Stokes Lasry-Lions Mean-Field Games Transition Path Theory An approach to large time statistical structures of complex flows 23
Weak convergence and large deviation theory
First Prev Next Go To Go Back Full Screen Close Quit 1 Weak convergence and large deviation theory Large deviation principle Convergence in distribution The Bryc-Varadhan theorem Tightness and Prohorov
More informationLecture 5: Importance sampling and Hamilton-Jacobi equations
Lecture 5: Importance sampling and Hamilton-Jacobi equations Henrik Hult Department of Mathematics KTH Royal Institute of Technology Sweden Summer School on Monte Carlo Methods and Rare Events Brown University,
More informationLarge Deviations Techniques and Applications
Amir Dembo Ofer Zeitouni Large Deviations Techniques and Applications Second Edition With 29 Figures Springer Contents Preface to the Second Edition Preface to the First Edition vii ix 1 Introduction 1
More informationLarge Deviations for Small-Noise Stochastic Differential Equations
Chapter 21 Large Deviations for Small-Noise Stochastic Differential Equations This lecture is at once the end of our main consideration of diffusions and stochastic calculus, and a first taste of large
More informationLarge Deviations for Small-Noise Stochastic Differential Equations
Chapter 22 Large Deviations for Small-Noise Stochastic Differential Equations This lecture is at once the end of our main consideration of diffusions and stochastic calculus, and a first taste of large
More informationThéorie des grandes déviations: Des mathématiques à la physique
Théorie des grandes déviations: Des mathématiques à la physique Hugo Touchette National Institute for Theoretical Physics (NITheP) Stellenbosch, Afrique du Sud CERMICS, École des Ponts Paris, France 30
More informationLarge Deviations for Weakly Dependent Sequences: The Gärtner-Ellis Theorem
Chapter 34 Large Deviations for Weakly Dependent Sequences: The Gärtner-Ellis Theorem This chapter proves the Gärtner-Ellis theorem, establishing an LDP for not-too-dependent processes taking values in
More informationLARGE DEVIATIONS FOR STOCHASTIC PROCESSES
LARGE DEVIATIONS FOR STOCHASTIC PROCESSES By Stefan Adams Abstract: The notes are devoted to results on large deviations for sequences of Markov processes following closely the book by Feng and Kurtz ([FK06]).
More informationMetric Spaces and Topology
Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies
More informationGärtner-Ellis Theorem and applications.
Gärtner-Ellis Theorem and applications. Elena Kosygina July 25, 208 In this lecture we turn to the non-i.i.d. case and discuss Gärtner-Ellis theorem. As an application, we study Curie-Weiss model with
More informationEntropy and Large Deviations
Entropy and Large Deviations p. 1/32 Entropy and Large Deviations S.R.S. Varadhan Courant Institute, NYU Michigan State Universiy East Lansing March 31, 2015 Entropy comes up in many different contexts.
More informationLecture 2: Convex Sets and Functions
Lecture 2: Convex Sets and Functions Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are
More informationGeneral Theory of Large Deviations
Chapter 30 General Theory of Large Deviations A family of random variables follows the large deviations principle if the probability of the variables falling into bad sets, representing large deviations
More informationFokker-Planck Equation on Graph with Finite Vertices
Fokker-Planck Equation on Graph with Finite Vertices January 13, 2011 Jointly with S-N Chow (Georgia Tech) Wen Huang (USTC) Hao-min Zhou(Georgia Tech) Functional Inequalities and Discrete Spaces Outline
More informationThe Moment Method; Convex Duality; and Large/Medium/Small Deviations
Stat 928: Statistical Learning Theory Lecture: 5 The Moment Method; Convex Duality; and Large/Medium/Small Deviations Instructor: Sham Kakade The Exponential Inequality and Convex Duality The exponential
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationAn inverse of Sanov s theorem
An inverse of Sanov s theorem Ayalvadi Ganesh and Neil O Connell BRIMS, Hewlett-Packard Labs, Bristol Abstract Let X k be a sequence of iid random variables taking values in a finite set, and consider
More information1. Principle of large deviations
1. Principle of large deviations This section will provide a short introduction to the Law of large deviations. The basic principle behind this theory centers on the observation that certain functions
More informationExercises with solutions (Set D)
Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where
More informationδ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by
. Sanov s Theorem Here we consider a sequence of i.i.d. random variables with values in some complete separable metric space X with a common distribution α. Then the sample distribution β n = n maps X
More informationAdvanced computational methods X Selected Topics: SGD
Advanced computational methods X071521-Selected Topics: SGD. In this lecture, we look at the stochastic gradient descent (SGD) method 1 An illustrating example The MNIST is a simple dataset of variety
More informationLarge-deviation theory and coverage in mobile phone networks
Weierstrass Institute for Applied Analysis and Stochastics Large-deviation theory and coverage in mobile phone networks Paul Keeler, Weierstrass Institute for Applied Analysis and Stochastics, Berlin joint
More informationA large deviation principle for a RWRC in a box
A large deviation principle for a RWRC in a box 7th Cornell Probability Summer School Michele Salvi TU Berlin July 12, 2011 Michele Salvi (TU Berlin) An LDP for a RWRC in a nite box July 12, 2011 1 / 15
More informationP (A G) dp G P (A G)
First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume
More informationMATH 6605: SUMMARY LECTURE NOTES
MATH 6605: SUMMARY LECTURE NOTES These notes summarize the lectures on weak convergence of stochastic processes. If you see any typos, please let me know. 1. Construction of Stochastic rocesses A stochastic
More information2a Large deviation principle (LDP) b Contraction principle c Change of measure... 10
Tel Aviv University, 2007 Large deviations 5 2 Basic notions 2a Large deviation principle LDP......... 5 2b Contraction principle................ 9 2c Change of measure................. 10 The formalism
More information10-704: Information Processing and Learning Spring Lecture 8: Feb 5
10-704: Information Processing and Learning Spring 2015 Lecture 8: Feb 5 Lecturer: Aarti Singh Scribe: Siheng Chen Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal
More informationBrownian Motion and Stochastic Calculus
ETHZ, Spring 17 D-MATH Prof Dr Martin Larsson Coordinator A Sepúlveda Brownian Motion and Stochastic Calculus Exercise sheet 6 Please hand in your solutions during exercise class or in your assistant s
More informationLecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.
Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference III: Variational Principle I Junming Yin Lecture 16, March 19, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationProving the central limit theorem
SOR3012: Stochastic Processes Proving the central limit theorem Gareth Tribello March 3, 2019 1 Purpose In the lectures and exercises we have learnt about the law of large numbers and the central limit
More informationB553 Lecture 1: Calculus Review
B553 Lecture 1: Calculus Review Kris Hauser January 10, 2012 This course requires a familiarity with basic calculus, some multivariate calculus, linear algebra, and some basic notions of metric topology.
More information1 Probability and Random Variables
1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results
Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics
More informationINTRODUCTION TO STATISTICAL MECHANICS Exercises
École d hiver de Probabilités Semaine 1 : Mécanique statistique de l équilibre CIRM - Marseille 4-8 Février 2012 INTRODUCTION TO STATISTICAL MECHANICS Exercises Course : Yvan Velenik, Exercises : Loren
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationLarge Deviation Theory. J.M. Swart December 2, 2016
Large Deviation Theory J.M. Swart December 2, 206 2 3 Preface The earliest origins of large deviation theory lie in the work of Boltzmann on entropy in the 870ies and Cramér s theorem from 938 [Cra38].
More information0.1 Uniform integrability
Copyright c 2009 by Karl Sigman 0.1 Uniform integrability Given a sequence of rvs {X n } for which it is known apriori that X n X, n, wp1. for some r.v. X, it is of great importance in many applications
More informationLecture 1 Measure concentration
CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples
More informationMean Field Games on networks
Mean Field Games on networks Claudio Marchi Università di Padova joint works with: S. Cacace (Rome) and F. Camilli (Rome) C. Marchi (Univ. of Padova) Mean Field Games on networks Roma, June 14 th, 2017
More informationBioinformatics: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA Model Building/Checking, Reverse Engineering, Causality Outline 1 Bayesian Interpretation of Probabilities 2 Where (or of what)
More informationModule 3. Function of a Random Variable and its distribution
Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given
More informationLagrange Relaxation and Duality
Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler
More informationBrownian Motion. 1 Definition Brownian Motion Wiener measure... 3
Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................
More informationThe Relativistic Heat Equation
Maximum Principles and Behavior near Absolute Zero Washington University in St. Louis ARTU meeting March 28, 2014 The Heat Equation The heat equation is the standard model for diffusion and heat flow,
More informationLarge deviation theory and applications
Large deviation theory and applications Peter Mörters November 0, 2008 Abstract Large deviation theory deals with the decay of the probability of increasingly unlikely events. It is one of the key techniques
More informationHamiltonian Mechanics
Chapter 3 Hamiltonian Mechanics 3.1 Convex functions As background to discuss Hamiltonian mechanics we discuss convexity and convex functions. We will also give some applications to thermodynamics. We
More informationConcentration inequalities and tail bounds
Concentration inequalities and tail bounds John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno bounds II Sub-Gaussian random variables 1 Definitions 2 Examples
More informationLARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS
LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS A Dissertation Presented by CHRISTOPHER L. BOUCHER Submitted to the Graduate School of the University
More informationIntroduction to Statistical Learning Theory
Introduction to Statistical Learning Theory In the last unit we looked at regularization - adding a w 2 penalty. We add a bias - we prefer classifiers with low norm. How to incorporate more complicated
More informationStatistical physics models belonging to the generalised exponential family
Statistical physics models belonging to the generalised exponential family Jan Naudts Universiteit Antwerpen 1. The generalized exponential family 6. The porous media equation 2. Theorem 7. The microcanonical
More informationContents. Preface xi. vii
Preface xi 1. Real Numbers and Monotone Sequences 1 1.1 Introduction; Real numbers 1 1.2 Increasing sequences 3 1.3 Limit of an increasing sequence 4 1.4 Example: the number e 5 1.5 Example: the harmonic
More informationLarge Deviations from the Hydrodynamic Limit for a System with Nearest Neighbor Interactions
Large Deviations from the Hydrodynamic Limit for a System with Nearest Neighbor Interactions Amarjit Budhiraja Department of Statistics & Operations Research University of North Carolina at Chapel Hill
More informationThe Central Limit Theorem: More of the Story
The Central Limit Theorem: More of the Story Steven Janke November 2015 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 1 / 33 Central Limit Theorem Theorem (Central Limit
More informationFokker-Planck Equation with Detailed Balance
Appendix E Fokker-Planck Equation with Detailed Balance A stochastic process is simply a function of two variables, one is the time, the other is a stochastic variable X, defined by specifying: a: the
More informationA new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems
A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems Giuseppe Savaré http://www.imati.cnr.it/ savare Dipartimento di Matematica, Università di Pavia Nonlocal
More informationConcentration Inequalities
Chapter Concentration Inequalities I. Moment generating functions, the Chernoff method, and sub-gaussian and sub-exponential random variables a. Goal for this section: given a random variable X, how does
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationSolution for Problem 7.1. We argue by contradiction. If the limit were not infinite, then since τ M (ω) is nondecreasing we would have
362 Problem Hints and Solutions sup g n (ω, t) g(ω, t) sup g(ω, s) g(ω, t) µ n (ω). t T s,t: s t 1/n By the uniform continuity of t g(ω, t) on [, T], one has for each ω that µ n (ω) as n. Two applications
More informationMean-field dual of cooperative reproduction
The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time
More informationStat 451 Lecture Notes Numerical Integration
Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29
More information6.1 Variational representation of f-divergences
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016
More informationarxiv: v2 [math.ap] 28 Nov 2016
ONE-DIMENSIONAL SAIONARY MEAN-FIELD GAMES WIH LOCAL COUPLING DIOGO A. GOMES, LEVON NURBEKYAN, AND MARIANA PRAZERES arxiv:1611.8161v [math.ap] 8 Nov 16 Abstract. A standard assumption in mean-field game
More informationThe large deviation principle for the Erdős-Rényi random graph
The large deviation principle for the Erdős-Rényi random graph (Courant Institute, NYU) joint work with S. R. S. Varadhan Main objective: how to count graphs with a given property Only consider finite
More informationLecture 5 Channel Coding over Continuous Channels
Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From
More informationStatistical Machine Learning Lectures 4: Variational Bayes
1 / 29 Statistical Machine Learning Lectures 4: Variational Bayes Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 29 Synonyms Variational Bayes Variational Inference Variational Bayesian Inference
More informationSOLVABLE VARIATIONAL PROBLEMS IN N STATISTICAL MECHANICS
SOLVABLE VARIATIONAL PROBLEMS IN NON EQUILIBRIUM STATISTICAL MECHANICS University of L Aquila October 2013 Tullio Levi Civita Lecture 2013 Coauthors Lorenzo Bertini Alberto De Sole Alessandra Faggionato
More informationLarge deviations and averaging for systems of slow fast stochastic reaction diffusion equations.
Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint work with Michael Salins 2, Konstantinos Spiliopoulos 3.) 1. Department of Mathematics
More informationA relative entropy characterization of the growth rate of reward in risk-sensitive control
1 / 47 A relative entropy characterization of the growth rate of reward in risk-sensitive control Venkat Anantharam EECS Department, University of California, Berkeley (joint work with Vivek Borkar, IIT
More informationA D VA N C E D P R O B A B I L - I T Y
A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2
More informationConnection to Branching Random Walk
Lecture 7 Connection to Branching Random Walk The aim of this lecture is to prepare the grounds for the proof of tightness of the maximum of the DGFF. We will begin with a recount of the so called Dekking-Host
More informationContents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.
Minimization Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. 1 Minimization A Topological Result. Let S be a topological
More informationInformation Theory and Predictability Lecture 6: Maximum Entropy Techniques
Information Theory and Predictability Lecture 6: Maximum Entropy Techniques 1 Philosophy Often with random variables of high dimensional systems it is difficult to deduce the appropriate probability distribution
More information2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.
Chapter 3 Duality in Banach Space Modern optimization theory largely centers around the interplay of a normed vector space and its corresponding dual. The notion of duality is important for the following
More informationOther properties of M M 1
Other properties of M M 1 Přemysl Bejda premyslbejda@gmail.com 2012 Contents 1 Reflected Lévy Process 2 Time dependent properties of M M 1 3 Waiting times and queue disciplines in M M 1 Contents 1 Reflected
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence
Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationConsistency of the maximum likelihood estimator for general hidden Markov models
Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models
More informationWeek 6 Notes, Math 865, Tanveer
Week 6 Notes, Math 865, Tanveer. Energy Methods for Euler and Navier-Stokes Equation We will consider this week basic energy estimates. These are estimates on the L 2 spatial norms of the solution u(x,
More informationFrom Boltzmann Equations to Gas Dynamics: From DiPerna-Lions to Leray
From Boltzmann Equations to Gas Dynamics: From DiPerna-Lions to Leray C. David Levermore Department of Mathematics and Institute for Physical Science and Technology University of Maryland, College Park
More informationExercises Measure Theoretic Probability
Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationThe Way of Analysis. Robert S. Strichartz. Jones and Bartlett Publishers. Mathematics Department Cornell University Ithaca, New York
The Way of Analysis Robert S. Strichartz Mathematics Department Cornell University Ithaca, New York Jones and Bartlett Publishers Boston London Contents Preface xiii 1 Preliminaries 1 1.1 The Logic of
More informationLECTURE 15: COMPLETENESS AND CONVEXITY
LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other
More informationVariational approach to mean field games with density constraints
1 / 18 Variational approach to mean field games with density constraints Alpár Richárd Mészáros LMO, Université Paris-Sud (based on ongoing joint works with F. Santambrogio, P. Cardaliaguet and F. J. Silva)
More informationOptimality Conditions for Constrained Optimization
72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)
More information4 Expectation & the Lebesgue Theorems
STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does
More informationWeak Convergence of Numerical Methods for Dynamical Systems and Optimal Control, and a relation with Large Deviations for Stochastic Equations
Weak Convergence of Numerical Methods for Dynamical Systems and, and a relation with Large Deviations for Stochastic Equations Mattias Sandberg KTH CSC 2010-10-21 Outline The error representation for weak
More informationHomework 1 Due: Thursday 2/5/2015. Instructions: Turn in your homework in class on Thursday 2/5/2015
10-704 Homework 1 Due: Thursday 2/5/2015 Instructions: Turn in your homework in class on Thursday 2/5/2015 1. Information Theory Basics and Inequalities C&T 2.47, 2.29 (a) A deck of n cards in order 1,
More information(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define
Homework, Real Analysis I, Fall, 2010. (1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define ρ(f, g) = 1 0 f(x) g(x) dx. Show that
More informationELEMENTS OF PROBABILITY THEORY
ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable
More informationConstrained Optimization Theory
Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August
More informationReaction-Diffusion Equations In Narrow Tubes and Wave Front P
Outlines Reaction-Diffusion Equations In Narrow Tubes and Wave Front Propagation University of Maryland, College Park USA Outline of Part I Outlines Real Life Examples Description of the Problem and Main
More information6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )
6. Brownian Motion. stochastic process can be thought of in one of many equivalent ways. We can begin with an underlying probability space (Ω, Σ, P) and a real valued stochastic process can be defined
More informationRobust control and applications in economic theory
Robust control and applications in economic theory In honour of Professor Emeritus Grigoris Kalogeropoulos on the occasion of his retirement A. N. Yannacopoulos Department of Statistics AUEB 24 May 2013
More informationNotes on Large Deviations in Economics and Finance. Noah Williams
Notes on Large Deviations in Economics and Finance Noah Williams Princeton University and NBER http://www.princeton.edu/ noahw Notes on Large Deviations 1 Introduction What is large deviation theory? Loosely:
More informationLattice spin models: Crash course
Chapter 1 Lattice spin models: Crash course 1.1 Basic setup Here we will discuss the basic setup of the models to which we will direct our attention throughout this course. The basic ingredients are as
More informationMAT 135B Midterm 1 Solutions
MAT 35B Midterm Solutions Last Name (PRINT): First Name (PRINT): Student ID #: Section: Instructions:. Do not open your test until you are told to begin. 2. Use a pen to print your name in the spaces above.
More information