Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Similar documents
Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Lecture 6 Basic Probability

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Scientific Computing: Monte Carlo

S6880 #13. Variance Reduction Methods

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

A D VA N C E D P R O B A B I L - I T Y

Probability and Measure

Lecture 11. Probability Theory: an Overveiw

Probability and Measure

1 Sequences of events and their limits

Random Process Lecture 1. Fundamentals of Probability

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture 2: Convergence of Random Variables

An Introduction to Stochastic Calculus

Introduction to Self-normalized Limit Theory

Probability Theory. Richard F. Bass

Measure-theoretic probability

Convergence of Random Variables

General Principles in Random Variates Generation

Notes 1 : Measure-theoretic foundations I

Numerical Methods I Monte Carlo Methods

7 Convergence in R d and in Metric Spaces

Exercises Measure Theoretic Probability

Brownian Motion and Conditional Probability

Probability and Measure

MATH 418: Lectures on Conditional Expectation

STOR 635 Notes (S13)

ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1

Verona Course April Lecture 1. Review of probability

Lecture 22: Variance and Covariance

Review: mostly probability and some statistics

Continuous Random Variables

Recap of Basic Probability Theory

Chapter 5. Weak convergence

PROBABILITY THEORY II

Numerical Methods in Economics MIT Press, Chapter 9 Notes Quasi-Monte Carlo Methods. Kenneth L. Judd Hoover Institution.

h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote

Introduction to Statistical Inference Self-study

Recap of Basic Probability Theory

1 Probability theory. 2 Random variables and probability theory.

Metric Spaces. Exercises Fall 2017 Lecturer: Viveka Erlandsson. Written by M.van den Berg

Entrance Exam, Real Analysis September 1, 2017 Solve exactly 6 out of the 8 problems

Hochdimensionale Integration

Convergence Concepts of Random Variables and Functions

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Probability and Statistics

Quick Tour of Basic Probability Theory and Linear Algebra

Probability Theory and Statistics. Peter Jochumzen

Part II Probability and Measure

Lecture 2: Repetition of probability theory and statistics

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Homework Assignment #2 for Prob-Stats, Fall 2018 Due date: Monday, October 22, 2018

Recitation 2: Probability

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Empirical Processes: General Weak Convergence Theory

IEOR 6711: Stochastic Models I Fall 2013, Professor Whitt Lecture Notes, Thursday, September 5 Modes of Convergence

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Monte Carlo Integration I [RC] Chapter 3

Advanced Probability

1 Independent increments

Lecture 3: Random variables, distributions, and transformations

MATH4210 Financial Mathematics ( ) Tutorial 7

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

JUSTIN HARTMANN. F n Σ.

Continuous random variables

1. Stochastic Processes and filtrations

1 Probability space and random variables

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Variance Reduction s Greatest Hits

ABSTRACT EXPECTATION

ECE 4400:693 - Information Theory

1 Presessional Probability

P (A G) dp G P (A G)

Stochastic Processes and Monte-Carlo Methods. University of Massachusetts: Spring Luc Rey-Bellet

Lecture 7. Sums of random variables

ECON 3150/4150, Spring term Lecture 6

1 Stat 605. Homework I. Due Feb. 1, 2011

Low-discrepancy constructions in the triangle

MATH5835 Statistical Computing. Jochen Voss

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Chapter 1: Probability Theory Lecture 1: Measure space, measurable function, and integration

Selected Exercises on Expectations and Some Probability Inequalities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Northwestern University Department of Electrical Engineering and Computer Science

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

17. Convergence of Random Variables

FE 5204 Stochastic Differential Equations

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.

Conditional independence, conditional mixing and conditional association

Exercises Measure Theoretic Probability

Serena Doria. Department of Sciences, University G.d Annunzio, Via dei Vestini, 31, Chieti, Italy. Received 7 July 2008; Revised 25 December 2008

Stability of optimization problems with stochastic dominance constraints

Introduction to Discrepancy Theory

SDS : Theoretical Statistics

Transforms. Convergence of probability generating functions. Convergence of characteristic functions functions

4 Expectation & the Lebesgue Theorems

Transcription:

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 1 / 50

Outline 1 Introduction and Definitions 2 On the convergence rate of Monte Carlo methods 3 Simulation Methods of Classical Laws 4 Variance Reduction 5 Low discrepancy sequences E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 2 / 50

Outline 1 Introduction and Definitions 2 On the convergence rate of Monte Carlo methods 3 Simulation Methods of Classical Laws 4 Variance Reduction 5 Low discrepancy sequences E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 3 / 50

Motivation What is really random? Stochastic Models in general Sources of noise in neuronal activities Monte Carlo Methods Efficiency E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 4 / 50

Motivation What is really random? Stochastic Models in general Sources of noise in neuronal activities Monte Carlo Methods Efficiency E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 4 / 50

Motivation What is really random? Stochastic Models in general Sources of noise in neuronal activities Monte Carlo Methods Efficiency E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 4 / 50

Motivation What is really random? Stochastic Models in general Sources of noise in neuronal activities Monte Carlo Methods Efficiency E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 4 / 50

Motivation What is really random? Stochastic Models in general Sources of noise in neuronal activities Monte Carlo Methods Efficiency E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 4 / 50

Noise in neuronal activity Thermal noise Channel noise Electrical noise Synaptic noise E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 5 / 50

Two approaches: strong link between deterministic and stochastic approaches Toy example A = 1 0 f (θ) dθ A = E [f (U)] Ã = f (θ 1, θ 2, θ 3 ) dθ 3 dθ 2 dθ 1 [0,1] 3 Ã = E [f (U 1, U 2, U 3 )] Heat Equation and Brownian Motion u t (t, x) + 1 2 u(t, x) = 0 2 x 2 u(t, x) = Ψ(x) u(t, x) = E [Ψ(W T ) W t = x] E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 6 / 50

Probability Space σ-field Consider an arbitrary nonempty space Ω. A class F of subsets of Ω is called a σ field if Ω F A F implies A c F A 1, A 2, F implies k N A k F Probability measures Consider a real valued function mathbbp defined on a σ field F of Ω. P is a probability measure if it satisfies. 1 0 P(A) 1, for A F 2 P( ) = 0, P(Ω) = 1 3 if A 1, A 2, is a disjoint sequence of F sets, P( A k ) = P(A k ) k=1 k=1 E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 7 / 50

Limit sets Definition For a sequence A 1, A 2, of sets, we define lim sup A n = n lim inf n A n = If lim sup n A n = lim inf n A n, we write lim A n. n=1 k=n n=1 k=n A k A k Theorem 1 For each sequence A n, P(lim inf A n ) lim P(A n ) lim P(A n ) P(lim sup A n ) 2 If A n A then P(A n ) P(A) E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 8 / 50

Borel Cantelli Lemmas Theorem (First Borel Cantelli Lemma) If n P(A n) converges, then P(lim sup n A n ) = 0 Theorem (Second Borel Cantelli Lemma) If {A n } is an independent sequence of events and n P(A n) diverges, then P(lim sup n A n ) = 1 E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 9 / 50

Random Variables Definition (Random variable) A random variable on a probability space (Ω, F, P) is a real valued function X = X(ω), F-measurable. Definition (Cumulative distribution function) The cumulative distribution function F X of a random variable X is F X (y) := P (X y). Properties F X caracterizes the law of X F X is a non decreasing function F X is right continuous lim y F X (y) = 0 lim y + F X (y) = 1 E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 10 / 50

Expected value For general real random variable, E (1 A ) = P(A) ( ) E a i 1 Ai = a i P(A i ) i i If X 0 E (X) = sup E(Y ) Y X,Y = ai1 A i i E(X) = E(X + ) E(X ) E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 11 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in distribution A sequence X 1, X 2, of random variables is said to converge in distribution to a random variable X if lim F Xn (x) = F X (x) n for every x such that F X is continuous at x. Properties E(f (X n )) Ef (X) for all bounded, continuous functions f ; E(f (X n )) Ef (X) for all bounded, Lipschitz functions f ; lim sup E(f (X n )) Ef (X) for every upper semi-continuous function f bounded from above; lim inf E(f (X n )) Ef (X) for every lower semi-continuous function f bounded from below; lim sup P(X n C) P(X C) for all closed sets C; lim inf P(X n U) P(X U) for all open sets U; It does not imply convergence of probability density function. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 12 / 50

Convergence of sequences of random variables Convergence in probability A sequence X 1, X 2, of random variables is said to converge in probability to a random variable X if for every ε > 0 lim n P( X n X ε) = 0 Almost sure Convergence A sequence X 1, X 2, of random variables is said to converge almost surely to a random variable X if P(lim n X n = X) = 1. Convergence in mean A sequence X 1, X 2, of random variables is said to converge in r mean to a random variable X if for every E( X n X r ) = 0. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 13 / 50

Implications X n X n X n X n a.s. P X = X n X P a.s. X = X ϕ(n) X P L X = X n X L r X = X n P X If r s 1, X n L r X = X n L s X E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 14 / 50

Independence Independent events Events A and B are independent if P(A B) = P(A)P(B). More generally, a finite collection A 1,, A n of events is independent if P(A k1 A kj ) = P(A k1 ) P(A kj ) for j 2, n and 1 k 1 < < k j n. Independent random variables Random variables X and Y are independent if the σ fields σ(x) and σ(y ) are independent (i.e. for all A σ(x) and B σ(y ), A and B are independent). E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 15 / 50

Conditional probability A first definition P(A B) = P(A B) P(B) if P(B) 0. If B 1, B 2,, is a countable partition of Ω f (ω) = P(A B i ) if ω B i. The random variable f is the conditional probability of A given G = σ(b 1, B 2, ). Remark It is necessary to extend the definition to non empty sets B i such that P(B i ) = 0. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 16 / 50

Conditional Expectation Definition Let X be an integrable random variable on (Ω, F, P) and G a σ field in F. There exists a random variable E(X G) called the conditional expected value of X given G having the two properties (i) E(X G) is G measurable and integrable (ii) E [1 G E(X G)] = E [1 G X] for G G. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 17 / 50

Outline 1 Introduction and Definitions 2 On the convergence rate of Monte Carlo methods 3 Simulation Methods of Classical Laws 4 Variance Reduction 5 Low discrepancy sequences E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 18 / 50

Strong Law of Large Numbers Theorem (Law of Large Numbers) Consider a random variable X, such that E ( X ) <. We denote the mean µ := E (X). We consider a sample of n independent random variables X 1,, X n with the same law as X. Then X 1 + + X n n a.s. n µ. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 19 / 50

Central Limit Theorem Theorem Consider a random variable X, such that E ( [ X 2) <. Denote the mean and variance by µ := E (X), σ 2 = E (X E(X)) 2] = E(X 2 ) (E(X)) 2. Let us consider a sample of n i.i.d random variables X 1,, X n with the same law as X. Then ( ) n X1 + + X n L µ N (0, 1). σ n n E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 20 / 50

Confidence intervals Assume µ = E(X). An estimator of µ is ˆµ n := 1 n (X 1 + + X n ) Assume n is large enough to be in the asymptotic regime. [ ] n P σ (ˆµn µ) A P (G A) where G N (0, 1) α there exists y α such that P( G y α ) = α An example of the size of the confidence interval For α = 95%, y α = 1.96. P ( µ [ ˆµ n 1.96σ n, ˆµ n + 1.96σ n ]) 95% E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 21 / 50

A non asymptotic estimate Theorem (Berry-Esseen) Let (X i ) i 1 be a sequence of independent and identically distributed random variables with zero mean. Denote by σ the common standard deviation. Suppose that E X 3 < +. Then ( ) ε n := sup P X1 + + X x N x R σ x e u2 /2 du N 2π CE X 1 3 σ 3 N. In addition, 0.398 C 0.8. For a proof, see, e.g., Shiryayev (1984). E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 22 / 50

A more precise result We now give a result which is slightly more precise than the Berry-Esseen Theorem: the estimate is non uniform in x. See Petrov (1975) for a proof and extensions. Theorem (Bikelis) Let (X i ) i 1 be a sequence of independent real random variables, which are not necessarily identically distributed. Suppose that EX i = 0 for all i, and that there exists 0 < δ 1 such that E X i 2+δ < + for all i. Set σ 2 i := EX 2 i, B N := N σi 2, i=1 F N (x) := P [ N i=1 X i BN Denote by Φ the distribution function of a Gaussian law with zero mean and unit 1 variance. There exists a universal constant A in (, 1) independent of N and 2π of the sequence (X i ) i 1, such that, for all x, F N (x) Φ(x) A B 1+δ/2 N (1 + x ) 2+δ x N E X i 2+δ E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 23 / 50 i=1 ].

Outline 1 Introduction and Definitions 2 On the convergence rate of Monte Carlo methods 3 Simulation Methods of Classical Laws 4 Variance Reduction 5 Low discrepancy sequences E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 24 / 50

Uniform Law Generating a sequence U 1,, U n of i.i.d. uniform random variables Properties 1 n 1 n n 1 a Ui b b a a, b [0, 1] i=1 n i=1 Congruencial generator N max N n 0 N n k+1 an k + b (mod) N max u k = n k N max ( U 2i+1 1 ) ( U 2i+2 1 ) 0 2 2 etc. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 25 / 50

Remarks 1 This sequence (u k ) k 1 mimics a sequence of independent random variable uniformly distributed on (0, 1) but it is a deterministic sequence. 2 It allows us to compare several methods with the same random events 3 These sequences are periodic 4 We have to take care to the period: as long as possible. 5 A good choice: Mersenne Twister. Do not forget If you want to use a software or a given language in order to apply stochastic numerical methods, you have to find its own uniform random generator or to download a good uniform generator. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 26 / 50

Inverse the Cumulative Distribution Function Proposition Let X be a random variable with cumulative distribution function F X. Let U be a random variable uniformly distributed on (0, 1). F 1 X (U) X E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 27 / 50

Rejection Procedure Principle Our aim is to estimate E [FG] with 0 G 1 almost surely. The idea : write G = P(X)(= P(X F, G)) E [FG] = E [F 1 X ] = E [F X] E [G] E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 28 / 50

Simulation of a random variable with a rejection procedure Let X be a r.v. with density f. We do not know how to simulate it. Let Y be a r.v. with density g. We know how to simulate it. Assumption: x R 0 f (x) Cg(x). We set h(x) := f (x) Cg(x) 1 {g(x)>0} E [ϕ(x)] = ϕ(x)f (x)dx = ϕ(x) f (x) g(x) g(x)dx [ = E ϕ(y ) f (Y ) ] [ = CE ϕ(y ) f (Y ) ] = CE [ϕ(y )h(y )] g(y ) Cg(Y ) = CE [ ] ϕ(y )1 {U h(y )} = CE [ϕ(y ) U < h(y )] P [U h(y )] E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 29 / 50

P [U h(y )] = E [h(y )] f (y) f (y) = Cg(y) g(y)dy = C dy = 1 C E [ϕ(x)] = CE [ϕ(y ) U < h(y )] P [U h(y )] = E [ϕ(y ) U < h(y )] E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 30 / 50

Algorithm 1 Generate Y 2 Compute for this realisation f (Y ) Cg(Y ) 3 Generate a random variable U, indep. of Y, with uniform law on (0, 1). 4 If U f (Y ), accept the realisation, that is X = Y Cg(Y ) 5 Else (if U > f (Y ) ), reject the realisation and start again from first step. Cg(Y ) Remark You have to wait a random time to obtain each realisation The probability of acceptance is equal to 1 C. Smaller is C, better is the algorithm. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 31 / 50

Gaussian r.v.: Box Muller Procedure Algorithm Generate (U 1, U 2 ) two independent random variables uniformly distributed on (0, 1) Set G 1 = 2 log(u 1 ) cos(2πu 2 ) G 2 = 2 log(u 1 ) sin(2πu 2 ) G 1 and G 2 are two independent gaussian random variables N (0, 1). E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 32 / 50

Gaussian r.v.: Marsiglia Polar Procedure Algorithm Generate (X 1, X 2 ) a point uniformly distributed in the unit disc. Compute S = X1 2 + X 2 2 Set 2 log (S) G 1 = X 1 S 2 log (S) G 2 = X 2 S G 1 and G 2 are two independent gaussian random variables N (0, 1). E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 33 / 50

Marsaglia Polar Procedure (2) Remark To generate (X 1, X 2 ), uniformly distributed on the unit disc, we use a rejection procedure: We generate U 1 U(0, 1) and U 2 U(0, 1) indep. We set Y 1 = 2U 1 1,Y 2 = 2U 2 1 If Y 2 1 + Y 2 2 1, (X 1, X 2 ) = (Y 1, Y 2 ) ACCEPTANCE Else, REJECTION go to the first step. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 34 / 50

Outline 1 Introduction and Definitions 2 On the convergence rate of Monte Carlo methods 3 Simulation Methods of Classical Laws 4 Variance Reduction 5 Low discrepancy sequences E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 35 / 50

Context Notation µ = E(Ψ(Z)) ˆµ N = 1 N N Ψ(Z i ) i=1 Analysis of the error Central Limit Theorem gives our aim is to obtain the best accuracy µ ˆµ N σ N where σ 2 = Var(Ψ(Z)) E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 36 / 50

Comparison of methods We want to approximate µ = E(X) We know also that µ = E(Y ) We have two ways to approximate µ : Constraints µ 1 N X N X i=1 X i µ 1 N Y N Y Y i The best method should give the result with the best accuracy in a fixed execution time (say A) i=1 E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 37 / 50

Comparison of methods each simulation of X take a time T X each simulation of Y take a time T Y we have time to generate N X = A T X (and N Y = A T Y ) the accuracy is σ X NX = we have to compare σ 2 X T X and σ 2 Y T Y. σ 2 X T X A. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 38 / 50

Variance Reduction Importance Sampling You want to estimate with a Monte Carlo procedure E [ϕ(x)] where X has a density g. E [ϕ(x)] = ϕ(x)g(x)dx You choose a random variable Y with density f. ϕ(x)g(x) E [ϕ(x)] = ϕ(x)g(x)dx = f (x)dx [ ] f (x) ϕ(y )g(y ) E [ϕ(x)] = E. f (Y ) Control Variables E(f (X)) = E(f (X) h(x)) + E(h(X)) E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 39 / 50

Variance Reduction Antithetic Variables You want to estimate I = 1 Thanks to the change of variable y = 1 x, I = 1 0 0 g(1 y)dy = 1 2 g(x)dx 1 = 1 E(g(U) + g(1 U)) 2 0 g(x) + g(1 x)dx E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 40 / 50

Antithetic Variables Theorem Let us consider a monotone function g, a non increasing function T and a sequence of i.i.d. random variables (X i, i N) such that X i and T (X i ) have the same law. Then, Var [ 1 n n i=1 ] [ g(x i ) + g(t (X i )) 1 Var 2 2n 2n i=1 g(x i ) ] E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 41 / 50

Stratification Idea X is a r.v. with value in E, you want to estimate E [g(x)] You split the space E into E 1, E 2,..., E k. k I = E [g(x) X E i ] P [X E i ] I = Î = i=1 k p i I i. i=1 k p i Î i. i=1 σ 2 i = Var (g(x) X E i ). E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 42 / 50

You use a Monte Carlo procedure with n i realisations to estimate I i. ( k ) 2 Var(Î) = 1 p i σ i n n i i=1 under the constraint n = n i, the optimal choise of n i is n i = n p iσ i. k p j σ j j=1 E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 43 / 50

Outline 1 Introduction and Definitions 2 On the convergence rate of Monte Carlo methods 3 Simulation Methods of Classical Laws 4 Variance Reduction 5 Low discrepancy sequences E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 44 / 50

Low discrepancy Sequences Using sequences of points more regular than random points may sometimes improve Monte Carlo methods. We look for deterministic sequences (x i, i 1) such that f (x)dx 1 [0,1] n (f (x 1) + + f (x n )) d for all function f in a large enough set. Definition These methods with deterministic sequences are called quasi Monte Carlo methods. One can find sequences such that the speed of convergence of the previous approximation is of order K log(n)d n E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 45 / 50

Low discrepancy Sequences (2) Definition (Uniformly distributed sequences) For all y, z [0, 1] d, we say that y z if i = 1,.., d,y i z i. A sequence (x 1, i 1) is said to be uniformly distributed on [0, 1] d if one of the following equivalent properties is fulfilled: 1 For all y = (y 1,, y d ) [0, 1] d 1, lim n + n 2 Let Dn (x) = sup y [0,1] d the sequence, then 1 n n 1 xk [0,y] = Volume([0, y]) k=1 n 1 xk [0,y] Volume([0, y]) be the discrepancy of k=1 lim n D n (x) = 0 3 For every (bounded) continuous function f on [0, 1] d 1 lim n + n n f (x k ) = f (x)dx [0,1] d k=1 E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 46 / 50

Low discrepancy Sequences(3) Remark If (U n ) n 1 is a sequence of independent random variables with uniform law on [0, 1], the random sequence (U n (ω), n 1) is almost surely uniformly distributed. The discrepancy fulfills an iterated logarithm law 2n lim sup n log(log n) D n (U) = 1 Lower bound for the discrepancy: Roth Theorem a.s. The discrepancy of any infinite sequence satisfies the property Dn (log n) d 1 2 > C d n for d 3 for an infinite number of values of n, where C d is a constant which depends on d only. E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 47 / 50

Low discrepancy Sequences (4) Koksma-Hlawka inequality Let g be a finite variation function in the sense of Hardy and Krause and denote by V (g) its variation. Then, for n 1, 1 N N g(x k ) g(u)du [0,1] V (g)d N(x) d k=1 Finite variation function in the sense of Hardy and Krause If the function g is d times continuously differentiable, the variation V (g) is given by d k=1 1 i 1<..<i k d { x [0, 1] d x j = 1 for j i 1,.., i k k g(x) x i1 x ik dx i 1 dx ik E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 48 / 50

Popular Quasi Monte Carlo Sequences 1 Faure sequences 2 Halton sequences 3 Sobol sequences 4 van der Corput sequences An upper bound For such sequences, we obtain an upper bound of the discrepancy: Remark For small d: deterministic methods Dn (log n)d C n For moderated d: Quasi Monte Carlo methods For large d: Monte Carlo methods E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 49 / 50

Low discrepancy Sequences Figure: Halton Points Figure: (Pseudo) uniform points E. Tanré (INRIA - Team Tosca) Mathematical Methods for Neurosciences October 22nd, 2014 50 / 50