Average laws in analysis Silvius Klein Norwegian University of Science and Technology (NTNU)
The law of large numbers: informal statement The theoretical expected value of an experiment is approximated by the average of a large number of independent samples. theoretical expected value empirical average
The law of large numbers (LLN) Let X 1, X 2,..., X n,... be a sequence of jointly independent, identically distributed copies of a scalar random variable X. Assume that X is absolutely integrable, with expectation µ. Define the partial sum process Then the average process S n := X 1 + X 2 +... + X n. S n n µ as n.
The law of large numbers: formal statements Let X 1, X 2,... be a sequence of independent, identically distributed random variables with common expectation µ. Let S n := X 1 + X 2 +... + X n be the corresponding partial sums process. Then 1 (weak LLN) S n n That is, for every ɛ > 0, µ in probability. P { S n n µ > ɛ } 0 as n. 2 (strong LLN) S n n µ almost surely.
It was the best of times, it was the worst of times. Charles Dickens, A tale of two cities (click here)
Application of LLN: the infinite monkey theorem Let X 1, X 2,... be i.i.d. random variables drawn uniformly from a finite alphabet. Then almost surely, every finite phrase (i.e. finite string of symbols in the alphabet) appears (infinitely often) in the string X 1 X 2 X 3....
Application of LLN: the infinite monkey theorem Let X 1, X 2,... be i.i.d. random variables drawn uniformly from a finite alphabet. Then almost surely, every finite phrase (i.e. finite string of symbols in the alphabet) appears (infinitely often) in the string X 1 X 2 X 3.... yskpw,qol,all/alkmas;.a ma;;lal;,qwmswl,;q;[; lkle 78623rhbkbads m,q l;, ;f.w, fwe It was the best of times, it was the worst of times. jllkasjllmk,a s.,qjwejhns;.2;oi0ppk;q,qkjkqhjnqnmnmmasi[oqw qqnkm,sa;l;[ml/w/ q
Application of LLN: the infinite monkey theorem Let X 1, X 2,... be i.i.d. random variables drawn uniformly from a finite alphabet. Then almost surely, every finite phrase (i.e. finite string of symbols in the alphabet) appears (infinitely often) in the string X 1 X 2 X 3.... yskpw,qol,all/alkmas;.a ma;;lal;,qwmswl,;q;[; lkle 78623rhbkbads m,q l;, ;f.w, fwe It was the best of times, it was the worst of times. jllkasjllmk,a s.,qjwejhns;.2;oi0ppk;q,qkjkqhjnqnmnmmasi[oqw qqnkm,sa;l;[ml/w/ q
The second Borel-Cantelli lemma Let E 1, E 2,..., E n,... be a sequence of jointly independent events. If P(E n ) =, n=1 then almost surely, an infinite number of E n hold simultaneously. This can be deduced from the strong law of large numbers, applied to the random variables X k := 1 Ek.
The actual proof of the infinite monkey theorem Split every realization of the infinite string of symbols in the alphabet X 1 X 2 X 3... X n... into finite strings S 1, S 2,... of length 52 each. Let E n be the event that the phrase It was the best of times, it was the worst of times. is exactly the n-th finite string S n. These are independent events. They each have the same probability p > 0 to occur. Apply the second Borel-Cantelli lemma.
The law of large numbers We have seen that if X 1, X 2,..., X n,... is a sequence of jointly independent, identically distributed copies of a scalar random variable X, and if we denote the corresponding sum process by then the average process S n := X 1 + X 2 +... + X n, S n n E X as n.
A rather deterministic system: circle rotations Let S be the unit circle in the (complex) plane. There is a natural measure λ on S (i.e. the extension of the arc-length). Let 2πα be an angle, and denote by R α the rotation by 2πα on S. That is, consider the transformation R α : S S, where if z = e 2π i x S and if we denote ω := e 2π i α, then R α (z) = e 2π i (x+α) = z ω. Note that R α preserves the measure λ.
Iterations of the circle rotation Let 2πα be an angle. Start with a point z = e 2π i x S and consider successive applications of the rotation map R α : R 1 α(z) = R α (z) R 2 α(z) = R α R α (z). 2π i (x+α) = e 2π i (x+2α) = e Rα(z) n 2π i (x+nα) = R α... R α (z) = e. The maps R 1 α, R 2 α,..., R n α,... are the iterations of R α. Given a point z S, the set is called the orbit of z. {R 1 α(z), R 2 α(z),..., R n α(z),... }
An orbit of a circle rotation Let R α be the circle rotation by the angle 2πα, where α is an irrational number. Pick a point z on the circle S. The orbit of z (or rather a finite subset of it).
An orbit of a circle rotation Let R α be the circle rotation by the angle 2πα, where α is an irrational number. Pick a point z on the circle S. The orbit of z (or rather a finite subset of it). The orbit of every point is dense on the circle. This transformation satisfies a very weak form of independence called ergodicity.
Observables on the unit circle Any measurable function f : S R is called a (scalar) observable of the measure space (S, A, λ). We will assume our observables to be absolutely integrable. A basic example of an observable: f = 1 I, where I is an arc (or any other measurable set) on the circle. I I ~ e, Observations" of the orbit points of a circle rotation.
Average number of orbit points visiting an arc Let R α be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. I I ~ e, The first n orbit points of a circle rotation and their visits to I. The average number of visits to I: { } # j {1, 2,..., n} : Rα(z) j I n What does this look like for large enough n? Or in other words, is there a limit of these averages as n?
Average number of orbit points visiting an arc Let R α be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. I I ~ e, The first n orbit points of a circle rotation and their visits to I. As n, the average number of visits to I: { } # j {1, 2,..., n} : Rα(z) j I λ(i), n for all points z S.
Average number of orbit points visiting an arc Let R α be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. I I ~ e, The first n orbit points of a circle rotation and their visits to I. { } # j {1, 2,..., n} : Rα(z) j I = n 1 I (Rα(z)). j j=1 Then the average number of visits to I can be written: 1 I (R 1 α(z)) + 1 I (R 2 α(z)) +... + 1 I (R n α(z)) n λ(i)
Average number of orbit points visiting an arc Let R α be the circle rotation by the angle 2πα, where α is an irrational number. Let I be an arc on the circle. I I ~ e, The first n orbit points of a circle rotation and their visits to I. { } # j {1, 2,..., n} : Rα(z) j I = n 1 I (Rα(z)). j Then the average number of visits to I can be written: 1 I (Rα(z)) 1 + 1 I (Rα(z)) 2 +... + 1 I (Rα(z)) n λ(i) = 1 I dλ. n j=1 S
Measure preserving dynamical systems A probability space (X, B, µ) together with a transformation T : X X define a measure preserving dynamical system if T is measurable and it preserves the measure of any B-measurable set: µ(t 1 A) = µ(a) for all A B. Ergodic dynamical system. For any B-measurable set A with µ(a) > 0, the iterations T A, T 2 A,..., T n A,... fill up the whole space X, except possibly for a set of measure zero. Ergodicity leads to some very, very weak form of independence.
Some examples of ergodic dynamical systems 1 The Bernoulli shift, which encodes sequences of independent, identically distributed random variables. 2 The circle rotation by an irrational angle. 3 The doubling map. T : [0, 1] [0, 1], Tx = 2x mod 1..
The pointwise ergodic theorem Given: an ergodic dynamical system (X, B, µ, T), and an absolutely integrable observable f : X R, define the n-th Birkhoff sum S n f (x) := f (Tx) + f (T 2 x) +... + f (T n x). Then as n 1 n S n f (x) X f dµ for µ a.e. x X.
The law of large numbers We have seen that if X 1, X 2,..., X n,... is a sequence of jointly independent, identically distributed copies of a scalar random variable X, and if we denote the corresponding sum process by then as n S n := X 1 + X 2 +... + X n, 1 n S n X almost surely.
An immediate application of the ergodic theorem Let (X, B, µ, T) be an ergodic dynamical system. Let x X, and consider its orbit Tx, T 2 x,..., T n x,... Equidistribution of orbit points. For any B-measurable set A, the average number of orbit points that visit A, converges as n. { } # j {1, 2,..., n} : T j x A µ(a). n for µ almost every point x X. Proof. Just apply the pointwise ergodic theorem to the observable f = 1 A, and note that the counting of orbit points above equals the n-th Birkhoff sum of this observable.
Another simple application of the ergodic theorem Consider the decimal representation of every real number x [0, 1). x = 0. x 1 x 2... x n..., where the digits x k {0, 1, 2,..., 9}. What is the frequency (average occurrence) of each digit in the decimal representation of a typical" real number x [0, 1]?
Another simple application of the ergodic theorem Consider the decimal representation of every real number x [0, 1). x = 0. x 1 x 2... x n..., where the digits x k {0, 1, 2,..., 9}. What is the frequency (average occurrence) of each digit in the decimal representation of a typical" real number x [0, 1]? { } # j {1, 2,..., n} : x j = 7? as n. n
Another simple application of the ergodic theorem Consider the decimal representation of every real number x [0, 1). x = 0. x 1 x 2... x n..., where the digits x k {0, 1, 2,..., 9}. What is the frequency (average occurrence) of each digit in the decimal representation of a typical" real number x [0, 1]? { } # j {1, 2,..., n} : x j = 7? as n. n Solution. Consider the dynamical system given by the 10-fold map: T : [0, 1) [0, 1), Tx = 10x mod 1. Let f : [0, 1) R be the observable defined as { 1 if x1 = 7 f (x) = 0 otherwise.
The law of large numbers We have seen that if X 1, X 2,..., X n,... is a sequence of jointly independent, identically distributed scalar random variables, and if we denote the corresponding sum process by S n := X 1 + X 2 +... + X n, then the arithmetic averages 1 n S n converge almost surely as n.
Random matrices and geometric averages Consider a sequence M 1, M 2,..., M n,... of random matrices. We assume that this sequence is independent and identically distributed. Consider the partial products process: Π n = M n... M 2 M 1.
Random matrices and geometric averages Consider a sequence M 1, M 2,..., M n,... of random matrices. We assume that this sequence is independent and identically distributed. Consider the partial products process: Π n = M n... M 2 M 1. Furstenberg-Kesten s theorem. Almost surely, and as n, the geometric averages" 1 n log Π n converge to a constant. This constant is called the Lyapunov exponent of the multiplicative process.
If you liked this...
If you liked this...... then MA3105 Advanced real analysis will cover in depth many of these topics.
If you liked this...... then MA3105 Advanced real analysis will cover in depth many of these topics. You are ready to take MA3105 if this picture makes some sense to you.
MA3105 Advanced real analysis main topics A monkey typing random stuff.
MA3105 Advanced real analysis main topics Arnold s cat map.