The Pointwise Ergodic Theorem and its Applications

The Poitwise Ergodic Theorem ad its Applicatios Itroductio Peter Oberly 11/9/2018 Algebra has homomorphisms ad topology has cotiuous maps; i these otes we explore the structure preservig maps for measure theory kow (somewhat uimagiatively) as measure preservig trasformatios. The first sectio cotais some (but ot all) of the ecessary defiitios for this talk ad i the secod we itroduce some classical examples to illustrate these defiitios. We the tur our attetio to the dyamics of measure preservig maps which leads us to the poitwise ergodic theorem. I the fial sectio we use the ergodic theorem to prove Borel s theorem o ormal umbers. Defiitios Defiitio. A σ-algebra A is a collectio of subsets of a o-empty set X so that X A ad A is closed uder complemetatio ad coutable uios; The pair (X, A) is called a measurable space, ad elemets of A are called measurable sets. A particularly importat σ-algebra is the collectio of Borel sets, defied to be the σ-algebra geerated by the ope subsets of a topological space X. Defiitio. A measure m : A [0, ] is a fuctio which satisfies the followig: 1. m(e) 0 for all E A; 2. m( ) = 0; 3. If {E } =1 A is a sequece of pairwise disjoit sets i A, the m( E ) = m(e ). A measure space is a triple (X, A, m) where (X, A) is a measurable space ad m is a measure defied o A. The triple (X, A, m) is called a probability space if m(x) = 1. Defiitio. Let (X, A, m) ad (Y, B, ) be measure spaces, ad let T : X Y be a map from X ito Y. T is said to be measurable if T 1 (E) A for each E B; that is, if the pre-image of every measurable set is measurable. Defiitio. A measurable trasformatio T : (X, A, m) (Y, B, ) is said to be measurepreservig if m(t 1 (E)) = (E) for all E B. If T is a bijectio ad T 1 is also measure preservig, the T is said to be ivertible. If (X, A, m) is a probability space, ad if T : X X is measure preservig, the the quadruple (X, A, m, T ) is sometimes referred to as a measurable dyamical system. Remarks: (1) We should really write T : (X, A, m) (Y, B, ) sice the measure preservig property

Notes o the Ergodic Theorem depeds o both the σ-algebras ad the measures, but will ofte write T : X Y istead. (2) If T : (X, A,, m) (Y, B, ) ad S : (Y, B, ) (Z, C, p) are measure preservig, the so is S T. (3) Measure preservig maps are the structure preservig trasformatios (morphisms) of measure spaces. (4) As such, a measure preservig map T : X X iduces a morphism o the Baach space of m-itegrable fuctios L 1 (m). I detail, let U T : L 1 (m) L 1 (m) be defied by U T (f) = f T. It is evidet that U T is liear, ad if f 0 (ad so is real valued), the (U T f)(x) = f(t (x)) 0 for x X. So U T is positive. I fact, U T is a isometry. For if s is a o-egative simple fuctio s = k=1 a kχ Ak, where a k are scalars ad A k are the measurable sets where s > 0, the U T (s) dm = a k χ Ak T dm = a k m(t 1 (A k )) = a k m(a k ) = s dm. k=1 k=1 Therefore choosig a sequece of simple fuctios s which coverges mootoically to f, where f L 1 (m), shows U T (f) 1 = f 1. Note also that this shows U T really does map ito L 1 (m). (5) As we are iterested i the dyamics of measure preservig maps, from ow o we will restrict our attetio to measurable fuctios T : X X. Additioally, uless other wise stated, we will assume that (X, A, m) is a probability space. Our last defiitio requires a bit of motivatio. Let (X, A, m, T ) be a measurable dyamical system. If T 1 (E) = E for E A, the T 1 (X \ E) = X \ E ad we could study our system by examiig the two simpler systems (E, A E, m E A E, T E ) ad (X \ E, A (X \ E), m A (X\E), T X\E ) (with the correspodig measures ormalized appropriately). If 0 < m(e) < 1, the we have actually decomposed our origial system ito two smaller oes. However, if m(e) = 0 or m(x \ E) = 0 (i.e. m(e) = 1), the oe of our simpler systems is i fact trivial, ad we are left with a system essetially the same as the oe we started with. It follows that those measurable dyamical systems where T 1 (E) = E implies m(e) = 0 or 1 are ot usefully decomposable i this way. It makes sese therefore to study those systems where such decompositio is ot possible, for uderstadig these will eable us to uderstad the oes which ca be simplified. We call such systems ergodic. Defiitio. A measurable dyamical system (X, A, m, T ) is said to be ergodic if E A ad T 1 (E) = E implies that m(e) = 0 or 1. We will ofte have a specific probability space (X, A, m) i mid ad refer to the measure preservig trasformatio T as ergodic. There are may characterizatios of ergodicity; oe which will prove useful i this talk is the followig. Theorem 1. (X, A, m, T ) is ergodic if ad oly if f L 1 (m) ad f T = f ae implies that f is costat ae. Proof. Assume that for all f L 1 (m) that if f T = f ae the f is costat ae. Let E A be so that T 1 (E) = E. The χ E T = χ E. As χ E L 1 (m), the χ E is costat ae. Therefore χ E is either 0 or 1 ae ad so m(e) = 0 or 1. The coverse is more techical, ad ca be foud i McDoald ad Wiess o page 616. 2 k=1

Notes o the Ergodic Theorem Examples (1) Let T : R R be a liear map ad let m be the Lebesgue measure o the Borel sets of R. If T is sigular, the rage T is a proper subspace of R, ad so T is ot measure preservig. If istead T is o-sigular, from liear algebra the m(t 1 (E)) = m(e)/ det T for all Borel sets E. Therefore T is a measure preservig liear map if ad oly if det T = 1. (2) Let S 1 = {z C : z = 1} ad let B deote the Borel σ-algebra. The with ormalized circular Lebesgue measure m, the triple (S 1, B, m) is a probability space. For a S 1, defie the rotatio T a : S 1 S 1 by T a (z) = az. The T a is measure preservig ad ivertible for all a. It is very istructive to show the followig. Theorem 2. The rotatio T = T a is ergodic if ad oly if a is ot a root of uity Proof. Suppose that a is a root of uity. The a p = 1 for some p 0. Let f : S 1 S 1 be defied by f(z) = z p. The (f T )(z) = f(az) = a p z p = f(z) for all z S 1. Therefore f T = f but f is o-costat. So T is ot ergodic by theorem 1. Coversely let A be a measurable subset of S 1 so that T 1 (A) = A. Notice that the fuctios e : S 1 S 1 defied by e (z) = z, Z form a orthoormal basis for L 2 (m). Let the Fourier series for χ A be χ A b e. Sice e (T z) = a e (z), it follows by a chage of variable that b = χ A e dm = a e dm, T 1 (A) ad so χ T 1 (A) a b e. As T 1 (A) = A, the χ A = χ T 1 (A) ad thus have the same Fourier coefficiets. Therefore b = a b for all. If a is ot a root of uity, the oly way this ca hold is if b = 0 for all 0. By the uiqueess of Fourier coefficiets, χ A is a costat almost everywhere ad so m(a) = 0 or 1. Therefore T a is ergodic whe a is ot a root of uity. (3) Let ([0, 1), B, m) be the probability space cosistig of the half ope uit iterval with Borel sets B ad m the Lebesgue measure. Defie T : [0, 1) [0, 1) by { 2x, if 0 x < 1/2; T (x) = 2x mod 1 = 2x 1, if 1/2 x < 1. This map is referred to as the dyadic trasformatio. Notice that if x has biary expasio x = 0.x 1 x 2 x 3...(2) the T (x) = 0.x 2 x 3...(2). It is worth showig that T is measure preservig. From measure theory [Billigsly, p 4], it suffices to prove that T preserves measure o a semi-algebra which geerates the Borel σ-algebra. The collectio of half ope itervals with ratioal dyadic edpoits is such a semi-algebra. So let E = [ k, j ) where 0 ad 2 2 3

Notes o the Ergodic Theorem 0 k j 2. The T 1 k (E) = {x [0, 1/2) : 2 2x < j 2 } {x [1/2, 1) : k 2 2x 1 < j 2 } k = [ 2, j k ) [1/2 + +1 2+1 2, 1 +1 2 + j 2 ) +1 = 1 2 E (1 2 + 1 2 E) ad the traslatio ivariace of the Lebesgue measure implies m(t 1 (E)) = 1 2 m(e) + 1 m(e) = m(e). 2 So T is measure preservig. We sketch the proof that T is i fact ergodic. Let A be a measurable subset of [0, 1) with T 1 (A) = A. Let x = 0.0x 2 x 3...(2) ad x = 0.1x 2 x 3...(2) ad assume that these are uique expasios. The T (x) = T (x ) = 0.x 2 x 3...(2). Now x A is equivalet to T x A ad similarly x A exactly whe T x A. So T (x) = T (x ) implies x A if ad oly if x A. The it follows that A [1/2, 1) = 1/2 + A [0, 1/2). So m(a [0, 1/2)) = m(a [1/2, 1)), ad hece m(a) = m(a [0, 1/2)) + m(a [1/2, 1)) = 2m(A [0, 1/2)) = m(a [0, 1/2))/m([0, 1/2)). Thus m(a)m([0, 1/2)) = m(a [0, 1/2)). Now this argumet ca be elaborated to show that this is true of ay half ope iterval with ratioal dyadic edpoits, or ay disjoit uio of such itervals. Now give ɛ > 0, choose such a disjoit uio E so that m(a E) < ɛ, where deotes the symmetric differece (which we ca do as A is measurable ad the half ope dyadic itervals geerate the Borel sets). The m(a) m(e) < ɛ ad m(a) m(a E) = m(a) m(a)m(e) < ɛ. Hece m(a) m(a) 2 < 2ɛ ad as ɛ is arbitrary, the m(a) = m(a) 2. So m(a) = 0 or 1 ad T is ergodic. The Ergodic Theorem To motivate the poitwise ergodic theorem, we first show that all measure preservig trasformatios o a fiite measure space ejoy the property of recurrece: Theorem 3 (The Poicaré Recurrece Theorem). Let T : X X be a measure preservig trasformatio of a probability space (X, A, m). Let E A with m(e) > 0. The almost all poits of E retur to E ifiitely ofte uder iteratio by T ; that is, T (x) E for almost all x E ad for ifiitely may. Proof. Give N 0, set E N = =N T (E) ad set F = E N=0 E N. The x F if ad oly if x E ad for all N 0, there is a N so that T (x) E. So F is the set 4

Notes o the Ergodic Theorem of poits of E which retur to E ifiitely ofte uder iteratio by T. Note that if x F, the there is a subsequece 1 < 2 <... < j <... of atural umbers so that T j (x) E for all j; therefore for each j we have T j (x) F sice T j i (T i (x)) E for all i. Thus every poit of F returs to F ifiitely ofte uder iteratio by T. It remais to show that m(f ) = m(e). Note that T 1 (E N ) = =N T (+1) (E) = E N+1 ad so m(e N ) = m(e N+1 ) for all N. Therefore m(e N ) = m(e 0 ) for all N ad sice E 0 E 1... the m( N=0 E N) = m(e 0 ). Therefore m(f ) = m(e E 0 ) = m(e) as E E 0. This begs the atural questio: how ofte, or with what frequecy, do the iterates of T (x) retur to a set? There is a very big differece betwee T 2 (x) E ad T! (x) E for all (ad almost all x E) eve though both retur to E ifiitely ofte. It makes sese the to cosider the log term behavior of the average umber of times T (x) returs to E; that is to cosider the limit of the ratios 1 1 χ E (T k (x)) as. It is ot obvious i what sese, if ay at all, this limit exists. It is also quite restrictive to cosider just characteristic fuctios; i a wide variety of applicatios both i theoretical math ad the scieces, it is impossible to calculate or observe the orbit of a poit directly. Istead, we rely o umerical data. We are therefore lead to cosider the covergece of the ratios 1 1 (f T k )(x) where f : X C is ow a measurable fuctio. It is eve less clear i what sese this limit may exist, or with what restrictios we may require to esure covergece. Birkhoff s celebrated poitwise ergodic theorem provides a aswer to these questios. Theorem 4 (Birkhoff s Poitwise Ergodic Theorem). Let (X, A, m) be a (possibly σ-fiite) measure space ad let T : X X be measure preservig. If f L 1 (m), the the limit 1 1 lim (f T k ) coverges poitwise almost everywhere to a fuctio f L 1 (m). Furthermore, f T = f ae (f is ivariat), ad if m(x) < the f dm = f dm. Remark. If T is also ergodic, the f is costat ae by theorem 1. So if m(x) <, the f dm = f m(x) = f dm ae ad thus f = 1 m(x) f dm. I particular, if T is ergodic ad (X, A, m) is a probability space the 1 1 lim (f T k )(x) = 5 f dm

Notes o the Ergodic Theorem for almost all x X ad all f L 1 (m). This is the form of the ergodic theorem that may be the most familiar; that the time average teds to the space average for almost every poit. This aswers our questio o the asymptotic frequecy with which the orbit of a poit x lies i a give measurable set E. For if T is ergodic, the 1 1 lim χ E (T k )(x) = m(e) for almost every x i the probability space X. We will oly outlie the proof. A detailed expositio ca be foud i Walters, Halmos, or Billigsly. The form of this proof is from Walters. (1) The first step is to prove the maximal ergodic theorem, or rather the followig corollary of it. The maximal ergodic theorem, alog with the covergece theorems of Lebesgue theory, is what drives the proof of the poitwise ergodic theorem. Theorem 5 (Maximal Ergodic Theorem). Let (X, A, m) be a fiite measure space ad T : X X be measure preservig. If f is real-valued ad itegrable, the f dm 0, where A A = {x X : sup 1 1 1 f(t k (x)) > 0} Proof. As oted i the itroductio, the map U T : L 1 R(m) L 1 R(m) defied by U T (f) = f T is a positive liear isometry. Let f 0 = 0 ad f = f + U T f +...U 1 T f for 1. Set F N = max 0 N f ad ote that F N 0 for all N N. Also observe that F N is itegrable sice f is. We have F N f for 0 N, ad so U T (F N ) U T (f ) by positivity. Hece U T (F N ) + f f +1, ad therefore U T (F N ) + f max 1 N f. Thus if x X ad F N (x) > 0, the (U T F N )(x) + f(x) max f (x) = F N (x). 0 N So f F N U T F N o A N = {x X : F N (x) > 0}. As F N (x) = 0 o X \ A N, the f dm F N dm U T (F N ) dm A N A N A N = F N dm U T (F N ) dm X A N F N dm F N dm X X = F N 1 U T (F N ) 1 = 0, 6

Notes o the Ergodic Theorem where we have used the fact that A N U T (F N ) dm U X T (F N ) dm ad that U T is a 1 isometry. Give x X, we see that sup 1 1 U T k (f) > 0 if ad oly if there is a N so that max 0 N f (x) = F N (x) > 0; hece A = =0 A N. As F N F N+1, the A N A N+1 ad so applyig the mootoe covergece theorem to f χ AN yields the desired claim. (2) We make some simplifyig assumptios ad itroduce otatio. Assume first that m(x) < ad that f is real valued. Give x X, defie ad f (x) = lim sup a (x) = 1 1 f(t k (x)), a (x), f (x) = lim if(x). As a is measurable for all, the so are f ad f. Notice that ( ) + 1 a (T x) = a +1 (x) f(x) for all. Sice f L 1 (X), we ca assume that f(x) < by redefiig f o a set of measure zero if ecessary. Therefore f(x)/ 0 as ad so f (T (x)) = lim sup a (T x) = lim sup ( + 1 A similar argumet shows that f T = f ae. a +1(x) f(x)/) = lim sup a +1 (x) = f (x). (3) We show that f = f ae; that is, that the set E = {x X : f (x) < f (x)} has measure zero. For real umbers a ad b with a < b, let E(a, b) = {x X : f (x) < a < b < f (x)}. The E = {E(a, b) : a, b Q}, so we show m(e(a, b)) = 0. As f ad f are measurable, the so is E(a, b) ad therefore so is E. As f T = f ad f T = f ae, the T 1 (E(a, b)) = {x X : f (T x) < a < b < f (T x)} = E(a, b). It is here that we eed to use the maximal ergodic theorem. 1 (4) Notice that E(a, b) {x X : sup 1 1 f(t k (x)) > b} = E(a, b). So apply the maximal ergodic theorem to the fuctio f b to coclude f b dm 0, so f dm bm(e(a, b)) ad similarly E(a,b) E(a,b) a f dm 0, so E(a,b) E(a,b) f dm am(e(a, b)). Therefore ae(a, b) be(a, b); sice b > a, this ca be true oly if m(e(a, b)) = 0. Hece f = f ae. 7

Notes o the Ergodic Theorem (5) To show that f is itegrable, ote that a dm 1 1 f T k dm = f(x) dm <, where we have used a chage of variables ad the fact that T is m-ivariat. Fatou s lemma implies the lim if a dm lim if f dm <. So f L 1 (m). (6) The last part is to show that f dm = f dm. Notice that a dm = 1 1 f T k dm = f dm by chagig variables ad sice T preserves measure. Therefore if we show that the iterchage of limit ad itegral f dm = lim a dm = lim a dm = f dm is valid the the proof for the case m(x) < is complete. This is accomplished by aother applicatio of the maximal ergodic theorem ad the domiated covergece theorem. (7) For the case whe X is σ-fiite, the above will work so log as m(e(a, b)) < so that we ca apply the maximal ergodic theorem. This is doe by choosig a subset C E(a, b) with fiite measure (which exists by σ-fiiteess) ad applyig the maximal ergodic theorem to the fuctio f bχ C to coclude (after a few more steps) that f dm bm(c). Therefore if C E(a, b) has m(c) <, the m(c) 1 b f dm; it follows from σ- fiiteess that m(e(a, b)) < as well. Cosequeces of the Ergodic Theorem A real umber x is ormal to base r if the expasio of x i base r cotais each digit i the same proportio. Theorem 6 (Borel s Theorem o Normal Numbers). Almost all umbers i [0, 1) are ormal to base r for all itegers r 2; i.e. for almost all x [0, 1) the frequecy of the digits 0, 1, 2,..., r 1 i the base r expasio of x occur with the same frequecy 1/r. 8

Notes o the Ergodic Theorem Proof. Let r 2 be a iteger ad defie the r-adic trasformatio T : [0, 1) [0, 1) by rx 0 x < 1; 1 rx 1 T (x) = rx mod 1 = x < 2; r r. r 1 rx (r 1) x < 1. r Just as for the dyadic trasformatio (r = 2), T is ergodic o [0, 1) with respect to the Lebesgue measure ad Borel σ-algebra. Let X deote the set of poits of [0, 1) which have uique base r expasio. The [0, 1) \ X is coutable so m(x) = 1. Let x X ad write x uiquely as x = x 1 x 2 x 3...(r). The T (x) = T (0.x 1 x 2...) = 0.x 2 x 3 x 4...(r), ad so T j (x) = 0.x j+1 x j+2...(r) where j 0. For ease of writig, let f deote the characteristic fuctio f = χ [ k 0 k < r is a iteger. The { f(t j 1, if x j+1 = k; (x)) = f(0.x j+1 x j+2...) = 0, else. r, k+1 r ), where Therefore the umber of times k appears i the first digits of the r-adic expasio of x is 1 j=0 f(t j (x)). Dividig by ad applyig the ergodic theorem gives 1 1 f(t j (x)) j=0 [0,1) f dm = m([ k r, k + 1 )) = 1 r r. Hece the frequecy with which k {0, 1,..., r 1} appears i the r-adic expasio of almost all umbers i [0, 1) is 1/r. The poitwise ergodic theorem gives the followig ice characterizatio of ergodicity. Theorem 7. A measurable dyamical system (X, A, m, T ) is ergodic if ad oly if for all A, B A 1 1 m(t k (A) B) m(a)m(b). Proof. Suppose that T is ergodic. Applyig the ergodic theorem to χ A shows that 1 χ A(T k )χ B m(a)χ B a.e., ad so the domiated cover- Multiplyig by χ B gives 1 gece theorem implies 1 1 χ A (T k ) m(a) a.e.. 1 1 m(t k (A) B) m(a)m(b) a.e. 9

Notes o the Ergodic Theorem Coversely, suppose the covergece property holds. Suppose that E A with T 1 (E) = E. Set A = B = E; by assumptio the Sice 1 1 1 m(e) m(e) 2. 1 m(e) = m(e) for all the m(e) = m(e)2 ad so m(e) = 0 or 1. This theorem provides a physical aid for uderstadig ergodic trasformatios; they are the maps which stir our space eough so that every measurable set will itersect every other measurable set i proportio to their relative size. Refereces 1. Joh McDoald, Neil Weiss. A Course i Real Aalysis, 2d editio, Academic Press 2012, chapter 16. 2. Paul R. Halmos. Lectures o Ergodic Theory, Martio Publishig, 2013 3. Peter Walters. A Itroductio to Ergodic Theory, Spriger-Verlag New York Ic., 1982 4. Patrick Billigsley. Ergodic Theory ad Iformatio, Joh Wiley ad Sos Ic., 1965 10