Lecture 4. Entropy and Markov Chains

Similar documents
VARIATIONAL PRINCIPLE FOR THE ENTROPY

Dynamical Systems and Ergodic Theory PhD Exam Spring Topics: Topological Dynamics Definitions and basic results about the following for maps and

MAGIC010 Ergodic Theory Lecture Entropy

An Introduction to Entropy and Subshifts of. Finite Type

MARKOV PARTITIONS FOR HYPERBOLIC SETS

4. Ergodicity and mixing

Proof: The coding of T (x) is the left shift of the coding of x. φ(t x) n = L if T n+1 (x) L

Lecture 16 Symbolic dynamics.

On fuzzy entropy and topological entropy of fuzzy extensions of dynamical systems

25.1 Ergodicity and Metric Transitivity

Ergodic Theory. Constantine Caramanis. May 6, 1999

THE MEASURE-THEORETIC ENTROPY OF LINEAR CELLULAR AUTOMATA WITH RESPECT TO A MARKOV MEASURE. 1. Introduction

Dynamical Systems 2, MA 761

S-adic sequences A bridge between dynamics, arithmetic, and geometry

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Problems in hyperbolic dynamics

MATH642. COMPLEMENTS TO INTRODUCTION TO DYNAMICAL SYSTEMS BY M. BRIN AND G. STUCK

LECTURE 15: COMPLETENESS AND CONVEXITY

P-adic Functions - Part 1

Math 354 Transition graphs and subshifts November 26, 2014

Rudiments of Ergodic Theory

PHY411 Lecture notes Part 5

DYNAMICAL SYSTEMS PROBLEMS. asgor/ (1) Which of the following maps are topologically transitive (minimal,

Dynamics and time series: theory and applications. Stefano Marmi Giulio Tiozzo Scuola Normale Superiore Lecture 3, Jan 20, 2010

Lecture Notes Introduction to Ergodic Theory

A construction of strictly ergodic subshifts having entropy dimension (joint with K. K. Park and J. Lee (Ajou Univ.))

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

Chapter 8. P-adic numbers. 8.1 Absolute values

Symbolic extensions for partially hyperbolic diffeomorphisms

Lebesgue Measure on R n

Topological properties

Introduction to Dynamical Systems

Integration on Measure Spaces

UNIVERSITY OF BRISTOL. Mock exam paper for examination for the Degrees of B.Sc. and M.Sci. (Level 3)

Problemas abiertos en dinámica de operadores

Waiting times, recurrence times, ergodicity and quasiperiodic dynamics

Segment Description of Turbulence

Rotation set for maps of degree 1 on sun graphs. Sylvie Ruette. January 6, 2019

SYMBOLIC DYNAMICS FOR HYPERBOLIC SYSTEMS. 1. Introduction (30min) We want to find simple models for uniformly hyperbolic systems, such as for:

3 hours UNIVERSITY OF MANCHESTER. 22nd May and. Electronic calculators may be used, provided that they cannot store text.

Measures. Chapter Some prerequisites. 1.2 Introduction

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

M3A23/M4A23. Specimen Paper

Probability and Measure

IRRATIONAL ROTATION OF THE CIRCLE AND THE BINARY ODOMETER ARE FINITARILY ORBIT EQUIVALENT

4th Preparation Sheet - Solutions

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

Pre-image Entropy. Wen-Chiao Cheng and Sheldon Newhouse. February 11, 2004

NOTES ON DIOPHANTINE APPROXIMATION

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

The Caratheodory Construction of Measures

Problem: A class of dynamical systems characterized by a fast divergence of the orbits. A paradigmatic example: the Arnold cat.

POINTWISE DIMENSION AND ERGODIC DECOMPOSITIONS

10. The ergodic theory of hyperbolic dynamical systems

Three hours THE UNIVERSITY OF MANCHESTER. 31st May :00 17:00

Quantitative recurrence for beta expansion. Wang BaoWei

LOCAL ENTROPY THEORY

4 Countability axioms

Li- Yorke Chaos in Product Dynamical Systems

Thermodynamics for discontinuous maps and potentials

MATH 4200 HW: PROBLEM SET FOUR: METRIC SPACES

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

STAT 7032 Probability Spring Wlodek Bryc

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Homework in Topology, Spring 2009.

Lecture 5 - Hausdorff and Gromov-Hausdorff Distance

ORBITAL SHADOWING, INTERNAL CHAIN TRANSITIVITY

L Enseignement Mathématique, t. 40 (1994), p AN ERGODIC ADDING MACHINE ON THE CANTOR SET. by Peter COLLAS and David KLEIN

Chapter 2 Metric Spaces

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Topological properties of Z p and Q p and Euclidean models

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6

Based on the Appendix to B. Hasselblatt and A. Katok, A First Course in Dynamics, Cambridge University press,

MEASURE-THEORETIC ENTROPY

Optimization Theory. A Concise Introduction. Jiongmin Yong

CHAOS ON THE INTERVAL a survey of relationship between the various kinds of chaos for continuous interval maps. Sylvie Ruette

1 Directional Derivatives and Differentiability

Measure and integration

l(y j ) = 0 for all y j (1)

NOTES ON THE REGULARITY OF QUASICONFORMAL HOMEOMORPHISMS

1 Measurable Functions

Real Analysis Problems

MATH 54 - TOPOLOGY SUMMER 2015 FINAL EXAMINATION. Problem 1

A dyadic endomorphism which is Bernoulli but not standard

Ergodic Properties of Markov Processes

Lebesgue Measure on R n

Measure and Integration: Concepts, Examples and Exercises. INDER K. RANA Indian Institute of Technology Bombay India

C*-algebras associated with interval maps

ITERATED FUNCTION SYSTEMS WITH CONTINUOUS PLACE DEPENDENT PROBABILITIES

Seminar In Topological Dynamics

Math 5051 Measure Theory and Functional Analysis I Homework Assignment 3

Differential equations: Dynamics and Chaos. Pierre Berger

Ergodic Theory and Topological Groups

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

Notes on Measure Theory and Markov Processes

Topological Properties of Invariant Sets for Anosov Maps with Holes

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

On the smoothness of the conjugacy between circle maps with a break

Transcription:

preliminary version : Not for diffusion Lecture 4. Entropy and Markov Chains The most important numerical invariant related to the orbit growth in topological dynamical systems is topological entropy. 1 It represents the exponential growth rate of the number of orbit segments which are distinguishable with an arbitrarily high but finite precision. Of course, topological entropy is invariant by topological conjugacy. For measurable dynamical systems, an entropy can be defined using the invariant measure. It gives an indication of the amount of randomness or complexity of the system. The relation between measure theoretical entropy and topological entropy is given by a variational principle. 4.1 Topological Entropy We will follow the definition given by Rufus Bowen in [Bo, Chapter 4]. Let X be a compact metric space. Definition 4.1 Let S X, n N and ε > 0. S is a (n, ε) spanning set if for every x X there exists y S such that d(f j (x), f j (y)) ε for all 0 j n. It is immediate to check that the compactness of X implies the existence of finite spanning sets. Let r(n, ε) be the least number of points in an (n, ε) spanning set. If we bound the time of observation of our dynamical system by n and our precision in making measurements is ε we will see at most r(n, ε) orbits. Exercise 4.2 Show that if X admits a cover by m sets of diameter ε then r(n, ε) m n+1. Definition 4.3 The topological entropy h top (f) of f is given by 1 h top (f) = lim lim sup log r(n, ε). [4.1] ε 0 n + n In the previous definition one cannot replace lim sup with lim since there exist examples of maps for which the limit does not exist. However one can replace it with lim inf still obtaining the topological entropy (see [Mn1], Proposition 7.1, p. 237). Exercise 4.4 Show that the topological entropy for any diffeomorphism of a compact manifold is always finite. Exercise 4.5 Let X = {x l 2 (N), x i < 2 i for all i N}, f((x i ) i N ) = (2x i+1 ) i N. Let k N. Show that for this system r(n, k 1 ) > k n thus h top (f) =. 1 According to Roy Adler [BKS, p. 103] topological entropy was first defined by C. Shannon [Sh] and called by him noiseless channel capacity.

preliminary version! Exercise 4.6 Show that the topological entropy of the p adic map of Exercise 2.36 is log p. Remark 4.7 The topological entropy of a flow ϕ t is defined as the topological entropy of the time one diffeomorphism f = ϕ 1. Exercise 4.8 Show that : (i) the topological entropy of an isometry is zero ; if h is an isometry the topological entropy of f equals that of h 1 f h. (ii) if f is a homeomorphism of a compact space X then h top (f) = h top (f 1 ) ; (iii) h top (f m ) = m h top (f). Exercise 4.9 Let X be a metric space and f a continuous endomorphism of X. We say that a set A is (n, ε) separated if for all x, y X there exists a 0 j n such that d(f j (x), f j (y)) > ε. We denote s(n, ε) the maximal cardinality of an (n, ε) separated set. Show that : (i) s(n, 2ε) r(n, ε) s(n, ε) ; 1 (ii) h top (f) = lim ε 0 lim sup n + n log s(nε) ; (iii) if X is a compact subset of R l and f is Lipschitz with Lipschitz constant K then h top (f) l log K. Proposition 4.10 The topological entropy does not depend on the choice of the metric on X provided that the induced topology is the same. The topological entropy is invariant by topological conjugacy. Proof. We first show how the second statement is a consequence of the first. Let f, g be topologically conjugate via a homeomorphism h. Let d denote a fixed metric on X and d denote the pullback of d via h : d (x 1, x 2 ) = d(h 1 (x 1 ), h 1 (x 2 )). Then h becomes an isometry so h top (f) = h top (g) (see Exercise 4.8). Let us now show the first part. Let d and d be two different metrics on X which induce the same topology and let r d (n, ε) and r d (n, ε) denote the minimal cardinality of a (n, ε)-spanning set in the two metrics. We will denote h top,d (f) and h top,d (f) the corresponding topological entropies. Let ε > 0 and consider the set D ε of all pairs (x 1, x 2 ) X X such that d(x 1, x 2 ) ε. This is a compact subset of X X thus d takes a minimum δ (ε) > 0 on D ε. Thus any δ (ε) ball in the metric d is contained in a ε ball in the metric d. From this one gets r d (n, δ (ε)) r d (n, ε) thus h top,d (f) h top,d (f). Interchanging the role of the two metrics one obtains the opposite inequality. Exercise 4.11 Show that if g is a factor of f then h top (g) h top (f). An alternative but equivalent definition of topological entropy is obtained considering all possible open covers of X and their refinements obtained by iterating f. 2

C. Carminati and S. Marmi An Introduction to Dynamical Systems Definition 4.12 If α, β are open covers of X theire join α β is the open cover by all sets of the form A B, where A α and B β. An open cover β is a refinement of α, written α < β, if every member of β is a subset of a member of α. Let α be an open cover of X and let N(α) be the number of sets in a finite subcover of α with smallest cardinality. We denote f 1 α the open cover consisting of all sets f 1 (A) where A α. Exercise 4.13 If {a n } n N is a sequence of real numbers such that a n+m a n +a m for all n, m then lim n + a n /n exists and equals inf n N a n /n. [Hint : n = kp+m, a nn a p p + a m kp.] Theorem 4.14 The topological entropy of f is given by ( n 1 ) 1 h top (f) = sup lim α n n log N f i α i=0. [4.2] For its proof see [Wa, pp. 173-174]. 4.2 Entropy and information. Metric entropy. In order to define metric entropy and to make clear its analogy with the formula [4.2] of topological entropy we will preliminarly introduce some general considerations on the relationship between entropy and information (see [Khi]). Suppose that one performs an experiment which we will denote α which has m N possible mutually esclusive outcomes A 1,..., A m (e.g. throwing a coin m = 2 or a dice m = 6). Assume that each possible outcome A i happens with a probability p i [0, 1], m p i = 1 (in an experimental situation the probability will be defined statistically). In a probability space (X, A, µ) this corresponds to the following setting : α is a finite partition X = A 1... A m mod(0), A i A, µ(a i A j ) = 0, µ(a i ) = p i. We want to define a function (called entropy) which measures the uncertainity associated to a prediction of the result of the experiment (or, equivalently, which measures the amount of information which one can gain from performing the experiment). Let (m) denote the standard m-simplex of R m, (m) = {(x 1,..., x m ) R m x i [0, 1], x i = 1}. Definition 4.15A continuous function H (m) : (m) [0, + ] is called an entropy if it has the following properties : 3

preliminary version! (1) symmetry : i, j {1,..., m} H (m) (p 1,..., p i,..., p j,..., p m ) = H(p 1,..., p j,..., p i,..., p m ) ; (2) H (m) (1, 0,..., 0) = 0 ; (3) H (m) (0, p 2,..., p m ) = H (m 1) (p 2,..., p m ) m 2, (p 2,..., p m ) (m 1) ; (4) (p 1,..., p m ) (m) one has H (m) (p 1,..., p m ) H ( (m) 1 m,..., m) 1 where equality is possible if and only if p i = 1 m for all i = 1,..., m ; (5) Let (π 11,..., π 1l, π 21,..., π 2l,..., π m1,..., π ml ) (ml) ; for all (p 1,..., p m ) (m) one must have H (ml) (π 1l,..., π 1l, π 21,..., π ml ) =H (m) (p 1,..., p m )+ + p i H (l) ( πi1 p i,..., π il p i In the above definition : (2) says that if some outcome is certain then the entropy is zero ; (3) says that no information is gained from impossible outcomes (i.e. outcomes with probability zero) ; (4) says that the maximal uncertainity of the outcome is obtained when the possible results have the same probabilitly ; (5) describes the behaviour of entropy when independent distinct experiences are performed. Let β denote another experiment with possible outcomes B 1,..., B l (i.e. another partition of (X, A, µ)). Let π ij be the probablility of A i and B j. The conditional probability of B j is prob (B j A i ) = π ij p i (i.e. µ(a i B j )). Clearly the uncertainity of the outcome of the experiment β once one has already performed α with outcome A i is given by H (l) ( π i1 p i,..., π il p i ). Theorem 4.16 An entropy is necessarily a positive multiple of H(p 1,..., p m ) = ). p i log p i. [4.3] Here we adopt the convention 0 log 0 = 0. The above theorem and its proof are taken from [Khi, pp. 10-13]. Proof. Let K(m) = H ( 1 m,..., 1 m). By (3) and (4) K is increasing : K(m) = H(0, 1/m,..., 1/m) H(1/(m + 1),..., 1/(m + 1)) = K(m + 1). Let m and l be two positive integers. Applying (5) with π ij 1 ml, p i 1 m gives K(lm) = K(m) + 1 K(l) = K(m) + K(l) m 4

C. Carminati and S. Marmi An Introduction to Dynamical Systems thus K(l m ) = mk(l). Given three integers r, n, l let m be such that l m r n l m+1, i.e. m n log r log l m n + 1 n. Since mk(l) = K(l m ) K(r n ) = nk(r) K(l m+1 ) = (m + 1)K(l) one obtains m n K(r) K(l) m n + 1 n, i.e. K(r) K(l) log r log l 1 K(r) n. Thus log r = K(l) log l and K(m) = c log m, c > 0. Let (p 1,..., p m ) Q m (m) and let s denote the least common multiple of their denominators. Then p i = r i s and m r i = s. In addition to the partition α with elements A 1,..., A m and associated probabilities con p 1,..., p m we also consider β with s outcomes B 1,..., B s which we divide into m groups each of them containing r 1,..., r m outcomes respectively. Let π ij = p i r i = 1 s, i = 1,..., m, j = 1,..., r i. Given any outcome A i of α the possible r i outcomes of β are equally probable thus ( πi1 H,..., π ) ir i = c log r i and p i p i ( πi1 p i H,..., π ) ir i = c p i log r i = c p i log p i + c log s. p i p i On the other hand H(π i1,..., π mrm ) = c log s and by (5) H(p 1,..., p m ) = H(π i1,..., π mrm ) = c p i log p i, p i H ( πi1,..., π ) ir i p i p i thus [4.3] holds on a dense subset of (m). By continuity it must hold everywhere. The entropy H can be regarded as 1 N the logarithm of the probability of a typical result of the experiment α repeated N times. Indeed, if N is large and α is repeated N times, by the law of large numbers one should observe each A i approximately p i N times. Thus the probability of a typical outcome is p p 1N 1 p p 2N 2... p p mn m. We now want to extend the notion of entropy to measurable dynamical systems (X, A, µ, f). If α and β are two partitions of X, their joint partition α β is {A B, A α, B β}. Given n partitions α 1,..., α n we will denote n α i their joint 5

preliminary version! partition. If f is measurable and f 1 (A) A for all A A, and α is a partition, f 1 α is the partition defined by the subsets {f 1 A, A α}. Finally a partition β is finer than α, denoted α < β, if B β A α such that B A. The entropy H(α) of a partition α = {A 1,..., A m } is given by H(α) = m µ(a i) log µ(a i ). Definition 4.17 Let (X, A, µ, f) be a measurable dynamical system and α a partition. The entropy of f w.r.t. the partition α is ( n 1 ) 1 h µ (f, α) := lim n n H f i α [4.4] The entropy of f is i=0 h µ (f) := sup{h(s, α), α is a finite partition of X}. [4.5] Remark 4.18 Using the strict convexity of x log ( x on R +, one can prove the existence of the limit [4.4]. Indeed the sequence 1 n H n 1 ) i=0 f i α is non negative monotonic non increasing. Thus h µ (f, α) 0 for all α. Exercise 4.19 Show that if two measurable dynamical systems are isomorphic then they have the same entropy. The above considerations show that the entropy of a partition α measures the amount of information obtained making a measurement by means of a device which distinguishes points of X with the resolution prescribed by {A 1,..., A m } = α. If x X and we consider the orbit of x up to time n 1 x, fx, f 2 x,..., f n 1 x, since α is a partition mod(0) of X the points f i x, 0 i n 1, belong (almost surely) to exactly one of the sets of α : x i A ki with k i {1,..., m} for all i = 0,..., n 1. H ( n 1 i=0 f i α ) measures the information obtained from the knowledge of the) distribution w.r.t. α of a segment of orbit of length n. Thus i=0 f i α is the average amount of information per unit of time and ( 1 n H n 1 h µ (S, α) is the amount of information (asymptotically) obtained at each iteration of the dynamical system from the knowledge of the distribution of the orbit of a point w.r.t. the partition α. A more satisfactory formulation of this is given by the following theorem [Mn1]. Theorem 4.20 (Shannon-Breiman-McMillan) Let (X, A, µ, f) be an ergodic measurable dynamical system, α a finite partition of X. Given x X let α n (x) be the element of n 1 i=0 f i α which contains x. For µ a.e. x X one has h µ (f, α) = lim n 1 n log µ(αn (x)). [4.6] 6

C. Carminati and S. Marmi An Introduction to Dynamical Systems Remark 4.21 The previous theorem admits the following interpretation : if a system is ergodic then there exists a non negative number h such that ε > 0 if α is a sufficiently fine partition of X then there exists a positive integer N such that for all n N there is a subset X n of X with measure µ(x n ) > 1 ε and made of approximately e nh elements of n 1 i=0 S i α, each measuring about e nh. Let X be a compact metric space and A be the Borel σ-algebra. Brin e Katok [M. Brin and A. Katok, Lecture Notes in Mathematics 1007 (1983) 30 38] gave a topological version of Shannon-Breiman-McMillan s Theorem. Let B(x, ε) be the ball of center x X and radius ε. Let f : X X be continuous and preserving the probability measure µ : A [0, 1]. Let B(x, ε, n) := {y X d(f i x, f i y) ε forall i = 0,..., n 1}, i.e. B(x, ε, n) is the set of points y X whose orbit stays at a distance at most ε from the orbit of x for at least n 1 iterations. Then one has Theorem 4.22 (Brin-Katok) x X one has sup ε>0 Assume that (X, A, µ, f) is ergodic. For µ a.e. lim sup 1 n n log µ(b(x, ε, n)) = h µ(f). [4.7] When the entropy is positive some of the observables are not predictable. A system is chaotic if it has positive entropy. Brin-Katok s Theorem together with Poincaré recurrence theorem show that the orbits of chaotic systems are subject to two apparently contrasting requirements. On one hand almost every orbit is recurrent. On the other hand the probability that two orbits stay close to each other for an inteval of time of length n decays exponentially with n. Since two initially close orbits must come infinitely many times close to their origin, if the entropy is positive they cannot be correlated. Tipically they will separate one from the other and return at different times n. To this complexity of the motions one associates the notion of chaos and shows how it can be impossible to compute the values that an observable will assume from the knowledge of the past. Remark 4.23 To compute the entropy one can use the following important result of Kolmogorov and Sinai : if α is a partition of X which generates the σ-algebra A the entropy of (X, A, µ, f) is simply given by h µ (f) = h µ (f, α). [4.8] We recall that α generates A iff + f i α = A mod(0) if f is invertible, i=0 f i α = A mod(0) if f is not invertible. 7

preliminary version! Exercise 4.24 Show that the entropy of the p adic map is log p. Exercise 4.25 Interpret formula [4.2] in terms of information (so as its analogy with [4.4] is clear). 4.3 Shifts and Bernoulli schemes Let N 2, Σ N = {1,... N} Z. For x = (x i ) i Z, y = (y i ) i Z we define their distance d(x, y) = 2 a(x,y) where a(x, y) = inf{ n, n Z, x n y n }. [4.9] Then (Σ N, d) is a compact (ultra) metric space. The shift σ : Σ N Σ N is the bilipschitzian homeomorphism of Σ N (the Lipschitz constant is N) defined by σ((x i ) i Z ) = (x i+1 ) i Z. [4.10] Topological properties of the shift map : The phase space Σ N is totally disconnected and has Hausdorff dimension 1. The homeomorphism σ is expansive : for all x y there exists n such that d(σ n (x), σ n (y)) 1. The topological entropy of (Σ N, σ) is log N. Let (p 1,..., p N ) (N) and let ν be the probability measure on {1,... N} such that ν({i}) = p i. Definition 4.26The Bernoulli scheme BS(p 1,..., p N ) is the measurable dynamical system given by the shift map σ : Σ N Σ N with the (product) probability measure µ = ν Z on Σ N. Exercise 4.27 Show that the σ algebra of measurable subsets of Σ N coincides with its Borel σ algebra and its generated by cylinders : if j 1,..., j k {1,... N} and i 1,..., i k Z the corresponding cylinder is ( ) j1,..., j k C = {x Σ N x i1 = j 1, x i2 = j 2,..., x ik = j k }, [4.11] i 1,..., i k Check that the measure of cylinders for the Bernoulli scheme BS(p 1,..., p N ) is ( ( )) j1,..., j k µ C = p j1... p jk, [4.12] i 1,..., i k and that it is preserved by the shift map. Proposition 4.27The Kolmogorov Sinai entropy of the Bernoulli scheme BS(p 1,..., p N ) is N p i log p i. 8

C. Carminati and S. Marmi An Introduction to Dynamical Systems Proof. The partition α defined by the cylinders { C ( )} j 0 generates the j=1,...,n sigma-algebra A. By Remark 4.22 we can thus use it to compute the entropy. Since { ( )} α σ 1 j0 j α = C 1 α σ 1 α σ 2 α = 0 1 { C j 0,j 1 =1,...,N ( j0 j 1 j 2 0 1 2 and so on, and the corresponding entropies are H(α) = H(α σ 1 α) = = H(α σ 1 α σ 2 ) = p j log p j j=1 j 0 =1 j 1 =1 )} p j0 p j1 log p j0 p j1 = N (p j0 log p j0 p j1 j 0 =1 = 2 p j log p j j=1 j 1 =1 j i =1,...,N, i=0,1,2 N (p j1 log p j1 p j0 = j 1 =1 j 0,j 1,j 2 p j0 p j1 p j2 log p j0 p j1 p j2 = 3 j 0 =1 (p j log p j and so on. Thus h µ (σ, α) = N j=1 p j log p j. Remark 4.28 Note that h µ (σ) log N for all (p 1,..., p N ) (N) with equality if and only if p i = 1/N for all i for which we get the unique invariant measure of the shift on N symbols which realizes the topological entropy. Let us see how the shift and the shift invariant compact subsets of Σ N arise naturally in the context of symbolic dynamics (the following description is taken from the lectures of J. C. Yoccoz at the 1992 ICTP School on Dynamical Systems). Let (Y, d) be a compact metric space and f a homeomorphism of Y. Let Y = Y 1... Y N, where the Y i are compact. Given a point y Y we define Σ(f, y) = {x Σ N, f i (y) Y xi i Z}. This is a nonempty compact subset of Σ N. Moreover we define Σ(f) = y Y Σ(f, y) = {x Σ N, i Z f i (Y xi ) }. 9 j=1

preliminary version! Exercise 4.29 Show that Σ(f) is also a compact subset of Σ N, invariant under the shift. [Hint : Σ(f, f(y)) = f(σ(f, y)).] Assume that the map f is expansive, i.e. there exists ε > 0 such that for all y 1 y 2 there exists an integer n such that d(f n (y 1 ), f n (y 2 )) > ε, and choose the compacts Y i above with diam(y i ) < ε. Then by expansivity if y 1 y 2 the sets Σ(f, y 1 ) and Σ(f, y 2 ) are disjoint and we can define a map h : Σ(f) Y by the property h 1 (y) = Σ(f, y). Exercise 4.30 Show that h is surjective, continuous and h σ = f h, i.e. h is a semiconjugacy from the restriction of the shift σ to Σ(f) to f. Exercise 4.31 Show that the semiconjugacy above is indeed a topological conjugacy if and only if Y is totally disconnected (and f is expansive). [Hint : choose the compacts Y i with diam(y i ) < ε and disjoint.] 4.4 (Topological) Markov chains and Markov maps The discussion at the end of the previous section shows the importance of the shift invariant compact subsets of Σ N. Among these a very important subclass are the so called topological Markov chains or subshifts of finite type. Let Γ {1,... N} 2 and let Γ be a connected directed graph on the vertices {1,... N} with at most one arrow between two vertices : there is an arrow from i to j if and only if (i, j) Γ. We denote A = A Γ the N N matrix with entries a ij {0, 1} defined as follows : a ij = { 1 (i, j) Γ there is an arrow in Γ from i to j 0 otherwise We moreover assume that for all i {1,... N} there exist j, k {1,... N} such that a ij = a ki = 1. We associate to the matrix A (or, equivalently, to the directed graph Γ ) the subset Σ A Σ N defined as follows : Σ A = {x Σ N, (x i, x i+1 ) Γ i Z}. Exercise 4.32 Show that Σ A is a compact shift invariant subset of Σ N. The restriction of the shift σ to Σ A is denoted σ A and is called the topological Markov chain (or subshift of finite type) associated to the matrix A (equivalently to the graph Γ ). 10

C. Carminati and S. Marmi An Introduction to Dynamical Systems Exercise 4.33 Show that card (Fix(σA n)) = Tr(An ) for all n N. Deduce from this that the Artin-Mazur zeta function ( ) 1 ζ A (t) = exp n card (Fix(σn A))t n n=0 is rational (indeed it is equal to det(i ta) 1 ). The matrix A is called primitive if there exists a positive integer m such that all the entries of A m are strictly positive : A m = (a m ij ) and am ij > 0 for all i, j. Then it is easy to show that for all n m one also has a n ij > 0 for all i, j. Exercise 4.34 Show that if A is primitive then σ A is topologically transitive, and its periodic orbits are dense in Σ A. Moreover σ A is topologically mixing (...). When the matrix is primitive one can apply the classical Perron Frobenius theorem to compute the topological entropy of the associated subshift. Theorem 4.35 (Perron Frobenius, see [Gan]) If A is primitive then there exists an eigenvalue λ A > 0 such that : (i) λ A > λ for all eigenvalues λ λ A ; (ii) the left and right eigenvectors associated to λ A are strictly positive and are unique up to constant multiples ; (iii) λ A is a simple root of the characteristic polynomial of A. Exercise 4.35 Assume that A is primitive. Show that the topological entropy of σ A is log λ A (clearly λ A > 1 since all the integers a m ij > 0). Very much as the shift on N symbols preserves many invariant measures (the Bernoulli schemes on N symbols) a topological Markov chain preserves many invariant measures (which are called Markov chains). Let P = (P ij ) be an N N matrix such that (i) P ij 0 for all i, j, and P ij > 0 a ij = 1 ; (ii) N j=1 P ij = 1 for all i = 1,..., N ; (iii) P m has all its entries strictly positive. Such a matrix is called a stochastic matrix. Applying Perron Frobenius theorem to P we see that 1 is a simple eigenvalue of P and there exists a normalized eigenvector p = (p 1,..., p N ) (N) such that p i > 0 for all i and p i P ij = p j, 1 i N. We define a probability measure µ on Σ A corresponding to P prescribing its value on the cylinders : ( ( )) j0,..., j k µ C = p j0 P j0 j i,..., i + k 1 P jk 1 j k, 11

preliminary version! for all i Z, k 0 and j 0,..., j k {1,..., N}. It is called the Markov measure associated to the stochastic matrix P. Exercise 4.36 Prove that the subshift σ A preserves the Markov measure µ. The subshift of finite type σ A with the preserved measure µ is called a Markov chain. Exercise 4.37 Show that the Kolmogorov Sinai entropy of (Σ A, A, σ A, µ) is Check that h µ (σ A ) h top (σ A ). h µ (σ A ) = i,j=1 p i P ij log P ij. One can prove that there exists a stochastic matrix P such that the entropy of the associated Markov chain is equal to the topological entropy of σ A. Moreover this measure is unique (Parry measure, see [Mn1]). Remark 4.38 There is another point of view which can be useful in studying topological Markov chains and their invariant Markov measures. Call a sequence x Σ A a configuration of a one dimensional spin system (or Potts system) with configuration space Σ A. Then part of the classical stastistical mechanics of spin systems [Ru] is just the ergodic theory of the topological Markov chain (the shift invariant measures being interpreted as translation invariant measures). Remark 4.39 An interesting application of the symbolic dynamics method described at the end of Section 3 is the theory of piecewise expanding Markov maps of the interval (Exercise 2.21). Let Y = [0, 1], f : Y Y piecewise monotonic and C 2, i.e. there exists a finite decomposition of the interval [0, 1] in N subintervals I i = [a i, a i+1 ), (a 1 = 0, a N+1 = 1) on which f si monotonic and of class C 2 on their closure. On each of these subintervals an inverse branch f 1 i of f is well defined. Assume moreover Markov property f(i i ) = I ki I ki +1... I ki +n i ; aperiodicity there exists an integer m such that f m (I i ) = Y for all i = 1,..., N ; eventual expansivity some iterate of f has its derivative bounded away from 1 in modulus. After Section 3 the symbolic dynamics of these maps is just a topological Markov chain. Moreover one can prove that there exists a unique invariant ergodic measure absolutely continuous w.r.t. the Lebesgue measure with piecewise continuous density bounded away from 0 and. With this measure the system is isomorphic to the Markov chain with the Parry measure : see [AF]. The existence of 12

C. Carminati and S. Marmi An Introduction to Dynamical Systems an absolutely continuous invariant measure can be proven also under weaker assumptions, see the classical [LY]. 13