Chaos, Complexity, and Inference (36-462)

Similar documents
One-Parameter Processes, Usually Functions of Time

PHY411 Lecture notes Part 5

CSC236 Week 10. Larry Zhang

Week 2: Defining Computation

Chaos, Complexity, and Inference (36-462)

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Finite State Machines 2

Computational Models: Class 3

THE GOLDEN MEAN SHIFT IS THE SET OF 3x + 1 ITINERARIES

Johns Hopkins Math Tournament Proof Round: Automata

What we have done so far

Theory of computation: initial remarks (Chapter 11)

CSCI3390-Lecture 6: An Undecidable Problem

EDUCATIONAL PEARL. Proof-directed debugging revisited for a first-order version

Kleene Algebras and Algebraic Path Problems

Lecture 10: Powers of Matrices, Difference Equations

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

An Introduction to Entropy and Subshifts of. Finite Type


CS 121, Section 2. Week of September 16, 2013

Lecture 23 : Nondeterministic Finite Automata DRAFT Connection between Regular Expressions and Finite Automata

Complex Systems Methods 3. Statistical complexity of temporal sequences

3515ICT: Theory of Computation. Regular languages

Nondeterministic finite automata

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Chaos, Complexity, and Inference (36-462)

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

(Refer Slide Time: 0:21)

Theory of Computation Lecture 1. Dr. Nahla Belal

Chapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin

Fooling Sets and. Lecture 5

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CS 154, Lecture 3: DFA NFA, Regular Expressions

Proofs, Strings, and Finite Automata. CS154 Chris Pollett Feb 5, 2007.

Notes for Lecture Notes 2

CSC173 Workshop: 13 Sept. Notes

CS 256: Neural Computation Lecture Notes

WHAT IS A CHAOTIC ATTRACTOR?

1 Alphabets and Languages

How do regular expressions work? CMSC 330: Organization of Programming Languages

One-to-one functions and onto functions

Proving languages to be nonregular

Automata: a short introduction

CSCI 2200 Foundations of Computer Science Spring 2018 Quiz 3 (May 2, 2018) SOLUTIONS

Computation in Sofic Quantum Dynamical Systems

1 Two-Way Deterministic Finite Automata

jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?

Automata Theory and Formal Grammars: Lecture 1

Complexity and NP-completeness

Finite Automata Part One

Semigroup invariants of symbolic dynamical systems

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Mathematical Preliminaries. Sipser pages 1-28

Finite Automata Part One

arxiv:math/ v1 [math.co] 8 Oct 2003

ECS 120 Lesson 15 Turing Machines, Pt. 1

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 10 Nancy Lynch

Time Magazine (1984)

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

THE LOGIC OF QUANTIFIED STATEMENTS. Predicates and Quantified Statements I. Predicates and Quantified Statements I CHAPTER 3 SECTION 3.

Lecture 4: Constructing the Integers, Rationals and Reals

Math 3361-Modern Algebra Lecture 08 9/26/ Cardinality

Finite Automata Part One

The Separating Words Problem

Chaos, Complexity, and Inference (36-462)

CHAPTER 1 Regular Languages. Contents

Languages, regular languages, finite automata

Lecture 2: Connecting the Three Models

Finite-State Machines (Automata) lecture 12

Concatenation. The concatenation of two languages L 1 and L 2

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

CSci 311, Models of Computation Chapter 4 Properties of Regular Languages

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Lecture 3: Finite Automata

1 What a Neural Network Computes

The Separating Words Problem

2. Syntactic Congruences and Monoids

Computational Models Lecture 2 1

Chapter Two: Finite Automata

CSE 105 THEORY OF COMPUTATION

Computational Models Lecture 2 1

UNIT-III REGULAR LANGUAGES

CA320 - Computability & Complexity

Theory of computation: initial remarks (Chapter 11)

Computational Theory

Uses of finite automata

Introduction to Techniques for Counting

Chaos and Liapunov exponents

Finite Automata and Formal Languages

Lecture 3: Finite Automata

Sri vidya college of engineering and technology

Griffith University 3130CIT Theory of Computation (Based on slides by Harald Søndergaard of The University of Melbourne) Turing Machines 9-0

Context Free Language Properties

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

Theory of Computation

20.1 2SAT. CS125 Lecture 20 Fall 2016

The Pumping Lemma. for all n 0, u 1 v n u 2 L (i.e. u 1 u 2 L, u 1 vu 2 L [but we knew that anyway], u 1 vvu 2 L, u 1 vvvu 2 L, etc.

Transcription:

Chaos, Complexity, and Inference (36-462) Lecture 5: Symbolic Dynamics; Making Discrete Stochastic Processes from Continuous Deterministic Dynamics Cosma Shalizi 27 January 2009

Symbolic dynamics Reducing continuous time series to sequences of discrete symbols Stochastic processes How to get random sequences from deterministic dynamics Topological entropy rate How many different things can your process do? Formal languages Ways of making your process talk Further reading: Badii and Politi (1997) is the reference on symbolic dynamics which comes closest to what we are doing (because I learned it from there). It s mathematically advanced, however. Chapter 9 of Devaney (1992) has a more pure-mathematical introduction; if you like that, you might consider Lind and Marcus (1995); Kitchens (1998) assumes you know a lot of abstract algebra.

Symbolic Dynamics Start with our favorite dynamical system, with a continuous state S t and a map Φ S t+1 = Φ(S t ) Partition B: divide the state space up into non-overlapping cells, B 0, B 1,... B k 1 b(s t ) = label (symbol) for the cell S t is in = X t (say) symbol sequence X X 1 = b(s 1 ), b(s 2 ), b(s 3 ),... = b(s 1 ), b(φ(s 1 )), b(φ (2) (S 1 )),... i.e., given initial condition S 1 and partition B, symbol sequence X1 is fixed

The Shift Map Seen symbols, what about the dynamics? Shift map Σ Σ(X 1 ) = X 2 Σ shifts the symbol sequence one place over Σ (k) shifts the symbol sequence k places Σ (k) (X 1 ) = X 1+k b S t X t Φ S t+1 b Σ X t+1

Why do this? 1. Model of finite-resolution measurements 2. Continuous math is hard; let s go discretize Discrete-math mathematical tools Probability tools 3. Sometimes involves no real loss

Refinement of partitions Subdivide cells according to which symbol they will give us From Badii and Politi (1997, p. 71)

In math, work out Φ 1 B i for each cell B i Now the new partition B 2 is all the sets B i Φ 1 B j Refinement: knowing the cell in B 2 tells you the cell in B, but not the other way Example: Take the r = 1 logistic map with B 0 = [0, 0.5), B 1 = [0.5, 1]. Call these L and R, so we don t mix them up with other things, and color them red and black

t Symbolic Dynamics 0 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 1.0 x shows B, Φ 1 B, Φ 2 B,...

Generating Partitions A partition is generating if the cells of B, B 2, B 3,... keep getting smaller forever or: the infinite symbol sequence X1 corresponds to a unique initial condition S 1 then we are back in change-of-coordinates land: b S t X t Φ S t+1 b 1 Σ X t+1 Example: The example partition for the logistic map is generating for all r, not just r = 1

Where did all the details go? Most of these maps get pretty complicated pretty quickly e.g. try writing out Φ 20 for the logistic map but Σ 20 is as trivial as Σ Trick: the complexity has moved out of the dynamical map to the state space now the space of symbol sequences and, possibly, probability distributions on the sequence space This lets us use different mathematical tools

When are there generating partitions? For one-dimensional maps, make a generating partition by putting boundaries at the critical points, i.e. maxima, minima, vertical asymptotes For higher-dimensional maps, there are fewer general rules; don t always exist

Estimating generating partitions Symbolic false nearest neighbors (Kennel and Buhl, 2003): if you have a generating partition, then close symbol sequences should only come from close points in the state-space 1 Reconstruct your state space 2 Start with an initial partition 3 Calculate distances among symbols sequences and distances among state points 4 Find false symbolic neighbors 5 Tweak partition boundaries to reduce the number of false neighbors 6 Iterate to convergence Another approach ( symbolic shadowing ): similar symbol sequences should imply close trajectories in state space (Hirata et al., 2004)

... from fully deterministic continuous dynamics If S 1 has a distribution, then so do X 1 = b(s 1 ), X 2 = b(φ(s 1 )),..., X t = b(φ (t 1) (S 1 )),... In general the X t will be dependent on each other symbol sequences are stochastic processes Studying these processes can tell us about the dynamical system Symbolic dynamics tells us about how stochastic processes arise

Again with the Logistic Map r = 1 B 0 = [0, 1 2 ), B 1 = [ 1 2, 1] so X t are binary variables (values L, R) S 1 in invariant distribution Claim: X 1 is a sequence of IID, with P(X t = L) = 0.5 Translation: the logistic map gives us perfect coin-tossing

The usual argument for this 1 Under the invariant distribution, P(B 0 ) = 1/2 2 Under the invariant distribution P(Φ 1 B i B j ) = 1/2 for all i, j boundaries are not evenly placed, but distribution isn t uniform, cancels out so b(s t ) and b(s t+1 ) are independent 3 In fact P(Φ k B i Φ k 1 B jk, Φ k 2 B jk 1,... B j1 ) = 1/2 so b(r t1 ), b(r t2 ), b(r t3 ),... are all independent

The usual mathematician s argument Integrals with the invariant distribution and pre-images of the partition get messy; avoid 1 smoothly change coordinates to go from the logistic map, with state S t, to the tent map, with state R t 2 this changes the invariant distribution to be the uniform distribution 3 leaves the generating partition alone 4 write R t as a binary number 5 b(r t ) = R if and only if the first digit of R t is 1 6 b(r t+1 ) = R if the first digit of R t was 1 and its second digit was 0, or if the first digit was 0 and the second digit was 1 ; etc. for other two-digit combinations 7 b(r t+1 ) is independent of b(r t ) 8 in fact b(r t1 ), b(r t2 ), b(r t3 ),... are all independent

But why think when you can do the experiment? EXERCISE 1: Write a program to simulate the symbolic dynamics of the logistic map with r = 1. Tabulate the frequencies of sub-sequences of length 2n. Test whether X t+n 1 t is independent of X t 1 t n.

Independent symbols is an extreme case More general, dependence across symbols Two aspects: absolute restrictions on what sequences can appear (today) relative frequency dependence (next lecture)

Forbidden Sequences Take r = 0.966; this is moderately chaotic (Lyapunov exponent 0.42) You can verify either by calculation or simulation that LLL never appears nor LLRR nor infinitely many others These are all forbidden Those which do appear are allowed Also say allowed and forbidden words (because they re made from letters)

Topological Entropy Rate Every allowed word of length n implies a word of length n 1 At least as many longer words as shorter words W n = number of allowed words of length n 1 h 0 lim n n log 2 W n Measures the exponential growth in the number of allowed patterns as their length grows For any process h 0 0

2 h 0 says (roughly) how many choices there are for ways of continuing the typical sequence h 0 > 0 is necessary for sensitive dependence on initial conditions fixed points or periodicity h 0 = 0 With r = 1, W n = 2 n so h 0 = log 2 2 = 1, thus two possible continuations Careful: h 0 = 0 can mean only one possible sequence, or just sub-exponential growth

EXERCISE 2: Write a program to calculate the topological entropy rate for the logistic map at any r EXERCISE 3: How would you put a standard error on your estimate of h 0?

Languages A word is a finite sequence of symbols A (formal) language is a set of allowed words A (formal) grammar is a collection of rules which give you all and only the allowed words blame the linguists for mixed metaphor See Badii and Politi (1997) for more on how this applies to dynamical systems See Charniak (1993) for statistical aspects See Minsky (1967); Lewis and Papadimitriou (1998) for good introductions to formal languages

Regular expressions Simplest sorts of formal grammars; used in Unix, Perl, etc. Basic operations: literals e.g. L, R, etc., depending on alphabet; also none of the above, abbreviated ε alternation this or that ; make an arbitrary choice from two sets e.g. L R means either L or R concatenation string together L(L R) means L, followed by either L or R, means LL or LR star, repetition Repeat something zero or more times L matches ε, L, LL, LLL,... (L(L R)) matches ε, LL, LR, LLLL, LLLR, LRLL, LRLR,...

(LR) a period-two sequence forbidden words include: RR, LL (L(L R)) odd-place symbols must be L, even-place can be L or R or: every other symbol must be L forbidden words include: RR periodicity here is hidden (LRR(RR) ) (L(RR) + ) blocks of Rs, of even length, separated by isolated Ls forbidden words include: LL, LRRRL (L (RR) ) even-length blocks of Rs, separated by blocks of Ls of arbitrary length forbidden words include: LRRRL Not describable by any regular expression: every ( must be followed eventually by a matching )

Expressive Machinery Basic theorem (Kleene, 1956): every regular expression can be implemented by a machine ( automaton ) with a finite memory; finite automata can only implement regular expressions implement can mean: check if words match the expression generate words which match these are equivalent think about generating, it feels more like dynamics

Machines as Directed Graphs write machines as circle-and-arrow diagrams, directed graphs circles states ; fixes possible future sequences for symbolic dynamics, each state in the diagram is a set of states in the original, continuous state-space arrows go from circle to circle, labeled with symbols from the alphabet paths generate words: write down the labels hit following that path concatenation following arrows alternation more than one out-going arrow from a circle star loops should distinguish allowed start and stop states

Some Machines and Expressions L 1 2 R (LR) L 1 R L 2 (L(L R)) 1 R R 2 L R 3 L 1 R R 2 (L(RR) + ) (L (RR) )

Sofic Systems Sofic: only finitely many circles (also called finitary) Finite type: to determine what symbol is possible next, need only look back k symbols, for some fixed k Strictly sofic: sofic, but not of finite type (LR) is of finite type (L(L R)), (L(RR) + ) and (L (RR) ) are strictly sofic These are the skeletons of stochastic processes Finite type finite-order Markov chains Strictly sofic hidden Markov, long-range correlations next time: some statistics!

Badii, Remo and Antonio Politi (1997). Complexity: Hierarchical Structures and Scaling in Physics. Cambridge, England: Cambridge University Press. Charniak, Eugene (1993). Statistical Language Learning. Cambridge, Massachusetts: MIT Press. Devaney, Robert L. (1992). A First Course in Chaotic Dynamical Systems: Theory and Experiment. Reading, Mass.: Addison-Wesley. Hirata, Yoshito, Kevin Judd and Devin Kilminster (2004). Estimating a Generating Partition from Observed Time Series: Symbolic Shadowing. Physical Review E, 70: 016215. doi:10.1103/physreve.70.016215. Kennel, Matthew B. and Michael Buhl (2003). Estimating good discrete partitions from observed data: symbolic false nearest neighbors. Physical Review Letters, 91: 084102. URL http://arxiv.org/abs/nlin.cd/0304054.

Kitchens, Bruce P. (1998). Symbolic Dynamics: One-sided, Two-sided and Countable State Markov Shifts. Berlin: Springer-Verlag. Kleene, S. C. (1956). Representation of Events in Nerve Nets and Finite Automata. In Automata Studies (Claude E. Shannon and John McCarthy, eds.), vol. 34 of Annals of Mathematics Studies, pp. 3 41. Princeton, New Jersey: Princeton University Press. Lewis, Harry R. and Christos H. Papadimitriou (1998). Elements of the Theory of Computation. Upper Saddle River, New Jersey: Prentice-Hall, 2nd edn. Lind, Douglas and Brian Marcus (1995). An Introduction to Symbolic Dynamics and Coding. Cambridge, England: Cambridge University Press. Minsky, Marvin (1967). Computation: Finite and Infinite Machines. Englewood Cliffs, New Jersey: Prentice-Hall.