Dynamical Systems and Information Theory

Similar documents
The equation of motion of a dynamical system is given by a set of differential equations. That is (1)

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

Problem Set 9 Solutions

Lecture 10 Support Vector Machines II

Linear Approximation with Regularization and Moving Least Squares

Errors for Linear Systems

Entropy of Markov Information Sources and Capacity of Discrete Input Constrained Channels (from Immink, Coding Techniques for Digital Recorders)

Kernel Methods and SVMs Extension

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

APPENDIX A Some Linear Algebra

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Lecture Notes on Linear Regression

2.3 Nilpotent endomorphisms

Learning Theory: Lecture Notes

Composite Hypotheses testing

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Grover s Algorithm + Quantum Zeno Effect + Vaidman

DO NOT DO HOMEWORK UNTIL IT IS ASSIGNED. THE ASSIGNMENTS MAY CHANGE UNTIL ANNOUNCED.

Homework Assignment 3 Due in class, Thursday October 15

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Lecture 4: Universal Hash Functions/Streaming Cont d

Comparison of Regression Lines

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

Lecture 12: Discrete Laplacian

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

THE SUMMATION NOTATION Ʃ

Density matrix. c α (t)φ α (q)

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

6. Stochastic processes (2)

6. Stochastic processes (2)

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

12. The Hamilton-Jacobi Equation Michael Fowler

Markov Chain Monte Carlo Lecture 6

Advanced Quantum Mechanics

Convergence of random processes

Note on EM-training of IBM-model 1

Lecture 3: Shannon s Theorem

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Finding Dense Subgraphs in G(n, 1/2)

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 2/21/2008. Notes for Lecture 8

Module 9. Lecture 6. Duality in Assignment Problems

STATISTICAL MECHANICS

The non-negativity of probabilities and the collapse of state

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

1 The Mistake Bound Model

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

The Feynman path integral

10-701/ Machine Learning, Fall 2005 Homework 3

Random Walks on Digraphs

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Tracking with Kalman Filter

Integrals and Invariants of Euler-Lagrange Equations

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models

= z 20 z n. (k 20) + 4 z k = 4

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Numerical Heat and Mass Transfer

Lecture 10: May 6, 2013

Consistency & Convergence

THEOREMS OF QUANTUM MECHANICS

Lecture Space-Bounded Derandomization

Limited Dependent Variables

} Often, when learning, we deal with uncertainty:

Linear Feature Engineering 11

Difference Equations

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

CHAPTER 14 GENERAL PERTURBATION THEORY

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Topics in Probability Theory and Stochastic Processes Steven R. Dunbar. Classes of States and Stationary Distributions

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Continuous Time Markov Chain

Global Sensitivity. Tuesday 20 th February, 2018

CSCE 790S Background Results

Chapter 13: Multiple Regression

3.1 ML and Empirical Distribution

Section 8.3 Polar Form of Complex Numbers

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

18.1 Introduction and Recap

1 Matrix representations of canonical matrices

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006)

Temperature. Chapter Heat Engine

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Norms, Condition Numbers, Eigenvalues and Eigenvectors

5 The Rational Canonical Form

EGR 544 Communication Theory

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Vapnik-Chervonenkis theory

NUMERICAL DIFFERENTIATION

Transcription:

Dynamcal Systems and Informaton Theory Informaton Theory Lecture 4 Let s consder systems that evolve wth tme x F ( x, x, x,... That s, systems that can be descrbed as the evoluton of a set of state varables Such evoluton can be n dscrete or contnuous The former s governed by dfference or recurrence equatons, the later by dfferental equatons t+ t t t or n d x F n n dt ( x Some Vocabulary Dfferental equatons n frst-order form If F s lnear, the system s a lnear system, lkewse nonlnear The order of the system s the number of hstorcal terms n the dfference equatons, or the hghest order n n the dfferental equatons (,,,... x F x x x t+ t t t or n d x F n n dt ( x In general, a system of dfferental equatons can be converted to a frst order system through the addton of varables Here s an example for a second order, lnear d x dx m + b + kx dt dt d x b dx k + x dt m dt m x x dx dt dx x m m x& Ax system dt ( k ( b Egenvalues and Egenvectors Egen s a German word, whch roughly translates to characterstc For a mathematcal transformaton of some vector of varables An egenvector of the transformaton s a characterstc shape for that transformaton An egenvalue s a correspondng magntude for that shape A transformaton may have several egenvalues and egenvectors Representng behavors of transformatons as a combnaton of egenvectors s a form of data compresson We wll examne egenvalues and vectors n contnuous dynamcal systems as an example An example Consder solvng a ordnary, lnear dfferental equaton We solve by assumng a soluton form Whch reduces to the problem of fndng egenvectors mx && + bx& + kx x Ce x& x t x&& x + + m x b x kx gnorng the trval x soluton + + m b k 4 m b± b mk

In frst-order form In dynamcal systems Ths s the standard egenvalue problem for A Solutons are the egenvalues for the matrx (transformaton A For a gven, the soluton for x n xax s an egenvector x& Ax x c e t x Ax gnorng the trval x soluton A I A I Egenvectors (shapes represent modes of the characterstc (unforced behavor of the system Egenvalues (magntudes are related to these shape s duratons through tme Behold the wonder of Euler In summary Egenvalues come n complex conjugate pars Thus postve real parts ndcate growth negatve real parts ndcate decay Imagnary parts ndcate frequency of oscllaton Of the assocated egenvector (shape t e cost+ sn t ( r+ t rt ( r± t rt ( cos sn e e t+ t for complex conjugate pars ( cos e e t For a transformaton, egenvectors are characterstc shapes, egenvalues of ther characterstc magntudes For dynamcal systems, these the duratons through tme of modes of behavor We can descrbe contnuous lnear dynamcal systems wth a matrx, va frst order form Egenvectors of ths matrx ndcate one of several characterstc shapes of a dynamcal systems evoluton For correspondng egenvalues: Postve real parts ndcate that shape grows exponentally Negatve real parts ndcate that shape des off exponentally Imagnary parts ndcate the speed of oscllaton around that shape ( natural frequency Attractors Three knds of attractors In general, we can say that dynamcal systems have transent behavor (that whch des out over tme and steady-state behavor Any steady state behavor s also known as an attractor of that system Systems can also dverge (one of more of ther state varables can go to nfnty Fxed ponts An equlbrum value of the state vector Perodc attractors A repeatng sequence of state vector values Chaotc attractors A sequence that never dverges, but never repeats (!? Attractors can also be stable or unstable

s Examnng attractors Sngular value decomposton As an experment, let s construct a matrx descrbng a dynamcal systems behavor usng the method of delays Ths method allows s a non-analytcal way of examnng system behavor wthout havng to have the system equatons We can treat ether dscrete or contnuous systems wth ths method [... ] X x x x x t t t tm Is a generalzaton of egen decomposton (whch we ll talk about n more detal later Let s get the sngular values of X Then normalze them to - The dstrbuton ndcates the complexty of system dynamcs Let s take the entropy of the resultng dstrbuton ' H j H j log ' ' An Example Low Let s consder a set of partcles connected wth nonlnear sprngs and dampers We can thnk of ths as a sort of partcle swarm Let s look at how vares wth the sprng and damper strength.5.5.5.5 log( 3 4 Moton n ths fgure s largely rght to left Ths s the case where the long term behavor s for the partcles to lock and behave lke a sngle partcle Relatve to the partcle s center of mass, ths s a fxed pont y poston 4 4 6 8 4 35 3 5 5 5 5 x poston Medum Symbolc Dynamcs Is the stuaton where the partcles do not dverge, but do not coalesce It s lkely that ths s a chaotc attractor (but I haven t techncally proven that We mght call the behavor complex, emergent or self organzed We ll look a bt more at complexty measures y poston 5 4 3 3 4 5 5 5 5 5 x poston Let s assume that we are takng measurements of a dynamcal system n dscrete tme, and that each measurement results n one symbol from an alphabet A, consstng of k possble symbols The underlyng system mght be a dscrete or contnuous dynamcal system Wth or wthout stochastc elements Note that we are brushng over detals of stochastc processes at ths pont 3

Let s consder a symbolc dynamcal system (Crutchfeld and Shalz Generatng a sequence of symbols S - S - S S S For a gven tme t, we wll label the past and future sequences And we defne the noton of a statonary stochastc process, f the probablty of any measurable future event sequence (taken from the possble set F s ndependent of tme s St s the past r St s the future the system s statonary r s f P S A S s ( t t r s ( t t P S A S s for all t and t s L S are the last L symbols r L S are the next L symbols Predctng the future We want to look at prevous symbols, and predct the probablty dstrbuton of future symbol sequences We are gong to partton the set of possble prevous symbols such that all the elements n a gven cell of ths partton are matched to the same predcted dstrbuton over the set of possble future sequences If the functon mappng a past hstory to a future dstrbuton s, past sequences s and s, are n the same partton cell f and only f (s (s Effectve states Learnng We wll call each cell n ths partton an effectve state of the underlyng process, for a gven predcton functon We wll call R the set of effectve states nduced by We would lke to learn the partton, and the predcted dstrbutons, based on past sequences Let s concentrate on gettng the rght parttons We d lke to maxmze the mutual nformaton between the partton R and the possble sequences of future states Any predcton that s as good as one could do rememberng all past states s called prescent r r r I S R H S H S R L L L ( ; ( + ( r r s L L ( ( H S R H S S Statstcal Complexty Causal states C(R s the number of bts needed to represent the partton Note that whle ths s computed n bts, and s based on a statstcal model, t s a dfferent sort of complexty measure than H It s a sort of machne sze We wll call the (unque set of prescent states that mnmzes statstcal complexty the causal states of the system Let s recap: ths s the most effcent set of sets of prevous symbols that predct the probablty dstrbuton of future sequences 4

But there s more The system s -machne Gven one causal state, and a symbol from the real process, we move to another causal state We want to fnd those transtons, as well It turns out that ths gves a determnstc dynamcal system n the followng sense For a causal state, and current symbol s, the machne moves to another partcular causal state, wth probablty However, recall the system we are modelng s stochastc, so the model s stochastc, n the sense that the sequence of symbols s that are nput s stochastc Also recall that the causal states are mapped to probablty dstrbutons over the future states by the functon Whew! Is defned by the symbol set of the orgnal symbolc dynamcal system, that system s causal states, and the transton probablty matrces T (s r T P S s S S (, + ( s ' ' j t j t Markov Process The causal states form a Markov process That s you only need to know the current state to completely determne the probablty dstrbuton over all possble future states We call also ths the Markov property Recurrent, Transent, and Synchronzaton States In a Markov process, states are ether Recurrent vsted over and over agan n an nfnte loop Transent vsted once, and never returned to agan In an -machne, transent states are also called synchronzaton states snce the represent the hstory of symbols you have to see before you can fx yourself nto the approprate recurrent state Crutchfeld s complexty measures wll gnore synchronzaton states, n general We mght also call a set of connected recurrent states and attractor of the process Complexty metrcs Two knds of predctable We need two numbers to characterze the complexty of the system, gven the -machne C(R, the statstcal complexty The varable memory needed to represent the machne H, the entropy of the state transtons Ths s rather profound! Weather that s wldly varable s predctable n ts varablty (hgh H Well treated wth probablstc models Weather that s very perodc s very predctable (hgh C Well treated wth determnstc models Complex weather s nether of these thngs (complexty n ths sense s characterzed by bounded randomness and relatvely hgh sze of the machne used to descrbe dynamcs Hard to get a good model of ether knd 5

Causal state splttng reconstructon (CSSR A somewhat exhaustve algorthm for fndng a system s -machne We start by assumng only one causal state, and the largest possble It s very nterestng to look at the complexty metrcs nferred for varous systems The CSSR algorthm Gven data from a system of symbol dynamcs Start wth one causal state and the assumpton that symbols are unformly randomly generated (maxmum H Test statstcally to see f causal states should be added If so, add a state, and compute approprate dstrbutons and transton probabltes from the gven data, and repeat If not, stop Slghtly more detal If Set L, S { } (the null causal state Whle L<L max For each causal state n S Calculate the condtonal probablty dstrbuton of all future state sequences of length L For each hstory n Consder each sequence that conssts of ths hstory and one more prevous character Calculate the condtonal probablty dstrbuton of all future state sequences of length L Use a statstcal test to see f ths dstrbuton s the same as that for any exstng causal state The new hstory gves a dstrbuton that s statstcally the same as that of an exstng causal state Add ths hstory to that state Else Create a new state that contans just ths hstory Calculate the causal state transtons correspondng to any gven symbol I have smplfed ths terrbly! A CSSR Example CSSR gves an -machne Consder the famous logstc equaton X(t+rX(t(-X(t Ths s the prmary example of determnstc chaos We convert t to a symbolc dynamcal system by outputtng f X(t>.5, otherwse For each value of r, and L max 6 These are plotted n the space of the two complexty measures C ( machne sze and H ( randomness The phase transton occurs at the Fegenbaum number 6

At the phase transton The Edge of Chaos Addng more nference to CSSR (ncreasng L max just leads to larger and larger machne sze (V s approxmately C Ths s the so-called edge of chaos It also ndcates a jump up Chomsky s herarchy of grammars Is a phenomena often dscussed n the feld of Complexty It seems to ndcate an regon of system dynamcs bounded by smple and smply random behavors, where Interestng developmental or accdental patterns and phenomena occur n the system It s what I was tryng to capture wth medum Another study of the edge Attractor Length Consder Kaufman s Random Boolean Networks Recurrent networks (dynamcal systems wth bnary outputs/nputs, and random Boolean functons at the nodes Characterzed by N (number of nodes and K (connectvty Started wth some bt strng, they settle towards one of (possbly many attractors N... F(x F(x F(x F(x K... As a functon of N and K For K < 3 (sh, length of attractors expands as sqrt(n For K > 5 (sh, length of attractors expands exponentally wth N For K around 3 length of attractors s sublnear n N Number of dstnct attractors Stablty of attractors As a functon of N and K For K < 3 (sh, number of attractors expands exponentally wth N For K > 5 (sh, number of attractors expands as a low-order polynomal of N For K around 3 number of attractors expands sub-lnearly n N That s, whether small random perturbatons return to a gven attractor, or go to some other attractor For N<3 (sh attractors are farly unstable For N>5 attractors unstable For N around 3, attractors are stable 7

.5.55.5.495.49.485.48.475.47.465.46.6.6.64.66.68.7 Dmensonless Input (NK Entropy L6 L8 L4 L Summary of ths edge K<3: many smple unstable behavors K>5: few complcated unstable behavors K around 3: few medum complcated stable behavors Ths s another edge of chaos But s t the same one Untng Crutchfeld and Kaufman s Edges? Procedure Generate large numbers of RBNs, wth varous levels of ongong perturbaton (mutatons of the output Use CSSR to fnd -machnes for the results Fnd a unfed method of examnng the results Dmensonless Entropy Consder H/C, the random complexty relatve to the machne complexty We examne ths for the nput and the output of the RBNs: At the nput, C s the number of bts necessary to descrbe the RBN, and H s the entropy of the mutatons At the output, C and H are as gven by CSSR We are measurng the complexty of what we can nfer, versus what s actually there.9 Prelmnary Results Dmensonless Output (Epslon Machne Entropy Dmensonless Output (Epslon Machne Entropy.8.7.6.5.4.3.....3.4.5.6 Dmensonless Input (NK Entropy K K K3 K4 K5 K6 K7 K8 Take Home Messages Dynamcal system (ncludng symbolc dynamcs behavor can be characterzed by (compressed nto Egen decomposton (and smlar Attractor descrpton And n a broader sense, nformaton theoretc approaches Whch can be characterzed by Markov chans Such examnaton reveals, among other thngs Two dstnct knds of complexty: randomness and machne sze The edge of chaos phenomena These reman actve research topcs 8