Reconstruction. Reading for this lecture: Lecture Notes.

Similar documents
Quantum versus Classical Measures of Complexity on Classical Information

Quantum Finite-State Machines

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Lecture 9. Intro to Hidden Markov Models (finish up)

Intelligent Systems:

Lecture 6: Entropy Rate

Lecture 1: September 25, A quick reminder about random variables and convexity

KNOWLEDGE AND MEANING... CHAOS AND COMPLEXITY

The Modeling and Complexity of Dynamical Systems by Means of Computation and Information Theories

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Randomized Decision Trees

Pattern Discovery in Time Series, Part II: Implementation, Evaluation, and Comparison

Bayesian Additive Regression Tree (BART) with application to controlled trail data analysis

Lecture 7: DecisionTrees

Structural Drift: The Population Dynamics of Sequential Learning

CS Communication Complexity: Applications and New Directions

ECE 564/645 - Digital Communications, Spring 2018 Homework #2 Due: March 19 (In Lecture)

Probabilistic Graphical Models

1 Ways to Describe a Stochastic Process

Lecture 1 : Data Compression and Entropy

Lecture 3: Decision Trees

6.02 Fall 2012 Lecture #1

Quantitative Information Flow. Lecture 7

Decision Trees. Tirgul 5

Reconstruction Deconstruction:

Computation at the Nanoscale:

Towards a Theory of Information Flow in the Finitary Process Soup

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Bayesian networks: Representation

Unsupervised Inductive Learning In Symbolic Sequences via Recursive Identification of Self-Similar Semantics

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

Properties of Context-Free Languages

CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis

Classification & Information Theory Lecture #8

COMP3702/7702 Artificial Intelligence Lecture 11: Introduction to Machine Learning and Reinforcement Learning. Hanna Kurniawati

BAYESIAN STRUCTURAL INFERENCE AND THE BAUM-WELCH ALGORITHM

Anatomy of a Bit. Reading for this lecture: CMR articles Yeung and Anatomy.

Learning Decision Trees

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

Supervised Machine Learning (Spring 2014) Homework 2, sample solutions

Statistical Methods for NLP

Even More on Dynamic Programming

Phylogenetics: Likelihood

Beyond the typical set: Fluctuations in Intrinsic Computation

Designing Information Devices and Systems I Fall 2018 Lecture Notes Note Introduction: Op-amps in Negative Feedback

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Dynamically data-driven morphing of reduced order models and the prediction of transients

Local Probability Models

1 The Probably Approximately Correct (PAC) Model

Streaming Algorithms for Optimal Generation of Random Bits

9 Forward-backward algorithm, sum-product on factor graphs

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Rapid Introduction to Machine Learning/ Deep Learning

Information Theory and Coding Techniques: Chapter 1.1. What is Information Theory? Why you should take this course?

Context-Free Languages

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Machine Learning Recitation 8 Oct 21, Oznur Tastan

Data Compression Techniques

Expectation Maximization (EM)

CMPSCI 240: Reasoning about Uncertainty

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

Lecture 12: May 09, Decomposable Graphs (continues from last time)

6.867 Machine learning, lecture 23 (Jaakkola)

Linear Sketches A Useful Tool in Streaming and Compressive Sensing


Machine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL

Computing and Communications 2. Information Theory -Entropy

Statistical Methods for NLP

2 : Directed GMs: Bayesian Networks

Decision Tree Learning Lecture 2

Bayesian Learning (II)

6.891: Lecture 24 (December 8th, 2003) Kernel Methods

Shannon s Noisy-Channel Coding Theorem

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Probabilistic Graphical Models

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Naive Bayes classification

1/22/13. Example: CpG Island. Question 2: Finding CpG Islands

CSC321 Lecture 18: Learning Probabilistic Models

This lecture covers Chapter 7 of HMU: Properties of CFLs

Notes on Machine Learning for and

10.1 The Formal Model

Optimal Causal Inference

2011 Pearson Education, Inc

2 : Directed GMs: Bayesian Networks

1 Review of The Learning Setting

Dynamic Approaches: The Hidden Markov Model

Supervised Learning via Decision Trees

Structural Drift: The Population Dynamics of Sequential Learning

The Modeling and Complexity of Dynamical Systems by Means of Computation and Information Theories

Learning Decision Trees

Math 1313 Experiments, Events and Sample Spaces

Introduction to Information Theory and Its Applications

Feature selection and extraction Spectral domain quality estimation Alternatives

Semantics and Thermodynamics

Information and Entropy. Professor Kevin Gold

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Lecture 2: Weighted Majority Algorithm

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma

Distributed Data Fusion with Kalman Filters. Simon Julier Computer Science Department University College London

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.

Transcription:

ɛm Reconstruction Reading for this lecture: Lecture Notes.

The Learning Channel... ɛ -Machine of a Process: Intrinsic representation! Predictive (or causal) equivalence relation: s s Pr( S S= s ) = Pr( S S= s ) Causal State: s, s S Set of pasts with same morph : Pr( S s ) S = { s : s s } Set of causal states: S = S/ = {S, S, S 2,...} Causal state map: ɛ : S S L Causal state morph: Pr S S ɛ( s )={ s : s s } 2

The Learning Channel... Causal State Dynamic: State-to-State Transitions: {T (s) ij : s, i, j =,,..., S } T (s) = Pr(S ij j,s S i ) ( = Pr S = ɛ( s s) S = ɛ( ) s ) 3

The Learning Channel... The ɛ-machine of a Process... { } M = S, {T (s),s } State State 2 3 B Transient States 3 4 3 4 Unique Start State: 2 C No measurements made: s = λ Start state: 2 Recurrent States S =[λ] D Start state distribution: Pr(S, S, S 2,...)=(,,,...) 4

ɛmreconstruction: ny method to go from process P Pr(S) to its ɛm () nalytical: Given model, equations of motion, description,... (2) Statistical inference: Given samples of P (i) Subtree Reconstruction: Time or spacetime data to ɛm (ii) State-splitting (CSSR): Time or spacetime data to ɛm (iii) Spectral (emsr): Power spectra to ɛm (iv) Optimal Causal Inference: Time or spacetime data to ɛm (v) Enumerative Bayesian Inference 5

How to reconstruct an ɛm: Subtree algorithm Given: Word distributions Pr(s D ), D =, 2, 3,... Steps: () Form depth-d parse tree. (2) Calculate node-to-node transition probabilities. (3) Causal states: Find morphs Pr( s L s K ) as subtrees. (4) Label tree nodes with morph (causal state) names. (5) Extract state-to-state transitions from parse tree. { } (6) ssemble into ɛm : M = S, {T (s),s }. lgorithm parameters: D, L, K 6

How to reconstruct an ɛm... Form parse tree estimate of Pr(s D ). Data stream: s M =... Start node Parse tree of depth D = 5 Number of samples:m D History length: K =,, 2, 3,... 7

How to reconstruct an ɛm... Form parse tree estimate of Pr(s D ). Data stream: s M =... Start node Parse tree of depth D = 5... 8

How to reconstruct an ɛm... Form parse tree estimate of Pr(s D ). Data stream: s M =... Start node Parse tree of depth D = 5... 9

How to reconstruct an ɛm... Form parse tree estimate of Pr(s D ). Data stream: s M =... Start node Parse tree of depth D = 5...

How to reconstruct an ɛm... Form parse tree estimate of Pr(s D ). Data stream: s M =... M Total samples: M D M Parse tree of depth D = 5 3 7 Store word counts at nodes. Probability of node = Probability of word leading to node: 6 7 4 3 2 4 4 3 2 2 2 Pr(w) = node count M

How to reconstruct an... ɛm ssume we have correct word distribution: Pr( ) = Pr(s D ) Pr() Pr() Pr() Pr() Pr() Pr() Pr()...... Pr() 2

How to reconstruct an... ɛm Node-to-node transition probability: w = ws Pr(n n, s) = Pr(n n ) = Pr(n ) Pr(n) = Pr(w ) Pr(w) = Pr(s w) w w w n n w n s Pr( w) n n Pr( w) n 3

How to reconstruct an... ɛm Find morphs Pr( s L s K ) as subtrees Future: L = 2 /2 /2 Past: K = Morph: Pr( S 2 S =) 2/3 /3 /2 /2 Given: s K = Pr() = 2 3 2 3 = 4 9 Pr() = 2 3 3 = 2 9 Pr() = 3 2 = 6 Pr() = 3 2 = 6 2/3 /3 /2 /2 2/3 /3 /2 /2............ 4

How to reconstruct an... ɛm Find morphs Pr( s L s K ) as subtrees Future: L = 2 /2 /2 Past: K = Morph: Pr( S 2 S = ) 2/3 /3 /2 /2 Given: s K = Pr() = 2 2 3 = 3 Pr() = 2 3 = 6 Pr() = 2 2 = 4 Pr() = 2 2 = 4 2/3 /3 /2 /2 2/3 /3 /2 /2 5

How to reconstruct an... ɛm Set of distinct morphs: Morph B: 2/3 /3 Morph : /2 /2 2/3 /3 /2 /2 2/3 /3 /2 /2 Set of causal states = Set of distinct morphs. S = {, B} 6

How to reconstruct an... ɛm Causal state transitions? Label tree nodes with their morph (causal state) names: /2 /2 B 2/3 /3 /2 /2 S = {, B} B B 2/3 /3 /2 /2 2/3 /3 /2 /2 B B B B............ 7

How to reconstruct an... ɛm Form ɛm: Causal states: S = {, B} B Start state ~ top tree node 8

How to reconstruct an... ɛm Form ɛm: Causal-state transitions from node-to-node transitions: 2 2 B 2 3 3 9

How to reconstruct an ɛm: Subtree algorithm Given: Word distributions Pr(s D ), D =, 2, 3,... Steps: () Form depth-d parse tree. (2) Calculate node-to-node transition probabilities. (3) Causal states: Find morphs Pr( s L s K ) as subtrees. (4) Label tree nodes with morph (causal state) names. (5) Extract state-to-state transitions from parse tree. { } (6) ssemble into ɛm : M = S, {T (s),s }. lgorithm parameters: D, L, K 2

How to reconstruct an... ɛm Example Processes:. Period- 2. Fair Coin 3. Biased Coin 4. Period-2 5. Golden Mean Process 6. Even Process 2

Examples (back to the Prediction Game): Period-:... Parse Tree D = 5..... 22

Examples (back to the Prediction Game): Period-:... Parse Tree D = 5 Morph L = 2....... 23

Examples (back to the Prediction Game)... Period-:... Space of histories: single point. One future morph:. Support: { + } Distribution: Pr(! S L = L s = K )=. 24

Examples (back to the Prediction Game)... Period-... ɛm: M = { } S, {T (s),s } S = {S = {...}} T () = () T () = () 25

Examples (back to the Prediction Game)... Period-... Causal state distribution: p S = () Entropy Rate: h µ = bits per symbol Statistical Complexity: C µ = bits 26

Examples (back to the Prediction Game)... Fair Coin:... Parse Tree D = 5 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2...... /2 /2 27

Examples (back to the Prediction Game)... Fair Coin:... Parse Tree D = 5 Future Morph L = 2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2...... /2 /2 28

Examples (back to the Prediction Game)... Fair Coin... Space of histories: K S = K One future morph: Support: L Distribution: Pr( S L s )=2 L Call it state. 29

Examples (back to the Prediction Game)... Fair Coin... Label tree nodes with state names: /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2...... /2 /2 3

Examples (back to the Prediction Game)... Fair Coin... ɛm : M = { } S, {T (s),s } States: S = {S = L } Transitions: T () = 2 2 2 T () = 2 3

Examples (back to the Prediction Game)... Fair Coin... Causal state distribution: p S = () Entropy Rate: h µ = bit per symbol Statistical Complexity: C µ = bits 32

Examples... Biased Coin: p Pr() = 2 3 Parse Tree D = 5 /3 2/3... /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 / 2/3...... /3 2/3 33

Examples (back to the Prediction Game)... Biased Coin: Parse Tree D = 5 Future Morph L = 2 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 / 2/3...... /3 2/3 34

Examples (back to the Prediction Game)... Biased Coin... Space of histories: K S = K Single future morph: Support: L Distribution: Pr(! S L s )=p n ( p) L n (n # 2! s L ) Call it state. 35

Examples (back to the Prediction Game)... Biased Coin... Label tree nodes with state names: /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 /3 2/3 / 2/3...... /3 2/3 36

Examples (back to the Prediction Game)... Biased Coin... ɛm : M = { } S, {T (s),s } States: S = {S = L } Transitions: T () = 3 T () 2 = 3 3 2 3 37

Examples (back to the Prediction Game)... Biased Coin... Causal state distribution: p S = () Entropy Rate: h µ = H 2 3.983 bits per symbol Statistical Complexity: C µ = bits 38

Examples (back to the Prediction Game)... Period-2 Process:... Parse Tree D = 5 /2 /2 39

Examples (back to the Prediction Game)... Period-2 Process:... Parse Tree D = 5 Future Morphs at L = 2 /2 /2 S S S 2 /2 /2 4

Examples (back to the Prediction Game)... Period-2 Process:... Space of histories: S = { s =..., s =...} Future morphs: { S λ} = {...,...} { S } = {...} { S } = {...} { S } = {...} { S } = {...} 4

Examples (back to the Prediction Game)... Period-2 Process:... Morph distributions: Pr( λ) = 2 Pr( λ) = 2 Pr( ) = Pr( ) = Pr( ) = Pr( ) = 42

Examples (back to the Prediction Game)... S Period-2 Process... /2 /2 Label tree nodes: S 2 S S S 2 S 2 S 43

Examples (back to the Prediction Game)... Period-2 Process... { } M = S, {T (s),s } States: S = {S = {...,...}, S = {...}, S 2 = {...}} Transitions: T () 2 = @ T () = 2 @ S 2 2 S S 2 44

Examples (back to the Prediction Game)... Period-2 Process... S Causal State Distribution: 2 2 p S =, 2, 2 S S 2 Entropy rate: h µ = bits per symbol Statistical complexity: C µ = bit 45

Examples... Golden Mean Process: Topological Reconstruction (Only support of word distribution) Parse Tree D = 5 46

Examples... Golden Mean Process: Topological Reconstruction (Only support of word distribution) Parse Tree D = 5 Morphs at L = 2 46

Examples... Golden Mean Process: Topological Reconstruction (Only support of word distribution) Parse Tree D = 5 Morphs at L = 2 B 46

Examples... Golden Mean Process: Topological -machine { } M = S, {T (s),s } Topological Causal States: S = {, B} B Transitions: T () = T () = 47

Examples... Golden Mean Process: Topological -machine Probabilistic Transitions: T () = 2 T () = 2 2 2 B Causal State Distribution: p S = 2 3, 3 Entropy rate: h µ = 2 3 bits per symbol Statistical complexity: C µ = H( 2 3 ) bits 48

Examples... Golden Mean Process: Probabilistic reconstruction { } M = S, {T (s),s } (Capture full word distribution) Causal States: S = {, B, C} Transitions: /3 T () = /2 2 2 3 3 2 B C 2/3 T () = /2 49

Parse Tree D = 5 Examples... Even Process: Topological reconstruction 5

Examples... Even Process: Topological reconstruction Parse Tree D = 5 Future Morphs at L = 2 5

Examples... Even Process: Topological reconstruction Parse Tree D = 5 Future Morphs at L = 2 B C 5

Parse Tree D = 5 Examples... Even Process: Topological reconstruction B Label tree nodes: B C B B C B B C B B C B C C B B B B B C 52

Examples... Even Process: Topological reconstruction Topological States: S = {, B, C} Topological Transitions: T () = @ T () = @ 53

Examples... Even Process: Topological ε-machine { } M = S, {T (s),s } States: S = {, B, C} Transitions: T () = T () = 2 @ 2 @ 2 2 2 2 B 2 2 C 54

Examples... 2 3 Even Process: Probabilistic reconstruction { } M = S, {T (s),s } 3 3 4 B 4 States: S = {, B, C, D} 2 C Transitions: T () = 3 B 4 C @ 2 T () = 2 3 3 B 4 C @ 2 D 2 Entropy rate: h µ = 2 3 bits per symbol Statistical complexity: C µ = H( 2 3 ) bits 55

Reading for next lecture: CMR article CMPPSS Homework: ɛm reconstruction for GMP, EP, & RRXOR Helpful? Tree & morph paper at: http://csc.ucdavis.edu/~cmg/compmech/tutorials.htm 56