Graphs and Trees: cycles detection and stream segmentation. Lorenzo Cioni Dipartimento di Informatica Largo Pontecorvo 3 Pisa

Similar documents
Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Learning Theory: Lecture Notes

Basic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions

Calculation of time complexity (3%)

Lecture 4: November 17, Part 1 Single Buffer Management

Hashing. Alexandra Stefan

Problem Set 9 Solutions

6.842 Randomness and Computation February 18, Lecture 4

Turing Machines (intro)

Design and Analysis of Algorithms

Introduction to Algorithms

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

find (x): given element x, return the canonical element of the set containing x;

Introduction to Algorithms

CHAPTER 17 Amortized Analysis

Singular Value Decomposition: Theory and Applications

Week 5: Neural Networks

Lecture 10: May 6, 2013

General theory of fuzzy connectedness segmentations: reconciliation of two tracks of FC theory

10-701/ Machine Learning, Fall 2005 Homework 3

On the Repeating Group Finding Problem

A property of the elementary symmetric functions

APPENDIX A Some Linear Algebra

THE SUMMATION NOTATION Ʃ

More metrics on cartesian products

The stream cipher MICKEY

Lecture 5 Decoding Binary BCH Codes

The Minimum Universal Cost Flow in an Infeasible Flow Network

04 - Treaps. Dr. Alexander Souza

On the set of natural numbers

Statistical Mechanics and Combinatorics : Lecture III

Errors for Linear Systems

A CLASS OF RECURSIVE SETS. Florentin Smarandache University of New Mexico 200 College Road Gallup, NM 87301, USA

Combining Constraint Programming and Integer Programming

Graph Reconstruction by Permutations

Problem Do any of the following determine homomorphisms from GL n (C) to GL n (C)?

Expected Value and Variance

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

Appendix B: Resampling Algorithms

Compilers. Spring term. Alfonso Ortega: Enrique Alfonseca: Chapter 4: Syntactic analysis

A 2D Bounded Linear Program (H,c) 2D Linear Programming

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

MODELING TRAFFIC LIGHTS IN INTERSECTION USING PETRI NETS

Affine transformations and convexity

Exercises. 18 Algorithms

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

Computing Correlated Equilibria in Multi-Player Games

VQ widely used in coding speech, image, and video

NP-Completeness : Proofs

Changing Topology and Communication Delays

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

FINITELY-GENERATED MODULES OVER A PRINCIPAL IDEAL DOMAIN

Sequential Processes. In the case of sequential processes, this information indicates explicitly the possible transactions from a state.

Dynamic Systems on Graphs

18.1 Introduction and Recap

The Geometry of Logit and Probit

Maximal Margin Classifier

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Lecture 13 APPROXIMATION OF SECOMD ORDER DERIVATIVES

A how to guide to second quantization method.

FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP

DUE: WEDS FEB 21ST 2018

Tornado and Luby Transform Codes. Ashish Khisti Presentation October 22, 2003

Two Methods to Release a New Real-time Task

Beyond Zudilin s Conjectured q-analog of Schmidt s problem

Spectral Graph Theory and its Applications September 16, Lecture 5

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

EEE 241: Linear Systems

Dynamic Programming 4/5/12. Dynamic programming. Fibonacci numbers. Fibonacci: a first attempt. David Kauchak cs302 Spring 2012

Lecture Space-Bounded Derandomization

Finding Primitive Roots Pseudo-Deterministically

On cyclic of Steiner system (v); V=2,3,5,7,11,13

Report on Image warping

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Generalized Linear Methods

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Lecture 3 January 31, 2017

12 MATH 101A: ALGEBRA I, PART C: MULTILINEAR ALGEBRA. 4. Tensor product

An efficient algorithm for multivariate Maclaurin Newton transformation

Message modification, neutral bits and boomerangs

Valuated Binary Tree: A New Approach in Study of Integers

Module 9. Lecture 6. Duality in Assignment Problems

Difference Equations

Problem Set 6: Trees Spring 2018

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Math Review. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Amusing Properties of Odd Numbers Derived From Valuated Binary Tree

1 Convex Optimization

Clustering gene expression data & the EM algorithm

CSE4210 Architecture and Hardware for DSP

arxiv: v1 [math.co] 1 Mar 2014

Chapter 4: Root Finding

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Transcription:

Graphs and Trees: cycles detecton and stream segmentaton Lorenzo Con Dpartmento d Informatca Largo Pontecorvo 3 Psa lcon@d.unp.t

Man topcs of the talk Two algorthms: segmentaton of a stream of data applcaton: syllabfcaton of wrtten Italan cycles detecton n drected (even complete) graphs applcaton: graphc edtor for graphs edtng wth cycles detecton, desgned to be used for the defnton of drected graphs used n causal loop graphs (system dynamcs)

Trees and segmentaton RB-tree (paper of mne, 996): syllabfcaton of wrtten Italan; bnary tree; recursve calls; vowels/consonants. m-ary tree, under development m drect descendants of each node; recursve calls; alphabet of m dstnct symbols.

Trees and segmentaton 2 Lexcon A alphabet, A =m S strngs on A, length n, potentally unbound S={s + s A} M, set of markers M A = Ø R, rules, fxed and fnte set, R={r,..., r k } W, weghts, fxed and fnte set, W={w,..., w k } r w one-to-one correspondence Lexcon/Syntax A={α 0,..., α m- } β=m base of the numberng scheme Ŵ={w j w j = =0,k β l } k=[0,..., K], K max number of elements of a rule W Ŵ dentfes the set of weghts to whch corresponds a rule m-ary tree dynamcally bult, no permanent data structures rules as weghts

Trees and segmentaton 3 nput stream n n 2 n 3...n... n k... wth n A ŵ= n β where n s a codng of each nput symbol (cf. the examples) f ŵ W then t dentfes a rule r j formally we have: we defne a runnng sum ŵ= n β and, at each step, we check f ŵ W or not. In the former case we apply the correspondng rule r j otherwse we step to the followng nput symbol and update ŵ.

Trees and segmentaton 4 evaluaton of ŵ check f ŵ W, cost O(), two cases: no: we ncrement k (the ponter on the nput stream) by one (k=k+) and evaluate a new value of ŵ as ŵ=ŵ+n k β k yes: we apply the correspondng rule r and reset to zero both the ponter k and the runnng sum ŵ the applcaton of a rule s equvalent to the nserton of: one marker: after, segmentaton nsde, syllabfcaton two markers, one before and one after, extracton

Unformty: Propertes of the rules we have unformty f every rule nvolves the same number of nput symbols otherwse we have a value max that defne the longest rule[s] Completeness: a set of rules s complete f there s a rule for every value of the runnng sum from 0 to β - There s no relaton between the two propertes Unformty and completeness translate n propertes of the structure of the m-ary tree, dynamcally bult

f w r then Executon of the rules else apply r reset ponter k and runnng sum k=k+ update runnng sum wth the current nput symbol apply = nsert, n the proper poston[s] one or two markers

Example A={a,b}, b 0, a, M={*}, S={s + s A} completeness and unformty r 0 =bb w 0 =0; r =ab w =;r 2 =ba w 2 =2; r 3 =aa w 3 =3 R={r 0,r,r 2,r 3 } W={w 0,w,w 2,w 3 } w a rule that defnes where to nsert the marker[s] r 0,r 3 marker nsde aa a*a, bb b*b r,r 2 marker after ab ab*, ba bab* abbbaaab... ab*b*ba*aab*...

Example contnued A={a,b}, b 0, a, M={*}, S={s + s A} no completeness but unformty r 0 =bb w 0 =0; r 3 =aa w 3 =3 r 0,r 3 marker before and after abbbaaab... ab*bb**aa*ab... no completeness and no unformty conflctng rules? used n syllabfcaton

Example contnued r =ab w =; r =aba w =5 conflct: the former covers the 0 0 latter no completeness no unformty (cf. fgure) r =ab w =; 0 0 r =bab w =2 r =bba w =4 2 2

Example 2: syllabfcaton Man features: marker nsde no unformty completeness lookahead & stepback vs. stepforward Two versons: smple verson: small alphabet, complex rules; complex verson: bg alphabet, smpler rules

Example 2 contnued Smple verson: alphabet A: vowels V (wth/wthout stress), consonants C, separators S (spaces, tabs) and punctuaton marks P A= V C S P markers M={-} weghts W={w } rules R={r } r w Rules rules are appled wth a lookahead and a stepback wth a recursve call rules are complex snce we have to dscrmnate many cases wthn a rule

Example 2 contnued Some rules of smple verson: v V, v, c C, c 0 w 0 = 0 r 0 =cvcv cv-cv w 9 = 9 r 9 =vc c 2 v some cases: f c =c 2 then vc -c 2 v f c =n then vc -c 2 v f c =s then v-c c 2 v Examples (n blue substrngs analysed wth recursve calls) pallone 9 pal-lone 0 pal-lo-ne asmatco 9 a-smatco 0 a-sma-tco 0 a-sma-t-co

Example 2 contnued Complex verson (under development): alphabet A'=A V n C 2 C 3... alphabet A' contans old alphabet plus groups of vowels (V n ) and of two (C 2 ), three (C 3 ) consonants and all the relevant subgroups markers M={-} weghts W={w }, rules R={r },r w β= A' Rules rules are appled wth a lookahead and a stepback wth a recursve call we have more rules but each rule s smpler snce t contans a small set of sub-cases

Segmentaton : computatonal complexty Constant number of rules Input stream of n symbols. Two cases: n fxed (not really relevant for complexty); n not known a pror. Only stepforward complexty O(n) Wth lookahead and stepback complexty > O(n) complexty < O(n 2 ) complexty O(n log n)??

Cycles detecton n drected graphs Data structures G=(N,E) drected graph, N =n, E =m A alphabet A={a } =,..., n S=Â={a...a k a A k 2} a n N, (a l,a k ) (n l,n k ) E Three man steps mappng: from graphs to strngs searchng substrngs wth gven propertes (every substrng dentfes a cycle) prunng of equvalent substrngs (duplcate cycles)

Cycles detecton n drected graphs 2 Data structures G=(N,E) drected graph, N =n, E =m m=n(n-) at the most number of cycles (complete graph): Frst example fve cycles: ABBA, BCCB, ACCA, ABBCCA (ant clock wse), ACCBBA (clock wse) CBBAAC and ACCBBA equvalent through shft left BCCAAB and ABBCCA equvalent through shft left

Cycles detecton n drected graphs 3 Second example three cycles: ABBA BCCB, ABBCCA casual mappng: ABCBCABCBA lexcographc mappng (by nodes order and by forward star): ABBABCCACB

Cycles detecton n drected graphs 4 Second example contnued cycles detecton n ABBABCCACB (head node n blue, tal node n black, subcycles n talcs) ABBABCCACB ABBA cycle, scan ABBCCB subcycle (premature closure), dscard ABBCCA cycle, 2 scans ABBABCCACB BAAB duplcate cycle, 2 scans ABBABCCACB BCCAAB duplcate cycle, 2 scans BCCB cycle, scan ABBABCCACB CAABBA subcycle, dscard CAABBC duplcate cycle, 2 scans and so on...

Operatons Mappng g G=(N,E) s S = {s = aa + a A} unform vs. non unform number of symbols for each node dentfer modes of mappng: casual, lexcographc Searchng number of scans: maxmum n- wth n= N search operatons produces a lst of substrngs, one for each (duplcate) cycle Prunng (or removal of duplcate cycles) ex-ante ex-post

contguty_check (cc) Search Operatons fnds couple of contguous arcs (parwse underlned), returns a boolean premature_closure (pc) fnds subcycles (.e. cyclcal substrngs) to be dscarded (velvet) condton: head tal closure (cl) fnds a cycle as represented by a cyclc substrng wthout pc condton: head (blue) = tal (talc blue) Examples ABB CCACB ABB CCBCA ABB CCA

Prunng removes duplcate cycles. e. substrngs of equal length and equvalent under left/rght shft ex-ante durng the search phase does not nsert equvalent substrngs n the lst, usually one sngle growng vector C[] ex-post after the search phase two man cases: one sngle unordered vector C[] of k elements a set of vectors C [] =2,..., n, each of k elements, some even empty all substrngs contaned n the lst created at the end of the search phase are examned and equvalent substrngs are removed

Prunng 2 ex-ante to each arc (of two dstnct nodes) corresponds a strng s S so that to a chan of arcs corresponds a chan of strngs s s 2 s 3...s k C [] s ether empty (at the very begnnng) or contans the substrngs cycles defned up to a certan pont we have an terated algorthm: at step (=,..., k) we check f s C [] and defne C + []={c j C [] c j s } f the cycle contans k arcs and the correspondng strng s of 2k symbols and C k [] then the current substrng corresponds to a duplcate cycle and must be dscarded f, at any step, we have C []= then the current strng (f satsfes the aforesad propertes) corresponds to a new cycle and must be added to C []

Prunng 3 ex-post frst case: one sngle unordered vector C[] of k elements we have a cycle for = to k-, at each step we have: confront C[] wth all the shfted versons of C[j] for j=+ to k every matchng strng must be removed from C[] whose number of elements reduces at the end C[] contans the resdual substrngs to each of whch corresponds a cycle of the gven graph second case: a set of vectors C [] =2,..., n, each of k elements, some even empty we repeat the algorthm we have desgned for the sngle vector case for all the non empty vectors C []

Applcatons of the algorthm cycles detecton connectvty? more?

Complexty of the algorthm mappng (complete graph of n nodes): O(n 2 ) searchng (complete graph of n nodes): O(n n- )?? prunng: ex-ante, costly ex-post, very costly

Concludng remarks so many thngs to do and so a short tme...

Game Over... Thank you for your attenton Lorenzo Con Dpartmento d Informatca Largo Pontecorvo 3 Psa lcon@d.unp.t