N-Synchronous Kahn Networks A Relaxed Model of Synchrony for Real-Time Systems

Similar documents
N-Synchronous Kahn Networks

Relaxing Synchronous Composition with Clock Abstraction

Relaxation du parallélisme synchrone pour les systèmes asynchrones communicant par buffers : réseaux de Kahn n-synchrones

Integer Clocks and Local Time Scales

Scheduling and Buffer Sizing of n-synchronous Systems

A Multi-Periodic Synchronous Data-Flow Language

Synchronous Modelling of Complex Systems

Analysis of Periodic Clock Relations in Polychronous Systems

Equalities and Uninterpreted Functions. Chapter 3. Decision Procedures. An Algorithmic Point of View. Revision 1.0

Automatic Code Generation

Fundamental Algorithms for System Modeling, Analysis, and Optimization

A Multi-Periodic Synchronous Data-Flow Language

Hybrid Systems Modeling Challenges caused by CPS

Course Runtime Verification

Design of Real-Time Software

Formal Specification and Verification of Task Time Constraints for Real-Time Systems

Alan Bundy. Automated Reasoning LTL Model Checking

Design of Embedded Systems: Models, Validation and Synthesis (EE 249) Lecture 9

Clocks in Kahn Process Networks

CPE100: Digital Logic Design I

The Quasi-Synchronous Approach to Distributed Control Systems

Causality Interfaces and Compositional Causality Analysis

BOOLEAN ALGEBRA INTRODUCTION SUBSETS

A Logical Basis for Component-Based Systems Engineering *

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

Solving Fuzzy PERT Using Gradual Real Numbers

Digital Control of Electric Drives

Introduction to Artificial Intelligence Propositional Logic & SAT Solving. UIUC CS 440 / ECE 448 Professor: Eyal Amir Spring Semester 2010

Using the Principles of Synchronous Languages in Discrete-event and Continuous-time Models

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

Static Program Analysis using Abstract Interpretation

Outline. policies for the first part. with some potential answers... MCS 260 Lecture 10.0 Introduction to Computer Science Jan Verschelde, 9 July 2014

A Hybrid Synchronous Language with Hierarchical Automata

Sequential Equivalence Checking without State Space Traversal

First-Order Theorem Proving and Vampire

Logic BIST. Sungho Kang Yonsei University

Semantic Equivalences and the. Verification of Infinite-State Systems 1 c 2004 Richard Mayr

EE 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Fall 2016

Dynamic Semantics. Dynamic Semantics. Operational Semantics Axiomatic Semantics Denotational Semantic. Operational Semantics

GRAPHITE Two Years After First Lessons Learned From Real-World Polyhedral Compilation

VLSI Signal Processing

Streams and Coalgebra Lecture 2

Chapter 3. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 3 <1>

Towards Algorithmic Synthesis of Synchronization for Shared-Memory Concurrent Programs

Can an Operation Both Update the State and Return a Meaningful Value in the Asynchronous PRAM Model?

Compiling Techniques

Chapter I : Basic Notions

Computational Logic. Davide Martinenghi. Spring Free University of Bozen-Bolzano. Computational Logic Davide Martinenghi (1/30)

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Circuits & Numbers. Symbolic Numbers 28/11/ /11/2012 Digital Synchronous Circuit Digital Number Digital Algebra Digital Function

ELEC Digital Logic Circuits Fall 2014 Sequential Circuits (Chapter 6) Finite State Machines (Ch. 7-10)

Automata-Theoretic Model Checking of Reactive Systems

Composition for Component-Based Modeling

EE40 Lec 15. Logic Synthesis and Sequential Logic Circuits

Symbolic Model Checking with ROBDDs

Testability. Shaahin Hessabi. Sharif University of Technology. Adapted from the presentation prepared by book authors.

Normalization by Evaluation

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization

VOTE : Group Editors Analyzing Tool

Introduction to mcrl2

The Earlier the Better: A Theory of Timed Actor Interfaces

a = a i 2 i a = All such series are automatically convergent with respect to the standard norm, but note that this representation is not unique: i<0

Overview. Discrete Event Systems Verification of Finite Automata. What can finite automata be used for? What can finite automata be used for?

Time and Schedulability Analysis of Stateflow Models

THE UNIVERSITY OF MICHIGAN. Faster Static Timing Analysis via Bus Compression

Mapping to Parallel Hardware

Adequate set of connectives, logic gates and circuits

Fuzzy Answer Set semantics for Residuated Logic programs

Shared Memory vs Message Passing

Concurrent Models of Computation

Lecture 4. Algebra, continued Section 2: Lattices and Boolean algebras

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events

Cyclic Dependencies in Modular Performance Analysis

Scalable and Accurate Verification of Data Flow Systems. Cesare Tinelli The University of Iowa

Chapter 7. Sequential Circuits Registers, Counters, RAM

Reduced Ordered Binary Decision Diagrams

Simulation & Modeling Event-Oriented Simulations

VOTE : Group Editors Analyzing Tool

Safety Analysis versus Type Inference

Timo Latvala. March 7, 2004

Time. Today. l Physical clocks l Logical clocks

For smaller NRE cost For faster time to market For smaller high-volume manufacturing cost For higher performance

Exercises Solutions. Automation IEA, LTH. Chapter 2 Manufacturing and process systems. Chapter 5 Discrete manufacturing problems

Automata-Theoretic LTL Model-Checking

What s category theory, anyway? Dedicated to the memory of Dietmar Schumacher ( )

2. Accelerated Computations

Embedded Systems Development

Interfaces for Stream Processing Systems

ClC (X ) : X ω X } C. (11)

Some Results on (Synchronous) Kleene Algebra with Tests

Matching Logic: Syntax and Semantics

Semantic Foundation of the Tagged Signal Model

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Ordering Constraints over Feature Trees

A note on monotone real circuits

Formal Methods for Java

Complexity. Complexity Theory Lecture 3. Decidability and Complexity. Complexity Classes

The Discrete EVent System specification (DEVS) formalism

Digital Electronics. Part A

What is Binary? Digital Systems and Information Representation. An Example. Physical Representation. Boolean Algebra

Transcription:

N-Synchronous Kahn Networks A Relaxed Model of Synchrony for Real-Time Systems Albert Cohen 1, Marc Duranton 2, Christine Eisenbeis 1, Claire Pagetti 1,4, Florence Plateau 3 and Marc Pouzet 3 POPL, Charleston January 12th, 2006 1: INRIA, Orsay France 2: PHILIPS NatLabs, Eindhoven, The Netherlands 3: University of Paris-Sud, Orsay, France 4: ONERA, Toulouse, France 1

Context Video intensive applications (TV boxes, medical systems) tera-operations per second (on pixel components) is typical Ensure three properties: hard real-time and performance and safety Implementations Today: specific hardware (ASIC) Evolution: multi-clock asynchronous architectures, mixing hardware/software because of costs, variability of supported algorithms Design and programming tools General purpose languages and compilers are not well adapted Kahn Networks (KN) is common practice in the field 2

A typical example: the Downscaler high definition (HD) standard definition (SD) 1920 1080 pixels 720 480 horizontal filter: number of pixels in a line from 1920 pixels downto 720 pixels, vertical filter: number of lines from 1080 downto 480 reorder HD input SD output hf vf Real-Time Constraints the input and output processes: 30Hz. HD pixels arrive at 30 1920 1080 = 62, 208, 000Hz SD pixels at 30 720 480 = 10, 368, 000Hz (6 times slower) 3

Our Goal Define a programming language dedicated to those Kahn Networks providing: a modular functional description a modular description and programming of the timing requirements with a semantics and a compiler which statically guarantees four important properties. E.g., on the downscaler: a proof that, according to worst-case time conditions, the frame and pixel rate will be sustained a proof that the system executes in bounded memory an evaluation of the delay introduced by the downscaler in the video processing chain, i.e., the delay before the output process starts receiving pixels an evaluation of memory requirements, to store data within the processes, and to buffer the stream produced by the vertical filter in front of the output process 4

What about Synchronous Languages? dedicated to hard real-time critical systems generation of custom hardware and software system with static guarantees (real-time and resource constraints) static analysis, verification and testing tools synchrony is ensured by a type-system for clocks: a clock calculus it allows to program Synchronous Kahn Networks 5

But too restrictive for our video applications 0 0 1 1 x when when y z?? + t 0 1 streams must be synchronous when composed (y+z is rejected by the clock calculus) y z adding buffer code (by hand) is feasible but hard and error-prone can we compute it automatically and obtain regular synchronous code? we need a relaxed model of synchrony and relaxed clock calculus 6

N-Synchronous Kahn Networks propose a programming model based on a relaxed notion of synchrony yet compilable to some synchronous code allows to compose programs as soon as they can be made synchronous through the insertion of a bounded buffer y 1 1 0 0 1 1 0 0 1 1 0 0 1 1 buff[1] z 1 0 1 0 1 0 1 0 1 0 1 0 1 0 based on the use of infinite ultimately periodic clocks a precedence relation between clocks ck 1 <: ck 2 7

Infinite Ultimately Periodic Clocks Introduce Q 2 as the set of infinite periodic binary words. Coincides with rational 2-adic numbers (01) = 01 01 01 01 01 01 01 01 01... 0(1101) = 0 1101 1101 1101 1101 1101 1101 1101... 1 stands for the presence of an event 0 for its absence Definition: w ::= u(v) where u (0 + 1) and v (0 + 1) + 8

Causality order and Synchronisability Precedence relation: w 1 w 2 1s from w 1 arrive before 1s from w 2 is a partial order which abstracts the causality order between streams (Q 2,,, ) is a lattice Synchronisability: Two infinite periodic binary words w and w are synchronisable, noted w w iff it exists d N such that w 0 d w and d N such that w 0 d w. 1. 11(01) and (10) are synchronisable 2. (010) and (10) are not synchronisable since there are too much reads or too much writes (infinite buffers) Subsumption (sub-typing): w 1 <: w 2 w 1 w 2 w 1 w 2 9

Multi-sampled Systems (clock sampling) c on w denotes a subsampled clock. c ::= w c on w w (0 + 1) ω c on w is the clock obtained in advancing in w at the pace of clock c. E.g., 1(10) on (01) = (0100). base 1 1 1 1 1 1 1 1 1 1... (1) p 1 1 1 0 1 0 1 0 1 0 1... 1(10) base on p 1 1 1 0 1 0 1 0 1 0 1... 1(10) p 2 0 1 0 1 0 1... (01) (base on p 1 ) on p 2 0 1 0 0 0 1 0 0 0 1... (0100) Proposition 1 (on-associativity) Let w 1, w 2 and w 3 be three infinite binary words. Then w 1 on (w 2 on w 3 ) = (w 1 on w 2 ) on w 3. 10

Computing with Periodic Clocks In the case of infinite periodic binary words, precedence relation, synchronizability, equality can be decided in bounded time Synchronizability: Two infinite periodic binary words u(v) and u (v ) are synchronizable, noted u(v) u (v v ) iff they have the same rate, i.e., 1 v 1 = v v. Equality: Let w = u(v) and w = u (v ). We can always write w = a(b) and w = a (b ) with a = a = max( u, u ) and b = b = lcm( v, v ) Delays and Buffers: can be computed practically after normalisation The set of infinite periodic binary words is closed by sampling (on), delaying (pre) and point-wise application of a boolean operation w ::= u(v) c ::= w c on w not c pre c... 11

A Synchronous Data-flow Kernel Reminiscent to Lustre and Lucid Synchrone receive a standard (strictly) synchronous semantics e ::= x i e where x = e e(e) e fby e e when pe merge pe e e d ::= let node f(x) = e d; d dp ::= period p = pe dp; dp pe ::= p w pe on pe not pe... fby is the initialized delay (or register) when is the sampling operator allowing to extract a sub-stream from a stream merge is the combination operator allowing to combine two complementary streams (with opposite clocks) 12

The Downscaler let period c = (10100100) p let node hf p = o where rec o2 = 0 fby p and o3 = 0 fby o2 f when o and o4 = 0 fby o3 and o5 = 0 fby o4 and o6 = 0 fby o5 1 0 1 0 0 1 0 0 and o = f (p,o2,o3,o4,o5,o6) when c val hf : a -> a on c let node main i = o where rec t = hf i and (i1,i2,i3,i4,i5,i6) = reorder t and o = vf (i1,i2,i3,i4,i5,i6) The clock signature of each process abstracts its timing behavior Clock calculus: what is the clock signature of main? 13

Clock calculus σ ::= α.σ ct ct ::= ct ct ct ct ck ck ::= ck on pe α H ::= [x 1 : σ 1,..., x m : σ m ] P ::= [p 1 : w 1,..., p n : w n ] Judgment: P, H e : ct expression e receive clock type ct in environments H and P 14

From 0-Synchrony to N-Synchrony 0-Synchrony: 0-synchrony can be checked using standard Milner-type system [ICFP 96,Emsoft 03] only need clock equality (and clocks are not necessarily periodic) H, P e 1 : ck H, P e 2 : ck N-Synchrony: H, P op(e 1, e 2 ) : ck extend the basic clock calculus of a synchronous language with a sub-typing rule: P, H e : ck on w w <: w (SUB) P, H e : ck on w defines the synchronisation points where buffer code should be inserted 15

An Example 0 0 1 1 0 0 1 1 1 buffer x when when y z + t x when when z + t 0 1 0 1 let node f(x) = t where t = (x when (1100)) + (x when (10)) (1100) and (10) can be synchronized using a buffer of size 1. Indeed: P, H x when (1100)) : α on (1100) (1100) <: (10) P, H x when (1100) : α on (10) Finaly, f : α.α α on (01) and the 1-buffer buffer[1]: α.α on (1100) α on (1010) 16

Translation into 0-Synchronous Programs (TRANSLATION) P, H e : ck on w e w <: w P, H e : ck on w buffer w,w (e ) buffer w,w : α.α on w α on w Theorem (correctness): Any well clocked (N-synchronous) program can be transformed into a 0-synchronous program This is a constructive proof: sub-typing points define where some buffering is necessary The translated program can be checked with the basic clock calculus 17

Algorithm (constraint resolution) The sub-typing system is not deterministic and is thus not an algorithm Standard solution: apply the (SUB) rule at every program construction generate a set of sub-typing constraints {ck 1 <: ck 1,..., ck n <: ck n} rely on a resolution algorithm Resolution amounts to rewriting (simplifying) the set of constraints until we get the empty set Theorem (completeness): For any expression e, and for any period and clock environments P and H, if e has an admissible clock type in P, H for the relaxed clock calculus, then the type inference algorithm computes a clock ct verifying P, H e : ct 18

Clock sampling (gating) vs Buffering In general, there exists an infinite number of solutions. f g + f : α 1.α 1 α 1 on (1100) g : α 2.α 2 α 2 on (10) (+) : α 3.α 3 α 3 α 3 We have to solve the constraint: α 1 on (1100) <: α 3 and α 2 on (10) <: α 3 Clock sampling: (unification) find v 1 and v 2 st α 1 = α 4 on v 1 and α 2 = α 4 on v 2 Solution: α 4 on (10111) on (1100) = α 4 on (10100) = α 3 α 4 on (11110) on (10) = α 4 on (10100) = α 3 No buffering but the base clock must be faster Buffering: (sub-typing) α 1 = α 4 and α 2 = α 4, α 4 on (1100) <: α 3 and α 4 on (10) <: α 3 α 4 on (1100) (10) = α 4 on (10) = α 3 A 1-buffer is needed 19

(EQUAL) S h α 1 S on V(w 1, w 2 )/α 1 α 2 on V (w 1, w 2 )/α 2 if S = S + I 1 + I 2, i I 1 = {α 1 on w 1 <: ck 1 } or {ck 1 <: α 1 on w 1 } I 2 = {α 2 on w 2 <: ck 2 } or {ck 2 <: α 2 on w 2 }, α 1 α 2 w 1 w 2 (CYCLE) S + {α on w 1 <: α on w 2 } S if w 1 <: w 2 (SUP) (INF) (CUT) (FORK) (JOIN) (SUBST) S + {α on w 1 <: α, α on w 2 <: α } S + {α on w 1 w 2 <: α } if w 1 w 2 S + {α <: α on w 1, α <: α on w 2 } S + {α <: α on w 1 w 2 } if w 1 w 2 S + {α 1 on w <: α 2 on w} S + {α 1 <: α 3 on u 1, α 3 on u 2 <: α 2 } if α 1 α 2, u 1 = U max (w), u 2 = U min (w) S + {α <: α 1 on w, α <: α 2 on w} S[α 3 on u on w/α] + {α 3 on u <: α 1, α 3 on u <: α 2 } if α 1 α 2, u = U min (w) S + {α 1 on w <: α, α 2 on w <: α} S[α 3 on u on w/α] + {α 1 <: α 3 on u, α 2 <: α 3 on u} if α 1 α 2, u = U max (w) S I S[ck/α] if I = {α <: ck} or {ck <: α}, α / FV(ck) 20

Conclusion N-Synchronous Kahn Networks introduce a relaxed model of synchrony extended synchrononous framework: automatic generation of the synchronous buffers which are semantically (as defined by Kahn) guaranteed correct a relaxed clock calculus where buffering corresponds to sub-typing N-synchronous programs can be translated into 0-synchronous ones extend the expressive power of synchronous languages, yet allowing to do compilation, simulation and verification after translation Lustre programs are 0-Synchronous Kahn Networks Kahn networks are -Synchronous Kahn Networks 21

Current and Future Work algebraic characterisation and symbolic representation of clocks implementation inside an existing synchronous language optimization and architecture considerations (buffer size, locality, clock gating) forgetting buffering mechanism, periodic clocks are useful for dealing with several implementations of the same function: parallel vs pipelined vs serial computation going from a version to an other changes clocks how to prove them to be equivalent (or to derive them from the same program) according to resource constraints? 22