Programming Model Application SW Translator Productivity SW model in BIP Functional Correctness D-Finder System model in BIP DOL Performance Efficiency/ Correctness Correctness Source2Source Distributed BIP Code Generator Deployed SW Multicore Platform Compiler C++ Code BIP Engine/Linux Simulation/ Model Checking
BIP Global state model Distributed centralized implementation Decentralized implementations O V E R V I E W Architecture transformations Discussion 4
BIP Basic Concepts Layered component model Priorities (conflict resolution) Interactions (collaboration) B E H A V I O R Composition operation parameterized by glue IN12, PR12 PR1 IN1 PR12 IN12 PR2 IN2 PR1 PR2 PR12 IN1 IN2 IN12
BIP Basic Concepts Priorities: Interactions: sr1r2r3 s r1 r2 r3 s r1 r2 r3 Sender Receiver1 Receiver2 Receiver3 Rendezvous
BIP Basic Concepts Priorities: xπxy for x,xy Interactions Interactions: s + sr1 + sr2 + sr3 + sr1r2 + sr2r3 + sr1r3 + sr1r2r3 s r1 r2 r3 s r1 r2 r3 Sender Receiver1 Receiver2 Receiver3 Broadcast
BIP Basic Concepts Priorities: xπxy for x,xy Interactions Interactions: s + sr1r2r3 s r1 r2 r3 s r1 r2 r3 Sender Receiver1 Receiver2 Receiver3 Atomic Broadcast
BIP Basic Concepts Priorities: xπxy for x,xy Interactions Interactions: s + sr1 + sr1r2 + sr1r2r3 s r1 r2 r3 s r1 r2 r3 Sender Receiver1 Receiver2 Receiver3 Causal Chain
BIP Basic Concepts: Semantics a set of atomic components {B i } i=1..n where B i =(Q i, 2 Pi, i ) a set of interactions γ priorities π γ ( Q i ) γ π γ (B 1,., B n ) Interactions a γ i [1,n] q i - a P i i q i (q 1,., q n ) - a γ (q 1,., q n ) where q I =q I if a P i = Priorities q- a γ q ( q- b γ a π q b ) q- a π q
BIP The execution Engine busy execute init stable choose ready filter
BIP Global state model Distributed centralized implementation Decentralized implementations O V E R V I E W Architecture transformations Discussion 12
Distributed Implementation BIP is based on: Global state semantics, defined by operational semantics rules, implemented by the Engine Atomic multiparty interactions, e.g. by rendezvous or broadcast Translate BIP models into distributed models Collection of independent components intrinsically concurrent - No global state Separate interaction from internal computation of components Point to point communication by asynchronous message passing Correctness by construction that is, the initial BIP model is observationally equivalent to the implementation Approach: BIP Partial state BIP Distributed BIP
Centralized Distributed Implementation The Principle Priorities: π Interactions: γ B 1 B 2 B n Priorities: π Interactions: γ B 1 B 2 B n Engine Oracle B 1 B 2 B n
Distributed Implementation Global vs. Partial State Models a Priorities: π Interactions: γ b c d Priorities: π Interactions: γ a b c d a,f a b,f b c,f c d,f d a β β β β f a b f b c f c d f d (a) Global State Model (b) Partial State Model Broadcast γ = a+ab+ac+ad+abc+abd+acd+abcd, with maximal progress. (a): only abcd is possible. (b): arbitrary desynchronization may occur. Rendezvous γ = ab+bc+cd and priority abπbc, cdπbc. (a): only (bc) is ω is possible (b): it is possible to reach a state from which bc never occurs e.g. ab(f a cd f b f d ab f c ) ω.
Distributed Implementation Partial state semantics q 1 q 2 p, f q 1 p q 2 β,f States are global or partial β-transitions interleave From any state q a unique global state is reached by application of β-transitions a? ab? abd? abcd β q A q B q C q D β q A β q A q B q A q B q D β Objective: Safe and efficient execution from partial states a γ i [1,n] q i - a P i i q i O(q 1,., q n ;a) (q 1,., q n ) - a γ (q 1,., q n ) where q I =q I if a P i =
Distributed Implementation Oracles x 2 y 1 b (a π b) Ideal Oracle Knowledgebased Oracle a more precise global state x 2 y 1 x 1 y 1 b (a π b) Dynamic Oracle O( y 1 ;a) x 2 y 1 x 1 y 1 x 2 y 2 x 1 y 2 ParallelismOverhead Static Oracle b (a π b) O( y 1 ;a) = false Lazy Oracle
Distributed Implementation Asynchronous MP Model a b?a!{a,b}?b!{a,b}!{a,b} a b?a?b β, f a β, f b β, f a β, f b Partial State Model Message Passing Model Before reaching a ready state, the set of the enabled ports is sent to the Engine From a ready state, await notification from the Engine indicating the selected port
Distributed Implementation Example (Forte 08)
BIP Global state model Distributed centralized implementation Decentralized implementations O V E R V I E W Architecture transformations Discussion 20
Decentralized Distributed Implementation The principle B b A a c C B!b?b A!a?a Engine!c?c C
Distributed Implementation Implementing Connectors!b?b!a?a Engine!c?c!b?b!b?b!a?a!c?c!a?a!c?c
Centralized solution Conflict Resolution I1 I2 I3 A B C D I4 I5 I6 I1 I2 I3 A B C D I4 I5 I6
Centralized solution Conflict Resolution I1 I2 I3 A B C D I4 I5 I6 A B C D I2 I6
Centralized solution Conflict Resolution I1 I2 I3 A B C D I4 I5 I6 alpha core I1 alpha core I2 alpha core I3 A alpha B C D core alpha alpha core core alpha core alpha core alpha core alpha core I4 I5 I6
Centralized solution Conflict Resolution I1 I2 I3 A B C D I4 I5 I6 Distributed Independent Set of Conflicting Interactions I1 I2 I3 A B C D I4 I5 I6 Distributed Independent Set of Conflicting Interactions
Centralized solution Conflict Resolution I1 I2 I3 A B C D I4 I5 I6 Distributed Clique of non Conflicting Interactions I1 I2 I3 A B C D I4 I5 I6 Distributed Clique of non Conflicting Interactions
Decentralized Solution I1 I2 I3 A B C D I4 I5 I6 Distributed Graph Matching (edges not sharing a common vertex) Protocol Protocol Protocol A B C D Protocol
BIP Global state model Distributed centralized implementation Decentralized implementations O V E R V I E W Architecture transformations Discussion 29
Architecture Transformations Partitioning Monolithic Code Generation
Architecture Transformations Composite to Monolithic Component Flattening Connector Flattening
Architecture Transformations Composite to Monolithic G, F p 1 p 2 p 1 p 2 p 1 p 2 g 1 p 1 p g 2 2 f 1 f 2 g 1 p 1 p g 2 1 p 2 p 2 f 1 f 2 g f g= g 1 g 2 G f = F;(f 1 f 2 )
Example MPEG4 Video Encoder Transform the monolithic sequential program (12000 lines of C code) into a componentized one: ++ reusability, schedulability analysis, reconfigurability overhead in memory and execution time Decomposition: GrabFrame: gets a frame and produces macroblocks OutputFrame: produces an encoded frame Encode: encodes macroblocks GrabFrame Encode OutputFrame f_in f_in f_out f_out f_in f_out f_in f_in f_out f_out grabframe() outputframe()
Example MPEG4 Video Encoder f_in GrabMacroBlock out in MotionEstimation out in DCT out in Quant out in Intraprediction out in Coding out f_in in 1 in 2 IQuant out in IDCT out in Reconstruction f_out f_out GrabMacroBlock: splits a frame in (W*H)/256 macro blocks, outputs one at a time Reconstruction: regenerates the encoded frame from the encoded macro blocks. : buffered connections
Example MPEG4 Video Encoder f_in exit c=max c:=0 f_in out in in c<max c:=c+1 f_out out c<max grabmacroblock(), c:=c+1 GrabMacroBlock f_out c=max c:=0 reconstruction() Reconstruction in in fn() out out MAX=(W*H)/256 W=width of frame H=height of frame Generic Functional component
Axample MPEG4 Video Encoder: Results ~ 500 lines of BIP code Consists of 20 atomic components and 34 connectors Components call routines from the encoder library The generated C++ code from BIP is ~ 2,000 lines BIP binary is 288 KB compared to 172 KB of monolithic binary 100% overhead in execution time wrt monolithic code ~66% due to computation of interactions (can be reduced by composing components) ~34% due to evaluation of priorities (can be reduced by applying priorities to atomic components)
Source-to-Source MPEG4 Video Encoder: Results
BIP Global state model Distributed centralized implementation Decentralized implementations O V E R V I E W Architecture transformations Discussion 38
Papers Results A. Basu, Ph. Bidinger, M. Bozga and J. Sifakis. Distributed Semantics and Implementation for Systems with Interaction and Priority FORTE, 2008, pp. 116-133. Marius Bozga, Mohamad Jaber, Joseph Sifakis: Source-to-source architecture transformation for performance optimization in BIP. IEEE Fourth International Symposium on Industrial Embedded Systems 2009, pp 152-160 Ananda Basu, Saddek Bensalem, Doron Peled, Joseph Sifakis: Priority Scheduling of Distributed Systems Based on Model Checking. CAV 2009: 79-93 Borzoo Bonakdarpour, Marius Bozga, Mohamad Jaber, Joseph Sifakis. Incremental Component-based Modeling, Verification, and Performnce Evaluation of Distributed Reset, DISC 09. Distributed Implementations Centralized BIP Engine (FORTE08) Weakly decentralized (one Engine per set of conflicting interactions ) over Linux using TCP sockets Weakly decentralized (one Engine per interaction + Conflict resolution with alphacore) over Linux using TCP sockets
Discussion Study different distributed implementations from fully decentralized to fully centralized ones Use existing distributed algorithms for multiparty interaction and conflict resolution e.g. maximal matching algorithm Prove correctness by using composability techniques - non interference of features of the composed algorithms Performance evaluation tradeoffs wrt two criteria: degree of parallelism overhead for coordination Implementation tools and case studies.