2IN35 VLSI Programming Lab Work Communication Protocols: A Synchronous and an Asynchronous One

Similar documents
Models for representing sequential circuits

EECS Components and Design Techniques for Digital Systems. FSMs 9/11/2007

Example: vending machine

FSM Examples. Young Won Lim 11/6/15

Appendix B. Review of Digital Logic. Baback Izadi Division of Engineering Programs

Laboratory Exercise #8 Introduction to Sequential Logic

GMU, ECE 680 Physical VLSI Design 1

Timing Issues. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić. January 2003

Using Global Clock Networks

Ch 7. Finite State Machines. VII - Finite State Machines Contemporary Logic Design 1

Laboratory Exercise #11 A Simple Digital Combination Lock

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Instructor: Mohsen Imani. Slides from Tajana Simunic Rosing

Chapter 6. Synchronous Sequential Circuits

Review: Designing with FSM. EECS Components and Design Techniques for Digital Systems. Lec09 Counters Outline.

EE241 - Spring 2006 Advanced Digital Integrated Circuits

University of Toronto Faculty of Applied Science and Engineering Edward S. Rogers Sr. Department of Electrical and Computer Engineering

High Performance Computing

Design of Datapath Controllers

CSE 320: Spartan3 I/O Peripheral Testing

EXPERIMENT Traffic Light Controller

Review Problem 1. should be on. door state, false if light should be on when a door is open. v Describe when the dome/interior light of the car

Ch 9. Sequential Logic Technologies. IX - Sequential Logic Technology Contemporary Logic Design 1

June 8 th Riga, Latvia

ECE/Comp Sci 352 Digital Systems Fundamentals. Charles R. Kime Section 2 Fall Logic and Computer Design Fundamentals

Chapter 7. Sequential Circuits Registers, Counters, RAM

Sequential Circuit Timing. Young Won Lim 11/6/15

Chapter 5 Synchronous Sequential Logic

Distributed Algorithms Time, clocks and the ordering of events

Present Next state Output state w = 0 w = 1 z A A B 0 B A C 0 C A C 1

CSC 322: Computer Organization Lab

Review: Designing with FSM. EECS Components and Design Techniques for Digital Systems. Lec 09 Counters Outline.

Issues on Timing and Clocking

MAHALAKSHMI ENGINEERING COLLEGE TIRUCHIRAPALLI

Chapter 4 (Lect 4) Encoders Multiplexers Three-State Gates More Verilog

Finite State Machines CS 64: Computer Organization and Design Logic Lecture #15 Fall 2018

Working with combinational logic

14:332:231 DIGITAL LOGIC DESIGN

EECS 270 Midterm 2 Exam Answer Key Winter 2017

EECS150 - Digital Design Lecture 23 - FSMs & Counters

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary

CPE100: Digital Logic Design I

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong

Problem 10. Minimization of Incompletely Specified Finite State Machines.

ELCT201: DIGITAL LOGIC DESIGN

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences

Shared Memory vs Message Passing

Data Synchronization Issues in GALS SoCs

ELEVATOR CONTROL CIRCUIT. Project No: PRJ045 Presented by; Masila Jane Mwelu. Supervisor: Prof. Mwangi Examiner: Dr. Mang oli

Last lecture Counter design Finite state machine started vending machine example. Today Continue on the vending machine example Moore/Mealy machines

Operating Systems. VII. Synchronization

Chapter 3. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 3 <1>

ECEN 468 Advanced Logic Design

LOGIC CIRCUITS. Basic Experiment and Design of Electronics

Sequential Circuits Sequential circuits combinational circuits state gate delay

Pipelined Viterbi Decoder Using FPGA

Lab Course: distributed data analytics

CSE 123: Computer Networks

Lecture 14: State Tables, Diagrams, Latches, and Flip Flop

A Formal Model of Clock Domain Crossing and Automated Verification of Time-Triggered Hardware

Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

COE 202: Digital Logic Design Sequential Circuits Part 4. Dr. Ahmad Almulhem ahmadsm AT kfupm Phone: Office:

International Journal of Combined Research & Development (IJCRD) eissn: x;pissn: Volume: 7; Issue: 7; July -2018

Outline F eria AADL behavior 1/ 78

Experiment 9 Sequential Circuits

Metastability. Introduction. Metastability. in Altera Devices

S No. Questions Bloom s Taxonomy Level UNIT-I

Design for Testability

Chapter 3. Chapter 3 :: Topics. Introduction. Sequential Circuits

Laboratory Exercise #10 An Introduction to High-Speed Addition

Register Transfer Level

Analysis of clocked sequential networks

Design for Testability

Agreement. Today. l Coordination and agreement in group communication. l Consensus

Parity Checker Example. EECS150 - Digital Design Lecture 9 - Finite State Machines 1. Formal Design Process. Formal Design Process

SMV the Symbolic Model Verifier. Example: the alternating bit protocol. LTL Linear Time temporal Logic

A subtle problem. An obvious problem. An obvious problem. An obvious problem. No!

Cuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H

Lecture 7: Logic design. Combinational logic circuits

King Fahd University of Petroleum and Minerals College of Computer Science and Engineering Computer Engineering Department

PERFORMANCE ANALYSIS OF SYNCHRONIZATION CIRCUITS

TECHNICAL REPORT YL DISSECTING ZAB

ENGG 1203 Tutorial _03 Laboratory 3 Build a ball counter. Lab 3. Lab 3 Gate Timing. Lab 3 Steps in designing a State Machine. Timing diagram of a DFF

And Inverter Graphs. and and nand. inverter or nor xor

Implementation of Clock Network Based on Clock Mesh

A Random Walk from Async to Sync. Paul Cunningham & Steev Wilcox

University of Minnesota Department of Electrical and Computer Engineering

Formal Verification of Systems-on-Chip

Designing Sequential Logic Circuits

FSM Optimization. Counter Logic Diagram Q1 Q2 Q3. Counter Implementation using RS FF 10/13/2015

Digital Electronics Sequential Logic

Synchronous Sequential Circuit Design. Digital Computer Design

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Checking Behavioral Conformance of Artifacts

EECS150 - Digital Design Lecture 23 - FFs revisited, FIFOs, ECCs, LSFRs. Cross-coupled NOR gates

Slides for Chapter 14: Time and Global States

Lecture 4 Event Systems

LOGIC CIRCUITS. Basic Experiment and Design of Electronics. Ho Kyung Kim, Ph.D.

Synchronizers, Arbiters, GALS and Metastability

EE115C Winter 2017 Digital Electronic Circuits. Lecture 19: Timing Analysis

VLSI Physical Design Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Transcription:

2IN35 VLSI Programming Lab Work Communication Protocols: A Synchronous and an Asynchronous One René Gabriëls, r.gabriels@student.tue.nl July 1, 2008 1

Contents 1 Introduction 3 2 Problem Description 3 3 Synchronous solution 3 3.1 Protocol....................................... 3 3.2 Composition.................................... 4 3.3 Verilog implementation............................... 5 4 Asynchronous solution 6 4.1 Protocol....................................... 6 4.2 Composition.................................... 7 4.3 Verilog implementation............................... 7 A Verilog code listings 8 A.1 Synchronous protocol............................... 8 A.2 Asynchronous protocol............................... 12 2

1 Introduction This document describes the problem of communication between two hardware components, and how to solve it using a communication protocol. Two solutions are presented: a synchronous method (i.e. using shared clocks) and using an asynchronous method (i.e. using a handshake protocol). 2 Problem Description Imagine a situation in which there are two hardware components: a producer P and a consumer C, connected together by a wire (see figure 1). P sends data to C over this wire. But how does C know when to look for a data item, and how does P know when C has received the data item and is ready for the next one? This problem will be tackled in the next two sections by a synchronous and an asynchronous approach. P data C Figure 1: Producer P and consumer C with shared communication wire data 3 Synchronous solution 3.1 Protocol The first solution is introducing a shared notion of time between the sender and receiver, known as a clock (hence the name synchronous ), plus a set of conventions to indicate when data items have to be sent and received. A clock can be any periodic signal, but a square wave with equal low and high periods is most common. For such a clock, a communication convention might be that one data item has to be send and received every clock cycle. The number of values exchanged per second then equals the frequency of the clock. This scheme works, except when the sender cannot send a data item every clock cycle, or the receiver cannot receive a data item every clock cycle. This can be solved by introducing signaling wires, where the sender and receiver can notify each other whether they are ready to communicate or not. When both are ready during the same clock cycle, communication can proceed according to the protocol outlined above. Each side in the communication can be either active or passive. An active side signals that it is ready, no matter what the state of the other side is, while a passive side only signals readyness whenever the other side has already signaled readyness. A communication channel always has a passive and an active component connected to it. This gives rise to 2 possible 3

clock ready P enabled data C Figure 2: Producer P and consumer C with communication wire data, signaling wires ready and enabled, and a shared clock. configurations: a pull configuration, where the consumer is active and the producer is passive, and a push configuration, where the producer is active and consumer is passive. A pull configuration is shown in figure 2. Whenever C is ready to accept new data in this configuration, it signals this by making the ready wire high. If C is ready, P can signal that it has data available by making the enabled wire high. Every clock cycle in which enabled is high, one data item is send over the wire. Such a system can be in four states: ready enabled meaning 0 0 P and C are both busy. 0 1 Forbidden! No data items may be send when ready is low. 1 0 C is ready to accept a value, P is busy. 1 1 C is ready to accept a value, P sends a value. Figure 3 shows an example of this scheme in action between a producer P and consumer C. Every clock cycle in which both ready and enabled are high, one data item is exchanged. So in this example, 4 data items are exchanged between P and C in total. clock ready enabled data item 1? item 2 item 3? item 4 Figure 3: Operation of a synchronous protocol with a shared clock, signaling wires ready and enabled and data wire data. 3.2 Composition The disadvantage of a pull or push configuration is that one side must always be passive (P in the pull configuratin above), while the other side must always be active (C in the pull configuration above). To prevent mismatching components, one could choose to adopt a convention that everything should be constructed in a pull or push configuration. However, it is more 4

attractive to choose the third option: make both sides active. So in every clock cycle when both sides are ready, a data item is exchanged. This can be implemented straightforwardly by using an AND-gate to combine the ready signals, and feed those back as enabled signals. Figure 4 shows a pipeline consisting of a producer, a combined producer/consumer and a consumer connected together using this scheme. clock ready ready ready ready P enabled & enabled C & P enabled & enabled C data data Figure 4: Sequential composition of three active synchronous components P, C&P and C, interconnected by passivators. 3.3 Verilog implementation A skeleton verilog implementation for the pipeline shown in figure 4 is listed in appendix A.1. It consists of 4 components: a producer prod, a producer and consumer prodcons, a consumer cons, a passivator passivator. These components are instantiated in the top-level module pipeline. The components cons and prodcons have an input port consisting of the wires in ready, in enabled and in data. Conversely, the component prod and prodcons have an output port consisting of the wires out ready, out enabled and out data. Additionaly, each component has a parameter to specify the data width of the data wires. In each component, a buffer is provided for every output wire to keep it stable (see figure 5. Implementing the protocol amounts to controlling these buffers in the behavioral section (the statements in always @(...)). The behavior is a state machine that alternates between computation and communication. clock in_ready in_enabled in_data buf logic buf buf out_ready out_enabled out_data Figure 5: Generic synchronous producer/consumer component. 5

4 Asynchronous solution 4.1 Protocol The other solution uses the signaling wires such that a global clock isn t needed, and hence an absence of a shared notion of time. This opens the door to components that work at different clock speeds internally, but can still communicate with each other. Components might even be asynchronous internally. Such a design is shown in figure 6. request P acknowledge data C Figure 6: Producer P and consumer C with communication wire data, and signaling wires request and acknowledge. The most common asynchronous protocols are a 2-phase or 4-phase handshake protocol. We will describe a 4-phase protocol, as illustrated in figure 7. The four phases of the protocol are: 1. Whenever C is ready to receive a value, it sends a request to P (by making the request wire high). 2. If C has send a request, the sender can send an acknowledge back to C (by making the acknowledge wire high) as soon as it has data available. 3. When C receives an acknowledge, it will read the data item, and signal that it is ready reading the data item by making the request wire low again. 4. Whenever the request wire becomes low, the sender will respond by making the acknowledge wire low as well, ending the 4 phase handshake. Note that after a cycle of this protocol, the system is in the same state as before. The four phases are (in order): request acknowledge meaning 0 0 P and C are both busy. 1 0 C has requested a data item, P is busy. 1 1 C has requested a data item, P has send and acknowledged it. 0 1 C has received a data item, P is waiting to for acknowledge low. Note that although the synchronous and asynchronous solution both use two signaling wires, they are not the same! The asynchronous protocol always needs to go through the four phases in order, and thereby exchange one data item. The synchronous protocol on the other hand does not have to do a complete cycle to exchange one data item: if both signaling wires are high during n consecutive clock cycles, n data items are exchanged. The clock signal is used to separate consecutive data items, which is impossible in an asynchronous system. 6

request acknowledge data? item 1? item 2? item 3? item 4? Figure 7: Operation of an asynchronous protocol with signaling wires request and acknowledge and data wire data. 4.2 Composition As was the case for the synchronous solution, this solution also has an active and a passive side: the consumer is active (because it controls the request wire), and the sender is passive (because it controls the acknowledge wire). This is again a pull configuration. Reversing the roles of the producer and consumer, would result in a push configuration. In both cases, an active side and a passive side have to be matched. In order to prevent mismatches, we could standardize on one of these two approaches. But again, another solution is more attractive: make all components active, and insert a simple two port passive component in between them, known as a passivator. This device may seem more complex than it actually is. Instead of an AND-gate (which worked for the synchronous solution), we need a symmetric Muller-C element. A Muller-C element can be implemented with a 3-input majority gate, with one feedback wire. In the case of an FPGA this gate can be mapped onto 1 LUT. A composition of three components in a pipeline using the asynchronous protocol to communicate is shown in figure 8. Note the similarity to the synchronous solution. The differences are the absense of a global clock signal (although there might be one) and the Muller-C element instead of the AND-gate to connect the signaling wires. request request request request P acknowledge C acknowledge C & P acknowledge C acknowledge C data data Figure 8: An active producer P and consumer C with a passivator in between. 4.3 Verilog implementation The verilog implementation of the asynchronous protocol for the pipeline of figure 8 is very similar to implementation of the synchronous pipeline. Only the names of the registers and wires, and the guards of the conditionals have changed. See appendix A.2 for a complete listing. Note that this implementation still uses a global clock, but in principle each component can have its own clock. This is what is called a globally asynchronous, locally synchronous (GALS) design. 7

A Verilog code listings A.1 Synchronous protocol 1 module prod #(parameter DWIDTH = 8) 3 input r e s e t, 4 output o u t r e a d y, 5 input o u t e n a b l e d, 6 output [ 0 : DWIDTH 1] o u t d a t a ) ; 7 8 // R e g i s t e r f o r output r e a d y s i g n a l 9 reg o u t r e a d y b u f ; 10 a s s i g n o u t r e a d y = o u t r e a d y b u f ; 11 12 // R e g i s t e r f o r output data 13 reg [ 0 : DWIDTH 1] o u t d a t a b u f ; 14 a s s i g n o u t d a t a = o u t d a t a b u f ; 15 16 always @( posedge c l o c k ) begin 17 // C l e a r a l l r e g i s t e r s on r e s e t 18 i f ( r e s e t ) begin 19 o u t r e a d y b u f <= 0 ; 20 o u t d a t a b u f <= 0 ; 21 end 22 e l s e begin 23 // Stop p r o v i d i n g data i f output r e q u e s t was acknowledged 24 i f ( o u t e n a b l e d ) begin 25 o u t r e a d y b u f <= 0 ; 26 end 27 // Compute i f no output r e q u e s t i s open 28 i f (! o u t r e a d y ) begin 29 i f output r e a d y n e x t c y c l e begin 30 o u t r e a d y b u f <= 1 ; 31 o u t d a t a b u f <= output ; 32 end 33 compute ; 34 end 35 end 36 end 37 38 endmodule 8

1 module c o n s #(parameter DWIDTH = 8) 3 input r e s e t, 4 output i n r e a d y, 5 input i n e n a b l e d, 6 input [ 0 : DWIDTH 1] i n d a t a ) ; 7 8 // R e g i s t e r f o r i n p u t r e a d y s i g n a l 9 reg i n r e a d y b u f ; 10 a s s i g n i n r e a d y = i n r e a d y b u f ; 11 12 // B u f f e r to s t o r e incoming v a l u e s 13 reg [ 0 : DWIDTH 1] b u f f e r ; 14 15 always @( posedge c l o c k ) begin 16 // C l e a r a l l r e g i s t e r s on r e s e t 17 i f ( r e s e t ) begin 18 i n r e a d y b u f <= 0 ; 19 b u f f e r <= 0 ; 20 end 21 e l s e begin 22 // Take data i f i n p u t r e q u e s t was acknowledged 23 i f ( i n e n a b l e d ) begin 24 b u f f e r <= i n d a t a ; 25 i n r e a d y b u f <= 0 ; 26 end 27 // Compute i f no i n p u t r e q u e s t i s open 28 i f (! i n r e a d y ) begin 29 i f i n p u t r e q u i r e d n e x t c y c l e begin 30 i n r e a d y b u f <= 1 ; 31 end 32 compute ; 33 end 34 end 35 end 36 37 endmodule 9

1 module prodcons #(parameter DWIDTH = 8) 3 input r e s e t, 4 output i n r e a d y, 5 input i n e n a b l e d, 6 input [ 0 : DWIDTH 1] i n d a t a, 7 output o u t r e a d y, 8 input o u t e n a b l e d, 9 output [ 0 : DWIDTH 1] o u t d a t a ) ; 10 11 // R e g i s t e r f o r i n p u t r e a d y s i g n a l 12 reg i n r e a d y b u f ; 13 a s s i g n i n r e a d y = i n r e a d y b u f ; 14 15 // R e g i s t e r f o r output r e a d y s i g n a l 16 reg o u t r e a d y b u f ; 17 a s s i g n o u t r e a d y = o u t r e a d y b u f ; 18 19 // R e g i s t e r f o r output data 20 reg [ 0 : DWIDTH 1] o u t d a t a b u f ; 21 a s s i g n o u t d a t a = o u t d a t a b u f ; 22 23 // B u f f e r to s t o r e incoming v a l u e s 24 reg [ 0 : DWIDTH 1] b u f f e r ; 25 26 always @( posedge c l o c k ) begin 27 // C l e a r a l l r e g i s t e r s on r e s e t 28 i f ( r e s e t ) begin 29 i n r e a d y b u f <= 0 ; 30 o u t r e a d y b u f <= 0 ; 31 o u t d a t a b u f <= 0 ; 32 b u f f e r <= 0 ; 33 end 34 e l s e begin 35 // Take data i f i n p u t r e q u e s t was acknowledged 36 i f ( i n e n a b l e d ) begin 37 b u f f e r <= i n d a t a ; 38 i n r e a d y b u f <= 0 ; 39 end 40 // Stop p r o v i d i n g data i f output r e q u e s t was acknowledged 41 i f ( o u t e n a b l e d ) begin 42 o u t r e a d y b u f <= 0 ; 43 end 44 // Compute i f no r e q u e s t s a r e open 45 i f (! i n r e a d y &&! o u t r e a d y ) begin 46 i f output r e a d y n e x t c y c l e begin 47 o u t r e a d y b u f <= 1 ; 48 o u t d a t a b u f <= output ; 49 end 50 i f i n p u t r e q u i r e d n e x t c y c l e begin 51 i n r e a d y b u f <= 1 ; 52 end 53 compute ; 54 end 55 end 56 end 57 endmodule 10

1 module p a s s i v a t o r #(parameter DWIDTH = 8) 2 ( input i n r e a d y, 3 output i n e n a b l e d, 4 input [ 0 : DWIDTH 1] i n d a t a, 5 input o u t r e a d y, 6 output o u t e n a b l e d, 7 output [ 0 : DWIDTH 1] o u t d a t a ) ; 8 9 // P a s s i v a t o r b e h a v i o u r (AND gate, 1 LUT) 10 a s s i g n i n e n a b l e d = i n r e a d y & o u t r e a d y ; 11 a s s i g n o u t e n a b l e d = i n e n a b l e d ; 12 13 // Data p a s s t h r o u g h 14 a s s i g n o u t d a t a = i n d a t a ; 15 16 endmodule 1 module p i p e l i n e #(parameter DWIDTH = 8) 3 input r e s e t ) ; 4 5 // I n t e r c o n n e c t s 6 wire rdy1, rdy2, rdy3, rdy4 ; 7 wire ena1, ena2, ena3, ena4 ; 8 wire [ 0 : DWIDTH 1] data1, data2, data3, data4 ; 9 10 // I n s t a n t i a t e the p i p e l i n e 11 prod #(DWIDTH) s t a g e 1 ( c l o c k, r e s e t, rdy1, ena1, data1 ) ; 12 p a s s i v a t o r #(DWIDTH) p a s s 1 ( rdy1, ena1, data1, rdy2, ena2, data2 ) ; 13 prodcons #(DWIDTH) s t a g e 2 ( c l o c k, r e s e t, rdy2, ena2, data2, rdy3, ena3, data3 ) ; 14 p a s s i v a t o r #(DWIDTH) p a s s 2 ( rdy3, ena3, data3, rdy4, ena4, data4 ) ; 15 cons #(DWIDTH) s t a g e 3 ( c l o c k, r e s e t, rdy4, ena4, data4 ) ; 16 17 endmodule 11

A.2 Asynchronous protocol 1 module prod #(parameter DWIDTH = 8) 3 input r e s e t, 4 output o u t r e q u e s t, 5 input out acknowledge, 6 output [ 0 : DWIDTH 1] o u t d a t a ) ; 7 8 // R e g i s t e r f o r output r e q u e s t s i g n a l 9 reg o u t r e q u e s t b u f ; 10 a s s i g n o u t r e q u e s t = o u t r e q u e s t b u f ; 11 12 // R e g i s t e r f o r output data 13 reg [ 0 : DWIDTH 1] o u t d a t a b u f ; 14 a s s i g n o u t d a t a = o u t d a t a b u f ; 15 16 always @( posedge c l o c k ) begin 17 // C l e a r a l l r e g i s t e r s on r e s e t 18 i f ( r e s e t ) begin 19 o u t r e q u e s t b u f <= 0 ; 20 o u t d a t a b u f <= 0 ; 21 end 22 e l s e begin 23 // Stop p r o v i d i n g data i f output r e q u e s t was acknowledged 24 i f ( o u t r e q u e s t && o u t a c k n o w l e d g e ) begin 25 o u t r e q u e s t b u f <= 0 ; 26 end 27 // Compute i f no r e q u e s t s a r e open 28 i f (! o u t r e q u e s t &&! o u t a c k n o w l e d g e ) begin 29 i f output r e a d y n e x t c y c l e begin 30 o u t r e q u e s t b u f <= 1 ; 31 o u t d a t a b u f <= output ; 32 end 33 compute ; 34 end 35 end 36 end 37 38 endmodule 12

1 module c o n s #(parameter DWIDTH = 8) 3 input r e s e t, 4 output i n r e q u e s t, 5 input i n a c k n o w l e d g e, 6 input [ 0 : DWIDTH 1] i n d a t a ) ; 7 8 // R e g i s t e r f o r i n p u t r e q u e s t s i g n a l 9 reg i n r e q u e s t b u f ; 10 a s s i g n i n r e q u e s t = i n r e q u e s t b u f ; 11 12 // B u f f e r to s t o r e incoming v a l u e s 13 reg [ 0 : DWIDTH 1] b u f f e r ; 14 15 always @( posedge c l o c k ) begin 16 // C l e a r a l l r e g i s t e r s on r e s e t 17 i f ( r e s e t ) begin 18 i n r e q u e s t b u f <= 0 ; 19 b u f f e r <= 0 ; 20 end 21 e l s e begin 22 // Take data i f i n p u t r e q u e s t was acknowledged 23 i f ( i n r e q u e s t && i n a c k n o w l e d g e ) begin 24 b u f f e r <= i n d a t a ; 25 i n r e q u e s t b u f <= 0 ; 26 end 27 // Compute i f no r e q u e s t s a r e open 28 i f (! i n r e q u e s t &&! i n a c k n o w l e d g e ) begin 29 i f i n p u t r e q u i r e d n e x t c y c l e begin 30 i n r e q u e s t b u f <= 1 ; 31 end 32 compute ; 33 end 34 end 35 end 36 37 endmodule 13

1 module prodcons #(parameter DWIDTH = 8) 3 input r e s e t, 4 output i n r e q u e s t, 5 input i n a c k n o w l e d g e, 6 input [ 0 : DWIDTH 1] i n d a t a, 7 output o u t r e q u e s t, 8 input out acknowledge, 9 output [ 0 : DWIDTH 1] o u t d a t a ) ; 10 11 // R e g i s t e r f o r i n p u t r e q u e s t s i g n a l 12 reg i n r e q u e s t b u f ; 13 a s s i g n i n r e q u e s t = i n r e q u e s t b u f ; 14 15 // R e g i s t e r f o r output r e q u e s t s i g n a l 16 reg o u t r e q u e s t b u f ; 17 a s s i g n o u t r e q u e s t = o u t r e q u e s t b u f ; 18 19 // R e g i s t e r f o r output data 20 reg [ 0 : DWIDTH 1] o u t d a t a b u f ; 21 a s s i g n o u t d a t a = o u t d a t a b u f ; 22 23 // B u f f e r to s t o r e incoming v a l u e s 24 reg [ 0 : DWIDTH 1] b u f f e r ; 25 26 always @( posedge c l o c k ) begin 27 // C l e a r a l l r e g i s t e r s on r e s e t 28 i f ( r e s e t ) begin 29 i n r e q u e s t b u f <= 0 ; 30 o u t r e q u e s t b u f <= 0 ; 31 o u t d a t a b u f <= 0 ; 32 b u f f e r <= 0 ; 33 end 34 e l s e begin 35 // Take data i f i n p u t r e q u e s t was acknowledged 36 i f ( i n r e q u e s t && i n a c k n o w l e d g e ) begin 37 b u f f e r <= i n d a t a ; 38 i n r e q u e s t b u f <= 0 ; 39 end 40 // Stop p r o v i d i n g data i f output r e q u e s t was acknowledged 41 i f ( o u t r e q u e s t && o u t a c k n o w l e d g e ) begin 42 o u t r e q u e s t b u f <= 0 ; 43 end 44 // Compute i f no r e q u e s t s a r e open 45 i f (! i n r e q u e s t &&! i n a c k n o w l e d g e &&! o u t r e q u e s t &&! o u t a c k n o w l e d g e ) begin 46 i f output r e a d y n e x t c y c l e begin 47 o u t r e q u e s t b u f <= 1 ; 48 o u t d a t a b u f <= output ; 49 end 50 i f i n p u t r e q u i r e d n e x t c y c l e begin 51 i n r e q u e s t b u f <= 1 ; 52 end 53 compute ; 54 end 55 end 56 end 57 endmodule 14

1 module p a s s i v a t o r #(parameter DWIDTH = 8) 2 ( input i n r e q, 3 output i n a c k, 4 input [ 0 : DWIDTH 1] i n d a t a, 5 input o u t r e q, 6 output out ack, 7 output [ 0 : DWIDTH 1] o u t d a t a ) ; 8 9 // P a s s i v a t o r b e h a v i o u r ( m a j o r i t y gate, 1 LUT) 10 a s s i g n i n a c k = ( i n r e q & o u t r e q ) ( i n r e q & i n a c k ) ( o u t r e q & i n a c k ) ; 11 a s s i g n o u t a c k = i n a c k ; 12 13 // Data p a s s t h r o u g h 14 a s s i g n o u t d a t a = i n d a t a ; 15 16 endmodule 1 module p i p e l i n e #(parameter DWIDTH = 8) 3 input r e s e t ) ; 4 5 // I n t e r c o n n e c t s 6 wire req1, req2, req3, req4 ; 7 wire ack1, ack2, ack3, ack4 ; 8 wire [ 0 : DWIDTH 1] data1, data2, data3, data4 ; 9 10 // I n s t a n t i a t e the p i p e l i n e 11 prod #(DWIDTH) s t a g e 1 ( c l o c k, r e s e t, req1, ack1, data1 ) ; 12 p a s s i v a t o r #(DWIDTH) p a s s 1 ( req1, ack1, data1, req2, ack2, data2 ) ; 13 prodcons #(DWIDTH) s t a g e 2 ( c l o c k, r e s e t, req2, ack2, data2, req3, ack3, data3 ) ; 14 p a s s i v a t o r #(DWIDTH) p a s s 2 ( req3, ack3, data3, req4, ack4, data4 ) ; 15 cons #(DWIDTH) s t a g e 3 ( c l o c k, r e s e t, req4, ack4, data4 ) ; 16 17 endmodule 15