Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation

Size: px
Start display at page:

Download "Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation"

Transcription

1 Brigham Young University BYU ScholarsArchive All Theses and Dissertations Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation Adam T. Arnesen Brigham Young University - Provo Follow this and additional works at: Part of the Electrical and Computer Engineering Commons BYU ScholarsArchive Citation Arnesen, Adam T., "Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation" (2011). All Theses and Dissertations This Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in All Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact scholarsarchive@byu.edu.

2 Increasing Design Productivity for FPGAs Through Intellectual Property Reuse and Meta-Data Encapsulation Adam Arnesen A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science Michael J. Wirthlin, Chair Brad L. Hutchings Brent E. Nelson Department of Electrical and Computer Engineering Brigham Young University April 2011 Copyright c 2011 Adam Arnesen All Rights Reserved

3

4 ABSTRACT Increasing Design Productivity for FPGAs Through Intellectual Property Reuse and Meta-Data Encapsulation Adam Arnesen Department of Electrical and Computer Engineering Master of Science As Moore s law continues to progress, it is becoming increasingly difficult for hardware designers to fully utilize the increasing number of transistors available semiconductor devices including FPGAs. This design productivity gap must be addressed to allow designs to take full advantage of the increased logic density that results from rising transistor density. The reuse of previously developed and verified intellectual property (IP) is one approach that has claimed to narrow the design productivity gap. Reuse, however, has proved difficult to realize in practice because of the complexity of IP and the reluctance of designers to reuse IP that they do not understand. This thesis proposes to narrow the design productivity gap for FPGAs by simplifying the reuse problem by encapsulating IP with extra machine-readable information or meta-data. This meta-data simplifies reuse by providing a language independent format for composing complex systems, providing a parameter representation system, defining high-level data types for FPGA IP, and allowing arbitrary IP to be described as actors in the homogeneous synchronous dataflow model of computation. This work implements meta-data in XML and presents two XML schemas that enable reuse. A new XML schema known as CHREC XML is presented as well as extensions that enable IP-XACT to be used to describe FPGA dataflow IP. Two tools developed in this work are also presented that leverage meta-data to simplify reuse of arbitrary IP. These tools simplify structural composition of IP, allow designers to manipulate parameters, check and validate high-level data types, and automatically synthesize control circuitry for dataflow designs. Productivity improvements are also demonstrated by reusing IP to quickly compose software radio receivers. Keywords: meta-data, FPGA, intellectual property reuse, interface synthesis, IP-XACT, synchronous dataflow, architectural synthesis

5

6 ACKNOWLEDGMENTS I would like to thank my advisor, Mike Wirthlin, and my committee members Brent Nelson and Brad Hutchings who have encouraged me in my work. I would also like to thank my other professors at BYU who have mentored me through my undergraduate and graduate studies and who have provided opportunities for learning and inspiration for my studies. I am grateful for the support and help of Marc Padilla who helped me design communication systems, Derrick Gibelyou for helping with algorithms and data structures, and the other students in the BYU Configurable Computing Laboratory who have helped me in my research My family also deserves thanks for supporting me throughout my education. My parents have encouraged my academic work from my elementary school years and have always helped me to push myself to be my best. My loving wife Sarah also deserves my deepest thanks. She has been patient and supportive as I have spent long hours in school and research. She has been an unfailing source of love and support. I would also like to thank Newton Peterson, HoJin Kee, and Jeff Washington at National Instruments for their support of my ideas and their encouragement of me to pursue my education. This work was supported by a grant from The Rocky Mountain NASA Space Grant Consortium, as well as by Brigham Young University CHREC center funded by the I/UCRC Program of the National Science Foundation under Grant No

7

8 Table of Contents List of Tables xiii List of Figures xvi 1 Introduction: Design Productivity and Reuse The Design Productivity Gap for FPGAs Increasing Design Productivity IP Reuse Enabling Reuse with Meta-Data Thesis Contributions IP Reuse and Meta-Data Descriptions Meta-Data for HDL Reuse Meta-Data in Commercial Design Tools Xilinx CORE Generator Xilinx EDK Xilinx System Generator National Instruments LabVIEW FPGA XML-Based Meta-Data for Reuse XML as a Meta-Data Format CHREC XML vii

9 3.2.1 Segment 1: The Structural Segment Segment 2: High-Level Datatype Segment Segment 3: Temporal Behavior Weaknesses of CHREC XML IP-XACT and Extensions Parameterization and Mathematical Expressions Ports and Structural Interface Generator Chains Modifying IP-XACT for FPGA IP Extensions for High-Level Datatypes Extensions for Temporal Interface Behavior Meta-Data Enabled Design Environment A Structural Design GUI Parameter Representation and Manipulation Traditional Low-Level Parameterization High-Level Parameterization Parameter Dependencies and Translation A Parameter Manipulation GUI Language-Specific Wrapper Generation Meta-Data Enabled H-SDF Synthesis Using IP Numerical Datatypes Representing Datatypes Utilizing Numerical Types Representing Coarse-Grained IP as H-SDF Actors viii

10 5.2.1 The H-SDF Model of Computation Latency Data Introduction Interval Sample Delay IP-XACT Extensions for H-SDF Applying H-SDF Synthesis Techniques to Coarse-Grain IP Translating Schematics to H-SDF Graphs Applying Iterative Modulo Scheduling Control Synthesis Meta-Data Enabled Rapid Radio Development A Highly Parameterized IP Library Manually Constructing Radios Automatic Radio Generation Conclusion: Productivity Gains from Meta-Data-Assisted Reuse Productivity Increases Demonstrated The Role of Meta-Data in Reuse Future Work The Need for Increased Design Productivity Bibliography 81 A CHREC XML Extensions to IP-XACT 85 A.1 Extending IP-XACT A.2 Parameter Extensions A.3 Port Description Extensions ix

11 A.4 High Level Datatypes Extension A.4.1 Bit Vector A.4.2 Integer A.4.3 Floating Point A.4.4 Fixed Point A.4.5 Custom Type A.5 Behavioral Layer Extension A.5.1 Pipeline Depth A.5.2 Data Introduction Interval A.5.3 Sample Delay A.5.4 Signal Associations B Dataflow Interface Automata 93 B.1 Introduction B.2 Definition and Examples B.3 Visualizing DIA s C The IP-XACT Extensions Schema 105 C.1 CHREC Extensions C.2 Parameters C.3 High-Level Datatypes C.4 H-SDF Interface Definitions C.5 Port Extensions C.6 Supporting Code D Generated VHDL from Ogre 123 x

12 D.1 Top-Level VHDL D.2 Finite State Machine xi

13 xii

14 List of Tables 4.1 High-Level Parameters Example for Loop Filter IP-XACT Enabled IP Core Library for Communication Productivity Gains With Ogre vs. Manual Creation xiii

15 xiv

16 List of Figures 1.1 The Design Productivity Gap A Basic FPGA Fabric A Generic Design Flow Using Meta-Data Independent Segments of XML in CHREC XML The IP-XACT Design Environment Datatype Mismatch Caused by Matching Bitwidths Tool Flow for Creating Wrappers from Multi-Segment CHREC XML The CHREC XML Design Composition Tool High Level Parameters Translated to Low-Level Parameters Parameter Manipulation GUI Based on CHREC XML The Ogre Tool-Flow The Simulink Front End to Ogre Tools Synthesis of Datatype Conversion Logic The Homogeneous Synchronous Dataflow Model of Computation H-SDF Initial Conditions Design Represented as H-SDF Graphs Scheduling H-SDF Graphs A QPSK System For Input to Ogre xv

17 6.2 Bit Error Rate Curves for Software Radios B.1 Deterministic Dataflow Interface Automata (DIA) B.2 Deterministic DIA With Control B.3 Non-deterministic DIA B.4 Non-deterministic DIA for core with required clear B.5 A DIA for an Upsampling Core B.6 A DIA for a Downsampling Core B.7 Full Example of Dataflow Interface Automata Operation xvi

18 Chapter 1 Introduction: Design Productivity and Reuse As the density of transistors on a semiconductor device continues to increase following Moore s Law, designers in the electronics industry have found it increasingly difficult to utilize the increasing transistor density that is available on silicon devices. The disparity between the capability of technology and the designer s ability to utilize it is called the design productivity gap. The increase in the design productivity gap has concerned the electronics industry for several years. If the gap is not closed significantly, then despite the improvements in technology the designs the industry produces will not scale at the same rate as the technology upon which they are implemented. The trends in productivity and device capability when viewed on their own are encouraging because improvements are being observed in both areas as shown in Figure 1.1 [1]. Figure 1.1 shows that there have been improvements in the productivity of engineers designing systems to be implemented on digital hardware. The productivity improvements are encouraging because the engineer s productivity is doubling about every 3.5 years. The steady increase in design productivity is due in part to the progression of tools (verification methods, design entry methods, etc.) and the ability of designers to design systems at a higher level (i.e., System-on-Chip). Figure 1.1 also shows encouraging advances in device capability. The transistor density is increasing as predicted by Moore s Law, doubling every 1.5 to 2 years. While both productivity and device capability are improving, comparing the rate of improvement of productivity and of transistor density reveals the design productivity gap. For each year that designer productivity increases at its current rate, the transistor density is increasing at more than double that rate. If this gap continues to increase at its current 1

19 Figure 1.1: The Design productivity gap for hardware systems [1]. The gap between productivity and technology capability is increasing. rate, the ability of electronic systems designers to utilize the increasing computing resources on available platforms will not be able to keep up with the increases in density. 1.1 The Design Productivity Gap for FPGAs The design productivity gap observed for digital hardware circuits also exists for field programmable gate arrays (FPGA) and other reconfigurable devices. FPGAs allow the computational capacity of the machine to be highly customized to the instantaneous needs of an application while allowing the computational capacity to be reused many times for different applications [2]. FPGAs offer a flexible design implementation solution that occupies a niche between software implementations that run on a processor and application specific integrated circuits (ASIC) and other custom silicon solutions. FPGAs provide an array of customizable hardware blocks and programmable interconnect as shown in Figure 1.2 [3]. The logic elements and interconnect can be configured repeatedly to implement many different hardware designs. Designing for FPGAs is like de- 2

20 signing software in the sense that in FPGA designs hardware description language (HDL) code is compiled and then run on a the device. The compiled FPGA code represents a set of configuration instructions that define the behavior of logic blocks and the interconnect routing in the FPGA fabric. Figure 1.2: A basic FPGA fabric. Interconnect and logic block contents are both programmable. Dots on interconnect represent programmable connections between wires that enable signals to route between logic blocks. Data-flow systems such as digital signal processing are often straight-forward to implement on modern FPGAs because of the natural mapping of these types of systems onto the FPGA fabric. Because of the abundance of resources, DSP systems can be pipelined and logically optimized to perform at high clock frequencies. The reconfigurable nature of FPGAs also allows for data-flow systems to be quickly upgraded as technology demands change. This work will focus primarily on increasing designer productivity in designing data-flow systems, particularly on DSP and communication systems implemented on FPGAs. 3

21 Design productivity challenges for FPGAs are different from the productivity challenges facing the electronics industry in general. Historically the design challenges for FPGA design have included meeting timing closure and making designs compact enough that they will fit on the fabric of a single device. This was especially true for early FPGAs which had such a small logic density that even basic designs or sometimes a single complex core easily consumed all of the computational resource available on the device. However, in recent years, as FPGAs and other reconfigurable devices continue to increase their computing capability following Moore s law, the design productivity problem has become more pronounced. The increase in transistor density has enabled production of devices with many more logic elements. As the number of available logic elements increases, it becomes increasingly difficult to utilize all available FPGA logic with a single design. Low design productivity continues to be a primary barrier to the more widespread adoption of FPGAs as a computing platform. Unless design productivity for FPGAs significantly increases, FPGA adoption will be limited to a relatively few dedicated application and hardware development experts who have the skills necessary to create low-level FPGA circuits. 1.2 Increasing Design Productivity The International Technology Roadmap for Semiconductors (ITRS) [1] has outlined several ways of addressing the increasing design productivity gap for digital semiconductor systems as well as for FPGAs. These ways include migrating to platform based design, improving and simplifying high level synthesis, and increasing intellectual property (IP) reuse. Platform-based design is essentially a form of component or IP reuse. Platform-based design is a meeting-in-the-middle process, where successive refinements of specifications meet with abstractions of potential implementations and the identification of precisely defined layers where the refinement and abstraction processes take place [4]. This means that platform-based design simplifies the process of mapping a concept system onto an implementation by providing a large variety of functionality on a single circuit board or even in a single integrated circuit package. Because of the large amount of pre-built design implementation 4

22 circuitry, the mapping of ideas to that circuitry as well as the design space exploration are simplified. High level synthesis is generally defined as the process of automatically creating digital circuits by starting with an abstract behavioral specification of a digital system and finding a register-transfer level (RTL) structure that realizes the behavioral specification [5]. These abstract systems can be specified in a traditional software language such as C and then translated into a high performance hardware system [6, 7]. High level synthesis generally maps behavioral specifications to small-sized low-level primitives such as adders and multipliers. This is a powerful approach to increasing design productivity [5]. Design reuse is the process of designing robust verified, IP and reusing that IP in future designs. Reuse increases design productivity by avoiding duplication of core design effort. If a core has been used successfully in an application in the past, the effort spent designing and testing that IP should not be duplicated when the same functionality is needed for another application. Leveraging IP reuse to increase productivity is the primary approach used in this thesis. 1.3 IP Reuse Reuse of previously verified and tested soft and hard intellectual property (IP) has long been touted as a primary method of improving design productivity [8] because there already exists an enormous amount of previously verified and tested hard and soft IP in the electronic design industry. Easy access to this IP and the ability to quickly and easily integrate it into designs would drastically narrow the design productivity gap. Despite the potential of reuse to increase productivity, reuse has been hampered because it is often difficult to reuse IP. In most current design environments and methodologies, in order for a designer to successfully reuse IP the designer must 1) manually find and select the appropriate IP, 2) understand the details of the implementation of the core, and 3) understand the interface and timing protocol used in order to integrate the IP into an overall system. Control and interface circuitry often must be manually generated. This is a complex and time-consuming process that must be done very quickly in order for reuse to be feasible. IP also often comes from many sources and in many formats, making reuse an intimidating 5

23 prospect. In fact, in order for any reuse process to be feasible, the entire process must not require more than 30% of the effort required to create the same IP from scratch [9]. Reuse based on standard platforms and formats has long been used in software engineering and has resulted in significant productivity improvements [10]. Software reuse, in the form of libraries, is so commonplace that most programmers do not even realize they are in fact reusing IP when they program. Programmers are able to develop systems by simply reusing previously developed IP from a library to implement their software system with little or no knowledge of how the underlying implementation of the IP operates. It is this reuse and standard method of representing software of IP that has most significantly increased the design productivity level in software development [11]. The success of this reuse scheme and the role of standard methods are important to remember as reuse schemes are developed for hardware IP. The issues to be overcome to facilitate reuse of IP for general hardware design are somewhat analogous to those that have been addressed for software reuse. In most hardware design and integration methodologies, hardware designers are required to operate at a very low level of abstraction; traditionally design is done at the hardware description language (HDL) layer. This abstraction level could be considered analogous to assembly or low-level C that was yesterday s design entry level for software engineering. HDL code for hardware and C for software both require an understanding of the underlying hardware. Raising the abstraction, even slightly, away from the RTL layer and allowing tools to translate from the abstraction to HDL can contribute to design productivity for hardware just as high-level compilers have done for software. This increase in abstraction would provide part of the common reuse scheme that is currently lacking for hardware design. Despite the difficulty of reusing arbitrary hardware IP, IP reuse has been successfully used to increase design productivity. Designers often reuse their own previously developed IP in other designs. Design paradigms such as System-on-Chip (SoC), in which cores are integrated using well-defined bus interfaces such as AMBA, PLB (core connect), and Wishbone [12], have also helped increase reuse. Tools such as Xilinx EDK [13], System Generator [14], and LabVIEW FPGA [15] have also leveraged reuse to obtain productivity improvements in a particular design space. Even though it is difficult to reuse arbitrary IP 6

24 in FPGA designs, design productivity has improved through direct IP reuse and through select tools that leverage reuse. 1.4 Enabling Reuse with Meta-Data Successful reuse of IP depends on having extra information about IP in addition to its actual implementation code. Such extra information constitutes meta-data about a piece of IP. Reuse of IP for FPGAs can be enabled by encapsulating IP in meta-data that describe the interface and other details of a core in a generic way. Meta-data can be used to describe basic interfaces and low-level details of a core. Meta-data can also be used to define a higher level more abstract view of the IP. Meta-data encapsulation and abstraction can enable the development of tools to automatically manipulate, instance, and interconnect cores within a design thereby removing this responsibility from the human designer. All hardware IP have similar characteristics that can be represented in meta-data. IP is developed in many languages and comes from many different sources. Despite this variety in representation and source all hardware IP are similar in their fundamental makeup. All have input and output ports, all have some representation in an external file, all have a name, etc. Many reusable cores also have parameters and these parameters values often depend on each other. Much of the IP for FPGAs operates on numerical data and communicates its data in high-level numerical types (i.e., fixed point or floating point). Capturing these types of information in meta-data allows for the interface or external view of the core to be represented in a standard way that is independent of any particular hardware design language (HDL) or design environment. 1.5 Thesis Contributions This work introduces several techniques that exploit the common elements of IP in meta-data to increase design productivity of FPGA-based systems. The meta-data encapsulation can enable reuse by removing low-level IP implementation details from designers and allowing them to design at a higher level. This higher level view can be achieved by leveraging meta-data to enable tools to do many of the low-level design tasks that have traditionally been time consuming for human designers, such as interface and control circuitry 7

25 generation. This work will discuss and demonstrate several ways that this encapsulation can be done and discuss the benefits of such meta-data encapsulation. Toward this end this thesis contributes several techniques that enable the development of tools to increase design productivity by exploiting novel meta-data encapsulation techniques. The specific contributions of this thesis include the following: This work demonstrates the benefits of representing the interface components of IP in a standard, language independent format by automatically composing complex systems based on IP from several different languages. A structural design tool is presented in Chapter 4 that demonstrates the ability of meta-data to enable this type of language independent structural composition. This thesis demonstrates the ability of meta-data to simplify resolution of complex IP parameters. Complex parameter resolution is facilitated by meta-data that describes IP parameters and their relationships via mathematical expressions. Chapter 4 demonstrates the ability of a tool to leverage meta-data to automatically create parameterization interfaces. This parameterization manipulation tool also leverages mathematical parameter dependencies to ensure proper parameterization of cores with many inter-related parameters. Much of the IP for FPGA operates on numerical values. This work contributes a meta-data description of high-level numerical data types for data-flow IP that enables tools to check and resolve numerical datatypes between connected IP. This datatype description is used as part of a design tool named Ogre presented in Chapter 5 to verify compatibility of types in data-flow systems. The meta-data descriptions enable designers to know when datatypes need to be converted to avoid data corruption. Many systems that are implemented in FPGAs can be modeled with the homogeneous synchronous dataflow (H-SDF) model of computation. This work contributes metadata descriptions that allow coarse-grain IP to be represented as actors in H-SDF. These descriptions enable architectural synthesis algorithms to use coarse grained IP as primitive operators in much the same was as more fine-grained IP (i.e., adders, 8

26 multipliers) are traditionally used. Chapter 5 will discuss how meta-data is used in Ogre to enable synthesis of control and interface circuitry based on an iterative-modulo scheduling [16] approach. To demonstrate the productivity improvements provided by meta-data and meta-data enabled tools, this work presents a study of the rapid construction of digital communications radio receivers. Radio receivers were built using synthesis techniques with reusable IP described in meta-data. This rapid construction, discussed in Chapter 6, demonstrates the ability of the contributed meta-data descriptions to enable improvements in designer productivity. For radios developed in this work, design time was reduced from 3 days to under an hour. This thesis addresses the need to increase design productivity for FPGAs by introducing meta-data that enables tools to perform much of the work that has traditionally been required of human designers. A standard meta-data description of the structure of IP interfaces allows tools to structurally interconnect IP from different languages and sources. A robust parameterization resolution mechanism based on mathematical expressions allows tools to ensure valid parameterization of IP. Meta-data that describes the high-level numerical datatypes associated with the inputs and outputs of IP can allow tools to ensure correct representation of communicated data. Meta-data that describes coarse-grain IP as actors in a an H-SDF system enables synthesis algorithms to automatically construct control circuitry for data-flow systems. These meta-data elements enable tools to significantly decrease the design time for complex data-flow systems for FPGA. 9

27 10

28 Chapter 2 IP Reuse and Meta-Data Descriptions Meta-data descriptions of IP are essential to enabling reuse. Meta-data is any information describing an IP core that exists separately from the actual IP implementation code. Meta-data can include in-code comments, human-readable documentation, and any other extra information regarding an IP s interface or internal operation. Meta-data enables reuse by providing designers and computer aided design (CAD) tools with the information needed to properly integrate a piece of IP into a complete system. Without meta-data, the designer would be left with only the raw HDL code which may be difficult to integrate without the extra information meta-data provides. Various existing design approaches leverage IP reuse. All of these reuse approaches have leveraged some type of meta-data to enable reuse and design composition. Traditional HDL-based IP reuse requires meta-data in the form of in-code comments and written documentation to be successful. Tools such as Xilinx CoreGen [17] require meta-data that describes the parameters for generating a particular piece of IP. Design composition tools such as Xilinx EDK, Xilinx System Generator, and National Instruments LabVIEW FPGA all require meta-data that describes the IP that can be used in these systems. While there are existing tools that exploit reuse by using meta-data, the meta-data in these approaches are all proprietary and limited to a specific tool environment. This thesis introduces a more general approach to representing IP meta-data and demonstrates the ability of this approach to enable the construction of design tools that simplify the task of reusing arbitrary IP. This chapter will present the use of meta-data in existing reuse approaches and tools. The development of a standard XML-based meta-data format for describing data-flow IP for FPGA will be presented. The CHREC XML representation 11

29 format developed in this work will be discussed along with the transition in this work from using CHREC XML to extending the IP-XACT [18] standard. 2.1 Meta-Data for HDL Reuse The most common way to reuse IP is to simply reuse HDL code that was previously developed in a new design. Meta-data is essential to enabling this type of reuse. Meta-data for HDL reuse primarily consists of written documentation. This documentation describes the purpose for the different inputs and outputs on the core. It describes the proper timing protocols to be used to communicate with the IP. For FPGA design this documentation will also often contain information about the timing and area characteristics for IP implemented in a particular device. For open-source HDL IP, meta-data also comes in the form of in-code comments and in readable well-written HDL code. Highly parameterized IP requires comprehensive documentation to enable the designer to properly reuse that IP. Many IP cores are highly parameterized because comprehensive parameterization of HDL-based IP significantly increases their reusability [19]. Parameters allow IP to used in a variety of situations with no manual changes to the core s internal HDL code. Without exhaustive documentation describing valid parameter values and the affect of the parameter s on the core s operation, it can be difficult or even impossible to reuse parameterized IP. Inconveniences arise when trying to reuse a core that was developed in one language in a system primarily based on another. Reusable cores are commonly written in VHDL, Verilog or other languages and when reusing these IP wrappers must be manually created to include the IP in the system s language. Machine-readable meta-data could simplify the task of integrating IP from multiple HDL languages by enabling the automatic creation of wrappers in the designer s preferred language. Meta-data in the form of documentation and in-code comments is essential when reusing HDL directly. While HDL itself is often human readable, if the designer only has access to the HDL code and no accompanying documentation, reuse will be nearly impossible. However if that HDL is accompanied with meta-data in the form of comments and documentation, reuse will be simplified and possible. 12

30 2.2 Meta-Data in Commercial Design Tools Commercial design tools often leverage reuse. These tools typically target a specific type of user and design domain and attempt to make the design process simpler by providing a library of IP with associated meta-data and a design environment capable of composing cores from the library into complete designs. This section will discuss the use of meta-data in the Xilinx CORE Generator tool, Xilinx EDK, Xilinx System Generator and National Instruments LabVIEW FPGA Xilinx CORE Generator Xilinx CORE Generator tool is a tool that generates reusable IP based on a set of user parameter [17]. CORE Generator facilitates the delivery of IP cores by allowing designers to generate IP in a language or format that enables their reuse in a vendor-specific design tool. This type of tool is essential to reuse because it provides easy access to a large variety of IP. IP from generation tools such as CORE Generator also tends to be well tested and verified and therefore can be reused without requiring additional testing. CORE Generator and other IP generation tools use meta-data to describe a particular IP that is to be generated. This meta-data describes the variety of IP available for generation and also captures the different variations that can be created for a core. Meta-data for these generation tools may also include code templates for different output languages as well as directives on how to implement specific parts of IP. For example, meta-data may describe that VHDL is to be generated, provide a template for that VHDL, and direct that all names are to be case-insensitive. Meta-data may also be used to describe project- and IP-specific parameters and the values that should be used for generation. The CORE Generator tool describes this type of parameterization in two meta-data files. The.xco file describes the parameters for the current CORE Generator project. This includes data about the target device, the HDL synthesis tool that will be used, the implementation language that should be generated, etc. The.xcp file also describes parameter used in generation but these parameters are specific to a particular IP. For example, for the CORE Generator s FIR filter these parameters include the number of taps for the filter, the pipeline depth that should be generated, and the type of memory that should be used in the design. CORE Generator also 13

31 uses a set of IP-specific.tcl files to generate a GUI that allows a designer to parameterize the IP. These.tcl scripts are also a form of meta-data. Meta-data is essential to the CORE Generator tool. Meta-data provides all of the necessary information to generate a core in any number of languages or formats. Without this meta-data the reuse facilitated by generation tools such as CORE Generator would be impossible Xilinx EDK Both of the large FPGA vendors, Xilinx and Altera, have design tools that leverage reuse of IP to create System-on-Chip designs. The Xilinx Embedded Developers Kit (EDK) [13] is a good example of such a tool. Xilinx states that the EDK is a suite of tools and Intellectual Property that enables you to design a complete embedded processor system for implementation in an FPGA device. [13]. The EDK environment provides a way for designers to quickly access hardware IP and to integrate it into a complete SoC system. EDK accelerates construction of designs that have a processor, a communication bus, and peripherals that communicate over that bus. In EDK the designer selects processors, busses and I/O components from a library and uses these to create an SoC system. The EDK also enables the designer to reuse software components in the SoC. The EDK uses meta-data to describe the reusable IP. An example of the meta-data in the EDK is the files required when using custom IP in the EDK environment. When a designer imports reusable IP into the EDK system they are required to create two files the microprocessor peripheral definition (MPD) and peripheral analyze order (PAO) files. These files define the mapping between different bus interfaces in the EDK to the ports on the reusable IP. These files make it possible for the EDK to recognize reusable IP. These two files are meta-data that are necessary to reuse IP in the EDK Xilinx System Generator In addition to the EDK, Xilinx also has the System Generator design environment which is intended for design and deployment of DSP systems to FPGAs [14]. System Generator allows designers to choose from a large library of generated and hard IP and to stitch 14

32 them together using point to point connections. System Generator utilizes the Xilinx CORE Generator [17] system for many of its blocks and also allows users to specify black box components that can contain arbitrary IP. In order for System Generator to correctly generate synchronous systems, it requires that all cores are clocked and have a clock enable signal. System Generator uses these signals to control the flow of data between IP in the system. System Generator uses meta-data both for native IP from CORE Generator and for arbitrary IP in black boxes. When using CORE Generator IP, System Generator requires that there be a mapping between the ports of the generated IP and the graphical representation presented to the user. This mapping is done with.m code that defines which ports are presented to the user and which signals are the clock, reset, and clock enable signals. When using a black box the designer must create a.m file that specifies meta-data about the IP to System Generator. This.m file defines the mapping between ports on the HDL and the ports that are needed for the System Generator simulation and synthesis tools. For example, if designing a clocked system, the.m file defines which ports on the IP are clock, clock enable, and reset. In order to use any type of IP in System Generator, meta-data in.m code is required National Instruments LabVIEW FPGA LabVIEW FPGA from National Instruments [20, 15] is a design environment that allows a domain expert to access the computational power of FPGAs by providing the user with a set of easily understood operations in a graphical programming environment. Because these operations are not necessarily hardware operations nor are they tightly coupled to a specific piece of IP, there must be a mapping between the user operation and some synthesizable IP. LabVIEW leverages meta-data to define the mappings between the high-level description of the algorithm and operations and the low-level implementing IP. This meta-data is not defined in a user-editable file; however, the user is able to use the IP without having to worry about implementation details because those are understood implicitly by the tool. Efficient IP reuse always depends on meta-data. Direct reuse of RTL is simplified by having documentation and comments in code. Each of the tools discussed in this chapter, 15

33 CORE Generator, EDK, System Generator, and LabVIEW FPGA, utilize meta-data to facilitate reuse. Each tool defines meta-data in its own format and the meta-data does not always contain the same information. While these differences in format can make it difficult to migrate IP from one tool to another, the meta-data is a primary enabler of the tools for which it was designed. 16

34 Chapter 3 XML-Based Meta-Data for Reuse Because of the importance of meta-data in reuse, it is important that standard methods of representing meta-data be developed. Standard meta-data descriptions can enable design environments to be built that depend only on the standard to simplify and enable reuse. A standard would allow these tools to not rely directly on HDL implementations or on proprietary or tool-specific meta-data. An example of an environment that depends only on standard meta-data is shown in Figure 3.1. This type of design environment would require that all IP are wrapped in standard meta-data format. This format would enable a generic, language- and IPsource-independent, design environment to compose designs from meta-data wrapped IP. This design environment could also produce designs in such a way that the result is reusable again in a meta-data enabled environment. Figure 3.1: Meta-data encapsulating of IP enables a generic design tool to reason with the cores and integrate them in a common framework. 17

35 Several research and industry projects have addressed different aspects of standard meta-data encapsulation of cores. MetaRTL is a language that was created from scratch to describe the protocol information for a piece of IP [21]. MetaRTL describes only a high-level view of a core and does not translate directly to a common implementation format. The dataflow interchange format (DIF) is a similar attempt to capture the semantics of datadriven computation using blocks of IP [22]. While these specifications address some of the meta-data needs for reusable IP, they require custom parsers and tools to understand and manipulate that meta-data and have not gain widespread acceptance. 3.1 XML as a Meta-Data Format Any meta-data standard for IP reuse should depend on commonly available languages and software tools. Extensible markup language (XML) is a powerful mechanism for representing meta-data. Because XML is the standard for data transmission on the web [23], it has the advantage of being widely known and used. Its real power is that it is extensible and in conjunction with XML Schema [24] can represent virtually any type of data. XML is also a good choice for meta-data representations because there are existing tools for reading, manipulating, and saving XML data in almost any programming language. This base of existing code enables engineers to easily develop design environments using common software techniques without the need to interpret custom meta-data formats. The development meta-data done in this work leverages XML. This development was done in two stages with the second stage building on the successes of the first stage and correcting its weaknesses. The initial meta-data development attempt was an XML schema called CHREC XML. This schema was built completely from scratch and attempted to address specifically the needs of FPGA IP. Before developing this schema, the emerging IP-XACT standard was reviewed and it was decided that it did not sufficiently represent the description needs of FPGA IP [25]. CHREC XML is introduced and briefly discussed in section 3.2. Upon completion of CHREC XML and its associated tools, the updated versions of IP-XACT were reviewed and the determination was made that many of the inadequacies of earlier versions had now been corrected. Because of these corrections, this work chose to continue its meta-data description efforts by leveraging the IP-XACT schema 18

36 and augmenting it slightly to suit the needs of FPGA IP. The IP-XACT schema and the extensions developed in this work are introduced in section CHREC XML The CHREC XML schema organizes the core meta-data into several distinct segments of abstraction: the structural segment, the datatype segment, and the temporal interface segment. Organizing IP meta-data in segments that represent different levels of abstraction allows IP core providers and tool vendors to support the integration of IP at different levels of abstraction and to do this integration independently of other meta-data segments as shown in Figure 3.2. For example, low level tools such as a netlisting tool may require only lowlevel information such as port naming and bitwidths. High level synthesis tools, however, are better served with a more abstract, higher level view of the interface, datatypes and timing information of a core. Figure 3.2: CHREC XML defines separate segments that represent different parts of metadata descriptions. These segments are independent of each other in their interpretation and implementation. Tools that use this type of representation need only understand segments applicable to their own operation. 19

37 Figure 3.2 shows the three CHREC XML segments. Each of the segments of CHREC XML is defined and used independently of the other segments, and when implemented each segment is defined in a separate XML file. The segmentation approach allows for additional segments of encapsulation information to be easily added without the need to modify the other, unrelated segments. CHREC XML also supports describing cores from any source language or environment. The abstraction segments of CHREC XML is described in detail as follows Segment 1: The Structural Segment The Structural Segment provides all of the needed information to instance and use cores in a basic composition environment and is very similar to the schema described in [25]. The Structural Segment is responsible for IP library taxonomy and naming of IP. This segment also includes the naming of ports and a mapping of these ports to the actual HDL ports. It further includes a list of parameters for the core as well as mathematical expressions and enumerated values that these parameters may depend on. This segment also contains a list of files required to simulate or synthesize the core. The structural segment is especially useful to low-level simulation and hardware synthesis tools whose primary purpose is the structural interconnection of cores and low-level communication between them. This segment of CHREC XML defines the primary meta-data elements that enabled construction of the IP language and source independent design environment and the parameterization manipulation and dependency enforcement tool discussed in Chapter Segment 2: High-Level Datatype Segment The high-level datatype segment primarily defines and associates high-level numerical datatypes with their bit-level implementations. Types are specifically defined for bit vector, integer, fixed point, floating point, and custom types. Separate XML element sets specify each type. Each high-level type defines the mapping between fields of that type and bits in the underlying signal. This meta-data segment defines the relationship between fields and bits for each type and lists each of the signals from the HDL segment that are of that type. 20

38 The datatype segment is independently useful to a tool which reasons about the details of actually wiring cores together and preserving data integrity. Other details of the core such as parameters and naming are not important to this type of tool. The data typing of signals and the associated bit-based signals described in this segment allow the tool to correctly match bits from one signal to another and to automatically perform any needed conversions of datatypes Segment 3: Temporal Behavior Very little of this segment was actually implemented in CHREC XML. Some initial attempts were made to describe interfaces as finite automata as described in Appendix B, but these proved to be overly complex. This segment of CHREC XML represent the pipeline depth, or latency, of a core and provided a starting point for the further development of interface descriptions in extensions to IP-XACT Weaknesses of CHREC XML There were two primary weaknesses in CHREC XML. 1. Support for parameter dependencies and mathematical relations was complicated and difficult to use. 2. The method used for representing the bitwidth of ports was inadequate. The method of representing the mathematical dependencies between parameters was weak in CHREC XML. This weakness came primarily from the complex nature of representing dependent parameters. There were several reasons that parameter dependencies were complex. 1. Variables in expressions were based on parameter names. There is no way in current XML to enforce matching between the contents of an arbitrary XML element and text elsewhere in the XML file. Because of this lack of enforcement, it is impossible to verify if the parameter names used in expressions actually exist as parameters in the meta-data description. 21

39 2. The mathematical operations defined for CHREC XML did not take into account the syntactical structure of XML. Only expressions involving +,,, /, and = were experimented with in IP described in CHREC XML; however, if an expression needs to use > or <, these would cause syntactical errors in XML. 3. The method of doing conditional statements in CHREC XML was incomplete and complex. It involved a long series of overly-verbose XML tags that were difficult to understand and write. XML 1 CHREC XML allows for bitwidths to be described as constant values. This does not allow for determination of which bits of the underlying bit-vector should be treated as the most significant bits. <chrec:rtlcore>... <chrec:port> <chrec:name>x</chrec:name> <chrec:sourcename>x</chrec:sourcename> <chrec:direction>in</chrec:direction> <chrec:portwidth> <chrec:bitwidth chrec:resolve="static">31</chrec:bitwidth> </chrec:portwidth> </chrec:port>... </chrec:rtlcore> The representation of port bitwidths was also weak in CHREC XML. Bitwidths in CHREC XML were represented by a single integer value as shown in XML 1. While this may be adequate for some uses, it did not describe which bit in the signal was the most significant bit. Even with attached numerical types, this discrepancy was not addressed fully because the high-level type simply stated how many bits were in each portion of the signal. This detail was overlooked in CHREC XML because an assumption was made that the left-most bit was always the most significant. While CHREC XML did allow for bitwidths to be represented as a VHDL-style vector (left downto right), the allowance was made for a single value description of bitwidth. This allowance created an ambiguity in the description that was difficult to reconcile. 22

40 When evaluating newer versions of IP-XACT after the completion of CHREC XML, these weaknesses were used as some of the evaluation criteria. IP-XACT addresses these weaknesses well, influencing the decision to migrate to IP-XACT as the basis for meta-data descriptions. 3.3 IP-XACT and Extensions IP-XACT is a standard XML schema which defines meta-data for describing reusable circuit cores in a vendor and language neutral manner. The IP-XACT [26, 18] standard was developed by The Spirit Consortium and standardized as IEEE 1685 [27]. Targeted primarily for System-on-Chip (SoC) design, IP-XACT defines the busses, ports, configuration, and properties of a reusable core to facilitate core reuse in higher-level designs. IP-XACT enables tools to allow designers to drag-and-drop arbitrary complex IP into an SoC design and automatically use third party tools to generate and verify SoC designs. This type of design paradigm simplifies the process of reusing IP by enabling a domain expert to quickly and easily integrate IP from any environment into a new design. Figure 3.3 provides an overview of the IP-XACT strategy for SoC IP reuse. Reusable cores in IP-XACT are defined as components in XML and exist in a library accessible by an IP-XACT enabled design environment. A designer can select IP from this component library and create complex SoC designs with relative ease. After composing the design, external third party tools, generators, are run in sequence as defined by generator chains to verify, simulate, and synthesize the design. The strength of IP-XACT is in describing cores that are intended for use in Systemon-Chip (SoC) designs, which are typically characterized by a centralized processor that is connected to peripheral devices via a standard bus structure [12]. More recently SoCs are also characterized by network-on-chip interconnection schemes [28]. The common denominator between all SoC designs is that they leverage a standard interconnection scheme and protocol for inter-core communication. The IP-XACT standard is specifically designed to describe this scheme. While the intent of IP-XACT was to describe IP for SoC, many of the strengths of IP-XACT are easily adapted to the description of data-flow IP typically used on FPGAs. 23

41 Figure 3.3: The IP-XACT Design Environment includes several different types of XML description files that work together to provide design entry and HDL generation. This type of a graphical representation of a library can be automatically generated based on taxonomy given in meta-data. The general strengths of IP-XACT for describing large libraries of cores include strong parameterization support, hardware port information, and descriptions of interactions with external tools. This research extends IP-XACT by adding descriptions of high-level numerical datatypes and a description of the temporal behavior of data-flow IP. The native IP-XACT elements along with these extensions make IP more reusable and more accessible to designers and domain experts Parameterization and Mathematical Expressions The parameterization approach in IP-XACT addresses the weaknesses of the CHREC XML dependent parameterization method by utilizing the standard XPath expression language [29]. XPath is an expression language that is used by XML parsers to find particular XML elements within a document. In general XPath expressions look very similar to hierarchical paths that might be seen in a file system. The nature of XPath as an expression language enables it to address the weaknesses of CHREC XML s parameterization scheme. 24

42 1. XPath provides the ability to add an identification tag to any XML element. XML validation tools enforce uniqueness on these tags. This uniqueness enables expressions to reference variables based on their ID as defined in XPath and removes the need for expressions to have their variables based solely on parameter names. 2. Because XPath defines syntax for expressions and because that syntax is meant to be used in conjunction with XML documents, there are never collisions with XML syntax. For example, instead of using > and < XPath uses &gt and &lt. 3. XPath provides a standard way for doing conditional expressions. These conditionals are based on datatypes and use common operators such as * and + to determine values based on conditionals. An example of this type of conditional in IP-XACT is shown in XML 2. This expression means that if Sregsize 2 then the value should be Sregsize else if Sregsize < 2 then the value should be 2. Although this representation may not be immediately intuitive, it is standard and can be easily evaluated when parsed by a tool. XML 2 Definition of a dependent parameter value that is evaluated based on a high-level parameter named Sregsize and has a default value of 2. This dependent parameter utilizes the XPath expression language [29]. <spirit:value spirit:resolve="dependent" spirit:dependency= "(id( Sregsize ) >= 2) * id( Sregsize ) + (id( Sregsize ) < 2) * 2">2 </spirit:value> Ports and Structural Interface In addition to providing a standardized, robust, parameterization framework, IP- XACT also provides information about the structural interface of a core. IP-XACT describes the structural nature of the ports with several important elements: a name, a direction, a width, and a low-level type. 25

43 Port Naming IP-XACT defines several naming elements for ports. It defines a mapping between the meta-data description of the port and the name in the actual HDL with the display name and name elements.. The display name provides easy understanding to the user. An extra description element is also provided. An example of the port naming meta-data is shown in XML 3. XML 3 A port description in IP-XACT. Elements of interest include naming and vector elements as well as mathematical dependencies and low-level types. <spirit:port> <spirit:name>premu</spirit:name> <spirit:displayname>previous Mu</spirit:displayName> <spirit:description>the value of mu from the previous iteration</spirit:description> <spirit:wire> <spirit:direction>in</spirit:direction> <spirit:vector> <spirit:left spirit:resolve="dependent" spirit:dependency="id( premu_length ) - 1">18</spirit:left> <spirit:right spirit:resolve="immediate">0</spirit:right> </spirit:vector> <spirit:wiretypedefs> <spirit:wiretypedef> <spirit:typename>unsigned</spirit:typename> <spirit:typedefinition>ieee.numeric_std.all</spirit:typedefinition> <spirit:viewnameref>source</spirit:viewnameref> </spirit:wiretypedef> </spirit:wiretypedefs> </spirit:wire> </spirit:port> Port Direction The direction of the port is essential when connecting IP together. Input ports must be connected to output ports and vice versa. Is is also important that inputs are not driven by multiple outputs. There is also the possibility of having bi-directional, in/out ports. This information is especially important when attempting to enable automatic synthesis 26

44 and verification of point-to-point connections. IP-XACT provides this description in its port description as shown in XML 3. Port Widths In addition to naming and direction, it is essential to know how many bits wide each port is. Typical HDL is written at a level of abstraction that allows a single logical port to contain multiple bits. This is also an appropriate level for representation in meta-data. There are two basic pieces of information that need to be represented: the number of bits in the port and which side of the port contains the most significant bit (MSB). Both the number of bits and the MSB information is contained in IP-XACT in a vector element. The vector defines the left and right ends of the vector with an integer. The greater of these two integers defines which end of the vector is the MSB and the absolute width of the port is given by width = left-right. The values of left and right can be parameterized to provide flexibility in implementation. An example of how port widths are represented in IP-XACT is shown in XML 3. Low-Level Types For meta-data wrapping of certain types of HDL, the low-level type of the signal is important. For example, if the IP being wrapped is VHDL, it is important to know the VHDL bit vector type. It is important, for example to differentiate between a VHDL std logic vector and a VHDL unsigned signal. This information allows connections to be made with the proper VHDL or other low-level HDL types to ensure that a completed system will compile and build properly. Not all core wrappers will require a low-level type, but for typed HDLs this is appropriate to represent in meta-data Generator Chains One of the primary advantages of using meta-data to encapsulate core details is that cores from any source can be composed and manipulated in a common environment which is aware of that meta-data. While this is a large advantage, if there is no way to compile 27

45 or convert the various IP into a common synthesizable language, the meta-data descriptions and the accompanying environment are worthless. IP-XACT addresses this need with generator chains as shown in Figure 3.3. A generator chain in IP-XACT defines sequences of external tools that should be run in order to convert IP from its HDL to a low-level standard format for implementation or simulation. Each IP then can point to one or more tool chains. A single IP may be compatible with several different generator chains depending on the end implementation. For example, a generator chain for a piece of IP in VHDL that is intended for implementation on a Xilinx FPGA might include tools such as XST, PAR, and Xilinx bit-gen. This particular chain also defines commands to download a completed bit file to an FPGA device. The overall goal of IP-XACT is to encapsulate IP in a vendor and implementation neutral manner in order to facilitate reuse. System-on-Chip is the primary target for IP that are supported by IP-XACT, which provides appropriate meta-data wrappers for this type of design. Native IP-XACT allows SoC designs to be created from arbitrary IP in any language from any environment and allows generic environments to be created to allow designers to reason with these designs Modifying IP-XACT for FPGA IP Because IP-XACT is designed to support SoC design, new methods are needed to meet the description needs of arbitrary non-soc IP for FPGAs. Two primary concerns need to be addressed in this description. First, many designs that are typically targeted to FPGAs consist of fine-grained IP. They do not tend to fit the SoC design paradigm but tend to be data-flow designs such as DSP applications in which cores communicate through computational data dependencies rather than communicating on a standard interconnect bus [30]. Second, the increasing availability of FPGAs has driven the development of tools that make FPGAs available to users without a digital hardware development background [15]. The SoC model is not suitable and may be too complex for simple use by many of these domain experts. In order to make the computing power of FPGAs available to these domain experts, reusable cores should be encapsulated in meta-data in such a way that they can be easily used in a non-soc system. 28

46 Both of these inadequacies in IP-XACT can be addressed by representing wrapped IP at a higher level of abstraction using meta-data. The abstraction chosen for a particular piece of IP should be appropriate to its intended use [31]. For many FPGA designs and cores a DSP-compatible abstraction is appropriate. Increasing design productivity by increasing abstraction is not new. The concept of raising abstraction to increase productivity was also used when HDLs were originally created [32]. Before HDL, schematic capture was used and designers often had to work at a very low level, sometimes having to design with individual transistors. With HDL came the ability to synthesize logic from a higher-level description without having to manually construct the logic. The concept of raising abstraction to increase productivity can be applied to existing HDL IP with proper meta-data wrappers. The meta-data that is included in basic IP-XACT is fundamentally structural in its encapsulation approach and does not significantly increase abstraction over a standard HDL description. The structural description provided by IP-XACT can be supplemented and expanded by using IP-XACT as a base description and adding high-level datatypes and temporal behavior information as extensions to the schema. These additional description elements are similar to those that were originally represented in CHREC XML and provide a method of raising the abstraction level through encapsulation and are easily applied to common FPGA IP used for data-flow computation Extensions for High-Level Datatypes Many data-flow FPGA IP communicate numerical data. A difficulty with the IP- XACT representation of ports is that it does not reflect this higher-level data typing. If datatypes are nor represented, data corruption can occur between cores when only bit-widths are required to match on connected ports. Figure 3.4 shows an example of this problem. Here two 16-bit ports are connected to each other and the most and least significant bits are properly aligned. However, by naively connecting these ports by bitwidth only, there has been a data corruption; the that was transmitted has been interpreted as a This problem becomes more pronounced when floating point or other complex types are used. In order to ensure correctness of data transfer between cores with arbitrary interfaces 29

47 and to avoid the problem caused by naive bitwidth-only connections, higher-level datatypes should be associated with port bit-vectors. Figure 3.4: The two fixed-point numbers shown have the same bitwidth; however, if only the bitwidths are matched the data transferred between these types will be incorrectly interpreted. This work extends the IP-XACT standard to include several numerical and other high-level types that can be associated with signals. This research proposes meta-data descriptions for bit vector, integer, fixed point, floating point, and custom datatypes. These types are briefly discussed here and described in detail in Appendix A. Bit Vector: Bit Vector types have no associated numerical data and are simply represented as a vector of bits. Integer: This datatype is basic for standard integer representations of any bitwidth. It can describe unsigned, 1 s compliment, 2 s compliment, and signed magnitude integer types. Fixed Point: This type defines either a number of integer bits or a number of fractional bits to be included. The total distribution of bits between integer and fractional bits can be determined from either of these along with the entire bitwidth of the signal. All of the cores used in the radios built in this study utilize the fixed point representation. Floating Point: This datatype is highly parameterizable to represent possible divisions of bits in floating point number representations. The floating point type is similar to the IEEE standard for floating point representations. It has three fields, {1 sign bit}{k bits for exponent}{n bits for fraction of significand} which can be arbitrarily mapped to bits in the underlying signal. Custom Types: Custom types contain a list of fields which associate a name and a sub-vector of bits from the underlying bit representation. 30

48 These datatypes are used in the architectural synthesis tool presented in chapter 5 to ensure that data transfer between cores is correct Extensions for Temporal Interface Behavior Because IP-XACT was originally intended to describe IP that is used in SoC systems, it does not have a mechanism for describing the temporal behavior of IP interfaces. There are many ways of representing these interfaces and several methods have been developed for representing IP interfaces mathematically. Finite automata, both deterministic (DFA) and nondeterministic (NFA), have been used to describe the protocol and timing behavior of cores with handshaking protocols which interface with various bus protocols. Several of these include those discussed in [33], [34], [35], [36], [37], [38], and [9]. While these specifications are important to many types of IP, they do not meet the requirements of data-flow DSP and FPGA IP which tend to be data-driven pipelined computational cores and often do not have handshaking type protocols. An attempt at creating an automata-based description for DSP IP is given in Appendix B; however, there has not yet been an attempt to implement this description or to describe it in meta-data. This work attempts to match the temporal interface behavior of data-flow IP for FPGAs to the homogeneous synchronous dataflow (H-SDF) model of computation. This is done by defining three extensions to IP-XACT that allow IP to be described as actors in an H-SDF graph. These parameters are latency, data introduction interval, and sample delay. These parameters are important in creating a synthesis system that is able to apply architectural synthesis algorithms to systems composed of coarse-grain IP. A complete discussion of these parameters and their use in architectural synthesis is given in chapter 5. The meta-data formats described in this section, both CHREC XML and IP-XACT with extensions, propose a standard way of representing meta-data to describe data-flow FPGA IP. Because meta-data is important to reuse, it is helpful to have a standard way of representing that meta-data. A standard mechanism enables tools to be built that can work with the standard and thereby expand the number of IP that are reusable in a tool to all IP that are described in this standard way. The following chapters will address such tools. Chapter 4 will present a structural design tool and a parameterization manipulation interface 31

49 based on CHREC XML that demonstrate the use of standard meta-data. Chapter 5 will present a different tool called Ogre that utilized IP-XACT and the high-level extensions to enable architectural synthesis. 32

50 Chapter 4 Meta-Data Enabled Design Environment This chapter presents a structural design tool that demonstrates the ability of the meta-data developed in this work to represent the structural interface of FPGA IP in a language and source independent manner. This tool enables IP cores from any language to be instanced and connected independently of their underlying representation. In addition to the ability to structurally interconnect cores, this tool demonstrates the ability of meta-data to represent parameterization and the interdependency of parameters in IP in a standard way. The parameterization manipulation interface presented in section 4.2 leverages meta-data descriptions to automatically create a custom parameterization interface for individual IP. This GUI demonstrates the usage of parameter relationships via mathematical expressions by automatically adjusting parameter values and the graphical representation of the IP when a user makes adjustment to parameter values. This structural design tool is based primarily on concepts developed in CHREC XML. However, the metadata enabled techniques demonstrated in this chapter can also be accomplished by leveraging IP-XACT and the extensions contributed by this work. 4.1 A Structural Design GUI Meta-data that defines a standard IP naming scheme and that represents the basic structural interface of IP can be used as the basis for a language and IP source independent design composition tool. To enable IP reuse, meta-data can define standard naming methods that enable a designer to rapidly find desired IP. All hardware IP have common elements in their structural interface. When these elements are represented in standard meta-data, a tool can use these elements to represent IP to a designer in a language independent manner. 33

51 Meta-data can also enable design composition by defining the structural nature of IP input and output ports. In addition to enabling language and source independent representation of libraries of IP, CHREC XML also enabled the building of a generic structural design tool. This tool allowed a designer to structurally interconnect IP from the library and automatically generate bitstreams for download to an FPGA. The structural interconnection of IP and the use of tools to create completed, downloadable designs were enabled by meta-data. The general flow for this structural design tool is shown in Figure 4.1. IP in any language could be imported into the library. For VHDL cores, this research developed a parser that automatically created much of the CHREC XML meta-data that was required for representation in the library. Once this basic XML was created by the parser, the user was prompted to add extra information not contained in HDL but required for a complete CHREC XML description. Figure 4.1: The CHREC XML schema supported a library structure, allowed for basic composition of IP, and allowed appropriate vendor tools to be automatically run as generators on IP. The XML also enabled automatic generation of wrappers for the created core in multiple common languages. 34

52 Once IP was imported into the library, the CHREC XML meta-data allowed this IP to be dragged and dropped onto a design canvas where their ports were connected as shown in Figure 4.2. The graphical representation for the IP on the design canvas was automatically generated by the tool based on the meta-data descriptions. The XML was queried to determine the names and widths of the input and output ports and the graphical representation was created based on this data. Because the GUI s representation of cores depended only on the meta-data in XML, any IP represented in XML could be used in this structural composition GUI. Figure 4.2: The GUI tool demonstrates the ability of CHREC XML to standardize descriptions for cores from multiple environments and enable them to communicate. The two matched filters in this design are from Xilinx CoreGen and the PLL circuit from JHDL. 35

53 Figure 4.2 shows the structural design of part of a QPSK demodulator in the composition GUI. This QPSK fragment used two filter circuits generated by the Xilinx CoreGen tool and a PLL circuit created from JHDL. Because of the different representation languages and sources of these IP, connecting them in a design would normally require significant manual manipulation. However, because of the meta-data wrappers defined in CHREC XML, the composition of these cores could be achieved in a straight forward manner. Once cores were instanced in the GUI, each core icon showed the ports and the tool allowed the designer to parameterize the cores (if parameters exist) and to connect ports graphically. Once the designer had connected the cores in the GUI as desired, the entire design could be synthesized and a downloadable bitstream created automatically. This was done based on the file sets and the external tools that were defined as generators in CHREC XML as shown in Figure 4.1. For the IP in the QPSK segment shown in Figure 4.2, the Xilinx CoreGen tools were automatically run to create an EDIF version of the filter circuits. The JHDL compilation tools were run to create a structural VHDL representation of the PLL. These two cores were automatically instanced and connected in a top-level VHDL file and this VHDL was then passed by the tool to the Xilinx tool chain. The tool issued the commands to Xilinx to synthesize, place and route the design and generate a complete bitstream. The language and source independent design environment was enabled by meta-data in CHREC XML. The meta-data enabled a standard representation of libraries of IP and allowed that library to be easily searched. Structural composition of IP was also enabled by the meta-data that allowed the design environment to represent IP graphically regardless of its implementation language or source environment. While these demonstrations are simple, they show the ability of meta-data to enable tools to organize IP from different sources into a new design. Other tools could also be created that leverage the meta-data in CHREC XML. One such tool that performs architectural synthesis based on IP described in meta-data will be presented in Chapter 5. 36

54 4.2 Parameter Representation and Manipulation In addition to providing the ability to structurally interconnect IP in a graphical environment, the structural design environment also enabled core parameters to be manipulated in a generic GUI. Because reusable IP tends to be highly parameterized, it is important to be able to correctly set parameter values [19]. Two types of parameters were represented in CHREC XML and in IP-XACT with extensions. Traditional low-level parameters such as those traditionally included in HDL were listed in XML. Higher-level, domain specific parameters that did no exist on the original HDL were also listed. Mathematical expressions enabled the tool to set low-level parameters based on higher-level parameter values. This section discusses the concepts of low-level parameterization, high-level parameterization, the translation between levels enabled by mathematical expressions, and the parameter manipulation GUI that utilized these types of parameterizations Traditional Low-Level Parameterization Hardware cores have traditionally been parameterized with low-level parameters such as bit-widths and operating modes which are typically represented in HDL. These parameters describe relatively low-level changes that can be applied to the core. This low-level parameterization increases the reusability of a core by allowing it to be used in designs that require different bit-widths. Low-level parameters typically defined by a name-value pair. Their values typically have little or no direct dependency or affect on other parameters. Lowlevel parameterization is quite common and is fairly simple to implement and meta-data can represent this type of parameterization quite simply. Traditional parameterization can significantly increase IP reusability for experienced hardware designers; however, this low-level, HDL parameterization can be difficult to understand and use for a non-hardware expert. Even experienced hardware designers still have to understand the low-level implementation of the core in order to integrate it properly into a system even when the IP has extensive low-level parameterization on a core. The designer must understand how each of the parameters affects the other parameters and the core s behavior. Extensive low-level parameterization presents a greater challenge to a domain expert, the domain expert may be more confused by a highly parameterized core that has only 37

55 low-level parameterization. For example, a core with several parameterizable bit-widths that the designer must manually select without a knowledge of how these will affect the operation of the IP may be less attractive than a core with no parameterization at all [19] High-Level Parameterization Encapsulating low-level parameters in higher-level, more domain specific parameters can further increase the reusability of cores without causing user confusion. Meta-data can be used to create high-level parameters that did not exist in the original HDL but which are more applicable and understandable to a particular domain. For example, for IP for digital communication systems, high-level parameters that are readily understood by communication systems experts could be defined. These high-level parameters allow the designer to interact with the core at a higher level of abstraction and therefore increase the reusability of the core. This reusability can also be realized both for the experienced hardware designer and for the inexperienced domain expert. The need for experienced designers to manipulate the low level details is removed and the ability of the domain expert to understand the operation of the core is increased because the parameters are more familiar. Examples of high-level parameters are shown in the Table 4.1 for a typical loop filter core from the communications IP library developed in this work [39]. This is a simple first-order loop filter consisting of a multiplier and an accumulator. This core has low-level parameters for signal bit-widths and constant multiplication coefficients; however, high-level parameters specific to communication receivers are presented in the meta-data wrapper. The parameters shown in Table 4.1 are used to perform a non-trivial calculation which determines both bit-widths and coefficients necessary for a given signal processing function. In addition, this core contains a parameter named samplespersymbol, which allows the core to be quickly used in different radio personalities that each operate on a different number of samples for each output symbol computed. Changing this parameter fundamentally changes the core s internal behavior and structure. This high level of parameterization significantly improves the ease of reuse for this IP by allowing users to quickly adapt it to a number of similar but significantly different designs. 38

56 Table 4.1: High-level parameterization for loop filter core. Parameter Description loopbandwidth B n T used in calculating constant multiplicand values loopdampingfactor ζ used in calculating constant multiplicand values phasedetectorgain K p used in calculating constant multiplicand values accumulationwidth Number of bits right of radix point for internal accumulator kprecision Number of fractional bits used for constant multiplicand values ddsgain K 0 used in calculating constant multiplicand values samplespersymbol N order Loop order: first (no accumulator) or second (with accumulation) Parameter Dependencies and Translation Because not all IP is designed with high-level parameters, reusability can be improved by providing a mechanism for enabling IP with robust low-level parameterization to be parameterized at a high level in a meta-data wrapper. Once high-level parameters have been set in meta-data they can be translated to more traditional, low-level parameters as shown in Figure 4.3. High-level parameters can exist in meta-data alone and the metadata description itself can describe the relationship between derived high-level parameters and actual low-level parameters on the IP. This type of dependency is supported by both CHREC XML and IP-XACT which both provide mathematical dependency relationships between parameters values. Meta-data can be used to compute lower level parameters based on the higher-level parameters defined exclusively in meta-data. For example, if low-level parameters for the bitwidth of a core are available in HDL, meta-data wrapping that IP could present the user with a parameter that asked for the minimum and maximum values expected by the IP. The meta data could then define the mathematical relationship between this given parameter range and the proper bitwidth parameter that should be set. The ability to parameterized exclusively in meta-data allows higher-level parameters to be created that did not originally exist for a core. The encapsulation of complex, low-level parameters in higher level parameters and accompanying mathematical relationships reduces the number of unique parameters that 39

57 Figure 4.3: The high-level parameters represented only in meta-data can be translated to existing low-level, HDL parameters using mathematical and dependency relationships between parameters. must be set by the user, while still allowing for powerful, deep parameterization. It should be noted here that not all of the dependencies between high- and low-level parameters must be expressed in meta-data VHDL functions (in corresponding core module generators) may also be leveraged for the computation of some low-level parameters. Because IP can be extensively parameterized it is important for meta-data descriptions to robustly support this parameterization. The meta-data developed in this work, both CHREC XML and extended IP-XACT, supports parameterization. Both low- and high-level parameters can be described in meta-data. The meta-data also describes the relationship between these parameters methematically and allows higher-level parameters to be defined exclusively in meta-data and be translated into low-level parameters A Parameter Manipulation GUI A parameter manipulation GUI was created to demonstrate the ability of the CHREC XML meta-data to represent parameters. The parameterization of IP used in this GUI was often complicated with some IP having many interrelating parameters. Because of the metadata, the parameter manipulation tool was able to automatically set and correct parameter 40

58 values based on mathematical expressions was demonstrated by the parameter manipulation GUI. Figure 4.4: Each core instance can be individually opened and its parameters modified to fit it to a particular use. The parameterization manipulation window is automatically generated on the fly from the meta-data descriptions. The parameter manipulation GUI was generated on the fly from meta-data describing IP that had been instanced in the structural design tool. After an instance had been created for a core as shown in the GUI in Figure 4.2, the designer could open an individual instan- 41

59 tiated core and edit that core s parameters as shown in Figure 4.4. The parameterization interface shown in Figure 4.4 is the FIR filter that was used in the QPSK segment shown in Figure 4.2. Meta-data was leveraged to generate the left panel of the GUI shown in Figure 4.4. This panel displayed the different parameters that could be changed by the user. The parameter names and their default values were extracted from the meta-data and appropriately represented on the parameter panel. The GUI presented only the high-level parameters that should be set by the user; all others were automatically calculated and set as defined in XML. The meta-data also enabled the tools to correctly represent the parameter input method. Some parameter values could be typed in a field, while others required selecting a value from a drop-down menu. This reflected the definition of valid parameter values as defined in meta-data. In CHREC XML some parameters were allowed to fall in a given range of numerical values. Other parameter values were to be selected from a set of choices. These value types were reflected in the manipulation GUI with ranged numerical values being input via a field and choices via a combo-box. The mathematical expressions defined in meta-data were used by the GUI to ensure proper parameterization of the IP. When users manipulated parameter values, the GUI responded to these manipulations by altering other parameter values to ensure a valid core parameterization. For example, when setting the parameters for the FIR filter shown in Figure 4.4, if the value of the parameter CSETPassbandMin changed, then the minimum valid value for the parameter passband max also changed. If the user had already set a value for this parameter that is now outside of the valid range, the GUI corrected that value to be within the range and notified the user. Other parameters that were dependent on userset parameters were also modified as the user changed the parameter values. Parameters in some IP, especially IP from Xilinx CoreGen [17], affected which ports would exist on a particular IP when it was generated. When these parameters were changed in the GUI, the representation in the right panel of the GUI changed to reflect these changes. This right pane also reflected changes in bitwidths of signals as parameter values changed. 42

60 The ability of the parameter manipulation GUI to represent parameters was based solely on meta-data from CHREC XML. None of the manipulation GUI was specific to any particular IP. The parameterization of IP from any hardware description language or IP generated by a tool could be represented in this GUI because it depended on meta-data. The GUI was able to update parameter values and change the graphical representation of the IP because of the mathematical parameter relationships in CHREC XML. This type of HDL-independent parameter manipulation is important to describing large libraries of IP from different sources because it provides a common method for setting parameters based on mathematical relationships described in meta-data. 4.3 Language-Specific Wrapper Generation In addition to manipulation of parameters, the CHREC XML meta-data enabled the tool to create custom wrappers for a single piece of IP and composed designs in a variety of languages. The structural design tool allowed the designer to export this particular IP with its parameterization in a VHDL, Verilog, or EDIF wrapper. The generation of wrappers in multiple languages demonstrated the ability of the meta-data to represent sufficient information to duplicate the interface of an IP core in any language. Wrapper generation was performed by by querying the XML for port data and using this data to create a proper wrapper in different languages. If the IP described in XML had parameters, the tool was able to use this meta-data to properly set the parameters in the appropriate wrapper languages. The tool also allowed the designer to export a CHREC XML file that contained the current parameterization for a particular core and allowed the designer to change the naming of the IP to reflect any customization that may have been made. The generation of wrappers and the ability to export the core in several formats all depended on the meta-data in CHREC XML. This meta-data allowed tools to generate wrappers and run tools regardless of the original implementation language because all needed data was encapsulated in the XML. Meta-data enabled design composition and parameter manipulation in a language independent manner. The tool presented in this chapter leveraged the CHREC XML descriptions of IP to enable structural composition of IP. The graphical representation of cores 43

61 in the GUI was generated based on the XML descriptions. CHREC XML meta-data also enabled the automatic generation of top level design VHDL and synthesis of complete designs to bitstreams by automatically running vendor tools specific to different pieces of IP in the design. Manipulation of parameters was also enabled by meta-data. The parameter manipulation interface was generated based on meta-data descriptions and the mathematical dependencies in CHREC XML enabled the GUI to enforce validity of parameter sets. Meta-data was the key factor in enabling these language- and source-independent design methods. 44

62 Chapter 5 Meta-Data Enabled H-SDF Synthesis Using IP As demonstrated in Chapter 4, it is relatively easy to instance, specialize (i.e., set parameters), and connect arbitrary IP that is described using XML-based meta-data. This type of structural design was simple because all of the data required to perform these tasks was readily available in XML. This chapter introduces another tool that can compose FPGA designs based on XML IP. This tool, however, operates at a higher level of abstraction and uses meta-data to reason about correctness of data transfer between connected IP and to automatically synthesize control circuitry for dataflow designs in FPGAs. Traditionally control synthesis systems have used fine-grain IP such as multipliers and adders as its primitive operators. When more coarse-grain IP have been used for synthesis, the set of possible IP was often limited to a small set of IP that is native to the synthesis tool. This work, however, allows any coarse-grain IP that is described in meta-data to be used in synthesis thus allowing the set of operations for synthesis to include any arbitrary IP from any source. The addition of two primary types of information to a meta-data description for FPGA IP were required for the construction of this synthesis tool: high-level numerical datatypes and temporal behavior specifications. Meta-data for high-level numerical datatypes describes the datatypes that are represented on input and output ports for IP. This information enables tools to assist a designer in ensuring that datatypes are correctly manipulated as data flows through a design. Meta-data specifying the temporal behavior of IP allows tools to automatically synthesize control circuitry that enforces the proper sequences of IP operation and ensures that no data is lost in communication. This chapter will introduce a design tool known as Ogre that uses datatype and temporal behavior specifications in meta-data to enable the synthesis of dataflow systems. 45

63 Figure 5.1: An overview of the Ogre System. There are four primary components: library representation, translation of schematic information to H-SDF graphs, scheduling the H-SDF graph, and synthesizing control and interface circuitry to create a complete downloadable bitstream. The general flow of the Ogre system is shown in Figure 5.1. Ogre utilizes the Simulink GUI and model file to represent models of systems composed of reusable IP as shown in Figure 5.2. These models are understood by the underlying synthesis system which can then reason about high-level datatypes and synthesize control circuitry. This chapter will discuss the use of high-level datatypes to help a user properly match datatypes between IP in Ogre. It will also present the method used by Ogre to leverage the H-SDF model of computation to schedule IP operation and synthesize control and interface circuitry by leveraging meta-data descriptions. 46

64 Figure 5.2: Simulink is used as a front end for design entry by the Ogre tool. IP blocks described in IP-XACT XML are automatically included in a Simulink library and can be dropped onto the simulink design pallet to create complete designs. 5.1 Numerical Datatypes The Ogre tool used high-level numerical datatypes to check for valid data transfer between IP that have been connected in a design. Much of the IP that is used in dataflow designs for FPGA operates on numerical data. Because of this, signals in data-flow computations are also often meant to be interpreted as some type of number such as an integer or a fractional number represented as fixed or floating point. When composing dataflow designs and attempting to reuse IP, it is vital to know the mapping of bits in a signal to the numerical data being represented by this signal. If this mapping data is not available, blindly tying cores together will almost certainly result in incorrect data transmission. This is especially important when working with fractional numbers and their fixed and floating point representations. Ogre utilized the meta-data descriptions developed as extensions to 47

65 IP-XACT which define a standard way of mapping high-level datatypes to their underlying bit-vector implementations Representing Datatypes The extension to IP-XACT that allows for the representation of datatypes defines mappings between high-level datatypes and the bits that implement those types on a particular signal. There are two components to this representation: the definition of the underlying bit-vector and the mapping of these bits to a high-level type. The bit-vector representation is contained in the meta-data description of each port in XML. The port is defined by the XML vector element with the width of the port being defined as the left side of the vector minus the right side of the vector (left right). XML 4 This code snippet shows an example of a high-level datatype extension. This example shows a signed fixed-point type where two bits are used to represent the integer part. <spirit:component>... <spirit:vendorextensions> <chrec:highleveldatatypes> <chrec:portdatatype> <chrec:name>sfix_2_a</chrec:name> <chrec:fixedpoint chrec:sign="2scomplement"> <chrec:intbits chrec:resolve="static"> 2 </chrec:intbits> </chrec:fixedpoint> </chrec:portdatatype> </chrec:highleveldatatypes>... </spirit:vendorextensions>... </spirit:component> The high-level type is defined separately from the description of the port. Each port points to the high-level type that should be used to represent it allowing multiple ports to be defined by the same type without the need to duplicate the description of the type. An 48

66 example of a high-level type definition is shown in XML 4. This particular definition is for a fixed-point datatype. It defines that the two most significant bits should be interpreted as integer bits and the rest of the bits of a signal should be interpreted as fractional bits. For fixed-point datatypes it is also possible to specify the number of fractional bits that should exist on a signal and assume that the rest are integer bits. The extensions to IP-XACT for datatypes and the details of their implementation are discussed at length in Appendix A and in [39] Utilizing Numerical Types The IP-XACT extensions developed in this research that describe high-level types were used in the Ogre design synthesis system to assist designers in ensuring that datatypes matched between IP. Ogre ensures first that bitwidths between IP match and then checks the datatypes of these connections. If there is a mismatch, Ogre alerts the user to the problem. Before Ogre checks data-types, Ogre first ensures that bitwidths match between IP. Bitwidth matching is done by performing a traversal of the diagram starting at the inputs and propagating the input bitwidths through the design. Mathematical relationships between parameters and port widths enable the tool to properly set parameters to make bitwidths match between IP. Once all bitwidth matching has been done, the tool iterates over all nets in the design and checks that the high-level datatypes are compatible between the ports on that net. If one of the ports does not match the others, an error is reported to the user, advising the user of which net had the offending port. Although the functionality of Ogre leveraged high-level types only for checking of proper data connections, these meta-data defined types provide the ability for a tool to perform more sophisticated datatype synthesis. For example, a tool could leverage the datatypes in meta-data to automatically synthesize datatype conversions between IP when it detects that there is a datatype mismatch. When a tool detects a mismatch, a parameterized block of IP could be inserted between the incompatible ports, making their datatypes compatible as shown in Figure 5.3. The high-level datatypes developed in this work are essential for any tool that will automatically compose cores for data-flow designs on FPGAs. If the high-level numerical 49

67 Figure 5.3: Datatype-aware tools can use high-level datatype information to synthesize conversion logic between incompatible datatypes and thereby ensure correct data transmission. datatypes are not defined, there will most likely be corruption of data as it moves between IP in a design. The meta-data high-level types included in CHREC XML and in extensions to IP-XACT provide the necessary mapping between the bits of an IP port and the high-level type which that port s data belongs to. 5.2 Representing Coarse-Grained IP as H-SDF Actors In order to automatically compose arbitrary IP in a dataflow system, a description of the timing behavior of the IP s interface is required. If temporal core behavior for IP can be matched to a particular model of computation, tools will be able to reason with these cores and automatically generate control circuitry for designs. The meta-data proposed in this work to describe timing behavior is based on the homogeneous synchronous dataflow (H-SDF) model of computation [40]. This section will briefly describe the H-SDF model and describe how meta-data implemented as extensions to IP-XACT enabled cores to be mapped as H-SDF actors The H-SDF Model of Computation The H-SDF model of computation defines the execution semantics for a system based on the dataflow relationships between portions of the system. H-SDF is represented by a 50

68 directed, vertex weighted, graph G = {V, E}. Each vertex v V is called an H-SDF actor and each edge (x, y) E represents the operation precedence between two actors. The edges in E are used to enforce execution semantics on the H-SDF graph. For example, the presence of edge (a, b) in G means that all computation must be done in vertex a before vertex b can start its computation. Computation in H-SDF is done by the actors. When an an actor performs a computation, it is said that the actor fires. The weight of the vertex v represents the number of steps required for that particular actor to fire. In addition to simply defining edges between vertices, H-SDF also uses the notion of tokens to enforce semantics. Each edge in G can contain multiple tokens at any given time. In order for any actor in H-SDF to fire, or perform computations, it must have an input token on each of its inputs. If there are no input edges to an actor, it may fire at any time. When an actor fires it produces tokens on all of its output edges. (a) Actors A and B fire and produce tokens on outgoing edges. (b) Actor C fires consuming all input tokens and producing an output token on outgoing edges. (c) Actor D fires and consumes all inputs. Computation is now finished. Figure 5.4: The homogeneous synchronous dataflow model of computation allows each node to fire when one token is available on each of its inputs. Each firing produces one token on the node s output. Homogeneous synchronous dataflow is a subset of standard synchronous dataflow (SDF) because when H-SDF actors fire they consume only one token from their inputs and produce one token on their outputs. In general SDF, actors are allowed to consume and produce multiple tokens when they fire(i.e., multi-rate dataflow). In this work H-SDF was 51

69 chosen as the model of computation because many single-rate systems can be represented as actors that consume and produce single tokens when they perform computations. H-SDF was also chosen because it is easier to use than general SDF. Figure 5.4 shows an example of an H-SDF model and a valid sequence of actors firing. Because actors A and B have no inputs they can fire at any time. When they fire, they each produce tokens on their output edges. Actor C is allowed to fire once tokens produced by A and B are both present on its inputs. When C fires, it also produces a single output token which is consumed by actor D when it fires. Actor D s firing completes a valid computation from this H-SDF model. (a) Invalid H-SDF initial conditions (b) Valid H-SDF initial condition with one initial token in the loop (c) Valid H-SDF initial condition with two tokens in the loop Figure 5.5: A cyclic H-SDF graph must have proper initial conditions. Each cycle in the graph must start with at least one token already on an edge in the cycle. The graph in figure 5.5(a) is invalid because it has no such initial condition. Figure 5.5(b) shows the simple case of a valid initial condition with one token initially in the loop. Multiple initial tokens are valid as shown in Figure 5.5(c). The H-SDF model of computation also supports cyclic data dependency graphs. However, when an H-SDF graph is cyclic, care must be taken to correctly satisfy the initial conditions for the computation. For each cycle in an H-SDF graph there must be at least one token on an edge in that cycle. If there is no token in the cycle, the computation will not be able to start because no actor will have the needed inputs. Multiple tokens may exist in the cycle or even on a single edge, but at least one token must be in the cycle. An example of proper and improper initialization of H-SDF graphs is shown in Figure

70 The execution semantics of H-SDF allow a static schedule to be computed for the graph. This schedule will be a repeating schedule that defines the relative start times for each of the actors. When using H-SDF to represent hardware systems, the schedule can be mapped on to clock cycles for pipelined IP. Because H-SDF enforces execution semantics on a dataflow graph, it is useful in describing the execution of single-rate dataflow systems for FPGAs. If each IP core in a system can be interpreted as an actor using H-SDF semantics, then the execution semantics of H-SDF can be used to determine how the hardware system should execute by creating a static schedule for the operation of IP in the system. This research defines three metadata elements that allow coarse-grain IP to be described in a way that allows them to be interpreted as actors onto the H-SDF graph. This meta-data defines the latency, the data introduction interval, and the sample delay for IP Latency The latency of an IP core is the number of clock cycles that elapse from the time that data is consumed on the inputs of the core to the time that the corresponding results are produced on the outputs. This does not mean that the core is pipelined in the traditional sense or that data can be accepted by the core on every cycle. For example, cores that accept data only every 8 cycles and take 9 cycles to compute a result would be given a latency value of 9. When mapping a IP core onto an H-SDF actor the latency of IP is represented in H-SDF graph by the weight of an actor. Because this weight defines the amount of time that elapses while an actor is performing a computation, this weight can interpreted as the number of latency clock cycles. This information can be used by H-SDF scheduling algorithms to to determine the time that data will appear on the output of a core. The latency can also allow synthesis algorithms to appropriately control IP downstream to wait until valid data has been produced by the IP. 53

71 5.2.3 Data Introduction Interval The data introduction interval for a core describes how many clock cycles must elapse between the introduction of data for each new sample. Cores with a data introduction interval of one can accept new samples each clock cycle. The data introduction interval of a core is independent of its latency. For example a core that has a data introduction interval of 3 can consume data on clock cycle 0 but then will not consume data again until clock cycle 3 and then again on cycle 6. This same core may take 9 cycles to compute a result from a set of inputs. The data introduction interval imposes an additional constraint on the scheduling algorithms that generate control for H-SDF execution. If tools are aware that a core can only accept new data every n clock cycles, then any synthesized control circuitry must ensure that data is given to a core only when it is able to receive it Sample Delay Sample delay is perhaps the most complected parameter used to describe IP as H- SDF actors. The sample delay is the number of cycle iterations separating an actor from the downstream actors. In other words, the sample delay defines how many cycle iterations later the data produced by an IP will be needed for computation. Sample delay is important when IP are going to be used in a cyclic manner. For example, in the design shown in Figure 5.2 there is a cycle in the design. The sample delay defines the break between iterations of the cycle. Sample delay can also be thought of as the number of initial tokens in a cycle in an H-SDF graph. If we consider a design to be represented as a H-SDF graph, we know that there must be at least one initial token on an edge in the cycle in order to enable this loop to execute properly according to H-SDF semantics. It is this initial condition that the sample delay represents.the sample delay parameter indicates the number of H-SDF initial tokens existing on the outputs of a particular IP. Another way to conceptualize sample delay is that it represents the state generated by the previous iteration of the loop. For example in Figure 5.5(b) the token that exists on 54

72 the arc ba represents the result of the computation done by the previous execution sequence {a, c, b}. This research chose to represent sample delay as a property of a piece of IP. While the proper way of representing sample delay in cyclic models is an open research question, there are several advantages to representing it as a property of a particular IP block. Representing sample delay as a property of a block of IP is especially useful when IP cores are used in situations that are closely related to their original design. When the sample delay is a property of particular communications IP, for example, these blocks need only be inserted in a cycle and the sample delay of that cycle is automatically satisfied. This type of representation is also logical when thinking of sample delay simply as the state from the previous iteration of the cycle. If the IP with the sample delay also has internal registers to maintain state, these registers contain the result of all upstream computation in the cycle. This type of representation, however, is now without its weaknesses. It breaks down if a core is used in a situation that is not similar to its original use. This may cause the sample delay on the IP to be in an incorrect location. The demonstration avenue for these techniques was communication systems. IP that are defined to have sample delay in these communication systems are rarely if ever used in a situation different from their initial usage. There may be models, however, that require a computation cycle but do not have any IP with an implicit sample delay. For this purpose, a sample-delay marker that could be used in Simulink models was also created in this research IP-XACT Extensions for H-SDF The meta-data description elements needed to represent coarse-grain IP as actors in H-SDF was implemented as a set of XML elements called the behavioral layer. This set of elements was added to IP-XACT as a vendor extension. XML 5 shows the definition of a temporal H-SDF interface as it appears as an IP-XACT extension. This particular interface has a data introduction interval of 7, a pipeline depth of 8, and a sample delay of 0. This representation method allows sample delay to be represented as part of the IP core and does not require the user to understand the complex concept of sample delay. 55

73 XML 5 Definition of temporal interface for H-SDF compliant cores. <chrec:behaviorallayer> <chrec:dataintroductioninterval>7</chrec:dataintroductioninterval> <chrec:pipelinedepth>8</chrec:pipelinedepth> <chrec:sampledelay>1</chrec:sampledelay> </chrec:behaviorallayer> The IP-XACT extensions representing H-SDF interfaces enabled scheduling and synthesis algorithms to be applied to IP that had this type of interface. These types of algorithms allowed hardware to be automatically synthesized to control the flow of data between the H-SDF cores. 5.3 Applying H-SDF Synthesis Techniques to Coarse-Grain IP The description of coarse grain IP as actors in the homogeneous synchronous dataflow model of computation allows traditional architectural synthesis algorithms to be used to synthesize control logic for systems. Although the algorithms used for this synthesis have been used before, the meta-data presented in this work enables coarse-grain IP to be used as primitives in these algorithms. There are several assumptions made in Ogre about the structural interfaces of the IP that will be used to create designs. All IP must have a fixed latency and all inputs must be consumed on the same clock cycle. The latency may be parameterized, however, once a particular instance of the IP exists the latency must be the same for every computation done by that core. Ogre also assumes that each IP has two control signals: clock-enable and data-valid. These signals are used by synthesized control circuitry to properly start and stop IP operation. This section will describe the method used in the Ogre tool to leverage meta-data to perform architectural synthesis. The Ogre tool leverages the Mathworks Simulink tool as an input method for data-flow designs as shown in Figure 5.2. Once IP have been connected in the Simulink GUI, the Ogre tool can parse the.mdl file and use the meta-data describing the IP to perform synthesis of complete designs. An overview of this synthesis flow is shown in Figure 5.1. Details of the synthesis flow will be described in this section. The method of 56

74 constructing an H-SDF dataflow dependency graph will be presented as well as an overview of the iterative modulo scheduling algorithm that was used as a first step toward creating control circuitry. The method of converting schedules to finite state machines will also be presented Translating Schematics to H-SDF Graphs Before architectural synthesis algorithms could be applied to systems composed of IP described in meta-data, the interconnection of the IP were represented as an H-SDF graph. This translation used the structural interconnection between IP described in the Simulink GUI and the meta-data describing each of the blocks to create the H-SDF graph. An intermediate netlist data structure was used as part of the translation that represented the connectivity of the complete design. This netlist structure, shown in Figure 5.1 as the Ogre Netlist, represented all of the data that was contained in the extended IP-XACT meta-data. The netlist structure was connected to a library of IP meta-data that it queried to determine the structural interface of a core, its parameters, its datatypes, and its temporal behavior. Each of these description elements was encapsulated in an instance of each core in the design. Each of these instances contained ports that could be connected in the netlist structure to represent a full or partial design. The first step in translating a Simulink model file to an H-SDF graph was to populate the Ogre netlist structure with instances of the IP that were in the design and to connect the data ports as represented in the model. Many of these IP were parameterized, and the parameter values set in Simulink were translated into the IP instance in the Ogre netlist. Once parameter values were set, Ogre verified the correctness of the provided parameter set by using the mathematical expressions provided in the meta-data. Once these parameters were validated, mathematical expressions were used to properly set all of the low-level parameters on each of the IP instances. Datatype checking was a two-step process. First the bitwidths were set to be compatible across the design. The bitwidths set by the user on the input ports in Simulink were used as a starting point to propagate the bitwidths throughout the design. Many of the IP used had parameterizable bitwidths. Because of this parameterization, making bitwidths 57

75 compatible was often a simple matter of setting the correct parameter to match bitwidths. When parameters could not be set to correctly resolve bitwidths, Ogre would report this to the user who would have to resolve the conflict. While resolving bitwidth values in the design, Ogre also checked for conflicting high-level datatypes. Mismatches identified using the datatypes defined in meta-data were also reported to the user. (a) The initial translation of design to H-SDF represents initial conditions (sample delay) as a distance on the edge after the IP with the sample delay. Node weights reflect the latency of IP. (b) After scheduling, nodes are annotated with the start time determined by the iterative modulo scheduling algorithm. Figure 5.6: The 2 H-SDF graphs shown here represent the graph that is created to represent the design shown in Figure 5.2. Before scheduling, only weights and sample delay are present on nodes. Scheduling applies a start time to each node. Once the Ogre netlist was completely populated from the Simulink description, an H-SDF graph was produced that represented the temporal behavior of the computation system defined in Simulink. For each of the instances of IP in the netlist an H-SDF actor was created. The weight of this actor was the latency of that particular IP. For each wire in the netlist the corresponding edge was created in the H-SDF graph. These edges were 58

76 weighted according to the sample delay description of their source actor. Because sample delay occurs infrequently in blocks, the weight of most edges was 0. Input and output ports were also represented as actors but their weight was always 0. They were include only as a means for determining consistent starting position for the scheduling algorithm discussed in subsection An example of the result of this translation is shown in Figure 5.6. This particular example shows the H-SDF graph that results when the design shown in Figure 5.2 was translated to H-SDF. Note the translation of the pipeline depth to the latency or weight of each of the nodes. Also note that because the nco has a sample delay value of 1 there is a weight of 1 applied to the edge from the nco to CMult1. The translation from dataflow block diagram to H-SDF graph was enabled by the Ogre netlist structure that was based on meta-data contained in the extended IP-XACT specification developed in this work. This meta-data allowed a simple H-SDF graph to be created that represented coarse-grain IP from a library to be represented as actors in H-SDF and allowed a correct representation of the connections between them in the data path of a design Applying Iterative Modulo Scheduling The H-SDF model of computation defines its execution semantics, but the implementation of these semantics must be properly represented in hardware in order to produce a working design. As a first step in translating the semantics of H-SDF to hardware, Ogre applied a scheduling technique to determine the relative start times of each of the IP in a design based on a global clock signal. Ogre used an iterative modulo scheduling (IMS) approach to this schedule as described in [16]. The IMS algorithm described in [16] was intended for general scheduling of multi-cycle actors onto available processors. Ogre simply needed to determine the clock cycle that data would be ready for each IP to use to do computation. The power of IMS for this use was that IMS computed a minimum initiation interval (II) for cyclic H-SDF graphs. The initiation interval was the number of clock cycles that must elapse between times that the design was able to consume new data. Minimizing the initiation interval was important because lower 59

77 initiation intervals corresponded to higher throughput for cyclic data-flow designs. Because many of the designs developed for this work were cyclic in nature, this was an appropriate algorithm choice. (a) Most common scheduling algorithms require that a completed computation iteration be finished before beginning another. The II produced by this type of scheduling generally does not produce the minimum II, in this case the II=8 (b) The IMS schedule allows for iterations of a loop to overlap each other. Each color represents the progression of a complete computation through the H-SDF Graph. This produces a schedule with a II=4 which is much better than the minimum possible for the H-SDF graph shown in 5.6(a) (c) The kernel of the IMS schedule represents the repeated start times for IP in a cycle. Figure 5.7: Scheduling possibilities for the design shown in Figure 5.2 and the H-SDF graph shown in Figures 5.6(a). Scheduling algorithms determine the start times for H-SDF actors. The iterative modulo scheduling algorithm defines the start times for IP in a kernel that allows iterations of a cycle to overlap and computes the minimum initial interval for an H-SDF graph. Many scheduling algorithms, when applied to cyclic H-SFD graphs, did not allow for the minimum initiation interval. For example, many algorithms required that a complete computation through the graph be complete before beginning a new computation as shown in Figure 5.7(a). IMS allowed a schedule to be created that overlaps different computational iterations as shown in Figure 5.7(b). To produced this type of a schedule the sample delay characteristic of IP was very important. Because sample-delay in H-SDF represented initial 60

78 values available to an actor, IP that are downstream from a sample delay could be scheduled before nodes that came before them in strict dataflow. The schedule in Figure 5.7(b) is the generated schedule for the H-SDF graph shown in Figure 5.6(a). Notice that in the schedule in Figure 5.7(b) CMult1 started before the nco even though the dataflow edges in Figure 5.6(a) seem to require that the nco run first. The IMS algorithm computed a schedule kernel that described the repetitive schedule that should be used to continually operate the H-SDF graph properly. An example of this kernel is shown in Figure 5.7(c). The ability of sample delay to allow IMS to fold long schedules into shorter schedules allowed Ogre to compute the minimum II for a H-SDF graph. This minimum II allowed for maximum throughput on data-flow hardware designs. Once Ogre had completed the scheduling process through IMS, the nodes of the H-SDF graph were labeled with their start times as shown in Figure 5.6(b). The schedule kernel produced by IMS allowed control circuitry to be created to control the passage of data through the hardware Control Synthesis Once a schedule had been generated for the design, this schedule could be used to generate control circuitry for the system. The Ogre synthesis system used the schedule generated by the IMS algorithm to create a finite state machine (FSM) that controlled when each of the IP in the design was active. By ensuring that IP are active only during their scheduled times, this FSM was able to ensure that data moved between IP during the correct clock cycle. The FSM generated by the Ogre system assumes that clock-enable and data-valid signals exist on the IP. Which hardware ports on the IP correspond to these types of signals was described in meta-data. The FSM controlled the flow by properly manipulating the values on the clock enable and data-valid signals. When the schedule indicated that an IP should start, the FSM would raise the data-valid signal for a single clock, signaling to the IP that it should begin computation because there was real data on its inputs. This data-valid signal often was connected to an enable signal on the first bank of pipeline registers in the IP block. In addition to starting the computation with the data-valid 61

79 signal, the FSM raised the clock-enable signal on the IP for each of the clock cycles that the IP should be running as determined by the length of the schedule time. Ogre synthesized the FSM and other interface circuitry in VHDL. When synthesizing the FSM, a VHDL file was produced to implement the proper behavior. The FSM was also added to the Ogre netlist structure as an instance of a component. The FSM was connected to the proper signals in the netlist based on the port names determined from the meta-data. Once the FSM had been connected to the IP in the system, global clock and reset signals were also added and connected to IP as needed. An example of the generated VHDL for a state machine is shown in Appendix D.2. At this point, the Ogre netlist structure represented a complete and correct hardware design. Synthesizable VHDL was automatically created from the netlist structure. The data path between IP was created and the control from the FSM was connected in VHDL. This generated, top-level VHDL file was then passed to a traditional synthesis flow to create a downloadable bitstream. An example of the generated top-level VHDL is shown in Appendix D.1. Creation of the synthesis algorithms used in Ogre was possible because of the metadata in IP-XACT with the extensions describing datatypes and temporal behavior. Metadata enabled data-flow designs captured in Simulink to be translated into a structural netlist in Ogre. This netlist and the meta-data was then used to create an H-SDF graph that enabled the IMS algorithm to produce a schedule that minimized the II for the design. The IMS schedule was used to synthesize a finite state machine that ensured that data moved through the design correctly. This FSM was included in a top-level VHDL file that could be synthesized to a bitstream and downloaded to an FPGA. Ogre demonstrates the ability of meta-data to enable tools to increase design productivity by performing tasks that are traditionally required of human designers when reusing IP. Using Ogre, a designer no longer has to manually create control circuitry to ensure correct flow of data. The designer does not have to worry about bitwidth and datatype correctness. While Ogre may not be suitable for all types of designs, the ability of Ogre to synthesize fully functional data-flow designs shows that meta-data can enable tools that can increase design productivity by performing complex architectural synthesis. 62

80 Several radio receivers were developed using the Ogre system. This development leveraged a library of IP cores that were described in IP-XACT and that were able to be used as H-SDF actors. Ogre was used to develop several different radio personalities. The development of radios in Ogre and the design productivity improvements observed in this development will be presented in the next chapter. 63

81 64

82 Chapter 6 Meta-Data Enabled Rapid Radio Development To demonstrate the usefulness of the Ogre design environment and show the potential design productivity gains, several radio designs were developed. These designs were based on a library of IP and the designs were created both by hand in VHDL and by using the Ogre synthesis system. The results of this radio construction demonstrate that the meta-data, in particular IP-XACT extensions, and the accompanying Ogre design flow reduced design time for the selected radios from a few days to less than an hour per radio. To further test the flexibility of the meta-data and the Ogre synthesis system, seven different QPSK designs, each of which use a different set of IP and occupy a different location in the area/time trade-off space were created. This variety of designs demonstrated the ability of Ogre to support rapid design space exploration and find a variety of solutions to a problem instance by automatically handling many of the timing details for the designer. 6.1 A Highly Parameterized IP Library The design of digital radio receivers was chosen as the demonstration vehicle for this work because of the close correlation of digital radio designs to the H-SDF model of computation. To that end, a library of building blocks suitable for the creation of a variety of radio personalities was developed. The blocks were first created as parameterized VHDL modules. The development of these modules and the decisions about how to parameterize and partition the IP took approximately 3 months. Meta-data descriptions of the IP cores were created in the IP-XACT with extensions discussed in Chapter 5. The temporal characteristics of the cores were specified to facilitate their use within the H-SDF model of computation. High-level datatypes were also used to describe the input and output data for the IP. 65

83 Table 6.1: A listing of different versions of blocks that were created and their timing/area characteristics. Latency is measured in clock cycles and is therefore omitted in combinational versions which have no input clock. Block Delay is the total time from when the input is presented to when the corresponding output appears. (These results are based on a Virtex 4-SX35 FPGA.) Block Type Latency Block Delay Max Freq. Area (cycles) (ns) (MHz) Slices DSP 48s Cubic Interpolators N/A Decision (QPSK) N/A 0 0 Timing Error Detectors N/A Loop Filters N/A NCO Calculate Mu Phase Error Detectors Clockwise Rotations DDS

84 The radio receiver personalities targeted in this research include QPSK, Offset QPSK, PCM/FM, 16QAM, 8PSK, and 16APSK, although other desired constellations may also be possible with slight adjustments to the developed block set. The creation of the block set took advantage of the fact that there are many recurring blocks in these different radio types [41]. These recurring blocks include interpolators, timing error detectors (TEDs), phase error detectors (PEDs), loop filters, direct digital synthesis (DDS) blocks, and numerically controlled oscillators (NCOs). A list of some of the created blocks and their functions is as follows: Clockwise Rotation: Rotates a complex signal by a certain angle, determined by sine and cosine inputs. Interpolator: FIR filter which outputs an approximation of the desired sample based on available sample values. Decision Block: Finds and outputs the constellation point for a given modulation scheme that is closest to the processed input value. Timing Error Detector: Computes the sample timing error. The rest of the blocks in a typical timing loop attempt to drive this error to zero. Loop Filter: A proportional-plus-integrator filter. This is commonly used to smooth the output error signals coming from the TED and PED cores. Numerically Controlled Oscillator: Part of a typical timing loop control; generates control signals for TED and PED. Calculate Mu: Generates fractional interval, µ, typically for use by an interpolator. Phase Error Detector: Computes a sample phase error. The rest of the blocks in a typical phase loop attempt to drive this error to zero. Direct Digital Synthesizer: Generates sine and cosine outputs based on an input phase value. 67

85 One of the goals of this work was to support the exploration of cost/performance points in the overall solution space for each radio personality selected. Thus, multiple versions of each block were designed which differ in their temporal behavior as well as in their timing and area characteristics. For each block there are combinational versions as well as heavily pipelined versions to facilitate different radio implementation requirements. In addition, the various blocks exhibit different data consumption rates based on their internal design. Table 6.1 lists the blocks in the library and their variations. For example, four cubic interpolator blocks are available to support a range of latencies, clock rates, and resource requirements. 6.2 Manually Constructing Radios The meta-data enabled library of cores was used to manually construct two different radios in VHDL. This was done by manually selecting IP from the library, manually determining the proper datatype conversions between the IP, and manually creating a finite state machine to control the flow of data through the design. The first manually build VHDL design was a basic combinational QPSK demodulator which consumed one data sample per clock cycle. Construction of the combinational QPSK radio was fairly straightforward. Connecting the cores in VHDL, it took about a day to produce a working radio that could run on hardware with a zero bit error rate. This radio test was fairly uninteresting, but it proved that the VHDL cores were functionally correct. The second VHDL design was more difficult to create. This design consisted of pipelined versions of many of the cores and was thus able to run at a much higher clock rate. However, the pipelined cores increased the complexity of the radio design because of feedback in the design and the difficulty of aligning data dependencies in the loop. After determining the desired timing and sequencing required, a finite state machine controller was created manually to control and sequence the blocks in the new design. Finally, a number of manual design iterations were required to find a solution in the design space which was able to meet both sample and cycle-level timing. The final design required 15 clock cycles per loop iteration (input sample) and the design time was approximately three days. 68

86 69 Figure 6.1: A simple QPSK radio block diagram in Simulink that can be used in Ogre. The color on the wires shows the location of the sample delay blocks. Note that only data path signals are connected in the diagram. Control signals need not be connected. The BYU Interface Synthesis block provides access to the Ogre system.

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,

More information

Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations

Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations Mohammed El-Shambakey Dissertation Submitted to the Faculty of the Virginia Polytechnic Institute and State

More information

VLSI Design Verification and Test Simulation CMPE 646. Specification. Design(netlist) True-value Simulator

VLSI Design Verification and Test Simulation CMPE 646. Specification. Design(netlist) True-value Simulator Design Verification Simulation used for ) design verification: verify the correctness of the design and 2) test verification. Design verification: Response analysis Specification Design(netlist) Critical

More information

CSE370: Introduction to Digital Design

CSE370: Introduction to Digital Design CSE370: Introduction to Digital Design Course staff Gaetano Borriello, Brian DeRenzi, Firat Kiyak Course web www.cs.washington.edu/370/ Make sure to subscribe to class mailing list (cse370@cs) Course text

More information

Combinational Logic Design

Combinational Logic Design PEN 35 - igital System esign ombinational Logic esign hapter 3 Logic and omputer esign Fundamentals, 4 rd Ed., Mano 2008 Pearson Prentice Hall esign oncepts and utomation top-down design proceeds from

More information

Serial Parallel Multiplier Design in Quantum-dot Cellular Automata

Serial Parallel Multiplier Design in Quantum-dot Cellular Automata Serial Parallel Multiplier Design in Quantum-dot Cellular Automata Heumpil Cho and Earl E. Swartzlander, Jr. Application Specific Processor Group Department of Electrical and Computer Engineering The University

More information

Table of Content. Chapter 11 Dedicated Microprocessors Page 1 of 25

Table of Content. Chapter 11 Dedicated Microprocessors Page 1 of 25 Chapter 11 Dedicated Microprocessors Page 1 of 25 Table of Content Table of Content... 1 11 Dedicated Microprocessors... 2 11.1 Manual Construction of a Dedicated Microprocessor... 3 11.2 FSM + D Model

More information

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical & Electronic

More information

Digitization of Vector Control Algorithm Using FPGA

Digitization of Vector Control Algorithm Using FPGA Digitization of Vector Control Algorithm Using FPGA M. P. Priyadarshini[AP] 1, K. G. Dharani[AP] 2, D. Kavitha[AP] 3 DEPARTMENT OF ECE, MVJ COLLEGE OF ENGINEERING, BANGALORE Abstract: The paper is concerned

More information

Laboratory Exercise #8 Introduction to Sequential Logic

Laboratory Exercise #8 Introduction to Sequential Logic Laboratory Exercise #8 Introduction to Sequential Logic ECEN 248: Introduction to Digital Design Department of Electrical and Computer Engineering Texas A&M University 2 Laboratory Exercise #8 1 Introduction

More information

ECE 407 Computer Aided Design for Electronic Systems. Simulation. Instructor: Maria K. Michael. Overview

ECE 407 Computer Aided Design for Electronic Systems. Simulation. Instructor: Maria K. Michael. Overview 407 Computer Aided Design for Electronic Systems Simulation Instructor: Maria K. Michael Overview What is simulation? Design verification Modeling Levels Modeling circuits for simulation True-value simulation

More information

GIS Data Conversion: Strategies, Techniques, and Management

GIS Data Conversion: Strategies, Techniques, and Management GIS Data Conversion: Strategies, Techniques, and Management Pat Hohl, Editor SUB G6ttlngen 208 494219 98 A11838 ONWORD P R E S S V Contents SECTION 1: Introduction 1 Introduction and Overview 3 Ensuring

More information

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture Computational Platforms Numbering Systems Basic Building Blocks Scaling and Round-off Noise Computational Platforms Viktor Öwall viktor.owall@eit.lth.seowall@eit lth Standard Processors or Special Purpose

More information

VLSI Signal Processing

VLSI Signal Processing VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data

More information

Combinational Logic Design Combinational Functions and Circuits

Combinational Logic Design Combinational Functions and Circuits Combinational Logic Design Combinational Functions and Circuits Overview Combinational Circuits Design Procedure Generic Example Example with don t cares: BCD-to-SevenSegment converter Binary Decoders

More information

Determining Appropriate Precisions for Signals in Fixed-Point IIR Filters

Determining Appropriate Precisions for Signals in Fixed-Point IIR Filters 38.3 Determining Appropriate Precisions for Signals in Fixed-Point IIR Filters Joan Carletta Akron, OH 4435-3904 + 330 97-5993 Robert Veillette Akron, OH 4435-3904 + 330 97-5403 Frederick Krach Akron,

More information

EECS150 - Digital Design Lecture 23 - FFs revisited, FIFOs, ECCs, LSFRs. Cross-coupled NOR gates

EECS150 - Digital Design Lecture 23 - FFs revisited, FIFOs, ECCs, LSFRs. Cross-coupled NOR gates EECS150 - Digital Design Lecture 23 - FFs revisited, FIFOs, ECCs, LSFRs April 16, 2009 John Wawrzynek Spring 2009 EECS150 - Lec24-blocks Page 1 Cross-coupled NOR gates remember, If both R=0 & S=0, then

More information

Digital Systems Roberto Muscedere Images 2013 Pearson Education Inc. 1

Digital Systems Roberto Muscedere Images 2013 Pearson Education Inc. 1 Digital Systems Digital systems have such a prominent role in everyday life The digital age The technology around us is ubiquitous, that is we don t even notice it anymore Digital systems are used in:

More information

Looking at a two binary digit sum shows what we need to extend addition to multiple binary digits.

Looking at a two binary digit sum shows what we need to extend addition to multiple binary digits. A Full Adder The half-adder is extremely useful until you want to add more that one binary digit quantities. The slow way to develop a two binary digit adders would be to make a truth table and reduce

More information

And Inverter Graphs. and and nand. inverter or nor xor

And Inverter Graphs. and and nand. inverter or nor xor And Inverter Graphs and and nand inverter or nor xor And Inverter Graphs A B gate 1 u gate 4 X C w gate 3 v gate 2 gate 5 Y A u B X w Y C v And Inverter Graphs Can represent any Boolean function: v i+1

More information

Qualitative Behavior Prediction for Simple Mechanical Systems. Jonathan P. Pearce

Qualitative Behavior Prediction for Simple Mechanical Systems. Jonathan P. Pearce Qualitative Behavior Prediction for Simple Mechanical Systems by Jonathan P. Pearce Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements

More information

DO NOT COPY DO NOT COPY

DO NOT COPY DO NOT COPY Drill Problems 3 benches. Another practical book is VHDL for Programmable Logic, by Kevin Skahill of Cypress Semiconductor (Addison-esley, 1996). All of the ABEL and VHDL examples in this chapter and throughout

More information

Formal Verification of Systems-on-Chip

Formal Verification of Systems-on-Chip Formal Verification of Systems-on-Chip Wolfgang Kunz Department of Electrical & Computer Engineering University of Kaiserslautern, Germany Slide 1 Industrial Experiences Formal verification of Systems-on-Chip

More information

ECE 448 Lecture 6. Finite State Machines. State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code. George Mason University

ECE 448 Lecture 6. Finite State Machines. State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code. George Mason University ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code George Mason University Required reading P. Chu, FPGA Prototyping by VHDL Examples

More information

Transactions on Information and Communications Technologies vol 18, 1998 WIT Press, ISSN

Transactions on Information and Communications Technologies vol 18, 1998 WIT Press,   ISSN GIS in the process of road design N.C. Babic, D. Rebolj & L. Hanzic Civil Engineering Informatics Center, University ofmaribor, Faculty of Civil Engineering, Smetanova 17, 2000 Maribor, Slovenia. E-mail:

More information

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER Jesus Garcia and Michael J. Schulte Lehigh University Department of Computer Science and Engineering Bethlehem, PA 15 ABSTRACT Galois field arithmetic

More information

SIMULATION-BASED APPROXIMATE GLOBAL FAULT COLLAPSING

SIMULATION-BASED APPROXIMATE GLOBAL FAULT COLLAPSING SIMULATION-BASED APPROXIMATE GLOBAL FAULT COLLAPSING Hussain Al-Asaad and Raymond Lee Computer Engineering Research Laboratory Department of Electrical & Computer Engineering University of California One

More information

Mark Redekopp, All rights reserved. Lecture 1 Slides. Intro Number Systems Logic Functions

Mark Redekopp, All rights reserved. Lecture 1 Slides. Intro Number Systems Logic Functions Lecture Slides Intro Number Systems Logic Functions EE 0 in Context EE 0 EE 20L Logic Design Fundamentals Logic Design, CAD Tools, Lab tools, Project EE 357 EE 457 Computer Architecture Using the logic

More information

FPGA IMPLEMENTATION OF BASIC ADDER CIRCUITS USING REVERSIBLE LOGIC GATES

FPGA IMPLEMENTATION OF BASIC ADDER CIRCUITS USING REVERSIBLE LOGIC GATES FPGA IMPLEMENTATION OF BASIC ADDER CIRCUITS USING REVERSIBLE LOGIC GATES B.Ravichandra 1, R. Kumar Aswamy 2 1,2 Assistant Professor, Dept of ECE, VITS College of Engineering, Visakhapatnam (India) ABSTRACT

More information

Intro To Digital Logic

Intro To Digital Logic Intro To Digital Logic 1 Announcements... Project 2.2 out But delayed till after the midterm Midterm in a week Covers up to last lecture + next week's homework & lab Nick goes "H-Bomb of Justice" About

More information

Testability. Shaahin Hessabi. Sharif University of Technology. Adapted from the presentation prepared by book authors.

Testability. Shaahin Hessabi. Sharif University of Technology. Adapted from the presentation prepared by book authors. Testability Lecture 6: Logic Simulation Shaahin Hessabi Department of Computer Engineering Sharif University of Technology Adapted from the presentation prepared by book authors Slide 1 of 27 Outline What

More information

A GUI FOR EVOLVE ZAMS

A GUI FOR EVOLVE ZAMS A GUI FOR EVOLVE ZAMS D. R. Schlegel Computer Science Department Here the early work on a new user interface for the Evolve ZAMS stellar evolution code is presented. The initial goal of this project is

More information

CprE 281: Digital Logic

CprE 281: Digital Logic CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Simple Processor CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev Digital

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Efficient random number generation on FPGA-s

Efficient random number generation on FPGA-s Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 1. pp. 313 320 doi: 10.14794/ICAI.9.2014.1.313 Efficient random number generation

More information

Introduction to Digital Signal Processing

Introduction to Digital Signal Processing Introduction to Digital Signal Processing What is DSP? DSP, or Digital Signal Processing, as the term suggests, is the processing of signals by digital means. A signal in this context can mean a number

More information

Design and Study of Enhanced Parallel FIR Filter Using Various Adders for 16 Bit Length

Design and Study of Enhanced Parallel FIR Filter Using Various Adders for 16 Bit Length International Journal of Soft Computing and Engineering (IJSCE) Design and Study of Enhanced Parallel FIR Filter Using Various Adders for 16 Bit Length D.Ashok Kumar, P.Samundiswary Abstract Now a day

More information

Introduction to the Xilinx Spartan-3E

Introduction to the Xilinx Spartan-3E Introduction to the Xilinx Spartan-3E Nash Kaminski Instructor: Dr. Jafar Saniie ECE597 Illinois Institute of Technology Acknowledgment: I acknowledge that all of the work (including figures and code)

More information

Arduino and Raspberry Pi in a Laboratory Setting

Arduino and Raspberry Pi in a Laboratory Setting Utah State University DigitalCommons@USU Physics Capstone Project Physics Student Research 5-16-217 Arduino and Raspberry Pi in a Laboratory Setting Dustin Johnston Utah State University Follow this and

More information

Contents. Chapter 3 Combinational Circuits Page 1 of 36

Contents. Chapter 3 Combinational Circuits Page 1 of 36 Chapter 3 Combinational Circuits Page of 36 Contents Combinational Circuits...2 3. Analysis of Combinational Circuits...3 3.. Using a Truth Table...3 3..2 Using a Boolean Function...6 3.2 Synthesis of

More information

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN. Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN. Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering TIMING ANALYSIS Overview Circuits do not respond instantaneously to input changes

More information

BASIC TECHNOLOGY Pre K starts and shuts down computer, monitor, and printer E E D D P P P P P P P P P P

BASIC TECHNOLOGY Pre K starts and shuts down computer, monitor, and printer E E D D P P P P P P P P P P BASIC TECHNOLOGY Pre K 1 2 3 4 5 6 7 8 9 10 11 12 starts and shuts down computer, monitor, and printer P P P P P P practices responsible use and care of technology devices P P P P P P opens and quits an

More information

Let s now begin to formalize our analysis of sequential machines Powerful methods for designing machines for System control Pattern recognition Etc.

Let s now begin to formalize our analysis of sequential machines Powerful methods for designing machines for System control Pattern recognition Etc. Finite State Machines Introduction Let s now begin to formalize our analysis of sequential machines Powerful methods for designing machines for System control Pattern recognition Etc. Such devices form

More information

Experimental designs for multiple responses with different models

Experimental designs for multiple responses with different models Graduate Theses and Dissertations Graduate College 2015 Experimental designs for multiple responses with different models Wilmina Mary Marget Iowa State University Follow this and additional works at:

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

Digital Control of Electric Drives

Digital Control of Electric Drives Digital Control of Electric Drives Logic Circuits - equential Description Form, Finite tate Machine (FM) Czech Technical University in Prague Faculty of Electrical Engineering Ver.. J. Zdenek 27 Logic

More information

EECS 579: SOC Testing

EECS 579: SOC Testing EECS 579: SOC Testing Core-Based Systems-On-A-Chip (SOCs) Cores or IP circuits are predesigned and verified functional units of three main types Soft core: synthesizable RTL Firm core: gate-level netlist

More information

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture Rajeevan Amirtharajah University of California, Davis Outline Announcements Review: PDP, EDP, Intersignal Correlations, Glitching, Top

More information

CS 226: Digital Logic Design

CS 226: Digital Logic Design CS 226: Digital Logic Design 0 1 1 I S 0 1 0 S Department of Computer Science and Engineering, Indian Institute of Technology Bombay. 1 of 29 Objectives In this lecture we will introduce: 1. Logic functions

More information

Analog computation derives its name from the fact that many physical systems are analogous to

Analog computation derives its name from the fact that many physical systems are analogous to Tom Woodfin 6.911 Architectures Anonymous Tom Knight 11 April 2000 Analog Computation Analog computation derives its name from the fact that many physical systems are analogous to problems we would like

More information

Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA

Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical &

More information

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier Espen Stenersen Master of Science in Electronics Submission date: June 2008 Supervisor: Per Gunnar Kjeldsberg, IET Co-supervisor: Torstein

More information

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA Efficient Polynomial Evaluation Algorithm and Implementation on FPGA by Simin Xu School of Computer Engineering A thesis submitted to Nanyang Technological University in partial fullfillment of the requirements

More information

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System G.Suresh, G.Indira Devi, P.Pavankumar Abstract The use of the improved table look up Residue Number System

More information

Examining the accuracy of the normal approximation to the poisson random variable

Examining the accuracy of the normal approximation to the poisson random variable Eastern Michigan University DigitalCommons@EMU Master's Theses and Doctoral Dissertations Master's Theses, and Doctoral Dissertations, and Graduate Capstone Projects 2009 Examining the accuracy of the

More information

Novel Devices and Circuits for Computing

Novel Devices and Circuits for Computing Novel Devices and Circuits for Computing UCSB 594BB Winter 2013 Lecture 4: Resistive switching: Logic Class Outline Material Implication logic Stochastic computing Reconfigurable logic Material Implication

More information

Industrial Rotating Kiln Simulation

Industrial Rotating Kiln Simulation Industrial Rotating Kiln Simulation This thesis is presented for the degree of Doctor of Philosophy Faculty of Science University of Technology, Sydney 1999 Submitted by Dennis Van Puyvelde, B. Chem. Eng.

More information

Navigating to Success: Finding Your Way Through the Challenges of Map Digitization

Navigating to Success: Finding Your Way Through the Challenges of Map Digitization Library Faculty Presentations Library Faculty/Staff Scholarship & Research 10-15-2011 Navigating to Success: Finding Your Way Through the Challenges of Map Digitization Cory K. Lampert University of Nevada,

More information

Optimization of new Chinese Remainder theorems using special moduli sets

Optimization of new Chinese Remainder theorems using special moduli sets Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2010 Optimization of new Chinese Remainder theorems using special moduli sets Narendran Narayanaswamy Louisiana State

More information

Constrained Clock Shifting for Field Programmable Gate Arrays

Constrained Clock Shifting for Field Programmable Gate Arrays Constrained Clock Shifting for Field Programmable Gate Arrays Deshanand P. Singh Dept. of Electrical and Computer Engineering University of Toronto Toronto, Canada singhd@eecg.toronto.edu Stephen D. Brown

More information

An Automotive Case Study ERTSS 2016

An Automotive Case Study ERTSS 2016 Institut Mines-Telecom Virtual Yet Precise Prototyping: An Automotive Case Study Paris Sorbonne University Daniela Genius, Ludovic Apvrille daniela.genius@lip6.fr ludovic.apvrille@telecom-paristech.fr

More information

Construction of a reconfigurable dynamic logic cell

Construction of a reconfigurable dynamic logic cell PRAMANA c Indian Academy of Sciences Vol. 64, No. 3 journal of March 2005 physics pp. 433 441 Construction of a reconfigurable dynamic logic cell K MURALI 1, SUDESHNA SINHA 2 and WILLIAM L DITTO 3 1 Department

More information

Oakland County Parks and Recreation GIS Implementation Plan

Oakland County Parks and Recreation GIS Implementation Plan Oakland County Parks and Recreation GIS Implementation Plan TABLE OF CONTENTS 1.0 Introduction... 3 1.1 What is GIS? 1.2 Purpose 1.3 Background 2.0 Software... 4 2.1 ArcGIS Desktop 2.2 ArcGIS Explorer

More information

YEAR III SEMESTER - V

YEAR III SEMESTER - V YEAR III SEMESTER - V Remarks Total Marks Semester V Teaching Schedule Hours/Week College of Biomedical Engineering & Applied Sciences Microsyllabus NUMERICAL METHODS BEG 389 CO Final Examination Schedule

More information

Pipelined Viterbi Decoder Using FPGA

Pipelined Viterbi Decoder Using FPGA Research Journal of Applied Sciences, Engineering and Technology 5(4): 1362-1372, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scientific Organization, 2013 Submitted: July 05, 2012 Accepted: August

More information

Vidyalankar S.E. Sem. III [CMPN] Digital Logic Design and Analysis Prelim Question Paper Solution

Vidyalankar S.E. Sem. III [CMPN] Digital Logic Design and Analysis Prelim Question Paper Solution . (a) (i) ( B C 5) H (A 2 B D) H S.E. Sem. III [CMPN] Digital Logic Design and Analysis Prelim Question Paper Solution ( B C 5) H (A 2 B D) H = (FFFF 698) H (ii) (2.3) 4 + (22.3) 4 2 2. 3 2. 3 2 3. 2 (2.3)

More information

The Design Procedure. Output Equation Determination - Derive output equations from the state table

The Design Procedure. Output Equation Determination - Derive output equations from the state table The Design Procedure Specification Formulation - Obtain a state diagram or state table State Assignment - Assign binary codes to the states Flip-Flop Input Equation Determination - Select flipflop types

More information

Intel Stratix 10 Thermal Modeling and Management

Intel Stratix 10 Thermal Modeling and Management Intel Stratix 10 Thermal Modeling and Management Updated for Intel Quartus Prime Design Suite: 17.1 Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1...3 1.1 List of Abbreviations...

More information

Pipelining and Parallel Processing

Pipelining and Parallel Processing Pipelining and Parallel Processing ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 010 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines

More information

CHAPTER 3 RESEARCH METHODOLOGY

CHAPTER 3 RESEARCH METHODOLOGY CHAPTER 3 RESEARCH METHODOLOGY 3.1 INTRODUCTION The research methodology plays an important role in implementing the research and validating the results. Therefore, this research methodology is derived

More information

EEE 480 LAB EXPERIMENTS. K. Tsakalis. November 25, 2002

EEE 480 LAB EXPERIMENTS. K. Tsakalis. November 25, 2002 EEE 480 LAB EXPERIMENTS K. Tsakalis November 25, 2002 1. Introduction The following set of experiments aims to supplement the EEE 480 classroom instruction by providing a more detailed and hands-on experience

More information

CMOS Ising Computer to Help Optimize Social Infrastructure Systems

CMOS Ising Computer to Help Optimize Social Infrastructure Systems FEATURED ARTICLES Taking on Future Social Issues through Open Innovation Information Science for Greater Industrial Efficiency CMOS Ising Computer to Help Optimize Social Infrastructure Systems As the

More information

Event Operators: Formalization, Algorithms, and Implementation Using Interval- Based Semantics

Event Operators: Formalization, Algorithms, and Implementation Using Interval- Based Semantics Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 Event Operators: Formalization, Algorithms, and Implementation Using Interval- Based Semantics Raman

More information

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University Prof. Mi Lu TA: Ehsan Rohani Laboratory Exercise #4 MIPS Assembly and Simulation

More information

Design for Testability

Design for Testability Design for Testability Outline Ad Hoc Design for Testability Techniques Method of test points Multiplexing and demultiplexing of test points Time sharing of I/O for normal working and testing modes Partitioning

More information

DSP Configurations. responded with: thus the system function for this filter would be

DSP Configurations. responded with: thus the system function for this filter would be DSP Configurations In this lecture we discuss the different physical (or software) configurations that can be used to actually realize or implement DSP functions. Recall that the general form of a DSP

More information

Special Nodes for Interface

Special Nodes for Interface fi fi Special Nodes for Interface SW on processors Chip-level HW Board-level HW fi fi C code VHDL VHDL code retargetable compilation high-level synthesis SW costs HW costs partitioning (solve ILP) cluster

More information

VLSI. Faculty. Srikanth

VLSI. Faculty. Srikanth J.B. Institute of Engineering & Technology Department of CSE COURSE FILE VLSI Faculty Srikanth J.B. Institute of Engineering & Technology Department of CSE SYLLABUS Subject Name: VLSI Subject Code: VLSI

More information

FUZZY PERFORMANCE ANALYSIS OF NTT BASED CONVOLUTION USING RECONFIGURABLE DEVICE

FUZZY PERFORMANCE ANALYSIS OF NTT BASED CONVOLUTION USING RECONFIGURABLE DEVICE FUZZY PERFORMANCE ANALYSIS OF NTT BASED CONVOLUTION USING RECONFIGURABLE DEVICE 1 Dr.N.Anitha, 2 V.Lambodharan, 3 P.Arunkumar 1 Assistant Professor, 2 Assistant Professor, 3 Assistant Professor 1 Department

More information

Binary addition (1-bit) P Q Y = P + Q Comments Carry = Carry = Carry = Carry = 1 P Q

Binary addition (1-bit) P Q Y = P + Q Comments Carry = Carry = Carry = Carry = 1 P Q Digital Arithmetic In Chapter 2, we have discussed number systems such as binary, hexadecimal, decimal, and octal. We have also discussed sign representation techniques, for example, sign-bit representation

More information

Design for Testability

Design for Testability Design for Testability Outline Ad Hoc Design for Testability Techniques Method of test points Multiplexing and demultiplexing of test points Time sharing of I/O for normal working and testing modes Partitioning

More information

COMPUTER SCIENCE TRIPOS

COMPUTER SCIENCE TRIPOS CST.2016.2.1 COMPUTER SCIENCE TRIPOS Part IA Tuesday 31 May 2016 1.30 to 4.30 COMPUTER SCIENCE Paper 2 Answer one question from each of Sections A, B and C, and two questions from Section D. Submit the

More information

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks Yufei Ma, Yu Cao, Sarma Vrudhula,

More information

2. Accelerated Computations

2. Accelerated Computations 2. Accelerated Computations 2.1. Bent Function Enumeration by a Circular Pipeline Implemented on an FPGA Stuart W. Schneider Jon T. Butler 2.1.1. Background A naive approach to encoding a plaintext message

More information

EE 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Fall 2016

EE 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Fall 2016 EE 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Fall 2016 Discrete Event Simulation Stavros Tripakis University of California, Berkeley Stavros Tripakis (UC Berkeley)

More information

THERMAL MODELING OF INTEGRATED CIRCUITS IN DIGITAL SIMULATION ENVIRONMENTS

THERMAL MODELING OF INTEGRATED CIRCUITS IN DIGITAL SIMULATION ENVIRONMENTS U.P.B. Sci. Bull., Series C, Vol. 75, Iss. 4, 2013 ISSN 2286 3540 THERMAL MODELING OF INTEGRATED CIRCUITS IN DIGITAL SIMULATION ENVIRONMENTS Vlad MOLEAVIN 1, Ovidiu-George PROFIRESCU 2, Marcel PROFIRESCU

More information

Formal Verification of Systems-on-Chip Industrial Practices

Formal Verification of Systems-on-Chip Industrial Practices Formal Verification of Systems-on-Chip Industrial Practices Wolfgang Kunz Department of Electrical & Computer Engineering University of Kaiserslautern, Germany Slide 1 Industrial Experiences Formal verification

More information

FPGA Implementation of a Predictive Controller

FPGA Implementation of a Predictive Controller FPGA Implementation of a Predictive Controller SIAM Conference on Optimization 2011, Darmstadt, Germany Minisymposium on embedded optimization Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan

More information

Designing and Evaluating Generic Ontologies

Designing and Evaluating Generic Ontologies Designing and Evaluating Generic Ontologies Michael Grüninger Department of Industrial Engineering University of Toronto gruninger@ie.utoronto.ca August 28, 2007 1 Introduction One of the many uses of

More information

DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore

DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore DATA SOURCES AND INPUT IN GIS By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore 1 1. GIS stands for 'Geographic Information System'. It is a computer-based

More information

EC-121 Digital Logic Design

EC-121 Digital Logic Design EC-121 Digital Logic Design Lecture 2 [Updated on 02-04-18] Boolean Algebra and Logic Gates Dr Hashim Ali Spring 2018 Department of Computer Science and Engineering HITEC University Taxila!1 Overview What

More information

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 1

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 1 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS Modeling, Analysis, and Hard Real-time Scheduling of Adaptive Streaming Applications Jiali Teddy Zhai, Sobhan Niknam, and Todor

More information

Design Methodologies for Reversible Logic Based Barrel Shifters

Design Methodologies for Reversible Logic Based Barrel Shifters University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School January 2012 Design Methodologies for Reversible Logic Based Barrel Shifters Saurabh Kotiyal University of

More information

Computer Science. 19. Combinational Circuits. Computer Science COMPUTER SCIENCE. Section 6.1.

Computer Science. 19. Combinational Circuits. Computer Science COMPUTER SCIENCE. Section 6.1. COMPUTER SCIENCE S E D G E W I C K / W A Y N E PA R T I I : A L G O R I T H M S, M A C H I N E S, a n d T H E O R Y Computer Science Computer Science An Interdisciplinary Approach Section 6.1 ROBERT SEDGEWICK

More information

Power Minimization of Full Adder Using Reversible Logic

Power Minimization of Full Adder Using Reversible Logic I J C T A, 9(4), 2016, pp. 13-18 International Science Press Power Minimization of Full Adder Using Reversible Logic S. Anandhi 1, M. Janaki Rani 2, K. Manivannan 3 ABSTRACT Adders are normally used for

More information

Sample Preparation. Approaches to Automation for SPE

Sample Preparation. Approaches to Automation for SPE Sample Preparation Approaches to Automation for SPE i Wherever you see this symbol, it is important to access the on-line course as there is interactive material that cannot be fully shown in this reference

More information

Test Generation for Designs with Multiple Clocks

Test Generation for Designs with Multiple Clocks 39.1 Test Generation for Designs with Multiple Clocks Xijiang Lin and Rob Thompson Mentor Graphics Corp. 8005 SW Boeckman Rd. Wilsonville, OR 97070 Abstract To improve the system performance, designs with

More information

Spintronics. Seminar report SUBMITTED TO: SUBMITTED BY:

Spintronics.  Seminar report SUBMITTED TO: SUBMITTED BY: A Seminar report On Spintronics Submitted in partial fulfillment of the requirement for the award of degree of Electronics SUBMITTED TO: SUBMITTED BY: www.studymafia.org www.studymafia.org Preface I have

More information

I. INTRODUCTION. CMOS Technology: An Introduction to QCA Technology As an. T. Srinivasa Padmaja, C. M. Sri Priya

I. INTRODUCTION. CMOS Technology: An Introduction to QCA Technology As an. T. Srinivasa Padmaja, C. M. Sri Priya International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 5 ISSN : 2456-3307 Design and Implementation of Carry Look Ahead Adder

More information

ISSP User Guide CY3207ISSP. Revision C

ISSP User Guide CY3207ISSP. Revision C CY3207ISSP ISSP User Guide Revision C Cypress Semiconductor 198 Champion Court San Jose, CA 95134-1709 Phone (USA): 800.858.1810 Phone (Intnl): 408.943.2600 http://www.cypress.com Copyrights Copyrights

More information