Fast FPGA Placement Based on Quantum Model

Similar documents
A Framework for Layout-Level Logic Restructuring. Hosung Leo Kim John Lillis

An Integer Programming Placement Approach to FPGA Clock Power Reduction

WITH rapid growth of traditional FPGA industry, heterogeneous

Introduction to Digital Logic Missouri S&T University CPE 2210 PLDs

Quantum computing! quantum gates! Fisica dell Energia!

Projects about Quantum adder circuits Final examination June 2018 Quirk Simulator

Implementation of Reversible Control and Full Adder Unit Using HNG Reversible Logic Gate

Tate Bilinear Pairing Core Specification. Author: Homer Hsing

Elliptic Curve Group Core Specification. Author: Homer Hsing

Performance Enhancement of Reversible Binary to Gray Code Converter Circuit using Feynman gate

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

LUTMIN: FPGA Logic Synthesis with MUX-Based and Cascade Realizations

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

FPGA-Based Circuit Model Emulation of Quantum Algorithms

Power Minimization of Full Adder Using Reversible Logic

ECE 407 Computer Aided Design for Electronic Systems. Simulation. Instructor: Maria K. Michael. Overview

ABHELSINKI UNIVERSITY OF TECHNOLOGY

Example: vending machine

- Why aren t there more quantum algorithms? - Quantum Programming Languages. By : Amanda Cieslak and Ahmana Tarin

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks

LRADNN: High-Throughput and Energy- Efficient Deep Neural Network Accelerator using Low Rank Approximation

Advanced Stereoscopic Array Trigger. Frank Krennrich (Iowa State University)

FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture

Parallelization of the QC-lib Quantum Computer Simulator Library

Accelerating Transfer Entropy Computation

Design of Digital Multiplier with Reversible Logic by Using the Ancient Indian Vedic Mathematics Suitable for Use in Hardware of Cryptosystems

IBM Systems for Cognitive Solutions

UTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement

Realization of programmable logic array using compact reversible logic gates 1

Quantum Computer Architecture

FPGA Implementation of a Predictive Controller

Fixed-Point Trigonometric Functions on FPGAs

Design and Study of Enhanced Parallel FIR Filter Using Various Adders for 16 Bit Length

Pipelined Viterbi Decoder Using FPGA

2. Accelerated Computations

The Quantum Supremacy Experiment

Attacking the ECDLP with Quantum Computing

ECE 645: Lecture 3. Conditional-Sum Adders and Parallel Prefix Network Adders. FPGA Optimized Adders

Efficient arithmetic Fourier transform implementation to detect potential electromigration failures in FPGAs

Constrained Clock Shifting for Field Programmable Gate Arrays

Design and Implementation of Carry Adders Using Adiabatic and Reversible Logic Gates

Section 3: Combinational Logic Design. Department of Electrical Engineering, University of Waterloo. Combinational Logic

Quantum gate. Contents. Commonly used gates

Quantum Memory Hierarchies

Novel Devices and Circuits for Computing

OHW2013 workshop. An open source PCIe device virtualization framework

Design Exploration of an FPGA-Based Multivariate Gaussian Random Number Generator

CS470: Computer Architecture. AMD Quad Core

LOGIC CIRCUITS. Basic Experiment and Design of Electronics

Reversible and Quantum computing. Fisica dell Energia - a.a. 2015/2016

Introduction to Quantum Computing

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram

EECS 151/251A Fall 2018 Digital Design and Integrated Circuits. Instructor: John Wawrzynek & Nicholas Weaver. Lecture 5 EE141

DESIGN AND ANALYSIS OF A FULL ADDER USING VARIOUS REVERSIBLE GATES

Advanced Hardware Architecture for Soft Decoding Reed-Solomon Codes

Chapter 10. Quantum algorithms

An FPGA-based real quantum computer emulator

ISSN Vol.03, Issue.03, June-2015, Pages:

Cider Seminar, University of Toronto DESIGN AND PERFORMANCE ANALYSIS OF A HIGH SPEED AWGN COMMUNICATION CHANNEL EMULATOR

EECS150 - Digital Design Lecture 2 - Combinational Logic Review and FPGAs. General Model for Synchronous Systems

Quantum Computation. Michael A. Nielsen. University of Queensland

Parallelization of the QC-lib Quantum Computer Simulator Library

(Self-)reconfigurable Finite State Machines: Theory and Implementation

Sequential Logic Optimization. Optimization in Context. Algorithmic Approach to State Minimization. Finite State Machine Optimization

Large-Scale Quantum Architectures

Errata list, Nielsen & Chuang. rrata/errata.html

1 Brief Introduction to Quantum Mechanics

An Extensive Literature Review on Reversible Arithmetic and Logical Unit

Quantum Braitenberg Vehicles. Report. Submitted by Indudhar Devanath

Short introduction to Quantum Computing

Semiconductors: Applications in spintronics and quantum computation. Tatiana G. Rappoport Advanced Summer School Cinvestav 2005

Characterizing Quantum Supremacy in Near-Term Devices

Digital Control of Electric Drives

Using Global Clock Networks

On the Tradeoff between Power and Flexibility of FPGA Clock Networks

X row 1 X row 2, X row 2 X row 3, Z col 1 Z col 2, Z col 2 Z col 3,

Quantum Information Processing and Diagrams of States

Computer organization

DO NOT COPY DO NOT COPY

A New Method to Express Functional Permissibilities for LUT based FPGAs and Its Applications

Complex numbers: a quick review. Chapter 10. Quantum algorithms. Definition: where i = 1. Polar form of z = a + b i is z = re iθ, where

FPGA IMPLEMENTATION OF BASIC ADDER CIRCUITS USING REVERSIBLE LOGIC GATES

CS/COE0447: Computer Organization

Construction of a reconfigurable dynamic logic cell

CS 140 Lecture 14 Standard Combinational Modules

Introduction to Quantum Computing

VHDL DESIGN AND IMPLEMENTATION OF C.P.U BY REVERSIBLE LOGIC GATES

Quantum parity algorithms as oracle calls, and application in Grover Database search

Implementation of CCSDS Recommended Standard for Image DC Compression

The GigaFitter for Fast Track Fitting based on FPGA DSP Arrays

Testability. Shaahin Hessabi. Sharif University of Technology. Adapted from the presentation prepared by book authors.

Quantum Computer Simulation Using CUDA (Quantum Fourier Transform Algorithm)

Coding Metamaterials, Digital Metamaterials and Programmable Metamaterials

Design and Implementation of High Speed CRC Generators

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System

Low Power Design Methodologies and Techniques: An Overview

Extended Superposed Quantum State Initialization Using Disjoint Prime Implicants

Gates for Adiabatic Quantum Computing

Efficient Finite Field Multiplication for Isogeny Based Post Quantum Cryptography

Transcription:

Fast FPGA Placement Based on Quantum Model Lingli Wang School of Microelectronics Fudan University, Shanghai, China Email: llwang@fudan.edu.cn 1

Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 2

What is FPGA? Field-Programmable Gate Array General-Purpose Programmable Hardware C/C++ MatLab scripts Verilog, VHDL Compilers: Visual Studio, GCC, MathLab/Simulink etc FPGA Compilers: ISE, Quartus CPU Memory DSP FPGA

FPGA Architecture FPGA bit stream

FPGA Compilation Flow RTL input Logic Synthesis Placement Routing Timing/Power Analysis Bit Stream File

Motivation How to reduce the compilation time? Placement LUT4 # Equivalent LE# 2000000 1800000 1600000 1400000 1200000 1000000 800000 600000 400000 200000 0 Virtex-2 Virtex-4 Virtex-5 Virtex-6 800000 700000 600000 500000 400000 300000 200000 100000 0 Stratix(.13um) Stratix-II(90nm) Stratix-III(65nm) Stratix-IV(40nm)

Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 7

VPR Placement Versatile Place and Route Developed by University of Toronto Best in the academic research

VPR Placement Algorithm Based on Simulated Annealing

Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 10

Qubit: Quantum Bit 0 1 2 2 1 z 0 cos 0 2 e i After the measurement: 0 or 1 sin 1 2 n-qubits can represent 2 n states simultaneously! (linear combinations ) -- Superstate x θ φ 1 Bloch Sphere y

Quantum gates Not: X 0 1 1 0 ' 0 1 1 0 Rotation: G cos sin sin cos i i cos i sini sini i cos i i CNOT: a b a a b U CN 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0

Quantum model for FPGA placement Quantum encoding For an FPGA device with N CLB logic blocks and N IO IO blocks: 0 N N 2 n CLB The number of qubits n: n ceil{log [ N N ]} 2 CLB For example: With 400 CLB blocks and 80 IOs, n is 9, i.e. 2 9 = 512 > 480 0-399: CLB locations 400-479: IO locations IO IO

Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 14

FPGA Placement Flow t : iteration number Qt ( ) : quantum representation of placement Pt ( ) : placement location after measurement G : rotation gate operation t 0; Initialize Q( t); Measure Q( t) to obtain P( t); Calculate the placement cost of Save P( t) as the best solution; while(! stop_condition ) begin t t1; P( t); measure Q( t 1) to obtain P( t); Calculate the placement cost of P( t); Update Q( t 1) with Q( t) using G; Save the best solution; end 1 1 i 0 1 2 2 for all qubits VPR function VPR function 15

How to measure a qubit? Qubit: 0 1 Generate a random number, r : r 2 0 r 2 1 1 23 45 6 789 Messure Index 1 0 0 0 0 0 0 0 0 1 Real location: 0 1 2 16

How to avoid location confliction? CLB block IO block 17

Rotation Gate Operation G cos sin sin cos i i cos i sini sini i cos i i where s(, ) i i i 18

Quantum Model Efficiency 19

Quantum + VPR Flow RTL input Quantum model VPR placement Global Placement Local Placement Logic Synthesis Placement Routing Timing/Power analysis Bit Stream

Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 21

Experimental Results:before routing Quantum+VPR VPR Comparison =0.01 Average over 5 runs Iteration number for Quantum: 1000 Benchmark Cost Time Cost Time ( 10-8 s) (ms) ( 10-8 s) (ms) Placement Cost Speed-up ratio alu4 9.97 6493.4 10.22 16202.4 0.98 2.50 X apex2 14.85 7959 14.8 23398.8 1.00 2.94 X apex4 23.01 5556 21.26 15511.8 1.08 2.79 X clma 68.52 43105.2 63.66 180767.6 1.08 4.19 X diffeq 1.47 6589.8 1.43 22357 1.03 3.39 X elliptic 9.66 23587.4 9.6 68911.8 1.01 2.92 X ex5p 13.7 5746.4 14.2 16249.2 0.96 2.83 X ex1010 88.12 26430.4 96.95 106251 0.91 4.02 X frisc 5.64 19583.4 5.48 71151.2 1.03 3.63 X pdc 49.17 23064.6 47.82 89782.2 1.03 3.89 X s298 8.65 5349.6 8.44 13452.4 1.02 2.51 X s38417 12.49 33292.6 12.15 114763 1.03 3.45 X s38584.1 3.13 45242.6 2.97 137506.6 1.05 3.04 X seq 11.15 8080.8 12.71 24752.6 0.88 3.06 X spla 43 17699.4 42.4 62108 1.01 3.51 X Average: 1.01 3.25 X

Experimental Results: after routing Benchmark Quantum+VPR VPR ( 10-9 s) (10-9 s) Critical-Path Delay alu4 19.3 19.46 0.99 apex2 22.17 21.5 1.03 apex4 20.1 21.15 0.95 clma 34.71 34.43 1.01 diffeq 25.59 25.37 1.01 elliptic 31.74 32.01 0.99 ex5p 20.6 20.48 1.01 ex1010 30.24 29.6 1.02 frisc 49.27 48.79 1.01 pdc 28.93 28.25 1.02 s298 38.72 38.61 1.00 s38417 28.1 27.77 1.01 s38584.1 23.82 24.04 0.99 seq 19.26 19.14 1.01 spla 26.58 25.75 1.03 Average: 1.01

Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 24

Conclusions Quantum model is proposed for FPGA placement. FPGA placement can be more than 3 times faster if the quantum model is adopted. About 1% of the timing performance is sacrificed. Many future works It is only a start point. 25

Acknowledgement This research is sponsored by Shanghai Municipal Pu-Jiang Research Foundation,2008 National Science Foundation of China, 2007-2009 26

Thank you! Q & A 27

Unit Hadamard I 1 0 0 1 I H 1 1 1 2 1 1 H Pauli-X Rotation-X 0 1 1 0 X X R x cos i sin 2 2 i sin cos 2 2 Rx(θ) Pauli-Y Rotation-Y Y 0 i i 0 Y R y cos sin 2 2 sin cos 2 2 Ry(θ) Pauli-Z Rotation-Z Z 1 0 0 1 Z R z i /2 i /2 e 0 e 0 Rz(θ) S Phase 1 0 0 i S T 8 gate 1 0 0 i /4 e T 28