Fast FPGA Placement Based on Quantum Model Lingli Wang School of Microelectronics Fudan University, Shanghai, China Email: llwang@fudan.edu.cn 1
Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 2
What is FPGA? Field-Programmable Gate Array General-Purpose Programmable Hardware C/C++ MatLab scripts Verilog, VHDL Compilers: Visual Studio, GCC, MathLab/Simulink etc FPGA Compilers: ISE, Quartus CPU Memory DSP FPGA
FPGA Architecture FPGA bit stream
FPGA Compilation Flow RTL input Logic Synthesis Placement Routing Timing/Power Analysis Bit Stream File
Motivation How to reduce the compilation time? Placement LUT4 # Equivalent LE# 2000000 1800000 1600000 1400000 1200000 1000000 800000 600000 400000 200000 0 Virtex-2 Virtex-4 Virtex-5 Virtex-6 800000 700000 600000 500000 400000 300000 200000 100000 0 Stratix(.13um) Stratix-II(90nm) Stratix-III(65nm) Stratix-IV(40nm)
Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 7
VPR Placement Versatile Place and Route Developed by University of Toronto Best in the academic research
VPR Placement Algorithm Based on Simulated Annealing
Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 10
Qubit: Quantum Bit 0 1 2 2 1 z 0 cos 0 2 e i After the measurement: 0 or 1 sin 1 2 n-qubits can represent 2 n states simultaneously! (linear combinations ) -- Superstate x θ φ 1 Bloch Sphere y
Quantum gates Not: X 0 1 1 0 ' 0 1 1 0 Rotation: G cos sin sin cos i i cos i sini sini i cos i i CNOT: a b a a b U CN 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0
Quantum model for FPGA placement Quantum encoding For an FPGA device with N CLB logic blocks and N IO IO blocks: 0 N N 2 n CLB The number of qubits n: n ceil{log [ N N ]} 2 CLB For example: With 400 CLB blocks and 80 IOs, n is 9, i.e. 2 9 = 512 > 480 0-399: CLB locations 400-479: IO locations IO IO
Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 14
FPGA Placement Flow t : iteration number Qt ( ) : quantum representation of placement Pt ( ) : placement location after measurement G : rotation gate operation t 0; Initialize Q( t); Measure Q( t) to obtain P( t); Calculate the placement cost of Save P( t) as the best solution; while(! stop_condition ) begin t t1; P( t); measure Q( t 1) to obtain P( t); Calculate the placement cost of P( t); Update Q( t 1) with Q( t) using G; Save the best solution; end 1 1 i 0 1 2 2 for all qubits VPR function VPR function 15
How to measure a qubit? Qubit: 0 1 Generate a random number, r : r 2 0 r 2 1 1 23 45 6 789 Messure Index 1 0 0 0 0 0 0 0 0 1 Real location: 0 1 2 16
How to avoid location confliction? CLB block IO block 17
Rotation Gate Operation G cos sin sin cos i i cos i sini sini i cos i i where s(, ) i i i 18
Quantum Model Efficiency 19
Quantum + VPR Flow RTL input Quantum model VPR placement Global Placement Local Placement Logic Synthesis Placement Routing Timing/Power analysis Bit Stream
Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 21
Experimental Results:before routing Quantum+VPR VPR Comparison =0.01 Average over 5 runs Iteration number for Quantum: 1000 Benchmark Cost Time Cost Time ( 10-8 s) (ms) ( 10-8 s) (ms) Placement Cost Speed-up ratio alu4 9.97 6493.4 10.22 16202.4 0.98 2.50 X apex2 14.85 7959 14.8 23398.8 1.00 2.94 X apex4 23.01 5556 21.26 15511.8 1.08 2.79 X clma 68.52 43105.2 63.66 180767.6 1.08 4.19 X diffeq 1.47 6589.8 1.43 22357 1.03 3.39 X elliptic 9.66 23587.4 9.6 68911.8 1.01 2.92 X ex5p 13.7 5746.4 14.2 16249.2 0.96 2.83 X ex1010 88.12 26430.4 96.95 106251 0.91 4.02 X frisc 5.64 19583.4 5.48 71151.2 1.03 3.63 X pdc 49.17 23064.6 47.82 89782.2 1.03 3.89 X s298 8.65 5349.6 8.44 13452.4 1.02 2.51 X s38417 12.49 33292.6 12.15 114763 1.03 3.45 X s38584.1 3.13 45242.6 2.97 137506.6 1.05 3.04 X seq 11.15 8080.8 12.71 24752.6 0.88 3.06 X spla 43 17699.4 42.4 62108 1.01 3.51 X Average: 1.01 3.25 X
Experimental Results: after routing Benchmark Quantum+VPR VPR ( 10-9 s) (10-9 s) Critical-Path Delay alu4 19.3 19.46 0.99 apex2 22.17 21.5 1.03 apex4 20.1 21.15 0.95 clma 34.71 34.43 1.01 diffeq 25.59 25.37 1.01 elliptic 31.74 32.01 0.99 ex5p 20.6 20.48 1.01 ex1010 30.24 29.6 1.02 frisc 49.27 48.79 1.01 pdc 28.93 28.25 1.02 s298 38.72 38.61 1.00 s38417 28.1 27.77 1.01 s38584.1 23.82 24.04 0.99 seq 19.26 19.14 1.01 spla 26.58 25.75 1.03 Average: 1.01
Outline Background & Motivation VPR placement Quantum Model Fast FPGA Placement Algorithm Experimental Results Conclusions 24
Conclusions Quantum model is proposed for FPGA placement. FPGA placement can be more than 3 times faster if the quantum model is adopted. About 1% of the timing performance is sacrificed. Many future works It is only a start point. 25
Acknowledgement This research is sponsored by Shanghai Municipal Pu-Jiang Research Foundation,2008 National Science Foundation of China, 2007-2009 26
Thank you! Q & A 27
Unit Hadamard I 1 0 0 1 I H 1 1 1 2 1 1 H Pauli-X Rotation-X 0 1 1 0 X X R x cos i sin 2 2 i sin cos 2 2 Rx(θ) Pauli-Y Rotation-Y Y 0 i i 0 Y R y cos sin 2 2 sin cos 2 2 Ry(θ) Pauli-Z Rotation-Z Z 1 0 0 1 Z R z i /2 i /2 e 0 e 0 Rz(θ) S Phase 1 0 0 i S T 8 gate 1 0 0 i /4 e T 28