Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture

Similar documents
DSP Design Lecture 2. Fredrik Edman.


Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

DSP Configurations. responded with: thus the system function for this filter would be

Analysis of Finite Wordlength Effects

Numbering Systems Basic Building Blocks Scaling and Round-off Noise. Number Representation. Floating vs. Fixed point. DSP Design.

Practical Considerations in Fixed-Point FIR Filter Implementations

REAL TIME DIGITAL SIGNAL PROCESSING

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

DFT & Fast Fourier Transform PART-A. 7. Calculate the number of multiplications needed in the calculation of DFT and FFT with 64 point sequence.

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System

ECE4270 Fundamentals of DSP Lecture 20. Fixed-Point Arithmetic in FIR and IIR Filters (part I) Overview of Lecture. Overflow. FIR Digital Filter

Hardware Design I Chap. 4 Representative combinational logic

Determining Appropriate Precisions for Signals in Fixed-Point IIR Filters

CS 140 Lecture 14 Standard Combinational Modules

Vel Tech High Tech Dr.Ranagarajan Dr.Sakunthala Engineering College Department of ECE

Binary Multipliers. Reading: Study Chapter 3. The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding

Data Converter Fundamentals

ALU (3) - Division Algorithms

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters

FINITE PRECISION EFFECTS 1. FLOATING POINT VERSUS FIXED POINT 3. TYPES OF FINITE PRECISION EFFECTS 4. FACTORS INFLUENCING FINITE PRECISION EFFECTS

DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING EC2314- DIGITAL SIGNAL PROCESSING UNIT I INTRODUCTION PART A

Mark Redekopp, All rights reserved. Lecture 1 Slides. Intro Number Systems Logic Functions

Design of Sequential Circuits

IT DIGITAL SIGNAL PROCESSING (2013 regulation) UNIT-1 SIGNALS AND SYSTEMS PART-A

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Tunable Floating-Point for Energy Efficient Accelerators

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m )

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Pipelined Viterbi Decoder Using FPGA

A 32-bit Decimal Floating-Point Logarithmic Converter

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

Introduction to Digital Logic Missouri S&T University CPE 2210 Subtractors

DSP Design Lecture 7. Unfolding cont. & Folding. Dr. Fredrik Edman.

What s the Deal? MULTIPLICATION. Time to multiply

DIVIDER IMPLEMENTATION

Oversampling Converters

Combinational Logic Design Arithmetic Functions and Circuits

Question Bank. UNIT 1 Part-A

Determining Appropriate Precisions for Signals in Fixed-point IIR Filters

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

Logic and Computer Design Fundamentals. Chapter 8 Sequencing and Control

Lab 3 Revisited. Zener diodes IAP 2008 Lecture 4 1

L16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory

Lecture 11. Advanced Dividers

UNIT 1. SIGNALS AND SYSTEM

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

ECE380 Digital Logic. Positional representation

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA

E : Lecture 1 Introduction

Design of Low Power, High Speed Parallel Architecture of Cyclic Convolution Based on Fermat Number Transform (FNT)

LOGIC CIRCUITS. Basic Experiment and Design of Electronics

Administrative Stuff

Numbers and Arithmetic

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

VALLIAMMAI ENGINEERING COLLEGE. SRM Nagar, Kattankulathur DEPARTMENT OF INFORMATION TECHNOLOGY. Academic Year

DESIGN OF QUANTIZED FIR FILTER USING COMPENSATING ZEROS

Homework #1 Solution

Complement Arithmetic

Adders, subtractors comparators, multipliers and other ALU elements

Slide Set Data Converters. Digital Enhancement Techniques

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

Novel Devices and Circuits for Computing

UNSIGNED BINARY NUMBERS DIGITAL ELECTRONICS SYSTEM DESIGN WHAT ABOUT NEGATIVE NUMBERS? BINARY ADDITION 11/9/2018

Numbers and Arithmetic

Multiplication of signed-operands

3 Finite Wordlength Effects

14:332:231 DIGITAL LOGIC DESIGN. Why Binary Number System?

Hilbert Transformator IP Cores

Lecture 8: Sequential Multipliers

Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA

ECE260: Fundamentals of Computer Engineering

Binary addition example worked out

Low Power, High Speed Parallel Architecture For Cyclic Convolution Based On Fermat Number Transform (FNT)

CprE 281: Digital Logic

CS/COE0447: Computer Organization

LOGIC CIRCUITS. Basic Experiment and Design of Electronics. Ho Kyung Kim, Ph.D.

Introduction to the Xilinx Spartan-3E

Optimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations

Numbers & Arithmetic. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See: P&H Chapter , 3.2, C.5 C.

Serial Parallel Multiplier Design in Quantum-dot Cellular Automata

Design and Implementation of REA for Single Precision Floating Point Multiplier Using Reversible Logic

Unit II Chapter 4:- Digital Logic Contents 4.1 Introduction... 4

Number Representation and Waveform Quantization

CS/COE0447: Computer Organization

EE260: Digital Design, Spring n Digital Computers. n Number Systems. n Representations. n Conversions. n Arithmetic Operations.

Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives

Adders, subtractors comparators, multipliers and other ALU elements

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Four Important Number Systems

L15: Custom and ASIC VLSI Integration

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

Literature Review on Multiplier Accumulation Unit by Using Hybrid Adder

Higher-Order Σ Modulators and the Σ Toolbox

Professor Fearing EECS150/Problem Set Solution Fall 2013 Due at 10 am, Thu. Oct. 3 (homework box under stairs)

Review. EECS Components and Design Techniques for Digital Systems. Lec 18 Arithmetic II (Multiplication) Computer Number Systems

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

Chapter 4 Number Representations

Transcription:

Computational Platforms Numbering Systems Basic Building Blocks Scaling and Round-off Noise Computational Platforms Viktor Öwall viktor.owall@eit.lth.seowall@eit lth Standard Processors or Special Purpose Special Purpose here that is dedicated architecture Standard Processor Algorithm Special Purpose An architecture is developed to fulfill special requirements, could be hardware mapped time-multiplexed μ processor or DSP Programable/Flexible Short design time/ttm Low price? Dedicated architecture High calculation capacity Low power consumption Low price at volume What is volume? the architecture can then be implemented on either FPGA ASIC

4 4 X Operand X Registers Y Y 4 4 56 56 ALU Shifter Accumulators A (56) B (56) 56 Fixed point DSP Motorola DSP56x Standard DSPs are MAC (Multiply-Accumulate) based and usually have single cycle multiplier, may be pipelined 56 Double wordlength out, 4 4 altenative is mult with reduced wordlength output, e.g. 4 56 Shifter/ Limiter 4 4 guard bits scaling Architectural options OTS (Off The Shelf) processors Programmable microprocessors or DSP Based on generic computational units, for DSPs usually MAC Prefabbed or IP cores Time-multiplexed application specific processors Several algorithmic operations performed on same hardware unit Trades reduced HW for longer computation time Hardware mapped architectures One (or more) hardware unit per algorithmic operation High HW cost and high throughput Hardware Implementation Techniques Hardware Solution Design for FPGA or ASIC Hardware description language, e.g. VHDL, or Verilog Simulation FPGA Full Custom Cell library Synthesis P&R Field Programmable Gate Arrays Already fabricated silicon Full Custom Fabrication necessary Configuration Post-layout sim. Reconfigurable Fast Turn Around Prototyping High Calculation Capacity High Utilization Low Power Low Price at Volume Fabrication

Preprocessed array that can be programmed, e.g. VHDL/Verilog Block RAM BANK BANK Heterogeneous Programmable Platforms FPGA Fabric BAN NK 7 BAN NK IOB Embedded PowerPc Embedded memories CLB Hardwired multipliers BANK 6 BANK Routing Xilinx Vertex-II Pro Timing BANK 5 BANK 4 High-speed I/O Courtesy Xilinx We will discuss this again at the end of the course. Number Representation

Floating vs. Fixed point In floating point a value is represented by mantissa determining the resolution/precision o ec s o e m b exponent determining the dynamic range In fixed point we only have a single value Floating point gives higher dynamic range but the cost is high in energy area calculation time For energy efficient implementations fixed point is preferred Binary numbers, unsigned integers MSB LSB Most Significant Bit Least Significant Bit () Nbits () () () N ord (4) (5) (6) (7) Dynamic range and Resolution Nr. of Nr. of Resolution Dynamic Range bits levelsl V fs.5v V LSB.5 5 4 6.5V.5V 8 56 mv 8V 496.mV 8V 6 65 56 7.6μV 4V How do we use the bits? Depends on the application! Unsigned Number Representation Fixed radix (base) systems The digits a {,,,... r } in a radix r system: l i r ai i k r k a k l k r a k r a r a r a r + + + a l described in a fixed point positional number system: ai ai aa. a a l Fractional part

Example: Unsigned Number l i ai { a {,,,... 9} in radix } i k k a k l k + ak a+ a+ a a l Example: Unsigned Number l i a i { a {,} in radix } i k k a k l k + ai a+ a+ a a l l i ai { a {,} in radix } i k k k l a k + a i a + a + a a l. 4 i + i + i + i + i + i + i + i 8+ + + 4 8 Signed Digit Number Representation The digits a { α,, r α } in a radix r system: l i r ak i k Example Radix : a { 4,,, 4, 5} ( 5) + 5 + 5 95 (. 5) + 5.+.5.95 Signed Number Representation Sign Magnitude One s Complement Two s Complement

Signed Magnitude Unsigned numbers with a sign-bit One s Complement Signed numbers by inverting (Complement) - - Two Zeros - Two Zeros - Signed Magnitude - + Low Power? + Easy to convert to Negative One's - Complement - - + Easy to convert to Negative Two s Complement Most widely used fixed point numbering system Complement + LSB - - Two's Complement - - 4 + Easy Addition - Not so easy to convert to Neg. Two s Complement The digits a {,} in a radix system: l k a i k + ai i k k a k l k a k a a a + + + a l described in a fixed point positional number system: a k a k a a. a a l Sign Bit Fractional part

Example: s complement Sign Extension in Two s Complement. 4 i + i + i + i + i + i + i + i 8+ + + 4 8 If nothing else said we assume numbers x < k a k k + a k a + a ka k k k a + k + ak a + a k+ a k k k k + ak a k ak a a + + + Example: h The Wordlength, i.e. nr of bits D D D h h UMTS-filter float h Every extra bit costs energy/power delay area the wordlength has to be reduced 7bits The Wordlength, i.e. nr of bits h D D D h h The output of adder output needs an extra bit to be sure of no overflow, e.g. decimal: + 4 binary: + h multiplier MxN bits M+N bits for full precision sometimes M+N- Precision has to be limited

Basic Building Blocks D D D h h h h Basic Building Blocks In the FIR filter adders multipliers registers in other algrithms also: shift, minus, division,... left shift is multiply by right shift is a dived by but is low complexity! Comparing Basic Building Blocks High Complexity Divider Generic Multiplier Fixed Multiplier Adder/Subtarct a a a a b b b b a a a a b b b b a a a a b b b b a a a a b b b b a a a a b b b b Scaling and Round-off Noise Shifter Low a a a a b b b b s s s s p 6 p 5 p 4 p p p p

Two Types Quantization Coefficient Quantization Non-Ideal Transfer Function Compare to analog component variations Signal Quantization Round-off Noise Limit Cycles Round-off Noise Quantization Affect the output as a random disturbance Limit Cycle Oscillations Undesired periodic components Due to non-linear behavior in the feedback (rounding or overflow) Quantization Analysis Using real rounding, truncation, and overflow Give exact result Tricky - need integer representation Using noise models Floating point representation can still be used Suitable for Matlab, C/C++... Rounding Truncation Rounding/Truncation is always there! Especially necessary in recursive systems Q Without t quantization - infinite it wordlength Multiplication n+m output bits Addition n+ output bits

Level X+ Truncation and Rounding Level X+ Truncation Rounding Truncation Rounding No energy added to the system Often used in recursive algorithms Truncation towards zero Level X Level X -4 - - - -4 - - - -4 - - - Truncation Rounding All values approximated Values approximated in the same direction up or down Max error LSB Max error / LSB DC error All values goes towards -infinity Rounded to even Add LSB before truncation if negative Scaling Example Where Scaling is Needed Adjust signal range to fit the hardware Unchanged transfer function (Scaled coefficients might move the pole-zeros) u(n) -.5 u(n) un ( ) ± 4 Trade-off Scale up to reduce roundoff noise Scale down to avoid overflow But you loose precision! Overflow

f(n) f(n) Scaling Safe scaling if β i β β f ( i) Where f(i) is the unit sample response Example: Safe Scaling 7 xn ( ) and ( n ) 7 y give safe scaling β i f(i). + 7. +.5.5 h(n) /7. -.7 7 5.5 Increased roundoff noise Internal scaling might improve 7/ (Linear phase FIR. Note the strength reduction) Example: Safe Scaling β 5 i i f(i) i (. ) 5 + 5 + 5 +. 5 (. ) (. ) (. ) -.5 -.5 Geometric series Scaling Safe scaling is pessimistic Alternative is scaling with β i ( f ( i ) ) In practice: Scaling with β ±n Easy to do - a shift u(n) u(n) Increased internal wordlength an alternative Original filter with overflow.5

Pacemaker example The Electrocardiogram (EGM) - 5 5 5 5 4 45.5 -.5-5 5 5 5 4 45 5.5 -.5 5 5 5 5 4 45 time [ms] The Interfered signal Filtering Performance.5 ) EGM + Interference -.5-4 6 8 from AC hand drill db SNR T().5 4 6 8 time, [ms] Output of the GLRT and threshold EGM with added interference

Wavelet Filterbank Bit-optimization Signals have been monitored to determine the upper bound of the wordlength Comparison of worst-case wordlength and implemented wordlength at the wavelet output: F ( z) + z + z + z G ( ) b z + z y y y y 4 y 5 y 6 N wc N+6 N+7 N+ N+ N+4 N+5 N Imp N+ N+ N+ N+ N+ N+ Example: Internal Scaling VHDL bit-level simulation 4-point FFT Compared with Matlab floating-point simulation Optimized internal scaling A 6-point Radix- FFT W W 4 W Basic Butterfly unit W Data In Radix Radix Radix Radix Radix Data Out W 6 W W 8 W 4 Counter Stage 5 Stage 4 Stage Stage Stage W W W W W W 4 Clock W 4 White noise input Source: Fredrik Kristensen W 5 W W 6 W 7 W W 4 W W 6 W 4

Example: Internal Scaling VHDL bit-level simulation 4-point FFT Compared with Matlab floating-point simulation Optimized internal scaling 8-bits -bits -bits 4-bits 4-bits -bits Radix Radix Radix Radix Radix Data In Data Out Limit Cycles Counter Stage 5 Stage 4 Stage Stage Stage Clock White noise input Source: Fredrik Kristensen Limit Cycles Example: zero input oscillations in nd order IIR Q b Limit Cycles Zero Input Example: zero input oscillations Rounding after multiplication X(n) Q b Truncation after multiplication 489 5 b.9565; b.975 56 6 Source: Lars Wanhammar, DSP Integrated circuits

Limit Cycles y Limit Cycles y Poles close to the unity circle Changing the precision move the poles! Matlab: zplane(,[ -.9565.975]).8 Poles close to the unity circle. Matlab: zplane(,[ -.9565.975]) -..8 -.4.6 -.6.4 -.8 - - -.5 Very difficult problem Real Part zplane(,[ -.975.975]).5.8.6.4. Imaginary Part Often not accepted in audio.4 Imaginary P Part Zero input oscillations Imaginary Part.6 -. -.4 -.6 In I general, l no solutions l ti for f structures t t - -.6 - Can be limited by increased internal wordlength -.5 Real Part.5 -.8 - Can in some nd order structures be eliminated by pole positioning nd -. -.4 -.8 > nd order. - -.5 Real Part.5 order Wave Digital Filters are free from parasitic oscillations From Adder Saturation Arithmetic O Overflow fl Oscillations O ill ti C out-msb NOF C in-msb POF Cout-msb -bit two s complement sum Oscillations are limited by saturation Cin-msb Signbit Correct sum Saturated Output -bit saturated sum Correct sum Overflow if Cout-msb differs from Cin-msb Overflow change the sign

Limit Cycles y due to overflow Limit cycles due to overflow Lab Example Zero Input 5 Wrap around Amplitud de Two s Complement Arithmetic input unquantized output quantized output Lab Example 5 input unquantized output quantized output -5-4 5 Time 6 7 S t Saturation ti 8 9 Amplitude Saturated Arithmetic -5-4 5 Time 6 7 8 9 Source: Lars Wanhammar, DSP Integrated circuits Limit cycles due to overflow Lab Example 5 input unquantized output quantized output Wrap around Amplitud de Lab Example 5 Simple Noise Analysis input unquantized output quantized output -5 4 5 Time 6 7 S t Saturation ti 8 9 Amplitude - -5-4 5 Time 6 7 8 9

Scaling and White Noise Input β δ i β f (), i Safe scaling f (), i possible overflow i ( ) unit sample response, ( ) Variance white noise input i f i f i 5 bits Q 8 bits u(n) a 8 bits Rounding e(n) u(n) Model a Safe scaling but not guaranteed δ sets the probability for an overflow Typically y one overflow every 6 sample is accepted in audio [Wanhammar] u(n) 5 bits Q 8 bits a 8 bits Rounding u(n) ( ) u n e(n) en ( ) u n un ( ) ( ) Model Modeled with added noise as an input error a Roundoff Noise If the quantization error probability is uniformly distributed in the interval Δ Δ ( W ) en ( ) where Δ W is the number of bits after the rounding Δ P e (e) ½ LSB Δ Δ e

Roundoff Noise Δ / Variance E[ ( e e ) ] ( e e ) Pe ( e) de [ e ] Δ / Mean value e E[( en )] Δ / Δ / e Δ /8 Δ /8 Δ W e de Δ Δ / Δ Δ Δ / Δ P e (e) ½ LSB Δ Δ e Example: Roundoff Noise In the case of rounding (mean) the variance and the average power are the same, i.e. if a value is rounded the quantization noise becomes: σ e W If we scale down one bit: ( W ) W 4 σ e Signal to Noise Ratio (SNR) One extra bit reduces quantization error by a factor 4 Signal to Noise Ratio (SNR) Signal power (variance) 4σ SNR log e 6. σ e db SNR σ x log σ W σ e x log x Good to remember: 6 db increase in SNR per bit Roundoff error power (variance)

Signal to Noise Ratio (SNR) Example: Full scale sinus wave rounded to 8 bits A SNR log 5 db; - A 8 Roundoff Noise: Addition E [( e + e ) ] E [ e + e e + e ] E [ e ] + E[ e e ] + E[ e ] zero if e (n) u and u independent u (n) u (n) + E e E [ e ] + [ ] e (n) Example: Roundoff Noise Example: SNR First order IIR-filter, the variance is: Example: Full scale sinus, rounded to 8 bits in IIR a σ σ σ a a..σ e f () i ( ( ) ( ) ( ) ) i e + a + a + a e e a.5.σ e a σ u(n).998 5 e a i σ e f ( i ) σ No feedback a.998 a SNR ( e 5 σ e u(n) 5dB e(n) SNR db Narrow band filter e(n)