VLSI IMPLEMENTATION OF PARALLEL- SERIAL LMS ADAPTIVE FILTERS

Similar documents
DESIGN AND IMPLEMENTATION OF SPLIT RADIX ALGORITHM FOR LENGTH - 6 M DFT USING VLSI AND FPGA

Backstepping Control of the Doubly Fed Induction Generator using Xilinx System Generator for Implementation on FPGA

A Deep Convolutional Neural Network Based on Nested Residue Number System

A GENERALIZATION OF A CONJECTURE OF MELHAM. 1. Introduction The Fibonomial coefficient is, for n m 1, defined by

Queuing Network Approximation Technique for Evaluating Performance of Computer Systems with Hybrid Input Source

Optimum Settings of Process Mean, Economic Order Quantity, and Commission Fee

Study on GPS Common-view Observation Data with Multiscale Kalman Filter. based on correlation Structure of the Discrete Wavelet Coefficients

JORDAN CANONICAL FORM AND ITS APPLICATIONS

Tidal forces. m r. m 1 m 2. x r 2. r 1

30 The Electric Field Due to a Continuous Distribution of Charge on a Line

Hammerstein Model Identification Based On Instrumental Variable and Least Square Methods

Game Study of the Closed-loop Supply Chain with Random Yield and Random Demand

Induction Motor Identification Using Elman Neural Network

ATMO 551a Fall 08. Diffusion

Some Remarks on the Boundary Behaviors of the Hardy Spaces

Multi-Port Calibration Techniques for Differential Parameter Measurements with Network Analyzers

Orbital Angular Momentum Eigenfunctions

Adsorption and Desorption Kinetics for Diffusion Controlled Systems with a Strongly Concentration Dependent Diffusivity

Surveillance Points in High Dimensional Spaces

Stanford University CS259Q: Quantum Computing Handout 8 Luca Trevisan October 18, 2012

Chapter 9 Dynamic stability analysis III Lateral motion (Lectures 33 and 34)

A Bijective Approach to the Permutational Power of a Priority Queue

Lecture 04: HFK Propagation Physical Optics II (Optical Sciences 330) (Updated: Friday, April 29, 2005, 8:05 PM) W.J. Dallas

Truncated Squarers with Constant and Variable Correction

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms

Psychometric Methods: Theory into Practice Larry R. Price

Physics 2B Chapter 22 Notes - Magnetic Field Spring 2018

KANTOROVICH TYPE INEQUALITIES FOR THE DIFFERENCE WITH TWO NEGATIVE PARAMETERS. Received April 13, 2010; revised August 18, 2010

BImpact of Supply Chain Coordination for Deteriorating Goods with Stock-Dependent Demand Rate

Vortex Initialization in HWRF/HMON Models

ALOIS PANHOLZER AND HELMUT PRODINGER

Three-dimensional Quantum Cellular Neural Network and Its Application to Image Processing *

Analytical Solutions for Confined Aquifers with non constant Pumping using Computer Algebra

Power efficiency and optimum load formulas on RF rectifiers featuring flow-angle equations

4/18/2005. Statistical Learning Theory

Central limit theorem for functions of weakly dependent variables

HOW TO TEACH THE FUNDAMENTALS OF INFORMATION SCIENCE, CODING, DECODING AND NUMBER SYSTEMS?

FARADAY'S LAW. dates : No. of lectures allocated. Actual No. of lectures 3 9/5/09-14 /5/09

A question of Gol dberg concerning entire functions with prescribed zeros

STUDY ON 2-D SHOCK WAVE PRESSURE MODEL IN MICRO SCALE LASER SHOCK PEENING

MULTILAYER PERCEPTRONS

ON THE TWO-BODY PROBLEM IN QUANTUM MECHANICS

A New Type of Capacitive Machine

J. Electrical Systems 1-3 (2005): Regular paper

POWER PENALTIES CAUSED BY MODEM IMPAIRMENTS IN SPREADING METHOD WITH IMPROVED INTERFERENCE TOLERANCE Noboru Izuka 1 and Yoshimasa Daido 2

3.1 Random variables

DonnishJournals

r ˆr F = Section 2: Newton s Law of Gravitation m 2 m 1 Consider two masses and, separated by distance Gravitational force on due to is

On Bounds for Harmonic Topological Index

Method for Approximating Irrational Numbers

Gradient-based Neural Network for Online Solution of Lyapunov Matrix Equation with Li Activation Function

Chapter 3 Optical Systems with Annular Pupils

Maximum Torque Control of Induction Traction Motor Based on DQ Axis Voltage Regulation

Application of Poisson Integral Formula on Solving Some Definite Integrals

Chapter 3: Theory of Modular Arithmetic 38

Semi-Custom VLSI Design and Implementation of a New Efficient RNS Division Algorithm

Evolutionary approach to Quantum and Reversible Circuits synthesis

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department. Problem Set 10 Solutions. r s

Σk=1. g r 3/2 z. 2 3-z. g 3 ( 3/2 ) g r 2. = 1 r = 0. () z = ( a ) + Σ. c n () a = ( a) 3-z -a. 3-z. z - + Σ. z 3, 5, 7, z ! = !

OSCILLATIONS AND GRAVITATION

A DETAILED STUDY OF THE HIGH ORDER SERIAL RESONANT INVERTER FOR INDUCTION HEATING

ANA BERRIZBEITIA, LUIS A. MEDINA, ALEXANDER C. MOLL, VICTOR H. MOLL, AND LAINE NOBLE

Basic Bridge Circuits

Some Ideal Convergent Sequence Spaces Defined by a Sequence of Modulus Functions Over n-normed Spaces

LC transfer of energy between the driving source and the circuit will be a maximum.

On a quantity that is analogous to potential and a theorem that relates to it

Physics 2020, Spring 2005 Lab 5 page 1 of 8. Lab 5. Magnetism

Appendix B The Relativistic Transformation of Forces

FARADAY'S LAW dt

Chapter 5 Force and Motion

Dynamic Performances of Self-Excited Induction Generator Feeding Different Static Loads

Duality between Statical and Kinematical Engineering Systems

Chapter 5 Force and Motion

Bifurcation Analysis for the Delay Logistic Equation with Two Delays

7.2. Coulomb s Law. The Electric Force

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007

An Adaptive Neural-Network Model-Following Speed Control of PMSM Drives for Electric Vehicle Applications

Central Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution

[2007] IEEE. Reprinted, with permission, from [Jiaxin Chen, Jianguo Zhu, Youguang Guo, A 2-D nonlinear FEA tool embedded in Matlab/Simulink

Interpretation Of Wind Components As Compositional Variables

Journal of Inequalities in Pure and Applied Mathematics

Quadratic Harmonic Number Sums

Quaternion Based Inverse Kinematics for Industrial Robot Manipulators with Euler Wrist

FUSE Fusion Utility Sequence Estimator

Pulse Neutron Neutron (PNN) tool logging for porosity Some theoretical aspects

WIENER MODELS OF DIRECTION-DEPENDENT DYNAMIC SYSTEMS. Singleton Park, Swansea, SA2 8PP, UK. University of Warwick, Coventry, CV4 7AL, UK

Directed Regression. Benjamin Van Roy Stanford University Stanford, CA Abstract

Equations to Calculate Characteristic Frequencies of Multiple Chamber Aligned in Parallel Cavity Resonator (MCAP-CR)

Functions Defined on Fuzzy Real Numbers According to Zadeh s Extension

Robust Spectrum Decision Protocol against Primary User Emulation Attacks in Dynamic Spectrum Access Networks

A generalization of the Bernstein polynomials

COMPUTATIONS OF ELECTROMAGNETIC FIELDS RADIATED FROM COMPLEX LIGHTNING CHANNELS

AQI: Advanced Quantum Information Lecture 2 (Module 4): Order finding and factoring algorithms February 20, 2013

(n 1)n(n + 1)(n + 2) + 1 = (n 1)(n + 2)n(n + 1) + 1 = ( (n 2 + n 1) 1 )( (n 2 + n 1) + 1 ) + 1 = (n 2 + n 1) 2.

On generalized Laguerre matrix polynomials

ON THE INVERSE SIGNED TOTAL DOMINATION NUMBER IN GRAPHS. D.A. Mojdeh and B. Samadi

Chapter 5 Linear Equations: Basic Theory and Practice

Contact impedance of grounded and capacitive electrodes

EXAM NMR (8N090) November , am

Relating Branching Program Size and. Formula Size over the Full Binary Basis. FB Informatik, LS II, Univ. Dortmund, Dortmund, Germany

Transcription:

VLSI IMPLEMENTATION OF PARALLEL- SERIAL LMS ADAPTIVE FILTERS Run-Bo Fu, Paul Fotie Dept. of Electical and Copute Engineeing, Laval Univesity Québec, Québec, Canada GK 7P4 eail: fotie@gel.ulaval.ca Abstact - In this pape, a paallel ealization of the LMS algoith in FPGA is pesented. It is based on a paallel-seial ultiplie, in which one of the inputs and the outputs ae tansfeed in seies, ost significant digit fist. The ipleentation is elatively low in coplexity. The odulaity exhibited is attactive fo VLSI ipleentations. It can be ealized on a single chip o on a few odula chips fo ost pactical applications. I. Intoduction Adaptive filtes have found any applications in the aeas of counications and signal pocessing, such as echo cancelation, channel equalization, noise cancelation, and syste identification []. The diffeence between adaptive filtes and conventional digital filtes is that the foe needs an appopiate algoith fo updating the filte coefficients. The least-ean-squae (LMS) algoith, due to its siplicity and good convegence behavio, has been widely used in pactical applications []. Since LMS adaptive filteing involves a lage nube of coputations, paallel ipleentation of the LMS algoith is necessay fo eal-tie applications. Howeve, the ipleentation would be expensive in this case because N ultiplies would be needed fo N filte coefficients. Multiplie aays ae nown to be fast. But because of thei coplexity, just one aay can be ipleented on a chip. Anothe altenative is the use of paallel-seial ultiplies which ae slowe but also less coplex. In this pape, a paallel-seial ultiplie is used [3], [4]. It is based on an algoith which pefos ultiplication in the ost significant digit (MSD) fist anne. The pipeline of the ultiplie depends of the use of a edundant nube syste. The ipleentation of the LMS algoith pesented in this pape has thee steps, the coputation of the convolution tes, the suation of these tes and the adaptation algoith. By an elaboate design of the thee inds of ultiplies used, the thee steps ae ovelapped in a full pipeline fashion. The edundant aithetic addes and ultiplies which ae used fo the ealization of the LMS algoith ae pesented in Section II. The LMS algoith is descibed biefly in Section III. Then, the paallel ealization of the LMS algoith is shown in Section IV. Finally, the ipleentation of the algoith in FPGA is discussed in Section V. II. Redundant aithetic addes and ultiplies The ultiplication discussed in this section is caied out by suing the patial poducts in a deceasing ode. The patial esult is epesented in a signed-digit-binay (SDB) nube. The patial poducts ae pefoed in two s copleent (C) epesentation o SDB epesentation.. Redundant and hybid addes Two inds of nube epesentation ae used in the ultiplication. Let X and Y be C and SDB nubes, espectively, X = x x j j x, x {, } j X [, ] Y = y j j y j {,, } Y [, ] whee digit y j is coded by two bits c j and s j, such that y j = c j s j c j, s j {, } Unde this epesentation, the edundant adde (Fig..a) and hybid adde (Fig..b) ae obtained. The edundant adde caies out the addition of two SDB nubes. And the latte adde caies out the addition of an SDBN nube with a C nube. We will also need to convet an SDB nube into C epesentation in a data bloc fashion. Let X and Y be a C and SDB nube espectively. () () (3)

Y = y j j = ( c j s j ) j ( c j s j ) j = (4) By copaing equations (4) and (), we can get the convesion cicuit illustated in Fig.. It is obvious that the convesion is caied out by siply adding the two C nubes, C and S, which epesent the vecto foed by c j and s j, with an offset of -, the bit x is the invese of the ost significant cay bit.. Seial-paallel ultiplication The seial-paallel ultiplication descibed in this pape is caied out by suing the patial poducts in a deceasing ode. This allows fo geneating the esult digit by digit, ost significant fist, in a pipeline anne [5]. Assuing that the weights of the patial poducts decease by a facto, the sequence of the patial poducts can be descibed as { y j j : y y, j=,, } j ax The facto is the basis of the nube syste of the ultiplie opeand, and is the total steps of the accuulation. Without loss of geneality, we assue that the initial value of patial accuulation is x. The patial accuulation is expessed as X = x o y j j (6) Z j is the esult digit, which is extacted at each step, ost significant fist, fo the patial accuulation. The patial esult at step is epesented as Z = z j j z j { ρ,..., ρ} (7) whee [6] ---------- < ρ < (8) The esidual W which is the diffeence between the two the patial accuulations is given by the following expession: W = ( X Z ) x y j j z j j = W = W y z (5) (9) () The ultiplication is ealized as follows: Initialize fo W x W ' ( W y ) -- z Select ( W ') W W ' z -- End fo. In the pocedue of the ealization, thee issues, the selection ule, the doain of y j and the eo bound, should be appoached. To extact the esult digit fo the patial accuulation, we can convet the ost significant n bits of W ' into C nube. The ost significant 3 bits, of the C nube epesents the extacted digit. To ae sue that the extacted digit does not exceed the doain of [-ρ,..., ρ], y j is bound by ρ < W ' < ρ () With W ' = ( W y ) --, we have -- () ρ -- < W y < -- ρ -- Afte selection of z, W = W ' z --, so -- < W (3) < -- Fo () and (3), we have -- (4) ρ y -- ρ The input doain of y j is defined as Y C -- ρ (5) -- ρ [, ] The accuulation pocedue stops afte n steps. Since the esult is geneated one digit at each step, the less significant pat of the total esult is then ignoed. The nube which is ignoed in the last step is W n. In ode fo the esult to decease as the accuulation pogesses by a facto, a bound on the eo is found X n Z n = W n n < -- n (6) The pocedue descibed above esults in an easie design than the ethod descibed in [3]. We need not deteine the doain of convegence and the selection inteval by a

coplicated pocedue. It is enough to deteine the doain of W and Y though equation (3) and (5). Fo (3) and (5), we obseve that the doains of W and Y depend not only on ρ and, but also, the nube of bits to be conveted fo a edundant notation into to C. With definite ρ and, thee is a lowe bound fo. III. LMS algoith The LMS algoith is descibed as N Y ( n) = x i( n) c i( n) i = (7) E ( n) = Z ( n) Y ( n) c i( n ) = c i( n) µe ( n) x i( n) The convegence speed is a function of the step size µ which ust obey the following condition to insue that the convegence will be in a quadatic ean sense µ < (8) N ------------ P whee P is the powe of the piay signal. IV. Pipeline ealization of the LMS algoith By using the ultiplie in Section II, the LMS algoith is ealized in thee steps. The thee steps ae ovelapped in a pipeline anne. The fist step is to copute the convolution tes x i c i,i=,,,, N. We assue that the input x i epesented as a C nube as the ultiplicand, the coefficient c i is expessed as a edundant nube with ρ =, and the esult of the ultiplication is epesented as Z = z j 4 j (9) This eans that the paaetes and ρ ae defined as follows: = 4, ρ = 3 () A hybid adde is used. Fo = and fo equation (4), we have 3 3 ----- x () 6 i -- 8 Fig. 3 illustates the ultiplie used to cay out the convolution tes. The suation of the N extacted digits and the coputation of the eo ae caied out in the second step. The N digits ae sued up by a binay adde tee. The adde ipleentation is lie that of Fig..a, except that the pai of bits to the fa ight ust be eplaced by (,). In this way, the adde can add fou C nubes and give two nubes in C as its esult. Fo the addition in the binay-tee, the least significant bit of the esult in evey stage could not be cut down siply as it is the case fo usual addes because the extacted digit is in ost significant fist fashion. This eans that it needs a 3 bit adde in the fist stage of the binay tee, a 4 bit adde in the second stage of the binay tee, a 5 bit adde in the thid stage, and so on. The saple peiod depends on the nube of stages of the adde tee. If it caies out fou nubes to be added in one stage instead of two nubes, the nube of stages of the binay tee will decease fo log N to log 4 N. The sequence geneated by the binay tee deceases by a facto of 4 at each step. An accuulato (Fig. 4) is used to su this sequence, geneating the final esult digit by digit, ost significant fist. In this accuulato, a edundant adde is used. Since the esult will be used as the ultiplicand in the next step, the esult of the accuulation is epesented as Z = z j 4 j () This eans that = 4 ρ = Fo equation (4), with = 3, we have 5 x i 3 (3) (4) The last step is to copute the new coefficients. It also needs N ultiplies (Fig. 5). The ultiplicand is in C and saved in a paallel egiste. The ultiplie is the eo fo step, MS fist. The esulting coefficient is ead bac in step to cay out the next saple peiod. Hence, the esult of the ultiplication is also pesented as in equation (). The diffeence with the accuulation used in step is that a hybid adde is used hee. Fo the LMS filte with N = 8 coefficients, the pipeline peiod is indicated in Table. The saple peiod is cloc cycles. V. FPGA ipleentation Ou ipleentation of the LMS filte is odula. We need two types of odules, one fo the convolution coputations and one fo the updates. Moeove, all counications between odules ae seial, which equies naow buses. Each odule is quite easy to build because the ultiplies can be placed egulaly. The binay tees exhibit soe egulaity, which is an attactive chaacteistic fo VLSI ipleentations. Fast and siple hadwae pogaability ae the ey FPGA featues that educe anufactuing costs and allow the apid developent of custo coponents. With in-cicuit epogaability, FPGA can play an ipotant ole in the eseach on the algoith ipleentations. To achieve custoization flexibility, FPGAs sacifice aea and soe pocessing speed. The paallel ealization of the LMS

algoith poposed in this pape has been designed by seveal Xilinx FPGA chips (XC4). The cloc peiod is 5 ns. To ipleent the LMS algoith, two types of FPGA chips ae designed. The convolution coputation and adapto which shae the sae coefficients, copose the fist odule (Fig. 6), and the suato coposes a second odule. Moe than one Module chips ae used to copute the convolution and the new coefficients. The patial convolution esults ae sued and the eo signal is poduced by the second type of chip. The eo signal is ead bac to the fist ind of chip (Fig. 7). As an exaple of ipleentation, 8 ultiplies and adde can be put on a Xilinx 4 cicuit to ealize a 4 coefficient Module chip with bit input data and bit coefficients. This design uses 7 contol logic blocs, 39 I/ blocs and 4 pots. The cloc speed is MHz and, since it needs cloc cycles fo each adaptation, the data ate is MHz. Afte ipleenting the algoith in FPGA, we plan to cay on the ealization of the poposed pipeline LMS filte on application specific integated cicuit (ASIC). VI. Conclusion We have ipleented in FPGA an LMS filte using an achitectue based on edundant aithetic. The ipleentation is both siple and odula. The pefoance (sapling ate) is adequate fo any DSP and digital counication applications. Refeences [] B. Widow and S. D. Steans, Adaptive signal pocessing, Seies in signal pocessing, Pentice Hall, Englewood Cliffs, NJ, 985. [] S. Hayin, Adaptive filte theoy, Pentice-Hall, Englewood Cliffs, NJ, 987. [3] M. Lapointe, H. T. Huynh, and P. Fotie, Systeatic design of pipeline ecusive filtes, IEEE Tansactions on Coputes, vol. 4, no. 4, pp. 43-46, Apil 993. [4] M. Lapointe, P. Fotie, and H. T. Huynh, Fast paallel ealization of the LMS algoith in O(logN) coputation tie, 5th Biennial Syposiu on Counications, Kingston, Canada, June 99. [5] N. Weste and K. Eshaghian, Pinciples of CMOS VLSI design: A syste pespective, Addison-Wesley, Reading, MA, 985. [6] A. Avizienis, Signed-digit nube epesentations fo fast paallel aithetic, IRE Tans. Electon. Coput., vol., pp. 389-4, Septebe, 96. b b b 3 b 4 a a a 3 a 4 c c c c 3 c 4 (a) b b b b a a a 3 b 4 3 a 4 c c c c 3 c 4 (b) Figue. (a) edundant adde, (b) hybid adde. s s s s 3 4 c c c 3 c 4 y y y y 3 y 4 Figue. SDB to C convete. n 3 4 5 6 7 8 9 3 4 C i (n) X c c c c 3 c 4 c 5 c 6 c 7 c 8 c 9 c c c c 3 M i (n) 3 4 5 6 7 8 9 E(n) e e e e 3 e 4 e 5 e 6 e 7 e 8 e 9 C i (n) c c c c 3 c 4 c 5 c 6 Table. Pipeline cycle of LMS algoith, n = 3 4 5 6 7 8 9 3 4 5... C i (n).

Multiplicand X i(n) (C) x x x x 3 X X X X X X X X X X W 4 W 5 W 6 W 7 W 8 W 9 W W W W 3 c i Initialize to d d d W 4 W 5 W 6 W 7 W 8 W 9 W W esult digit Figue 3. Multiplie fo convolution coputation. S 3 S 4 S 5 S 6 q 4 q 5 W 6 S S W9 W 7 W 8 W W W e e e q 4 q 5 W 6 W 7 W 8 W 9 W Initialize to o -Zn Figue 4. Convolution accuulato. X i(n) (C) q 4 q 5 W6 W7 x x x x 3 x 4 X X X X X W 8 W 9 W W X W x 5 e (n) c c c q 4 q 5 W 6 W 7 W 8 W9 W Figue 5. Multiplie fo adaptation coputation.

e (n) X in Adapto Convolve Adapto Convolve...... Adapto Convolve X out Z Z Z p Figue 6. Bloc diaga of Module. Module Module Module X (n) X (n-p) X (n-p) X(n-Np) X (n-n)...... Module e (n) Figue 7. Bloc diaga of FPGA ipleentation of the LMS algoith.