THE discrete sine transform (DST) and the discrete cosine

Size: px
Start display at page:

Download "THE discrete sine transform (DST) and the discrete cosine"

Transcription

1 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BIREFS 1 New Systolic Algorithm and Array Architecture for Prime-Length Discrete Sine Transform Pramod K. Meher Senior Member, IEEE and M. N. S. Swamy Fellow, IEEE Abstract Using a simple input regeneration approach and index transformation techniques, a new formulation is presented in this paper for computing an N-point prime-length discrete sine transform (DST) through two pairs of [(N 1)/4]-point cyclic convolutions, where [(N 1)/4] is an odd number. The cyclic convolution-based algorithm is used further to obtain a simple regular and locally connected linear systolic array for concurrent pipelined implementation of the DST. It is shown that the proposed systolic structure involves significantly less areatime complexity compared with that of the existing structures. Index Terms Discrete sine transform (DST), Discrete cosine transform (DCT), systolic array, very-large-scale integration (VLSI). I. INTRODUCTION THE discrete sine transform (DST) and the discrete cosine transform (DCT), have key functions in signal and image processing systems not only for their near optimal transform coding behaviour, but also for several other applications, e.g., block filtering, transform-domain adaptive filtering, digital signal interpolation, adaptive beamforming and image resizing etc. [1] [4]. For transform coding application the usual block size is 8, but for most other applications it is quite useful to have the DST and the DCT of prime transform-lengths and composite transform-lengths (consisting of two or more relatively prime factors). While there are only limited choices of power-of-two transform-lengths, prime-factor approach provides closely spaced suitable choices of transform-lengths [5], [6]. Moreover, the prime-factor approach not only offers scalability for hardware, time and transform size, but also involves significantly less area-time complexity compared to that of the direct implementation of long-length transforms. Various algorithms and architectures are therefore suggested in the literature for efficient computation of prime-length DCT and DST, and to combine them efficiently for the computation of transforms of composite transform-lengths [5], [6]. Several attempts have been made in the recent years for efficient implementation of prime-length DST and DCT in systolic hardware through cyclic convolutional formulation [7] [11] due to its remarkable advantage over the others, particularly for efficient input/output and data transfer operations. Manuscript submitted on July 21, 26, Revised October 6, 26. P. K. Meher is with the School of Computer Engineering, Nanyang Technological University, 5 Nanyang Avenue, Singapore, aspkmeher@ntu.edu.sg, URL: M. N. S. Swamy is with the Department of Electrical & Computer Engineering, Concordia University, 1455 de Maisonneuve Blvd. West, Montreal, Quebec, Canada H3G 1M8. swamy@ece.concordia.ca. Copyright (c) 26 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an to pubs-permissions@ieee.org. Apart from this, the convolutional representation of the DST is found to be more suitable for memory- and adder-based systolic realization. Attempts have, therefore, been made to implement N-point DST more efficiently through a pair of [(N 1)/2]-point cyclic convolutions [1] or convolution-like computation [9]. Using a new input regeneration scheme and simple index transformation techniques, in this paper, we formulate a low-complexity concurrent algorithm for conversion of an N point prime-length DST into a pair of [(N 1)/2]- point exact cyclic convolutions; and each of those [(N 1)/2]- point cyclic convolutions is further reduced to a pair of [(N 1)/4]-point cyclic convolutions when [(N 1)/4] is odd. The proposed convolution-based algorithm is used to derive a simple and regular area-time efficient linear systolic array for prime-length DST. The low-complexity convolutional formulation is derived in the next Section, and illustrated in Section III. The proposed systolic architecture is derived in Section IV. Hardware and time complexities of the proposed design are estimated and compared with the exiting structures in Section V. Conclusions are presented in Section VI. II. CONVOLUTIONAL FORMULATION OF THE DST The DST of a sequence {y(n), n N 1} can be defined as X(k) N 1 n π(2n + 1)k y(n)sin 2N, 1 k N. (1) Using the properties of sine and cosine functions, for any positive integer n, one can find that π(2n + 1)k sin 2N 1+2 n cos i1 πki πk sin. (2) N 2N Substituting (2) on (1), the DST can also be expressed as X(N) x(), (3a) βk and X(k) [2S(k) +x()] sin, (3b) 2 N 1 where S(k) x(n) cos βkn, (3c) n1 for k 1, 2,...,N 1, β π/n, and the input sequence {x(n)} is generated by successive accumulation given by and x(n 1) y(n 1), (3d)

2 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BIREFS 2 for n 2,...,N. x(n n) y(n n)+x(n n + 1), (3e) n 1 n mod K, n 1 <K, (9a) n 2 n mod 2, n 1 and 1 (9b) When N is any odd number, the even- and odd-indexed components of the intermediate result S(k) can be separated out from (1), and the even and odd-indexed terms in each of those components can be combined by using the symmetry of cosine functions to have where S(2k) P (k) S(N 2k) Q(k) (N 1)/2 n1 (N 1)/2 n1 p(n) [x(2n)+x(n 2n)] q(n) [x(2n) x(n 2n)] p(n) cos 2βφ(k, n), (4a) q(n) cos 2βφ(k, n), (4b) (4c) (4d) for k, n 1, 2,...,(N 1)/2, and φ(k, n) in the argument of cosine function is given by (2kn)N, if (2kn) N M. (5) φ(k, n) N (2kn) N, if (2kn) N >M. The symbol (.) N in (5), denotes modulo (N) operation and M (N 1)/2. When the transform length N is prime, each of the two sequences {P (k} and {Q(k)}, for 1 k M, given by (4) can be converted into [(N 1)/2] point circular convolution by suitable permutations achieved by mapping the indices k to l and n to m according to the following equations: and n k (η m ) N, if (2kn) N M. N (η m ) N, otherwise. (η l ) N, if (2kn) N M. N (η l ) N, otherwise. where η is the (N 1)-th primitive root of unity, such that η N 1 mod N 1and η j mod N 1, for <j<(n 1). Using the mapping given by (6) and (7)), each of the sequences {P (k)} and {Q(k}, of (4) may thus be expressed as an M point cyclic convolution of the form y(k) M 1 n (6) (7) h(n) r(k n) M, (8) The input sequence {r(n)} in (8) corresponds to one of the two sequences {p(n)} and {q(n)}, {h(n)} corresponds to the fixed coefficients {cos(2βφ(k, n)} and the convolved output sequence corresponds to {P (k)} and {Q(k)} of (4a) and (4b), respectively. When M 2K, and K is any odd number, each of the sequences {r(n)}, {h(n)} and {y(n)} can be converted into a 2-dimensional array of size K 2 by mapping the index n to the pair (n 1,n 2 ) using the Chinese remainder theorem (CRT) according to the relations: where the inverse mapping from (n 1,n 2 ) to n is performed by the relation: n (n 1 s 1 + n 2 s 2 )modm, n<m, (1) for s 1 1modK, s 2 1mod2, s 1 mod2and s 2 modk. Using the mapping of (9), the cyclic convolution of (8) may be converted into a two-dimensional form [12]: y(k 1,k 2 ) n 1 n 2 1 h(n 1,n 2 )r(k 1 n 1,k 2 n 2 ), (11) where the indices (k 1 n 1 ) and (k 2 n 2 ) of r(n 1,n 2 ) are understood to be taken mod K and mod 2, respectively. The 2-point convolutions in (11) can then be expanded to: y(k 1, ) n 1 h(n 1, )r(k 1 n 1, ) + h(n 1, 1)r(k 1 n 1, 1), y(k 1, 1) n 1 h(n 1, )r(k 1 n 1, 1) + h(n 1, 1)r(k 1 n 1, ), From (12), one can obtain where d(k 1, ) d(k 1, 1) y(k 1, ) d(k 1, ) + d(k 1, 1), y(k 1, 1) d(k 1, ) d(k 1, 1). n 1 n 1 (12a) (12b) (13a) (13b) a(n 1, )b (k 1 n 1 ) K,, (13c) a(n 1, 1)b (k 1 n 1 ) K, 1, (13d) a(n 1, ) [h(n 1, ) + h(n 1, 1)]/2, a(n 1, 1) [h(n 1, ) h(n 1, 1)]/2, b(n 1, ) r(n 1, ) + r(n 1, 1), b(n 1, 1) r(n 1, ) h(n 1, 1). (13e) (13f) (13g) (13h) for k 1,n 1 <K. The M point cyclic convolution of (8) may, therefore, be computed from a pair of M/2 point cyclic convolutions of (13c) and (13d). The pair of [(N 1)/2] point cyclic convolutions of (4a) and (4b) for N point DST may, thus, be computed from two pairs of [(N 1)/4]-point cyclic convolutions. It may be noted here that, as shown in [13], An N point DCT can also be converted to a form similar to that of (3); and it can then be converted into two pairs of [(N 1)/4]-point cyclic convolutions similar to the case of the DST as discussed above in this Section.

3 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BIREFS 3 III. EXAMPLE OF CONVERSION OF DST INTO CIRCULAR CONVOLUTION FORM For simple illustration of the proposed formulation, we show here the conversion of 13-point DST into two pairs of 3-point cyclic convolutions. For transform-length N 13, we can write (4a) and (4b) in a common matrix-vector form: U(1) U(2) U(3) U(4) U(5) U(6) c 2 c 4 c 6 c 5 c 3 c 1 c 4 c 5 c 1 c 3 c 6 c 2 c 6 c 1 c 5 c 2 c 4 c 3 c 5 c 3 c 2 c 6 c 1 c 4 c 3 c 6 c 4 c 1 c 2 c 5 c 1 c 2 c 3 c 4 c 5 c 6 u(1) u(2) u(3) u(4) u(5) u(6) (14) where, c i cos(2iβ) and β ( π 13 ) for 1 i 6. Equation (14) represents (4a) when U(i) P (i) and u(i) p(i), while it represents (4b) when U(i) Q(i) and u(i) q(i) for 1 i 6. To convert (14) into the desired circular convolution, we can find the primitive root of unity η to be 2 for N 13, and can map the indices according to (6) and (7) as shown in Tables I. TABLE I MAPPING OF INDICES k AND n. l and m n k TABLE II MAPPING OF INDEX n TO (n 1,n 2 ). n (n 1,n 2 ) (, ) (1, 1) (2, ) (, 1) (1, ) (2, 1) Using the mapping of Table I and the commutative property of cyclic convolution, (14) can be written as a 6-point cyclic convolution of the form: y() r() r(5) r(4) r(3) r(2) r(1) h() y(1) r(1) r() r(5) r(4) r(3) r(2) h(1) y(2) y(3) r(2) r(1) r() r(5) r(4) r(3) h(2) r(3) r(2) r(1) r() r(5) r(4) h(3) (15) y(4) r(4) r(3) r(2) r(1) r() r(5) h(4) y(5) r(5) r(4) r(3) r(2) r(1) r() h(5) where [y() y(1) y(2) y(3) y(4) y(5)] T [U(1) U(2) U(4) U(5) U(3) U(6)] T, [r() r(1) r(2) r(3) r(4) r(5)] T [u(1) u(6) u(3) u(5) u(4) u(2)] T and [h() h(1) h(2) h(3) h(4) h(5)] T [c 2 c 4 c 5 c 3 c 6 c 1 ] T. The 6-point sequences {y(n)}, {h(n)} and {r(n)} of (15) can, respectively, be mapped into 3 2 matrices [y(n 1,n 2 ], [h(n 1,n 2 ] and [r(n 1,n 2 ] according to (9) as shown in Table II. The 6-point cyclic convolution of (15), may then be computed according to (13a) and (13b) from a pair of 3-point cyclic convolutions of (16). d(,i) d(1,i) d(2,i) b(,i) b(2,i) b(1,i) b(1,i) b(,i) b(2,i) b(2,i) b(1,i) b(,i) a(,i) a(1,i) (16) a(2,i) for i and 1, where, (a(n 1,i) and (b(n 1,i) for n 1 2, and i and 1 are computed according to (13e, 13f) and (13g, 13h), respectively. A 13-point DST of (4a) and (4b) may thus be obtained from two pairs of 3-point circular convolutions as given by (16). IV. PROPOSED SYSTOLIC ARRAY ARCHITECTURE A simple and regular locally connected linear systolic structure is derived here for the computation of a 6-point cyclic-convolution according to (13) by concurrent implementation of a pair of 3-point cyclic convolutions of (16). The proposed convolution structure can be used further for the computation of 13-point DST according to (3) and (4). The dependence graphs (DG) for computation of a pair of 3- point convolutions of (16) are shown Figs. 1(a) and 1(b), respectively. The function of each node is depicted in Fig. 1(c). The fixed multiplying coefficients {a(n 1, )} and {a(n 1, 1)} for n 1 2 are pre-computed according to (13e) and (13f). {b(n 1, )} and {b(n 1, 1)} are computed by an input adder according to (13g) and (13h), respectively, and made available to the the nodes of the DG. Since both of these DGs have identical functions, they can be merged together and projected along the j direction with a schedule along [1 1] T to obtain a linear systolic array consisting of 3 PEs in the i direction. According to the requirement of systolic transformation, the multiplying coefficients available to the nodes of the DG from direction [ 1] T stay in the PEs, the input values available from direction [1 1] T are transferred to the next PE with 2 delays, and the partial result available to each of the nodes from [1 ] T direction is moved to the next PEs in the subsequent cycles. The proposed linear systolic array to realize a pair of 3-point cyclic convolutions is shown in Fig. 2. The input values are fed to the individual PEs through a circularly-extended input interface, such that the j b(, ) b(2, ) b(1, ) i a(,) a(1,) a(2,) b(1, ) b(2, ) Zin Xin (a) Yin Yout Xout Zout d(,) d(1,) d(2,) (c) b(, 1) b(2, 1) b(1, 1) a(,1) a(1,1) a(2,1) Yout Yin; Zout Zin. b(1, ) b(2, ) (b) Xout Xin + Zin. Yin; d(,1) d(1,1) d(2,1) Fig. 1. The DGs for computation of {d(n 1, )} and {d(n 1, 1)} for 1 n 1 3, given by circular convolution form of (13). (a) The DG for {d(n 1, )} (b) The DG for {d(n 1, 1)}. (c) Function of each node of the DGs.

4 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BIREFS 4 input values to a PE are staggered by one cycle-period with respect to the preceding PE to maintain the data dependency requirement. Function of each PE of the structure is shown in Fig. 2(b). Each of the PEs performs a pair of multiplications and a pair of additions in each cycle period, where the multiplications in a PE are always performed with a pair of fixed coefficients. This feature of the PEs can be utilized to implement the multiplications in each PE by a pair of look-uptable (LUT) ROMs that store the product values for all possible input values for the given pair of multiplying coefficients of the PE. The structure of the proposed memory-based PEs is shown in Fig.3. It consists of two dual-port ROMs, each of size 2 words, where L is the word length. Each dual-port ROM serves as look-up-table for all the possible values of the product. The bits of each of the input words Uin and V in are separated into two equal halves of bits each and the two halves of each of the two input words are fed in parallel to the pair of address ports of a dual-port ROM as shown in Fig.3. The more significant halves of the output of the ROM are left-shifted by () bit positions and added with their other halves to generate the desired product values by a pair of shift-add cells. The pair of outputs of the shift-add cells are added with X1in and X2in to generate a pair of outputs of the PE. As shown in Fig. 3, the function of the proposed memory-based PEs can be implemented in three pipelined stages. The duration of a cycle period of the memory-based implementation can be T max(t Mem,T AS ), where T Mem and T AS, are the time required for each memory-read operation and the time required for a shift-add operation in the PEs. The actual duration of the cycle period of the memory-based PE will, however, depend on the word-length L, and how the adders and memory elements are implemented in the PEs. In multiply-accumulate implementation, if each PE is designed to have two multipliers and two adders, then the duration of the cycle period would be T (T M + T A ), where T M and T A are, respectively, the times involved in performing one multiplication and one addition in the PE. The right-most PE of the structure yields its first output 3 cycles after the first input arrives at its left-most PE, and produces its subsequent output in every cycle period thereafter. It delivers one pair of convolved output sequences in every 3 cycle periods once the pipeline is filled in the first 3 cycles. For implementation of a pair of [(N 1)/2]-point cyclic convolutions, in general, one can have two such linear arrays, each consisting of [(N 1)/4] PEs. A pair of [(N 1)/2]- point cyclic convolutions associated with the computation of an N point DST may, however, be time-multiplexed into the same structure to be computed one after the other. V. HARDWARE AND TIME COMPLEXITIES In Section II, it was shown that an N-point DST can be computed via two [(N 1)/2]-point cyclic convolutions. Moreover, when [(N 1)/4] is odd, each of the [(N 1)/2]- point cyclic convolutions can be computed by a pair of [(N 1)/4]-point cyclic convolutions using a linear systolic array of [(N 1)/4] PEs. The structure would yield its first pair of convolution output after a latency of [2 + (N 1)/4] cycles. It would provide a pair of output in every cycle after b(,) b(,1) b(1,) b(1,1) b(2,) b(2,1) Δ Δ 2Δ 2Δ d(2,) d(1,) d(,) PE-1 PE-2 PE-3 d(2,1) d(1,1) d(,1) X1in X2in Uin PE Vin X2out (a) X1in + C1. Uin; X 2out X 2in + C2. Vin; Fig. 2. The proposed linear systolic array for the computing a pair of 3- point cyclic convolutions. (a) The linear array. (b) Function of each PE. U in b(i 1, ), V in b(i 1, 1), 1 i 3, for the i-th PE. C1 and C2 are constants for a given PE. stands for unit delay. X1in X2in Fig. 3. Uin Vin STAGE-1 DUAL-PORT ROM OF SIZE 2 DUAL-PORT ROM OF SIZE 2 STAGE-2 SHIFT ADD SHIFT ADD STAGE-3 Proposed structure of the memory-based processing elements. ADDER ADDER X2out the latency period. A complete set of convolved output can be computed in every [(N 1)/2] cycles, where the duration of a cycle period T (T M + T A ) for multiplier-based implementation and T max(t Mem,T AS ) for the memorybased implementation. The hardware- and time-complexities of the proposed systolic realization along with those of the existing systolic structures for DST and DCT of [7] [9] and [14] are listed in Table III. It is found that the proposed structure involves the same average computation-time (ACT) as that of [9], but needs nearly half the number of multipliers and adders used in the latter. Although the proposed structure involves nearly the same number of multipliers and adders as that of [7], it has half the ACT of the latter. The structure of [14] involves nearly twice the ACT and double the number of adders as that of proposed structure. Even though the structure of [8] has nearly the same number of multipliers as the proposed one, it has 3 times the number of adders. Furthermore, the cycle period of the proposed structure is only T M +T A compared to T M +3T A of that of the [8], even though the ACT involves the same number of computational cycles for both. Also, the proposed structure requires the same number of I/O channels as that of [8], which is comparable to that of the other structures as well. The hardware- and time-complexities of the memory-based realization of the proposed systolic structure is listed in Table IV, along with those of the recently proposed memory-based structure for the DST [1]. The proposed structure requires the same number of cycles of ACT as that of [1], but it involves half the ROM size and nearly half the number of adders,

5 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BIREFS 5 TABLE III HARDWARE- AND TIME-COMPLEXITIES OF PROPOSED STRUCTURE AND THE EXISTING MULTIPLIER-BASED SYSTOLIC STRUCTURES. Structures Multipliers Adders Registers Cycle-Time (T ) Latency ACT I/O Channels Chiper et al. [9] (N 1) (N +1) 5(N 1)/2 T M + T A 2(N 1)T (N 1)T/2 3L + N Chiper [7] (N 1)/2 (N 1)/2 5(N 1)/2 T M + T A (3N 2)T (N 1)T 3L +1 Fang and Wu [14] (N/2+3) (N +3) (11N +4) T M +2T A 7NT/2 NT 3L +1 Cheng and Parhi [8] (N 1)/2 3(N 1)/2 3(N 3) 4 T M +3T A (5N 1)T/4 (N 1)T/2 4L +1 Proposed (N +3)/2 (N +11)/2 2(N 1) T M + T A 2(N 1)T (N 1)T/2 4L +1 TABLE IV HARDWARE- AND TIME-COMPLEXITIES OF PROPOSED STRUCTURE AND THE EXISTING MEMORY-BASED SYSTOLIC STRUCTURE. Structures Multipliers Adders ROM (Words) Cycle-Time (T ) Latency ACT I/O Channels (N 1) Chiper et al. [1] 2 2N (+1) T Mem + T A 2(N 1)T (N 1)T/2 7L +1 (N 1) Proposed 2 N (+1) max(t Mem,T AS ) 2(N 1)T (N 1)T/2 4L +1 a lower cycle period and less number of I/O channels as those of other. Unlike that of [1], the proposed convolutional formulation provides a much simpler input regeneration by successive accumulation according to (3d) and (3e). Besides, no additional computation is needed in the proposed formulation to obtain the last DST component X(N), since it is same as the regenerated input value x() as given by (3a). It may also be noted that, unlike the existing structures of [8] and [9], the proposed structure does not involve any tag-bit control for sign alterations for realization of convolution-like operations. VI. CONCLUSIONS A new convolutional formulation is derived to compute prime-length DST of size N from a pair of [(N 1)/2]-point cyclic convolutions. Moreover, a reduced-complexity recursive algorithm is proposed for systolization of each [(N 1)/2]- point cyclic convolution through a pair of [(N 1)/4]-point cyclic convolutions, when [(N 1)/4] is an odd number. A simple, regular and locally connected linear array is presented for concurrent pipelined systolic implementation of those cyclic convolutions. It offers a significantly lower area-time complexity compared with that of the existing structures. It is interesting to note that the proposed convolutional formulation eliminates the use of control tag-bits, which are otherwise involved in most of the existing structures. The proposed scheme is found to to be suitable for efficient memory-based implementation of DST using ROM-based look-up-tables. It would also be useful for DST implementation based on distributed arithmetic and constant multiplications schemes like canonical signed digit multiplication and CORDIC multiplication. The proposed convolutional formulation and the systolic structure for prime-length DST can be directly used for implementation of the DCT, as well. In this paper, we have used a new and simple input regeneration scheme, which interestingly yields the last DST component, as the first regenerated input value. REFERENCES [1] Z. Wang, G. A. Jullien, and W. C. Miller, Interpolation using the discrete sine transform with increased accuracy, Electronics Letters, vol. 29, no. 22, pp , Oct [2] S. A. Martucci and R. Mersereau, New approaches to block filtering of images using symmetric convolution and the DST or DCT, in Proc. IEEE International Symp. Circuits and Systems (ISCAS 93), May 1993, pp [3] F. Beaufays, Transform-domain adaptive filters: an analytical approach, IEEE Trans. Signal Processing, vol. SP-43, no. 2, pp , Feb [4] Y. S. Park and H. W. Park, Arbitrary-ratio image resizing using fast DCT of composite length for DCT-based transcoder, IEEE Trans. Image Processing, vol. 15, no. 2, pp , Feb. 26. [5] A. Tatsaki, C. Dre, T. Stouraitis, and C. Goutis, Prime-factor DCT algorithms, IEEE Trans. Signal Processing, vol. SP-43, no. 3, pp , Mar [6] D. Kar and V. V. B. Rao, On the prime factor decomposition algorithm for the discrete sine transform, IEEE Trans. Signal Processing, vol. SP-42, no. 11, pp , Nov [7] D. F. Chiper, A new systolic array algorithm for memory-based VLSI array implementation of DCT, in Proc. Second IEEE Symp. on Computers and Communications, July 1997, pp [8] C. Cheng and K. K. Parhi, A novel systolic array structure for DCT, IEEE Trans. Circuits Syst-II: Express Briefs, vol. 52, no. 7, pp , July 25. [9] D. F. Chiper, M. N. S. Swamy, M. O. Ahmad, and T. Stouraitis, A systolic array architecture for the discrete sine transform, IEEE Trans. Signal Process., vol. 5, no. 9, pp , Sept. 22. [1], Systolic algorithms and a memory-based design approach for a unified architecture for the computation of DCT/DST/IDCT/IDST, IEEE Trans. Circuits Syst-I: Regular Papers, vol. 52, no. 6, pp , June 25. [11] P. K. Meher, Systolic designs for DCT using a low-complexity concurrent convolutional formulation, IEEE Trans. Circuits & Systems for Video Technology, vol. 16, no. 9, pp , Sept. 26. [12] R. C. Agarwal and J. W. Cooley, New algorithms for digital convolution, IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-25, no. 5, pp , Oct [13] J.-I. Guo, C.-M. Liu, and C.-W. Jen, A new array architecture for prime-length discrete cosine transform, IEEE Trans. Signal Processing, vol. 41, no. 1, pp , Jan [14] W. H. Fang and M. L. Wu, Unified fully-pipelined implementations of one- and two-dimensional real discrete trigonometric transforms, IEICE Trans. Fund. Electron., Commun. Comput. Sci., vol. E82-A, no. 1, pp , Oct

A High-Speed Realization of Chinese Remainder Theorem

A High-Speed Realization of Chinese Remainder Theorem Proceedings of the 2007 WSEAS Int. Conference on Circuits, Systems, Signal and Telecommunications, Gold Coast, Australia, January 17-19, 2007 97 A High-Speed Realization of Chinese Remainder Theorem Shuangching

More information

An Effective New CRT Based Reverse Converter for a Novel Moduli Set { 2 2n+1 1, 2 2n+1, 2 2n 1 }

An Effective New CRT Based Reverse Converter for a Novel Moduli Set { 2 2n+1 1, 2 2n+1, 2 2n 1 } An Effective New CRT Based Reverse Converter for a Novel Moduli Set +1 1, +1, 1 } Edem Kwedzo Bankas, Kazeem Alagbe Gbolagade Department of Computer Science, Faculty of Mathematical Sciences, University

More information

AN IMPROVED LOW LATENCY SYSTOLIC STRUCTURED GALOIS FIELD MULTIPLIER

AN IMPROVED LOW LATENCY SYSTOLIC STRUCTURED GALOIS FIELD MULTIPLIER Indian Journal of Electronics and Electrical Engineering (IJEEE) Vol.2.No.1 2014pp1-6 available at: www.goniv.com Paper Received :05-03-2014 Paper Published:28-03-2014 Paper Reviewed by: 1. John Arhter

More information

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System G.Suresh, G.Indira Devi, P.Pavankumar Abstract The use of the improved table look up Residue Number System

More information

Volume 3, No. 1, January 2012 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at

Volume 3, No. 1, January 2012 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at Volume 3, No 1, January 2012 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at wwwjgrcsinfo A NOVEL HIGH DYNAMIC RANGE 5-MODULUS SET WHIT EFFICIENT REVERSE CONVERTER AND

More information

Subquadratic Computational Complexity Schemes for Extended Binary Field Multiplication Using Optimal Normal Bases

Subquadratic Computational Complexity Schemes for Extended Binary Field Multiplication Using Optimal Normal Bases 1 Subquadratic Computational Complexity Schemes for Extended Binary Field Multiplication Using Optimal Normal Bases H. Fan and M. A. Hasan March 31, 2007 Abstract Based on a recently proposed Toeplitz

More information

KEYWORDS: Multiple Valued Logic (MVL), Residue Number System (RNS), Quinary Logic (Q uin), Quinary Full Adder, QFA, Quinary Half Adder, QHA.

KEYWORDS: Multiple Valued Logic (MVL), Residue Number System (RNS), Quinary Logic (Q uin), Quinary Full Adder, QFA, Quinary Half Adder, QHA. GLOBAL JOURNAL OF ADVANCED ENGINEERING TECHNOLOGIES AND SCIENCES DESIGN OF A QUINARY TO RESIDUE NUMBER SYSTEM CONVERTER USING MULTI-LEVELS OF CONVERSION Hassan Amin Osseily Electrical and Electronics Department,

More information

Low Power, High Speed Parallel Architecture For Cyclic Convolution Based On Fermat Number Transform (FNT)

Low Power, High Speed Parallel Architecture For Cyclic Convolution Based On Fermat Number Transform (FNT) RESEARCH ARTICLE OPEN ACCESS Low Power, High Speed Parallel Architecture For Cyclic Convolution Based On Fermat Number Transform (FNT) T.Jyothsna 1 M.Tech, M.Pradeep 2 M.Tech 1 E.C.E department, shri Vishnu

More information

On Equivalences and Fair Comparisons Among Residue Number Systems with Special Moduli

On Equivalences and Fair Comparisons Among Residue Number Systems with Special Moduli On Equivalences and Fair Comparisons Among Residue Number Systems with Special Moduli Behrooz Parhami Department of Electrical and Computer Engineering University of California Santa Barbara, CA 93106-9560,

More information

HARDWARE IMPLEMENTATION OF FIR/IIR DIGITAL FILTERS USING INTEGRAL STOCHASTIC COMPUTATION. Arash Ardakani, François Leduc-Primeau and Warren J.

HARDWARE IMPLEMENTATION OF FIR/IIR DIGITAL FILTERS USING INTEGRAL STOCHASTIC COMPUTATION. Arash Ardakani, François Leduc-Primeau and Warren J. HARWARE IMPLEMENTATION OF FIR/IIR IGITAL FILTERS USING INTEGRAL STOCHASTIC COMPUTATION Arash Ardakani, François Leduc-Primeau and Warren J. Gross epartment of Electrical and Computer Engineering McGill

More information

A Bit-Plane Decomposition Matrix-Based VLSI Integer Transform Architecture for HEVC

A Bit-Plane Decomposition Matrix-Based VLSI Integer Transform Architecture for HEVC IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 64, NO. 3, MARCH 2017 349 A Bit-Plane Decomposition Matrix-Based VLSI Integer Transform Architecture for HEVC Honggang Qi, Member, IEEE,

More information

Design and Implementation of Efficient Modulo 2 n +1 Adder

Design and Implementation of Efficient Modulo 2 n +1 Adder www..org 18 Design and Implementation of Efficient Modulo 2 n +1 Adder V. Jagadheesh 1, Y. Swetha 2 1,2 Research Scholar(INDIA) Abstract In this brief, we proposed an efficient weighted modulo (2 n +1)

More information

Computing running DCTs and DSTs based on their second-order shift properties

Computing running DCTs and DSTs based on their second-order shift properties University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering Information Sciences 000 Computing running DCTs DSTs based on their second-order shift properties

More information

ISSN (PRINT): , (ONLINE): , VOLUME-5, ISSUE-7,

ISSN (PRINT): , (ONLINE): , VOLUME-5, ISSUE-7, HIGH PERFORMANCE MONTGOMERY MULTIPLICATION USING DADDA TREE ADDITION Thandri Adi Varalakshmi Devi 1, P Subhashini 2 1 PG Scholar, Dept of ECE, Kakinada Institute of Technology, Korangi, AP, India. 2 Assistant

More information

Citation Ieee Signal Processing Letters, 2001, v. 8 n. 6, p

Citation Ieee Signal Processing Letters, 2001, v. 8 n. 6, p Title Multiplierless perfect reconstruction modulated filter banks with sum-of-powers-of-two coefficients Author(s) Chan, SC; Liu, W; Ho, KL Citation Ieee Signal Processing Letters, 2001, v. 8 n. 6, p.

More information

CORDIC, Divider, Square Root

CORDIC, Divider, Square Root 4// EE6B: VLSI Signal Processing CORDIC, Divider, Square Root Prof. Dejan Marković ee6b@gmail.com Iterative algorithms CORDIC Division Square root Lecture Overview Topics covered include Algorithms and

More information

Title Perfect reconstruction modulated filter banks with sum of powers-of-two coefficients Author(s) Chan, SC; Liu, W; Ho, KL Citation IEEE International Symposium on Circuits and Systems Proceedings,

More information

FAST FIR ALGORITHM BASED AREA-EFFICIENT PARALLEL FIR DIGITAL FILTER STRUCTURES

FAST FIR ALGORITHM BASED AREA-EFFICIENT PARALLEL FIR DIGITAL FILTER STRUCTURES FAST FIR ALGORITHM BASED AREA-EFFICIENT PARALLEL FIR DIGITAL FILTER STRUCTURES R.P.MEENAAKSHI SUNDHARI 1, Dr.R.ANITA 2 1 Department of ECE, Sasurie College of Engineering, Vijayamangalam, Tamilnadu, India.

More information

High rate soft output Viterbi decoder

High rate soft output Viterbi decoder High rate soft output Viterbi decoder Eric Lüthi, Emmanuel Casseau Integrated Circuits for Telecommunications Laboratory Ecole Nationale Supérieure des Télécomunications de Bretagne BP 83-985 Brest Cedex

More information

On-Line Hardware Implementation for Complex Exponential and Logarithm

On-Line Hardware Implementation for Complex Exponential and Logarithm On-Line Hardware Implementation for Complex Exponential and Logarithm Ali SKAF, Jean-Michel MULLER * and Alain GUYOT Laboratoire TIMA / INPG - 46, Av. Félix Viallet, 3831 Grenoble Cedex * Laboratoire LIP

More information

Discrete-Time Systems

Discrete-Time Systems FIR Filters With this chapter we turn to systems as opposed to signals. The systems discussed in this chapter are finite impulse response (FIR) digital filters. The term digital filter arises because these

More information

Transformation Techniques for Real Time High Speed Implementation of Nonlinear Algorithms

Transformation Techniques for Real Time High Speed Implementation of Nonlinear Algorithms International Journal of Electronics and Communication Engineering. ISSN 0974-66 Volume 4, Number (0), pp.83-94 International Research Publication House http://www.irphouse.com Transformation Techniques

More information

Optimum Circuits for Bit Reversal

Optimum Circuits for Bit Reversal Optimum Circuits for Bit Reversal Mario Garrido Gálvez, Jesus Grajal and Oscar Gustafsson Linköping University Post Print.B.: When citing this work, cite the original article. 2011 IEEE. Personal use of

More information

Novel Modulo 2 n +1Multipliers

Novel Modulo 2 n +1Multipliers Novel Modulo Multipliers H. T. Vergos Computer Engineering and Informatics Dept., University of Patras, 26500 Patras, Greece. vergos@ceid.upatras.gr C. Efstathiou Informatics Dept.,TEI of Athens, 12210

More information

A NOVEL APPROACH FOR HIGH SPEED CONVOLUTION OF FINITE AND INFINITE LENGTH SEQUENCES USING VEDIC MATHEMATICS

A NOVEL APPROACH FOR HIGH SPEED CONVOLUTION OF FINITE AND INFINITE LENGTH SEQUENCES USING VEDIC MATHEMATICS A NOVEL APPROACH FOR HIGH SPEED CONVOLUTION OF FINITE AND INFINITE LENGTH SEQUENCES USING VEDIC MATHEMATICS M. Bharathi 1, D. Leela Rani 2 1 Assistant Professor, 2 Associate Professor, Department of ECE,

More information

A Digit-Serial Systolic Multiplier for Finite Fields GF(2 m )

A Digit-Serial Systolic Multiplier for Finite Fields GF(2 m ) A Digit-Serial Systolic Multiplier for Finite Fields GF( m ) Chang Hoon Kim, Sang Duk Han, and Chun Pyo Hong Department of Computer and Information Engineering Taegu University 5 Naeri, Jinryang, Kyungsan,

More information

A New Bit-Serial Architecture for Field Multiplication Using Polynomial Bases

A New Bit-Serial Architecture for Field Multiplication Using Polynomial Bases A New Bit-Serial Architecture for Field Multiplication Using Polynomial Bases Arash Reyhani-Masoleh Department of Electrical and Computer Engineering The University of Western Ontario London, Ontario,

More information

Analysis and Synthesis of Weighted-Sum Functions

Analysis and Synthesis of Weighted-Sum Functions Analysis and Synthesis of Weighted-Sum Functions Tsutomu Sasao Department of Computer Science and Electronics, Kyushu Institute of Technology, Iizuka 820-8502, Japan April 28, 2005 Abstract A weighted-sum

More information

AREA EFFICIENT MODULAR ADDER/SUBTRACTOR FOR RESIDUE MODULI

AREA EFFICIENT MODULAR ADDER/SUBTRACTOR FOR RESIDUE MODULI AREA EFFICIENT MODULAR ADDER/SUBTRACTOR FOR RESIDUE MODULI G.CHANDANA 1 (M.TECH),chandana.g89@gmail.com P.RAJINI 2 (M.TECH),paddam.rajani@gmail.com Abstract Efficient modular adders and subtractors for

More information

Design and Comparison of Wallace Multiplier Based on Symmetric Stacking and High speed counters

Design and Comparison of Wallace Multiplier Based on Symmetric Stacking and High speed counters International Journal of Engineering Research and Advanced Technology (IJERAT) DOI:http://dx.doi.org/10.31695/IJERAT.2018.3271 E-ISSN : 2454-6135 Volume.4, Issue 6 June -2018 Design and Comparison of Wallace

More information

A Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form (2 n (2 p ± 1))

A Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form (2 n (2 p ± 1)) The Computer Journal, 47(1), The British Computer Society; all rights reserved A Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form ( n ( p ± 1)) Ahmad A. Hiasat Electronics Engineering

More information

Low-complexity generation of scalable complete complementary sets of sequences

Low-complexity generation of scalable complete complementary sets of sequences University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2006 Low-complexity generation of scalable complete complementary sets

More information

Forward and Reverse Converters and Moduli Set Selection in Signed-Digit Residue Number Systems

Forward and Reverse Converters and Moduli Set Selection in Signed-Digit Residue Number Systems J Sign Process Syst DOI 10.1007/s11265-008-0249-8 Forward and Reverse Converters and Moduli Set Selection in Signed-Digit Residue Number Systems Andreas Persson Lars Bengtsson Received: 8 March 2007 /

More information

GENERALIZED ARYABHATA REMAINDER THEOREM

GENERALIZED ARYABHATA REMAINDER THEOREM International Journal of Innovative Computing, Information and Control ICIC International c 2010 ISSN 1349-4198 Volume 6, Number 4, April 2010 pp. 1865 1871 GENERALIZED ARYABHATA REMAINDER THEOREM Chin-Chen

More information

ELEG 305: Digital Signal Processing

ELEG 305: Digital Signal Processing ELEG 5: Digital Signal Processing Lecture 6: The Fast Fourier Transform; Radix Decimatation in Time Kenneth E. Barner Department of Electrical and Computer Engineering University of Delaware Fall 8 K.

More information

Design of Low Power, High Speed Parallel Architecture of Cyclic Convolution Based on Fermat Number Transform (FNT)

Design of Low Power, High Speed Parallel Architecture of Cyclic Convolution Based on Fermat Number Transform (FNT) Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 641-650 Research India Publications http://www.ripublication.com/aeee.htm Design of Low Power, High Speed

More information

The DFT as Convolution or Filtering

The DFT as Convolution or Filtering Connexions module: m16328 1 The DFT as Convolution or Filtering C. Sidney Burrus This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License A major application

More information

The equivalence of twos-complement addition and the conversion of redundant-binary to twos-complement numbers

The equivalence of twos-complement addition and the conversion of redundant-binary to twos-complement numbers The equivalence of twos-complement addition and the conversion of redundant-binary to twos-complement numbers Gerard MBlair The Department of Electrical Engineering The University of Edinburgh The King

More information

DSP Design Lecture 7. Unfolding cont. & Folding. Dr. Fredrik Edman.

DSP Design Lecture 7. Unfolding cont. & Folding. Dr. Fredrik Edman. SP esign Lecture 7 Unfolding cont. & Folding r. Fredrik Edman fredrik.edman@eit.lth.se Unfolding Unfolding creates a program with more than one iteration, J=unfolding factor Unfolding is a structured way

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Radix-4 Factorizations for the FFT with Ordered Input and Output

Radix-4 Factorizations for the FFT with Ordered Input and Output Radix-4 Factorizations for the FFT with Ordered Input and Output Vikrant 1, Ritesh Vyas 2, Sandeep Goyat 3, Jitender Kumar 4, Sandeep Kaushal 5 YMCA University of Science & Technology, Faridabad (Haryana),

More information

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER Jesus Garcia and Michael J. Schulte Lehigh University Department of Computer Science and Engineering Bethlehem, PA 15 ABSTRACT Galois field arithmetic

More information

Low-Power Twiddle Factor Unit for FFT Computation

Low-Power Twiddle Factor Unit for FFT Computation Low-Power Twiddle Factor Unit for FFT Computation Teemu Pitkänen, Tero Partanen, and Jarmo Takala Tampere University of Technology, P.O. Box, FIN- Tampere, Finland {teemu.pitkanen, tero.partanen, jarmo.takala}@tut.fi

More information

An Algorithm for Inversion in GF(2 m ) Suitable for Implementation Using a Polynomial Multiply Instruction on GF(2)

An Algorithm for Inversion in GF(2 m ) Suitable for Implementation Using a Polynomial Multiply Instruction on GF(2) An Algorithm for Inversion in GF2 m Suitable for Implementation Using a Polynomial Multiply Instruction on GF2 Katsuki Kobayashi, Naofumi Takagi, and Kazuyoshi Takagi Department of Information Engineering,

More information

Fast Fir Algorithm Based Area- Efficient Parallel Fir Digital Filter Structures

Fast Fir Algorithm Based Area- Efficient Parallel Fir Digital Filter Structures Fast Fir Algorithm Based Area- Efficient Parallel Fir Digital Filter Structures Ms. P.THENMOZHI 1, Ms. C.THAMILARASI 2 and Mr. V.VENGATESHWARAN 3 Assistant Professor, Dept. of ECE, J.K.K.College of Technology,

More information

Information encoding and decoding using Residue Number System for {2 2n -1, 2 2n, 2 2n +1} moduli sets

Information encoding and decoding using Residue Number System for {2 2n -1, 2 2n, 2 2n +1} moduli sets Information encoding and decoding using Residue Number System for {2-1, 2, 2 +1} moduli sets Idris Abiodun Aremu Kazeem Alagbe Gbolagade Abstract- This paper presents the design methods of information

More information

DSP Algorithm Original PowerPoint slides prepared by S. K. Mitra

DSP Algorithm Original PowerPoint slides prepared by S. K. Mitra Chapter 11 DSP Algorithm Implementations 清大電機系林嘉文 cwlin@ee.nthu.edu.tw Original PowerPoint slides prepared by S. K. Mitra 03-5731152 11-1 Matrix Representation of Digital Consider Filter Structures This

More information

Linear Convolution Using FFT

Linear Convolution Using FFT Linear Convolution Using FFT Another useful property is that we can perform circular convolution and see how many points remain the same as those of linear convolution. When P < L and an L-point circular

More information

Optimization of new Chinese Remainder theorems using special moduli sets

Optimization of new Chinese Remainder theorems using special moduli sets Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2010 Optimization of new Chinese Remainder theorems using special moduli sets Narendran Narayanaswamy Louisiana State

More information

Digital Signal Processing. Midterm 2 Solutions

Digital Signal Processing. Midterm 2 Solutions EE 123 University of California, Berkeley Anant Sahai arch 15, 2007 Digital Signal Processing Instructions idterm 2 Solutions Total time allowed for the exam is 80 minutes Please write your name and SID

More information

Canonic FFT flow graphs for real-valued even/odd symmetric inputs

Canonic FFT flow graphs for real-valued even/odd symmetric inputs Lao and Parhi EURASIP Journal on Advances in Signal Processing (017) 017:45 DOI 10.1186/s13634-017-0477-9 EURASIP Journal on Advances in Signal Processing RESEARCH Canonic FFT flow graphs for real-valued

More information

FUZZY PERFORMANCE ANALYSIS OF NTT BASED CONVOLUTION USING RECONFIGURABLE DEVICE

FUZZY PERFORMANCE ANALYSIS OF NTT BASED CONVOLUTION USING RECONFIGURABLE DEVICE FUZZY PERFORMANCE ANALYSIS OF NTT BASED CONVOLUTION USING RECONFIGURABLE DEVICE 1 Dr.N.Anitha, 2 V.Lambodharan, 3 P.Arunkumar 1 Assistant Professor, 2 Assistant Professor, 3 Assistant Professor 1 Department

More information

FIXED WIDTH BOOTH MULTIPLIER BASED ON PEB CIRCUIT

FIXED WIDTH BOOTH MULTIPLIER BASED ON PEB CIRCUIT FIXED WIDTH BOOTH MULTIPLIER BASED ON PEB CIRCUIT Dr. V.Vidya Devi 1, GuruKumar.Lokku 2, A.Natarajan 3 1 Professor, Department of ECE, A. M.S. Engineering college, T.N., India vidyapeace@gmail.com 2 VLSI

More information

Cost/Performance Tradeoff of n-select Square Root Implementations

Cost/Performance Tradeoff of n-select Square Root Implementations Australian Computer Science Communications, Vol.22, No.4, 2, pp.9 6, IEEE Comp. Society Press Cost/Performance Tradeoff of n-select Square Root Implementations Wanming Chu and Yamin Li Computer Architecture

More information

Analog vs. discrete signals

Analog vs. discrete signals Analog vs. discrete signals Continuous-time signals are also known as analog signals because their amplitude is analogous (i.e., proportional) to the physical quantity they represent. Discrete-time signals

More information

Examples. 2-input, 1-output discrete-time systems: 1-input, 1-output discrete-time systems:

Examples. 2-input, 1-output discrete-time systems: 1-input, 1-output discrete-time systems: Discrete-Time s - I Time-Domain Representation CHAPTER 4 These lecture slides are based on "Digital Signal Processing: A Computer-Based Approach, 4th ed." textbook by S.K. Mitra and its instructor materials.

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 12 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted for noncommercial,

More information

Chap 2. Discrete-Time Signals and Systems

Chap 2. Discrete-Time Signals and Systems Digital Signal Processing Chap 2. Discrete-Time Signals and Systems Chang-Su Kim Discrete-Time Signals CT Signal DT Signal Representation 0 4 1 1 1 2 3 Functional representation 1, n 1,3 x[ n] 4, n 2 0,

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 2, Issue 8, August 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Efficient

More information

A VLSI Algorithm for Modular Multiplication/Division

A VLSI Algorithm for Modular Multiplication/Division A VLSI Algorithm for Modular Multiplication/Division Marcelo E. Kaihara and Naofumi Takagi Department of Information Engineering Nagoya University Nagoya, 464-8603, Japan mkaihara@takagi.nuie.nagoya-u.ac.jp

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 13 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael T. Heath Parallel Numerical Algorithms

More information

LECTURE NOTES DIGITAL SIGNAL PROCESSING III B.TECH II SEMESTER (JNTUK R 13)

LECTURE NOTES DIGITAL SIGNAL PROCESSING III B.TECH II SEMESTER (JNTUK R 13) LECTURE NOTES ON DIGITAL SIGNAL PROCESSING III B.TECH II SEMESTER (JNTUK R 13) FACULTY : B.V.S.RENUKA DEVI (Asst.Prof) / Dr. K. SRINIVASA RAO (Assoc. Prof) DEPARTMENT OF ELECTRONICS AND COMMUNICATIONS

More information

Reduced-Error Constant Correction Truncated Multiplier

Reduced-Error Constant Correction Truncated Multiplier This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.*, No.*, 1 8 Reduced-Error Constant Correction Truncated

More information

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,

More information

Global Optimization of Common Subexpressions for Multiplierless Synthesis of Multiple Constant Multiplications

Global Optimization of Common Subexpressions for Multiplierless Synthesis of Multiple Constant Multiplications Global Optimization of Common Subexpressions for Multiplierless Synthesis of Multiple Constant Multiplications Yuen-Hong Alvin Ho, Chi-Un Lei, Hing-Kit Kwan and Ngai Wong Department of Electrical and Electronic

More information

DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING EC2314- DIGITAL SIGNAL PROCESSING UNIT I INTRODUCTION PART A

DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING EC2314- DIGITAL SIGNAL PROCESSING UNIT I INTRODUCTION PART A DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING EC2314- DIGITAL SIGNAL PROCESSING UNIT I INTRODUCTION PART A Classification of systems : Continuous and Discrete

More information

Performance Evaluation of Signed-Digit Architecture for Weighted-to-Residue and Residue-to-Weighted Number Converters with Moduli Set (2 n 1, 2 n,

Performance Evaluation of Signed-Digit Architecture for Weighted-to-Residue and Residue-to-Weighted Number Converters with Moduli Set (2 n 1, 2 n, Regular Paper Performance Evaluation of Signed-Digit Architecture for Weighted-to-Residue and Residue-to-Weighted Number Converters with Moduli Set (2 n 1, 2 n, 2 n +1) Shuangching Chen and Shugang Wei

More information

A FRACTIONAL DELAY FIR FILTER BASED ON LAGRANGE INTERPOLATION OF FARROW STRUCTURE

A FRACTIONAL DELAY FIR FILTER BASED ON LAGRANGE INTERPOLATION OF FARROW STRUCTURE A Fractional Delay Fir Filter Based on Lagrange Interpolation of Farrow Structure A FRACTIONAL DELAY FIR FILTER BASED ON LAGRANGE INTERPOLATION OF FARROW STRUCTURE 1 K. RAJALAKSHMI, 2 SWATHI GONDI & 3

More information

Low-Complexity Multiplierless Constant Rotators Based on Combined Coefficient Selection and Shift-and-Add Implementation (CCSSI)

Low-Complexity Multiplierless Constant Rotators Based on Combined Coefficient Selection and Shift-and-Add Implementation (CCSSI) Low-Complexity Multiplierless Constant Rotators Based on Combined Coefficient Selection and Shift-and-Add Implementation (CCSSI) Mario Garrido Gálvez, Fahad Qureshi and Oscar Gustafsson Linköping University

More information

On the Sensitivity of Transversal RLS Algorithms to Random Perturbations in the Filter Coefficients. Sasan Ardalan

On the Sensitivity of Transversal RLS Algorithms to Random Perturbations in the Filter Coefficients. Sasan Ardalan On the Sensitivity of Transversal RLS Algorithms to Random Perturbations in the Filter Coefficients Sasan Ardalan Center for Communications and Signal Processing Dept. of Electrical and Computer Engineering

More information

A Generalized Output Pruning Algorithm for Matrix- Vector Multiplication and Its Application to Compute Pruning Discrete Cosine Transform

A Generalized Output Pruning Algorithm for Matrix- Vector Multiplication and Its Application to Compute Pruning Discrete Cosine Transform A Generalized Output Pruning Algorithm for Matrix- Vector Multiplication and Its Application to Compute Pruning Discrete Cosine Transform Yuh-Ming Huang, Ja-Ling Wu, IEEE Senior Member, and Chi-Lun Chang

More information

Chapter 8 The Discrete Fourier Transform

Chapter 8 The Discrete Fourier Transform Chapter 8 The Discrete Fourier Transform Introduction Representation of periodic sequences: the discrete Fourier series Properties of the DFS The Fourier transform of periodic signals Sampling the Fourier

More information

A Low-Error Statistical Fixed-Width Multiplier and Its Applications

A Low-Error Statistical Fixed-Width Multiplier and Its Applications A Low-Error Statistical Fixed-Width Multiplier and Its Applications Yuan-Ho Chen 1, Chih-Wen Lu 1, Hsin-Chen Chiang, Tsin-Yuan Chang, and Chin Hsia 3 1 Department of Engineering and System Science, National

More information

A Generalized Reverse Jacket Transform

A Generalized Reverse Jacket Transform 684 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 48, NO. 7, JULY 2001 A Generalized Reverse Jacket Transform Moon Ho Lee, Senior Member, IEEE, B. Sundar Rajan,

More information

BSIDES multiplication, squaring is also an important

BSIDES multiplication, squaring is also an important 1 Bit-Parallel GF ( n ) Squarer Using Shifted Polynomial Basis Xi Xiong and Haining Fan Abstract We present explicit formulae and complexities of bit-parallel shifted polynomial basis (SPB) squarers in

More information

DATA receivers for digital transmission and storage systems

DATA receivers for digital transmission and storage systems IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 10, OCTOBER 2005 621 Effect of Loop Delay on Phase Margin of First-Order Second-Order Control Loops Jan W. M. Bergmans, Senior

More information

UNIT 1. SIGNALS AND SYSTEM

UNIT 1. SIGNALS AND SYSTEM Page no: 1 UNIT 1. SIGNALS AND SYSTEM INTRODUCTION A SIGNAL is defined as any physical quantity that changes with time, distance, speed, position, pressure, temperature or some other quantity. A SIGNAL

More information

ENT 315 Medical Signal Processing CHAPTER 2 DISCRETE FOURIER TRANSFORM. Dr. Lim Chee Chin

ENT 315 Medical Signal Processing CHAPTER 2 DISCRETE FOURIER TRANSFORM. Dr. Lim Chee Chin ENT 315 Medical Signal Processing CHAPTER 2 DISCRETE FOURIER TRANSFORM Dr. Lim Chee Chin Outline Introduction Discrete Fourier Series Properties of Discrete Fourier Series Time domain aliasing due to frequency

More information

PAPER A Low-Complexity Step-by-Step Decoding Algorithm for Binary BCH Codes

PAPER A Low-Complexity Step-by-Step Decoding Algorithm for Binary BCH Codes 359 PAPER A Low-Complexity Step-by-Step Decoding Algorithm for Binary BCH Codes Ching-Lung CHR a),szu-linsu, Members, and Shao-Wei WU, Nonmember SUMMARY A low-complexity step-by-step decoding algorithm

More information

Retiming. delay elements in a circuit without affecting the input/output characteristics of the circuit.

Retiming. delay elements in a circuit without affecting the input/output characteristics of the circuit. Chapter Retiming NCU EE -- SP VLSI esign. Chap. Tsung-Han Tsai 1 Retiming & A transformation techniques used to change the locations of delay elements in a circuit without affecting the input/output characteristics

More information

Averaged Lagrange Method for interpolation filter

Averaged Lagrange Method for interpolation filter Acoustics 8 Paris Averaged Lagrange Method for interpolation filter J. Andrea, F. Coutard, P. Schweitzer and E. Tisserand LIEN - BP 29, Université Henri Poincaré, 5456 Vandoeuvre, France jonathan.andrea@lien.uhp-nancy.fr

More information

5.6 Convolution and FFT

5.6 Convolution and FFT 5.6 Convolution and FFT Fast Fourier Transform: Applications Applications. Optics, acoustics, quantum physics, telecommunications, control systems, signal processing, speech recognition, data compression,

More information

arxiv: v1 [cs.mm] 2 Feb 2017 Abstract

arxiv: v1 [cs.mm] 2 Feb 2017 Abstract DCT-like Transform for Image Compression Requires 14 Additions Only F. M. Bayer R. J. Cintra arxiv:1702.00817v1 [cs.mm] 2 Feb 2017 Abstract A low-complexity 8-point orthogonal approximate DCT is introduced.

More information

VLSI Signal Processing

VLSI Signal Processing VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data

More information

Residue Number Systems Ivor Page 1

Residue Number Systems Ivor Page 1 Residue Number Systems 1 Residue Number Systems Ivor Page 1 7.1 Arithmetic in a modulus system The great speed of arithmetic in Residue Number Systems (RNS) comes from a simple theorem from number theory:

More information

Comparative analysis of QCA adders

Comparative analysis of QCA adders International Journal of Electrical Electronics Computers & Mechanical Engineering (IJEECM) ISSN: 2278-2808 Volume 5 Issue 12 ǁ December. 2017 IJEECM journal of Electronics and Communication Engineering

More information

Finite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut

Finite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut Finite Word Length Effects and Quantisation Noise 1 Finite Word Length Effects Finite register lengths and A/D converters cause errors at different levels: (i) input: Input quantisation (ii) system: Coefficient

More information

Fourier analysis of discrete-time signals. (Lathi Chapt. 10 and these slides)

Fourier analysis of discrete-time signals. (Lathi Chapt. 10 and these slides) Fourier analysis of discrete-time signals (Lathi Chapt. 10 and these slides) Towards the discrete-time Fourier transform How we will get there? Periodic discrete-time signal representation by Discrete-time

More information

On the Complexity of Error Detection Functions for Redundant Residue Number Systems

On the Complexity of Error Detection Functions for Redundant Residue Number Systems On the Complexity of Error Detection Functions for Redundant Residue Number Systems Tsutomu Sasao 1 and Yukihiro Iguchi 2 1 Dept. of Computer Science and Electronics, Kyushu Institute of Technology, Iizuka

More information

ECEN 5022 Cryptography

ECEN 5022 Cryptography Elementary Algebra and Number Theory University of Colorado Spring 2008 Divisibility, Primes Definition. N denotes the set {1, 2, 3,...} of natural numbers and Z denotes the set of integers {..., 2, 1,

More information

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

EE216B: VLSI Signal Processing. FFT Processors. Prof. Dejan Marković FFT: Background

EE216B: VLSI Signal Processing. FFT Processors. Prof. Dejan Marković FFT: Background 4/30/0 EE6B: VLSI Signal Processing FFT Processors Prof. Dejan Marković ee6b@gmail.com FFT: Background A bit of history 805 - algorithm first described by Gauss 965 - algorithm rediscovered (not for the

More information

! Circular Convolution. " Linear convolution with circular convolution. ! Discrete Fourier Transform. " Linear convolution through circular

! Circular Convolution.  Linear convolution with circular convolution. ! Discrete Fourier Transform.  Linear convolution through circular Previously ESE 531: Digital Signal Processing Lec 22: April 18, 2017 Fast Fourier Transform (con t)! Circular Convolution " Linear convolution with circular convolution! Discrete Fourier Transform " Linear

More information

ISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-10,

ISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-10, A NOVEL DOMINO LOGIC DESIGN FOR EMBEDDED APPLICATION Dr.K.Sujatha Associate Professor, Department of Computer science and Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, Tamilnadu,

More information

Fast convolution Implementation using FNT

Fast convolution Implementation using FNT VLSI Signal Processing Final Project Fast convolution Implementation using FT R89921145 林秉勳 Contents I. Introduction II. The round-off and truncation issues III. Fast convolution IV. The structure of transforms

More information

VII. Discrete Fourier Transform (DFT) Chapter-8. A. Modulo Arithmetic. (n) N is n modulo N, n is an integer variable.

VII. Discrete Fourier Transform (DFT) Chapter-8. A. Modulo Arithmetic. (n) N is n modulo N, n is an integer variable. 1 VII. Discrete Fourier Transform (DFT) Chapter-8 A. Modulo Arithmetic (n) N is n modulo N, n is an integer variable. (n) N = n m N 0 n m N N-1, pick m Ex. (k) 4 W N = e -j2π/n 2 Note that W N k = 0 but

More information

Lecture 11 FIR Filters

Lecture 11 FIR Filters Lecture 11 FIR Filters Fundamentals of Digital Signal Processing Spring, 2012 Wei-Ta Chu 2012/4/12 1 The Unit Impulse Sequence Any sequence can be represented in this way. The equation is true if k ranges

More information

DISCRETE FOURIER TRANSFORM

DISCRETE FOURIER TRANSFORM DISCRETE FOURIER TRANSFORM 1. Introduction The sampled discrete-time fourier transform (DTFT) of a finite length, discrete-time signal is known as the discrete Fourier transform (DFT). The DFT contains

More information

New chaotic binary sequences with good correlation property using logistic maps

New chaotic binary sequences with good correlation property using logistic maps IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735. Volume 5, Issue 3 (Mar. - Apr. 013), PP 59-64 New chaotic binary with good correlation property using

More information

Partial Sums of Powers of Prime Factors

Partial Sums of Powers of Prime Factors 1 3 47 6 3 11 Journal of Integer Sequences, Vol. 10 (007), Article 07.1.6 Partial Sums of Powers of Prime Factors Jean-Marie De Koninck Département de Mathématiques et de Statistique Université Laval Québec

More information