Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels

Similar documents
Adaptive Filter Theory

2262 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 47, NO. 8, AUGUST A General Class of Nonlinear Normalized Adaptive Filtering Algorithms

MITIGATING UNCORRELATED PERIODIC DISTURBANCE IN NARROWBAND ACTIVE NOISE CONTROL SYSTEMS

Adaptive sparse algorithms for estimating sparse channels in broadband wireless communications systems

LEAST Mean Squares Algorithm (LMS), introduced by

squares based sparse system identification for the error in variables

Dominant Pole Localization of FxLMS Adaptation Process in Active Noise Control

Alternating Optimization with Shrinkage

The Modeling and Equalization Technique of Nonlinear Wireless Channel

Adaptive Filtering Part II

Sparseness-Controlled Affine Projection Algorithm for Echo Cancelation

NONLINEAR ECHO CANCELLATION FOR HANDS-FREE SPEAKERPHONES. Bryan S. Nollett and Douglas L. Jones

Power Amplifier Linearization Using Multi- Stage Digital Predistortion Based On Indirect Learning Architecture

Chapter 2 Fundamentals of Adaptive Filter Theory

Integrated Direct Sub-band Adaptive Volterra Filter and Its Application to Identification of Loudspeaker Nonlinearity

NONLINEAR systems with memory appear frequently in

Efficient Use Of Sparse Adaptive Filters

A Log-Frequency Approach to the Identification of the Wiener-Hammerstein Model

A SPARSENESS CONTROLLED PROPORTIONATE ALGORITHM FOR ACOUSTIC ECHO CANCELLATION

An Adaptive Sensor Array Using an Affine Combination of Two Filters

ACTIVE noise control (ANC) ([1], [2]) is an established

Generalized Orthogonal Matching Pursuit- A Review and Some

EE482: Digital Signal Processing Applications

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE?

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112

Exploiting Sparsity for Wireless Communications

NONLINEAR SYSTEMS IDENTIFICATION USING THE VOLTERRA MODEL. Georgeta Budura

Reduced-cost combination of adaptive filters for acoustic echo cancellation

Chapter 2 Wiener Filtering

Adaptive Filters. un [ ] yn [ ] w. yn n wun k. - Adaptive filter (FIR): yn n n w nun k. (1) Identification. Unknown System + (2) Inverse modeling

Performance Analysis of Norm Constraint Least Mean Square Algorithm

Adaptive MMSE Equalizer with Optimum Tap-length and Decision Delay

A low intricacy variable step-size partial update adaptive algorithm for Acoustic Echo Cancellation USNRao

CO-OPERATION among multiple cognitive radio (CR)

EFFECTS OF ILL-CONDITIONED DATA ON LEAST SQUARES ADAPTIVE FILTERS. Gary A. Ybarra and S.T. Alexander

26. Filtering. ECE 830, Spring 2014

Sparse Volterra and Polynomial Regression Models: Recoverability and Estimation

III.C - Linear Transformations: Optimal Filtering

STEADY-STATE MEAN SQUARE PERFORMANCE OF A SPARSIFIED KERNEL LEAST MEAN SQUARE ALGORITHM.

Error Entropy Criterion in Echo State Network Training

AdaptiveFilters. GJRE-F Classification : FOR Code:

Image Compression using DPCM with LMS Algorithm

Robust Sparse Recovery via Non-Convex Optimization

DESIGN OF QUANTIZED FIR FILTER USING COMPENSATING ZEROS

On the Stability of the Least-Mean Fourth (LMF) Algorithm

Adaptive Systems Homework Assignment 1

Samira A. Mahdi University of Babylon/College of Science/Physics Dept. Iraq/Babylon

Ch5: Least Mean-Square Adaptive Filtering

Blind Source Separation with a Time-Varying Mixing Matrix

Steady-state performance analysis of a variable tap-length LMS algorithm

ESE 531: Digital Signal Processing

Cooperative Communication with Feedback via Stochastic Approximation

Binary Step Size Variations of LMS and NLMS

Blind Channel Equalization in Impulse Noise

Performance Analysis and Enhancements of Adaptive Algorithms and Their Applications

KNOWN approaches for improving the performance of

NSLMS: a Proportional Weight Algorithm for Sparse Adaptive Filters

MMSE System Identification, Gradient Descent, and the Least Mean Squares Algorithm

EE482: Digital Signal Processing Applications

Sign Function Based Sparse Adaptive Filtering Algorithms for Robust Channel Estimation under Non-Gaussian Noise Environments

A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY. Jie Yang

Sliding Window Recursive Quadratic Optimization with Variable Regularization

Optimal and Adaptive Filtering

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain

Comparative Performance Analysis of Three Algorithms for Principal Component Analysis

A Robust Zero-point Attraction LMS Algorithm on Near Sparse System Identification

DNNs for Sparse Coding and Dictionary Learning

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

System Identification in the Short-Time Fourier Transform Domain

Recursive l 1, Group lasso

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Minimax MMSE Estimator for Sparse System

Linear Models for Regression

Today. ESE 531: Digital Signal Processing. IIR Filter Design. Impulse Invariance. Impulse Invariance. Impulse Invariance. ω < π.

Statistical and Adaptive Signal Processing

Ch4: Method of Steepest Descent

System Identification in the Short-Time Fourier Transform Domain

Error Vector Normalized Adaptive Algorithm Applied to Adaptive Noise Canceller and System Identification

A Flexible ICA-Based Method for AEC Without Requiring Double-Talk Detection

Assesment of the efficiency of the LMS algorithm based on spectral information

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks

Various Nonlinear Models and their Identification, Equalization and Linearization

Riccati difference equations to non linear extended Kalman filter constraints

STOCHASTIC INFORMATION GRADIENT ALGORITHM BASED ON MAXIMUM ENTROPY DENSITY ESTIMATION. Badong Chen, Yu Zhu, Jinchun Hu and Ming Zhang

LMS and eigenvalue spread 2. Lecture 3 1. LMS and eigenvalue spread 3. LMS and eigenvalue spread 4. χ(r) = λ max λ min. » 1 a. » b0 +b. b 0 a+b 1.

Old painting digital color restoration

VSS-LMS Algorithms for Multichannel System Identification Using Volterra Filtering Sandipta Dutta Gupta 1, A.K. Kohli 2

3.4 Linear Least-Squares Filter

arxiv: v1 [cs.sd] 28 Feb 2017

Convergence Evaluation of a Random Step-Size NLMS Adaptive Algorithm in System Identification and Channel Equalization

Independent Component Analysis. Contents

ACCORDING to Shannon s sampling theorem, an analog

A new structure for nonlinear narrowband active noise control using Volterra filter


Research Overview. Kristjan Greenewald. February 2, University of Michigan - Ann Arbor

ELEG 833. Nonlinear Signal Processing

Title without the persistently exciting c. works must be obtained from the IEE

On the Use of A Priori Knowledge in Adaptive Inverse Control

Department of Electrical and Electronic Engineering

Transcription:

Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels Bijit Kumar Das 1, Mrityunjoy Chakraborty 2 Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA E.Mail : 1 bijitbijit@gmail.com, 2 mrityun@ece.iitkgp.ernet.in Abstract The Volterra series model, though a popular tool in modeling many practical nonlinear systems, suffers from the problem of over-parameterization, as too many coefficients need to be identified, requiring very long data records. On the other hand, often it is observed that of all the model coefficients, only a few are prominent while the others are relatively insignificant. The sparsity inherent in such systems is, however, not exploited by standard estimators which are based on minimization of some L2 norm like mean square error or sum of error squared. This paper draws inspiration from the domain of compressive sampling and proposes an adaptive algorithm for estimating sparse Volterra Kernels, by embedding a L1 norm penalty on the coefficients into the quadratic least mean squares (LMS) cost function. It is shown that the proposed algorithm can achieve a lower steadystate mean square error than that of a standard LMS based algorithm for identifying the Volterra model. Index terms : Volterra Series, L 1 norm, Sparse Systems, LMS adaptation II. PROBLEM FORMULATION AND ALGORITHM A. LMS Algorithm for Truncated Volterra Series Model The development of a gradient-type LMS adaptive algorithm for truncated Volterra series nonlinear models follows a similar method of development as for linear systems. The truncated p-th order Volterra series expansion is given as [1], y(n) = h 0 + m 1=0m 2=0...+ m 1=0 h 1 (m 1 )x(n m 1 )+ h 2 (m 1,m 2 )x(n m 1 )x(n m 2 )+... m 1=0m 2=0... m p=0 h p (m 1,m 2,.....m p )x(n m 1 )x(n m 2 )...x(n m p ). (1) I. INTRODUCTION Adaptive identification of nonlinear systems has found many applications in areas like control, communications, biological signal processing, image processing, etc. For systems with sufficiently smooth nonlinearity, the Volterra series [1] offers a well-appreciated model of the system, expressing the output as a polynomial expansion of the input. The number of terms in the Volterra series, however, increases exponentially as the model order increases and as a result, often in practice, a truncated model (upto 2nd order) is considered. The coefficients of such a model are then identified by an appropriate adaptive algorithm, e.g., the LMS algorithm [9]-[10].. In various applications, one, however, comes across sparse Volterra models that have several coefficients zero or negligible. Such a priori knowledge about the sparsity of the system, if embedded in the identification algorithm, can boost up the performance of the algorithm. However, except for [2], the sparsity has not so far been exploited in the identification of Volterra systems. In [2], new algorithms, both batch and recursive, have been developed, and the recursive algorithm, being a variant of the recursive least squares (RLS) [9]-[10], carries the demerit of huge computational burden. This motivates us to develop a LMS based alternative which exploits the sparse nature of the Volterra system model. Assuming h 0 = 0 and p = 2, the weight vector for the adaptive filter at the n-th index is given by, H(n) = {h 1 (0;n),h 1 (1;n),...,h 1 (N 1;n),h 2 (0,0;n),h 2 (0,1;n),..., h 2 (0,N 1;n),h 2 (1,1;n),...h 2 (N 1,N 1;n)} T (2) Similarly, the input vector at the n-th index is given as, X(n) = {x(n),x(n 1),...,x(n N +1),x 2 (n),x(n)x(n 1),...,x(n)x(n N +1) x 2 (n 1),...,x 2 (n N +1)} T. (3) Linear and quadratic coefficients are updated separately by minimizing the instantaneous square of the error where J(n) = e 2 (n) (4) e(n) = d(n) ˆd(n) (5) ˆd(n) is the estimate of d(n). This results in the following update equations : h 1 (m 1 ;n+1) = h 1 (m 1 ;n) µ e 2 (n) 2 h 1 (m 1 ;n) = h 1 (m 1 ;n)+µe(n)x(n m 1 ), (6) 350 10-0103500354 2010 APSIPA. All rights reserved. Proceedings of the Second APSIPA Annual Summit and Conference, pages 350 354, Biopolis, Singapore, 14-17 December 2010.

and, h 2 (m 1,m 2 ;n+1) = h 2 (m 1,m 2 ;n) µ e 2 (n) 2 h 2 (m 1,m 2 ;n) = h 2 (m 1,m 2 ;n) + µe(n)x(n m 1 )x(n m 2 ), (7) where µ is the so-called step-size, used to control the speed of convergence and ensure stability of the filter. Fig. 1. Second order Volterra series model with N=3. Using the weight vector notation, H(n), we can combine the two update equations into one as the coefficient update equation e(n) = d(n) H T (n) X(n) (8) H(n+1) = H(n)+µ X(n)e(n), (9) where the value of µ is chosen such that 0 < µ < 2 λ max, (10) withλ max denoting the maximum eigenvalue of the autocorrelation matrix of the input vector X(n). For nonlinear Volterra filters, the eigenvalue spread of the autocorrelation matrix of the input vector is quite large. This leads to slow convergence. Note that the symmetric property of the coefficients reduces the length of the coefficient vector by half. B. Sparse Nature of Volterra Kernels In many applications, the associated Volterra kernels are sparse, meaning that many of the entries of H(n) are zero. Consider for example the Linear-Nonlinear-Linear (LNL) model employed in various applications like modeling the effects of nonlinear amplifiers in OFDM, the satellite communication channel, or the transfer function of loudspeakers and headphones. The LNL model consists of a linear filter h a (k), k = 0,1,,L a 1, in cascade with a memoryless nonlinearity f(x), and a second linear filter h b (k), k = 0,1,,L b 1. The overall memory is thus L = L a +L b 1. If the nonlinear function is analytic on an open set (a, b), it accepts a Taylor series expansion : f(x) = c p x p, x p=0 (a, b). It can then be shown that the p-th order Volterra kernel is given by [1] L b 1 h p (k 1,k 2,...k p ) = c p k=0 h b (k)h a (k 1 k)...h a (k p k) In (11), there exist p-tuples (k 1, k 2,,k p ) for which there is no k {0,...,L b 1} such that (k i k) {0,L a 1} for all i = 1,,p. For these p-tuples, the Volterra kernel equals zero. Further, if the second filter in the LNL model is dropped, then one obtains the so-called Wiener model, for which the p-th order Volterra kernel is expressed as (11) h p (k 1,,k p ) = c p h a (k 1 ) h a (k p ). (12) Due to the separability of the kernel in (12), if the impulse responseh a (k) is also sparse, then the Volterra kernel becomes even sparser. Apart from these nonlinear systems with special structures, it has been observed that in many applications, only a few kernel coefficients contribute to the output [3]. Furthermore, the sparsity of the Volterra representation can also arise when the degree of the nonlinearity and the system memory are not known a priori. In this case, kernel estimation must be performed jointly with model order selection. Based on these considerations, exploiting the sparsity present in many Volterra representations is well motivated. C. A Sparsity-Aware Variant of LMS for Volterra Kernels Estimation In the proposed method, we employ L 1 norm regularization to exploit the a priori information that the Volterra model is over-parameterized and sparse. In (4), by combining the L 1 norm penalty of the coefficient vector with the instantaneous square error, a new cost function J 1 (n) can be defined as J 1 (n) = e 2 (n)+γ H(n) 1, (13) where. 1 indicates the L 1 norm of the vector considered. Using the gradient descent updating, the new filter update is then obtained as H(n+1) = H(n) µ J 1(n) H(n) = H(n)+µ X(n)e(n) ρsign( H(n))(14) where ρ = µγ and sign(.) is a component-wise sign function defined as { x sign(x) = x : x 0 (15) 0 : x = 0 351

Comparing with (9), (14) has the additional term ρsign( H(n)) which always attracts the tap coefficients towards zero. In other words, it exploits the sparse nature of the system model. This update equation is an extension of the ZA-LMS algorithm for linear sparse systems [4] to nonlinear over-parameterized Volterra kernels. Following steps analogous to [4], the mean coefficient vector E[ H(n)] can be shown to converge as E[ H( )] = H opt ρ µ R 1 E[sign H( )] (16) if µ satisfies (10). Similarly, the steady state excess mean square error in this case will be given as, where P ex ( ) = η 2 η P α 1 0 + (2 η)µ ρ(ρ 2α 2 ) (17) α 1 α 1 = E[sign( H( )) T ((I µr) 1 sign( H( ))] (18) with I denoting the identity matrix, R denoting the autocovariance matrix of the input, P 0 indicating the minimum mean square error, and η = Tr.(µR(I µr) 1 ), and, α 2 = E[ H( ) 1 ] H opt 1. (19) [Derivations of (16)-(19) are skipped in this paper and will be provided in the revised version of the manuscript.] For highly sparse systems, if ρ is properly selected between 0 and 2α2 α 1, lower MSE than obtainable under the standard LMS algorithm will be observed. III. SIMULATION STUDIES A. Linear-Nonlinear-Linear (LNL) Model Fig. 2. General nonlinear model (LNL) The proposed algorithm was simulated using matlab. First, a L-N-L model was constructed as shown in Fig. 2 above, having a linear FIR filter with impulse response h(n) = [ 0.9,0,0.87,0, 0.3,0.2,0,0] T, in cascade with the memoryless nonlinearity f(x) = 0.4x 2 +0.5x, which is followed by the same linear filter h(n). This system is exactly described by a Volterra expansion with N = 15 and p = 2, leading to a total of 136 kernel coefficients stored in the vector H. Out of these, only a few kernel coefficients are nonzero. The system input was taken as a zero mean, unit variance white Gaussian process (i.e., N (0,1)) while the output was corrupted by additive white Gaussian noise with zero mean and variance of 0.0316 (i.e., N (0,0.001)), leading to a signal to noise ratio (SNR) of 30dB. Fig. 3 shows the learning curves by plotting the observed mean-square error (MSE), averaged over 3000 experiments, against the iteration index n for the following Fig. 4. The Wiener nonlinear model cases : (i) the standard LMS algorithm, as given by (9), with µ = 0.0002 [the blue curve], and, (ii) the proposed sparse LMS algorithm, given by (14) with, µ = 0.0002 and ρ = 0.000003 [the red curve]. While the convergence rates are almost identical for the two cases, as is to be expected since the same value of µ is used for both, quite clearly, the proposed sparse LMS algorithm has a lower steady-state mean square error as compared to the standard LMS case. B. Wiener Model Next we considered a Wiener model, which is a cascade of a linear filter and a memoryless nonlinearity, as shown in Fig. 4. For our simulation, the linear filter chosen had the impulse response, h(n) = = [ 0.9,0,0.87,0, 0.3,0.2,0,0,0,0,0,0,0.514, 0.95, 0.12] T and the memoryless nonlinearity was given by f(x) = 0.4x 2 + 0.5x. This system too is exactly described by a Volterra expansion with N = 15 and p = 2, leading to a total of 136 kernel coefficients, out of which only a few are nonzero. As before, the system input was taken as a zero mean, unit variance white Gaussian process and the output noise was taken to be zero mean, white Gaussian with variance 0.0316, resulting in a SNR of 30 db. The corresponding learning curves are shown in Fig. 5, both for the standard LMS (the blue curve) with µ = 0.0007 and the proposed sparse LMS (the red curve) with µ = 0.0007 and ρ = 0.000003, by averaging the MSE curves over 3000 experiments. The sparse LMS shows a lower steady-state mean square error. Quite clearly, under the same convergence rate condition, the proposed algorithm exhibits considerably lesser steady-state mean square error as compared to the standard LMS algorithm. IV. CONCLUSIONS An algorithm is presented for adaptive identification of nonlinear systems given by sparse, truncated Volterra kernels. The algorithm introduces a L 1 norm penalty function of the filter coefficients in the instantaneous square error and derives a LMS like algorithm that forces the insignificant coefficients to converge to zero faster. Simulation results showing superiority of the proposed method over LMS are provided. 352

0 Sparsity aware LMS Standard LMS 10 Mean Square Error (M.S.E.) 20 30 40 50 60 0 2000 4000 6000 8000 10000 12000 14000 Iteration Index (n) Fig. 3. The MSE versus no. of observations curve for General nonlinear model (LNL) 0 Sparsity aware LMS Standard LMS 10 Mean Square Error (M.S.E.) 20 30 40 50 60 0 2000 4000 6000 8000 10000 12000 14000 Iteration Index (n) Fig. 5. The MSE versus no. of observations curve for nonlinear Wiener model 353

REFERENCES [1] V. Mathews and G. Sicuranza,Polynomial Signal Processing, John Wiley and Sons Inc., 2000. [2] Vassilis Kekatos, Daniele Angelosante, Georgios B. Giannakis, Sparsity- Aware Estimation of Nonlinear Volterra Kernels, in 3 rd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing [CAMSAP], Aruba, Dutch Antilles, 2009. [3] S. Benedetto and E. Biglieri, Nonlinear equalization of digital satellite channels, in IEEE J. Select. Areas Commun., no. 1, pp.57-62, Jan. 1983. [4] Y. Gu Y. Chen and A. O. Hero, Sparse LMS for system identification, in Proc. IEEE Intl. Conf. Acoust. Sp. Sig. Proc., Taipei, Taiwan, Apr. 2009. [5] Tokunbo Ogunfunmi, Adaptive Nonlinear System Identification: The Volterra and Wiener Model Approaches, Springer, 2007. [6] R. Tibshirani, Regression shrinkage and selection via the lasso, in J. Royal. Statist. Soc B., vol. 58, pp. 267288, 1996. [7] E. Cand es, Compressive Sampling in Int. Congress of Mathematics, vol. 3, pp. 14331452, 2006. [8] R. Baraniuk, Compressive sensing, in IEEE Signal Processing Magazine, vol. 25, pp. 2130, March 2007. [9] S. Haykin, Adaptive Filter Theory, 3 rd, Prentice Hall. [10] B. Farhang-Boroujeny, Adaptive Filters, John Wlley and Sons. [11] D. G. Manolakis, V. K. Ingle, S. M. Kogon, Statistical and Adaptive Signal Processing, McGraw-HILL [12] Daniele Angelosante, Juan Andres Bazerque, Georgios B. Giannakis, Online Adaptive Estimation of Sparse Signals: where RLS meets the l 1 -norm in IEEE Transactions on Signal Processing (To appear) [13] A. H. Sayed, Fundamentals of Adaptive Filtering, John Wiley and Sons, 2003. 354