TWO-LAYER LINEAR STRUCTURES FOR FAST ADAPTIVE FILTERING. a dissertation. submitted to the department of electrical engineering

Size: px
Start display at page:

Download "TWO-LAYER LINEAR STRUCTURES FOR FAST ADAPTIVE FILTERING. a dissertation. submitted to the department of electrical engineering"

Transcription

1 TWO-LAYER LIEAR STRUCTURES FOR FAST ADAPTIVE FILTERIG a dissertation submitted to the department of electrical engineering and the committee on graduate studies of stanford university in partial fulfillment of the requirements for the degree of doctor of philosophy By Francoise Beaufays June 1995

2 c Copyright 1995 by Francoise Beaufays All Rights Reserved ii

3 I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. Bernard Widrow (Principal Advisor) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. Thomas Kailath I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. Umran Inan Approved for the University Committee on Graduate Studies: iii

4 Abstract The least mean squares (LMS) algorithm is simple, robust, and is one of the most widely used algorithms for adaptive ltering. Unfortunately, it is highly sensitive to the conditioning of its input autocorrelation matrix: the higher the input eigenvalue spread, the slower the convergence of the adaptive weights. This problem can be overcome by preprocessing the inputs to the lter with a xed data-independent transformation that, at least partially, decorrelates the inputs. Typical transformations are the discrete Fourier transform (DFT) and the discrete cosine transform (DCT). The adaptive lter is then adapted using LMS with normalized learning rates. The resulting algorithms are called DFT-LMS and DCT-LMS. We rst give a brief intuitive explanation of the algorithms. We then analyze the performance of DFT/DCT-LMS for rst-order Markov inputs. In particular, we show that for Markov-1 inputs of correlation 2 [; 1], the eigenvalue spread after DFT and amplitude normalization tends to (1 + )=(1? ) as the size of the lter gets large, while after DCT and amplitude normalization it reduces to (1 + ). For comparison, the eigenvalue spread before transformation is asymptotically equal to (1 + ) 2 =(1? ) 2. We next show that the DFT/DCT preprocessing stage can advantageously be implemented using the LMS spectrum analyzer, an adaptive lter originally proposed as an alternative way of computing the DFT of a time series. We show that this circuit is extremely robust to the propagation of round-o errors due to nite precision eects. Analytical results and computer simulations are given to support this point. The LMS spectrum analyzer concept is then extended to the cosine transform, and two alternative circuits are proposed to implement the DCT adaptively. iv

5 The overall structure composed of the preprocessing and ltering stages forms a fully adaptive two-layer linear lter, which achieves better speed performance than pure LMS while retaining its low computational cost and its extreme robustness. v

6 Acknowledgements I wish to express my deepest gratitude to my principal advisor, Prof. Bernard Widrow, for his support, his encouragements and his advices throughout my studies at Stanford. He initiated me to the world of research, and I will always remember his teaching. Even more than for his academic help, I thank Prof. Widrow for his care and comprehension. I gratefully acknowledge the professors in my reading and orals committees, Prof. Thomas Kailath, Umran Inan, Anoop Gupta, and Dwight ishimura, for their time and valuable advices. I also wish to thank Prof. Amir Dembo and Istvan Kollar for their helpful suggestions. Special thanks go to Prof. Steven Boyd for the great idea of introducing a real espresso machine in ISL! I thank all my friends from the \zoo" - past and present - Michel Bilello, Takeshi Doi (alias Keish), Boyd Fowler, Dana How, Jack Kouloharis, Michael Lehr (best known as Mister Mike), Ming-Chang Liu, Derrick guyen, Steven Piche, Gregory Plett, Edward Plumer (Edouard for his fellows French speakers), Raymond Shen, Maryhelen Stevenson, Linda Tomassini, and Eric Wan, for all the instructive discussions and all the fun we had together. I heartily thank Joice DeBolt for her help with everything, for her constant good humour, and for her kindness. I also wish to thank all the folks from the Italian department who added so much to my stay at Stanford, and especially Dina Viggiano who from a professor became a great friend. Finally, I would like to thank my parents Oscar and Denise for making me understand at a young age how important and fun learning is. Last but not least, I thank my friend Luca for distracting me from my work in such a lovely way. vi

7 I acknowledge the nancial help of the Belgian American Educational Foundation whose fellowship supported my rst year at Stanford, and of the Zonta Foundation whose two fellowships helped me later on. I also acknowledge the Electric Power Research Institute and its project manager, John Maulbetsch, for their sponsorship during most of my PhD, and my current employer, SRI International, for facilitating the end of my dissertation. vii

8 Contents Abstract Acknowledgements iv vi 1 Introduction Author's Contributions : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2 Adaptation Algorithms for Linear Filtering Introduction to Adaptive Filters : : : : : : : : : : : : : : : : : : : : : The LMS Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : Derivation of the LMS Algorithm : : : : : : : : : : : : : : : : Properties of the LMS Algorithm : : : : : : : : : : : : : : : : The Complex LMS Algorithm : : : : : : : : : : : : : : : : : : The Block-LMS Algorithm : : : : : : : : : : : : : : : : : : : : The RLS Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : Derivation of the RLS Algorithm : : : : : : : : : : : : : : : : Properties of the RLS Algorithm : : : : : : : : : : : : : : : : Transform-Domain LMS Algorithms : : : : : : : : : : : : : : : : : : : Transform-Domain Block-LMS Algorithms : : : : : : : : : : : Transform-Domain on-block LMS Algorithms : : : : : : : : 22 3 Transform-Domain Algorithms General Description of DFT-LMS and DCT-LMS : : : : : : : : : : : Intuitive Justications of DFT/DCT-LMS : : : : : : : : : : : : : : : 29 viii

9 3.2.1 Filtering Approach : : : : : : : : : : : : : : : : : : : : : : : : Geometrical Approach : : : : : : : : : : : : : : : : : : : : : : Towards an Analytical Study of DFT/DCT-LMS : : : : : : : : : : : 33 4 Eigenvalue Spread Computation Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Eigenvalues and Eigenvalue Spread with Markov-1 Inputs : : : : : : : Eigenvalue Distribution of DFT-LMS with Markov-1 Inputs : Eigenvalue Distribution of DCT-LMS with Markov-1 Inputs : Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 52 5 Simulations An Adaptive Modeling Task with Markov-1 Inputs : : : : : : : : : : Adaptive Filters with Other Low-Pass Inputs : : : : : : : : : : : : : Band-Pass Input Signals : : : : : : : : : : : : : : : : : : : : : : : : : Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 67 6 Implementation of the Sliding-DFT Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : on-adaptive Implementations of the Sliding-DFT : : : : : : : : : : The Straightforward on-adaptive Implementation : : : : : : Shynk's Implementation : : : : : : : : : : : : : : : : : : : : : The LMS Spectrum Analyzer : : : : : : : : : : : : : : : : : : : : : : Propagation of Limited-Precision Errors in the Sliding-DFT : : : : : Behavior of the Sliding-DFT in Floating Point Arithmetic : : : : : : Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 84 7 Implementation of the Sliding-DCT Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Derivation of a Real-Valued LMS Cosine-Spectrum Analyzer : : : : : Derivation of a Complex LMS Cosine-Spectrum Analyzer : : : : : : : Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 99 ix

10 8 Conclusions and Future Work Further Work : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 A DFT-LMS with Markov-1 Inputs 15 A.1 Toeplitz ature of D, and its Analytical Form : : : : : : : : : : : : 15 A.2 Asymptotic Equivalence f D D : : : : : : : : : : : : : : : : : : : 16 A.3 Asymptotic Equivalence f X R?1 f D : : : : : : : : : : : : : : : : : 16 B DCT-LMS with Markov-1 Inputs 18 B.1 Analytical Expression of Y 4 =R f X, and Asymptotic Equivalence fy Y : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18 B.2 Asymptotic Equivalence C f Y C T diagb : : : : : : : : : : : : 19 C Ongoing Deadbeat Spectral Observers 112 C.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 112 C.1.1 The Spectral Observer : : : : : : : : : : : : : : : : : : : : : : 112 C.2 Relationship with the LMS Spectrum Analyzer : : : : : : : : : : : : 114 D Floating Point Representation of umbers 116 D.1 Representation of umbers in Computers : : : : : : : : : : : : : : : : 116 D.2 IEEE Single Precision Standard for Floating Point Arithmetic : : : : 117 D.3 A Simple Procedure for Simulating Low Precision Processors : : : : : 118 Bibliography 12 x

11 List of Tables 3.1 Summary of the DFT-LMS and DCT-LMS algorithms (u k denotes the complex conjugate of u k, and u k = u k is real when the data preprocessing is performed by the DCT.) : : : : : : : : : : : : : : : : : : : : : : : : Summary of the DFT-LMS and DCT-LMS algorithms: amplitude-normalization of the inputs to the LMS lter. : : : : : : : : : : : : : : : : : : : : : : Eigenvalue spreads for a band-pass signal of increasing bandwidth. is the size of the adaptive lter, R is the autocorrelation matrix as seen by LMS (i.e. without preprocessing), S C is the autocorrelation matrix after DCT and amplitude normalization, and S F is the autocorrelation matrix after DFT and amplitude normalization. : : : : : : : : : : : : : : : : : : Schematical description of the DCT spectrum analyzer. : : : : : : : : : 95 xi

12 List of Figures 2.1 Linear Adaptive lter : : : : : : : : : : : : : : : : : : : : : : : : : : : : Linear Adaptive lter with tap-delayed inputs. : : : : : : : : : : : : : : Error surface for a 2-weight adaptive lter. : : : : : : : : : : : : : : : : DFT-LMS and DCT-LMS block diagram. : : : : : : : : : : : : : : : : : DFT-LMS and DCT-LMS block diagram: amplitude-normalization of the inputs to the LMS lter. : : : : : : : : : : : : : : : : : : : : : : : : : : Magnitude of a sample transfer function for a DFT: jh 5 (!)j 2 : : Magnitude of a sample transfer functions for a DCT: jh 1 (!)j 2 : MSE hyperellipsoid (2-D section) (a) before transformation, (b) after DCT, (c) after amplitude normalization. : : : : : : : : : : : : : : : : : : : : : Error function (xj x=xo ; y; z) for a sinusoidal input without and with additive white noise (upper and lower plots, respectively). : : : : : : : : : : Eigenvalue spread of S vs. (DFT-LMS). : : : : : : : : : : : : : : : D plots of the main matrices involved in DFT-LMS eigenvalue spread derivation. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Eigenvalue spread of S vs. (DCT-LMS). : : : : : : : : : : : : : : : D plots of the main matrices involved in DCT-LMS eigenvalue spread derivation. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Similarity between the DCT basis functions and the eigenvectors of a Markov-1 autocorrelation matrix ( = 16; = :95). : : : : : : : : : : : Block diagram of an adaptive modeling system. : : : : : : : : : : : : : : 57 xii

13 5.2 Impulse responses of the dynamic system to be modelled (IIR lter), of the LMS adaptive lter, and of the DCT-LMS adaptive lter. : : : : : : Comparison between the LMS and the DCT-LMS learning curves for the adaptive modeling application. : : : : : : : : : : : : : : : : : : : : : : : Eigenvalues of the 1818 autocorrelation matrix of a Markov-1 signal ( = :9) before (points marked 'o') and after (points marked 'x') preprocessing by the DCT and amplitude normalization. : : : : : : : : : : : : : : : : : Real and imaginary parts of the 1 1 autocorrelation matrix of a Markov-2 signal ( 1 = :8; 2 = :9) after DFT and amplitude normalization autocorrelation matrix of a Markov-2 signal ( 1 = :8; 2 = :9) after DCT and amplitude normalization. : : : : : : : : : : : : : : : : : Diagonal of the matrix B = CRC T ( 1 = :95; 2 = :99). : : : : : : : The sliding-dft : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Comparison between the exact ( = 1) and the modied ( < 1) sliding- DFT's. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : The LMS spectrum analyzer. : : : : : : : : : : : : : : : : : : : : : : : LMS spectrum analyzer vs. non-adaptive sliding-dft with a random perturbation hitting the system at time k = 1. (a) The sum of the square of the DFT components is plotted versus time, (b) the sum of the squared errors in the DFT components is plotted versus time. : : : : : : : : : : : Recursive implementations of the DFT with limited precision. : : : : : : Block diagram of the LMS cosine-spectrum analyzer ( = 6). : : : : : : Signals used as desired outputs to the LMS lters involved in the complex LMS cosine-spectrum analyzer. : : : : : : : : : : : : : : : : : : : : : : Block diagram of the complex LMS cosine-spectrum analyzer ( = 6). : Block diagram of DFT-LMS: a two-layer linear fully adaptive structure. : 11 D.1 IEEE standard for single precision representation of real numbers in oating point arithmetic. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 117 xiii

14 Chapter 1 Introduction The rst steps in the eld of adaptive ltering can be traced back to the 193's - 194's with Wiener's early work on linear estimation of stochastic processes and the formulation of the famous Wiener-Hopf equations (see e.g. [23, 3]). These equations allow the determination of the linear lter that best maps, in a least squares sense, an input signal into some target output. Depending on the nature of the target (past, current, or future value of a given signal), the task that the lter performs is referred to as smoothing, estimation, or prediction. The Wiener-Hopf equations can handle all three cases. They can be phrased to solve continuous-time problems as well as discrete-time problems. In all cases, the impulse response of the lter that best maps the inputs into the target outputs is a function of the autocorrelation of the inputs and of the crosscorrelation between the inputs and the target outputs. Writing and solving the Wiener-Hopf equations thus necessitates the knowledge of the second order statistics (correlation functions) of the input and target output signals. In most practical applications, these statistics are not known beforehand and they need to be estimated from the data samples that are presented to the system. For example, a batch of inputs and target outputs can be observed, the autocorrelation and cross-correlation functions can be estimated from these data, and the impulse response of the optimum lter can be calculated. The lter is then ready for use. The so-called adaptive lters dier from the lters we just described in that their 1

15 CHAPTER 1. ITRODUCTIO 2 impulse response (a set of coecients for discrete-time lters) are adjusted iteratively as data ow through the lter instead of being determined once forever in a preliminary design phase. This second method has the advantage that the lter parameters can be continuously adjusted to reect changes that may occur in the statistics of the intervening signals. The algorithms used to adjust the parameters of these adaptive lters are referred to as adaptation algorithms. One such algorithm is the least mean squares or LMS algorithm, which was invented by Widrow and Ho in the late 195's - early 196's [65, 66]. The principle underlying LMS is extremely simple: it consists of dening an error function as the average square dierence between the lter output and its target, and in iteratively minimizing this error function over the lter coecient space, using a simple gradient-based optimization method. Because of its extreme simplicity, this algorithm has an elegance and robustness that are unsurpassed by other adaptation algorithms. Its main disadvantage is its very slow convergence under certain input conditions. As we will show in this thesis, some modications can be brought to LMS to ameliorate its convergence properties. Another famous adaptation algorithm is the recursive least squares or RLS algorithm (see e.g. [18, 23]). In the RLS algorithm, the lter coecients are made equal at each iteration to the best approximation of the Wiener solution that can be calculated based on all the data the system has so far seen. In this sense, it is an exact least squares algorithm as opposed to LMS whose coecients don't follow the Wiener solution so closely during adaptation. Because of this property, the RLS algorithm generally displays better convergence performance than LMS. However, it suers from dierent problems such as a lack of robustness for certain input conditions, and a higher computational cost. In spite of its slow convergence, LMS is a very popular algorithm, mostly because of its simplicity and robustness. For many decades, it has been a major component in a large number of engineering systems such as, for example, automatic controllers for linear systems (adaptive modeling lters, adaptive inverse controllers,... [7]), various telephony and communication devices (adaptive interference and echo cancellors, adaptive equalizers, adaptive pulse code modulators,... [36, 55, 47]), signal detection

16 CHAPTER 1. ITRODUCTIO 3 circuits (adaptive line enhancer [56]), parameter estimation systems (adaptive spectrum analyzers, adaptive correlators,... [64]), beamforming circuits [67, 22], and so forth. As new applications were developed and as the adaptation speed required from existing systems increased, various adaptation algorithms were developed to replace LMS, some based on RLS techniques, some based on LMS itself. In this thesis, we will concentrate on the second category. For completeness, we should also mention that interest in LMS recently increased with the advent of feedforward multi-layer neural networks. These neural networks, which can be described as layered arrangements of linear adaptive units followed by nonlinear unimodal functions, typically contain a very large number of parameters that must be adapted. The most famous algorithm for adjusting these parameters is the backpropagation algorithm [62, 5, 63] which is nothing else than a generalization of LMS to this more complicated structure. Backpropagation suers from the same convergence speed problems than LMS, but remedying to this problem turns out to be much more complicated in the case of neural networks, mostly because these circuits contain so many elements that only very simple algorithms can be used if the computational cost is to be kept reasonably low. In this thesis, we will discuss a class of LMS-based algorithms whose inputs are preprocessed by a xed, data-independent transformation such as the discrete Fourier transform (DFT) or the discrete cosine transform (DCT). The purpose of this transformation is to decorrelate, at least partially, the inputs to the lter. The lter coecients are then adjusted using the LMS algorithm with normalized learning rates. This has the eect of redistributing the energy of the input signal more or less evenly over all the lter inputs, thereby improving the convergence speed of the lter coef- cients. The resulting algorithms are referred to as DFT-LMS and DCT-LMS. The best performance we observe with such algorithms are obtained with DCT-LMS, for low-pass input signals. In order to maintain the computational eciency and robustness of LMS, we propose that the orthogonalizing transforms, DFT or DCT, be implemented using the so-called LMS spectrum analyzer [64], an adaptive structure that calculates the DFT of a time series eciently and which is extremely robust to

17 CHAPTER 1. ITRODUCTIO 4 error propagation. The outline of the thesis is as follows. In chapter 2, we give a detailed introduction to adaptive linear lters, to LMS and RLS, and to the algorithms we will later focus on: DFT-LMS and DCT-LMS. In chapter 3, we dene, model, and justify intuitively the these algorithms. In chapter 4, we study in detail the convergence properties of both algorithms with the assumption that the input signals are generated by a rstorder Markov system. We present computer simulations illustrating these results in Chapter 5. Chapter 6 describes the LMS spectrum analyzer and demonstrates its robustness to noise propagation. In chapter 7, we generalize the LMS spectrum analyzer to the case of the DCT. We conclude in chapter 8 by summarizing the dissertation, adding some comments, and listing several points that we think would be interesting to further study. 1.1 Author's Contributions The major contributions to knowledge of this work can be summarized as follows. Modeling of the DFT-LMS and DCT-LMS algorithms so as to simplify the analytical study of their performance, and development of a mathematical framework where such an analytical study can take place. Derivation of asymptotic results on the transformed input eigenvalues and on the speed of convergence of DFT-LMS and DCT-LMS under the assumption of rst-order Markov inputs. Mathematical proof and experimental demonstration of the robustness of the LMS spectrum analyzer to noise propagation. Generalization of the LMS spectrum analyzer to the DCT (two dierent structures are proposed).

18 Chapter 2 Adaptation Algorithms for Linear Filtering In this chapter, we introduce more formally the concept of adaptive lter and we briey summarize the characteristics of two famous adaptation algorithms, LMS and RLS. The conclusions drawn from their comparison will lead to the introduction of another family of adaptation algorithms, the transform-domain LMS algorithms, of which DFT-LMS and DCT-LMS are two examples. 2.1 Introduction to Adaptive Filters A discrete-time linear adaptive combiner [65, 66] of length is shown in Fig At time k, a set of signals, x k (); x k (1); : : : ; x k (?1) are inputted in the combiner. The combiner coecients, w k (); w k (1); : : : ; w k (? 1), are refered to in this context as the weights of the combiner. These coecients can be adjusted by an adaptation algorithm so as to make the output y k resemble a given desired output signal that we denote d k. In another set-up, the inputs could come from a tap-delay line as shown in Fig This second structure is a very common particular case of the general diagram of Fig It is used essentially for prediction and ltering applications, and it is the basic structure on which this thesis will elaborate. 5

19 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 6 Inputs x () k x (1) x (2) k k x (-2) k x (-1) k w k() w k(1) w k(2) w (-2) k w (-1) k Adaptive weights y k _ Error e k + d k Output Desired response Figure 2.1: Linear Adaptive lter The task of the adaptation algorithm is to iteratively minimize some error criterion, where by error we mean a measure of how distant the actual outputs are from the desired outputs. Typically, the error criterion is chosen to be the expectation of the square of the dierence e k between the desired and the actual outputs, (w) = E = E h e 2 k i (2.1) h (dk? y k ) 2 i ; (2.2) where the expectation E[] is taken over the input space. Let w k 4 = [w k ()w k (1) : : : w k (?1)] be the weight vector, and x k 4 = [x k ()x k (1) : : : x k (? 1)] be the input vector. For tap-delayed inputs, x k = [x k x k?1 : : : x k? +1 ]. The output signal y k can be then expressed as the dot product of the weight and the input vectors, y k = w T k x k, where the superscipt T denotes the vector transpose. The mean square error (MSE) (w) dened in Eq. 2.2 can be expanded as h (dk? y k ) 2 i (w) = E (2.3) h = E dki 2 + wt Rw? 2pw; (2.4)

20 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 7 Input z -1 z -1 z -1 x k x k-1 x k-2 x k-+2 x k-+1 x k w k() w k(1) w k(2) w (-2) k w (-1) k Adaptive weights y k _ Error e k + d k Output Desired response Figure 2.2: Linear Adaptive lter with tap-delayed inputs. where R is the autocorrealtion matrix of the inputs, R 4 = E h xk x T k i ; (2.5) and p is the cross-correlation between the inputs and the desired outputs, p 4 = E [x k d k ] : (2.6) When the inputs are tap-delayed, the matrix R is Toeplitz, i.e. R(l; m) = R(jl? mj) 8 l; m (2.7) { a property that will be used throughout this thesis. The error (w) is a quadratic function of the weights and assumes the shape of a hyperparaboloid as illustrated in Fig. 2.3 for a 2-weight case. The sections of the error surface, = constant, are hyperellipsoids (ellipses in the 2-D case). The orientation and the shape of these ellipsoids depend on the eigenvalues of the input

21 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 8 autocorrelation matrix R. It is easy to show that the axes of the hyperellipsoids are aligned with the eigenvectors of R and that their lengths are inversely proportional to the square roots of the corresponding eigenvalues. In the 2-D case, if the two eigenvalues are very dierent the ellipses are thin and long while if the eigenvalues are equal the ellipses degenerate into circles. ξ(w) 15 Error surface w(1) Error contours Minimum error 5 w() Optimal solution ξ min 1 w opt Figure 2.3: Error surface for a 2-weight adaptive lter. The weight vector that minimizes the error (w) corresponds to the \bottom of the bowl" (see Fig. 2.3). It is obtained mathematically by taking the derivative of (w) with respect to the weights, setting it to zero, and solving for w. The solution

22 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 9 w opt, which is a special case of the Wiener solution 1, is equal to w opt = arg min (w) = R?1 p: (2.8) w The minimum achievable mean square error is obtained by replacing w with w opt in Eq. 2.4: min = E h h dki 2? 2pT R?1 p = E dki 2? 2pT w opt : (2.9) The error function (w) can thus be rewritten as (w) = min + (w? w opt ) T R (w? w opt ) : (2.1) This is the function that the adaptation algorithm has to minimize. Typically, adaptation algorithms work as follows. The lter weights are initially set to zero or to small - possibly random - values. Then, at each iteration, the weights are adjusted so as to travel down the error surface and to eventually reach its minimum w opt or a vicinity of it. The speed at which this happens, the precision of the solution after convergence, the overall robustness of the algorithm, its simplicity, its capacity to deal with non-stationary inputs and/or desired outputs, the number of calculations required per iteration,... are all factors that must be taken into account when comparing dierent adaptation algorithms. In the following sections, we will successively discuss three families of algorithms: the least mean squares (LMS) algorithms, the recursive least squares (RLS) algorithms, and the transform-domain LMS algorithms. 1 The lter that best maps, in a least squares sense, an input signal into a given desired output is in general of innite length [3]. It therefore requires the resolution of an innite set of equations, which may be impractical in computer implementations. Limiting the number of lter taps to constrains the solution w opt to be of nite length. It makes the computation of w opt more tractable but it also somewhat increases the minimum achievable error of the lter, min. ote that w opt is in general dierent from the solution that would be obtained by solving the innite set of equations and then truncating the solution after the th component. In general, it is assumed that is chosen large to make these eects negligeable.

23 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG The LMS Algorithm The Least Mean Squares (LMS) algorithm invented by Widrow and Ho [65, 66] is the simplest and one of the most widely used adaptation algorithms. In this section, we summarize the main features of LMS, insisting only on the properties that will inuence the remaining of this thesis. For more details, we refer the reader to Widrow's textbook [69] and to the original publications mentioned above Derivation of the LMS Algorithm The LMS algorithm minimizes the error function using a stochastic steepest descent approach, that is, at each iteration, the weights are updated proportionaly to an estimate of the error gradient. Let r k denote the true error gradient at time k, and cr k its estimate. The true gradient of the error function is given by r k = d(w k) (2.11) dw k = de d k? w Tk x k 2 dw k (2.12) The gradient estimate, c r k, is simply obtained by omitting the expectation in Eq. 2.12, hence the name \stochastic gradient": cr k = d d k? w T k x k 2 dw k (2.13) =?2 d k? w T k x k xk (2.14) =?2e k x k : (2.15) By adjusting the weights proportionaly to the stochastic gradient instead of the true gradient, LMS follows on the error surface a zig-zag path whose average course is the exact steepest descent path. The main motivation behind this stochastic approximation is to avoid the cost of computing an expectation over the whole input space at each iteration.

24 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 11 The LMS weight update is thus given by the simple formula: w k+1 = w k? c r k (2.16) = w k + 2e k x k ; (2.17) where the learning rate,, is a constant that governs the speed of convergence of the algorithm: large 's allow the algorithm to converge fast (bigger steps are taken towards the bottom of the bowl), but large 's also lower the precision of the weight vector after convergence has been reached (the weight vector keeps on wandering in a large neighborhood of the optimum solution). Moreover, large 's can create instability problems. Choosing the right value for is an important and dicult task Properties of the LMS Algorithm Although it may seem counter-intuitive, the very simplicity of LMS makes its exact analysis quite complicated. Most of the published proofs of convergence of LMS are based on the average behavior of the algorithm rather than on its stochastic behavior. The early theory of LMS developed by Widrow and Ho considers the convergence in the mean of the weight vector. Later studies have also included convergence in the mean square [57, 4, 23]. For our purposes, the former will suce, and we will limit ourselves to a summary of Widrow's main results, refering the reader to Widrow's and Haykin's textbooks [69, 23] for more details. Widrow based his analysis on the exact steepest descent algorithm: w k+1 = w k? r k : (2.18) The exact error gradient at time k can be expressed as r k = d(w k) dw k (2.19) = 2Rw k? 2p (2.2) = 2R(w k? w opt ): (2.21)

25 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 12 Let v k be the translated weight vector v k 4 = wk? w opt : (2.22) The weight update formula of Eq can be rewritten in terms of translated weights as v k+1 = (I? 2R)v k ; (2.23) where I is the identity matrix. The next step consists in performing a rotation of the translated weights, v k 4 = Q T v k ; (2.24) where the unitary matrix Q contains the eigenvectors of R, that is R = QQ T, where is a diagonal matrix containing the eigenvalues of R. Eq can then be rewritten as v k+1 = (I? 2)v k : (2.25) This last formula can be iterated from time to time k to give the transformed weight vector at time k: v k = (I? 2) k v ; (2.26) where v is the initial value of the transformed weight vector. Also the error at time k can be expressed in the transformed weight space: k = min + v T k v k: (2.27) By introducing the newly found formula for the weight vector (Eq. 2.26) in the error

26 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 13 function and with a little algebra, one nds k = min + X n= v 2 n n(1? 2 n ) 2k : (2.28) The implications of this equation are extremely important. First, we see that the error decreases as a sum of geometrical series { or exponentials if we interpret the adaptation as a continuous process. Each exponential corresponds to one weight and evolves independently of the others. This is due to the decorrelation of the weights resulting from their transformation by the matrix Q. exponentials are given by The time constants of the n = 1 4 n ; (2.29) where n is the eigenvalue associated to the n th weight. Small eigenvalues (low energy modes) correspond to long time constants and slow down the overall convergence of the adaptive lter. High eigenvalues, on the other hand, can cause the modulus of (1? 2 n ) to be larger than one, thereby causing the algorithm to diverge. Of course this divergence can be avoided by reducing the learning rate, but decreasing will have the direct consequence of slowing down even further those modes that are already slow because they correspond to small eigenvalues. Input signals with high eigenvalue spread will therefore always result in poor convergence performance. Clearly, the problem faced by LMS when its input eigenvalues are very spread apart is due to the fact that it has a single learning rate that must satisfy all the weights. The problem would be greatly reduced if we could associate to each decorrelated weight v (n) a specic learning rate n such that the product n n is more or less constant over n. ote however that this reasoning holds only in a weight space that has been previously orthogonalized (i.e. it holds in the weight space v but not in the weight spaces v or w). Without this preliminary decorrelating step, each weight would be associated to a combination of modes in the error function instead of just one mode, and the learning rates n could not be chosen eciently. Another important property to discuss is the precision of the steady-state solution

27 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 14 of LMS. When it converges, LMS does not reach the optimal solution w opt exactly; rather, it reaches a vicinity of the optimum solution where it keeps on wandering forever. This is due to the fact that the weight update in LMS is proportional to the stochastic error gradient instead of the true error gradient. At the bottom of the error function, the true gradient is equal to zero, but the stochastic gradient (i.e. the product of the input by the output error) is not necessarily equal to zero (besides of course in the degenerated case where the minimum achievable error, min, is equal to zero). The steady-state solution found by LMS is thus noisy. Its precision is characterized by a quantity called misadjustment, which is equal to the variance of the steady-state solution normalized by the minimum achievable error min. It can be shown [69] that the misadjustment is in rst approximation given by Misadjustment = Trace(R): (2.3) This formula shows that improving the precision of the steady-state solution can easily be achieved by decreasing the learning rate. However, this has the inconvenient of slowing down the adaptation process. A better solution which is often used in practice consists in starting the adaptation with a large and decreasing it progressively as the weights converge (see [14]). The misadjustment of LMS should therefore not be seen as a major limitation of the algorithm. Another way of interpreting the non-zero misadjustment of LMS is to note that the algorithm has no memory. Other algorithms such as RLS accumulate information about the data and use this information to progressively reduce the uncertainty about the solution. LMS does not. While for steady-state convergence this is a disadvantage because it causes steady-state misadjustment, the same feature turns out to be advantageous in non-stationary environments. By not accumulating any information about the data, LMS can more easily track a time-varying solution than RLS whose weights are delayed in their evolution by the obsolete information they have accumulated (see e.g. [23]).

28 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 15 In conclusion, besides for its slow convergence when the inputs are highly correlated, LMS displays excellent properties. Its weight update requires only O() computations per iteration, it is by far the simplest algorithm to implement, it is robust to error propagation in limited precision implementations [9], and it tracks non-stationarities better than other adaptation algorithms [5, 38, 7]. These properties have greatly contributed to the popularity of LMS, although the major complain about the algorithm remains its slow convergence. In applications where it is critical to achieve fast convergence, LMS is often not a viable solution. With this respect, exact least squares algorithms such as RLS may be more attractive. We will see however that the convergence speed of LMS can be greatly ameliorated if an adequate preprocessing is eected on its inputs. This will bring the discussion to the family of transform-domain LMS algorithms. In order to facilitate the later description of these transform-domain algorithms, we would like to briey introduce two extensions of LMS: complex LMS and block-lms, after which we will turn our attention to RLS algorithms, and then to transform-domain LMS algorithms The Complex LMS Algorithm Complex LMS [68] is the straightforward extension of real LMS to the case where the inputs, the adaptive weigths, and the desired outputs are allowed to take on complex values. The error function to be minimized is dened as the expectation of the square modulus of the output error, (w) = E [e k e k] ; (2.31) where e k is the complex conjugate of e k. Applying stochastic steepest descent to, one nds the following weight update fomula 2 : w k+1 = w k + 2e k x k ; (2.32) 2 A formal derivation of the algorithm is given in [68]. An intuitive derivation can be obtained by considering a few particular cases such as real inputs with imaginary desired outputs, etc.

29 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 16 with e k = d k? y k = d k? wk T x k, and where the weights w k (i) are complex numbers. Equivalently, the weight update can be formulated as w k+1 = w k + 2e kx k ; (2.33) with e k = d k? y k = d k? wk H x k, where the superscript H denotes the hermitian, i.e. the transpose conjugate [23]. The properties of complex LMS are very similar to those of real LMS, although slight dierences can be observed in terms of mean square convergence and stability performance [24] The Block-LMS Algorithm Block LMS is a partially batched extension of LMS [12, 1]. The instantaneous error gradient, c r, is computed at each iteration as in regular non-block LMS, but rather than being used right away to update the weights, it is buered for a certain number, L, of iterations. The weight update takes place every L iterations and is made proportional to the sum of the last L instantaneous error gradients: X L?1 w(k + 1) = w(k) + 2 l= x(kl + l)e(kl + l): (2.34) If L = 1, the weight update of Eq describes regular LMS; if L is equal to the number of input patterns available for training, Eq describes a batched implementation of LMS. In practice, L is often made equal to, the length of the lter. Having L > 1 essentially modies the nature of the error gradient, making it less stochastic and closer to the true error gradient, r. This has the consequence of smoothing the learning curve but it also aects other properties of the algorithm. For example, buering the gradient for L iterations may hurt the tracking capabilities of the algorithm in non-stationary environments because of the delay it introduces in the weight update. Also, the maximum learning rate that can be used without encountering stability problems is L times smaller than the one that could be used, under

30 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 17 identical input conditions, with regular LMS [16]. This, of course, may adversely inuence the convergence speed of the algorithm. The advantage of block-lms comes from the fact that the block-gradient in Eq can be seen as a linear correlation between the input signal and the output error signal. It can therefore be implemented eciently by taking the Fourier transforms of the two signals, computing their product, and inverse transforming the result [12, 1]. The computational eciency of this method counterbalances the slowliness of the weight convergence. We will see in section 2.4 that a whole class of transform-domain algorithms is based on this principle. 2.3 The RLS Algorithm The Recursive Least Squares (RLS) algorithm implements recursively an exact least squares solution [18, 23]. We saw previously that the Wiener solution for an adaptive lter of nite length is given by w opt = R?1 p where R is the autocorrelation matrix of the inputs and p is the cross-correlation between inputs and desired ouputs. At each time step, RLS estimates recursively R?1 and p based on all past data and computes 4 the weight vector as w k = R?1 k p k, which is thus the best to-date approximation to the Wiener solution Derivation of the RLS Algorithm At time k, the best estimates of R and p are given by R k = kx k?i x i x T i (2.35) i=1 = R k?1 + x k x T k ; (2.36) p k = kx k?i x i d i (2.37) i=1 = p k?1 + x k d k ; (2.38)

31 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 18 where the constant 2 [; 1] is generally chosen close to one but slightly smaller for stability reasons. The best estimate of the optimum weight vector 3 w opt is given by Applying the matrix inversion lemma 4 to Eq. 2.39, we get: where K k, the gain vector, is dened by w k = R?1 k p k : (2.39) R?1 k = R?1 R?1 k?1??2 k?1x k x T k R?1 k?1 1 +?1 x T k R k?1x k (2.4) =?1 R?1 k?1??1 K k x T k R?1 k?1 (2.41) K k = G k X k ; (2.42) and G k =?1 R?1 k?1 1 +?1 x T k R k?1x k : (2.43) Introducing Eq and Eq in Eq. 2.39, and after some algebraic manipulation, we nd the weight update formula w k = w k?1 + K k k (2.44) = w k?1 + G k k x k ; (2.45) where k = d k? w T k?1x k : (2.46) 3 Alternatively, the weight update formula can be found by minimizing the error function k = P k i=1 k?i e 2 i, where e i is the output error at time i, and where the sum over i causes the weight vector at time k to take into account all the past history (i = 1 to k) of the system. 4 Let A and B be two positive denite matrices, C a M matrix, and D a positive denite M M matrix. The matrix inversion lemma states that if A = B + CD?1 C T, then A?1 = B?1? B?1 C(D + C T B?1 C)?1 C T B?1 (see e.g. [29]).

32 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 19 Equations 2.43, 2.45, and 2.46 summarize the algorithm. The weights are typically initialized to zero while the R?1 k matrix is initialized by R?1 =?1 I, where is a small positive constant and I is the identity matrix Properties of the RLS Algorithm ote the formal resemblance between RLS and LMS. The RLS weight vector is updated proportionally to the product of the current input x k and some error signal k as in LMS. The error in RLS is dened a priori in the sense that it is based on the old weight vector w k?1, whereas in LMS the error e k = d k? wk T x k is computed a posteriori, that is based on the current weight vector, w k. A more important dierence results from the fact that the constant learning rate in LMS is replaced in RLS by a matrix G k that is time- and data-dependent. With this respect, RLS can be thought of as a sort of LMS algorithm having a matrixcontrolled optimal learning rate (optimal because the weight vector at each iteration is the best achievable one in the least mean square sense, given all the past data samples). The weight update formula of Eq could also be rewritten as with w k = w k?1 + k R?1 k?1 kx k ; (2.47) k =?1 1 +?1 x T : (2.48) k R k?1x k This formulation places in evidence the decorrelation operation performed by RLS on the input data: the stochastic gradient k x k is premultiplied by an estimate of the inverse autocorrelation matrix R?1 k?1, which has the eect of decorrelating the inputs to the adaptive lter. This decorrelation, along with the specic expression of the learning rate, reduces the sensibility of the algorithm to its input eigenvalue spread, and enhances its convergence properties with respect to LMS. This premultiplication by R?1 can unfortunately hurt the stability of the lter if the matrix R is ill-conditioned or close to being ill-conditioned { a situation that arises each time

33 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 2 some of the lter inputs are linearly dependent or, in other words, each time the lter contains more weights than necessary. Another issue to be discussed is the precision of the steady-state solution. The asymptotic misadjustment in RLS can be arbitrarily decreased by increasing the parameter up to one. Intuitively, 1 causes the entries of the matrix R k to grow as k increases (see Eq. 2.35), and forces k (Eq. 2.48) to gradually decrease down to zero (see [23] for a more formal justication). ote that choosing = 1 does not impair the convergence properties: the convergence speed with RLS in roughly independent of. As a counterpart, RLS displays poor tracking capabilities in non-stationary environments [5, 38, 7]. Intuitively, the weight vector in RLS is based on all the past history of the input signal. If the statistics of this signal change over time, it will be harder for RLS to adjust to these changes than for LMS whose weight update is based solely on the current stochastic gradient. In addition, RLS suers of a high computational complexity: due to matrix-vector multiplications, O( 2 ) operations are required for each weight update whereas only O() are necessary with LMS. In conclusion, while RLS has the advantage of a fast convergence rate and low sensitivity to the input eigenvalue spread, it is computationaly intensive, prompt to numerical instabilities, and inecient at tracking non-stationaries when compared to LMS. LMS is intrinsically slow because it does not decorrelate its inputs prior to adaptive ltering, but preprocessing the inputs by an estimate of the inverse input autocorrelation matrix in the fashion of RLS leads to the problems cited above. One solution, which we will further discuss in this thesis, consists of preprocessing the inputs to the LMS lter with a xed transformation that does not depend on the actual input data. The decorrelation will only be approximative, but the computational cost will remain of O() and the robustness and tracking capability of LMS will be preserved. These algorithms are generally called transform-domain LMS algorithms or frequency-domain LMS algorithms. Before leaving this section, we should mention that many other exact least squares algorithms have been studied in the literature. The main motivation for doing so was

34 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 21 to reduce the computational complexity of RLS and improve its robustness while maintaining its convergence characteristics. The most famous algorithms in this family are those based on the so-called lattice structure [42, 31, 32, 33]. These algorithms take advantage of the fact that in a Toeplitz matrix such as the autocorrelation matrix R, only out of 2 elements are distinct. This observation can be used to make a one-to-one correspondance between these elements and so-called reection coecients and, with some algebraic manipulation, reducing the computational cost of the algorithm from O( 2 ) to O(). Another characteristics that makes lattice lters very popular is the ease with which their stability can be monitored (see e.g. [23]). It has been observed, however, that nite arithmetic eects can severely degrade the algorithm performance [9]. The price for the improvement brought by the lattice structure is the increased complexity of the algorithm in terms of number of equations to be implemented, number of variables to be stored, and general complication of the algebra. Transform-domain LMS algorithms may be seen as a sort of intermediate solution that tries to combine the advantages of both LMS and RLS. 2.4 Transform-Domain LMS Algorithms The name \transform-domain LMS algorithms" is somewhat ambiguous. It has been used in the literature to designate two dierent categories of algorithms: block-lms algorithms implemented in the frequency domain and non-block LMS algorithms whose inputs are transformed into the frequency domain prior to ltering. In this section, we give a quick overview of both families of algorithms. Since the rest of this thesis is devoted to non-block algorithms, these will be further detailed in the next chapter Transform-Domain Block-LMS Algorithms This category of algorithms builds upon the Fourier implementation of the block-lms algorithm (see section 2.2.4). Assume that the outputs, desired outputs, and output

35 CHAPTER 2. ADAPTATIO ALGORITHMS FOR LIEAR FILTERIG 22 errors are buered into vectors. The output vector as well as the error gradient can be estimated in the frequency domain rather than in the time domain since they both result from the convolution of two vectors. Because the product of two Fourier series corresponds in the time domain to a circular convolution rather than a linear one, some constraints must be implemented to restore the linearity of the convolution (see e.g. [44]). ot implementing these constraints results in wrap-around eects that aect the performance of the algorithms (biased optimal solution, extra noise in the steady-state solution, etc.) [1, 46]. Two methods have been described in the literature that calculate the linear convolution of two signals by taking the product of their Fourier transforms: the overlapsave method and the overlap-add method [44, 11]. These methods require typically one Fourier transform, one inverse Fourier transform, and a few appropriate vector manipulations (zero-padding, truncation, concatenation of vectors,...) to calculate one convolution. Since two convolutions are calculated at each weight update (one for the error gradient and one for the lter output), frequency-domain block-lms algorithms are quite involved (see e.g. [52] for a detailed description of the algorithms). The main advantage of these algorithms { in addition to their computational eciency { is their potentially very fast convergence. By attributing to each transformed weight a learning rate that is inversely proportional to the energy of the corresponding input, the convergence of the algorithms can be greatly improved [15, 54] Transform-Domain on-block LMS Algorithms This family of algorithms was rst introduced by arayan under the name transform domain LMS algorithms [43]. arayan's structure consists simply of an LMS lter whose inputs are preprocessed by a DFT and whose learning rates (one per weight) are adjusted as a function of the input energy levels (a full description of the algorithm is given in the next chapter). Refering back to our discussions on LMS and RLS (sections and respectively), the DFT is used to decorrelate the inputs of the LMS lter. The learning rate normalization is then used to optimize the convergence speed of each individual weight. Since the DFT is not a perfect decorrelator, this structure

ADAPTIVE FILTER THEORY

ADAPTIVE FILTER THEORY ADAPTIVE FILTER THEORY Fourth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada Front ice Hall PRENTICE HALL Upper Saddle River, New Jersey 07458 Preface

More information

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL.

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL. Adaptive Filtering Fundamentals of Least Mean Squares with MATLABR Alexander D. Poularikas University of Alabama, Huntsville, AL CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is

More information

Adaptive Filter Theory

Adaptive Filter Theory 0 Adaptive Filter heory Sung Ho Cho Hanyang University Seoul, Korea (Office) +8--0-0390 (Mobile) +8-10-541-5178 dragon@hanyang.ac.kr able of Contents 1 Wiener Filters Gradient Search by Steepest Descent

More information

Adaptive Filtering Part II

Adaptive Filtering Part II Adaptive Filtering Part II In previous Lecture we saw that: Setting the gradient of cost function equal to zero, we obtain the optimum values of filter coefficients: (Wiener-Hopf equation) Adaptive Filtering,

More information

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Lecture Notes: Geometric Considerations in Unconstrained Optimization Lecture Notes: Geometric Considerations in Unconstrained Optimization James T. Allison February 15, 2006 The primary objectives of this lecture on unconstrained optimization are to: Establish connections

More information

Performance Analysis and Enhancements of Adaptive Algorithms and Their Applications

Performance Analysis and Enhancements of Adaptive Algorithms and Their Applications Performance Analysis and Enhancements of Adaptive Algorithms and Their Applications SHENGKUI ZHAO School of Computer Engineering A thesis submitted to the Nanyang Technological University in partial fulfillment

More information

2 Tikhonov Regularization and ERM

2 Tikhonov Regularization and ERM Introduction Here we discusses how a class of regularization methods originally designed to solve ill-posed inverse problems give rise to regularized learning algorithms. These algorithms are kernel methods

More information

Elec4621 Advanced Digital Signal Processing Chapter 11: Time-Frequency Analysis

Elec4621 Advanced Digital Signal Processing Chapter 11: Time-Frequency Analysis Elec461 Advanced Digital Signal Processing Chapter 11: Time-Frequency Analysis Dr. D. S. Taubman May 3, 011 In this last chapter of your notes, we are interested in the problem of nding the instantaneous

More information

EFFECTS OF ILL-CONDITIONED DATA ON LEAST SQUARES ADAPTIVE FILTERS. Gary A. Ybarra and S.T. Alexander

EFFECTS OF ILL-CONDITIONED DATA ON LEAST SQUARES ADAPTIVE FILTERS. Gary A. Ybarra and S.T. Alexander EFFECTS OF ILL-CONDITIONED DATA ON LEAST SQUARES ADAPTIVE FILTERS Gary A. Ybarra and S.T. Alexander Center for Communications and Signal Processing Electrical and Computer Engineering Department North

More information

Machine Learning. A Bayesian and Optimization Perspective. Academic Press, Sergios Theodoridis 1. of Athens, Athens, Greece.

Machine Learning. A Bayesian and Optimization Perspective. Academic Press, Sergios Theodoridis 1. of Athens, Athens, Greece. Machine Learning A Bayesian and Optimization Perspective Academic Press, 2015 Sergios Theodoridis 1 1 Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens,

More information

Assesment of the efficiency of the LMS algorithm based on spectral information

Assesment of the efficiency of the LMS algorithm based on spectral information Assesment of the efficiency of the algorithm based on spectral information (Invited Paper) Aaron Flores and Bernard Widrow ISL, Department of Electrical Engineering, Stanford University, Stanford CA, USA

More information

Temporal Backpropagation for FIR Neural Networks

Temporal Backpropagation for FIR Neural Networks Temporal Backpropagation for FIR Neural Networks Eric A. Wan Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract The traditional feedforward neural network is a static

More information

Statistical and Adaptive Signal Processing

Statistical and Adaptive Signal Processing r Statistical and Adaptive Signal Processing Spectral Estimation, Signal Modeling, Adaptive Filtering and Array Processing Dimitris G. Manolakis Massachusetts Institute of Technology Lincoln Laboratory

More information

Chapter 2 Wiener Filtering

Chapter 2 Wiener Filtering Chapter 2 Wiener Filtering Abstract Before moving to the actual adaptive filtering problem, we need to solve the optimum linear filtering problem (particularly, in the mean-square-error sense). We start

More information

AdaptiveFilters. GJRE-F Classification : FOR Code:

AdaptiveFilters. GJRE-F Classification : FOR Code: Global Journal of Researches in Engineering: F Electrical and Electronics Engineering Volume 14 Issue 7 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 11 Adaptive Filtering 14/03/04 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Title without the persistently exciting c. works must be obtained from the IEE

Title without the persistently exciting c.   works must be obtained from the IEE Title Exact convergence analysis of adapt without the persistently exciting c Author(s) Sakai, H; Yang, JM; Oka, T Citation IEEE TRANSACTIONS ON SIGNAL 55(5): 2077-2083 PROCESS Issue Date 2007-05 URL http://hdl.handle.net/2433/50544

More information

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112 Performance Comparison of Two Implementations of the Leaky LMS Adaptive Filter Scott C. Douglas Department of Electrical Engineering University of Utah Salt Lake City, Utah 8411 Abstract{ The leaky LMS

More information

Revision of Lecture 4

Revision of Lecture 4 Revision of Lecture 4 We have discussed all basic components of MODEM Pulse shaping Tx/Rx filter pair Modulator/demodulator Bits map symbols Discussions assume ideal channel, and for dispersive channel

More information

Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity.

Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity. Neural Computation, 1994 Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph Interreciprocity. Francoise Beaufays and Eric A. Wan Abstract We show that signal

More information

October 7, :8 WSPC/WS-IJWMIP paper. Polynomial functions are renable

October 7, :8 WSPC/WS-IJWMIP paper. Polynomial functions are renable International Journal of Wavelets, Multiresolution and Information Processing c World Scientic Publishing Company Polynomial functions are renable Henning Thielemann Institut für Informatik Martin-Luther-Universität

More information

NSLMS: a Proportional Weight Algorithm for Sparse Adaptive Filters

NSLMS: a Proportional Weight Algorithm for Sparse Adaptive Filters NSLMS: a Proportional Weight Algorithm for Sparse Adaptive Filters R. K. Martin and C. R. Johnson, Jr. School of Electrical Engineering Cornell University Ithaca, NY 14853 {frodo,johnson}@ece.cornell.edu

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 1, JANUARY 2001 135 New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks Martin Bouchard,

More information

Contents. 2.1 Vectors in R n. Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v. 2.50) 2 Vector Spaces

Contents. 2.1 Vectors in R n. Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v. 2.50) 2 Vector Spaces Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v 250) Contents 2 Vector Spaces 1 21 Vectors in R n 1 22 The Formal Denition of a Vector Space 4 23 Subspaces 6 24 Linear Combinations and

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Machine Learning and Adaptive Systems. Lectures 3 & 4

Machine Learning and Adaptive Systems. Lectures 3 & 4 ECE656- Lectures 3 & 4, Professor Department of Electrical and Computer Engineering Colorado State University Fall 2015 What is Learning? General Definition of Learning: Any change in the behavior or performance

More information

`First Come, First Served' can be unstable! Thomas I. Seidman. Department of Mathematics and Statistics. University of Maryland Baltimore County

`First Come, First Served' can be unstable! Thomas I. Seidman. Department of Mathematics and Statistics. University of Maryland Baltimore County revision2: 9/4/'93 `First Come, First Served' can be unstable! Thomas I. Seidman Department of Mathematics and Statistics University of Maryland Baltimore County Baltimore, MD 21228, USA e-mail: hseidman@math.umbc.edui

More information

Linear-Quadratic Optimal Control: Full-State Feedback

Linear-Quadratic Optimal Control: Full-State Feedback Chapter 4 Linear-Quadratic Optimal Control: Full-State Feedback 1 Linear quadratic optimization is a basic method for designing controllers for linear (and often nonlinear) dynamical systems and is actually

More information

Least Mean Squares Regression. Machine Learning Fall 2018

Least Mean Squares Regression. Machine Learning Fall 2018 Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises

More information

Ch4: Method of Steepest Descent

Ch4: Method of Steepest Descent Ch4: Method of Steepest Descent The method of steepest descent is recursive in the sense that starting from some initial (arbitrary) value for the tap-weight vector, it improves with the increased number

More information

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract Published in: Advances in Neural Information Processing Systems 8, D S Touretzky, M C Mozer, and M E Hasselmo (eds.), MIT Press, Cambridge, MA, pages 190-196, 1996. Learning with Ensembles: How over-tting

More information

Independent Component Analysis. Contents

Independent Component Analysis. Contents Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle

More information

Learning with Momentum, Conjugate Gradient Learning

Learning with Momentum, Conjugate Gradient Learning Learning with Momentum, Conjugate Gradient Learning Introduction to Neural Networks : Lecture 8 John A. Bullinaria, 2004 1. Visualising Learning 2. Learning with Momentum 3. Learning with Line Searches

More information

Advanced Digital Signal Processing -Introduction

Advanced Digital Signal Processing -Introduction Advanced Digital Signal Processing -Introduction LECTURE-2 1 AP9211- ADVANCED DIGITAL SIGNAL PROCESSING UNIT I DISCRETE RANDOM SIGNAL PROCESSING Discrete Random Processes- Ensemble Averages, Stationary

More information

Lecture: Adaptive Filtering

Lecture: Adaptive Filtering ECE 830 Spring 2013 Statistical Signal Processing instructors: K. Jamieson and R. Nowak Lecture: Adaptive Filtering Adaptive filters are commonly used for online filtering of signals. The goal is to estimate

More information

Chapter 3 Least Squares Solution of y = A x 3.1 Introduction We turn to a problem that is dual to the overconstrained estimation problems considered s

Chapter 3 Least Squares Solution of y = A x 3.1 Introduction We turn to a problem that is dual to the overconstrained estimation problems considered s Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A. Dahleh George Verghese Department of Electrical Engineering and Computer Science Massachuasetts Institute of Technology 1 1 c Chapter

More information

A general theory of discrete ltering. for LES in complex geometry. By Oleg V. Vasilyev AND Thomas S. Lund

A general theory of discrete ltering. for LES in complex geometry. By Oleg V. Vasilyev AND Thomas S. Lund Center for Turbulence Research Annual Research Briefs 997 67 A general theory of discrete ltering for ES in complex geometry By Oleg V. Vasilyev AND Thomas S. und. Motivation and objectives In large eddy

More information

Least Mean Squares Regression

Least Mean Squares Regression Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

strain appears only after the stress has reached a certain critical level, usually specied by a Rankine-type criterion in terms of the maximum princip

strain appears only after the stress has reached a certain critical level, usually specied by a Rankine-type criterion in terms of the maximum princip Nonlocal damage models: Practical aspects and open issues Milan Jirasek LSC-DGC, Swiss Federal Institute of Technology at Lausanne (EPFL), Switzerland Milan.Jirasek@ep.ch Abstract: The purpose of this

More information

In: Proc. BENELEARN-98, 8th Belgian-Dutch Conference on Machine Learning, pp 9-46, 998 Linear Quadratic Regulation using Reinforcement Learning Stephan ten Hagen? and Ben Krose Department of Mathematics,

More information

Lessons in Estimation Theory for Signal Processing, Communications, and Control

Lessons in Estimation Theory for Signal Processing, Communications, and Control Lessons in Estimation Theory for Signal Processing, Communications, and Control Jerry M. Mendel Department of Electrical Engineering University of Southern California Los Angeles, California PRENTICE HALL

More information

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module 2 Lecture 05 Linear Regression Good morning, welcome

More information

Dominant Pole Localization of FxLMS Adaptation Process in Active Noise Control

Dominant Pole Localization of FxLMS Adaptation Process in Active Noise Control APSIPA ASC 20 Xi an Dominant Pole Localization of FxLMS Adaptation Process in Active Noise Control Iman Tabatabaei Ardekani, Waleed H. Abdulla The University of Auckland, Private Bag 9209, Auckland, New

More information

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Computational Neuro-Engineering Laboratory University

More information

Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering

Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering Bernard Widrow and Gregory L. Plett Department of Electrical Engineering, Stanford University, Stanford, CA 94305-9510 Abstract

More information

On Mean Curvature Diusion in Nonlinear Image Filtering. Adel I. El-Fallah and Gary E. Ford. University of California, Davis. Davis, CA

On Mean Curvature Diusion in Nonlinear Image Filtering. Adel I. El-Fallah and Gary E. Ford. University of California, Davis. Davis, CA On Mean Curvature Diusion in Nonlinear Image Filtering Adel I. El-Fallah and Gary E. Ford CIPIC, Center for Image Processing and Integrated Computing University of California, Davis Davis, CA 95616 Abstract

More information

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Recursive Least Squares for an Entropy Regularized MSE Cost Function Recursive Least Squares for an Entropy Regularized MSE Cost Function Deniz Erdogmus, Yadunandana N. Rao, Jose C. Principe Oscar Fontenla-Romero, Amparo Alonso-Betanzos Electrical Eng. Dept., University

More information

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE?

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE? IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE? Dariusz Bismor Institute of Automatic Control, Silesian University of Technology, ul. Akademicka 16, 44-100 Gliwice, Poland, e-mail: Dariusz.Bismor@polsl.pl

More information

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 11 Adaptive Filtering 14/03/04 http://www.ee.unlv.edu/~b1morris/ee482/

More information

BLOCK LMS ADAPTIVE FILTER WITH DETERMINISTIC REFERENCE INPUTS FOR EVENT-RELATED SIGNALS

BLOCK LMS ADAPTIVE FILTER WITH DETERMINISTIC REFERENCE INPUTS FOR EVENT-RELATED SIGNALS BLOCK LMS ADAPTIVE FILTER WIT DETERMINISTIC REFERENCE INPUTS FOR EVENT-RELATED SIGNALS S. Olmos, L. Sörnmo, P. Laguna Dept. of Electroscience, Lund University, Sweden Dept. of Electronics Eng. and Communications,

More information

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

More information

Acoustic Signal Processing. Algorithms for Reverberant. Environments

Acoustic Signal Processing. Algorithms for Reverberant. Environments Acoustic Signal Processing Algorithms for Reverberant Environments Terence Betlehem B.Sc. B.E.(Hons) ANU November 2005 A thesis submitted for the degree of Doctor of Philosophy of The Australian National

More information

Adaptive Systems Homework Assignment 1

Adaptive Systems Homework Assignment 1 Signal Processing and Speech Communication Lab. Graz University of Technology Adaptive Systems Homework Assignment 1 Name(s) Matr.No(s). The analytical part of your homework (your calculation sheets) as

More information

3. ESTIMATION OF SIGNALS USING A LEAST SQUARES TECHNIQUE

3. ESTIMATION OF SIGNALS USING A LEAST SQUARES TECHNIQUE 3. ESTIMATION OF SIGNALS USING A LEAST SQUARES TECHNIQUE 3.0 INTRODUCTION The purpose of this chapter is to introduce estimators shortly. More elaborated courses on System Identification, which are given

More information

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017 Non-Convex Optimization CS6787 Lecture 7 Fall 2017 First some words about grading I sent out a bunch of grades on the course management system Everyone should have all their grades in Not including paper

More information

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39 Blind Source Separation (BSS) and Independent Componen Analysis (ICA) Massoud BABAIE-ZADEH Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39 Outline Part I Part II Introduction

More information

ADAPTIVE FILTER THEORY

ADAPTIVE FILTER THEORY ADAPTIVE FILTER THEORY Fifth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada International Edition contributions by Telagarapu Prabhakar Department

More information

Approximation of the Karhunen}Loève transformation and its application to colour images

Approximation of the Karhunen}Loève transformation and its application to colour images Signal Processing: Image Communication 6 (00) 54}55 Approximation of the Karhunen}Loève transformation and its application to colour images ReH mi Kouassi, Pierre Gouton*, Michel Paindavoine Laboratoire

More information

THE PROBLEMS OF ROBUST LPC PARAMETRIZATION FOR. Petr Pollak & Pavel Sovka. Czech Technical University of Prague

THE PROBLEMS OF ROBUST LPC PARAMETRIZATION FOR. Petr Pollak & Pavel Sovka. Czech Technical University of Prague THE PROBLEMS OF ROBUST LPC PARAMETRIZATION FOR SPEECH CODING Petr Polla & Pavel Sova Czech Technical University of Prague CVUT FEL K, 66 7 Praha 6, Czech Republic E-mail: polla@noel.feld.cvut.cz Abstract

More information

QM and Angular Momentum

QM and Angular Momentum Chapter 5 QM and Angular Momentum 5. Angular Momentum Operators In your Introductory Quantum Mechanics (QM) course you learned about the basic properties of low spin systems. Here we want to review that

More information

Ridge analysis of mixture response surfaces

Ridge analysis of mixture response surfaces Statistics & Probability Letters 48 (2000) 3 40 Ridge analysis of mixture response surfaces Norman R. Draper a;, Friedrich Pukelsheim b a Department of Statistics, University of Wisconsin, 20 West Dayton

More information

A Novel Approach to the 2D Analytic Signal? Thomas Bulow and Gerald Sommer. Christian{Albrechts{Universitat zu Kiel

A Novel Approach to the 2D Analytic Signal? Thomas Bulow and Gerald Sommer. Christian{Albrechts{Universitat zu Kiel A Novel Approach to the 2D Analytic Signal? Thomas Bulow and Gerald Sommer Christian{Albrechts{Universitat zu Kiel Institute of Computer Science, Cognitive Systems Preuerstrae 1{9, 24105 Kiel Tel:+49 431

More information

Chapter 7 Interconnected Systems and Feedback: Well-Posedness, Stability, and Performance 7. Introduction Feedback control is a powerful approach to o

Chapter 7 Interconnected Systems and Feedback: Well-Posedness, Stability, and Performance 7. Introduction Feedback control is a powerful approach to o Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A. Dahleh George Verghese Department of Electrical Engineering and Computer Science Massachuasetts Institute of Technology c Chapter 7 Interconnected

More information

The Growth of Functions. A Practical Introduction with as Little Theory as possible

The Growth of Functions. A Practical Introduction with as Little Theory as possible The Growth of Functions A Practical Introduction with as Little Theory as possible Complexity of Algorithms (1) Before we talk about the growth of functions and the concept of order, let s discuss why

More information

= w 2. w 1. B j. A j. C + j1j2

= w 2. w 1. B j. A j. C + j1j2 Local Minima and Plateaus in Multilayer Neural Networks Kenji Fukumizu and Shun-ichi Amari Brain Science Institute, RIKEN Hirosawa 2-, Wako, Saitama 35-098, Japan E-mail: ffuku, amarig@brain.riken.go.jp

More information

On the Use of A Priori Knowledge in Adaptive Inverse Control

On the Use of A Priori Knowledge in Adaptive Inverse Control 54 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS PART I: FUNDAMENTAL THEORY AND APPLICATIONS, VOL 47, NO 1, JANUARY 2000 On the Use of A Priori Knowledge in Adaptive Inverse Control August Kaelin, Member,

More information

Linear Algebra and Eigenproblems

Linear Algebra and Eigenproblems Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details

More information

Riccati difference equations to non linear extended Kalman filter constraints

Riccati difference equations to non linear extended Kalman filter constraints International Journal of Scientific & Engineering Research Volume 3, Issue 12, December-2012 1 Riccati difference equations to non linear extended Kalman filter constraints Abstract Elizabeth.S 1 & Jothilakshmi.R

More information

SIMON FRASER UNIVERSITY School of Engineering Science

SIMON FRASER UNIVERSITY School of Engineering Science SIMON FRASER UNIVERSITY School of Engineering Science Course Outline ENSC 810-3 Digital Signal Processing Calendar Description This course covers advanced digital signal processing techniques. The main

More information

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Ellida M. Khazen * 13395 Coppermine Rd. Apartment 410 Herndon VA 20171 USA Abstract

More information

Algorithms to solve block Toeplitz systems and. least-squares problems by transforming to Cauchy-like. matrices

Algorithms to solve block Toeplitz systems and. least-squares problems by transforming to Cauchy-like. matrices Algorithms to solve block Toeplitz systems and least-squares problems by transforming to Cauchy-like matrices K. Gallivan S. Thirumalai P. Van Dooren 1 Introduction Fast algorithms to factor Toeplitz matrices

More information

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun Boxlets: a Fast Convolution Algorithm for Signal Processing and Neural Networks Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun AT&T Labs-Research 100 Schultz Drive, Red Bank, NJ 07701-7033

More information

A new fast algorithm for blind MA-system identication. based on higher order cumulants. K.D. Kammeyer and B. Jelonnek

A new fast algorithm for blind MA-system identication. based on higher order cumulants. K.D. Kammeyer and B. Jelonnek SPIE Advanced Signal Proc: Algorithms, Architectures & Implementations V, San Diego, -9 July 99 A new fast algorithm for blind MA-system identication based on higher order cumulants KD Kammeyer and B Jelonnek

More information

V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline

V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline Goals Introduce Wiener-Hopf (WH) equations Introduce application of the steepest descent method to the WH problem Approximation to the Least

More information

Adaptive Filters. un [ ] yn [ ] w. yn n wun k. - Adaptive filter (FIR): yn n n w nun k. (1) Identification. Unknown System + (2) Inverse modeling

Adaptive Filters. un [ ] yn [ ] w. yn n wun k. - Adaptive filter (FIR): yn n n w nun k. (1) Identification. Unknown System + (2) Inverse modeling Adaptive Filters - Statistical digital signal processing: in many problems of interest, the signals exhibit some inherent variability plus additive noise we use probabilistic laws to model the statistical

More information

6.12 System Identification The Easy Case

6.12 System Identification The Easy Case 252 SYSTEMS 612 System Identification The Easy Case Assume that someone brings you a signal processing system enclosed in a black box The box has two connectors, one marked input and the other output Other

More information

On the Equivariance of the Orientation and the Tensor Field Representation Klas Nordberg Hans Knutsson Gosta Granlund Computer Vision Laboratory, Depa

On the Equivariance of the Orientation and the Tensor Field Representation Klas Nordberg Hans Knutsson Gosta Granlund Computer Vision Laboratory, Depa On the Invariance of the Orientation and the Tensor Field Representation Klas Nordberg Hans Knutsson Gosta Granlund LiTH-ISY-R-530 993-09-08 On the Equivariance of the Orientation and the Tensor Field

More information

Optimization for neural networks

Optimization for neural networks 0 - : Optimization for neural networks Prof. J.C. Kao, UCLA Optimization for neural networks We previously introduced the principle of gradient descent. Now we will discuss specific modifications we make

More information

Essentials of Intermediate Algebra

Essentials of Intermediate Algebra Essentials of Intermediate Algebra BY Tom K. Kim, Ph.D. Peninsula College, WA Randy Anderson, M.S. Peninsula College, WA 9/24/2012 Contents 1 Review 1 2 Rules of Exponents 2 2.1 Multiplying Two Exponentials

More information

Eigenspaces in Recursive Sequences

Eigenspaces in Recursive Sequences Eigenspaces in Recursive Sequences Ben Galin September 5, 005 One of the areas of study in discrete mathematics deals with sequences, in particular, infinite sequences An infinite sequence can be defined

More information

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo The Fixed-Point Algorithm and Maximum Likelihood Estimation for Independent Component Analysis Aapo Hyvarinen Helsinki University of Technology Laboratory of Computer and Information Science P.O.Box 5400,

More information

Analog Neural Nets with Gaussian or other Common. Noise Distributions cannot Recognize Arbitrary. Regular Languages.

Analog Neural Nets with Gaussian or other Common. Noise Distributions cannot Recognize Arbitrary. Regular Languages. Analog Neural Nets with Gaussian or other Common Noise Distributions cannot Recognize Arbitrary Regular Languages Wolfgang Maass Inst. for Theoretical Computer Science, Technische Universitat Graz Klosterwiesgasse

More information

Linear stochastic approximation driven by slowly varying Markov chains

Linear stochastic approximation driven by slowly varying Markov chains Available online at www.sciencedirect.com Systems & Control Letters 50 2003 95 102 www.elsevier.com/locate/sysconle Linear stochastic approximation driven by slowly varying Marov chains Viay R. Konda,

More information

A vector from the origin to H, V could be expressed using:

A vector from the origin to H, V could be expressed using: Linear Discriminant Function: the linear discriminant function: g(x) = w t x + ω 0 x is the point, w is the weight vector, and ω 0 is the bias (t is the transpose). Two Category Case: In the two category

More information

Adaptive Beamforming Algorithms

Adaptive Beamforming Algorithms S. R. Zinka srinivasa_zinka@daiict.ac.in October 29, 2014 Outline 1 Least Mean Squares 2 Sample Matrix Inversion 3 Recursive Least Squares 4 Accelerated Gradient Approach 5 Conjugate Gradient Method Outline

More information

IMPROVEMENTS IN ACTIVE NOISE CONTROL OF HELICOPTER NOISE IN A MOCK CABIN ABSTRACT

IMPROVEMENTS IN ACTIVE NOISE CONTROL OF HELICOPTER NOISE IN A MOCK CABIN ABSTRACT IMPROVEMENTS IN ACTIVE NOISE CONTROL OF HELICOPTER NOISE IN A MOCK CABIN Jared K. Thomas Brigham Young University Department of Mechanical Engineering ABSTRACT The application of active noise control (ANC)

More information

Discrete quantum random walks

Discrete quantum random walks Quantum Information and Computation: Report Edin Husić edin.husic@ens-lyon.fr Discrete quantum random walks Abstract In this report, we present the ideas behind the notion of quantum random walks. We further

More information

Conjugate Directions for Stochastic Gradient Descent

Conjugate Directions for Stochastic Gradient Descent Conjugate Directions for Stochastic Gradient Descent Nicol N Schraudolph Thore Graepel Institute of Computational Science ETH Zürich, Switzerland {schraudo,graepel}@infethzch Abstract The method of conjugate

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

weightchanges alleviates many ofthe above problems. Momentum is an example of an improvement on our simple rst order method that keeps it rst order bu

weightchanges alleviates many ofthe above problems. Momentum is an example of an improvement on our simple rst order method that keeps it rst order bu Levenberg-Marquardt Optimization Sam Roweis Abstract Levenberg-Marquardt Optimization is a virtual standard in nonlinear optimization which signicantly outperforms gradient descent and conjugate gradient

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

No. of dimensions 1. No. of centers

No. of dimensions 1. No. of centers Contents 8.6 Course of dimensionality............................ 15 8.7 Computational aspects of linear estimators.................. 15 8.7.1 Diagonalization of circulant andblock-circulant matrices......

More information

Error Empirical error. Generalization error. Time (number of iteration)

Error Empirical error. Generalization error. Time (number of iteration) Submitted to Neural Networks. Dynamics of Batch Learning in Multilayer Networks { Overrealizability and Overtraining { Kenji Fukumizu The Institute of Physical and Chemical Research (RIKEN) E-mail: fuku@brain.riken.go.jp

More information

Institute for Advanced Computer Studies. Department of Computer Science. On Markov Chains with Sluggish Transients. G. W. Stewart y.

Institute for Advanced Computer Studies. Department of Computer Science. On Markov Chains with Sluggish Transients. G. W. Stewart y. University of Maryland Institute for Advanced Computer Studies Department of Computer Science College Park TR{94{77 TR{3306 On Markov Chains with Sluggish Transients G. W. Stewart y June, 994 ABSTRACT

More information