Adaptive Filter Theory

Similar documents
Adaptive Filters. un [ ] yn [ ] w. yn n wun k. - Adaptive filter (FIR): yn n n w nun k. (1) Identification. Unknown System + (2) Inverse modeling

Adaptive Filtering Part II

26. Filtering. ECE 830, Spring 2014

ADAPTIVE FILTER THEORY

EE482: Digital Signal Processing Applications

III.C - Linear Transformations: Optimal Filtering

EE482: Digital Signal Processing Applications

3.4 Linear Least-Squares Filter

ADAPTIVE FILTER THEORY

Optimal and Adaptive Filtering

Ch5: Least Mean-Square Adaptive Filtering

V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline

2.6 The optimum filtering solution is defined by the Wiener-Hopf equation

Statistical and Adaptive Signal Processing

Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels

ESE 531: Digital Signal Processing

CHAPTER 4 ADAPTIVE FILTERS: LMS, NLMS AND RLS. 4.1 Adaptive Filter

Performance Analysis and Enhancements of Adaptive Algorithms and Their Applications

Adaptive Systems Homework Assignment 1

System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co

ECE 636: Systems identification

Linear Optimum Filtering: Statement

An Adaptive Sensor Array Using an Affine Combination of Two Filters

A Derivation of the Steady-State MSE of RLS: Stationary and Nonstationary Cases

Recursive Least Squares for an Entropy Regularized MSE Cost Function

On the Stability of the Least-Mean Fourth (LMF) Algorithm

Adaptive Beamforming Algorithms

Revision of Lecture 4

Chapter 2 Wiener Filtering

Advanced Digital Signal Processing -Introduction

AdaptiveFilters. GJRE-F Classification : FOR Code:

Advanced Signal Processing Adaptive Estimation and Filtering

Adap>ve Filters Part 2 (LMS variants and analysis) ECE 5/639 Sta>s>cal Signal Processing II: Linear Es>ma>on

Least Mean Square Filtering

Chapter 2 Fundamentals of Adaptive Filter Theory

Sparseness-Controlled Affine Projection Algorithm for Echo Cancelation

Machine Learning. A Bayesian and Optimization Perspective. Academic Press, Sergios Theodoridis 1. of Athens, Athens, Greece.

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL.

SIMON FRASER UNIVERSITY School of Engineering Science

Recursive Generalized Eigendecomposition for Independent Component Analysis

Examination with solution suggestions SSY130 Applied Signal Processing

NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group

Adaptive sparse algorithms for estimating sparse channels in broadband wireless communications systems

Machine Learning and Adaptive Systems. Lectures 5 & 6

Adaptive Linear Filtering Using Interior Point. Optimization Techniques. Lecturer: Tom Luo

Ch6-Normalized Least Mean-Square Adaptive Filtering

Acoustic MIMO Signal Processing

Lecture: Adaptive Filtering

MMSE System Identification, Gradient Descent, and the Least Mean Squares Algorithm

Efficient Use Of Sparse Adaptive Filters

Assesment of the efficiency of the LMS algorithm based on spectral information

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE?

ADAPTIVE ANTENNAS. SPATIAL BF

Decision Weighted Adaptive Algorithms with Applications to Wireless Channel Estimation

Ch4: Method of Steepest Descent

ELEG-636: Statistical Signal Processing

Widely Linear Estimation and Augmented CLMS (ACLMS)

Optimal and Adaptive Filtering

BLOCK LMS ADAPTIVE FILTER WITH DETERMINISTIC REFERENCE INPUTS FOR EVENT-RELATED SIGNALS

Adaptive SP & Machine Intelligence Linear Adaptive Filters and Applications

NSLMS: a Proportional Weight Algorithm for Sparse Adaptive Filters

ECE534, Spring 2018: Solutions for Problem Set #5

Error Vector Normalized Adaptive Algorithm Applied to Adaptive Noise Canceller and System Identification

An overview on optimized NLMS algorithms for acoustic echo cancellation

EFFECTS OF ILL-CONDITIONED DATA ON LEAST SQUARES ADAPTIVE FILTERS. Gary A. Ybarra and S.T. Alexander

KNOWN approaches for improving the performance of

FAST IMPLEMENTATION OF A SUBBAND ADAPTIVE ALGORITHM FOR ACOUSTIC ECHO CANCELLATION

SGN Advanced Signal Processing: Lecture 4 Gradient based adaptation: Steepest Descent Method

5 Kalman filters. 5.1 Scalar Kalman filter. Unit delay Signal model. System model

Statistical signal processing

Independent Component Analysis. Contents

Variable, Step-Size, Block Normalized, Least Mean, Square Adaptive Filter: A Unied Framework

New Statistical Model for the Enhancement of Noisy Speech

Lecture 3: Linear FIR Adaptive Filtering Gradient based adaptation: Steepest Descent Method

Reduced-Rank Multi-Antenna Cyclic Wiener Filtering for Interference Cancellation

HST.582J/6.555J/16.456J

NEW STEIGLITZ-McBRIDE ADAPTIVE LATTICE NOTCH FILTERS

A low intricacy variable step-size partial update adaptive algorithm for Acoustic Echo Cancellation USNRao

SNR lidar signal improovement by adaptive tecniques

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver

Machine Learning and Adaptive Systems. Lectures 3 & 4

Today. ESE 531: Digital Signal Processing. IIR Filter Design. Impulse Invariance. Impulse Invariance. Impulse Invariance. ω < π.

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

Lecture 19: Bayesian Linear Estimators

Least squares: introduction to the network adjustment

Adaptive Systems. Winter Term 2017/18. Instructor: Pejman Mowlaee Beikzadehmahaleh. Assistants: Christian Stetco

LMS and eigenvalue spread 2. Lecture 3 1. LMS and eigenvalue spread 3. LMS and eigenvalue spread 4. χ(r) = λ max λ min. » 1 a. » b0 +b. b 0 a+b 1.

Lecture 6: Block Adaptive Filters and Frequency Domain Adaptive Filters

Timing Recovery at Low SNR Cramer-Rao bound, and outperforming the PLL

Levinson Durbin Recursions: I

Multiple Random Variables

Department of Electrical and Electronic Engineering

Combinations of Adaptive Filters

Speech enhancement in discontinuous transmission systems using the constrained-stability least-mean-squares algorithm

ELEG-636: Statistical Signal Processing

Algorithm for Multiple Model Adaptive Control Based on Input-Output Plant Model

A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY. Jie Yang

EEL 6502: Adaptive Signal Processing Homework #4 (LMS)

Instructor: Dr. Benjamin Thompson Lecture 8: 3 February 2009

Transcription:

0 Adaptive Filter heory Sung Ho Cho Hanyang University Seoul, Korea (Office) +8--0-0390 (Mobile) +8-10-541-5178 dragon@hanyang.ac.kr

able of Contents 1 Wiener Filters Gradient Search by Steepest Descent Method Stochastic Gradient Adaptive Algorithms Recursive Least Square (RLS) Algorithm

Wiener Filters

Filter Optimization Problem 3 Wiener Filtering Aprioriknowledge of the signal statistics or at least their estimates are required. Complex and expensive hardware systems are necessary (particularly, in nonstationary environments). Adaptive Filtering Complete knowledge of the signal statistics is not required. Filter weights eventually converge to the optimum Wiener solutions for stationary processes. Filter weights show tracking capability in slowly time-varying nonstationary environments. Complex and expensive hardware systems are not, in general, necessary.

Wiener Filters (1/6) 4 Objectives: We want to design a filter h that minimizes the mean-squared estimation error so that the i E e ( n ) estimated signal dˆ ( n) best approximates d(n). h { } Estimation Error Signal Desired Signal dn ( ) en ( ) = dn ( ) dn ˆ ( ) Reference Signal x( n) h i 0 i N 1 N 1 dˆ( n ) = h x ( n i ) i= 0 i Estimated Signal

Wiener Filters (/7) 5 Basic Structure: dn () x() n h 0 N 1 en ( ) = dn ( ) hxn ( i) i= 0 = d ( n ) H X ( n ) i xn ( 1) z 1 h 1 xn ( ) z 1 h dn ˆ( ) Linear combination of the current and past input signals xn ( N + 1) z 1 hn 1

Wiener Filters (3/7) 6 Basic Assumptions: d(n)andx(n) are zero-mean. d(n) and x(n) are jointly wide-sense stationary. Notations: Filter Coefficient Vector: [,,, ] H = h h L h 0 1 N 1 Reference Input Vector: X() n = [ xn (), xn ( 1), L, xn ( N+ 1) ] () () ( 1) ( 1) Estimation Error Signal: Autocorrelation Matrix: N 1 en ( ) = dn ( ) hxn ( i) XX i= 0 = d( n) H X( n) { ( ) ( ) } R = E X n X n i Cross-correlation Vector: RdX = { ( ) ( )} E d n X n Optimum Filter Coefficient Vector: Hopt = h0, opt, h1, opt, L, hn 1, opt

Wiener Filters (4/7) 7 Performance Measure (Cost Function): ξ = E{ e ( n )} ( ( ) ( )) { ( )} { ( ) ( )} { ( ) ( )} { ( )} dx XX = E d n H X n = E d n H E d n X n + H E X n X n H = E d n H R + H R H We now want to minimize ξ with respect to H: ξ = RdX + RXX H = 0 H Wiener-Hopf Solution (1931): R H = XX opt R dx H = R 1 opt R XX dx

Wiener Filters (5/7) 8 Autocorrelation Matrix R XX : RXX = E{ X( n) X ( n) } rxx (0) rxx(1) L rxx( N 1) rxx(1) rxx(0) L rxx ( N ) = M M O M rxx( N 1) rxx( N ) L rxx (0) R XX is symmetric and oeplitz. Is R XX invertible? Yes, almost always. R XX is almost always a positive definite matrix. A symmetric matrix A is called positive definite if x Ax > 0 for every nonzero x. All the eigenvalues of A is positive. he determinant of every principal submatrix of A is positive. Since the determinant of A is not zero, A is invertible.

Wiener Filters (6/7) 9 Let X B (n) denote the vector obtained by rearranging the elements of X(n) backward, i.e., hen B [ ] X ( n) = x( n N + 1), x( n N + ), L, x( n) { ( ) ( )} E X n X n = R B B XX Cross-correlation Vector R dx : R dx rdx(0) r (1) dx = E{ d( n) X( n) } = M rdx ( N 1) Minimum Estimation Error: opt emin ( n ) = d ( n ) H X ( n ) = d ( n ) X ( n ) H opt

Wiener Filters (7/7) 10 Minimum Mean-Squared Estimation Error: ξ min = E { e min ( n) } = E ( d( n) Hopt X( n) ) = E { d ( n) } Hopt RdX = { ( )} { } opt XX opt E d n H R H Example: ξ N = 1 ξ N = ξ min Error Surface ξ min Error Surface h 1 0,opt h 0 h ( h0, opt, h1, opt ) h 0

Orthogonality Principle: 11 dn ( ) e min ( n ) θ Plane M dn ˆ( ) he plane M is spanned by X() n = [ xn (), xn ( 1), L, xn ( N+ + 1) ] () () ( 1) ( 1) he plane M is spanned by. N 1 dn ˆ( ) = hxn ( i ) i=0 i he plane M e ( n min ) E{ emin ( n ) X ( n ) } = 0 N he perfect estimation is possible if θ = 0, and the estimation fails if θ = π/.

Some Drawbacks of the Wiener Filter: 1 Signal statistics must be known a priori. We must know R XX and R dx or at least their estimates. A matrix inversion operation is required. Heavy computational load. Not proper for real-time applications. Situations get worse in nonstationary environments. We have to compute R XX (n) and R dx (n) at every time n. We must compute the matrix inversion operation at every time n.

13 Gradient Search by Steepest Descent Method

Steepest Descent Method (1/5) 14 Objectives: We want to design a filter h in a recursive form in order to avoid the matrix inversion operation i ( n) required in Wiener solution. dn ( ) en ( ) = dn ( ) dn ˆ( ) x( n) h i (n) 0 i N 11 N 1 dˆ( n ) = h ( n ) x ( n i ) i= 0 i

Steepest Descent Method (/5) 15 Basic Structure: dn () xn () z 1 h 0 ( n) N 1 en ( ) = dn ( ) h( n) xn ( i) i= 0 = d ( n ) H ( n ) X ( n ) i xn ( 1) z 1 h 1 ( n) dn ˆ( ) xn ( ) h ( n) z 1 xn ( N+ 1) h N 1 ( n )

Steepest Descent Method (3/5) 16 Basic Assumptions: d(n)andx(n) are zero-mean. d(n) and x(n) are jointly wide-sense stationary. Notations: Filter Coefficient Vector: [ ] H ( n) = h ( n), h ( n), L, h ( n) 0 1 N 1 Reference Input Vector: X() n = [ xn (), xn ( 1), L, xn ( N+ 1) ] () () ( 1) ( 1) Estimation Error Signal: Autocorrelation Matrix: N 1 en ( ) = dn ( ) h( nxn ) ( i) XX i= 0 = d( n) H ( n) X( n) { ( ) ( ) } R = E X n X n i Cross-correlation Vector: RdX = { ( ) ( )} E d n X n Optimum Filter Coefficient Vector: Hopt = h0, opt, h1, opt, L, hn 1, opt

Steepest Descent Method (4/5) 17 he filter coefficient vector at time n+1 is equal to the coefficient vector at time n plus a change proportional to the negative gradient of the mean-squared error, ie i.e., 1 H ( n+ 1) = H( n) μ H( n) ( n) μ = Adaptation Step-size [ ] H ( n) = h ( n), h ( n), L, h ( n) 0 1 N 1 Performance Measure (Cost Function): { } { } ξ ( n) = E e ( n) = E d ( n) H ( n) R + H ( n) R H( n) dx XX

Steepest Descent Method (5/5) 18 he Gradient of the Mean-Squared Error: H( n) ( n) ξ( n) = H( n) = R + R H ( n ) dx XX herefore, the recursive update equation for the coefficient vector becomes [ ] H( n+ 1) = I μ R H( n) +μr N XX dx Misalignment Vector: V( n) = H( n) Hopt [ ] V( n+ 1) = I μr V( n) N XX

Convergence of Steepest Descent Method (1/) 19 Convergence (or Stability) Condition: 1 μλ i < 1 0 < μ<, i λ λ i 0 <μ<, i λ max (λ i = the i-th eigenvalue of R XX ) λ Slow convergence if max is large. λ λ min

Convergence of Steepest Descent Method (/) 0 ime Constant: he convergence behavior of the i-thi element of the misalignment vector: ( ) v ( n+ 1) = 1 μλ v ( n) i i i ( ) n i i vi v ( n) = 1 μλ (0) ime constant for the i-th element of the misalignment vector: 1 1 μλ i = exp τi 1 1 τ i = (samples) for μ 1 ln 1 μλ ( μλ ) i i Steady-State Value: H( ) = H or V( ) = 0 opt N We still need a priori knowledge of signal statistics.

1 Stochastic Gradient Adaptive Algorithms

Stochastic Gradient Adaptive Filters Motivations: No a priori information about signal statistics No matrix inversion racking capability Slfd Self-designing i (Recursive method) he filter gradually learns the required correlation of the input signals and adjusts its coefficient vector recursively according to some suitably chosen instantaneous error criterion. Evaluation Criteria: Rate of convergence Misadjustment (Deviation from the optimum solution) Robustness for ill-conditioned data Computational costs Hardware implementation costs Numerical problems

Applications of Stochastic Gradient Adaptive Filters (1/) 3 System Identifications: ξ(n) Unknown d(n) x (n) Σ Σ e(n) System Adaptive Filter Adaptive Prediction: d(n) Σ e(n) z Δ x ( n ) = d ( n Δ ) Adaptive Filter

Applications of Stochastic Gradient Adaptive Filters (1/) 4 Noise Cancellation: y(n) Σ d( n) = y( n) + ξ( n) Σ e(n) ξ(n) x(n)? Adaptive Filter ˆ ξ( n ) Inverse Filtering: raining Signal (RX) raining Signal (X) Unknown Channel z Δ d(n) Received Signal x(n) Adaptive Σ Filter ξ(n) Σ e(n )

Classification of Adaptive Filters 5 System Identification: System Identification Layered Earth Modeling Adaptive Prediction: Linear Predictive Coding Autoregressive Spectral Analysis ADPCM Noise Cancellation: Adaptive Noise Cancellation Adaptive Echo Cancellation Active Noise Control Adaptive Beamforming Inverse Filtering: Adaptive Equalization Deconvolution Blind Equalization

Stochastic Gradient Adaptive Algorithms (1/6) 6 dn ( ) en ( ) = dn ( ) dn ˆ( ) x( n) h i (n) 0 i N 1 Adaptive Algorithm N 1 dˆ ( n) = h ( n) x( n i) i= 0 i en ( ) = d ( n ) H ( n ) X ( n ) μ H ( n+ 1) = H( n) H( n) ( n) α Various forms according to the choice of the performance measure. ( ) ( ) H n n = e ( ) ( n H n ) α If no correlation between d(n) and x(n), then no estimation can be made.

Stochastic Gradient Adaptive Algorithms (/6) 7 Notations: Filter Coefficient Vector: H ( n) = [ h ( n), h ( n), L, h ( n) ] ( ) ( ) ( ) ( ) 0 1 N 1 Reference Input Vector: X n = [ xn xn L xn N+ ] () (), ( 1),, ( 1) Estimation Error Signal: Autocorrelation Matrix: N 1 en ( ) = dn ( ) h( nxn ) ( i) XX i= 0 = d ( n ) H ( n ) X ( n ) { ( ) ( )} R = E X n X n i Cross-correlation Vector: RdX = { ( ) ( )} E d n X n Optimum Filter Coefficient Vector: Hopt = h0, opt, h1, opt, L, hn 1, opt Misalignment Vector: V( n) = H( n) Hopt Covariance Matrix of the Misalignment Vector: Kn ( ) = EVnV { ( ) ( n) }

Stochastic Gradient Adaptive Algorithms (3/6) 8 Sign Algorithm: α = 1 he sign algorithm tries to minimize the instantaneous absolute error value at each iteration. en ( ) = d ( n ) H ( n ) X ( n ) H ( n ) ( n) en ( ) = HH ( n ) Filter Coefficient Updates: { } H ( n+ 1) = H( n) μx( n)sign e( n) { en} sign ( ) 1, en ( ) 0 = 1, en ( ) < 0

Stochastic Gradient Adaptive Algorithms (4/6) 9 Least Mean Square (LMS) Algorithm: α = he LMS algorithm tries to minimize the instantaneous squared error value at each iteration. en ( ) = d ( n ) H ( n ) X ( n ) H( n) e ( n) ( ) ( n ) = HH ( n ) Filter Coefficient Updates: H ( n+ 1) = H( n) μx( n) e( n)

Stochastic Gradient Adaptive Algorithms (5/6) 30 Least Mean Absolute hird (LMA) Algorithm: α = 3 he LMA algorithm tries to minimize the instantaneous absolute error value to the third power at each iteration. en ( ) = dn ( ) H ( nx ) ( n) H( n) ( n) 3 en ( ) = H ( n) Filter Coefficient Updates: { } H ( n+ 1) = H( n) μx( n) e ( n)sign e( n)

Stochastic Gradient Adaptive Algorithms (6/6) 31 Least Mean Fourth (LMF) Algorithm: α = 4 he LMF algorithm tries to minimize the instantaneous error value to the fourth power at each iteration. en ( ) = dn ( ) H ( nx ) ( n) e ( n) H ( n ) ( n ) = H ( n) 4 Filter Coefficient Updates: H ( n+ 1) = H( n) μx( n) e ( n) 3

Convergence of the Adaptive Algorithms (1/) 3 Basically, we need to know the mean and mean-squared behavior of the algorithms. For the analysis of the statistical mean behavior: We want to know a set of statistical difference equations that characterizes E{H(n)} or E{V(n)}. We also need to check Stability conditions Convergence speed Unbiased estimation capability For the analysis of the statistical mean-squared behavior: We want to know a set of statistical difference equations that characterizes { } and K( n) = E V( n) V ( n). We also need to check Stability conditions Convergence speed Estimation precision { } σ e ( n ) = E e ( n )

Convergence of the Adaptive Algorithms (/) 33 Basic Assumptions for the Convergence Analysis: he input signals d(n) and x(n) are zero-mean, jointly wide-sense stationary, and jointly Gaussian with finite variances. A consequence of this assumption is that t the estimation error e(n) ( ) = d(n) ) H (n)x(n) )is also a zeromean and Gaussian when conditioned on the coefficient vector H(n). Independence d Assumption: he input pair {d(n), X(n)} at time n is independent of {d(k), X(k)} at time k, if n is not equal to k. his assumption is seldom true in practice, but is valid when the step-size μ is chosen to be sufficiently i small. One direct consequence of the independence assumption is that the coefficient vector H(n) is uncorrelated with the input pair {d(n), X(n)}, since H(n) depends only on inputs at time n-1 and before..

Sign Algorithm (1/) 34 Mean Behavior: E { Hn ( 1) } I μ { ( )} e( ) R EHn μ + = + πσ n πσe( n) R N XX dx μ E V n IN RXX E V n πσe( n) { ( + 1) } = { ( )} Mean-Squared Behavior: e min { } σ ( n) =ξ + tr K( n) R XX μ K( n+ 1) = K( n) +μ R K( n) R + R K( n) μ [ ] XX πσ( n) XX XX

Sign Algorithm (/) 35 Steady-State Mean-Squared Estimation Error: μ π σe( ) ξ min + ξ min { } tr R XX Convergence Condition (Weak Convergence): he long-term time-average of the MAE is bounded for any positive value of μ. Very robust, but slow.

LMS Algorithm (1/) 36 Mean Behavior: { ( 1) } [ ] { ( )} E H n+ = I μ R E H n +μr N XX dx { ( + 1) } = [ μ ] { ( )} EVn I R EVn N XX Mean-Squared Behavior: σ ( n ) =ξ + tr { K ( n ) R } e min XX [ ] K ( n+ 1) = K ( n ) μ K ( n ) R + R K ( n ) XX +μ σ + XX e ( ni ) N RXX Kn ( ) RXX

LMS Algorithm (/) 37 Steady-State Mean-Squared Estimation Error: μ σe( ) ξ min + ξ min { } tr R XX Mean Convergence: 0 <μ< λ max Mean-Squared Convergence: 0 <μ< tr R { } 3 XX π 1 If, then μ LMS =μsign σe( ) LMS σe( ) sign. ξ min he convergence of the algorithm strongly depends on the input signal statistics.

LMA Algorithm (1/) 38 Mean Behavior: E{ H( n 1) } + = I μ σ ( n) R E{ H( n) } + μ σ ( n) R π π N e XX e dx EVn IN e nrxx EVn π { ( + 1) } = μ σ ( ) { ( )} Mean-Squared Behavior: e min { } σ ( n) =ξ + tr K( n) R XX K( n + 1) = K ( n) μ σ e( n) K ( n) R XX + R XXK ( n) π [ ] + 3 μ σ ( ) σ ( ) + 3 ( ) e n e n RXX RXX K n RXX

LMA Algorithm (/) 39 Steady-State Mean-Squared Estimation Error: 3μ π σe( ) ξ + ξ ξ 4 min min min { } tr R XX Mean Convergence: π 1 0 < μ<, n λ σ ( n) max e Very fast, but must be careful. he convergence of the LMA algorithm depends on the initial choice of the coefficient vector. μ = μ 1 σ ( ) σ ( ) If, then. LMA LMS e LMA e LMS 3 π ξmin

LMF Algorithm (1/) 40 Mean Behavior: { } { } N e XX e dx E H( n+ 1) = I 3 μσ ( n) R E H( n) + 3 μσ ( n) R { ( + 1) } = 3 μσ ( ) { ( )} EVn IN e nrxx EVn Mean-Squared Behavior: σ ( n ) =ξ + tr { K ( n ) R } e min XX [ ] e XX XX Kn ( + 1) = Kn ( ) 3 μσ ( n) KnR ( ) + R Kn ( ) + 15 μσ ( ) σ ( ) + 6 ( ) 4 e n e n IN RXXK n RXX

LMF Algorithm (/) 41 Steady-State Mean-Squared Estimation Error:? Mean Convergence: 0 < μ<, n 3 λ σ ( n) max e Very fast, but must be careful also. he convergence of the LMF algorithm also depends on the initial choice of the coefficient vector.

Further Observations (1/) 4 ( ) M ξ Misadjustment: ex ξ min Sign Algorithm: { } tr R M μ π XX ξ min LMS Algorithm: LMA Algorithm: μ M tr R XX 3μ π 4 { } { } M ξ min tr R XX LMF Algorithm:?

Further Observations (/) 43 he misadjustment M increases with the filter order N. he misadjustment M is directly proportional to μ. he convergence speed is inversely proportional to μ. Convergence Speed: (Fast) LMA LMF LMS Sign (Slow) Robustness (or Stability): (Good) Sign LMS LMA LMF (Bad)

Example: System Identification Mode (1/6) 44 ξ (n) Unknown d(n) ( ) x (n) Σ Σ e(n) System Adaptive Filter H opt = [ 0.1, 0.3, 0.5, 0.7, 0.5, 0.3, 0.1]

Example: System Identification Mode (/6) 45 wo Sets of Reference Inputs: CASE 1: Eigenvalue Spread Ratio = 5.3 x ( n) = ζ ( n) + 0.9 x ( n 1) 0.1 x ( n ) 0. x ( n 3) 1 1 1 1 CASE : Eigenvalue Spread Ratio = 185.8 x ( n ) =ζ ( n ) + 1.5 x ( n 1) x ( n ) 0.5 x ( n 3) Measurement Noise ζ(n): White Gaussian Process Convergence Parameter μ: Sign LMS LMA LMF 0.00016 0.00 0.011 0.00

Example: System Identification Mode (3/6) 46 CASE 1: Eigenvalue Spread Ratio = 5.3 x ( n) =ζ ( n) + 0.9 x ( n 1) 0.1 x ( n ) 0. x ( n 3) 1 1 1 1 10 MSE in db 0-10 4 1 : LM A : LM S 3 : LM F 4 : S IG N -0 3 1 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # o f Itera tion Mean-Squared Behavior of the Coefficients

Example: System Identification Mode (4/6) 47 0.16 0.1 4 1 : LM A : LM S 3 : LM F 4 : S IG N E( h1(n)) 0.08 1 3 0.0 4 0.00 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # of Iteration Mean Behavior of the Coefficients

Example: System Identification Mode (5/6) 48 CASE : Eigenvalue Spread Ratio = 185.8 x ( n) =ζ ( n) + 1.5 x ( n 1) x ( n ) 0.5 x ( n 3) 10 MSE in db 0 1:LMA : L M S 3:LMF 4:SIGN -10 4 3-0 1 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # of Iteration Mean-Squared Behavior of the Coefficients

Example: System Identification Mode (6/6) 49 0.1 4 ) E( h1(n) 0.08 0.04 1 3 1:LMA :LMS 3:LMF 4:SIGN 0.00 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # of Iteration Mean Behavior of the Coefficients

Other Algorithms (1/) 50 Signed Regressor Algorithm: + = +μ { } H( n 1) H( n) sign X( n) e( n) Sign-Sign Algorithm: H( n+ 1) = H( n) +μ sign { X( n) } sign { e( n) } Normalized LMS Algorithm: Hn ( + 1) = Hn ( ) + Xnen ( ) ( ) X μ ( n) X( n) H ( n+ 1) = H( n) +μx ( n) e( n) Complex LMS Algorithm: *

Other Algorithms (/) 51 Hybrid Algorithm #1: LMS + LMF { e () n (1 ) e 4 () n } φ + φ Hn ( )() n =,0 φ 1 Hn () 3 { } H ( n+ 1) = H( n) +μ φ X( n) e( n) + (1 φ) X( n) e ( n) Hybrid Algorithm #: Sign + LMA { en () (1 ) en () 3 } φ + φ Hn ( )() n =,0 φ 1 Hn () { } { } Hn ( + 1) = Hn ( ) +μ φ Xn ( ) + 3(1 φ) Xne ( ) ( n) sign en ( )

5 Recursive Least Square (RLS) Algorithm

RLS Algorithm (1/5) 53 Cost Function: n ε ( n) = β( n, i) e ( i) i= 1 where n = Length of the observable data Error signal at time instance i: ei () = di () H ( nx ) () i he coefficient vector H(n) ( ) remains fixed during the observation interval 1 i n. Weight Vector: 0 < β( ni, ) 1 (Normally, β ( ni, ) =λ, λ = Forgetting Factor) n i By the method of exponentially weighted least squares, we want to minimize n n i ε ( n ) = λ e ( i ) i= 1 Very fast, but computationally very complex. he algorithm is useful when the number of taps required is small.

RLS Algorithm (/5) 54 Normal Equation: Φ ( nhn ) ( ) =Θ( n) where n Φ ( n) = λ X() i X () i i= 1 n i= 1 n i n i Θ ( n) = λ d( i) X( i) We write n 1 n 1 i Φ ( n ) =λ λ X ( i ) X ( i ) + X ( n ) X ( n ) i= 1 =λφ( n 1) + X( n) X ( n) Θ ( n) =λθ( n 1) + d( n) X( n) Do we need a matrix inversion? No!

RLS Algorithm (3/5) 55 Matrix Inversion Lemma: ( ) 1 1 1 1 1 If A = B + CD C, then A = B BC D + C BC C B. where A and B = N N Positive Definite C = N M D = M M Positive Definite 1 Letting A=Φ ( n), B =λφ( n 1), C = X( n), D= 1, we express in a recursive form: 1 1 1 1 Φ ( n 1) λ Φ ( n 1) X( n) X ( n) Φ ( n 1) Φ ( n) = λ 1 1 1 +λ X ( n ) Φ ( n 1) X ( n ) K(n)

RLS Algorithm (4/5) 56 Define 1 Ρ ( n) =Φ ( n) ( N N) 1 λ Ρ( n 1) X( n) Κ ( n) = ( N 1) 1 1 +λ X ( n) Ρ( n 1) X( n) 1 1 Κ ( n) +λ X ( n) Ρ( n 1) X( n) =λ Ρ( n 1) X( n) 1 1 { } Κ ( n) = λ Ρ( n 1) λ X ( n) Ρ( n 1) X( n) Κ ( n) =Ρ( n) X( n) 1 Κ ( n) =Φ ( n) X( n) herefore, 1 1 Ρ ( n) =λ Ρ( n 1) λ Κ( n) X ( n) Ρ( n 1)

RLS Algorithm (5/5) 57 ime Update for H(n): 1 H ( n) = Φ ( n) Θ( n) =Ρ( n) Θ( n) = λρ ( n ) Θ ( n 1) + d( n) Ρ( n) X( n) =Ρ( n 1) Θ( n 1) Κ( n) X ( n) Ρ( n 1) Θ( n 1) + d( n) Κ( n) 1 1 =Φ ( n 1) Θ( n 1) Κ( n) X ( n) Φ ( n 1) Θ( n 1) + d( n) Κ( n) Hn ( ) = Hn ( 1) +Κ( n) dn ( ) X ( nhn ) ( 1) Innovation: α ( n) = d( n) X ( n) H( n 1) A priori estimation error H ( n) = H ( n 1) + Κ( n) α( n) A posteriori Estimation error e(n): en ( ) = dn ( ) X ( nh ) ( n)

Summary of the RLS Algorithm 58 Initialization: Determine the forgetting factor λ (Normally, 0.9 λ<1) 1 ( N N) : Ρ (0) =δ I N, ( δ= a small positive number) ( N N): H(0) = 0 N Main Iteration: 1 Κ n n = λ Ρ +λ 1 Ρ ( N 1): ( ) ( 1) X( n) 1 X ( n) ( n 1) X( n) (1 1) : α ( n) = d( n) X ( n) H( n 1) ( N 1): H ( n ) = H ( n 1) +Κ ( n ) α ( n ) 1 1 ( N 1): Ρ ( n) =λ Ρ( n 1) λ Κ( n) X ( n) Ρ( n 1) (1 1) : en ( ) = dn ( ) X ( nh ) ( n) (if necessary)