Adaptive Filter Theory - PDF Free Download

0 Adaptive Filter heory Sung Ho Cho Hanyang University Seoul, Korea (Office) +8--0-0390 (Mobile) +8-10-541-5178 dragon@hanyang.ac.kr

able of Contents 1 Wiener Filters Gradient Search by Steepest Descent Method Stochastic Gradient Adaptive Algorithms Recursive Least Square (RLS) Algorithm

Wiener Filters

Filter Optimization Problem 3 Wiener Filtering Aprioriknowledge of the signal statistics or at least their estimates are required. Complex and expensive hardware systems are necessary (particularly, in nonstationary environments). Adaptive Filtering Complete knowledge of the signal statistics is not required. Filter weights eventually converge to the optimum Wiener solutions for stationary processes. Filter weights show tracking capability in slowly time-varying nonstationary environments. Complex and expensive hardware systems are not, in general, necessary.

Wiener Filters (1/6) 4 Objectives: We want to design a filter h that minimizes the mean-squared estimation error so that the i E e ( n ) estimated signal dˆ ( n) best approximates d(n). h { } Estimation Error Signal Desired Signal dn ( ) en ( ) = dn ( ) dn ˆ ( ) Reference Signal x( n) h i 0 i N 1 N 1 dˆ( n ) = h x ( n i ) i= 0 i Estimated Signal

Wiener Filters (/7) 5 Basic Structure: dn () x() n h 0 N 1 en ( ) = dn ( ) hxn ( i) i= 0 = d ( n ) H X ( n ) i xn ( 1) z 1 h 1 xn ( ) z 1 h dn ˆ( ) Linear combination of the current and past input signals xn ( N + 1) z 1 hn 1

Wiener Filters (3/7) 6 Basic Assumptions: d(n)andx(n) are zero-mean. d(n) and x(n) are jointly wide-sense stationary. Notations: Filter Coefficient Vector: [,,, ] H = h h L h 0 1 N 1 Reference Input Vector: X() n = [ xn (), xn ( 1), L, xn ( N+ 1) ] () () ( 1) ( 1) Estimation Error Signal: Autocorrelation Matrix: N 1 en ( ) = dn ( ) hxn ( i) XX i= 0 = d( n) H X( n) { ( ) ( ) } R = E X n X n i Cross-correlation Vector: RdX = { ( ) ( )} E d n X n Optimum Filter Coefficient Vector: Hopt = h0, opt, h1, opt, L, hn 1, opt

Wiener Filters (4/7) 7 Performance Measure (Cost Function): ξ = E{ e ( n )} ( ( ) ( )) { ( )} { ( ) ( )} { ( ) ( )} { ( )} dx XX = E d n H X n = E d n H E d n X n + H E X n X n H = E d n H R + H R H We now want to minimize ξ with respect to H: ξ = RdX + RXX H = 0 H Wiener-Hopf Solution (1931): R H = XX opt R dx H = R 1 opt R XX dx

Wiener Filters (5/7) 8 Autocorrelation Matrix R XX : RXX = E{ X( n) X ( n) } rxx (0) rxx(1) L rxx( N 1) rxx(1) rxx(0) L rxx ( N ) = M M O M rxx( N 1) rxx( N ) L rxx (0) R XX is symmetric and oeplitz. Is R XX invertible? Yes, almost always. R XX is almost always a positive definite matrix. A symmetric matrix A is called positive definite if x Ax > 0 for every nonzero x. All the eigenvalues of A is positive. he determinant of every principal submatrix of A is positive. Since the determinant of A is not zero, A is invertible.

Wiener Filters (6/7) 9 Let X B (n) denote the vector obtained by rearranging the elements of X(n) backward, i.e., hen B [ ] X ( n) = x( n N + 1), x( n N + ), L, x( n) { ( ) ( )} E X n X n = R B B XX Cross-correlation Vector R dx : R dx rdx(0) r (1) dx = E{ d( n) X( n) } = M rdx ( N 1) Minimum Estimation Error: opt emin ( n ) = d ( n ) H X ( n ) = d ( n ) X ( n ) H opt

Wiener Filters (7/7) 10 Minimum Mean-Squared Estimation Error: ξ min = E { e min ( n) } = E ( d( n) Hopt X( n) ) = E { d ( n) } Hopt RdX = { ( )} { } opt XX opt E d n H R H Example: ξ N = 1 ξ N = ξ min Error Surface ξ min Error Surface h 1 0,opt h 0 h ( h0, opt, h1, opt ) h 0

Orthogonality Principle: 11 dn ( ) e min ( n ) θ Plane M dn ˆ( ) he plane M is spanned by X() n = [ xn (), xn ( 1), L, xn ( N+ + 1) ] () () ( 1) ( 1) he plane M is spanned by. N 1 dn ˆ( ) = hxn ( i ) i=0 i he plane M e ( n min ) E{ emin ( n ) X ( n ) } = 0 N he perfect estimation is possible if θ = 0, and the estimation fails if θ = π/.

Some Drawbacks of the Wiener Filter: 1 Signal statistics must be known a priori. We must know R XX and R dx or at least their estimates. A matrix inversion operation is required. Heavy computational load. Not proper for real-time applications. Situations get worse in nonstationary environments. We have to compute R XX (n) and R dx (n) at every time n. We must compute the matrix inversion operation at every time n.

13 Gradient Search by Steepest Descent Method

Steepest Descent Method (1/5) 14 Objectives: We want to design a filter h in a recursive form in order to avoid the matrix inversion operation i ( n) required in Wiener solution. dn ( ) en ( ) = dn ( ) dn ˆ( ) x( n) h i (n) 0 i N 11 N 1 dˆ( n ) = h ( n ) x ( n i ) i= 0 i

Steepest Descent Method (/5) 15 Basic Structure: dn () xn () z 1 h 0 ( n) N 1 en ( ) = dn ( ) h( n) xn ( i) i= 0 = d ( n ) H ( n ) X ( n ) i xn ( 1) z 1 h 1 ( n) dn ˆ( ) xn ( ) h ( n) z 1 xn ( N+ 1) h N 1 ( n )

Steepest Descent Method (3/5) 16 Basic Assumptions: d(n)andx(n) are zero-mean. d(n) and x(n) are jointly wide-sense stationary. Notations: Filter Coefficient Vector: [ ] H ( n) = h ( n), h ( n), L, h ( n) 0 1 N 1 Reference Input Vector: X() n = [ xn (), xn ( 1), L, xn ( N+ 1) ] () () ( 1) ( 1) Estimation Error Signal: Autocorrelation Matrix: N 1 en ( ) = dn ( ) h( nxn ) ( i) XX i= 0 = d( n) H ( n) X( n) { ( ) ( ) } R = E X n X n i Cross-correlation Vector: RdX = { ( ) ( )} E d n X n Optimum Filter Coefficient Vector: Hopt = h0, opt, h1, opt, L, hn 1, opt

Steepest Descent Method (4/5) 17 he filter coefficient vector at time n+1 is equal to the coefficient vector at time n plus a change proportional to the negative gradient of the mean-squared error, ie i.e., 1 H ( n+ 1) = H( n) μ H( n) ( n) μ = Adaptation Step-size [ ] H ( n) = h ( n), h ( n), L, h ( n) 0 1 N 1 Performance Measure (Cost Function): { } { } ξ ( n) = E e ( n) = E d ( n) H ( n) R + H ( n) R H( n) dx XX

Steepest Descent Method (5/5) 18 he Gradient of the Mean-Squared Error: H( n) ( n) ξ( n) = H( n) = R + R H ( n ) dx XX herefore, the recursive update equation for the coefficient vector becomes [ ] H( n+ 1) = I μ R H( n) +μr N XX dx Misalignment Vector: V( n) = H( n) Hopt [ ] V( n+ 1) = I μr V( n) N XX

Convergence of Steepest Descent Method (1/) 19 Convergence (or Stability) Condition: 1 μλ i < 1 0 < μ<, i λ λ i 0 <μ<, i λ max (λ i = the i-th eigenvalue of R XX ) λ Slow convergence if max is large. λ λ min

Convergence of Steepest Descent Method (/) 0 ime Constant: he convergence behavior of the i-thi element of the misalignment vector: ( ) v ( n+ 1) = 1 μλ v ( n) i i i ( ) n i i vi v ( n) = 1 μλ (0) ime constant for the i-th element of the misalignment vector: 1 1 μλ i = exp τi 1 1 τ i = (samples) for μ 1 ln 1 μλ ( μλ ) i i Steady-State Value: H( ) = H or V( ) = 0 opt N We still need a priori knowledge of signal statistics.

1 Stochastic Gradient Adaptive Algorithms

Stochastic Gradient Adaptive Filters Motivations: No a priori information about signal statistics No matrix inversion racking capability Slfd Self-designing i (Recursive method) he filter gradually learns the required correlation of the input signals and adjusts its coefficient vector recursively according to some suitably chosen instantaneous error criterion. Evaluation Criteria: Rate of convergence Misadjustment (Deviation from the optimum solution) Robustness for ill-conditioned data Computational costs Hardware implementation costs Numerical problems

Applications of Stochastic Gradient Adaptive Filters (1/) 3 System Identifications: ξ(n) Unknown d(n) x (n) Σ Σ e(n) System Adaptive Filter Adaptive Prediction: d(n) Σ e(n) z Δ x ( n ) = d ( n Δ ) Adaptive Filter

Applications of Stochastic Gradient Adaptive Filters (1/) 4 Noise Cancellation: y(n) Σ d( n) = y( n) + ξ( n) Σ e(n) ξ(n) x(n)? Adaptive Filter ˆ ξ( n ) Inverse Filtering: raining Signal (RX) raining Signal (X) Unknown Channel z Δ d(n) Received Signal x(n) Adaptive Σ Filter ξ(n) Σ e(n )

Classification of Adaptive Filters 5 System Identification: System Identification Layered Earth Modeling Adaptive Prediction: Linear Predictive Coding Autoregressive Spectral Analysis ADPCM Noise Cancellation: Adaptive Noise Cancellation Adaptive Echo Cancellation Active Noise Control Adaptive Beamforming Inverse Filtering: Adaptive Equalization Deconvolution Blind Equalization

Stochastic Gradient Adaptive Algorithms (1/6) 6 dn ( ) en ( ) = dn ( ) dn ˆ( ) x( n) h i (n) 0 i N 1 Adaptive Algorithm N 1 dˆ ( n) = h ( n) x( n i) i= 0 i en ( ) = d ( n ) H ( n ) X ( n ) μ H ( n+ 1) = H( n) H( n) ( n) α Various forms according to the choice of the performance measure. ( ) ( ) H n n = e ( ) ( n H n ) α If no correlation between d(n) and x(n), then no estimation can be made.

Stochastic Gradient Adaptive Algorithms (/6) 7 Notations: Filter Coefficient Vector: H ( n) = [ h ( n), h ( n), L, h ( n) ] ( ) ( ) ( ) ( ) 0 1 N 1 Reference Input Vector: X n = [ xn xn L xn N+ ] () (), ( 1),, ( 1) Estimation Error Signal: Autocorrelation Matrix: N 1 en ( ) = dn ( ) h( nxn ) ( i) XX i= 0 = d ( n ) H ( n ) X ( n ) { ( ) ( )} R = E X n X n i Cross-correlation Vector: RdX = { ( ) ( )} E d n X n Optimum Filter Coefficient Vector: Hopt = h0, opt, h1, opt, L, hn 1, opt Misalignment Vector: V( n) = H( n) Hopt Covariance Matrix of the Misalignment Vector: Kn ( ) = EVnV { ( ) ( n) }

Stochastic Gradient Adaptive Algorithms (3/6) 8 Sign Algorithm: α = 1 he sign algorithm tries to minimize the instantaneous absolute error value at each iteration. en ( ) = d ( n ) H ( n ) X ( n ) H ( n ) ( n) en ( ) = HH ( n ) Filter Coefficient Updates: { } H ( n+ 1) = H( n) μx( n)sign e( n) { en} sign ( ) 1, en ( ) 0 = 1, en ( ) < 0

Stochastic Gradient Adaptive Algorithms (4/6) 9 Least Mean Square (LMS) Algorithm: α = he LMS algorithm tries to minimize the instantaneous squared error value at each iteration. en ( ) = d ( n ) H ( n ) X ( n ) H( n) e ( n) ( ) ( n ) = HH ( n ) Filter Coefficient Updates: H ( n+ 1) = H( n) μx( n) e( n)

Stochastic Gradient Adaptive Algorithms (5/6) 30 Least Mean Absolute hird (LMA) Algorithm: α = 3 he LMA algorithm tries to minimize the instantaneous absolute error value to the third power at each iteration. en ( ) = dn ( ) H ( nx ) ( n) H( n) ( n) 3 en ( ) = H ( n) Filter Coefficient Updates: { } H ( n+ 1) = H( n) μx( n) e ( n)sign e( n)

Stochastic Gradient Adaptive Algorithms (6/6) 31 Least Mean Fourth (LMF) Algorithm: α = 4 he LMF algorithm tries to minimize the instantaneous error value to the fourth power at each iteration. en ( ) = dn ( ) H ( nx ) ( n) e ( n) H ( n ) ( n ) = H ( n) 4 Filter Coefficient Updates: H ( n+ 1) = H( n) μx( n) e ( n) 3

Convergence of the Adaptive Algorithms (1/) 3 Basically, we need to know the mean and mean-squared behavior of the algorithms. For the analysis of the statistical mean behavior: We want to know a set of statistical difference equations that characterizes E{H(n)} or E{V(n)}. We also need to check Stability conditions Convergence speed Unbiased estimation capability For the analysis of the statistical mean-squared behavior: We want to know a set of statistical difference equations that characterizes { } and K( n) = E V( n) V ( n). We also need to check Stability conditions Convergence speed Estimation precision { } σ e ( n ) = E e ( n )

Convergence of the Adaptive Algorithms (/) 33 Basic Assumptions for the Convergence Analysis: he input signals d(n) and x(n) are zero-mean, jointly wide-sense stationary, and jointly Gaussian with finite variances. A consequence of this assumption is that t the estimation error e(n) ( ) = d(n) ) H (n)x(n) )is also a zeromean and Gaussian when conditioned on the coefficient vector H(n). Independence d Assumption: he input pair {d(n), X(n)} at time n is independent of {d(k), X(k)} at time k, if n is not equal to k. his assumption is seldom true in practice, but is valid when the step-size μ is chosen to be sufficiently i small. One direct consequence of the independence assumption is that the coefficient vector H(n) is uncorrelated with the input pair {d(n), X(n)}, since H(n) depends only on inputs at time n-1 and before..

Sign Algorithm (1/) 34 Mean Behavior: E { Hn ( 1) } I μ { ( )} e( ) R EHn μ + = + πσ n πσe( n) R N XX dx μ E V n IN RXX E V n πσe( n) { ( + 1) } = { ( )} Mean-Squared Behavior: e min { } σ ( n) =ξ + tr K( n) R XX μ K( n+ 1) = K( n) +μ R K( n) R + R K( n) μ [ ] XX πσ( n) XX XX

Sign Algorithm (/) 35 Steady-State Mean-Squared Estimation Error: μ π σe( ) ξ min + ξ min { } tr R XX Convergence Condition (Weak Convergence): he long-term time-average of the MAE is bounded for any positive value of μ. Very robust, but slow.

LMS Algorithm (1/) 36 Mean Behavior: { ( 1) } [ ] { ( )} E H n+ = I μ R E H n +μr N XX dx { ( + 1) } = [ μ ] { ( )} EVn I R EVn N XX Mean-Squared Behavior: σ ( n ) =ξ + tr { K ( n ) R } e min XX [ ] K ( n+ 1) = K ( n ) μ K ( n ) R + R K ( n ) XX +μ σ + XX e ( ni ) N RXX Kn ( ) RXX

LMS Algorithm (/) 37 Steady-State Mean-Squared Estimation Error: μ σe( ) ξ min + ξ min { } tr R XX Mean Convergence: 0 <μ< λ max Mean-Squared Convergence: 0 <μ< tr R { } 3 XX π 1 If, then μ LMS =μsign σe( ) LMS σe( ) sign. ξ min he convergence of the algorithm strongly depends on the input signal statistics.

LMA Algorithm (1/) 38 Mean Behavior: E{ H( n 1) } + = I μ σ ( n) R E{ H( n) } + μ σ ( n) R π π N e XX e dx EVn IN e nrxx EVn π { ( + 1) } = μ σ ( ) { ( )} Mean-Squared Behavior: e min { } σ ( n) =ξ + tr K( n) R XX K( n + 1) = K ( n) μ σ e( n) K ( n) R XX + R XXK ( n) π [ ] + 3 μ σ ( ) σ ( ) + 3 ( ) e n e n RXX RXX K n RXX

LMA Algorithm (/) 39 Steady-State Mean-Squared Estimation Error: 3μ π σe( ) ξ + ξ ξ 4 min min min { } tr R XX Mean Convergence: π 1 0 < μ<, n λ σ ( n) max e Very fast, but must be careful. he convergence of the LMA algorithm depends on the initial choice of the coefficient vector. μ = μ 1 σ ( ) σ ( ) If, then. LMA LMS e LMA e LMS 3 π ξmin

LMF Algorithm (1/) 40 Mean Behavior: { } { } N e XX e dx E H( n+ 1) = I 3 μσ ( n) R E H( n) + 3 μσ ( n) R { ( + 1) } = 3 μσ ( ) { ( )} EVn IN e nrxx EVn Mean-Squared Behavior: σ ( n ) =ξ + tr { K ( n ) R } e min XX [ ] e XX XX Kn ( + 1) = Kn ( ) 3 μσ ( n) KnR ( ) + R Kn ( ) + 15 μσ ( ) σ ( ) + 6 ( ) 4 e n e n IN RXXK n RXX

LMF Algorithm (/) 41 Steady-State Mean-Squared Estimation Error:? Mean Convergence: 0 < μ<, n 3 λ σ ( n) max e Very fast, but must be careful also. he convergence of the LMF algorithm also depends on the initial choice of the coefficient vector.

Further Observations (1/) 4 ( ) M ξ Misadjustment: ex ξ min Sign Algorithm: { } tr R M μ π XX ξ min LMS Algorithm: LMA Algorithm: μ M tr R XX 3μ π 4 { } { } M ξ min tr R XX LMF Algorithm:?

Further Observations (/) 43 he misadjustment M increases with the filter order N. he misadjustment M is directly proportional to μ. he convergence speed is inversely proportional to μ. Convergence Speed: (Fast) LMA LMF LMS Sign (Slow) Robustness (or Stability): (Good) Sign LMS LMA LMF (Bad)

Example: System Identification Mode (1/6) 44 ξ (n) Unknown d(n) ( ) x (n) Σ Σ e(n) System Adaptive Filter H opt = [ 0.1, 0.3, 0.5, 0.7, 0.5, 0.3, 0.1]

Example: System Identification Mode (/6) 45 wo Sets of Reference Inputs: CASE 1: Eigenvalue Spread Ratio = 5.3 x ( n) = ζ ( n) + 0.9 x ( n 1) 0.1 x ( n ) 0. x ( n 3) 1 1 1 1 CASE : Eigenvalue Spread Ratio = 185.8 x ( n ) =ζ ( n ) + 1.5 x ( n 1) x ( n ) 0.5 x ( n 3) Measurement Noise ζ(n): White Gaussian Process Convergence Parameter μ: Sign LMS LMA LMF 0.00016 0.00 0.011 0.00

Example: System Identification Mode (3/6) 46 CASE 1: Eigenvalue Spread Ratio = 5.3 x ( n) =ζ ( n) + 0.9 x ( n 1) 0.1 x ( n ) 0. x ( n 3) 1 1 1 1 10 MSE in db 0-10 4 1 : LM A : LM S 3 : LM F 4 : S IG N -0 3 1 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # o f Itera tion Mean-Squared Behavior of the Coefficients

Example: System Identification Mode (4/6) 47 0.16 0.1 4 1 : LM A : LM S 3 : LM F 4 : S IG N E( h1(n)) 0.08 1 3 0.0 4 0.00 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # of Iteration Mean Behavior of the Coefficients

Example: System Identification Mode (5/6) 48 CASE : Eigenvalue Spread Ratio = 185.8 x ( n) =ζ ( n) + 1.5 x ( n 1) x ( n ) 0.5 x ( n 3) 10 MSE in db 0 1:LMA : L M S 3:LMF 4:SIGN -10 4 3-0 1 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # of Iteration Mean-Squared Behavior of the Coefficients

Example: System Identification Mode (6/6) 49 0.1 4 ) E( h1(n) 0.08 0.04 1 3 1:LMA :LMS 3:LMF 4:SIGN 0.00 0 4 0 0 0 8 0 0 0 1 0 0 0 1 6 0 0 0 0 0 0 0 # of Iteration Mean Behavior of the Coefficients

Other Algorithms (1/) 50 Signed Regressor Algorithm: + = +μ { } H( n 1) H( n) sign X( n) e( n) Sign-Sign Algorithm: H( n+ 1) = H( n) +μ sign { X( n) } sign { e( n) } Normalized LMS Algorithm: Hn ( + 1) = Hn ( ) + Xnen ( ) ( ) X μ ( n) X( n) H ( n+ 1) = H( n) +μx ( n) e( n) Complex LMS Algorithm: *

Other Algorithms (/) 51 Hybrid Algorithm #1: LMS + LMF { e () n (1 ) e 4 () n } φ + φ Hn ( )() n =,0 φ 1 Hn () 3 { } H ( n+ 1) = H( n) +μ φ X( n) e( n) + (1 φ) X( n) e ( n) Hybrid Algorithm #: Sign + LMA { en () (1 ) en () 3 } φ + φ Hn ( )() n =,0 φ 1 Hn () { } { } Hn ( + 1) = Hn ( ) +μ φ Xn ( ) + 3(1 φ) Xne ( ) ( n) sign en ( )

5 Recursive Least Square (RLS) Algorithm

RLS Algorithm (1/5) 53 Cost Function: n ε ( n) = β( n, i) e ( i) i= 1 where n = Length of the observable data Error signal at time instance i: ei () = di () H ( nx ) () i he coefficient vector H(n) ( ) remains fixed during the observation interval 1 i n. Weight Vector: 0 < β( ni, ) 1 (Normally, β ( ni, ) =λ, λ = Forgetting Factor) n i By the method of exponentially weighted least squares, we want to minimize n n i ε ( n ) = λ e ( i ) i= 1 Very fast, but computationally very complex. he algorithm is useful when the number of taps required is small.

RLS Algorithm (/5) 54 Normal Equation: Φ ( nhn ) ( ) =Θ( n) where n Φ ( n) = λ X() i X () i i= 1 n i= 1 n i n i Θ ( n) = λ d( i) X( i) We write n 1 n 1 i Φ ( n ) =λ λ X ( i ) X ( i ) + X ( n ) X ( n ) i= 1 =λφ( n 1) + X( n) X ( n) Θ ( n) =λθ( n 1) + d( n) X( n) Do we need a matrix inversion? No!

RLS Algorithm (3/5) 55 Matrix Inversion Lemma: ( ) 1 1 1 1 1 If A = B + CD C, then A = B BC D + C BC C B. where A and B = N N Positive Definite C = N M D = M M Positive Definite 1 Letting A=Φ ( n), B =λφ( n 1), C = X( n), D= 1, we express in a recursive form: 1 1 1 1 Φ ( n 1) λ Φ ( n 1) X( n) X ( n) Φ ( n 1) Φ ( n) = λ 1 1 1 +λ X ( n ) Φ ( n 1) X ( n ) K(n)

RLS Algorithm (4/5) 56 Define 1 Ρ ( n) =Φ ( n) ( N N) 1 λ Ρ( n 1) X( n) Κ ( n) = ( N 1) 1 1 +λ X ( n) Ρ( n 1) X( n) 1 1 Κ ( n) +λ X ( n) Ρ( n 1) X( n) =λ Ρ( n 1) X( n) 1 1 { } Κ ( n) = λ Ρ( n 1) λ X ( n) Ρ( n 1) X( n) Κ ( n) =Ρ( n) X( n) 1 Κ ( n) =Φ ( n) X( n) herefore, 1 1 Ρ ( n) =λ Ρ( n 1) λ Κ( n) X ( n) Ρ( n 1)

RLS Algorithm (5/5) 57 ime Update for H(n): 1 H ( n) = Φ ( n) Θ( n) =Ρ( n) Θ( n) = λρ ( n ) Θ ( n 1) + d( n) Ρ( n) X( n) =Ρ( n 1) Θ( n 1) Κ( n) X ( n) Ρ( n 1) Θ( n 1) + d( n) Κ( n) 1 1 =Φ ( n 1) Θ( n 1) Κ( n) X ( n) Φ ( n 1) Θ( n 1) + d( n) Κ( n) Hn ( ) = Hn ( 1) +Κ( n) dn ( ) X ( nhn ) ( 1) Innovation: α ( n) = d( n) X ( n) H( n 1) A priori estimation error H ( n) = H ( n 1) + Κ( n) α( n) A posteriori Estimation error e(n): en ( ) = dn ( ) X ( nh ) ( n)

Summary of the RLS Algorithm 58 Initialization: Determine the forgetting factor λ (Normally, 0.9 λ<1) 1 ( N N) : Ρ (0) =δ I N, ( δ= a small positive number) ( N N): H(0) = 0 N Main Iteration: 1 Κ n n = λ Ρ +λ 1 Ρ ( N 1): ( ) ( 1) X( n) 1 X ( n) ( n 1) X( n) (1 1) : α ( n) = d( n) X ( n) H( n 1) ( N 1): H ( n ) = H ( n 1) +Κ ( n ) α ( n ) 1 1 ( N 1): Ρ ( n) =λ Ρ( n 1) λ Κ( n) X ( n) Ρ( n 1) (1 1) : en ( ) = dn ( ) X ( nh ) ( n) (if necessary)