MMSE Equalizer Design Phil Schniter March 6, 2008 [k] a[m] P a [k] g[k] m[k] h[k] + ṽ[k] q[k] y [k] P y[m] For a trivial channel (i.e., h[k] = δ[k]), e kno that the use of square-root raisedcosine (SRRC) pulses at transmitter and receiver suppresses inter-symbol interference (ISI) and maximizes the received signal-to-noise ratio (SNR) in the presence of hite noise { [k]}. With a non-trivial channel, hoever, e need to re-visit the design of the receiver pulse {q[k]}, hich is called an equalizer hen it tries to compensate for the channel. Here e design the minimum mean-squared error (MMSE) equalizer coefficients {q[k]} assuming that the input symbols {a[n]} and the noise { [k]} are hite random sequences that are uncorrelated ith each other. This means that E{a[m]a [n]} = σ 2 aδ[m n] (1) E{ [k] [l]} = σ 2 δ[k l] (2) E{a[m] [l]} = 0 (3) for some positive variances σ 2 a and σ 2. For practical implementation, e ill consider a causal equalizer ith length N q, so that q[k] = 0 for k < 0 and k N q. To simplify the derivation, e combine the transmitted pulse g[k] and the complex-baseband channel h[k] into the effective channel h[k] := g[k] h[k] and assume that this effective channel is causal ith finite length N h. Throughout, e assume that the effective channel coefficients {h[k]}, as ell as the variances σ 2 a and σ 2, are knon. Learning these quantities is a separate (and often challenging) problem. Notice that, because the effective channel is causal and length N h, it can delay the upsampled input signal a [k] by beteen 0 and N h 1 samples. Since it is difficult to compensate for this delay ith a causal equalizer, e ill allo for the presence of endto-end system delay. Thus, our goal is to make y[m] a[m ] for some integer 0. Throughout the design, e assume that has been chosen for us, although eventually e shall see ho to optimize. 1
Recall that if y[m] = a[m ], then e ill be able to make perfect decisions on the symbols a[m] from the output sequence y[m]. Hoever, e ould never expect a perfect output in the presence of noise. Thus, e take as our objective the minimization of the error signal e[m] := y[m] a[m ]. In particular, e minimize the mean squared error (MSE) E := E{ e[m] 2 }. We sa earlier that, if e[m] can be modelled as a zero-mean Gaussian random variable (ith variance σ 2 e = E), then the symbol error rate (SER) decreases as E/σ 2 a decreases. Thus, there is good reason to minimize E. Our eventual goal is to derive an expression for the MSE E from hich the equalizer coefficients can be optimized. But first e notice that, due to the stationary of {a[m]} and { [k]} (i.e., the time-invariance of their statistics) and the LTI nature of our filters, the statistics of {e[m]} ill also be time invariant, alloing us to rite E = E{ e[0] 2 }. This allos us to focus on e[0] instead of e[m], hich simplifies the development. The next step is then to find an expression for e[0]. From the block diagram, e[0] = y[0] a[ ] (4) = N q 1 l=0 ṽ[k] = [k] + = [k] + q[l]ṽ[ l] a[ ] (5) l= n= a [l]h[k l] (6) a[n]h[k np], (7) here in (7) e used the fact that a [l] = 0 hen l is not a multiple of P and that a [np] = a[n]. Though (7) is ritten ith an infinite summation, it turns out that most values of n ill not contribute. Due to the causality and length-n h of h[n], the values of n hich lead to contributions to ṽ[k] ensure that 0 k np N h 1 for at least one k {0, 1,..., N q + 1} (8) k P n N h 1 k for at least one k {0, 1,..., N q 1}(9) P Nh + N q 2 0 n, (10) P := N a 1 here N a denotes the number of contributing symbols. In other ords, ṽ[k] = [k] + 0 n=1 N a a[n]h[k np]. (11) 2
Next e use a vector formulation to simplify the development. We start by reriting (5) as ṽ[0] e[0] = [ q[0] q[1] q[n q 1] ] ṽ[ 1] a[ ], (12). := q T ṽ[1 N q ] here, for l {0,...,N q 1}, a[0] ṽ[ l] = [ l] + [ h[ l] h[ l + P] h[ l + (N a 1)P] ] a[ 1]. := h T l a[1 N a ] := a (13) so that ṽ[0] ṽ[ 1]. = ṽ[1 N q ] [0] [ 1] +. [1 N q ] := h T 0 h T 1. h T 1 N q := H a. (14) In (14), the ro vectors {h T l} Nq 1 l=0 ere combined to form the N q N a matrix H. Throughout, e use underlined loer-case letters to represent column vectors, and underlined upper-case letters to represent matrices. Plugging (14) into (12), e get e[0] = q T ( + Ha) a[ ]. (15) Defining δ as the column vector ith a 1 in the th place 1 and 0 s elsehere, e can rite δ T a = a[ ], hich yields the final expression for the time-0 error: e[0] = q T ( + Ha) δ T a (16) Next, e derive an expression for the MSE E. Notice that = q T + (q T H δ T )a. (17) E = E{ e[0] 2 } = E{e[0]e [0]} (18) { [q = E T + (q T H δ T )a ] [ q T + (q T H δ T )a ] } (19) { [q = E T + (q T H δ T )a ] [ T q + a T (H T q δ ) ] } (20) = E {[ q T + (q T H δ T )a ] [ H q + a H (H H q δ ) ]}, (21) 1 Note that = 0 indicates the first place, so that δ 0 = [1,0,...,0] T. 3
here in (20) e transposed the scalar quantities on the right (e.g., q T = (q T ) T = T q) and in (21) e distributed the complex conjugate, using the Hermitian transpose notation ( ) H := ( ) T. Expanding (21) gives E = E { q T H q } + E { q T a H (H H q δ ) } + E { (q T H δ T )a H q } + E { (q T H δ T )aa H (H H q δ ) } (22) = q T E { H} q + q T E { a H} (H H q δ ) + (q T H δ T ) E { a H} q + (q T H δ T ) E { aa H} (H H q δ ), (23) here in (23) e moved the non-random quantities through the expectations. Equation (23) can be simplified using the statistical properties of the random vectors a and, hich ere each constructed from hite sequences that are uncorrelated ith each other. In particular, notice that E { H} [0] [ 1] [ = E [0] [ 1] [1 N q ] ] (24). [1 N q ] [0] [0] [0] [ 1] [0] [1 N q ] [ 1] [0] [ 1] [ 1] [ 1] [1 N q ] = E (25)...... [1 N q ] [0] [1 N q ] [ 1] [1 N q ] [1 N q ] E{ [0] [0]} E{ [0] [ 1]} E{ [0] [1 N q ]} E{ [ 1] [0]} E{ [ 1] [ 1]} E{ [ 1] [1 N q ]} =...... (26) E{ [1 N q ] [0]} E{ [1 N q ] [ 1]} E{ [1 N q ] [1 N q ]} = a matrix hose (m,n) th entry equals E{ [ m] [ n]} = σ 2 δ[n m] (27) = σ 2 I, here I denotes the identity matrix. The same reasoning can be used to sho Similarly, E{aa H } = σ 2 ai. (28) E{a H } = a matrix hose (m,n) th entry equals E{a[ m] [ n]} = 0 (29) = 0, and E{a H } = (E{a H }) H = 0 H = 0. Applying these relationships to (23), e get (30) E = σ 2 q T q + σ 2 a(q T H δ T )(H H q δ ) (31) = σ 2 q 2 + σ 2 a H T q δ 2, (32) 4
hich shos that the MSE consists of σ 2 q 2 (due to noise) plus σ 2 a H T q δ 2 (due to ISI and error in the end-to-end gain). In general, minimizing the sum of these to components requires making a tradeoff beteen them. For example, setting q = 0 ould cancel the noise component but result in an end-to-end gain of zero. Similarly, choosing q such that H T q = δ D (if this is even possible) can amplify the noise. To proceed further, e rerite the MSE as follos. E = q T (σi 2 + σahh 2 H ) q q T Hδ σa 2 σ aδ 2 T H H q + σa 2 δ T } δ {{ } := A := b = 1 (33) = q T Aq q T b b H q + σ 2 a. (34) By completing-the-square, 2 the MSE expression (34) can be put into a convenient form, from hich the MSE-minimizing q ill become readily apparent. To do this, it is essential that the matrix A can 3 be decomposed as A = BB H, so that E = q T BB H q q T b b H q + σ 2 a (35) = q T BB H q q T BB 1 b b H B H B H q + σ 2 a (36) = (q T B b H B H )(B H q B 1 b) b H B H B }{{ 1 2 b + σ } a (37) = A 1 = (B H q B 1 b) H (B H q B 1 b) +σa 2 b H A 1 b. (38) 0 E min Note that the equalizer parameters only affect the first term in (38), hich is non-negative. So, to minimize E via choice of g, the best e can do is to set the first term in (38) to zero, at hich point the second term specifies the minimum possible E. Thus, the MSEminimizing equalizer parameters are those hich give B H q min = B 1 b, i.e., q min = B H B 1 b = A 1 b = (σ 2 I + σ 2 ahh H ) 1 Hδ σ 2 a (39) = ( ) σ 2 1 I + HH H Hδ (40) σ 2 a and the minimum MSE is E min = σ 2 a b H A 1 b (41) = σa 2 σaδ 2 T H H (σi 2 + σahh 2 H ) 1 Hδ σa 2 (42) ( ( ) ) σ = σa 2 1 δ T H H 2 1 I + HH H Hδ σa 2. (43) 2 For scalar quantities, this means riting x 2 2xy + z = (x y) 2 y 2 + z. 3 To see this, e can use the singular value decomposition (SVD) H = USV H, here U and V are unitary and S is non-negative diagonal, to rite A = (σ 2 UU H +σ 2 ausv H V SU H ) = U(σ 2 I+σ 2 as 2 )U H = UΣ 2 U H. Thus, if e define B := UΣ, then A = BB H. Note that B is guaranteed invertible hen σ 2 > 0. 5
Finally, e can see ho the delay can be optimized. Notice from (43) that the term ( ) σ δ T H H 2 1 I + HH H Hδ σa 2 is simply the th diagonal element of the matrix H H ( σ 2 σ 2 a I + HH H ) 1 H, and that Emin decreases as this term gets bigger. Thus, the MSE-minimizing is simply the index of the maximum diagonal element of the matrix H H ( σ 2 σ 2 a I + HH H ) 1 H. 6