Linear and Nonlinear Iterative Multiuser Detection

1 Linear and Nonlinear Iterative Multiuser Detection Alex Grant and Lars Rasmussen University of South Australia October 2011

Outline 1 Introduction 2 System Model 3 Multiuser Detection 4 Interference Cancellation 5 Linear Methods 6 Nonlinear Methods 7 Numerical Results 2

Outline 1 Introduction 2 System Model 3 Multiuser Detection 4 Interference Cancellation 5 Linear Methods 6 Nonlinear Methods 7 Numerical Results 3

System Model s 1[j] a 1 Data Source b 1[i] C 1 π 1 x 1[j] s 1[j] y 1[i] s 2[j] a 2 Data Source b 2[i] C 2 π 2 x 2[j] r[j] s 2[j] y 2[i] Multiuser Receiver n s K[j] a K Data Source b K[i] C K π K x K[j] Noise s K[j] y K[i] BPSK data x [j] { 1, 1} { Spreading s [i] 1/ N, +1/ N AWGN E { n[i]n t [i] } = σ 2 I } N 4

System Model s 1[j] a 1 Data Source b 1[i] C 1 π 1 x 1[j] s 1[j] y 1[i] s 2[j] a 2 Data Source b 2[i] C 2 π 2 x 2[j] r[j] s 2[j] y 2[i] Multiuser Receiver n s K[j] a K Data Source b K[i] C K π K x K[j] Noise s K[j] y K[i] K r[i] = s [i]a x [i] + n[i] =1 = S[i]Ax[i] + n[i] 5

System Model s 1[j] a 1 Data Source b 1[i] C 1 π 1 x 1[j] s 1[j] y 1[i] s 2[j] a 2 Data Source b 2[i] C 2 π 2 x 2[j] r[j] s 2[j] y 2[i] Multiuser Receiver n s K[j] a K Data Source b K[i] C K π K x K[j] Noise s K[j] y K[i] y[i] = S t [i]r[i] = S t [i]s[i]ax[i] + S t [i]n[i] = R[i]Ax[i] + z[i] 6

Matched Filter Ouput y[i] = S t [i]r[i] = S t [i]s[i]ax[i] + S t [i]n[i] = R[i]Ax[i] + z[i] Matched filter output is a sufficient statistic R[i] = S t [i]s[i] is the correlation matrix z[i] is colored Gaussian noise, E { z[i]z t [i] } = σ 2 R[i] Assume unit norm modulation, R ii = 1 and R ij 1, i j. 7

Outline 1 Introduction 2 System Model 3 Multiuser Detection 4 Interference Cancellation 5 Linear Methods 6 Nonlinear Methods 7 Numerical Results 8

Optimal Joint Detection Minimum error probability optimal multiuser detector ˆx MAP = arg max x { 1,1} K Pr(x r) = arg max x { 1,1} K p(r x)pr(x). Equiprobable data - maximum-lielihood detector, ˆx ML = arg max p(r x) x { 1,1} K = arg max x { 1,1} K exp = arg max x { 1,1} K exp Complexity O ( 2 K) ( 12 r SAx 2 ) ( 12 (y RAx) R 1 (y RAx) t ). 9

Optimal Joint Detection W. van Etten. Maximum lielihood receiver for multiple channel transmission systems. IEEE Trans. Commun., 24(2):276 283, February 1976. K. S. Schneider. Optimum detection of code division multiplexed signals. IEEE Trans. Aerosp. Electron. Systems, 15:181 185, January 1979. R. Kohno, M. Hatori, and H. Imai. Cancellation techniques of co-channel interference in asynchronous spread spectrum multiple access systems. Electronics and Commun., 66-A:20 29, 1983. S. Verdú. Minimum probability of error for asynchronous Gaussian multiple access channels. IEEE Trans. Inform. Theory, 32(1):85 96, January 1986. 10

Decorrelator Suppose correlation matrix R = S t S is invertible ˆx = ( S t S ) 1 S t r = R 1 y = Ax + R 1 z, E { R 1 zz t R 1} = σ 2 R 1 y consists of correlated data RAx and correlated noise R, ˆx consists of independent data Ax and correlated noise R 1. Multiple-access interference elimination vs. noise enhancement ML when relaxing x { 1, 1} K to x R K. ML for unnown A (estimates Ax rather than x). R. Lupas and S. Verdú. Linear multiuser detectors for synchronous code division multiple access channels. IEEE Trans. Inform. Theory, 35(1):123 136, January 1989. 11

Minimum Mean Squared Error Estimation Let x and y be random vectors with x = E {x} ȳ = E {y} G xy = (Cov {y, y}) 1 Cov {y, x} LMMSE estimate of x given y is x + G t xy (y ȳ). For jointly Gaussian x, y linear estimate in fact minimizes mean squared error. H. V. Poor. An Introduction to Signal Detection and Estimation. Springer-Verlag, 1994. 12

Linear Minimum Mean Square Error Detectors Chip-level (considering x, r) or symbol-level (considering x, y) ˆx r = E {x} + G t xr (r r) ˆx y = E {x} + G t xy (y ȳ). Independent data with E {x} = x and Cov {x, x} = I diag ( x x t) = V results in ˆx r = x + VAS t ( SAVAS t + σ 2 I ) 1 (r SA x) ˆx y = x + VAR ( RAVAR + σ 2 R ) 1 (y RA x) = x + A 1 ( R + σ 2 (AVA) 1) 1 (y RA x). Zero mean data results in ˆx r = AS t ( SA 2 S t + σ 2 I ) 1 r ˆx y = A 1 ( R + σ 2 A 2) 1 y. 13

Linear Minimum Mean Square Error Detectors Z. Xie, R. T. Short, and C. K. Rushforth. A family of suboptimum detectors for coherent multiuser communications. IEEE J. Select. Areas Commun., 8(4):683 690, May 1990. P. B. Rapajic and B. S. Vucetic. Adaptive receiver structures for asynchronous CDMA systems. IEEE J. Select. Areas Commun., 12(4):685 697, May 1994. U. Madhow and M. L. Honig. MMSE interference suppression for direct sequence spread spectrum CDMA. IEEE Trans. Commun., 42(12):3178 3188, December 1994. 14

Outline 1 Introduction 2 System Model 3 Multiuser Detection 4 Interference Cancellation 5 Linear Methods 6 Nonlinear Methods 7 Numerical Results 15

Interference Cancellation Matched filter output for user y = s t r = a x + s t s j a j x j + n j = a x + R j a j x j j }{{} multiple-access interference User could subtract an estimate of the MAI +z Estimate will not be perfect leaving residual MAI. This motivates an iterative cancellation approach 16

Iterative Interference Cancellation Let estimate of user at iteration n be ˆx (n) ˆx (n+1) = a 1 st = a 1 r s j a j ˆx (n) j j y R j a j ˆx (n) j j. chip-rate symbol-rate Choosing initial estimate ˆx (0) = 0 yields ˆx (1) = a 1 y. If ˆx (n) = x cancellation is perfect, ˆx (n+1) = x + z a 17

Parallel Cancellation ˆx (n+1) = A 1 (y (R I) Aˆx (n)). y 1 R 1 a 1 ˆx (1) 1 R 1 a 1 y j R j a j ˆx (1) j R j a j y a 1 ˆx (1) a 1 ˆx (2) y y K R K a K ˆx (1) K R K a K Stage 0 Stage 1 18

Serial Cancellation Cancel estimates as soon as they are available ˆx (n+1) 1 K = a 1 st r s j a j ˆx (n+1) j = a 1 j=1 1 y j=1 R j a j ˆx (n+1) j j=+1 K j=+1 Let R = L + L t + I where L is strictly triangular s j a j ˆx (n) j R j a j ˆx (n) j ˆx (n+1) = A 1 (y LAˆx (n+1) L t Aˆx (n)).. 19

Serial Cancellation ˆx (n+1) = A 1 (y LAˆx (n+1) L t Aˆx (n)). R 1 a 1 ˆx (1) 1 R 1 a 1 ˆx (2) 1 R j a j ˆx (1) j R j a j ˆx (2) j y a 1 ˆx (1) a 1 ˆx (2) y y K R K a K ˆx (1) K R K a K Stage 0 Stage 1 20

Implementation via Residual Error Update Chip-level parallel interference canceller K ˆx (n+1) = a 1 st r s j a j ˆx (n) j = ˆx (n) = ˆx (n) + a 1 st j=1 r + a 1 st η(n) Residual error, or noise hypothesis η (n) = r K j=1 K j=1 s j a j ˆx (n) j + s a ˆx (n) s j a j ˆx (n) j Perfect cancellation, residual is thermal noise, η (n) = n.. 21

Implementation via Residual Error Update Can do same thing for serial cancellation ˆx (n+1) = ˆx (n) + a 1 st η(n), Residual error seen by user after iteration n. η (n) 1 = r j=1 s j a j ˆx (n+1) j K j= s j a j ˆx (n) j 22

Implementation via Residual Error Update Can write iteration in terms of an update to residual error ( ) η (n) = s a ˆx (n 1) ˆx (n) = s s t η(n 1). ˆx (n 1) ˆx (n) η (n 1) η (n) s t s Interference cancellation module. K η (n) = η (n 1) + η (n) j, (Parallel) j=1 { η (n 1) K + η (n) K = 1 η (n) = η (n) 1 + η(n+1) 1 > 1. (Serial) 23

Modular Parallel Cancellation η (n 1) ˆx (n 1) 1 ˆx (n) 1 ˆx (n+1) 1 η (n) 1 η (n) η (n+1) 1 η (n+1) ˆx (n 1) 2 ˆx (n) 2 η (n) 2 ˆx (n+1) 2 η (n+1) 2 ˆx (n 1) 3 ˆx (n) 3 η (n) 3 ˆx (n+1) 3 η (n+1) 3 ˆx (n 1) 4 ˆx (n) 4 η (n) 4 ˆx (n+1) 4 η (n+1) 4 24

Modular Serial Cancellation η (n 1) 1 ˆx (n 1) 1 ˆx (n) 1 ˆx (n+1) 1 η (n) 1 η (n) 1 η (n+1) 1 η (n+1) 1 ˆx (n 1) 2 η (n 1) 2 ˆx (n) 2 η (n) 2 η (n) 2 ˆx (n+1) 2 η (n+1) 2 ˆx (n 1) 3 η (n 1) 3 ˆx (n) 3 η (n) 3 η (n) 3 ˆx (n+1) 3 η (n+1) 3 ˆx (n 1) 4 ˆx (n) 4 η (n) 4 η (n 1) 4 η (n) 4 ˆx (n+1) 4 η (n+1) 4 25

Tentative Decision Functions Transmitted symbols discrete { 1, +1} Estimates ˆx (n) could be any real number. What if ˆx (n) 1? Non-linear tentative decision function ζ : R [ 1, +1] ˆx (n 1) ζ ˆx (n) η (n 1) η (n) s t s 26

Tentative Decision Functions 1 0.5-2 -1 1 2 tanh(x) -0.5 clip(x) -1 sgn(x) 27

Outline 1 Introduction 2 System Model 3 Multiuser Detection 4 Interference Cancellation 5 Linear Methods 6 Nonlinear Methods 7 Numerical Results 28

Linear Methods Both decorrelator and LMMSE require K K matrix inversion Complexity scales O ( K 3) There are many lower complexity approaches for solving linear systems Series expansions Iterative matrix inversion Gradient descent These can all be implemented as interference cancellation 29

Direct Solution Methods Output of decorrelator or LMMSE can be written M = R for the decorrelator, ˆx = arg min x R K Mx y 2 2 M = RA for the normalized decorrelator, M = R + σ 2 A 2 for LMMSE. Solution to the unconstrained optimization problem is Mˆx = y. 30

Direct Solution Gaussian elimination followed by bac-substitution. Symmetric M - equivalent to Cholesy factorization M = FF t followed by forward and bacward substitution, Fz = y F tˆx = z forward substitution bacward substitution Cholesy factorization is O ( K 3 /3 ), Each substitution step is O ( K 2 /2 ). If M ij = 0 for i j > b, Cholesy decomposition is O ( K(b 2 + 3b) ) and substitution steps are O (Kb) 31

Series Expansions Find coefficients c n such that M 1 = n c n M n. y M M M c 0 c 1 c 2 With K terms this is O ( K 3). Truncate series to n K. 32

Series Expansions S. Moshavi, E. G. Kanterais, and D. L. Schilling. Multistage linear receivers for DS-CDMA systems. Int. J. Wireless Inform. Networs, 3:1 17, January 1996. D. Guo, L. K. Rasmussen, S. Sun,, and T. J. Lim. A matrix-algebraic approach to linear parallel interference cancellation in CDMA. IEEE Trans. Commun., 41(1):152 161, Jan. 2000. R. R. Müller and S. Verdú. Design and analysis of low-complexity interference mitigation on vector channels. IEEE J. Select. Areas Commun., 19:1429 1441, August 2001. D. Guo, S. Verdú, and L. K. Rasmussen. Asymptotic normality of linear multiuser receiver outputs. IEEE Trans. Inform. Theory, 48(12):3080 3095, December 2002. 33

Cayley Hamilton - Finite Series Theorem (Cayley Hamilton) K p M (M) = ( 1) K n c K n M n = 0 n=0 Can write any power of M as a linear combination of M n, n = 0, 1,..., K. M 1 = 1 ( 1) K det(m) K ( 1) K n c K n M n 1 n=1 K-stage multistage implementation. Computation of c n is as complex as matrix inversion There exists coefficients such that a finite power series implements matrix inversion exactly. 34

Taylor Series Taylor series (I + X) 1 = ( X) n, n=0 Convergent if spectral radius satisfies ρ (X) < 1 Setting X = M I, M 1 = ( 1) n (M I) n, n=0 Convergent if ρ (M) < 2. 35

Taylor Series First order truncation results in the approximate decorrelator Higher-order truncation ˆx (1) = (2I R)y = y (R I)y }{{} Interference Estimate ˆx (n) = y + (I R)y + (I R) 2 y + + (I R) n y = y (R I)ˆx (n 1) This is parallel interference cancellation: ˆx (n) = y }{{} User matched filter output R ˆx (n 1) }{{} Interference estimate from previous stage. 36

Taylor Series: Decorrelator ˆx (n 1) 1 R1 ˆx (n 1) j Rj ˆx (n) y ˆx (n 1) K RK 37

Taylor Series: LMMSE ˆx (n 1) 1 σ 2 a 2 1 + R1 ˆx (n 1) j σ 2 a 2 j + Rj ˆx (n 1) σ 2 a 2 ˆx (n) y ˆx (n 1) K σ 2 a 2 K + RK 38

Taylor Series: Interference Cancellation Module ˆx (n 1) α ˆx (n) 1 α 1 η (n 1) 1/α η (n) s t s Decorrelator α = 1 and LMMSE, α = 1 σ 2 /a 2 39

Taylor Series Infinte series rather than finite series. Convergent only if ρ (M) < 2. Easy computation of series cofficients. LMMSE only marginally more complex than decorrelator. 40

Taylor Series S. Verdú. Multiuser Detection. Cambridge University Press, Cambridge, 1998. N. B. Mandayam and S. Verdú. Analysis of an approximate decorrelating detector. Wireless Personal Commun., 6:97 111, June 1998. S. Moshavi, E. G. Kanterais, and D. L. Schilling. Multistage linear receivers for DS-CDMA systems. Int. J. Wireless Inform. Networs, 3:1 17, January 1996. 41

Iterative Solution Methods Let M = M 1 M 2. Then (M 1 M 2 )ˆx = y, Fixed point equation, Motivates iteration of the form Require: M 1ˆx = y + M 2ˆx. M 1 x (n+1) = y + M 2 x (n). 1 Easy to solve M 1 x = z. E.g. M 1 triangular or diagonal. 2 Choose M 1 and M 2 so iteration converges quicly. 42

Iterative Solution Methods Let e (n) = ˆx x (n) e (n) = (M 1 1 M 2) n e (0) (M 1 1 M 2) n e (0). Theorem A necessary and sufficient condition for convergence of in any norm is ρ (M 1 1 M 2) < 1. 43

Jacobi Iteration Choose M 1 = ωd M 2 = ωd M. Theorem With x (0) = y, iteration becomes x (n+1) = D 1 ω ( y (M ωd)x (n)). Taylor series expansion of (ωd + M) 1. Multistage parallel interference cancellation. The Jacobi implementation of the decorrelator with M 1 = ωi is convergent for any ω > 0 such that ρ (R) < 2ω. 44

Jacobi Iteration: LMMSE Theorem For ω = 1, the LMMSE Jacobi iteration is x (n+1) = ( I + A 2 σ 2) 1 ( y (R I)x (n)). Per-user signal-to-noise ratio scaling each iteration. The Jacobi iterative implementation of the LMMSE filter is convergent if and only if ( (I ρ J = ρ + A 2 σ 2) ) 1 (I R) < 1 45

Jacobi Iteration: LMMSE Theorem The Jacobi iterative implementation of the LMMSE filter is convergent if ρ (R I) < 1 + γ 1 max where γ max = max A 2 /σ2. The iteration is not convergent if where γ min = min A 2 /σ2. ρ (R I) > 1 + γ 1 min 46

Gauss-Seidel Iteration Choose M 1 = 1 ω D + L M 2 = 1 ω ω D Lt Results in the following iteration (with x (0) = y) ( ) ( ) 1 1 ω ω D + L x (n+1) = y + ω D Lt x (n). Theorem Successive cancellation Gauss-Seidel iteration is convergent for symmetric positive definite M and ω (0, 2). 47

Descent Algorithms For symmetric positive definite M, define x M 1 2 = M 1 2 x 2 = x t M 1 x, Define f (x) = 1 2 Mx y. Least-squares minimization M 1 2 is solution to ˆx = arg min f(x). x R K Note ( f(x) (x), f(x),..., f(x) ) t = Mx y. x 1 x 2 x K Gradient is equal to the error vector e = Mx y, and unique stationary point is Mx = y. 48

Descent Algorithms Descent algorithms tae the form x (n+1) = x (n) + t n d (n) Direction d (n) chosen to reduce objective function Step size t n chosen each iteration to minimize objective function in the direction of d (n). 49

Steepest Descent Search direction is negative gradient of objective function, d (n) = (x (n) ), resulting in ˆx (n+1) = ˆx (n+1) t n (ˆx (n) ) (1) = t n y (t n M I)ˆx (n), (2) where the optimal step size is t n = e(n) 2 M 1 2 e (n) 2. (3) Jacobi and Gauss-Seidel are steepest descent algorithms with suboptimal t n. 50

Steepest Descent Theorem The error norm of the steepest descent algorithm with optimal step size decreases geometrically with rate at least ( 1 λ ) min λ max where λ min and λ max are the smallest and largest eigenvalues of M. 51

Conjugate Gradient Better approach: new search direction orthogonal to all previous directions (d (n+1), Md (j) ) = 0, j = 0, 1,..., n. Choose a linear combination of current error vector (steepest descent) and previous direction, where the combining coefficient β n ensures orthogonality. d (0) = e (0) d (n+1) = e (n+1) + β n d (n) β n = (e(n+1), Md (n) ) (d (n), Md (n) ). where 52

Conjugate Gradient Theorem The error norm for the conjugate gradient algorithm decreases geometrically with rate at least ( κ 1 κ + 1 ) where κ = λ max /λ min is the condition number of M. 53

Conjugate Gradient Efficient implementation: Only two extra inner products and one extra vector addition compared to parallel cancellation. x x d M h (, ) t β d δ 0 δ 0 (, ) δ 1 e e 54

Outline 1 Introduction 2 System Model 3 Multiuser Detection 4 Interference Cancellation 5 Linear Methods 6 Nonlinear Methods 7 Numerical Results 55

Constrained Optimization Linear detectors were solutions to unconstrained optimization Ignores discrete alphabet of x. ˆx = arg min 1 2 Mx y M 1 2. Idea: Introduce manageable constraints to approximate discrete alphabet. min f(x) g i (x) 0 subject to i = 1, 2,..., m 56

Constrained Optimization Use barrier functions to replace constrained probem with equivalent unconstrained problem min f(x) + m I (g i (x)) i=1 }{{} penalty function where the ideal barrier function I is defined as { 0 u 0 I(u) = u > 0. 57

Constrained Optimization Approximate I( ) by differentiable approximation b(u) I(u) Define penalty function ϕ(x) = m b(g i (x)) (4) i=1 New optimization problem min f(x) + ϕ(x), 58

Gradient Descent Gradient descent Setting t n = 1 ˆx (n+1) = t n y (t n M I)ˆx (n) t n ϕ (ˆx (n)). ˆx (n+1) + ϕ (ˆx (n)) = y (M I)ˆx (n). Supposing ξ(x) = x + ϕ (x) has inverse function ζ = ξ 1, ( ˆx (n+1) = ζ y (M I)ˆx (n)), Nonlinear parallel interference cancellation! 59

Nonlinear Iteration Theorem The non-linear iteration ˆx (n+1) = ζ (y (M I)ˆx (n)) is a gradient method for numerical solution of min 1 2 Mx y M 1 2 + ϕ(x) where ϕ satisfies ζ 1 (x) = ϕ (x) + x. If ϕ is convex, then the iteration is convergent to the unique point ˆx satisfying Mˆx + ϕ (ˆx) = y. 60

Tentative Decision Functions tanh clip 1 0.8 0.6 0.4 0.2-1 -0.5 0.5 1-0.2 sgn -0.4 61

Outline 1 Introduction 2 System Model 3 Multiuser Detection 4 Interference Cancellation 5 Linear Methods 6 Nonlinear Methods 7 Numerical Results 62

Parallel cancellation decorrelator, K = 8, E b /N 0 = 7 db. 10 0 linear 10 1 Decorrelator Bit Error Rate 10 2 10 3 sgn cliptanh 10 4 0 0.2 0.4 0.6 0.8 1 System load K/N 63

Parallel cancellation, K = 8, N = 32, E b /N 0 = 7 db. 10 1 linear Bit Error Rate 10 2 Decorrelator sgn tanh clip 10 3 1 2 3 4 5 6 7 8 Iteration 64

Parallel, K = 8, N = 32, 8 iterations. 10 0 10 1 Bit Error Rate 10 2 10 3 10 4 10 5 Decorrelator linear sgn clip tanh 10 6 0 2 4 6 8 10 12 14 E b /N 0 (db) 65

Serial, K = 8, N = 16, E b /N 0 = 10 db. 10 1 Bit Error Rate 10 2 linear Decorrelator sgn tanh 10 3 LMMSE clip 1 2 3 4 5 6 7 8 Iteration 66

Serial, K = 8, N = 16, 8 iterations. 10 0 10 1 Bit Error Rate 10 2 10 3 linear tanh clip Decorrelator 10 4 0 2 4 6 8 10 12 E b /N 0 (db) 67

Descent algorithms, K = 8, N = 16, E b /N 0 = 10 db 10 1 Bit Error Rate 10 2 Serial Cancel Decorrelator Conjugate Gradient Gradient Decent 10 3 1 2 3 4 5 6 7 8 Iteration 68

Descent, LMMSE, K = 8, N = 16, E b /N 0 = 10 db. 10 1 Bit Error Rate 10 2 LMMSE Conjugate Gradient Serial Cancel Gradient Decent 10 3 1 2 3 4 5 6 7 8 Iteration 69

Decorrelator & LMMSE, K = 8, N = 16, E b /N 0 = 10 db. 10 1 Conjugate Gradient Bit Error Rate 10 2 Decorrelator Linear Serial Cancellation Clip Serial Cancellation LMMSE 10 3 1 2 3 4 5 6 7 8 Iteration 70

For more information... A. Grant and L. Rasmussen, Iterative Techniques, Chapter 3 in Advances in Multiuser Detection, Wiley, 2009. 71