NEURAL NETWORKS FOR TRANSMISSION OVER NONLINEAR MIMO CHANNELS

Size: px

Start display at page:

Download "NEURAL NETWORKS FOR TRANSMISSION OVER NONLINEAR MIMO CHANNELS"

Amy Nelson
6 years ago
Views:

1 NEURAL NETWORKS FOR TRANSISSION OVER NONLINEAR IO CHANNELS by AL UKHTAR AL-HINAI A thesis submitted to the Department of Electrical and Computer Engineering in conformity with the requirements for the degree of aster of Science (Engineering) Queen s University Kingston, Ontario, Canada August, 2007 Copyright AL uhtar AL-Hinai, 2007

2 To y Father who taught me the virtue of modesty To my other who taught me the meaning of ambition

3 Abstract ultiple-input ultiple-output (IO) systems have gained an enormous amount of attention as one of the most promising research areas in wireless communications. However, while IO systems have been extensively explored over the past decade, few schemes acnowledge the nonlinearity caused by the use of high power amplifiers (HPAs) in the communication chain. When HPAs operate near their saturation points, nonlinear distortions are introduced in the transmitted signals, and the resulting IO channel will be nonlinear. The nonlinear distortion is further exacerbated by the fading caused by the propagation channel. The goal of this thesis is: ) to use neural networs (NNs) to model and identify nonlinear IO channels; and 2) to employ the proposed NN model in designing efficient detection techniques for these types of IO channels. In the first part of the thesis, we follow a previous wor on modeling and identification of nonlinear IO channels, where it has been shown that a proposed bloc-oriented NN scheme allows not only good identification of the overall IO input-output transfer function but also good characterization of each component of the system. The proposed scheme employs an ordinary gradient descent based algorithm to update the NN weights during the learning process and it assumes only real-valued inputs. In this thesis, natural gradient (NG) descent is used for training the NN. oreover, we derive an improved variation of the previously proposed NN scheme to avoid the input type restriction and allow for complex modulated inputs as well. We also investigate the scheme tracing capabilities of time-varying nonlinear IO channels. Simulation results show that NG i

4 descent learning significantly outperforms the ordinary gradient descent in terms of convergence speed, mean squared error (SE) performance, and nonlinearity approximation. oreover, the NG descent based NN provides better tracing capabilities than the previously proposed NN. The second part of the thesis focuses on signal detection. We propose a receiver that employs the neural networ channel estimator (NNCE) proposed in part one, and uses the Zero-Forcing Vertical Bell Laboratories Layered Space-Time (ZF V-BLAST) detection algorithm to retrieve the transmitted signals. Computer simulations show that in slow time-varying environments the performance of our receiver is close to the ideal V- BLAST receiver in which the channel is perfectly nown. We also present a NN based linearization technique for HPAs, which taes advantage of the channel information provided by the NNCE. Such linearization technique can be used for adaptive data predistortion at the transmitter side or adaptive nonlinear equalization at the receiver side. Simulation results show that, when higher modulation schemes (>6-QA) are used, the nonlinear distortion caused by the use of HPAs is greatly minimized by our proposed NN predistorter and the performance of the communication system is significantly improved. ii

5 Acnowledgements I would lie to than my supervisor, Dr. ohamed Ibnahla for his help, guidance, and encouragement. I would also lie to than Dr. Aboelmagd Noureldin, Dr. Saeed Gazor, and Dr. Il-in Kim for their valuable comments during my defense. I extend my gratitude to all the members of the Satellite and obile Communication Lab for their friendship and help. I would especially lie to than Ali Alamdar for taing the time to proof-read my thesis. A special than you goes to the inistry of Higher Education in the Sultanate of Oman for their financial support. I am infinitely grateful for my friend Tricia Armson for her long time encouragement and support. Without the smile of the moon I would be lost in the void. y deepest gratitude goes to my parents. To my om who endured the patience of time waiting for this moment that I may return home. To my Dad who always believed in the importance of education. Than you for believing in me and giving me the motivation to continue even through the trying times. Dad you made it possible to have the education to buy all the shoes I need. iii

6 Contents Abstract...i Acnowledgements...iii Contents...iv List of Figures...vi List of Tables...xi List of Tables...xi Summery of Abbreviations and Symbols...xii Abbreviations... xii Symbols... xiii Chapter... Introduction.... otivation Bacground and Literature Review Neural Networ V-BLAST Detection Algorithm HPA Linearization Thesis Contribution Thesis Outline... 6 Chapter Nonlinear IO Channel odel and Neural Networ Scheme Nonlinear IO Channel odel Neural Networ Scheme Learning Algorithms Nonlinear Adaptive Algorithms Linear Adaptive Algorithms... 5 Chapter Neural Networ odeling and Identification of Nonlinear IO Channels Study Case odeling and Identification of Nonlinear IO Channels L IO Systems ( > 2, L > 2) Tracing of Time-Varying Nonlinear IO Channels iv

7 Chapter Improved Neural Networ odeling and Identification Scheme Channel odel Identification Structure Learning Algorithm Simulation results Chapter Applications Channel odel A Neural Networ Based V-BLAST Receiver The NNCE V-BLAST Detection Algorithm LS-NN Equalizer NN predistorter Simulation Chapter Conclusion and Future Wor Conclusion Suggestions for Future Wor... 0 Reference...02 v

8 List of Figures Fig. 2.: Nonlinear IO channel... 8 Fig. 2.2: NN identification structure... 0 Fig. 2.3: Adaptive system diagram... 2 Fig. 3.: SE vs. µ: comparison between BP and NG based algorithms ( = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input)... 9 Fig. 3.2: Smoothed SE curve (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zeromean white Gaussian input)... 2 Fig. 3.3: Evolution of the normalized weights (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input)... 2 Fig. 3.4: HPA nonlinearity g (x) and normalized NN (x) (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.5: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.6: Smoothed SE curve (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zeromean white Gaussian input) Fig 3.7: Evolution of the normalized weights (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.8: HPA nonlinearity g (x) and normalized NN (x) (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.9: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.0: Smoothed SE curve (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.: Evolution of the normalized weights (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.2: HPA nonlinearity g (x) and normalized NN (x) (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.3: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.4: Smoothed SE curve (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.5: Evolution of the normalized weights (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.6: HPA nonlinearity g (x) and normalized NN (x) (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.7: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.8: Smoothed SE curves: comparison between BP and NG based algorithms ( = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.9: Smoothed SE curves for N = 3, 5 and 7 (LS-BP, = L = 2, µ = 0. 09, SNR = 60 db, Zero-mean white Gaussian input)... 3 vi

9 Fig. 3.20: Smoothed SE curves for N = 3, 5 and 7 (LS-NGBP, = L = 2, µ = , SNR = 60 db, Zero-mean white Gaussian input)... 3 Fig. 3.2: SE vs. µ : comparison between BP and NG based algorithms ( = L = 3, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.22: Smoothed SE curve (LS-BP, = L = 3, µ = 0. 07, N = 5, SNR = 60 db, Zeromean white Gaussian input) Fig. 3.23: Evolution of the normalized weights (LS-BP, = L = 3, µ = 0. 07, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.24: HPA nonlinearity g (x) and normalized NN (x) (LS-BP, = L = 3, µ = 0. 07, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.25: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (LS-BP, = L = 3, µ = 0. 07, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.26: HPA nonlinearity g 3 (x) and normalized NN 3 (x) (LS-BP, = L = 3, µ = 0. 07, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.27: Smoothed SE curve (RLS-BP, = L = 3, µ = 0. 09, N = 5, SNR = 60 db, Zeromean white Gaussian input) Fig. 3.28: Evolution of the normalized weights (RLS-BP, = L = 3, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.29: HPA nonlinearity g (x) and normalized NN (x) (RLS-BP, = L = 3, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.30: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (RLS-BP, = L = 3, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.3: HPA nonlinearity g 3 (x) and normalized NN 3 (x) (RLS-BP, = L = 3, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.32: Smoothed SE curve (LS-NGBP, = L = 3, µ = 0. 0, N = 5, SNR = 60 db, Zeromean white Gaussian input) Fig. 3.33: Evolution of the normalized weights (LS-NGBP, = L = 3, µ = 0. 0, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.34: HPA nonlinearity g (x) and normalized NN (x) (LS-NGBP, = L = 3, µ = 0. 0, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.35: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (LS-NGBP, = L = 3, µ = 0. 0, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.36: HPA nonlinearity g 3 (x) and normalized NN 3 (x) (LS-NGBP, = L = 3, µ = 0. 0, N = 5, SNR = 60 db, Zero-mean white Gaussian input)... 4 Fig. 3.37: Smoothed SE curve (RLS-NGBP, = L = 3, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input)... 4 Fig. 3.38: Evolution of the normalized weights (RLS-NGBP, = L = 3, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.39: HPA nonlinearity g (x) and normalized NN (x) (RLS-NGBP, = L = 3, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.40: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (RLS-NGBP, = L = 3, µ = , N = 5, SNR = 60dB, Zero-mean white Gaussian input) Fig. 3.4: HPA nonlinearity g 3 (x) and normalized NN 3 (x) (RLS-NGBP, = L = 3, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) vii

10 Fig. 3.42: Smoothed SE curves: comparison between BP and NG based algorithms ( = L = 3, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.43: Time varying case - SE vs. µ: comparison between BP and NG based algorithms ( = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.44: Time varying case - Smoothed SE curves: comparison between LS-BP and RLS- BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.45: Time varying case: comparison between LS-BP and RLS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.46: Time varying case - HPA nonlinearity g (x) and normalized NN (x): comparison between LS-BP and RLS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.47: Time varying case - HPA nonlinearity g 2 (x) and normalized NN 2 (x): comparison between LS-BP and RLS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.48: Time varying case - Smoothed SE curves: comparison between LS-NGBP and RLS-NGBP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.49: Time varying case: comparison between LS-NGBP and RLS-NGBP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.50: Time varying case - HPA nonlinearity g (x) and normalized NN (x): comparison between LS-NGBP and RLS-NGBP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.5: Time varying case - HPA nonlinearity g 2 (x) and normalized NN 2 (x): comparison between LS-NGBP and RLS-NGBP (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input)... 5 Fig. 3.52: Time varying - case Smoothed SE curves: comparison between BP and NG based algorithms (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.53: Time varying - case Smoothed SE curves: RLS-NGBP performance comparison for different forgetting facotrs (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.54: RBF identification structure Fig. 3.55: SE vs. µ: RBF identification structure ( = L = 2, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.56: Smoothed SE curves: comparison between NN scheme and RBF structure (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.57: RBF identification of the time-varying channel: h (n) and normalized w (n) (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.58: RBF HPA nonlinearity approximation: g (x) and normalized NN (x): (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.59: RBF HPA nonlinearity approximation: g 2 (x) and normalized NN 2 (x): (f d = 0.000, = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 4.: Improved NN identification structure Fig. 4.2: Improved NN model - SE vs. µ : comparison between LS-BP and LS-NGBP algorithms ( = L = 2, N = 5, SNR = 60 db, 6-QA) Fig. 4.3: Improved NN model - Smoothed SE curves: comparison between LS-BP and LS- NGBP algorithms ( = L = 2, N = 5, SNR = 60 db, 6-QA) Fig. 4.4: Improved NN model - Evolution of the normalized weights (LS-BP, = L = 2, µ = 0. 05, N = 5, SNR = 60 db, 6-QA) viii

11 Fig. 4.5: Improved NN model - Evolution of the normalized weights (LS-NGBP, = L = 2, µ = 0.003, N = 5, SNR = 60 db, 6-QA) Fig. 4.6: Improved NN model - HPA A/A characteristic of g (x): true curve and normalized NN models, (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP ( = L = 2, N = 5, SNR = 60 db ) Fig. 4.7: Improved NN model - HPA A/A characteristic of g 2 (x): true curve and normalized NN models, (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP ( = L = 2, N = 5, SNR = 60 db ) Fig. 4.8: Improved NN model - HPA A/P characteristic of g (x): true curve and normalized NN models, (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP ( = L = 2, N = 5, SNR = 60 db ) Fig. 4.9: Improved NN model - HPA A/P characteristic of g 2 (x): true curve and normalized NN models, (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP ( = L = 2, N = 5, SNR = 60 db ) Fig. 4.0: Improved NN model Time varying case: SE vs. µ : comparison between LS-BP and LS-NGBP algorithms ( = L = 2, N = 5, SNR = 60 db, 6-QA) Fig 4.: Improved NN model - Time varying case: Smoothed SE curve: comparison between LS-BP and LS-NGBP algorithms (f d = 0.000, = L = 2, N = 5, SNR = 60 db, 6- QA) Fig 4.2 Improved NN model - Identification of the time-varying channel: comparison between LS-BP and LS-NGBP algorithms (f d = 0.000, = L = 2, N = 5, SNR = 60 db, 6- QA) Fig. 4.3: Improved NN model - Time varying case: HPA A/A characteristic of g (x): (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db )... 7 Fig. 4.4: Improved NN model - Time varying case: HPA A/A characteristic of g 2 (x): (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db )... 7 Fig. 4.5: Improved NN model - Time varying case: HPA A/P characteristic of g (x): (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db ) Fig. 4.6: Improved NN model - Time varying case: HPA A/P characteristic of g 2 (x): (o) and (*) represent the three 6-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db ) Fig 4.7: Improved NN model - Time varying case: Smoothed SE curve: comparison between LS-BP and LS-NGBP algorithms (f d = 0.000, = L = 2, N = 5, SNR = 60 db, 64- QA) Fig 4.8: Improved NN model - Identification of the time-varying channel: comparison between LS-BP and LS-NGBP algorithms (f d = 0.000, = L = 2, N = 5, SNR = 60 db, 64- QA) Fig. 4.9: Improved NN model - Time varying case: HPA A/A characteristic of g (x): (o) and (*) represent the three 64-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db ) Fig. 4.20: Improved NN model - Time varying case: HPA A/A characteristic of g 2 (x): (o) and (*) represent the three 64-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db ) Fig. 4.2: Improved NN model - Time varying case: HPA A/P characteristic of g (x): (o) and (*) represent the three 64-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db ) ix

12 Fig. 4.22: Improved NN model - Time varying case: HPA A/P characteristic of g 2 (x): (o) and (*) represent the three 64-QA amplitudes and their corresponding outputs for LS-NGBP and LS-BP (f d = 0.000, = L = 2, N = 5, SNR = 60 db ) Fig. 4.23: Improved NN model Smoothed SE curves: comparison between LS-BP and LS- NGBP for 6-QA and 64-QA Fig. 4.24: Improved NN model Smoothed SE curves: performance comparison between f d = and f d = 0.00 for 64-QA Fig. 5.: Simplified nonlinear IO communication system Fig. 5.2: Neural networ based V-BLAST receiver detection part Fig. 5.3: LS-NN equalizer Fig. 5.4: NN predistorter Fig. 5.5: Training of the nonlinearity inverter Fig. 5.6: g (x) - HPA A/A: Amplitude linearization Fig. 5.7: g 2 (x) - HPA A/A: Amplitude linearization Fig. 5.8: g (x) - HPA A/P: Phase shift cancellation Fig. 5.9: g 2 (x) - HPA A/P: Phase shift cancellation Fig. 5.0: Rectangular 6-QA constellation Fig. 5.: g (x) Output distorted 6-QA constellation Fig. 5.2: g 2 (x) Output distorted 6-QA constellation Fig. 5.3: g (x) Output 6-QA constellation after predistortion Fig. 5.4: g 2 (x) Output 6-QA constellation after predistortion... 9 Fig. 5.5: Rectangular 64-QA constellation Fig. 5.6: g (x) output distorted 64-QA constellation Fig. 5.7: g 2 (x) output distorted 64-QA constellation Fig. 5.8: g (x) Output 64-QA constellation after predistortion Fig. 5.9: g 2 (x) Output 64-QA constellation after predistortion Fig. 5.20: Frame structure Fig. 5.2: BER vs. SNR (normalized f d = 0-5, 6-QA) Fig. 5.22: BER vs. SNR (normalized f d = 0.000, 6-QA) Fig. 5.23: BER vs. SNR (normalized f d = 0-5, 32-QA) Fig. 5.24: BER vs. SNR (normalized f d = 0.000, 32-QA) Fig. 5.25: BER vs. SNR (normalized f d = 0-5, 64-QA) Fig. 5.26: BER vs. SNR (normalized f d = 0.000, 64-QA) x

13 List of Tables Table 3.: SE and scaling factors for the different algorithms Table 5.: SNR (db) needed to reach 0-4 BER for the different proposed detection techniques, (- ) indicates that the 0-4 cannot be achieved, (f d = 0-5 ) Table 5.2: SNR (db) needed to reach 0-4 BER for the different proposed detection techniques, (- ) indicates that the 0-4 BER cannot be achieved, (f d = 0.000) xi

14 Summery of Abbreviations and Symbols Abbreviations A/A A/P AWGN BER BP CSI db DD FI FIR HPA LS IO SE SE NG NGBP NN NNCE QA RLS Amplitude to Amplitude Conversions Amplitude to Phase Conversions Additive White Gaussian Noise Bit Error Rate Bac Propagation Channel State Information Decibel Decision Directed Fisher Information atrix Finite Impulse Response High Power Amplifier Least ean Squares ultiple-input ultiple-output inimum ean-squared Error ean Squared Error Natural Gradient Natural Gradient Bac-Propagation Neural Networ Neural Networ Channel Estimator Quadrature Amplitude odulation Recursive Least Squares xii

15 SNR TS ZF ZF-VBLAST Signal to Noise Ratio Training Sequence Zero-Forcing Zero-Forcing Vertical Bell Laboratories Layered Space- Time Symbols A (r ) A/A response of the HPA th th a Input weight of the i neuron of the bloc i th th b Output weight of the i neuron of the bloc i th th c Bias weight of the i neuron of the bloc i e j The j th Error e j The j th Error: imaginary part I e j The j th Error: real part R f (). Neural networ activation function fd Normalized Doppler frequency G Inverse of the Fisher information matrix (). g System s th memory-less nonlinearity H Linear combiner matrix H estimate Estimation of the channel matrix H + ( H estimate ) oore-penrose pseudo-inverse of H estimate ( estimate ) i H The th column of H xiii

16 ( estimate ) i H atrix obtained by zeroing the columns,, 2 of H estimate ( H ) + estimate Pseudo-inverse of ( H estimate ) i i J Cost function L N N j NN G ( r ) Number of antennas at the transmitter side Number of antennas at the receiver side Number of neurons in each neural networ bloc The j th white noise The th neural networ bloc gain output th NN (.) The neural networ bloc NN P ( r ) The th neural networ bloc phase output P (r ) A/P response of the HPA Q (). A quantizer to the nearest constellation point r s j Amplitude of the th input signal Neural networ s j th output x System s th input W Neural networ weight matrix ( W i ) j The j th row of Wi y j System s j th output α, α, β, β Parameters of the th HPA G P G P θ System parameter vector for th neural networ bloc µ Learning rate xiv

17 θ Ordinary gradient with respect to matrix θ λ γ RLS forgetting factor Scaling factor φ Phase of the th input signal xv

18 Chapter Introduction Nonlinear IO systems are being increasingly applied in several engineering fields, including satellite communications [23, 35], system control [42, 44], and control of Underwater Vehicles [4, 9, 25].. otivation The quest for higher data rates in wireless communications is significantly increasing at rapid paces. To achieve this goal, IO systems have been well looed at and investigated as a mean of increasing channel capacity []. The IO concept can offer significantly high data rate and capacity, with no additional bandwidth, when the channel exhibits rich scattering and its variation can be accurately traced [2]. In order to achieve higher date rate and fulfill the power requirement at the same time, IO communication systems may be equipped with high power amplifiers (HPAs). However, HPAs introduce nonlinearity to the system when operating near their nonlinear saturation regions. This nonlinearity imposes a major restriction on the modulation scheme the communication system can use. For instance, ulti-level modulation schemes, such as -ary (>4) Quadrature Amplitude odulation (QA), are sensitive to nonlinear distortion due to their large envelope fluctuation; hence, the system is restricted to simple modulation schemes such as Binary Phase Shift Keying (BPSK) [3, 5, 23, 26, 35].

19 In addition to the nonlinear behavior of HPAs there is the further challenge that lies in the variations of the wireless fading channel. The parameters of the time-varying channel are not directly observed and thus should be accurately traced. Therefore, to reach maximum throughput; nowledge of accurate and timely channel state information (CSI) is extremely important in wireless IO communication systems. For example, the receiver needs to now the channel for accurate detection and data-demodulation [27]. However, modeling the channel time-varying parameters and the HPA nonlinearities is a highly challenging tas, especially when both the nonlinearity and fading parameters are unnown. The goal of this thesis is to improve a previously proposed neural networ channel estimator (NNCE) for nonlinear IO channels which are composed of inputs, memory-less nonlinearities, a linear combiner, and L outputs. The improved NNCE is expected to adaptively identify the overall IO input-output transfer function and characterize each component of the channel. The NNCE is then applied to the design of efficient detection techniques for the type of nonlinear IO channels under study. These detection techniques include: a Zero-Forcing Vertical Bell Laboratories Layered Space-Time (ZF V-BLAST) receiver, an LS-NN equalizer, and a NN based predistorter. 2. Bacground and Literature Review Power efficiency represents the ability of a system to reliably transmit information at the lowest practical power level; while spectral efficiency demonstrates the ability of a system (e.g. modulation schemes) to accommodate data within an allocated bandwidth 2

20 [20]. High power efficiency can be achieved through the employment of HPAs within the communication chain. However, HPAs cause nonlinear distortion to the transmitted signals which becomes significant when high-level modulations schemes are used and thus restricting the system to simple and spectrally inefficient modulation schemes. High level modulation schemes are particularly susceptible to the distortion caused by HPAs because of their large envelope fluctuation. Therefore, the tas of achieving power and spectral efficient communication systems remains highly challenging particularly when considering the nonlinearity problem caused by the use of HPAs. One way to address this challenge is to accurately identify the channel parameters..2. Neural Networ Neural networs have been widely applied for modeling and identification of nonlinear IO channels. This is mainly due to their universal approximation, learning and adaptation abilities, which mae them powerful modeling tools for nonlinear dynamic systems [4, 5, 9, 2, 22, 28]. When adapting a bloc-oriented structure, the NN model is capable of characterizing not only the overall input-output transfer function but also individual components of the channel. The bloc oriented approach copies the general physical structure of the channel to be identified [7]. In other words, for each component in the channel there is a corresponding component in the bloc-oriented NN model. Several different ways exist for training NNs. One such way is supervised learning, where a set of training data is presented to the neural networ. The training set consists of pairs of input and desired output. The networ parameters are then continuously adjusted 3

21 under the affect of the training data and the error signal; the error signal is defined as the difference between the actual output of the networ and desired output [4, 7]. One of the well-nown adaptive learning algorithms used for training NNs is the bacpropagation algorithm. BP, however, has two major limitations: first, it has slow convergence speed and second it can get trapped in a local minima resulting in suboptimal approximation. Natural gradient (NG) learning, on the other hand, has been shown to have better convergence speed than classical BP because it taes into account the geometry of the coordinate system in which the NN parameters evolve. This maes NG learning better in escaping the plateau regions, which are typical of the classical BP [8, 20]..2.2 V-BLAST Detection Algorithm Several detection algorithms have been proposed in order to exploit the high spectral capacity offered by IO channels. One such algorithm is the V-BLAST algorithm, where the data streams are independently encoded and transmitted from each transmit antenna simultaneously, and detected at the receiver by a nulling and successive cancellation scheme [0, 40]. Thus, the specifics of the detection process depend on the method used to perform the nulling cancellation, the most common choices being zeroforcing (ZF) and minimum mean-squared error (SE). The detection algorithm described in this thesis will be with respect to the ZF criterion due to its simplicity. 4

22 .2.3 HPA Linearization Two approaches have been proposed to overcome the distortion caused by HPAs: predistortion [3, 32, 39] and equalization [6, 29, 38]. Predistortion is performed at the transmitter side prior to the amplification stage as it aims at pre-canceling the nonlinear effects via modeling the inverse of the amplifier characteristics. On the other hand, equalization is performed at the receiver side by post-canceling the amplifier nonlinear distortions..3 Thesis Contribution Our first contribution is to examine a previously proposed NN scheme for nonlinear IO channel modeling and identification. We investigate the performance of the NN scheme under natural gradient descent learning and compare it to that obtained under ordinary gradient descent learning. Indeed, NG descent learning significantly outperforms ordinary gradient descent learning in terms of convergence speed, mean squared error (SE) performance, and nonlinearity approximation. As a result, NG descent based NN scheme allows better characterization of the different parts of the unnown IO channel as well as excellent adaptive identification of the overall IO input-output transfer function. In addition, the NN scheme seems to exhibit better tracing capabilities under the NG descent learning. Our second contribution is to modify the already existing proposed NN scheme to account for complex-valued signals. The improved scheme will be able to model the amplitude distortion as well as the phase shift caused by the HPAs, which was not possible with the previous NN scheme. 5

23 Our third contribution is to propose an adaptive NN based V-BLAST receiver for nonlinear IO systems. The receiver is composed of the proposed neural networ channel estimator and a ZF V-BLAST detection algorithm. The ZF V-BLAST algorithm has been modified so that the channel nonlinearity is taen into consideration. Computer simulations show that in slow time-varying environments the performance of our receiver is close to the ideal V-BLAST receiver in which the channel is perfectly nown. Our forth contribution is to present a NN based linearization technique for HPAs used in IO communication systems under study. We apply our proposed NN scheme to approximate the inverse of the nonlinearities caused by HPAs, which can then be used for adaptive data predistortion at the transmitter side or adaptive nonlinear equalization at the receiver side. Simulation results show that, when higher modulation schemes (> 6- QA) are used, the nonlinear distortion caused by the use of HPAs is greatly minimized by our proposed NN predistorter and the performance of the communication system is significantly improved..4 Thesis Outline The rest of this thesis is organized as follows: In Chapter 2, we will state the class of nonlinear IO channels to be identified. Then, we will present the already proposed bloc-oriented NN identification structure. Finally we survey the different learning algorithms that will be used to train the proposed NN scheme. The algorithms to be studied are a combination of nonlinear adaptive algorithms, ordinary Bac-Propagation (BP) and Natural Gradient Bac-Propagation (NGBP), and 6

24 linear adaptive algorithms, Least ean Squares (LS) and Recursive Least Squares (RLS). In Chapter 3, we test the NN scheme by simulating a 2 2 IO system. The scheme performance is examined and compared under each of the algorithms discussed in Chapter 2. The comparison is done in terms of SE, convergence speed, and nonlinearity approximation. In addition, by simulating a 3 3 IO system, we illustrate that the NN identification structure can be extended to model IO systems with higher number of antennas. Finally, we apply the NN scheme to the tracing of time-varying nonlinear IO channels. In Chapter 4, we modify the improved NN model to mae it more suitable for complexvalued signals. The new proposed NN model will be able to model the phase shift caused by the HPAs. In Chapter 5, we introduce a neural networ V-BLAST receiver for nonlinear IO channels. The receiver is composed of a neural networ channel estimator and a ZF V- BLAST detection algorithm. We also propose a NN based linearization technique for HPAs, which can be used either at the transmitter or receiver side. In Chapter 6, we will summarize the conclusion and mae suggestions for future wor. 7

25 Chapter 2 Nonlinear IO Channel odel and Neural Networ Scheme In this chapter, we state the nonlinear IO channel model that will be used throughout this thesis. Then, we delineate the NN scheme used for modeling the nonlinear channel. Finally, we present several learning algorithms used to train the proposed NN structure. 2. Nonlinear IO Channel odel x () n g (.) h N () n y () n x 2 () n g 2 (.) h N 2 () n y 2 () n x () n g (.) h L N L () n y L () n h L Fig. 2.: Nonlinear IO channel Fig. 2. shows the nonlinear IO channel considered in this thesis [5]. The system consists of uncorrelated inputs x (, K, ) transformed by a memory-less nonlinearity g (.) then linearly combined by L matrix H = [ ]. = each of which is nonlinearly h j. The outputs of these nonlinearities are 8

26 The th j output can be expressed as: j h ji g i ( xi ) + N j, j = L ( 2. y L ) = i= where, N j is a white noise. The system output-input relation can be expressed in a matrix form as: y y y 2 L g = g H g ( x ) ( x ) 2 2 ( x ) + N N N 2 L ( 2.2) Depending on the application, the linear combiner H could be static or time varying. Our modeling approach assumes that only the structure of the nonlinear IO channel is nown. In other words, we now that the IO channel is composed of nonlinear memory-less blocs and a linear combining matrix, but we do not now the values or the behavior of those components. 2.2 Neural Networ Scheme The neural networ scheme proposed for modeling the nonlinear IO channel is shown in Fig. 2.2 [7]. It is composed of NN blocs, modeling the channel nonlinearities, followed by adaptive L linear combiner, modeling the linear component of the channel. 9

27 b Bloc a f c x () n f NN w s a N b N c N w f b 2 Bloc 2 a 2 f c 2 x 2 () n f NN 2 s 2 a 2N b 2N c 2N f b Bloc a f c w L x () n f NN w L s L c N a N b N f Fig. 2.2: NN identification structure 0

28 Each bloc has a scalar input x ( =, L, ), N neurons and a scalar output: () C N NN = ci f ( ai x + bi ), =, L, ( 2. 3) i= where f. is the NN activation function, a, b, c represent respectively, the input th th weight, output weight, and bias weight of the i neuron of the bloc. The output i i i NN th th of the bloc is connected to the j output of the system through weight w. j The system th j output is then expressed as: j w j NN, j =,, L ( 2.4) s L = = The decoupling constraint C is introduced to avoid error propagation ambiguity (which part of the nonlinear channel caused the error), where vector output weight set, C = c, K, c ], of blocs. [ N C represents the The NN model output-input relation can be expressed in a matrix form as: s s s 2 L NN = NN W NN ( x ) ( x ) 2 2 ( x ) ( 2.5) where W [ w j ] ( j =, L, L; =, L, = ) is the weight matrix. 2.3 Learning Algorithms The NN model uses supervised learning during the training process to update the weight parameters. The unnown IO channel and the proposed NN model are fed with the same input vector as shown in Fig At each iteration, the NN parameters are updated

29 in order to minimize a cost function, which is taen here to be the sum of the squared errors between the nown channel outputs and the corresponding model outputs: where e y s j =. j j J ( n ) e ( n ) 2 ( j ) ( 2.6 ) = L j = x x 2 x () n Unnown nonlinear IO channel () n N N 2 N L y L () n y () n y 2 () n x () n x 2 x () n () n Adaptive NN scheme s s 2 s L e L () n e 2 e Learning algorithm Fig. 2.3: Adaptive system diagram The tas of training the NN scheme can be split into two parts Nonlinear training: this part corresponds to the training of the memory-less NN blocs and it uses nonlinear adaptive algorithms to update the NN weights, a, c, b ( =, K, ; i =, K N ). i i i, Linear training: this part uses linear adaptive algorithms to update the coefficients of the linear combiner matrix W [ w j ] ( j =, K, L; =, K, ) =. 2

30 2.3. Nonlinear Adaptive Algorithms The system parameter vector for NN bloc will be denoted by θ, which includes all NN parameters to be updated in bloc θ = [ a K, a, b, K, b, c, K c ] t, N N N Ordinary Gradient Descent Bac-Propagation (BP) Algorithm [7] The BP algorithm updates the weight by following the ordinary gradient descent of the error surface: θ ( n + ) = θ µ J ( 2.7) θ where µ is a small positive constant and respect to matrix θ, which is expressed as: θ represents the ordinary gradient with θ J L = 2 e j θ s j ( 2.8) j= where θ s j cx c N x c = cn f f f f ( a x + b ) f ( an x + bn ) f ( a x + b ) w ( an x + bn ) ( a x + b ) w ( a x + b ) w N N w j j j w j w j j Natural Gradient Descent Bac-Propagation (NGBP) Algorithm The ordinary gradient is the steepest descent direction of a cost function if the space of parameters is an orthonormal coordinate system. It has been shown [, 20, 4, 43] that 3

31 in the case of multilayer nets, the steepest descent direction of the loss function is actually given by: where ~ θ J θ = G J ( 2.9) G is the inverse of the Fisher information matrix (FI): G = g i, j = g i, j = E J θ i J θ j ( ) ( 2.0) n Therefore, the neural networ weights will be updated as follows: θ ~ ( n + ) = θ µ J ( 2.) θ The calculation of the expectation in the expression of G requires the probability ( ) ( ) distribution of the inputs x n =, L,, which is unnown in most cases. oreover, the inversion of is computationally costly. To obtain directly, a G Kalman filter technique is used [2, 20]: ( n ) = ( + ε ) G ε G s s G t ~ ( ) G ( 2.2) ~ ~ ~ G + n n θ j θ A search-and-converge schedule will be used for ε n in order to obtain a good trade-off between convergence speed and stability: j ε n ε 0 + c n ε = τ 2 + c n ε ε n τ 0 + τ ( 2.3) such that small n corresponds to a search phase ( ε n is close to ε 0 ) and large n corresponds to a convergence phase ( ε n is equivalent to c ε for large n), where ε n 0, cε and τ are positive real constants. Using this online Kalman filter technique, the update of the weights (Eqn. 2.) becomes: θ ~ ( n + ) = θ µ G J ( 2.4 ) θ 4

32 2.3.2 Linear Adaptive Algorithms Least-ean-Square (LS) Algorithm [3] The LS algorithm is one of the most widely used adaptive filtering algorithms. It is a stochastic gradient descent method in that the weights are only updated based on the error signal at the current time. The error is defined to be the difference between the desired signal and the actual signal. The goal is to minimize the underlying cost function, which is achieved by minimizing the error signal. The LS algorithm may be described in words as follows: updatedvalue old value learning tap error of tap weight = of tap weight + rate input signal vector vector parameter vector Therefore, updating the elements of matrix W is done as follows: s e j j w w NN ( 2.5) = = y j s j ( 2.6) ( n + ) = w + 2µ e NN ( 2.7) j = j j The LS algorithm is simple to implement, however, its major limitation is a relatively slow rate of convergence. j Recursive Least-Square (RLS) Algorithm [3, 6] The RLS algorithm may be described in words as follows: updatedvalue old value Kalman innovation of the = of the + gain vector state state Hence, we update the elements of matrix W as follows: K ( n + ) pˆ = λ + NN NN Pˆ NN ( 2.8) 5

33 s e w w NN ( 2.9) j j = = y j s j ( 2.20) ( n + ) = w + K e ( 2.2) j = j j ( n + ) = Pˆ K NN Pˆ j ( ) ( 2.22) ˆ P λ The RLS forgetting factor is λ = and the initial value of P is P(0) = I, where I is the identity matrix. The RLS algorithm is capable of realizing a rate of convergence that is, in general, much faster than the LS algorithm, because the RLS algorithm utilizes all the information contained in the input data from the start of the adaptation up to the present. However, this is attained at the expanse of extensive computational complexity. The learning curve is defined as the evolution of the mean squared error during the learning process. The SE error starts to decrease until the system reaches a steady state and only slight changes in the weights are observed. The learning process may then be stopped if the modeled IO channel is static, however, if the IO channel is time varying, then the learning process should continuous in order to eep tracing of the time-varying parameters. 6

34 Chapter 3 Neural Networ odeling and Identification of Nonlinear IO Channels In this chapter we evaluate, through simulation, the performance of the NN scheme described in Chapter 2. The algorithms: LS-BP, LS-NGBP, RLS-BP, and RLS- NGBP are tested and examined in training the NN scheme where the performance of the scheme under each algorithm is compared in terms of convergence speed, SE performance and nonlinearity approximation. 3. Study Case For the purpose of conducting the simulation we choose a study case which we thin is comprehensive and good for illustrating the results. In this chapter, the inputs are Zeromean white Gaussian processes with unit variance. A 2 2 IO system is considered α x + β x with the unnown nonlinearities to be identified chosen to be: g( x) = 2 2 β x α, where α = β2 = 2, α 2 = β =. These nonlinear functions are and ( ) 2 g = exp 2 2 x 2 x reasonable models for amplitudes conversions for nonlinear high power amplifiers used in digital communications. For this chapter, the noise is taen as white Gaussian with varianceσ = Later in this chapter the simulation will be extended to include 3 3 IO systems. Each bloc in the proposed NN scheme is composed of N = 5 neurons. The NN activation function will be taen as the erf function. The RLS forgetting factor is taen to 2 7

35 be λ = We have tested the NN model for a range of values of µ belonging to the interval [ ]. 3.2 odeling and Identification of Nonlinear IO Channels In this section we consider a fixed IO channel where the linear combiner matrix is static. In other words, the matrix H is assigned fixed values and it does not change during 0.3 simulation. For illustration purposes we choose H to be: H =. 0.3 The SE vs. µ curves of BP and NG (at 50,000 iterations) are shown in Fig. 3.. It can be seen that µ = represents an optimal value for the SE performance of LS-BP and RLS-BP, while the optimal SE value of LS-NGBP and LS-NGBP is achieved when µ = Thus, when the learning rate is small the SE curve converges slowly and the SE value remains high, as a minimum is not yet reached. On the other hand, when the learning rate is large the NN performance will experience an increase in the SE value, moving away from the minimum, and eventually will become unstable. Therefore, to achieve a near optimum performance, one must carefully choose the learning rate which results in the lowest SE. 8

36 SE 0-4 RLS-BP 0-5 LS-BP LS-NGBP RLS-NGBP µ Fig. 3.: SE vs. µ: comparison between BP and NG based algorithms ( = L = 2, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.2, Fig. 3.6, Fig. 3.0, and Fig. 3.4 show the learning curves of the LS-BP, RLS- BP, LS-NGBP and RLS-NGBP algorithms, respectively. Table 3. lists the SE for the different algorithms. The SE values are calculated by averaging over the last 5,000 iterations of the SE curves. Fig. 3.3, Fig. 3.7, Fig.3., and Fig. 3.5 demonstrate the transient behavior of matrix W weights. The memory-less nonlinearities have been successfully identified by the NN approach, (see Fig. 3.4, Fig. 3.5, Fig.3.8, Fig. 3.9, Fig.3.2, Fig. 3.3, Fig.3.6, and Fig. 3.7). It is important to note that each of the nonlinearities as well as the coupling matrix H have been identified to within scaling factors. That is: NN ( x) has converged to ( x) γ, NN 2 ( x) has converged to γ ( x) g 2 g 2, and 9

37 W has converged to γ 0 0 H γ 2, where γ and γ 2 are real constants. The scaling factors are determined by comparing the average gain of the unnown channel nonlinearities and the average gain of the corresponding NN blocs. For instance, if the average gain of the unnown nonlinearity, (.) g is giving by: gaverage output power g Average Gain = and the average gain of the corresponding NN bloc, g average input power NN (). is given by: NNaverage output power NN Average Gain =, then the scaling factor, γ is given NN average input power Average Gain by: γ NN = g. Average Gain Table 3.: SE and scaling factors for the different algorithms Scaling factors Algorithm SE γ γ 2 LS-BP RLS-BP LS-NGBP RLS-NGBP

38 SE Iteration, n x 0 4 Fig. 3.2: Smoothed SE curve (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zeromean white Gaussian input).2 h, h w w w 2 h 2, h w Iteration, n x 0 4 Fig. 3.3: Evolution of the normalized weights (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) 2

39 .4.2 Nonlinearity approximation True nonlinearity Output, g (x) Input, x Fig. 3.4: HPA nonlinearity g (x) and normalized NN (x) (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60dB, Zero-mean white Gaussian input) True nonlinearity Nonlinearity approximation Output, g 2 (x) Input, x Fig. 3.5: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (LS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) 22

40 SE Iteration, n x 0 4 Fig. 3.6: Smoothed SE curve (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zeromean white Gaussian input).2 h, h 22 w 0.8 w h 2, h w 2 w Iteration, n (x50 iterations) Fig 3.7: Evolution of the normalized weights (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) 23

41 .4.2 Nonlinearity approximation True nonlinearity Output, g (x) Input, x Fig. 3.8: HPA nonlinearity g (x) and normalized NN (x) (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) Nonlinearity approximation 0.3 True nonlinearity Output, g 2 (x) Input, x Fig. 3.9: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (RLS-BP, = L = 2, µ = 0. 09, N = 5, SNR = 60 db, Zero-mean white Gaussian input) 24

42 SE Iteration, n x 0 4 Fig. 3.0: Smoothed SE curve (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.: Evolution of the normalized weights (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) 25

43 Output, g (x) 0.6 True nonlinearity, Nonlinearity approximatioin Input, x Fig. 3.2: HPA nonlinearity g (x) and normalized NN (x) (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) True nonlinearity, Nonlinearity approximation Output, g 2 (x) Input, x Fig. 3.3: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (LS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) 26

44 SE Iteration, n x 0 4 Fig. 3.4: Smoothed SE curve (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) Fig. 3.5: Evolution of the normalized weights (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) 27

45 Output, g (x) 0.6 True nonlinearity, Nonlinearity approximation Input, x Fig. 3.6: HPA nonlinearity g (x) and normalized NN (x) (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) True nonlinearity, Nonlinearity approximation Output, g 2 (x) Input, x Fig. 3.7: HPA nonlinearity g 2 (x) and normalized NN 2 (x) (RLS-NGBP, = L = 2, µ = , N = 5, SNR = 60 db, Zero-mean white Gaussian input) 28

46 Discussion From the figures above it is evident that the NN scheme has successfully modeled and identified the IO channel under testing. Yet, the performance varies depending on the learning algorithm used for training the NN. For instance, Fig. 3.7 exhibits evolution of the normalized weights at a faster convergence speed than it is in Fig This should not be surprising, since the rate of convergence of the RLS algorithm is typically an order of magnitude faster than that of the LS algorithm [3]. Also, Table 3. shows that ordinary BP descent based algorithms tend to converge to an SE value of approximately 4 0-5, while NG descent based algorithms tend to converge to an SE value of approximately This can be explained by that fact that the SE is mostly controlled by the channel nonlinearities, which taes more time to converge. Nonlinearities are trained using nonlinear adaptive algorithms (i.e. BP and NGBP). Furthermore, simulation results show that NG descent based algorithms outperform BP descent based algorithms in terms of convergence speed, SE performance and nonlinear approximation. NG has better capabilities of avoiding the plateau phenomena, which is typical of BP learning curves, because it accounts for the space parameters in which the NN weights evolve [20]. This yields faster convergence speed and lower SE performance. Fig. 3.8 demonstrates the considerable convergence speed and SE improvement obtained by employing NG instead of BP. We can see that while BP SE curves converge to a general value of 4 0-5, NG SE curves tend to converge to a general value of resulting in lower SE. In addition, Fig. 3.8 shows that, in order to achieve an SE of 0-4 the NG needs less than 5,000 iterations, whereas the BP needs more than 0,000 iterations. 29

12.4 Known Channel (Water-Filling Solution)

12.4 Known Channel (Water-Filling Solution) ECEn 665: Antennas and Propagation for Wireless Communications 54 2.4 Known Channel (Water-Filling Solution) The channel scenarios we have looed at above represent special cases for which the capacity