Adaptive Noise Cancellation Using Deep Cerebellar Model Articulation Controller

Similar documents
Multilayer Kerceptron

A Novel Learning Method for Elman Neural Network Using Local Search

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c)

Convolutional Networks 2: Training, deep convolutional networks

BP neural network-based sports performance prediction model applied research

Unconditional security of differential phase shift quantum key distribution

Stochastic Variational Inference with Gradient Linearization

Training Algorithm for Extra Reduced Size Lattice Ladder Multilayer Perceptrons

Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

Nonlinear Gaussian Filtering via Radial Basis Function Approximation

Paragraph Topic Classification

Adaptive Fuzzy Sliding Control for a Three-Link Passive Robotic Manipulator

A proposed nonparametric mixture density estimation using B-spline functions

Determining The Degree of Generalization Using An Incremental Learning Algorithm

Statistical Learning Theory: A Primer

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

DISTRIBUTION OF TEMPERATURE IN A SPATIALLY ONE- DIMENSIONAL OBJECT AS A RESULT OF THE ACTIVE POINT SOURCE

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

Bayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems

Consistent linguistic fuzzy preference relation with multi-granular uncertain linguistic information for solving decision making problems

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations

Minimizing Total Weighted Completion Time on Uniform Machines with Unbounded Batch

Combining reaction kinetics to the multi-phase Gibbs energy calculation

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

SVM: Terminology 1(6) SVM: Terminology 2(6)

The EM Algorithm applied to determining new limit points of Mahler measures

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel

A Brief Introduction to Markov Chains and Hidden Markov Models

Maximizing Sum Rate and Minimizing MSE on Multiuser Downlink: Optimality, Fast Algorithms and Equivalence via Max-min SIR

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

Tracking Control of Multiple Mobile Robots

Structural Control of Probabilistic Boolean Networks and Its Application to Design of Real-Time Pricing Systems

Decoupled Parallel Backpropagation with Convergence Guarantee

Source and Relay Matrices Optimization for Multiuser Multi-Hop MIMO Relay Systems

Disturbance decoupling by measurement feedback

A Fundamental Storage-Communication Tradeoff in Distributed Computing with Straggling Nodes

Supervised i-vector Modeling - Theory and Applications

Asynchronous Control for Coupled Markov Decision Systems

Polar Snakes: a fast and robust parametric active contour model

Distributed average consensus: Beyond the realm of linearity

CS229 Lecture notes. Andrew Ng

Quick Training Algorithm for Extra Reduced Size Lattice-Ladder Multilayer Perceptrons

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM

Two view learning: SVM-2K, Theory and Practice

A. Distribution of the test statistic

Extended SMART Algorithms for Non-Negative Matrix Factorization

Neural Networks Compression for Language Modeling

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Math 124B January 17, 2012

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA

Nonlinear Analysis of Spatial Trusses

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

The influence of temperature of photovoltaic modules on performance of solar power plant

Centralized Coded Caching of Correlated Contents

$, (2.1) n="# #. (2.2)

II. PROBLEM. A. Description. For the space of audio signals

International Journal "Information Technologies & Knowledge" Vol.5, Number 1,

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Symbolic models for nonlinear control systems using approximate bisimulation

Title Sinusoidal Signals. Author(s) Sakai, Hideaki; Fukuzono, Hayato. Conference: Issue Date DOI

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

FREQUENCY modulated differential chaos shift key (FM-

<C 2 2. λ 2 l. λ 1 l 1 < C 1

FRIEZE GROUPS IN R 2

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Target Location Estimation in Wireless Sensor Networks Using Binary Data

Converting Z-number to Fuzzy Number using. Fuzzy Expected Value

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems

Evolutionary Product-Unit Neural Networks for Classification 1

Adjustment of automatic control systems of production facilities at coal processing plants using multivariant physico- mathematical models

Safety Evaluation Model of Chemical Logistics Park Operation Based on Back Propagation Neural Network

Approach to Identifying Raindrop Vibration Signal Detected by Optical Fiber

C. Fourier Sine Series Overview

Indirect Optimal Control of Dynamical Systems

Statistical Learning Theory: a Primer

Melodic contour estimation with B-spline models using a MDL criterion

Lecture Note 3: Stationary Iterative Methods

Interactive Fuzzy Programming for Two-level Nonlinear Integer Programming Problems through Genetic Algorithms

Mathematical Scheme Comparing of. the Three-Level Economical Systems

Stochastic Automata Networks (SAN) - Modelling. and Evaluation. Paulo Fernandes 1. Brigitte Plateau 2. May 29, 1997

arxiv: v1 [cs.lg] 31 Oct 2017

Simplified analysis of EXAFS data and determination of bond lengths

Radar/ESM Tracking of Constant Velocity Target : Comparison of Batch (MLE) and EKF Performance

A Better Way to Pretrain Deep Boltzmann Machines

8 Digifl'.11 Cth:uits and devices

arxiv: v1 [cs.ds] 12 Nov 2018

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels

Legendre Polynomials - Lecture 8

Copyright information to be inserted by the Publishers. Unsplitting BGK-type Schemes for the Shallow. Water Equations KUN XU

Efficiently Generating Random Bits from Finite State Markov Chains

Soft Clustering on Graphs

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS

FORECASTING TELECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODELS

STABILITY OF A PARAMETRICALLY EXCITED DAMPED INVERTED PENDULUM 1. INTRODUCTION

Transcription:

daptive Noise Canceation Using Deep Cerebear Mode rticuation Controer Yu Tsao, Member, IEEE, Hao-Chun Chu, Shih-Wei an, Shih-Hau Fang, Senior Member, IEEE, Junghsi ee*, and Chih-Min in, Feow, IEEE bstract This paper proposes a deep cerebear mode articuation controer (DCMC) for adaptive noise canceation (NC). We expand upon the conventiona CMC by stacking singe-ayer CMC modes into mutipe ayers to form a DCMC mode and derive a modified backpropagation training agorithm to earn the DCMC parameters. Compared with conventiona CMC, the DCMC can characterize noninear transformations more effectivey because of its deep structure. Experimenta resuts confirm that the proposed DCMC mode outperforms the CMC in terms of residua noise in an NC task, showing that DCMC provides enhanced modeing capabiity based on channe characteristics. I. INTRODUCTION The goa of an adaptive noise canceation (NC) system is to remove noise components from signas. In NC systems, inear fiters are widey used for their simpe structure and satisfactory performance in genera conditions, where east mean square (MS) [1] and normaized MS [] are two effective criteria to estimate the fiter parameters. However, when the system has a noninear and compex response, a inear fiter may not provide optima performance. ccordingy, some noninear adaptive fiters have been deveoped. Notabe exampes incude the unscented Kaman fiter [3, 4] and the Voterra fiter [5, 6]. Meanwhie, cerebear mode articuation controer (CMC), a feed forward neura network mode, has been used as a compex piecewise inear fiter [7, 8]. Experimenta resuts showed that CMC provided satisfactory performance in terms of mean squared error (MSE) for noninear systems [9, 10]. CMC mode is a partiay connected perceptron-ike associative memory network [11]. Owing to its pecuiar structure, it overcomes fast growing probems and earning difficuties when the amount of training data is imited as compared to other neura networks [8, 1, 13]. Moreover, because of its simpe computation and good generaization capabiity, the CMC mode has been widey used to contro compex dynamica systems [14], noninear systems [9, 10], robot manipuators [15], and muti-input muti-output (MIMO) contro [16, 147]. Yu Tsao is with the Research Center for Information Technoogy Innovation, cademia Sinica, Taipei, Taiwan (corresponding author to provide phone: 886787390; emai: yu.tsao@citi.sinica.edu.tw). Hao-Chun Chu and Shih-Wei an, Shih-Hau Fang, Junghsi ee, and Chih-Min in are with the Department of Eectrica Engineering, Yuan Ze University, Taoyuan, Taiwan. (e-mai: {david46331, qooqoo6308} @gmai.com; {shfang, eeee, cm}@saturn.yzu.edu.tw). More recenty, deep earning has become a part of many state-of-the-art systems, particuary computer vision [18-0] and speech recognition [1-3]. Numerous studies indicate that by stacking severa shaow structures into a singe deep structure, the overa system coud achieve better data representation and, thus, more effectivey dea with noninear and high compexity tasks. Successfu exampes incude stacking denoising autoencoders [0], stacking sparse coding [4], mutiayer nonnegative matrix factorization [4], and deep neura networks [6, 7]. In this study, we propose a deep CMC (DCMC) framework that stacks severa singe-ayered CMCs. We aso derive a modified backpropagation agorithm to train the DCMC mode. Experimenta resuts on NC tasks show that the DCMC provides better resuts than conventiona CMC in terms of MSE scores..1 System Overview II. PROPOSED GORITHM Figure 1 shows the bock diagram of a typica NC system containing two microphones, one externa and the other interna. The externa microphone receives the noise source signa n(k), whie the interna one receives the noisy signa v(k). The noisy signa is a mixture of the signa of interest s(k) and the damage noise signa z(k). Therefore, v(k) = s(k) + z(k), where z(k) is generated by passing the noise signa n(k) through an unknown channe F( ). The reation between the noise signa n(k) and damage noise z(k) is from [8]. The NC system computes a fiter, F ( ), which transforms n(k) to y(k), so that the fina output, (v(k) y(k)), is cose to the signa of interest, s(k). The parameters in F ( ) are updated by minimizing the MSE. Recenty, the concept of deep earning has garnered great attention. Inspired by deep earning, we propose a DCMC framework, which stacks severa ayers of the singe-ayered Interest Signa s(k) Noise Signa n(k) z(k) + Noisy Signa v(k) + Output (v k y k ) + - Unknown System F( ) y(k) DCMC System F ( ) Figure 1. Bock diagram of the proposed NC system. (k)

CMC, to construct the fiter F ( ), as indicated in Fig. 1. Fig. shows the architecture of the DCMC, which is composed of a puraity of CMC ayers. The, R, and W in Fig. denote the association memory space, receptive fied space, and weight memory space, respectivey, in a CMC mode. In the next section we wi detai these three spaces. The output of the first ayer CMC is treated as the input for the next CMC ayer. The derived F ( ), as modeed by the DCMC, can better characterize the signas by using mutipe noninear processing ayers, and thus achieve an improved noise canceation performance. Input x 1 x x N ayer 1 ayer ayer CMC R W CMC R W Figure. rchitecture of the deep CMC.. Review of the CMC Mode This section reviews the structure and parameter-earning agorithm of the CMC mode.. Structure of a CMC Fig. 3 shows a CMC mode with five spaces: an input space, an association memory space, a receptive fied space, a weight memory space, and an output space. The main functions of these five spaces are as foows: 1) Input space: This space is the input of the CMC. In Fig. 3, the input vector is x = [x 1, x,, x N ] T R N, where N is the feature dimension. ) ssociation memory space: This space hods the excitation functions of the CMC, and it has a muti-ayer concept. Pease note that the ayers here (indicating the depth x x N Input ssociation memory () DCMC CMC R W Weight memory Receptive-fied (W) (R) Figure 3. rchitecture of a CMC. Output Output x 1 y 1 b y of association memory space) are different from those presented in Section.1 (indicating the number of CMCs in a DCMC). To avoid confusion, we ca the ayer for the association memory S_ayer and the ayer for the CMC number ayer in the foowing discussion. Fig. 4 shows an exampe of an association memory space with a two-dimensiona input vector, x = [x 1, x ] T with N =. The B and UB are ower and upper bounds, respectivey. We first divide x 1 into bocks (, B) and x into bocks (a, b). Next, by shifting each variabe an eement, we get bocks (C, D) and bocks (c, d) for the second S_ayer. S_ayer4 S_ayer3 S_ayer S_ayer1 h g f e d c b a Variabe x UB Gg B B Bb C E Figure 4. CMC with a two-dimensiona vector (N = ). ikewise, by shifting another variabe, we can generate another S_ayer. In Fig. 4, we have four S_ayers, each S_ayer having two bocks. Therefore, the bock number is eight (N B = 8) for one variabe; accordingy, the overa association memory space has 16 bocks (N = N B N). Each bock contains an excitation function, which must be a continuousy bounded function, such as the Gaussian, trianguar, or waveet function. In this study, we use the Gaussian function [as shown in Fig. 4]: φ i = exp [ (x i m i ) ], = 1,, N B ; i = 1,, N, (1) where x i is the input signa, and m i and represent the associative memory functions within the mean and variance, respectivey, of the i-th input of the -th bock. 3) Receptive fied space: In Fig. 4, areas formed by bocks are caed receptive fieds. The receptive fied space has eight areas ( =8): a, Bb, Cc, Dd, Ee, Ff, Gg, and Hh. Given the input x, the -th receptive fied function is represented as b = φ i = exp [ ( (x i m i ) N N i=1 i=1 )]. () In the foowing, we express the receptive fied functions in the form of vectors, namey, b = [b 1, b,, b NR ] T. 4) Weight memory space: This space specifies the adustabe weights of the resuts of the receptive fied space: G Dd B Ff D F H Variabe x 1 UB S_ayer1 S_ayer S_ayer3 S_ayer4

w = [ 1,,, ] T for t = 1,,, M, (3) where M denotes the output vector dimension. 5) Output space: From Fig. 3, the output of the CMC is y = w T N b = exp [ ( (x i m i ) R N =1 i=1 )], (4) where y is the t-th eement of the output vector, y = [y 1, y,, y ] T. The output of state point is the agebraic sum of outputs of receptive fieds (a, Bb, Cc, Dd, Ee, Ff, Gg, and Hh) mutipied by the corresponding weights. B. Parameters of daptive earning gorithm To estimate the parameters in the association memory, receptive fied, and weight memory spaces of the CMC, we first define an obective function: O(k) = 1 [ =1 (k)], (5) where error signa (k) = d (k) y (k), indicating the error between the desired response d (k) and the fiter s output y (k), at the k-th sampe. Based on Eq. (5), the normaized gradient descent method can be used to derive the update rues for the parameters in a CMC mode: where where m i (k + 1) = m i (k) + μ m, m i (x i m i ) = b m ( i ( ) =1 ); (k + 1) = (k) + μ σ, (x i m i = b ) σ ( i ( ) 3 =1 ); (k + 1) = (k) + μ w where.3 Proposed DCMC Mode w t = b., w t. Structure of the DCMC From Eq. (4), the output of the first ayer y 1 is obtained by 1 1 ) (6) (7) (8) y 1 = 1 N exp [ ( (x i m i =1 i=1 )], (9) 1 where y 1 is the t-th eement of the output of y 1 1, and is the number of receptive fieds in the first ayer. Next, the correation of the output of the (-1)-th ayer (y 1 ) and that of the -th ayer (y ) can be formuated as (y i 1 m i) y = =1 exp [ ( i=1 )], = ~, (10) N where N is the input dimension of the -th ayer (output dimension of the ( 1)-th ayer); is the number of receptive fieds in the -th ayer; y is the t-th eement of y ; m i, σ i, and w are the parameters in the -th CMC; is the tota ayer number of CMC in a DCMC. 1) Backpropagation gorithm for DCMC ssume that the output vector of a DCMC is y = [y 1, y,, y ] T R, where M is the feature dimension, the obective function of the DCMC is O(k) = 1 [d (k) y (k)] =1. (11) In the foowing, we present the backpropagation agorithm to estimate the parameters in the DCMC. Because the update rues for means and variances and weights are different, they are presented separatey. 1) The update agorithm of means and variances: The update agorithms of the means and variances for the ast ayer (the -th ayer) of DCMC are the same as that of CMC (as shown in Eqs. (6) and (7)). For the penutimate ayer (the (-1)-th ayer), the parameter updates are: z = b p ip z ip b p, (1) where b p is the p-th receptive fied function of the (-1)-th ayer. We define the momentum δ zp = b of the p-th p receptive fied function in the (-1)-th ayer. Then, we have = =1, (13) δ zp b b p b where b is the -th receptive fied function for the -th ayer. Notaby, by repacing z with m and σ in Eq. (13), we obtain momentums δ mp and δ σp. Simiary, we can derive the momentum, δ zq, for the q-th receptive fied function in the (-)-th ayer by: = = δ zq = b q b p p=1 b q b p b p p=1 b δ zp, q (14) where b q is the q-th receptive fied function for the (-)-th ayer, and is the number of receptive fieds in the ( 1)-th ayer. Based on the normaized gradient descent method, the earning of m i (the i-th mean parameter in the -th receptive fied in the -th ayer) is: m i (k + 1) = m i (k) + μ m b m i δ m ; (15) simiary, the earning agorithm of (the i-th variance parameter in the -th receptive fied in the -th ayer) is: (k + 1) = (k) + μ σ b σ i δ σ, (16) where μ m in Eq. (15) and μ σ in Eq. (16) are the earning rates for the mean and variance updates, respectivey. ) The update agorithm of weights: The update rue of the weight in the ast ayer (the -th ayer) of DCMC is the same as that for CMC. For the penutimate ayer (the (-1)-th ayer), the parameter update is:

w = y t t w t y t, (17) = y t : N b y R r =1 y r=1 t b y, (18) r where the momentum of the (-1)-th ayer δ wt = δ wt where y r is the r-th eement of the y. Simiary, the momentum for the (-)-th ayer can be computed by: δ wc = b y c =1 y b y =1 = b y c =1 y b δ w t =1. (19) ccording to the normaized gradient descent method, the earning agorithm of (weight for the -th receptive fied and the t-th output in the -th ayer) is defined as (k + 1) = (k) + μ w y t w t where μ w is the earning rate for the weights. III. 3.1 Experimenta Setup EXPERIMENTS δ wt, (0) In the experiment, we consider the signa of interest s(k) = sin (0.06k) mutipied by a white noise signa, normaized within [ 1, 1], as shown in Fig. 5 (). The noise signa, n(k), is generated by white noise, normaized within [ 1.5, 1.5]. tota of 100 training sampes are used in this experiment. The noise signa n(k) wi go through a noninear channe generating the damage noise z(k). The reation between n(k) and z(k) is z(k) = F(n(k)), where F( ) represents the function of the noninear channe. In this experiment, we used tweve different functions, { 0.6 (n(k)) i 1 ; 0.6 cos ((n(k)) i 1 ) ; 0.6 sin ((n(k)) i 1 ), i=1,, 3, 4 } to generate four different damage noise signas z(k). The noisy signas v(k) associated with four different z(k) signas, with three representative channe functions, namey, F = 0.6 (n(k)) 3, F( ) = 0.6 cos ((n(k)) 3 ), and F( ) = 0.6 sin ((n(k)) 3 ) are shown in Figs. 5 (B), (C), and (D), respectivey. We foowed reference [8] to set up the parameters of the DCMC, as characterized beow: 1) Number of ayers (S_ay r): 4 ) Number of bocks (N B )=8: C i(5 (N e )/ 4 (S_ay r)) 4 (S_ay r) = 8. 3) Number of receptive fieds ( )= 8. 4) ssociative memory functions: φ i = xp [ (x i m i ) ], i = 1; = 1,,. Note that C i( ) represents the unconditiona carry of the remainder. Signa range detection is required to set the UB and B necessary to incude a the signas. In this study, () Signa of interest (B) F( ) = 0.6 (n(k)) 3 (C)F( ) = 0.6 cos ((n(k)) 3 ) (D)F( ) = 0.6 sin((n(k)) 3 ) Figure 5. () Signa of interest s(k). (B-D) Noisy signa v(k) with three channe functions. our preiminary resuts show that [UB B]=[3-3] gives the best performance. Pease note that the main goa this study is to investigate whether DCMC can yied better NC resuts than a singe-ayer CMC. Therefore, we report the resuts using [3-3] for both CMC and DCMC in the foowing discussions. The initia means of the Gaussian function (m i ) are set in the midde of each bock (N B ) and the initia variances of the Gaussian function ( ) are determined by the size of each bock (N B ). With [UB B]=[3-3], we initiaize the mean parameters as: m i1 =.4, m i = 1.8, m i3 = 1., m i4 = 0.6, m i5 = 0.6, m i6 = 1., m i7 = 1.8, m i8 =.4, so that the eight bocks can cover [UB B] more eveny. Meanwhie, we set = 0.6 for =1,..8, and the initia weights ( ) zeros. Based on our experiments, the parameters initiaized differenty ony affect the performance at the first few epochs and converge to simiar vaues quicky. The earning rates are chosen as μ s = μ z = μ w = μ m = μ σ = 0.001. The parameters within a ayers of the DCMC are the same. In this study, we examine the performance of DCMCs formed by three, five, and seven ayers of CMCs, which are denoted as DCMC(3), DCMC(5), and DCMC(7), respectivey. The input dimension was set as N=1; the output dimensions for CMC and DCMCs were set as M = 1 and M = 1, respectivey. 3. Experimenta Resuts This section compares DCMC with different agorithms based on two performance metrics, the MSE and the convergence speed. Fig. 6 shows the converged MSE under a CMC and a DCMC under the three different settings, (S_ayer =, N e = 5), (S_ayer = 4, N e = 5), and (S_ayer = 4, N e = 9) testing on the channe function F( ) = 0.6 cos ((n(k)) 3 ). To compare the performance of the proposed DCMC, we conducted experiments using two popuar adaptive fiter methods, namey MS [1] and

the Voterra fiter [5, 6]. For a fair comparison, the earning epochs are set the same for MS, Voterra, CMC, and DCMC, where there are 100 data sampes in each epoch. The parameters of MS and the Voterra fiter are tested and the best resuts are reported in Fig. 6. From Fig. 6, we see that DCMC outperforms not ony Voterra and MS, but aso CMC under the three setups. The same trends are observed across the 1 channe functions, and thus ony the resut of F( ) = 0.6 cos ((n(k)) 3 ) is presented as a representative. () F( ) = 0.6 (n(k)) 3 Figure 6. MSE of MS, Voterra, CMC, and DCMC with channe function F( ) = 0.6 cos ((n(k)) 3 ). Speed is aso an important performance metric for NC tasks. Fig. 7 shows the convergence speed and MSE reduction rate versus number of epochs, for different agorithms. For ease of comparison and due to imited space, Fig. 7 ony shows the resuts of three-ayer DCMC (denoted as DCMC in Fig. 7) since the trends of DCMC performances are consistent across different ayer numbers. For CMC and DCMC, we adopted S_ayer = 4, N e = 5. Fig. 7 shows the resuts of three channe functions: F( ) = 0.6 (n(k)) 3, F( ) = 0.6 cos ((n(k)) 3 ), and F( ) = 0.6 sin ((n(k)) 3 ). The resuts in Fig. 7 show that MS and Voterra yied better performance than CMC and DCMC when the number of epoch is few. On the other hand, when the number of epoch increases, both DCMC and CMC give ower MSEs compared to that from MS and Voterra, over a testing channes. Moreover, DCMC consistenty outperforms CMC with ower converged MSE scores. The resuts show that the performance gain of the DCMC becomes increasingy more significant as the noninearity of the channes increases. Finay, we note that the performances of both DCMC and CMC became saturated around 400 epochs. In a rea-word appication, a deveopment set of data can be used to determine the saturation point, so that the adaptation can be switched off. (B) F( ) = 0.6 cos ((n(k)) 3 ) (C) F( ) = 0.6 sin ((n(k)) 3 ) Figure 7. MSEs of MS, Voterra, CMC, and DCMC with three types of channe functions. More resuts are presented in https://goo.g/frzvk Simuation resuts of a CMC and that of a DCMC, both for 400 epochs of training, are shown in Figs. 8 () and (B), respectivey. The resuts show that the proposed DCMC can achieve better fitering performance than that from the CMC for this noise canceation system. () CMC (B) DCMC Figure 8. Recovered signa using () CMC and (B) DCMC, where F( ) = 0.6 cos ((n(k)) 3 ).

Tabe I ists the mean and variance of MSE scores for MS, Voterra, CMC, and DCMC across 1 channe functions. The MSE for each method at a channe function was obtained with 1000 epochs of training. From the resuts, both CMC and DCMC give ower MSE than MS and Voterra. In addition to the resuts in Tabe I, we adopted the dependent t-test for the hypothesis test on the 1 sets of resuts. The t-test resuts reveaed that DCMC outperforms CMC with P-vaues = 0.005. TBE I. MEN ND VIRINCE OF MSES FOR MS, VOTERR, CMC, ND DCMC OVER 1 CHNNE FUCNTIONS MS Voterra CMC DCMC Mean 4.35 5.05 7.01 7.59 Variance 11.95 11.57 1.08 0.19 IV. CONCUSION The contribution of the present study was two-fod: First, inspired by the recent success of deep earning agorithms, we extended the CMC structure into a deep one, termed deep CMC (DCMC). Second, a backpropagation agorithm was derived to estimate the DCMC parameters. Due to the five-space structure, the backpropagation for DCMC is different from that used in the reated artificia neura networks. The parameter updates invoved in DCMC training incude two parts (1) The update agorithm of means and variances; () The update agorithm of weights. Experimenta resuts of the NC tasks showed that the proposed DCMC can achieve better noise canceation performance when compared with that from the conventiona singe-ayer CMC. In future, we wi investigate the capabiities of the DCMC on other signa processing tasks, such as echo canceation and singe-microphone noise reduction. Meanwhie, advanced deep earning agorithms used in deep neura networks, such as dropout and sparsity constraints, wi be incuded in the DCMC framework. Finay, ike reated deep earning researches, identifying a way to optimize the number of ayers and initia parameters in DCMC per the amount of training data are important future works. REFERENCES [1] B. Widrow, et a., daptive noise canceing: Principes and appications, Proceedings of the IEEE, vo. 63 (1), pp. 169-1716, 1975. [] S. Haykin, daptive Fiter Theory, fourth edition, Prentice-Ha, 00. [3] E.. Wan and R. van der Merwe, The unscented Kaman fiter for noninear estimation, in Proc. S-SPCC, pp. 153-158, 000. [4] F. Daum, Noninear fiters: beyond the Kaman fiter, IEEE erospace and Eectronic Systems Magazine, vo. 0 (8), pp. 57-69, 005. [5]. Tan and J. Jiang, daptive Voterra fiters for active contro of noninear noise processes, IEEE Transactions on Signa Processing, vo. 49 (8), pp. 1667-1676, 001. [6] V. John Mathews, daptive Voterra fiters using orthogona structures, IEEE Signa Processing etters, vo. 3 (1), pp. 307-309 1996. [7] G. Horvath and T. Szabo, CMC neura network with improved generaization property for system modeing, in Proc. IMTC, vo., pp. 1603-1608, 00. [8] C. M. in,. Y. Chen, and D. S. Yeung, daptive fiter design using recurrent cerebear mode articuation controer, IEEE Trans. on Neura Networks, vo. 1 (7), pp. 1149-1157, 010. [9] C. M. in and Y. F. Peng, daptive CMC-based supervisory contro for uncertain noninear systems, IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics, vo. 34 (), pp. 148-160, 004. [10] C. P. Hung, Integra variabe structure contro of noninear system using a CMC neura network earning approach, IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics, vo. 34 (1), pp. 70-709, 004. [11] J. S. bus, new approach to manipuator contro: The cerebear mode articuation controer (CMC), Journa of Dynamic Systems, Measurement, and Contro, vo. 97 (3), pp. 0 7, 1975. [1] P. E. M. meida and M. G. Simoes, Parametric CMC networks: Fundamentas and appications of a fast convergence neura structure, IEEE Trans. Ind. ppicat., vo. 39 (5), pp. 1551 1557, 003. [13] C. M. in,. Y. Chen, and C. H. Chen, RCMC hybrid contro for MIMO uncertain noninear systems using siding-mode technoogy, IEEE Trans. Neura Netw., vo. 18 (3), pp. 708 70, 007. [14] S. Commuri and F.. ewis, CMC neura networks for contro of noninear dynamica systems: Structure, stabiity and passivity, utomatica, vo. 33 (4), pp. 635-641, 1997. [15] Y. H. Kim and F.. ewis, Optima design of CMC neura-network controer for robot manipuators, IEEE Trans. on Systems, Man, and Cybernetics, Part C: ppications and Reviews, vo. 30 (1), pp. -31, 000. [16] J. Y. Wu, MIMO CMC neura network cassifier for soving cassification probems, ppied Soft Computing, vo. 11 (), pp. 36-333, 011. [17] Z. R. Yu, T. C. Yang, and J. G. Juang, ppication of CMC and FPG to a twin rotor MIMO system, in Proceedings ICIE, pp. 64-69, 010. [18] C. Farabet, C. Couprie,. Naman, and Y. ecun, earning hierarchica features for scene abeing, IEEE Trans. on Pattern naysis and Machine Inteigence, vo. 35 (8), pp.1915-199, 013. [19] H. ee, C. Ekanadham, and. Y. Ng, Sparse deep beief net mode for visua area V, in Proc. NIPS, 007. [0] P. Vincent, et a., Stacked denoising autoencoders: earning usefu representations in a deep network with a oca denoising criterion, The Journa of Machine earning Research, vo. 11, pp. 3371-3408, 010. [1] G. Hinton et a., Deep neura networks for acoustic modeing in speech recognition, IEEE Signa Processing Magazine, vo. 9 (6), pp. 8-97, 01. [] Y. ecun, Y. Bengio, and G. Hinton, Deep earning, Nature, vo. 51, pp. 436-444, 015. [3] S. M. Siniscachi, T. Svendsen, and C. H. ee, n artificia neura network approach to automatic speech processing, Neurocomputing, vo. 140, pp. 36-338, 014. [4] Y. He, K. Kavukcuogu, Y. Wang,. Szam, and Y. Qi, Unsupervised feature earning by deep sparse coding, in SDM, pp. 90-910, 014. [5]. Cichocki and R. Zdunek, Mutiayer nonnegative matrix factorization, Eectronics etters, pp. 947-948, 006. [6] S. iang and R. Srikant, Why deep neura networks? arxiv preprint arxiv:1610.04161. [7] J. Ba and R. Caruana, Do deep nets reay need to be deep? In Procs NIPS, pp. 654-66, 014. [8] C. T. in and C. F. Juang, n adaptive neura fuzzy fiter and its appications, IEEE Trans. on Systems, Man, and Cybernetics,B: Cybernetics, vo. 7 (4), pp. 635-656, 1997.