STOCHASTIC INFORMATION GRADIENT ALGORITHM WITH GENERALIZED GAUSSIAN DISTRIBUTION MODEL

Size: px
Start display at page:

Download "STOCHASTIC INFORMATION GRADIENT ALGORITHM WITH GENERALIZED GAUSSIAN DISTRIBUTION MODEL"

Transcription

1 Journal of Circuits, Systems, and Computers Vol. 2, No. (22) 256 (6 pages) #.c World Scientific Publishing Company DOI:.42/S STOCHASTIC INFORMATION GRADIENT ALGORITHM WITH GENERALIZED GAUSSIAN DISTRIBUTION MODEL BADONG CHEN,, JOSE C. PRINCIPE,, JINCHUN HU, and YU ZHU, ** Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 326, USA Department of Precision Instruments and Mechanology, Tsinghua University, Beijing 84, P. R. China Received 26 October 29 Accepted 3 August 2 Published 29 February 22 This paper presents a parameterized version of the stochastic information gradient (SIG) algorithm, in which the error distribution is modeled by generalized Gaussian density (GGD), with location, shape, and dispersion parameters. Compared with the kernel-based SIG (SIG- Kernel) algorithm, the GGD-based SIG (SIG-GGD) algorithm does not involve kernel width selection. If the error is zero-mean, the SIG-GGD algorithm will become the least mean p-power (LMP) algorithm with adaptive order and variable step-size. Due to its well matched density estimation and automatic switching capability, the proposed algorithm is favorably in line with existing algorithms. Keywords: Minimum error entropy criterion (MEE); stochastic information gradient (SIG) algorithm; generalized Gaussian density (GGD); least mean p-power (LMP).. Introduction Stochastic gradient (SG) algorithm under error criteria has been extensively studied and widely used in the last decades.,2 The least mean square (LMS) algorithm, which is based on the minimum mean-square error (MSE) criterion, is perhaps the most popular algorithm because of low computation requirements and optimality for Gaussian cases. However, in the presence of non-gaussian *This paper was recommended by Regional Editor Majid Ahmadi. 256-

2 B. Chen et al. noises, the LMS algorithm will result in significant performance degradation. To improve the learning performance (degree of stability, convergence speed, misadjustment error, etc.) for non-gaussian situations, many stochastic gradient algorithms under non-mean-square error (Non-MSE) criteria are studied. 3 5 Among these algorithms, the least mean p-power (LMP) algorithm 5,6 deserves special attention due to its ease of mathematical tractability and satisfactory performance. In scenarios of adaptive system training, the LMP algorithm with order p is given by Wðk þ ðjeðkþjp Þ ¼ WðkÞ pjeðkþj eðkþ ; ðþ where WðkÞ ¼½w ðkþ; w 2 ðkþ;...; w M ðkþš T is the weight vector of the adaptive model at k iteration, > is the step-size, eðkþ denotes the prediction error, the eðkþ depends on the topology of the adaptive system under consideration. Well-known special cases of the LMP algorithm include the least absolute difference (LAD) algorithm (p ¼ ), 7 LMS algorithm (p ¼ 2), and the least mean fourth (LMF) algorithm (p ¼ 4). 3 More recently, researchers propose the minimum error entropy (MEE) criterion as an information theoretic alternative to traditional error criteria in supervised adaptive system training. 8,23 As shown in Fig., under MEE criterion, the weight vector of the adaptive model hðx; WÞ is adjusted online so that the error s entropy HðeðkÞÞ is minimized. The error s entropy measures the average uncertainty contained in the error signal. Its minimization forces the error samples to concentrate. Numerical examples suggested that the MEE criterion could be able to achieve a better error distribution, especially for higher-order statistics in the adaptive system training. 2 Under MEE criterion, the optimal weight vector W is x(k) f (x) z(k) + + Σ n(k) d(k) + Σ e(k) error entropy h(x,w) y(k) Fig.. Scheme of adaptive system training under MEE criterion, where dðkþ is the noisy observations of the system output, and is considered as the desired response of the adaptive model when driven by input signal xðkþ and perturbed by noise samples nðkþ

3 Stochastic Information Gradient Algorithm with Generalized Gaussian Distribution Model given by W ¼ arg min HðeÞ W2R M Z ¼ arg min W2R M R p e ðeþ log p e ðeþde ¼ arg min E e ð log p e ðeþþ ; ð2þ W2R M where p e ðþ denotes the PDF of the error, and E e ðþ denotes expectation operator. The optimal weight vector can be searched by stochastic information gradient (SIG) algorithm. 9 Wðk þ Þ log p : ð3þ The key problem of SIG algorithm is how to estimate the PDF of error eðkþ. The nonparametric kernel approach has been widely used due to its smoothing characteristics. 3 By kernel density estimation (KDE), the estimated error distribution is given by ^p e ðeðkþþ ¼ L X k i¼k Lþ K h ðeðkþ eðiþþ ; ð4þ where K h ð:þ represents the kernel function, h is the kernel width, and L denotes the sliding window length. By kernel approach, one is often confronted with the problem of how to choose a suitable value of the kernel width. An inappropriate choice of width will significantly deteriorate the behavior of the algorithm. 8, Though the effects of the kernel width on the shape of the performance surface and the eigenvalues of the Hessian at and around the optimal solution have been carefully investigated, the choice of the kernel width is still a difficult task. In Ref. 24, an adaptive kernel width is proposed, however, the computation of the kernel width is based on a gradient search algorithm which is even as complex as the adaptive algorithm itself. Hence, certain parameterized density estimation, which does not involve the choice of kernel width, might be more practical. Further, if some prior knowledge about the structure of the density is available, the parameterized density estimation may achieve a better accuracy than nonparametric ones. These facts motivate us to develop a parameterized SIG algorithm, in which the error s distribution is modeled by a certain parametric family of probability density functions. In this paper, the generalized Gaussian density (GGD) 4 is used to match the error s distribution. The reasons for the choice of GGD densities are at least twofold: () The GGD densities belong to a family of maximum entropy densities, which have simple yet flexible functional forms and could approximate a large number of different signals, comprising a wide range of statistical distributions 5 8 ; (2) The 256-3

4 B. Chen et al. GGD densities have only three parameters, i.e., location, shape and dispersion, which can be easily estimated from the sample data. 7,8 The paper is organized as follows: In Sec. 2, the generalized Gaussian distribution and its parametric estimation are introduced. In Sec. 3, the GGD-based SIG algorithm is derived, and the connections with LMP algorithm are discussed. In Sec. 4, a series of Monte Carlo simulation experiments are performed to demonstrate the performance of the new algorithm. Finally, concluding remarks are presented in Sec Generalized Gaussian PDF and Its Parametric Estimation The generalized Gaussian density (GGD) with location and standard deviation > is defined as jx j p X ðxþ ¼ 2 exp ; ð5þ in which sffiffiffiffiffiffiffiffiffiffi ¼ ð Þ ð 3 Þ ; ð6þ where ðþ denotes the Gamma function 9 given by ðzþ ¼ Z x z e x dx ; z > : ð7þ In Eq. (5), describes the exponential rate of decay, and hence the shape of PDF, while models the dispersion of the distribution. Usually, is called the shape parameter and is referred to as the dispersion or scale parameter. The GGD densities include Gaussian ( ¼ 2) and Laplacian ( ¼ ) distributions as special cases. Figure 2 shows the GGD distributions for several shape parameters with the same variance ( ¼ ). From Fig. 2, it is evident that smaller values of the shape parameter correspond to heavier tails and therefore to sharper distributions. In the limiting cases, as!, the GGD becomes close to the uniform distribution, whereas for! þ, it approaches an impulse function (-distribution). The GGD densities are the maximum entropy (Maxent) distributions, which satisfy the following constrained optimization (see Appendix A): 8 Z þ >< max pðxþ log pðxþdx >: s:t: p Z þ jx j pðxþdx ¼ m ¼ ; where m denotes the -order absolute central moment. ð8þ 256-4

5 Stochastic Information Gradient Algorithm with Generalized Gaussian Distribution Model α=3 α= α=2 α= Fig. 2. Generalized Gaussian distribution for different shape parameters with unit variance. According to Jaynes s maximum entropy principle (MEP), 2 the Maxent distribution is \uniquely determined as the one which is maximally noncommittal with regard to missing information, and that it agrees with what is known, but expresses maximum uncertainty with respect to all other matters". Therefore, the GGD distribution would be the most unbiased distribution that agrees with a certain absolute central moment constraint. Until now, a number of methods have been proposed for the parametric estimation of a GGD distribution. The state-of-the-art can be found in Refs. 4 and 8. Here we only introduce the moment-matching estimators (MMEs). A complete statistical description for the GGD-modeled distributions, by simply matching the underlying moments of the data set with those of the assumed distribution, is first obtained by Mallat. 2 For a GGD distribution, the r-order absolute central moment is calculated as Z þ m r ¼ E½jX j r Š¼ jx j r p X ðxþdx ¼ r ðr þ Þ : ð9þ p ffiffiffiffiffiffi The ratio of the mean absolute deviation (m ) to standard deviation ( )is MðÞ ¼ m 2 pffiffiffiffiffiffi ¼ q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ðþ m 2 3 m

6 B. Chen et al. which is a steadily increasing function of shape parameter. So an estimate for can be expressed as ^ ¼ M ^m pffiffiffiffiffiffi ; ðþ ^m 2 where M ðþ denotes the inverse function of MðÞ, ^m and ^m 2 represent the sample mean absolute deviation and the sample standard deviation, respectively. Given independent, identically distributed (i.i.d.) random samples fx ; x 2 ;...; x N g,we have 8 ^ ¼ X N x N i ; >< ^m ¼ N >: ^m 2 ¼ N i¼ X N i¼ X N i¼ jx i ^j ; ðx i ^Þ 2 : The shape parameter can also be expressed in terms of the kurtosis of the GGD distribution. 22 A similar ratio, which is equal to the reciprocal of the square root of kurtosis of the GGD distribution is given by KðÞ ¼ m 2 3 pffiffiffiffiffiffi ¼ q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : ð3þ m 4 Therefore, another estimate of could be ^ ¼ K ^m pffiffiffiffiffiffi 2 ; ð4þ ^m 4 where K ðþ denotes the inverse function of KðÞ, ^m 4 is the sample fourth-order central moment, that is, ^m 4 ¼ N X N i¼ 5 ðx i ^Þ 4 : To estimate the shape parameter, we have to solve the inverse functions M ðþ or K ðþ, whose curves are depicted in Fig. 3. As both the inverse functions have no explicit forms, some interpolation method should be used in practical applications. p ffiffiffiffiffiffi Plugging the estimate of into (6) and substituting for ^m 2, we may get the estimate of the dispersion parameter: s ^ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^m 2 ð ^ Þ ð 3^ Þ ð2þ ð5þ : ð6þ 256-6

7 Stochastic Information Gradient Algorithm with Generalized Gaussian Distribution Model 9 8 y=m - (x) y=k - (x) 7 6 y x Fig. 3. Curves of the inverse function M ðþ and K ðþ. 3. GGD-Based Stochastic Information Gradient Algorithm Using GGD density to approximate the error s distribution, we have! ^ ^p e ðeðkþþ ¼ k 2 ^ exp jeðkþ ^ ^k! kj ; ð7þ k ^ ^ k k where ^ k, ^ k,and^ k denote the estimated location, shape, and dispersion parameters at k iteration. By moment-matching estimators, say the kurtosis-based approach, we have 8 ^ k ¼ X k eðiþ ; L i¼k Lþ X k >< ^ k ¼ K B L i¼k Lþ ðeðiþ ^ kþ 2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X A ; k i¼k Lþ ðeðiþ ^ kþ 4 ð8þ L vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi vffiffiffiffiffiffiffiffiffiffiffiffiffiffi u X ^ k ^ k ¼ t ðeðiþ ^ L k Þ 2 u k t ; >: i¼k Lþ 3^ k 256-7

8 B. Chen et al. where L denotes the sliding window length. Thus, the GGD-based SIG algorithm can be derived as 8! Wðk þ log ^ k exp jeðkþ ^ ^k! 9 < = kj : 2^ k ^ k ; ^ k ^k jeðkþ ^ k j ^ k ¼ WðkÞ ^ k ^ ^ jeðkþ ^ k k j ^ k signðeðkþ ^ eðkþ : k ð9þ In this paper, to make a distinction, we denote \SIG-Kernel" and \SIG-GGD" the SIG algorithms based on kernel approach and GGD densities, respectively. Compared with the SIG-Kernel algorithm, the SIG-GGD just needs to estimate the three parameters of GGD density, without resorting to the choice of kernel width and the calculation of kernel function. Comparing () and (9), we find that when ^ k, the SIG-GGD algorithm can be regarded as the LMP algorithm with adaptive order ^a k and variable step-size =^ ^ k k. In fact, under certain conditions, the SIG-GGD algorithm will converge to a specific LMP algorithm. Consider the finite impulse response (FIR) system identification case, in which the plant and the adaptive model are both FIR filters with the same order. In this case, the error signal eðkþ is given by eðkþ ¼W T XðkÞþnðkÞ W T ðkþxðkþ ¼ V T ðkþxðkþþnðkþ ; ð2þ where W ¼½w ; w 2 ;...; w M ŠT is the weight vector of unknown system, nðkþ is the interference noise, XðkÞ ¼½xðkÞ; xðk Þ;...; xðk M þ ÞŠ T is the input vector, and V ðkþ ¼W WðkÞ is the weight error vector. Then, near the convergence, we have WðkÞ W, and hence eðkþ ¼ðW WðkÞÞ T XðkÞþnðkÞ nðkþ : If fnðkþg is zero-mean white noise, we have ^ k and 8 X k ^ k K B L ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi i¼k Lþ n2ðiþ q X A ; k >< L i¼k Lþ n4 ðiþ vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi vffiffiffiffiffiffiffiffiffiffiffiffiffiffi u X ^ k ^ k t n 2 u k ðiþ t : >: L i¼k Lþ 3^ k ð2þ ð22þ 256-8

9 Stochastic Information Gradient Algorithm with Generalized Gaussian Distribution Model Assuming L as large enough, by the limit theorems of probability, 8 X k n 2 ðiþ!e½n 2 ðkþš ; >< L >: L i¼k Lþ X k i¼k Lþ n 4 ðiþ!e½n 4 ðkþš : Thus, the estimated parameters ^ k and ^ k shall approach certain constants, i.e., the SIG-GGD algorithm shall converge to a certain LMP algorithm with fixed order and step size. For example, if nðkþ is zero-mean Gaussian distributed (with shape parameter ¼ 2), we have ^ k 2 near the convergence, and consequently, the SIG- GGD algorithm converges to the LMS algorithm. Similarly, for Laplace noises, the SIG-GGD algorithm shall converge to the LAD algorithm. In Ref. 4, it has been shown that under slow adaptation, the LMS and LAD algorithms are, respectively, the optimum algorithms for the Gaussian and Laplace interfering noises. We may therefore conclude that the SIG-GGD algorithm has the ability to adjust its parameter pair ( ^ k, ^ k ) so as to switch to a certain optimum algorithm. This automatic switching property will be verified in the following section through a set of Monte- Carlo simulations. 4. Simulation Results To demonstrate the performance of the SIG-GGD algorithm, we perform a series of Monte-Carlo simulations for the FIR channel identification, in comparison with the SIG-Kernel, LAD, LMS, and LMF algorithms. Let the weight vector of unknown channel be W ¼½:; :3; :5; :3; :Š T, and the input signal be white Gaussian noise with unit power. For the SIG-Kernel algorithm, the Gaussian kernel is used, and the kernel width is chosen according to the well-known Silverman s rule of thumb. 3 For the SIG-GGD algorithm, the kurtosis-based moment-match estimator is used (see Ref. 8). To solve the inverse function K ðþ, we adopt the basic linear interpolation method, with step.5. In addition, to avoid \very large gradient", we set the upper bound of ^ k equal to 4, that is, if ^ k > 4, we set ^ k ¼ 4. In the experiments below, we focus on four special distributions of the disturbance noise, whose density functions are depicted in Fig. 4. Figures 5 8 illustrate the convergence curves of the mean-square deviation (MSD) E½V T ðkþv ðkþš averaged over independent Monte-Carlo runs for each disturbance noise. Figure 9 shows the averaged parameter k for Laplace, Gaussian, and Uniform noises. To compare the statistical results, we also summarize in Table the sample mean and standard deviation of the weight w 3 (w 3 ¼ :5). From the figures and table, we have the following observations: () The performances of the LAD, LMS, and LMF algorithms depend crucially on the distribution of the disturbance noise. The three algorithms may achieve the ð23þ 256-9

10 B. Chen et al..6.5 p(x) Gaussian Laplace Uniform MixNorm x Fig. 4. Four probability density functions of the disturbance noise. -5 SIG-Kernel SIG-GGD LAD LMS LMF LMF - MSD (db) -5 LMS SIG-Kernel SIG-GGD -2 LAD iterations Fig. 5. Convergence curves of mean square deviation (Laplace noise case). 256-

11 Stochastic Information Gradient Algorithm with Generalized Gaussian Distribution Model -5 SIG-Kernel SIG-GGD LAD LMS LMF - MSD (db) -5 SIG-Kernel LMF LAD LMS SIG-GGD iterations Fig. 6. Convergence curves of mean square deviation (Gaussian noise case). -5 SIG-Kernel SIG-GGD LAD LMS LMF - MSD (db) -5 LAD LMS SIG-Kernel LMF SIG-GGD iterations Fig. 7. Convergence curves of mean square deviation (Uniform noise case). 256-

12 B. Chen et al. -5 SIG-Kernel SIG-GGD LAD LMS LMF - MSD (db) -5 LAD LMS SIG-GGD LMF -2 SIG-Kernel iterations Fig. 8. Convergence curves of mean square deviation (MixNorm noise case). Table. Mean deviation results of w3 over Monte-Carlo simulations. SIG-Kernel SIG-GGD LAD LMS LMF Laplace Gaussian Uniform MixNorm smallest misadjustment for a certain noise distribution (e.g., the LMS algorithm for Gaussian noise); however, for other noise distributions, their performance will deteriorate dramatically. (2) The SIG-GGD and SIG-Kernel algorithms are both robust to the noise distribution. In the case of symmetric and unimodal noise (e.g., Laplace, Gaussian, Uniform), the SIG-GGD algorithm may achieve a smaller misadjustment than the SIG-Kernel algorithm. Though in the case of nonsymmetric and nonunimodal noise (e.g., MixNorm), the performance of SIG-GGD algorithm may be not as good as the SIG-Kernel algorithm, it is still better than most of the LMP algorithms. (3) Near the convergence, the parameter k converges approximately to, 2, and 4 (note that ^ k is restricted to ^ k 4 artificially) when disturbed by Laplace, 256-2

13 Stochastic Information Gradient Algorithm with Generalized Gaussian Distribution Model 2 α k iterations (a) Laplace Noise 2.5 α k iterations (b) Gaussian Noise 4 α k iterations (c) Uniform Noise Fig. 9. Parameter k averaged across independent Monte Carlo runs. Gaussian, and Uniform noise, which implies the SIG-GGD algorithm will switch approximately to the LAD, LMS, and LMF algorithm for, the Laplace, Gaussian, and Uniform noise, respectively. This agrees with the previous discussion in Sec Conclusions We have presented in this paper a new stochastic information gradient (SIG) algorithm, in which the error s distribution is modeled by the generalized Gaussian density (GGD). Different from the kernel-based SIG algorithm, the GGD-based algorithm does not resort to the choice of kernel width, and just relies on estimating the location, shape, and dispersion parameters during the adaptation. The relationships between the proposed algorithm and the least mean p-power (LMP) 256-3

14 B. Chen et al. algorithm are investigated. This new algorithm has the ability to adjust the parameters so as to switch approximately to a certain optimal LMP algorithm. Simulation results suggest that, for the case in which the noise has a symmetric and unimodal distribution, the SIG-GGD algorithm may outperform the SIG-Kernel algorithm. In the case of nonsymmetric and nonunimodal disturbance, the performance of the SIG-GGD algorithm may be not as good as the SIG-Kernel algorithm, however, it still outperforms most of the LMP algorithms. Acknowledgments This work was supported by the National Natural Science Foundation of China (No ), and was partially supported by NSF Grant ECCS 85644, NSF IIS 96497, and ONR N Appendix A To solve the optimization of (8), we create an unconstrained expression for the entropy using Lagrange multipliers: Z þ Z þ L ¼ pðxþ log pðxþdx ð Þ pðxþdx Z þ jx j pðxþdx : ða:þ Using calculus of variations, we maximize L with respect to pðxþ, f p log p ð Þp jx j pg¼ ) log p ð Þ jx j ¼ ) p ¼ expf jx j g ; where and are determined by 8Z þ >< expf jx j gdx ¼ ; >: Z þ jx j expf jx j gdx ¼ : Hence, it is easy to derive and as follows: ¼ logfð2 ð=þþ=g ; ¼ = : ða:2þ ða:3þ ða:4þ Combining (A.2) and (A.4) yields the GGD distribution. This completes the proof

15 Stochastic Information Gradient Algorithm with Generalized Gaussian Distribution Model References. S. Haykin, Adaptive Filtering Theory, 3rd edn. (Prentice Hall, NY, 996). 2. B. Widrow and S. D. Stearns, Adaptive Signal Processing (Prentice Hall, Englewood Cliffs, NJ, 985). 3. E. Walach and B. Widrow, The least mean fourth (LMF) adaptive algorithm and its family, IEEE Trans. Inform. Theory 3 (984) S. C. Douglas and H. Y. Meng, Stochastic gradient adaptation under general error criteria, IEEE Trans. Signal Process. 42 (994) S. C. Pei and C. C. Tseng, Least mean p-power error criterion for adaptive FIR filter, IEEE J. Select. Areas Commun. 2 (994) Y. Xiao, Y. Tadokoro and K. Shida, Adaptive algorithm based on least mean p-power error criterion for Fourier analysis in additive noise, IEEE Trans. Signal Process. 47 (999) V. J. Mathews and S. H. Cho, Improved convergence analysis of stochastic gradient adaptive filter using the sign algorithm, IEEE, Trans. Acoust. Speech, Signal Process. 35 (987) D. Erdogmus and J. C. Principe, Generalized information potential criterion for adaptive system training, IEEE Trans. Neural Ntw 3 (22) D. Erdogmus, E. H. Kenneth and J. C. Principe, Online entropy manipulation: Stochastic information gradient, IEEE Signal Process. Lett. (23) D. Erdogmus and J. C. Principe, Convergence properties and data efficiency of the minimum error entropy criterion in Adaline training, IEEE Trans. Signal Process. 5 (23) T. M. Cover and J. A. Thomas, Element of Information Theory (Wiley & Son, Inc., Chichester, 99). 2. D. Erdogmus and J. C. Principe, Comparison of entropy and mean square error criteria in adaptive system training using higher order statistics, Second Int. Workshop on Independent Component Analysis and Blind Signal Separation (2), pp B. W. Silverman, Density Estimation for Statistic and Data Analysis (Chapman & Hall, NY, 986). 4. M. K. Varanasi and B. Aazhang, Parametric generalized Gaussian density estimation, J. Acoust. Soc. Amer. 86 (989) R. L. Joshi and T. R. Fisher, Comparison of generalized Gaussian and Laplacian modeling in DCT image coding, IEEE Signal Process. Lett. 2 (995) F. Wang, H. Li and R. Li, Unified parametric and nonparametric ICA algorithms for hybrid source signals and stability analysis, Int. J. Innovative Comput. Inform. Control 4 (28) K. Sharifi and A. Leon-Garcia, Estimation of shape parameter for generalized Gaussian distributions in subband decompositions of video, IEEE Trans. Circuits Syst. Video Technol. 5 (995) K. Kokkinakis and A. K. Nandi, Exponent parameter estimation for generalized Gaussian probability density functions with application to speech modeling, Signal Process. 85 (25) M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions (Dover, New York, 97). 2. E. T. Jaynes, Information theory and statistical mechanics, Phys. Rev. 6 (957) S. Mallat, A theory for multiresolution signal decomposition: The wavelet representation, IEEE Trans. Pattern Anal. Machine Intell. (989)

16 B. Chen et al. 22. S. Choi, A. Cichocki and S. Amari, Flexible independent component analysis, J. VLSI Signal Process. 26 (2) B. Chen, J. Hu, L. Pu and Z. Sun, Stochastic gradient algorithm under (h;)-entropy criterion, Circuits Syst. Signal Process. 26 (27) A. Singh and J. C. Principe, Information theoretic learning with adaptive kernels, Signal Process. 9 (2)

STOCHASTIC INFORMATION GRADIENT ALGORITHM BASED ON MAXIMUM ENTROPY DENSITY ESTIMATION. Badong Chen, Yu Zhu, Jinchun Hu and Ming Zhang

STOCHASTIC INFORMATION GRADIENT ALGORITHM BASED ON MAXIMUM ENTROPY DENSITY ESTIMATION. Badong Chen, Yu Zhu, Jinchun Hu and Ming Zhang ICIC Express Letters ICIC International c 2009 ISSN 1881-803X Volume 3, Number 3, September 2009 pp. 1 6 STOCHASTIC INFORMATION GRADIENT ALGORITHM BASED ON MAXIMUM ENTROPY DENSITY ESTIMATION Badong Chen,

More information

Computing Maximum Entropy Densities: A Hybrid Approach

Computing Maximum Entropy Densities: A Hybrid Approach Computing Maximum Entropy Densities: A Hybrid Approach Badong Chen Department of Precision Instruments and Mechanology Tsinghua University Beijing, 84, P. R. China Jinchun Hu Department of Precision Instruments

More information

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Recursive Least Squares for an Entropy Regularized MSE Cost Function Recursive Least Squares for an Entropy Regularized MSE Cost Function Deniz Erdogmus, Yadunandana N. Rao, Jose C. Principe Oscar Fontenla-Romero, Amparo Alonso-Betanzos Electrical Eng. Dept., University

More information

On the steady-state mean squared error of the fixed-point LMS algorithm

On the steady-state mean squared error of the fixed-point LMS algorithm Signal Processing 87 (2007) 3226 3233 Fast communication On the steady-state mean squared error of the fixed-point LMS algorithm Mohamed Ghanassi, Benoıˆt Champagne, Peter Kabal Department of Electrical

More information

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Computational Neuro-Engineering Laboratory University

More information

Stochastic error whitening algorithm for linear filter estimation with noisy data

Stochastic error whitening algorithm for linear filter estimation with noisy data Neural Networks 16 (2003) 873 880 2003 Special issue Stochastic error whitening algorithm for linear filter estimation with noisy data Yadunandana N. Rao*, Deniz Erdogmus, Geetha Y. Rao, Jose C. Principe

More information

Recurrent neural networks with trainable amplitude of activation functions

Recurrent neural networks with trainable amplitude of activation functions Neural Networks 16 (2003) 1095 1100 www.elsevier.com/locate/neunet Neural Networks letter Recurrent neural networks with trainable amplitude of activation functions Su Lee Goh*, Danilo P. Mandic Imperial

More information

Error Entropy Criterion in Echo State Network Training

Error Entropy Criterion in Echo State Network Training Error Entropy Criterion in Echo State Network Training Levy Boccato 1, Daniel G. Silva 1, Denis Fantinato 1, Kenji Nose Filho 1, Rafael Ferrari 1, Romis Attux 1, Aline Neves 2, Jugurta Montalvão 3 and

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems

An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems 1780 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 7, JULY 2002 An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems Deniz Erdogmus, Member, IEEE, and Jose

More information

MANY real-word applications require complex nonlinear

MANY real-word applications require complex nonlinear Proceedings of International Joint Conference on Neural Networks, San Jose, California, USA, July 31 August 5, 2011 Kernel Adaptive Filtering with Maximum Correntropy Criterion Songlin Zhao, Badong Chen,

More information

Independent Component Analysis. Contents

Independent Component Analysis. Contents Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle

More information

ADAPTIVE FILTER THEORY

ADAPTIVE FILTER THEORY ADAPTIVE FILTER THEORY Fourth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada Front ice Hall PRENTICE HALL Upper Saddle River, New Jersey 07458 Preface

More information

A METHOD OF ADAPTATION BETWEEN STEEPEST- DESCENT AND NEWTON S ALGORITHM FOR MULTI- CHANNEL ACTIVE CONTROL OF TONAL NOISE AND VIBRATION

A METHOD OF ADAPTATION BETWEEN STEEPEST- DESCENT AND NEWTON S ALGORITHM FOR MULTI- CHANNEL ACTIVE CONTROL OF TONAL NOISE AND VIBRATION A METHOD OF ADAPTATION BETWEEN STEEPEST- DESCENT AND NEWTON S ALGORITHM FOR MULTI- CHANNEL ACTIVE CONTROL OF TONAL NOISE AND VIBRATION Jordan Cheer and Stephen Daley Institute of Sound and Vibration Research,

More information

On Information Maximization and Blind Signal Deconvolution

On Information Maximization and Blind Signal Deconvolution On Information Maximization and Blind Signal Deconvolution A Röbel Technical University of Berlin, Institute of Communication Sciences email: roebel@kgwtu-berlinde Abstract: In the following paper we investigate

More information

STEADY-STATE MEAN SQUARE PERFORMANCE OF A SPARSIFIED KERNEL LEAST MEAN SQUARE ALGORITHM.

STEADY-STATE MEAN SQUARE PERFORMANCE OF A SPARSIFIED KERNEL LEAST MEAN SQUARE ALGORITHM. STEADY-STATE MEAN SQUARE PERFORMANCE OF A SPARSIFIED KERNEL LEAST MEAN SQUARE ALGORITHM Badong Chen 1, Zhengda Qin 1, Lei Sun 2 1 Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University,

More information

2262 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 47, NO. 8, AUGUST A General Class of Nonlinear Normalized Adaptive Filtering Algorithms

2262 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 47, NO. 8, AUGUST A General Class of Nonlinear Normalized Adaptive Filtering Algorithms 2262 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 47, NO. 8, AUGUST 1999 A General Class of Nonlinear Normalized Adaptive Filtering Algorithms Sudhakar Kalluri, Member, IEEE, and Gonzalo R. Arce, Senior

More information

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112 Performance Comparison of Two Implementations of the Leaky LMS Adaptive Filter Scott C. Douglas Department of Electrical Engineering University of Utah Salt Lake City, Utah 8411 Abstract{ The leaky LMS

More information

An Improved Blind Spectrum Sensing Algorithm Based on QR Decomposition and SVM

An Improved Blind Spectrum Sensing Algorithm Based on QR Decomposition and SVM An Improved Blind Spectrum Sensing Algorithm Based on QR Decomposition and SVM Yaqin Chen 1,(&), Xiaojun Jing 1,, Wenting Liu 1,, and Jia Li 3 1 School of Information and Communication Engineering, Beijing

More information

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE?

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE? IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE? Dariusz Bismor Institute of Automatic Control, Silesian University of Technology, ul. Akademicka 16, 44-100 Gliwice, Poland, e-mail: Dariusz.Bismor@polsl.pl

More information

Convergence Evaluation of a Random Step-Size NLMS Adaptive Algorithm in System Identification and Channel Equalization

Convergence Evaluation of a Random Step-Size NLMS Adaptive Algorithm in System Identification and Channel Equalization Convergence Evaluation of a Random Step-Size NLMS Adaptive Algorithm in System Identification and Channel Equalization 1 Shihab Jimaa Khalifa University of Science, Technology and Research (KUSTAR) Faculty

More information

Generalized Information Potential Criterion for Adaptive System Training

Generalized Information Potential Criterion for Adaptive System Training IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1035 Generalized Information Potential Criterion for Adaptive System Training Deniz Erdogmus, Student Member, IEEE, and Jose C. Principe,

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Identification of nonlinear discrete-time systems using raised-cosine radial basis function networks

Identification of nonlinear discrete-time systems using raised-cosine radial basis function networks International Journal of Systems Science volume 35, number 4, April 4, pages Identification of nonlinear discrete-time systems using raised-cosine radial basis function networks A. F. AL-AJLOUNI{, R.J.SCHILLING{*

More information

A Flexible ICA-Based Method for AEC Without Requiring Double-Talk Detection

A Flexible ICA-Based Method for AEC Without Requiring Double-Talk Detection APSIPA ASC 2011 Xi an A Flexible ICA-Based Method for AEC Without Requiring Double-Talk Detection Marko Kanadi, Muhammad Tahir Akhtar, Wataru Mitsuhashi Department of Information and Communication Engineering,

More information

MITIGATING UNCORRELATED PERIODIC DISTURBANCE IN NARROWBAND ACTIVE NOISE CONTROL SYSTEMS

MITIGATING UNCORRELATED PERIODIC DISTURBANCE IN NARROWBAND ACTIVE NOISE CONTROL SYSTEMS 17th European Signal Processing Conference (EUSIPCO 29) Glasgow, Scotland, August 24-28, 29 MITIGATING UNCORRELATED PERIODIC DISTURBANCE IN NARROWBAND ACTIVE NOISE CONTROL SYSTEMS Muhammad Tahir AKHTAR

More information

Estimation of the Optimum Rotational Parameter for the Fractional Fourier Transform Using Domain Decomposition

Estimation of the Optimum Rotational Parameter for the Fractional Fourier Transform Using Domain Decomposition Estimation of the Optimum Rotational Parameter for the Fractional Fourier Transform Using Domain Decomposition Seema Sud 1 1 The Aerospace Corporation, 4851 Stonecroft Blvd. Chantilly, VA 20151 Abstract

More information

NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group

NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION M. Schwab, P. Noll, and T. Sikora Technical University Berlin, Germany Communication System Group Einsteinufer 17, 1557 Berlin (Germany) {schwab noll

More information

Alpha-Stable Distributions in Signal Processing of Audio Signals

Alpha-Stable Distributions in Signal Processing of Audio Signals Alpha-Stable Distributions in Signal Processing of Audio Signals Preben Kidmose, Department of Mathematical Modelling, Section for Digital Signal Processing, Technical University of Denmark, Building 3,

More information

Adaptive Filter Theory

Adaptive Filter Theory 0 Adaptive Filter heory Sung Ho Cho Hanyang University Seoul, Korea (Office) +8--0-0390 (Mobile) +8-10-541-5178 dragon@hanyang.ac.kr able of Contents 1 Wiener Filters Gradient Search by Steepest Descent

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo The Fixed-Point Algorithm and Maximum Likelihood Estimation for Independent Component Analysis Aapo Hyvarinen Helsinki University of Technology Laboratory of Computer and Information Science P.O.Box 5400,

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

MMSE DECODING FOR ANALOG JOINT SOURCE CHANNEL CODING USING MONTE CARLO IMPORTANCE SAMPLING

MMSE DECODING FOR ANALOG JOINT SOURCE CHANNEL CODING USING MONTE CARLO IMPORTANCE SAMPLING MMSE DECODING FOR ANALOG JOINT SOURCE CHANNEL CODING USING MONTE CARLO IMPORTANCE SAMPLING Yichuan Hu (), Javier Garcia-Frias () () Dept. of Elec. and Comp. Engineering University of Delaware Newark, DE

More information

A Canonical Genetic Algorithm for Blind Inversion of Linear Channels

A Canonical Genetic Algorithm for Blind Inversion of Linear Channels A Canonical Genetic Algorithm for Blind Inversion of Linear Channels Fernando Rojas, Jordi Solé-Casals, Enric Monte-Moreno 3, Carlos G. Puntonet and Alberto Prieto Computer Architecture and Technology

More information

New design of estimators using covariance information with uncertain observations in linear discrete-time systems

New design of estimators using covariance information with uncertain observations in linear discrete-time systems Applied Mathematics and Computation 135 (2003) 429 441 www.elsevier.com/locate/amc New design of estimators using covariance information with uncertain observations in linear discrete-time systems Seiichi

More information

Design of Image Adaptive Wavelets for Denoising Applications

Design of Image Adaptive Wavelets for Denoising Applications Design of Image Adaptive Wavelets for Denoising Applications Sanjeev Pragada and Jayanthi Sivaswamy Center for Visual Information Technology International Institute of Information Technology - Hyderabad,

More information

A Minimum Error Entropy criterion with Self Adjusting. Step-size (MEE-SAS)

A Minimum Error Entropy criterion with Self Adjusting. Step-size (MEE-SAS) Signal Processing - EURASIP (SUBMIED) A Minimum Error Entropy criterion with Self Adjusting Step-size (MEE-SAS) Seungju Han *, Sudhir Rao *, Deniz Erdogmus, Kyu-Hwa Jeong *, Jose Principe * Corresponding

More information

Blind Deconvolution by Modified Bussgang Algorithm

Blind Deconvolution by Modified Bussgang Algorithm Reprinted from ISCAS'99, International Symposium on Circuits and Systems, Orlando, Florida, May 3-June 2, 1999 Blind Deconvolution by Modified Bussgang Algorithm Simone Fiori, Aurelio Uncini, Francesco

More information

inear Adaptive Inverse Control

inear Adaptive Inverse Control Proceedings of the 36th Conference on Decision & Control San Diego, California USA December 1997 inear Adaptive nverse Control WM15 1:50 Bernard Widrow and Gregory L. Plett Department of Electrical Engineering,

More information

Blind Deconvolution via Maximum Kurtosis Adaptive Filtering

Blind Deconvolution via Maximum Kurtosis Adaptive Filtering Blind Deconvolution via Maximum Kurtosis Adaptive Filtering Deborah Pereg Doron Benzvi The Jerusalem College of Engineering Jerusalem, Israel doronb@jce.ac.il, deborahpe@post.jce.ac.il ABSTRACT In this

More information

A low intricacy variable step-size partial update adaptive algorithm for Acoustic Echo Cancellation USNRao

A low intricacy variable step-size partial update adaptive algorithm for Acoustic Echo Cancellation USNRao ISSN: 77-3754 International Journal of Engineering and Innovative echnology (IJEI Volume 1, Issue, February 1 A low intricacy variable step-size partial update adaptive algorithm for Acoustic Echo Cancellation

More information

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

EM-algorithm for Training of State-space Models with Application to Time Series Prediction EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research

More information

ON SOME EXTENSIONS OF THE NATURAL GRADIENT ALGORITHM. Brain Science Institute, RIKEN, Wako-shi, Saitama , Japan

ON SOME EXTENSIONS OF THE NATURAL GRADIENT ALGORITHM. Brain Science Institute, RIKEN, Wako-shi, Saitama , Japan ON SOME EXTENSIONS OF THE NATURAL GRADIENT ALGORITHM Pando Georgiev a, Andrzej Cichocki b and Shun-ichi Amari c Brain Science Institute, RIKEN, Wako-shi, Saitama 351-01, Japan a On leave from the Sofia

More information

Riccati difference equations to non linear extended Kalman filter constraints

Riccati difference equations to non linear extended Kalman filter constraints International Journal of Scientific & Engineering Research Volume 3, Issue 12, December-2012 1 Riccati difference equations to non linear extended Kalman filter constraints Abstract Elizabeth.S 1 & Jothilakshmi.R

More information

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Spall John Wiley and Sons, Inc., 2003 Preface... xiii 1. Stochastic Search

More information

Efficient Use Of Sparse Adaptive Filters

Efficient Use Of Sparse Adaptive Filters Efficient Use Of Sparse Adaptive Filters Andy W.H. Khong and Patrick A. Naylor Department of Electrical and Electronic Engineering, Imperial College ondon Email: {andy.khong, p.naylor}@imperial.ac.uk Abstract

More information

EFFICIENT SHAPE OPTIMIZATION USING POLYNOMIAL CHAOS EXPANSION AND LOCAL SENSITIVITIES

EFFICIENT SHAPE OPTIMIZATION USING POLYNOMIAL CHAOS EXPANSION AND LOCAL SENSITIVITIES 9 th ASCE Specialty Conference on Probabilistic Mechanics and Structural Reliability EFFICIENT SHAPE OPTIMIZATION USING POLYNOMIAL CHAOS EXPANSION AND LOCAL SENSITIVITIES Nam H. Kim and Haoyu Wang University

More information

Robust extraction of specific signals with temporal structure

Robust extraction of specific signals with temporal structure Robust extraction of specific signals with temporal structure Zhi-Lin Zhang, Zhang Yi Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science

More information

squares based sparse system identification for the error in variables

squares based sparse system identification for the error in variables Lim and Pang SpringerPlus 2016)5:1460 DOI 10.1186/s40064-016-3120-6 RESEARCH Open Access l 1 regularized recursive total least squares based sparse system identification for the error in variables Jun

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

ADAPTIVE FILTER THEORY

ADAPTIVE FILTER THEORY ADAPTIVE FILTER THEORY Fifth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada International Edition contributions by Telagarapu Prabhakar Department

More information

Low-Complexity Image Denoising via Analytical Form of Generalized Gaussian Random Vectors in AWGN

Low-Complexity Image Denoising via Analytical Form of Generalized Gaussian Random Vectors in AWGN Low-Complexity Image Denoising via Analytical Form of Generalized Gaussian Random Vectors in AWGN PICHID KITTISUWAN Rajamangala University of Technology (Ratanakosin), Department of Telecommunication Engineering,

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

Nonparametric Density Estimation. October 1, 2018

Nonparametric Density Estimation. October 1, 2018 Nonparametric Density Estimation October 1, 2018 Introduction If we can t fit a distribution to our data, then we use nonparametric density estimation. Start with a histogram. But there are problems with

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels

Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels Bijit Kumar Das 1, Mrityunjoy Chakraborty 2 Department of Electronics and Electrical Communication Engineering Indian Institute

More information

ADAPTIVE signal processing algorithms (ASPA s) are

ADAPTIVE signal processing algorithms (ASPA s) are IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 46, NO. 12, DECEMBER 1998 3315 Locally Optimum Adaptive Signal Processing Algorithms George V. Moustakides Abstract We propose a new analytic method for comparing

More information

LTI Systems, Additive Noise, and Order Estimation

LTI Systems, Additive Noise, and Order Estimation LTI Systems, Additive oise, and Order Estimation Soosan Beheshti, Munther A. Dahleh Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts

More information

An Adaptive Sensor Array Using an Affine Combination of Two Filters

An Adaptive Sensor Array Using an Affine Combination of Two Filters An Adaptive Sensor Array Using an Affine Combination of Two Filters Tõnu Trump Tallinn University of Technology Department of Radio and Telecommunication Engineering Ehitajate tee 5, 19086 Tallinn Estonia

More information

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 Jinglin Zhou Hong Wang, Donghua Zhou Department of Automation, Tsinghua University, Beijing 100084, P. R. China Control Systems Centre,

More information

Outline Lecture 2 2(32)

Outline Lecture 2 2(32) Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Time-domain noise reduction based on an orthogonal decomposition for desired signal extraction

Time-domain noise reduction based on an orthogonal decomposition for desired signal extraction Time-domain noise reduction based on an orthogonal decomposition for desired signal extraction Jacob Benesty INRS-EMT, University of Quebec, 800 de la Gauchetiere Ouest, Suite 6900, Montreal, Quebec H5A

More information

Independent Component Analysis and Unsupervised Learning

Independent Component Analysis and Unsupervised Learning Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent

More information

Blind Source Separation Using Artificial immune system

Blind Source Separation Using Artificial immune system American Journal of Engineering Research (AJER) e-issn : 2320-0847 p-issn : 2320-0936 Volume-03, Issue-02, pp-240-247 www.ajer.org Research Paper Open Access Blind Source Separation Using Artificial immune

More information

Mixture Models and EM

Mixture Models and EM Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering

More information

On the Behavior of Information Theoretic Criteria for Model Order Selection

On the Behavior of Information Theoretic Criteria for Model Order Selection IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 49, NO. 8, AUGUST 2001 1689 On the Behavior of Information Theoretic Criteria for Model Order Selection Athanasios P. Liavas, Member, IEEE, and Phillip A. Regalia,

More information

Expectation propagation for signal detection in flat-fading channels

Expectation propagation for signal detection in flat-fading channels Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA

More information

ELEG 833. Nonlinear Signal Processing

ELEG 833. Nonlinear Signal Processing Nonlinear Signal Processing ELEG 833 Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu February 15, 2005 1 INTRODUCTION 1 Introduction Signal processing

More information

THE problem of phase noise and its influence on oscillators

THE problem of phase noise and its influence on oscillators IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 54, NO. 5, MAY 2007 435 Phase Diffusion Coefficient for Oscillators Perturbed by Colored Noise Fergal O Doherty and James P. Gleeson Abstract

More information

On the Use of A Priori Knowledge in Adaptive Inverse Control

On the Use of A Priori Knowledge in Adaptive Inverse Control 54 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS PART I: FUNDAMENTAL THEORY AND APPLICATIONS, VOL 47, NO 1, JANUARY 2000 On the Use of A Priori Knowledge in Adaptive Inverse Control August Kaelin, Member,

More information

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space Robert Jenssen, Deniz Erdogmus 2, Jose Principe 2, Torbjørn Eltoft Department of Physics, University of Tromsø, Norway

More information

Algorithm for Multiple Model Adaptive Control Based on Input-Output Plant Model

Algorithm for Multiple Model Adaptive Control Based on Input-Output Plant Model BULGARIAN ACADEMY OF SCIENCES CYBERNEICS AND INFORMAION ECHNOLOGIES Volume No Sofia Algorithm for Multiple Model Adaptive Control Based on Input-Output Plant Model sonyo Slavov Department of Automatics

More information

RECONSTRUCTED QUANTIZED COEFFICIENTS MODELED WITH GENERALIZED GAUSSIAN DISTRIBUTION WITH EXPONENT 1/3

RECONSTRUCTED QUANTIZED COEFFICIENTS MODELED WITH GENERALIZED GAUSSIAN DISTRIBUTION WITH EXPONENT 1/3 Image Processing & Communications, vol. 2, no. 4, pp.5-2 DOI:.55/ipc-26-9 5 RECONSTRUCTED QUANTIZED COEFFICIENTS MODELED WITH GENERALIZED GAUSSIAN DISTRIBUTION WITH EXPONENT /3 ROBERT KRUPIŃSKI West-Pomeranian

More information

798 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 44, NO. 10, OCTOBER 1997

798 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 44, NO. 10, OCTOBER 1997 798 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL 44, NO 10, OCTOBER 1997 Stochastic Analysis of the Modulator Differential Pulse Code Modulator Rajesh Sharma,

More information

Output regulation of general discrete-time linear systems with saturation nonlinearities

Output regulation of general discrete-time linear systems with saturation nonlinearities INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL Int. J. Robust Nonlinear Control 2002; 12:1129 1143 (DOI: 10.1002/rnc.647) Output regulation of general discrete-time linear systems with saturation

More information

Title without the persistently exciting c. works must be obtained from the IEE

Title without the persistently exciting c.   works must be obtained from the IEE Title Exact convergence analysis of adapt without the persistently exciting c Author(s) Sakai, H; Yang, JM; Oka, T Citation IEEE TRANSACTIONS ON SIGNAL 55(5): 2077-2083 PROCESS Issue Date 2007-05 URL http://hdl.handle.net/2433/50544

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Comparison of DDE and ETDGE for. Time-Varying Delay Estimation. H. C. So. Department of Electronic Engineering, City University of Hong Kong

Comparison of DDE and ETDGE for. Time-Varying Delay Estimation. H. C. So. Department of Electronic Engineering, City University of Hong Kong Comparison of DDE and ETDGE for Time-Varying Delay Estimation H. C. So Department of Electronic Engineering, City University of Hong Kong Tat Chee Avenue, Kowloon, Hong Kong Email : hcso@ee.cityu.edu.hk

More information

V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline

V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline Goals Introduce Wiener-Hopf (WH) equations Introduce application of the steepest descent method to the WH problem Approximation to the Least

More information

System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain

System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain Electrical Engineering Department Technion - Israel Institute of Technology Supervised by: Prof. Israel Cohen Outline

More information

ANALYSIS OF STEADY-STATE EXCESS MEAN-SQUARE-ERROR OF THE LEAST MEAN KURTOSIS ADAPTIVE ALGORITHM

ANALYSIS OF STEADY-STATE EXCESS MEAN-SQUARE-ERROR OF THE LEAST MEAN KURTOSIS ADAPTIVE ALGORITHM ANALYSIS OF SEADY-SAE EXCESS MEAN-SQUARE-ERROR OF HE LEAS MEAN KUROSIS ADAPIVE ALGORIHM Junibakti Sanubari Department of Electronics Engineering, Satya Wacana University Jalan Diponegoro 5 60, Salatiga,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

COMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS

COMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS COMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS MUSOKO VICTOR, PROCHÁZKA ALEŠ Institute of Chemical Technology, Department of Computing and Control Engineering Technická 905, 66 8 Prague 6, Cech

More information

Asymptotic Analysis of the Generalized Coherence Estimate

Asymptotic Analysis of the Generalized Coherence Estimate IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 49, NO. 1, JANUARY 2001 45 Asymptotic Analysis of the Generalized Coherence Estimate Axel Clausen, Member, IEEE, and Douglas Cochran, Senior Member, IEEE Abstract

More information

Adaptive Filtering for Flow-Cytometric Particles

Adaptive Filtering for Flow-Cytometric Particles 126 Part. Part. Syst. Charact. 17 (2000) 126±133 Adaptive Filtering for Flow-Cytometric Particles alek Adjouadi*, Carlos Reyes**, John Riley***, Patricio Vidal* (Received 15 June 2000; resubmitted 26 July

More information

A ROBUST BEAMFORMER BASED ON WEIGHTED SPARSE CONSTRAINT

A ROBUST BEAMFORMER BASED ON WEIGHTED SPARSE CONSTRAINT Progress In Electromagnetics Research Letters, Vol. 16, 53 60, 2010 A ROBUST BEAMFORMER BASED ON WEIGHTED SPARSE CONSTRAINT Y. P. Liu and Q. Wan School of Electronic Engineering University of Electronic

More information

Linear Least-Squares Based Methods for Neural Networks Learning

Linear Least-Squares Based Methods for Neural Networks Learning Linear Least-Squares Based Methods for Neural Networks Learning Oscar Fontenla-Romero 1, Deniz Erdogmus 2, JC Principe 2, Amparo Alonso-Betanzos 1, and Enrique Castillo 3 1 Laboratory for Research and

More information

Simultaneous Diagonalization in the Frequency Domain (SDIF) for Source Separation

Simultaneous Diagonalization in the Frequency Domain (SDIF) for Source Separation Simultaneous Diagonalization in the Frequency Domain (SDIF) for Source Separation Hsiao-Chun Wu and Jose C. Principe Computational Neuro-Engineering Laboratory Department of Electrical and Computer Engineering

More information

No. 6 Determining the input dimension of a To model a nonlinear time series with the widely used feed-forward neural network means to fit the a

No. 6 Determining the input dimension of a To model a nonlinear time series with the widely used feed-forward neural network means to fit the a Vol 12 No 6, June 2003 cfl 2003 Chin. Phys. Soc. 1009-1963/2003/12(06)/0594-05 Chinese Physics and IOP Publishing Ltd Determining the input dimension of a neural network for nonlinear time series prediction

More information

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER Zhen Zhen 1, Jun Young Lee 2, and Abdus Saboor 3 1 Mingde College, Guizhou University, China zhenz2000@21cn.com 2 Department

More information

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 1, JANUARY 2001 135 New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks Martin Bouchard,

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A.

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. A EW IFORMATIO THEORETIC APPROACH TO ORDER ESTIMATIO PROBLEM Soosan Beheshti Munther A. Dahleh Massachusetts Institute of Technology, Cambridge, MA 0239, U.S.A. Abstract: We introduce a new method of model

More information