Convergence of a Fixed-Point Minimum Error Entropy Algorithm

Similar documents
Image Denoising Based on Non-Local Low-Rank Dictionary Learning

Ranking DEA Efficient Units with the Most Compromising Common Weights

Research Article An Extension of Cross Redundancy of Interval Scale Outputs and Inputs in DEA

The Extended Balanced Truncation Algorithm

A First Digit Theorem for Square-Free Integer Powers

Scale Efficiency in DEA and DEA-R with Weight Restrictions

The Features For Dark Matter And Dark Flow Found.

Lecture 2 Phys 798S Spring 2016 Steven Anlage. The heart and soul of superconductivity is the Meissner Effect. This feature uniquely distinguishes

Conservation of Energy

LEARNING DISCRIMINATIVE BASIS COEFFICIENTS FOR EIGENSPACE MLLR UNSUPERVISED ADAPTATION. Yajie Miao, Florian Metze, Alex Waibel

Bayesian Reliability Estimation of Inverted Exponential Distribution under Progressive Type-II Censored Data

AN EASY INTRODUCTION TO THE CIRCLE METHOD

An Exact Solution for the Deflection of a Clamped Rectangular Plate under Uniform Load

On the Use of High-Order Moment Matching to Approximate the Generalized-K Distribution by a Gamma Distribution

Investment decision for supply chain resilience based on Evolutionary Game theory

Topic 7 Fuzzy expert systems: Fuzzy inference

2FSK-LFM Compound Signal Parameter Estimation Based on Joint FRFT-ML Method

Chapter 2 Sampling and Quantization. In order to investigate sampling and quantization, the difference between analog

24P 2, where W (measuring tape weight per meter) = 0.32 N m

ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION. Xiaoqun Wang

Factor Analysis with Poisson Output

A Constraint Propagation Algorithm for Determining the Stability Margin. The paper addresses the stability margin assessment for linear systems

PPP AND UNIT ROOTS: LEARNING ACROSS REGIMES

Research Article Existence for Nonoscillatory Solutions of Higher-Order Nonlinear Differential Equations

Bogoliubov Transformation in Classical Mechanics

1-D SEDIMENT NUMERICAL MODEL AND ITS APPLICATION. Weimin Wu 1 and Guolu Yang 2

MODE SHAPE EXPANSION FROM DATA-BASED SYSTEM IDENTIFICATION PROCEDURES

THE BICYCLE RACE ALBERT SCHUELLER

A New Model and Calculation of Available Transfer Capability With Wind Generation *

Research Article Numerical and Analytical Study for Fourth-Order Integro-Differential Equations Using a Pseudospectral Method

Social Studies 201 Notes for November 14, 2003

Clustering Methods without Given Number of Clusters

Fair scheduling in cellular systems in the presence of noncooperative mobiles

Research Article An Adaptive Regulator for Space Teleoperation System in Task Space

TP A.30 The effect of cue tip offset, cue weight, and cue speed on cue ball speed and spin

Preemptive scheduling on a small number of hierarchical machines

Research Article A New Kind of Weak Solution of Non-Newtonian Fluid Equation

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

7.2 INVERSE TRANSFORMS AND TRANSFORMS OF DERIVATIVES 281

Exponentially Convergent Controllers for n-dimensional. Nonholonomic Systems in Power Form. Jihao Luo and Panagiotis Tsiotras

Assignment for Mathematics for Economists Fall 2016

TRIPLE SOLUTIONS FOR THE ONE-DIMENSIONAL

Series Expansion of Wide-Sense Stationary Random Processes

DIFFERENTIAL EQUATIONS

3.185 Problem Set 6. Radiation, Intro to Fluid Flow. Solutions

An Inequality for Nonnegative Matrices and the Inverse Eigenvalue Problem

ANALOG REALIZATIONS OF FRACTIONAL-ORDER INTEGRATORS/DIFFERENTIATORS A Comparison

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Lecture 2 DATA ENVELOPMENT ANALYSIS - II

Hybrid Projective Dislocated Synchronization of Liu Chaotic System Based on Parameters Identification

Theoretical Computer Science. Optimal algorithms for online scheduling with bounded rearrangement at the end

Jan Purczyński, Kamila Bednarz-Okrzyńska Estimation of the shape parameter of GED distribution for a small sample size

Source slideplayer.com/fundamentals of Analytical Chemistry, F.J. Holler, S.R.Crouch. Chapter 6: Random Errors in Chemical Analysis

One Class of Splitting Iterative Schemes

White Rose Research Online URL for this paper: Version: Accepted Version

Gain and Phase Margins Based Delay Dependent Stability Analysis of Two- Area LFC System with Communication Delays

STOCHASTIC DIFFERENTIAL GAMES:THE LINEAR QUADRATIC ZERO SUM CASE

Optimal Coordination of Samples in Business Surveys

Multi-dimensional Fuzzy Euler Approximation

Non-Parametric Non-Line-of-Sight Identification 1

4 Conservation of Momentum

Lecture 7: Testing Distributions

Suggestions - Problem Set (a) Show the discriminant condition (1) takes the form. ln ln, # # R R

Computers and Mathematics with Applications. Sharp algebraic periodicity conditions for linear higher order

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Unbounded solutions of second order discrete BVPs on infinite intervals

Section J8b: FET Low Frequency Response

S E V E N. Steady-State Errors SOLUTIONS TO CASE STUDIES CHALLENGES

Lecture 10 Filtering: Applied Concepts

The generalized Pareto sum

Efficient Methods of Doppler Processing for Coexisting Land and Weather Clutter

m 0 are described by two-component relativistic equations. Accordingly, the noncharged

Lecture 4 Topic 3: General linear models (GLMs), the fundamentals of the analysis of variance (ANOVA), and completely randomized designs (CRDs)

Improving Efficiency Scores of Inefficient Units. with Restricted Primary Resources

Social Studies 201 Notes for March 18, 2005

HELICAL TUBES TOUCHING ONE ANOTHER OR THEMSELVES

New bounds for Morse clusters

Feature Extraction Techniques

SIMM Method Based on Acceleration Extraction for Nonlinear Maneuvering Target Tracking

ADAPTIVE CONTROL DESIGN FOR A SYNCHRONOUS GENERATOR

A Simplified Methodology for the Synthesis of Adaptive Flight Control Systems

The Four Kinetic Regimes of Adsorption from Micellar Surfactant Solutions

1. The F-test for Equality of Two Variances

Identifying DNA motifs based on match and mismatch alignment information

Rigorous analysis of diffraction gratings of arbitrary profiles using virtual photonic crystals

Privacy-Preserving Point-to-Point Transportation Traffic Measurement through Bit Array Masking in Intelligent Cyber-Physical Road Systems

Control of industrial robots. Decentralized control

Relevance Estimation of Cooperative Awareness Messages in VANETs

Behavioral thermal modeling for quad-core microprocessors

CHAPTER 13 FILTERS AND TUNED AMPLIFIERS

Given the following circuit with unknown initial capacitor voltage v(0): X(s) Immediately, we know that the transfer function H(s) is

Standard Guide for Conducting Ruggedness Tests 1

Alternate Dispersion Measures in Replicated Factorial Experiments

Research Article Triple Positive Solutions of a Nonlocal Boundary Value Problem for Singular Differential Equations with p-laplacian

CS Lecture 13. More Maximum Likelihood

Investigation of application of extractive distillation method in chloroform manufacture

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Estimation of Peaked Densities Over the Interval [0,1] Using Two-Sided Power Distribution: Application to Lottery Experiments

A BATCH-ARRIVAL QUEUE WITH MULTIPLE SERVERS AND FUZZY PARAMETERS: PARAMETRIC PROGRAMMING APPROACH

A New Predictive Approach for Bilateral Teleoperation With Applications to Drive-by-Wire Systems

Transcription:

Entropy 05, 7, 5549-5560; doi:0.3390/e7085549 Article OPE ACCESS entropy ISS 099-4300 www.dpi.co/journal/entropy Convergence of a Fixed-Point Miniu Error Entropy Algorith Yu Zhang, Badong Chen, *, Xi Liu, Zejian Yuan and Joe C. Principe,3 School of Aeronautic and Atronautic, Zhejiang Univerity, Hangzhou 3007, China; E-Mail: zhangyu80@zju.edu.cn School of Electronic and Inforation Engineering, Xi an Jiaotong Univerity, Xi an 70049, China; E-Mail: lxi0@63.co (X.L.); yzejian@gail.co (Z.Y.) 3 Departent of Electrical and Coputer Engineering, Univerity of Florida, Gaineville, FL 36, USA; E-Mail: principe@cnel.ufl.edu * Author to who correpondence hould be addreed; E-Mail: chenbd@ail.xjtu.edu.cn. Acadeic Editor: Kevin H. Knuth Received: 3 May 05 / Accepted: 8 July 05 / Publihed: 3 Augut 05 Abtract: he iniu error entropy () criterion i an iportant learning criterion in inforation theoretical learning (IL). However, the olution cannot be obtained in cloed for even for a iple linear regreion proble, and one ha to earch it, uually, in an iterative anner. he fixed-point iteration i an efficient way to olve the olution. In thi work, we tudy a fixed-point algorith for linear regreion, and our focu i ainly on the convergence iue. We provide a ufficient condition (although a little looe) that guarantee the convergence of the fixed-point algorith. An illutrative exaple i alo preented. Keyword: inforation theoretic learning (IL); iniu error entropy () criterion; fixed-point algorith MSC Code: 6B0. Introduction In recent year, inforation theoretic eaure, uch a entropy and utual inforation, have been widely applied in doain of achine learning (o called inforation theoretic learning (IL) [) and

Entropy 05, 7 5550 ignal proceing [,. A poible ain reaon for the ucce of IL i that inforation theoretic quantitie can capture higher-order tatitic of the data and offer potentially ignificant perforance iproveent in achine learning application [. Baed on the Parzen window ethod [3, the ooth and nonparaetric inforation theoretic etiator can be applied directly to the data without ipoing any a priori auption (ay the Gauian auption) about the underlying probability denity function (PDF). In particular, Renyi quadratic entropy etiator can be eaily calculated by a double u over aple [4 7. he entropy in upervied learning erve a a eaure of iilarity and follow a iilar fraework of the well-known ean quare error (MSE) [,. An adaptive yte can be trained by iniizing the entropy of the error over the training dataet [4. hi learning criterion i called the iniu error entropy () criterion [,,8 0. ay achieve uch better perforance than MSE epecially when data are heavy-tailed or ultiodal non-gauian [,,0. However, the olution cannot be obtained in cloed for even when the yte i a iple linear odel uch a a finite ipule repone (FIR) filter. A practical approach i to earch the olution over perforance urface by an iterative algorith. Uually, a iple gradient baed earch algorith i adopted. With a gradient baed learning algorith, however, one ha to elect a proper learning rate (or tep-ize) to enure the tability and achieve a better tradeoff between iadjutent and convergence peed [4 7. Another ore proiing earch algorith i the fixed-point iterative algorith, which i tep-ize free and i often uch fater than gradient baed ethod [. he fixed-point algorith have received coniderable attention in achine learning and ignal proceing due to their deirable propertie of low coputational requireent and fat convergence peed [ 7. he convergence i a key iue for an iterative learning algorith. For the gradient baed algorith, the convergence proble ha already been tudied and oe theoretical reult have been obtained [6,7. For the fixed-point algorith, up to now there i till no tudy concerning the convergence. he goal of thi paper i to tudy the convergence of a fixed-point algorith and provide a ufficient condition that enure the convergence to a unique olution (the fixed point). It i worth noting that the convergence of a fixed-point axiu correntropy criterion (MCC) algorith ha been tudied in [8. he reainder of the paper i organized a follow. In Section, we derive a fixed-point algorith. In Section 3, we prove a ufficient condition to guarantee the convergence. In Section 4, we preent an illutrative exaple. Finally in Section 5, we give the concluion.. Fixed-Point Algorith Conider a iple linear regreion (filtering) cae where the error ignal i ei () di () yi () di () W Xi () () with di () being a deired value at tie i, yi () W Xi () the output of the linear odel, [,,, the weight vector, and X i [ x i x i x i W w w w () (), (),, () the input vector (i.e., the regreor). he goal i to find a weight vector uch that the error ignal i a all a poible. Under the criterion, the optial weight vector i obtained by iniizing the error entropy [,. With Renyi quadratic entropy, the olution can be expreed a

Entropy 05, 7 555 W p xdx p xdx arg in log e( ) arg ax e( ) W W () where p e(.) denote the PDF of the error ignal. In IL the quantity p e ( x) dx i alo called the quadratic inforation potential (QIP) [. In a practical ituation, however, the error ditribution i e(), e(),, e( ), where uually unknown, and one ha to etiate it fro the error aple { } denote the aple nuber. Baed on the Parzen window approach [3, the etiated PDF take the for pˆ e( x) κ( xe( i)) (3) i where κ (.) tand for a kernel function (not necearily a Mercer kernel), atifying κ ( x) 0 and κ( x) dx. Without entioned otherwie, the kernel function i elected a a Gauian kernel, given by κ x ( x) exp( ) (4) π where denote the kernel bandwidth. With Gauian kernel, the QIP can be iply etiated a [ pˆ e ( xdx ) κ ( xei ( )) dx ( ei ( ) e( j)) κ i i j (5) herefore, in practical ituation, the olution of () becoe W arg ax ( e( i) e( j)) κ W i j (6) Unfortunately, there i no cloed for olution of (6). One can apply a gradient baed iterative algorith to earch the olution, tarting fro an initial point. Below we derive a fixed-point iterative algorith, which i, in general, uch fater than a gradient baed ethod (although a gradient ethod can be viewed a a pecial cae of the fixed-point ethod, it involve a tep-ize paraeter). Let take the following firt order derivative: where W κ i j (() ei e()) j κ ( ei ( ) e( j)) ( ei ( ) e( j) )[ X( i) X( j) i j ( ( ) ( )) κ ei e j ( di ( ) d( j) )[ X( i) X( j) i j ( () ( )) κ e i e j [ X() i X( j) [ X() i X( j) W i j M { PdX R EE W} (7)

Entropy 05, 7 555 R κ ( ei () ej ( )) Xi () X( j) Xi () X( j) i j PdX κ ( ei () e( j)) ( di () d( j) )[ X() i X( j) i j Let (() ()) 0 κ ei e j W, and aue that the atrix i j the following olution [5: ( ) [ [ dx (8) R i invertible. hen, we obtain W R P (9) he above olution i, in for, very iilar to the well-known Wiener olution [9. However, it i not a cloed for olution, ince both atrix R and vector P dx depend on the weight vector W (note that ei () depend on W ). herefore, the olution of (9) i actually a fixed-point equation, which can alo be expreed a W f,where ( ) f ( W ) R P (0) he olution (fixed-point) of the equation W f can be found by the following iterative fixed-point algorith: dx W f k+ ( Wk) () where W k denote the etiated weight vector at iteration k. hi algorith i called the fixed-point algorith [5. An online fixed-point algorith wa alo derived in [5. In the next ection, we will prove a ufficient condition under which the algorith () urely converge to a unique fixed-point. 3. Convergence of the Fixed-Point he convergence of a fixed-point algorith can be proved by the well-known contraction apping theore (alo known a the Banach fixed-point theore) [. According to the contraction apping theore, the convergence of the fixed-point algorith () i guaranteed if β > 0 and 0< α < uch that the initial weight vector W0 β, and W p { W : W β p }, it hold that f ( W ) β p f ( W ) W f ( W ) α p W p where. p denote an lp-nor of a vector or an induced nor of a atrix, defined by A ax AX X, with p, A p X 0 p p p, atrix of f ( W ) with repect to W, given by X, and W ( W ) () f denote the Jacobian W f f f f (3) w w w

Entropy 05, 7 5553 where w f ( W ) R P w dx R R R P + R P w dx w dx R κ ( e() i e( j) )[ X() i X( j) [ X() i X( j) f w i j + R κ ( ei () e( j) )[ di () d( j) [ X() i X( j) i j w R κ i j ( ei ( ) e( j) )( x ( i) x ( j) ) ( ei ( ) e( j) )[ X( i) X( j) [ X( i) X( j) f + R ( ei () e( j) )( x () i x ( j) ) κ e() i e( j) d() i d( j) X() i X( j) i j ( )[ [ (4) o obtain a ufficient condition to guarantee the convergence of the fixed-point algorith (), we prove two theore below. heore. If d() i d( j) X() i X( j) i j β > ξ λin i j [ Xi () X( j) [ Xi () X( j) * * and, where i the olution of equation ϕ( ) β, where, d() i d( j) X() i X( j) ϕ i j ( ), 0, ( β X() i X( j) + d() i d( j) ) λ in exp [ X( i) X( j) [ X( i) X( j) i j 4 hen β f for all W { W : W β}. ( ) (5) Proof. he induced atrix nor i copatible with the correponding vector lp-nor, hence where R R P dx R PdX f ( W ) (6) R i the -nor (alo referred to a the colun-u nor) of the invere atrix, which i iply the axiu abolute colun u of the atrix. According to the atrix theory, the following inequality hold:

Entropy 05, 7 5554 R R R λax (7) where R i the -nor (alo referred to a the pectral nor) of R axiu eigenvalue of the atrix. Further, we have λ ( R ) λ in R ax ( a) λ κ in i j λ κ β ( e() i e( j) )[ X() i X( j) [ X() i X( j) ( X() i X( j) + d() i d( j) )[ X() i X( j) [ X() i X( j) in i j, which equal the (8) where (a) coe fro In addition, it hold that ( ) ei () e( j) di () d( j) W X() i X( j) W X() i X( j) + d() i d( j) β X() i X( j) + d() i d( j) (9) P dx κ ( b) ( c) i j ( ei () e( j) )[ di () d( j) [ X() i X( j) κ ( ei ( ) e( j) )[ di ( ) d( j) [ X( i) X( j) i j di ( ) d( j) Xi ( ) X( j) π i j where (b) follow fro the convexity of the vector l-nor, and (c) i becaue κ ( x) for any π x.cobining (6) (8) and (0), we derive (0) f ( W ) λ κ β λ π i j ( X() i X( j) + d() i d( j) )[ X() i X( j) [ X() i X( j) in i j i j ( β X() i X( j) + d() i d( j) ) in i j 4 ϕ ( ) di () d( j) Xi () X( j) d( i) d( j) X( i) X( j) exp ( ) ( ) [ X i X j [ X() i X( j) ()

Entropy 05, 7 5555 Clearly, the function ( ) 0,, atifying li ϕ( ), and li ϕ( ) ξ. herefore, if β > ξ, the equation ϕ( ) β will have a 0+ unique olution heore. If and ax {, } ϕ i a continuou and onotonically decreaing function of over ( ) * over ( 0, ), and if *, we have ϕ( ) d() i d( j) X() i X( j) i j β > ξ λin i j [ Xi () X( j) [ Xi () X( j) β, which coplete the proof., where * i the olution of the equation ϕ( ) β, and i the olution of equation ψ( ) α ( 0< α < ), where, γ ψ ( ), 0, ( β Xi () X( j) di () d( j) ) λin exp + Xi () X( j) Xi () X( j) i j 4 in which ( X() i X( j) d() i d( j) ) X() i X( j) [ X() i X( j) [ X() i X( j) γ β + β + i j di ( ) d( j) Xi ( ) X( j) then it hold that f β, and ( ) W W α ) ( f for all W { W : W β}. ( ) () (3) Proof. By heore, we have f β. o prove ( ) W f W α, it uffice to prove, w f α. By (4), we have w ( W ) R i j + R i j R f ( ei () e( j) )( x() i x( j) ) κ ( ei () e( j) )[ X() i X( j) [ X() i X( j) f ( ei () e( j) )( x() i x( j) ) κ ( ei () e( j) )[ di () d( j) [ X() i X( j) i j ( ei () e( j) )( x() i x( j) ) κ ( ei () e( j) )[ X() i X( j) [ X() i X( j) f + R i j ( ei ( ) e( j) )( x( i) x( j) ) κ ( ei ( ) e( j) )[ di ( ) d( j) [ X( i) X( j) (4)

Entropy 05, 7 5556 It i eay to derive ( d ) R i j ( ei ( ) ej ( ))( x( i) x( j) ) κ ( ei ( ) ej ( )) Xi ( ) X( j) Xi ( ) X( j) ( W f ) β ( e( i) e( j) )( x( i) x( j) ) κ ( e( i) e( j) ) X( i) X( j) X( i) X( j) ( W ) R f i j ( ei ( ) e( j) )( x( i) x( j) ) κ ( ei ( ) e( j) ) X ( i) X( j) X( i) X( j) R i j β ( β X() i X( j) + d() i d( j) ) X() i X( j) X() i X( j) X() i X( j) ( e) 3 4 π R i j where (d) follow fro the convexity of the vector l-nor and f β, and (e) i due to the fact that ( ei ( ) ej ( ))( x( i) x( j) ) ( β Xi ( ) X( j) + di ( ) d( j) ) Xi ( ) X( j) and κ ( ) x π for any x. In a iilar way, one can derive (5) R i j ( ei ( ) e( j) )( x( i) x( j) ) κ ( ei ( ) e( j) )[ di ( ) d( j) [ X( i) X( j) ( ei () e( j) )( x () i x ( j) ) κ ( ei () e( j) )[ di () d( j) [ X() i X( j) R i j ( 3 R ( β X i) X( j) + d( i) d( j) ) d( i) d( j) X( i) X( j) i j 4 π (6) hen, cobining (4) (6), (7) and (8), we have w ( W ) β 3 R f 4 π + i j ( β X() i X( j) d() i d( j) ) X() i X( j) [ X() i X( j) [ X() i X( j) + ( ( ) ( ) ( ) ( )) ( ) ( ) ( ) ( ) 3 β X i X j d i d j d i d j X i X j 4 π R + i j 3 4 λin κ β X( i) i j γ π ( X( j) + d( i) d( j) )[ X( i) X( j) [ X( i) X( j) γ ( β Xi () X( j) + di () d( j) ) λ in exp () ( ) () ( ) i j 4 ψ ( ) [ Xi X j [ Xi X j (7)

Entropy 05, 7 5557 Obviouly, ( ) atifie li ψ ( ), li ψ ( ) 0. herefore, given 0< α <, the equation ψ ( ) olution 0+ ψ i alo a continuou and onotonically decreaing function of over ( ) over ( 0, ), and if, we have ψ ( ) α. hi coplete the proof. 0,, and α ha a unique According to heore and Banach Fixed-Point heore [, given an initial weight vector atifying W0 β, the fixed-point algorith () will urely converge to a unique fixed point W W : W β provided that the kernel bandwidth i larger than a certain in the range { } value. Moreover, the value of α ( 0< α < ) guarantee the convergence peed. It i worth noting that the derived ufficient condition will be, certainly, a little looe, due to the zooing out in the proof proce. 4. Illutrative Exaple In the following, we give an illutrative exaple to verify the derived ufficient condition that guarantee the convergence of the fixed-point algorith. Let u conider a iple linear odel: di () Xi () + vi () (8) where X () i i a calar input, and vi () i an additive noie. Aue that X () i i unifor ditributed over 3, 3 and vi () i zero-ean Gauian with variance 0.0. here are 00 training aple generated fro the yte (8). Baed on thee data we calculate { Xi (), di ()} 00 i 00 00 di ( ) d( j) Xi ( ) X( j) i j ξ.974 00 00 Xi () X( j) i j (9) We chooe β 3 > ξ and α 0.9938 <. hen by olving the equation ϕ( ) β and ψ( ) α, we obtain.38 and.68. herefore, by heore, if.68 the fixed-point algorith will converge to a unique olution in the range 3 W 3. Figure 3 illutrate the curve of the function W, ( ) ( ) df f W R PdX, and when 3.0, 0., 0.0, repectively. Fro the Figure we oberve: (i) when 3.0 >.68, we have f ( W ) < 3 and df < α for df 3 W 3; (ii) when 0.<.68, we till have f ( W ) < 3 and < α for 3 W 3. In thi cae, the algorith till will converge to a unique olution in the range 3 W 3. hi reult confir the fact that the derived ufficient condition i a little looe (i.e., far fro being neceary). he ain reaon for thi i that there i a lot of zooing out in the derivation proce; (iii) however, when i too all, ay 0.0, the condition df < α will not hold for oe W 3 W 3. In thi cae, the algorith ay diverge. { }

Entropy 05, 7 5558 3 0 - - w f(w) df(w)/dw -3-3 - - 0 3 w Figure. Plot of the function W, f ( W ) and df when 3.0. 3 0 - w f(w) df(w)/dw - -3-3 - - 0 3 w Figure. Plot of the function W, f ( W ) and df when 0.. 3 0 - w f(w) df(w)/dw - -3-3 - - 0 3 w Figure 3. Plot of the function W, f ( W ) and df when 0.0. able how the nuber of iteration for convergence with different kernel bandwidth (3.0,.0, 0., 0.05). he initial weight vector i et at W 0 0., and the top condition for the convergence i Wk W W k k 6 < 0 (30)

Entropy 05, 7 5559 A one can ee, when 3.0 ax {, }, the fixed-point algorith will urely converge to a olution with few iteration. When becoe aller, the algorith ay till converge, but the convergence peed will becoe uch lower. ote that when i too all (e.g., 0.0 ), the algorith will diverge (the correponding reult are not hown in able ). 5. Concluion able. uber of iteration for convergence with different kernel bandwidth. 3.0.0 0. 0.05 Iteration 3 4 6 43 he criterion ha received increaing attention in ignal proceing and achine learning due to it deirable perforance in adaptive yte training epecially with non-gauian data. Many iterative optiization ethod have been developed to iniize the error entropy for practical ue. But the fixed-point algorith have been eldo tudied, and in particular, too little attention ha been paid to the convergence iue of the fixed-point algorith. hi paper preented a theoretical tudy of thi proble, and proved a ufficient condition to guarantee the convergence of a fixed-point algorith. he reult of thi tudy ay provide a poible range for chooing a kernel bandwidth for learning. However, the derived ufficient condition ay give a uch larger kernel bandwidth than a deired one due to the zooing out in the forula derivation proce. In the future tudy, we will try to derive a tighter ufficient condition that enure the convergence of the fixed-point algorith. Acknowledgent hi work wa upported by 973 Progra (o. 05CB35703) and ational SF of China (o. 6375). Author Contribution Yu Zhang and Badong Chen proved the ain theore in thi paper, Xi Liu preented the illutrative exaple, Zejian Yuan and Joe C. Principe polihed the language and were in charge of technical checking. All author have read and approved the final anucript. Conflict of Interet he author declare no conflict of interet. Reference. Principe, J.C. Inforation heoretic Learning: Renyi Entropy and Kernel Perpective; Springer: ew York, Y, USA, 00.. Chen, B.; Zhu, Y.; Hu, J.C.; Principe, J.C. Syte Paraeter Identification: Inforation Criteria and Algorith; Elevier: Aterda, the etherland, 03.

Entropy 05, 7 5560 3. Silveran, B.W. Denity Etiation for Statitic and Data Analyi; Chapan & Hall: ew York, Y, USA, 986. 4. Erdogu, D.; Principe, J.C. An error-entropy iniization for upervied training of nonlinear adaptive yte. IEEE ran. Signal Proce. 00, 50, 780 786. 5. Erdogu, D.; Principe, J.C. Generalized inforation potential criterion for adaptive yte training. IEEE ran. eural etw. 00, 3, 035 044. 6. Erdogu, D.; Principe, J.C. Convergence propertie and data efficiency of the iniu error entropy criterion in adaline traing. IEEE ran. Signal Proce. 003, 5, 966 978. 7. Chen, B.; Zhu, Y.; Hu, J. Mean-quare convergence analyi of ADALIE training with iniu error entropy criterion. IEEE ran. eural etw. 00,, 68 79. 8. Chen, B.; Principe, J.C. Soe further reult on the iniu error entropy etiation. Entropy 0, 4, 966 977. 9. Chen, B.; Principe, J.C. On the Soothed Miniu Error Entropy Criterion. Entropy 0, 4, 3 33. 0. Marque de Sá, J.P.; Silva, L.M.A.; Santo, J.M.F.; Alexandre, L.A. Miniu Error Entropy Claification; Springer: London, UK, 03.. Agarwal, R.P.; Meehan, M.; O Regan, D. Fixed Point heory and Application; Cabridge Univerity Pre: Cabridge, UK, 00.. Cichocki, A.; Aari, S. Adaptive Blind Signal and Iage Proceing: Learning Algorith and Application; Wiley: ew York, Y, USA, 00. 3. Regalia, P.A.; Kofidi, E. Monotonic convergence of fixed-point algorith for ICA. IEEE ran. eural etw. 003, 4, 943 949. 4. Fiori, S. Fat fixed-point neural blind-deconvolution algorith. IEEE ran. eural etw. 004, 5, 455 459. 5. Han, S.; Principe, J.C. A fixed-point iniu error entropy algorith. In Proceeding of the 6th IEEE Signal Proceing Society Workhop on Machine Learning for Signal Proceing, Arlington, VA, USA, 6 8 Septeber 006; pp. 67 7. 6. Chen, J.; Richard, C.; Berudez, J.C.M.; Honeine, P. on-negative leat-ean-quare algorith. IEEE ran. Signal Proce. 0, 59, 55 535. 7. Chen, J.; Richard, C.; Berudez, J.C.M.; Honeine, P. Variant of non-negative leat-ean-quare algorith and convergence analyi. IEEE ran. Signal Proce. 04, 6, 3990 4005. 8. Chen, B.; Wang, J.; Zhao, H.; Zheng,.; Principe, J.C. Convergence of a fixed-point algorith under Maxiu Correntropy Criterion. IEEE Signal Proce. Lett. 05,, 73 77. 9. Kailath,.; Sayed, A.H.; Haibi, B. Linear Etiation; Prentice Hall: Upper Saddle River, J, USA, 000. 05 by the author; licenee MDPI, Bael, Switzerland. hi article i an open acce article ditributed under the ter and condition of the Creative Coon Attribution licene (http://creativecoon.org/licene/by/4.0/).