Lyapunov Stability Analysis of the Quantization Error for DCS Neural Networks

Size: px

Start display at page:

Download "Lyapunov Stability Analysis of the Quantization Error for DCS Neural Networks"

Marilynn Hardy
5 years ago
Views:

1 Lyapunov Stability Analysis of the Quantization Error for DCS Neural Networks Sampath Yerramalla, Bojan Cukic Lane Department of Computer Science and Electrical Engineering West Virginia University Morgantown WV Edgar Fuller Department of Mathematics West Virginia University Morgantown WV Abstract In this paper we show that the quantization error for Dynamic Cell Structures (DCS) Neural Networks (NN) as defined by Bruske and Sommer provides a measure of the Lyapunov stability of the weight centers of the neural net. We also show, however, that this error is insufficient in itself to verify that DCS neural networks provide stable topological representation of a given fixed input feature manifold. While it is true that DCS generates a topology preserving feature map, it is unclear when and under what circumstances DCS will have achieved an accurate representation. This is especially important in safety critical systems where it is necessary to understand when the topological representation is complete and accurate. The stability analysis here shows that there exists a Lyapunov function for the weight adaptation of the DCS NN system applied to a fixed feature manifold. The Lyapunov function works in parallel during DCS learning, and is able to provide a measure of the effective placement of neural units during the NN s approximation. It does not, however, guarantee the formation of an accurate representation of the feature manifold. Simulation studies from a selected CMU-Benchmark involving the use of the constructed Lyapunov function indicate the existence of a Globally Asymptotically Stable (GAS) state for the placement of neural units, but an example is given where the topology of the constructed network fails to mirror that of the input manifold even though the quantization error continues to decrease monotonically. I. INTRODUCTION The learning process in most common neural nets evolves sequentially over time. This process of learning also being combinatorial and nonlinear in nature, it is a challenging task to evaluate the dynamic properties that govern the performance of a neural net over time. Stability being considered as an important dynamic property, the question then is: Is there an effective tool that can analyze the stability aspects of the dynamic learning process inside the NN? Even after decades of extensive research on neural nets and its wide spread use in many practical applications, the answer to this question still remains uncertain and contradictory. The objective of mathematical theory of nonlinear stability analysis is often misunderstood as the process of finding a solution for the differential equation(s) that govern the dynamics, either analytically or numerically (simulation studies). Theoretically, there is no guarantee for the existence of a solution to a nonlinear differential equation at all times, let alone the complicated task of solving them. In practice however, there exists a variety of techniques and tools that analyze the existence of stable states in nonlinear systems, without actually the need for solving any differential equation [1], [5]. Almost all stability definitions for linear systems lead to a similar stability criterion (R-locus, eigenvalues, etc). Therefore defining a precise stability definition for linear systems can be relaxed. However, eigenvalue methods are not applicable to nonlinear systems and so one needs to be careful while dealing with systems that have non-linear properties. It s often seen that if such systems are stable under one definition of stability they may tend to become unstable under other definitions [5]. This difficulty in imposing strong stability restrictions for nonlinear systems was realized as early as a century ago by a Russian mathematician A.M.Lyapunov. Details on Lyapunov s stability analysis technique for nonlinear discrete systems can be found in [6], [4]. The fact that Lyapunov s direct method or nonlinear theory of stability analysis can be readily applied to validate the existence (or non-existence) of stable states in nonlinear systems, intrigued us in using Lyapunov function in our stability analysis as a means to answer the question posed earlier. Dynamic Cell Structures (DCS) NN represents a family of Topology Representing Networks (TRN), used for both supervised and un-supervised learning. Some applications of DCS include vector quantization, clusetering, dimensional reduction [17], [16]. In this paper we analyze the stability properties of neural unit positioning by DCS NN using Lyapunov theory and we present a qualified answer to the question Is the learning process of DCS NN, stable? Throughout this stability analysis we consider DCS NN as a sequentially evolving discrete logic system, closely related to the family of Logical Discrete Event Systems. For detailed description of a Logical Discrete Event System, refer [2]. We also assume that the feature manifold being mapped by DCS, is fixed or stationary, meaning the manifold doesn t evolve over time. The paper is organized as follows: Section-II, provides a description of the competitive Hebb rule and Kohonen-like rule that govern the learning process of the DCS NN. This description of the building-blocks is later used in Section-III

2 to present a state-space representation of the dynamics of the network. Also in Section-III, the construction process of the Lyapunov function is described in detail along with proving (through proofs) the existence of a Globally Asymptotically Stable learning state in DCS NN. Simulation results are presented in Section-IV using a selected CMU benchmark of artificially generated data. Finally, conclusions are drawn in Section-V based on the proofs and simulation results. II. DCS NN BUILDING BLOCKS In 1994 Jorg Bruske and Gerald Sommers of the University of Kiel, introduced the concept of DCS as a family of Topology Representing, Self-Organizing NN [17], [16]. This Topology Preserving Feature Map (TPFM) generation scheme was motivated by Growing Neural GAS algorithm developed by Fritzke and the former work on TRN by Martinez [13]. DCS uses Kohen-like adaptation to shift the weighted centers of the local neighborhood structure, closer to the feature manifold. This when applied in conjunction with Competitive Hebb Rule (CHR) to update lateral connections of the neighbors, produces a network representation that preserves the features of the input manifold. We believe that these two essential building blocks of DCS algorithm, play a key role in the generation of a TPFM. In an attempt to study these rules in depth, a formal overview is depicted in the following section. A. Competitive Hebb Rule (CHR) DCS NN rests upon a Radial Basis Function (RBF) and an additional layer of lateral connections between the neural units [16]. These lateral connection strengths are symmetric and bounded in nature, C ij = C ji [, 1] I. The goal of CHR is to update the lateral connections by mapping neighborhoods in the input manifold to neighborhoods of the network. Thereby, avoiding any restrictions of the topology of the network [17]. For each input element of the feature manifold, CHR operates by setting the connection strength between the two neural units that are closer to the input than any other neuron pair, to a highest possible connection strength of 1. These two neural units are referred to as the Best Matching Unit (bmu) and Second Best Unit (sbu). CHR then proceeds to decay the strength of all other existing lateral connections emanating from the bmu using a forgetting constant, α. If any of these existing connections drop below a predefined threshold θ, they are set to zero. The set of all neural units that are connected to the bmu is defined as the neighborhood set of the bmu, and represented by nbr. All other connections of the network remain unaltered. In this way CHR induces a Delaunay Triangulation into the network by preserving the neighborhood structure of the feature manifold. 1 (i = bmu) (j = sbu) (i = bmu) (j nbr sbu) (c ij < θ) c ij (t + 1) = (1) αc ij (t) (i = bmu) (j nbr sbu) (c ij θ) c ij (t) i, j bmu It was proved in [13] that algorithms utilizing CHR to update lateral connections between neural units generate a TPFM. B. Kohonen Like Rule Unlike a feedforward Neural Net, the weight center w i associated with a neural unit i of the DCS network represents the location of the neural unit in the output space. It is crucial to realize that these weighted centers be updated in a manner that preserves the geometry of the input manifold. This can be achieved by adjusting the weighted center of the bmu and its surrounding neighborhood structure nbr closer to the input element. Thus, for each element of the feature manifold, DCS adapts the corresponding bmu and its neighborhood set nbr in a Kohonen-like manner, see [17]. Over any training cycle, let w i = w i (t + 1) w i (t) represent the adjustment of the wiehgt center of the neural unit, then the Kohonen-like rule followed in DCS can be represented as follows, ε bmu ( m w i (t)) i = bmu w i = ε nbr ( m w i (t)) i nbr (2) (i bmu) (i nbr) where ε bmu, ε nbr (, 1) are predefined constants known as the learning rates that define the momentum of the update process. For every input element, applying CHR before any other adjustment ensures that sbu is a member of nbr set for all further adjustments within the inner loop of the DCS algorithm. III. STATE-SPACE ANALYSIS The first step of any stability analysis lies in identification of the states involved with the dynamical system. State Space representation is the most common method of representing the identified states of the dynamic system. This representation technique is very effective during the construction of a Lyapunov function for Nonlinear Systems [6]. A. States of DCS Logic The following equations give the representation of the states of DCS logic during CHR and KHR adjustment as discrete differences over a time step. 1) Center State: W 2) Connection State: C = f W ( x W, bmu, nbr, ε bmu, ε nbr ) (3) = f C ( x C, bmu, nbr, α, θ) (4) Here x W and x C are the states of DCS learning that represents the spatial distribution of neural units and the lateral connection structure between the neural units respectively. If M I and A O are the input and output spaces of DCS that contain the feature manifold M and the network representation A, f W : I O and f C : I O are continuous functions that provide the required adjustment to x W and x C respectively.

3 B. DCS Output/Recall When DCS NN is presented with a test input u R I, the network is searched for the appropriate bmu and the closest of all neural units that is connected to the bmu. Note that this neural unit need not always be the sbu. DCS then applies linear interpolation technique between these two neural units to generate the output vector y(u) R O. The output y(u) generated at time t by the DCS network depends not only on the states x C (t), x W (t), but also the location of the bmu and its connected neighborhood structure. Thus, the output of DCS for a give input stimulus u can be represented as, y(u) = G( x C (t), x W (t), bmu, nbr, u) (5) If bmu is not connected to any other neural unit of the network then the generated output y(u) is based solely on the bmu s location. It is obvious that the output produced by the network with no neurons is zero. C. DCS State Space Representation We consider DCS Neural Net s approximation from any given input manifold M R I. In this neural net, a set of neural units i for i = 1... N with weighted centers w i W R O and lateral connections c ij [, 1] C R O represent the map generated by DCS from input manifold M on to the output space O R O. It was proved by Martinetz and Schulten in [13] that this is a topology preserving operation in the sense that neighborhoods of M are preserved in O. Let G(W, C) be the collection of N neural units of DCS having W NxN as their weighted center matrix and C NxN as their connection strength matrix, that represent the DCS map from a given input manifold M onto an output space O. Then, G(W, C) can be considered as an undirected graph of N multivalent vertices. Definition 1 (DCS Mapping): DCS Mapping G(W, C) of a given manifold M I is the N th order neural unit representation of M in the output space I, gotten by assigning N neural in t n steps of the DCS algorithm. Considering DCS as a discrete time dynamical system, the following state-space representation can be formulated : W = f W ( x W, bmu, nbr, ε bmu, ε nbr ) C = f C ( x C, bmu, nbr, α, θ) y = G(W, C) (6) IV. LYAPUNOV S STABILITY THEORY We want to determine if the DCS algorithm will reliably learn a fixed input manifold M on successive applications. The question is then: How much can the evolving DCS approximation of M, denoted by y f, deviate from a previously learned version of M, denoted by y i? Since the DCS approximation process is a time-varying construction whose state changes according to the difference relations in (6), it is natural to use the Lyapunov formulation of stability for a dynamical system to answer this question. The relevant result of Lyapunov [6] is the following Theorem 1 (Lyapunov Stability): Given a system of differential equations d X dt = f(x 1,..., x n, t) (7) if it is possible to find a positive-definite function V of X for which V () = and dv dt then the trivial solution of (7) is Lyapunov stable or locally stable. The Lyapunov condition in the discrete case can be formulated as follows: Theorem 2 (Discrete Lyapunov Stability): The discrete system of difference equations S = f(s 1,..., s n, t) (8) is stable in the sense of Lyapunov if there is a positive-definite function V for which V () = and V for some time step >. The system besides being stable, if V happens V to be negative-definite, <, then the system is said to be Asymptotically Stable. Moreover if V as t, the system is Globally Asymptotic Stable (GAS). A. Lyapunov function Construction To set up a Lyapunov function for DCS, we measure how accurately a current state of the algorithm models the input manifold in terms of the amount of geometry and topology preserved in the output manifold. To measure the accuracy of the DCS approximation, Bruske and Sommer introduce the Quantization Error as defined in [17]. This measures the effectiveness of the placement of neural units by computing the amount by which a neural unit deviates from an element m M of the input manifold for which it is the Best Matching Unit. We then average this over the total number of neurons, N, during that state of DCS approximation to get Q(t) = 1 m w bmu(m,t) (9) N In the following section we show that Q serves as a Lyapunov function for the neural unit placement of DCS. B. Neural Unit Placement by DCS is Lyapunov stable With the above setup, it follows that Theorem 3: The neural units, w i, assigned by DCS approximation during learning of a fixed input manifold M I are globally, asymptotically stable in the sense of Lyapunov. Proof: First of all we need to show that the Lyapunov function (9), constructed for the time-varying discrete system of DCS is valid. It is obvious that Q since m w bmu(m,t). Also Q() =, since there are no neurons in the DCS network during zero state x C = x W =. To show that Q <, first note that since the time step >, the numerator Q will determine the sign in question. Computing

4 using (9) we see that over any time interval = t i+1 t i >, adjustments of the weighted centers w i are determined by the Kohonen Rule. Over any learning cycle, DCS adjusts the bmu and its neighbors for every m M according to the rule in (2). Let m w bmu(m,t) be represented by, then we see that over a single training cycle Q = 1 N + 1 d( m, t + 1) 1 N = N d( m, t + 1) (N + 1) N(N + 1) = N (d( m, t + 1) ) N(N + 1) (1) For any m M, we need to show that either the corresponding portion of the numerator of (1) is negative or that some other portion compensates when it is positive. There are three ways in which a neural unit s weighted center may get adjusted. It may get updated 1) as the bmu for m, 2) as the bmu of some other m M, 3) or as one of the neighbors of the bmu of some other m M. In the first case, to show that (1) < it is sufficient to show that d( m, t + 1) < (11) Computing using (2) gives d( m, t + 1) = m w BMU( m,t+1) = m (w bmu( m,t) + ε bmu ( m w bmu( m,t) )) = ( m w bmu(t) ) ε bmu ( m w bmu(t) ) = m w bmu(t) ε bmu m + ε bmu w bmu(t) ) = (1 ε bmu ) m w bmu(t) = (1 ε bmu ) Since ε BMU (, 1), d( m, t + 1) < which implies (11). In the second case, the best matching unit for m, bmu( m), may get updated as the best matching unit for some other m M. This update will be either preceded or followed by and update of the neuron as bmu for m. If preceded and the neuron is closer to m than m, the resulting decrease in d( m, t) will be larger that the increase in. If preceded and the neuron is farther from m than m, the increase in due to this adjustment will be smaller than the decrease from the neuron s adjustment as bmu. If followed by the first case adjustment, the previous two relationships are reversed but the net effect on Q is still a decrease. All of these follow from the triangle inequality for the euclidean metric. The third case will have Q < since ε bmu >> ε sbu in general. In case the two values are comparable, the result follows from the same argument as in the second case above. As a result, the function Q is a Lyapunov function for the weight system of DCS and furthermore since Q < and Q as t, the stability is global and asymptotic. This result reflects the fact that DCS is topology preserving as a Kohonen based algorithm as shown by Martinetz and Schulten [13]. Unfortunately, this topology preservation does not correlate with the stability of the neural unit placement as is demonstrated in the next section. V. SIMULATION RESULTS The fixed feature manifold M I R 2 is the twospirals as provided by CMU s archive for artificial neural net benchmark data [21], [2]. In order to provide a better understanding to this stability analysis, the parameters (density,radius) of the training data are set in a manner similar to that described by Bruske and Sommer in [17]. Thus, the density of the twin-spiral training data is set to 5 and the spiral radius to 6.5. The program two-spirals.c of [21], generates 962 input elements in the x-y plane. Fig.1(a) is a snapshot of a 51st Fig. 1. DCS mapping and Lyapunov function at 19 and its corresponding quantization error order DCS NN representation, A (51) O, of the fixed twinspiral feature manifold M I. The relatively fewer number of neural units of the network fail to produce an accurate representation of M in A. This is also indicated by a corresponding large value in the quantization error in Fig.1(b). As the neural units are incremented in number to 251, the network gradually evolved into a topology preserved feature map, as in Fig.2(a). Meaning that the neighborhoods of M are preserved in 251st order DCS neural network representation, A (251). The relatively small quantization error in the network is indicated in Fig.2(b). However, if the simulation is continued for another Fig. 2. DCS mapping and Lyapunov function at 19 and its corresponding quantization error

5 iteration as shown in Fig.3(a), a fault or a defective connection is developed in the network. In order for a better understanding of the concept of defective network representation, let us give a precise definition for a fault in DCS learning. Definition 2 (Fault): We say that a fault has occurred in DCS learning, if two neural units, {i, j} of the network, A O, that are mapped to relatively distant input elements {m i, m j } M I, develop a a positive connection strength between them, c ij >. The faults however, are not detected by the quantization error term of Fig.3(b), that continues to decrease monotonically. Fig Faults in the DCS network near 25 epochs VI. CONCLUSION In this paper we proved through Lyapunov stability analysis, the quantization error of the DCS NN mapping of a fixed input manifold of achieving a globally asymptotic stable learning state. The credit for this stabile behavior is given to the design of the DCS approximating algorithm using Kohonenlike adaptation that adjusts the positions of weighted centers in conjunction with a Competitive Hebbian Rule that updates the lateral connection structure to form a topology preserving feature map. It was proved through simulation study that the formation of an accurate representation of M I in A O, depends greatly on a precise moment where the the learning process needs to be stopped. The simulation study also proved that the quantization error, by itself does not indicate the formation of an accurate topology preserved feature map, and thus cannot be used directly as a criteria for stopping the learning. Finally we propose the need for an efficient measure of the amount of topology preservation in the network in terms of a topology error, operating in conjunction with the quantization error to provide a performance measure to the accuracy of DCS NN s approximation. [7] Bernd Fritzke. Growing Cell Structures - A Self-organizing Network for Unsupervised and Supervised Learning, ICSI, Vol. 7, No. 9, Pages , May [8] Bernd Fritzke. Growing Self-organizing Networks - why?, European Symposium on Artificial Neural Network, Pages 61-72, [9] Bernd Fritzke. A Growing Neural Gas Network Learns Topologies, Advances in Neural Information Processing Systems, Vol. 7, pages , MIT Press, [1] Bernd Fritzke. Unsupervised Clustering With Growing Cell Structurea, Proc. of the IJCNN, Pages , [11] Bernd Fritzke. Growing Grid - A Self-Organizing Network With Constant Neighborhood Range and Adaptation Strength, Neural Processing Letters, Vol. 2, No. 5, Pages 9-13, [12] Teuvo Kohonen. The Self-Organizing Map, Proc. of the IEEE, Vol. 78, No. 9, Pages , September, 199. [12] Jurgen Rahmel. On The Role of Topology for Neural Network Interpretation, Proc. of European Conference on Artificial Intelligence, [13] Thomas Martinez, and Klaus Schulten. Topology Representing Networks, Neural Networks, Vol. 7, No. 3, Pages , [14] Lucius Chudy, and Igor Farkas. Prediction of Chaotic Time-Series Using Dynamic Cell Structures and Local Linear Models, Institute of Measurement Science, Slovak Academy of Sciences, Slovakia. [15] Lucius Chudy, and Igor Farkas. Modified Dynamic Cell Structures as a Thinning Algorithm, Institute of Measurement Science, Slovak Academy of Sciences, Slovakia. [16] Ingo Ahrns, Jorg Bruske, and Gerald Sommer. On-line Learning with Dynamic Cell Structures, Proc. of ICANN, Vol. 2, Pages , [17] Jorg Bruske, and Gerald Sommer. Dynamic Cell Structures, NIPS, Vol. 7, Pages , [18] Jorg Bruske, and Gerald Sommer. Dynamic Cell Structure Learns Perfectly Topology Preserving Map, Neural Computation, Vol. 7, No. 4, , [19] Charles C. Jorgensen. Feedback Linearized Aircraft Control Using Dynamic Cell Structures, NASA Ames Research Center, ISSCI [2] Fahlman S.E. CMU Benchmark Collection for Neural Net Learning Algorithms, Carnegie Mellon Univ., School of Computer Science, machinereadable data repository, Pittsburgh, [21] CMU Benchmark Archive. bench/cmu/, 22. REFERENCES [1] N.Rouche, P.Habets, and M.Laloy. Stability Theory by Liapunov s Direct Method, Springer-Verlag, New York Inc., [2] Kevin M.Passino, N.Michel, and Panos J. Antsaklis. Lyapunov Stability of a Class of Discrete Event Systems, IEEE Tran. on Automatic Control, Vol. 39, No. 2, February [3] Xinghuo Yu, M. Onder Efe,and Okyay Kaynak. A Backpropagation Learning Framework For Feedforward Neural Networks, IEEE transactions, No /1, 21. [4] W. L. Brogan. Modern Control Theory, II Edition, Prentice Hall Inc., [5] Bernard Friedland. Advanced Control System, Prentice Hall Inc., [6] V. I. Zubov. Methods of A. M. Lyapunov and Their Applications, U.S. Atomic Energy Commission, 1957.

Lyapunov Analysis of Neural Network Stability in an Adaptive Flight Control System

Lyapunov Analysis of Neural Network Stability in an Adaptive Flight Control System Sampath Yerramalla 1, Edgar Fuller 2, Martin Mladenovski 1, and Bojan Cukic 1 1 Lane Department of Computer Science and