Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo collegato di Trento Abstract. This work investigates the retrieval capacities of dierent types of nononotonic neurons. Storage capacity is axiized when the neuron response is a function with well dened geoetrical characteristics. Nuerical experients deonstrate that storage capacity is directly related to the dynaical property of the iterative ap that describes the network evolution. Maxiu capacity is reached when the neuron dynaics are subdivided into two non-overlapping \erratic bands" around points x i =.. Introduction We consider the storage capacity of fully connected Hopeld odels [2, ]. The discrete dynaics of the network eleents, X(t) = fx i (t); i =; ::; N g, are assued to be governed by the N-diensional iterative ap x i (t +)=g 0 X @ N j= T ij x j (t) A with Tij = N MX = i j () where =( ;;:::; 2 N ); i = ; =; 2; :::; M are the congurations to be eorized. The retrieval process is identied in the convergence of network conguration X(t) towards one of the stable xed points of iterative ap (). A successful recognition is achieved when the overlap between the nal static conguration andp one of the eorized congurations, e.g. =, is close to unity, N i i x i =. In this work, functions g(x) are noralized in such away that g() =. This noralization ensures that in the ideal case in which the stored vectors are orthogonal, ( ; )= ;, congurations are xed points of iterative ap (). In the traditional odels, in which the neuron response is a onotone function, the nuber of congurations that can be stored and retrieved without errors { the absolute capacity {ism A 2lnN N. If a sall fraction of errors is allowed, { the relative capacity {ism R c N where c 0:4 [3, 4]. In the N! liit, critical values M A and M R separate two distinct phases of the syste.
It was shown that if the sigoid is replaced with a nononotone function storage capacity is decisively iproved [6, 7]. In [7], a theoretical study of twostage dynaics with neuron response f(x) =,ax+ csgn(x), with a =(c +)=2 and a>0, indicates an absolute capacity ofaboutn= p 2 ln N and coputer siulations estiate a relative capacity of 0:3 N. 2. The odel considered As a rst exaple of nononotone odels, we consider function g gd (u) =ue, 2 (u2,) : (2) that depends on a single paraeter,. Slope at x = is negative for >. Figures shows the retrieval perforance for a N = 00 network with neuron response given by (2), as a function of paraeter for several values of loading paraeter = M=N. Initial congurations have a 0 =0:8 overlap with the target. Each point, for <<5, is generated by averaging over 000 saples. Perforance is easured in ters of the overlap between the network output and the pattern to be retrieved. In the following siulations the syste nal output is binarized, i.e. nal values x i are set to by eans of x i! x i = sgn(x i ). We indicate with the overlap obtained using binarized variables x i. The nal conguration has values x i that are contained in two bands around, the width of which depends on and. For low the range of values of that axiizes the overlap is wide, as increases this range shrinks around value 3. Retrieval with few errors is obtained for <0:4. Networks of larger size give the sae results, enhancing the dierence between dierent phases of the network behavior. Overlap.0 0.9 0.7 α=0.2 α=0.3 α= α=0.5 α= α=0.7 α= α=0.9 α=.0 a 6 0.5 6 0.3.0 2.0 3.0 4.0 5.0 0 2 b Figure : The retrieval perforance for a N = 00 network. Figure 2: Contour plot of overlap for the piecewise linear response for = 0:4. Capacity enhanceent with respect to onotone functions is a general behavior of nononotone response functions. Referring to [5, 6, 7], we have experientally copared function (2) with two other functions: ) the piecewise linear
8 < : function, ax for jxj <, a+b +b g pl (x) =,bx ( + b) for a+b +b +b < x < b bd ( + b) for +b b <d<x and 2) the Morita function, (3) g (x) =A, e,cx (jxj,h) +ec0 +e,cx + e c0 (jxj,h) where A is a noralization factor such that g () = and h =. The shape of these functions depends on the paraeter values. We perfored anuerical study to individuate the paraeter values that axiize retrieval capacity. Better results are achieved when positive (negative) values are apped into positive (negative) values. For this reason we set d = ( + b)=b in the piecewise linear function, and = 0 in the Morita function. Once paraeters d and k are set, the other paraeters (a and b for the piecewise linear function and c and c 0 for the Morita function) are related with the positive and negative slopes of the functions. For these two paraeters we searched the values that axiize the overlap between the syste conguration and the pattern to be retrieved, averaged over a large nuber of saples (000). For =0:4, the optial values are a 6 and b :4 for the piecewise linear function, see Fig. 2. The spacing between contour levels is 0:3, the highest contour level is 0:936 2. In the sae way, for the Morita function we found c 6, c 0 5. Note that even for =0:4 there is a large plateau of paraeter values that axiize storage capacity, i.e. the contour line that includes values within a sall fraction of the axiu deliits a large area. For lower values of load paraeter this area enlarges while it shrinks for higher values. This behavior is coon to all three functions. The optial values slightly change with the loading paraeter. For exaple, for =0:46 we have a 6 and b :0 for the piecewise linear response. A direct coparison between the storage capacities of the three functions the Gaussian derivative eq. (2), the Morita function, and the piecewise linear function is shown in gures 3. For all the functions, paraeter values that axiize storage capacity are used. The gure was obtained for a N = 00 network, starting fro initial congurations that have a 0 =0:8 overlap with the target. Each point is generated by averaging over 000 saples. The siulation results indicate that retrieval perforance depends on the geoetric shape of the nononotone function. In fact, as shown in Fig. 4, the three functions have very siilar shapes and perforance when the optial paraeters are used. For function (2) we choose as optial value the edian of the intersections betweens the =0:9 line and the =0:4 curve, i.e. 3:2. The value of h was set to unity in order to lower the nuber of paraeters while keeping a reasonable shape. 2 The optial values are dened as the approxiate position of the axiu; they were found using a higher nuber of contour levels than that shown in the gure. (4)
Overlap.0 0.9 g-d. p-l. Mor. Y 2.0.8.6 g-d. p-l. Mor. y=x 0.7.4.2 0.5.0 0.3 0.2 0. 0.2 0.0 0. 0.2 0.3 0.5 0.7 α 0.0 0.0 0.5.0.5 2.0 2.5 X Figure 3: Perforance for the three Figure 4: Shape of the three functions functions with optial paraeters. for \optial" paraeter values. 3. Relation with the dynaical properties of the response function We consider function (2). Fig. 5 shows the distribution as a function of paraeter for dierent values of the loading paraeter = M=N. Note that in the coputation of, nal values x i are not binarized to by eans of x! sgn(x) as previously done. The distribution was obtained using 000 saples for each value of and. The gray level of a pixel (;) is proportional to the nuber of saples that fall into [, =28; + =28] and [x, =28;x + =28]. For x 2 α=0.2 α=0.3 α= α=0.2 α=0.3 α= - 0 2 3 4 5 2 α=0.5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 x α=0.5 α= α=0.7 α= α=0.7-0 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 Figure 5: Distribution of overlap in Figure 6: Distribution of x in the range the range <<5for six values of the < < 5 for six values of the load load paraeter (initial overlap 0 = paraeter (initial overlap 0 =0:8). 0:8). xed, there is a sharp transition in perforance when becoes larger than a threshold value (for =0:2 the transition point isat 6). The value of this threshold decreases as increases.
The low value of for \high" is caused by neuron values x i ;i=;:::; N; oving close to zero. This is evidenced by Fig. 6 that displays values x i during the network evolution for dierent values of as a function of. Each iteration is generated fro a rando initial conguration that has a 0 =0:8 overlap with the target. The gray level of a pixel (;x) is proportional to the nuber of ties the iterated points x i ; i =; :::; N; fall into the ranges of values [, =28;+=28] and [x,=28;x+=28]. In general, the network conguration has values x i that are not strictly but are contained within two bands around. The width of these bands depends on and. The coparison between gures 5 and 6 indicates that, for a xed eory load, the syste achieves axiu storage capacity when the dynaics for around points two well-separated bands in which iterated points ove erratically, i.e. evenly covering the available space. On the contrary when the iterated points accuulate around the xed points, storage capacity islowered..0 0.9 E 0.7 0.5.0 2.0 3.0 4.0 5.0 Figure 7: Coparison between retrieval capability, in ters of nal overlaps and, and the easure of the irregularity of the iteration process (as dened by eq. 0) for =0:4. The initial overlap is 0 =0:8. This irregular behavior is quantitatively characterized in Fig. 7, where overlap and overlap are plotted against a easure of the spread of the iterated points, as a function of (siilar results were obtained for other values of ). To dene an appropriate easure, interval [,3; 3] is discretized into W =0 6 segents (of length =0,6 ). The easure is dened as the ratio between the total nuber of values produced, NVP, and the nuber of segents visited during the iteration process, E = P W i=0 i NVP 2 [0; ] (5) where i = if during iteration one or ore values fall inside interval [,3 + i;,3 P +(i +)], otherwise i = 0. The total nuberofvalues, NVP, is given S by N s I s, where N = 00 is the nuber of neurons, S = 000 is the nuber of test saples, and I s is the nuber of iterations corresponding to saple s.
Measure E tends toward unity if the iterated points spread uniforly in interval [,3; 3]. On the contrary it decreases toward zero if points accuulate around particular values. The result is stable under variations of the nuber of intervals W (for large W, of order NVP). Siilar results are obtained for other values of load paraeter. 4. Conclusions The capacity of associative eories with dierent types of non-onotonic neurons were nuerically investigated to deterine the \optial" shape of the neuron response function. It was shown that starting fro dierent analytical expressions, such as a piecewise linear function, the Morita function, and a Gaussian derivative function, capacity is axiized when the functions approxiate a well dened shape. The capacity enhanceent is connected with the dynaical properties of the iterative ap that describes the discrete network evolution. While in onotone odels neurons are constrained to quasi-binary values close to, in the nononotone odels they can assue values in a wider range. We have shown that this property is directly related to the capacity increase. In fact, the capability of retrieving patterns is axiized when the dynaics of the individual neuron are conned in two wide but well-disconnected bands around. The optial shape of the neuron function corresponds to the case in which the two bands are uniforly covered by the iterated points. References [] D.J. Ait. Modelling Brain Functions. Cabridge University Press, New York, 989. [2] J. J. Hopeld, \Neural networks and physical systes with eergent collective coputational abilities". Proc. Natl. Acad. Sci. USA, Vol. 79, pp. 2554-2558, April 982. [3] I. Kanter and H. Sopolinsky, \Associative recall of eory without errors". Phys. Rev. A, Vol. 35, No. : 380{392, 987. [4] L. F. Abbott and Y. Arian. \Storage capacity of generalized networks". Phys. Rev. A, Vol. 36, No. 0: 509{5094, 987. [5] S. Yoshizawa, M. Morita, and S. Aari \Capacity of associative eory using a nononotonic neuron odel". Neural networks, Vol. 6, pp. 67-76 (993). [6] M. Morita. \Associative eory with Nononotonic Dynaics". Neural networks, Vol. 6, pp. 5-26 (993). [7] H. Yanai, and S. Aari \Auto-associative eory with two-stage dynaics of nononotonic neurons". IEEE Transactions on Neural networks, Vol. 7, No. 4, pp. 803-85 (996).