Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy

Similar documents
e-companion ONLY AVAILABLE IN ELECTRONIC FORM

A Simple Regression Problem

Kernel Methods and Support Vector Machines

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

CS Lecture 13. More Maximum Likelihood

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

SPECTRUM sensing is a core concept of cognitive radio

Figure 1: Equivalent electric (RC) circuit of a neurons membrane

IAENG International Journal of Computer Science, 42:2, IJCS_42_2_06. Approximation Capabilities of Interpretable Fuzzy Inference Systems

Pattern Recognition and Machine Learning. Artificial Neural networks

Machine Learning Basics: Estimators, Bias and Variance

A method to determine relative stroke detection efficiencies from multiplicity distributions

ANALYSIS OF HALL-EFFECT THRUSTERS AND ION ENGINES FOR EARTH-TO-MOON TRANSFER

A remark on a success rate model for DPA and CPA

Non-Parametric Non-Line-of-Sight Identification 1

Ph 20.3 Numerical Solution of Ordinary Differential Equations

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

A note on the multiplication of sparse matrices

Probability Distributions

A Note on the Applied Use of MDL Approximations

Bootstrapping Dependent Data

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

Envelope frequency Response Function Analysis of Mechanical Structures with Uncertain Modal Damping Characteristics

Pattern Recognition and Machine Learning. Artificial Neural networks

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES

Ch 12: Variations on Backpropagation

Feature Extraction Techniques

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding

Genetic Quantum Algorithm and its Application to Combinatorial Optimization Problem

Block designs and statistics

Bayes Decision Rule and Naïve Bayes Classifier

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

An Improved Particle Filter with Applications in Ballistic Target Tracking

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors

A DESIGN GUIDE OF DOUBLE-LAYER CELLULAR CLADDINGS FOR BLAST ALLEVIATION

Optimization of ripple filter for pencil beam scanning

AND FITTING OF SINGLE-CHANNEL DWELL

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Estimating Parameters for a Gaussian pdf

Fairness via priority scheduling

Bayesian Approach for Fatigue Life Prediction from Field Inspection

The Wilson Model of Cortical Neurons Richard B. Wells

Detection and Estimation Theory

DETECTION OF NONLINEARITY IN VIBRATIONAL SYSTEMS USING THE SECOND TIME DERIVATIVE OF ABSOLUTE ACCELERATION

Kernel-Based Nonparametric Anomaly Detection

Multiscale Entropy Analysis: A New Method to Detect Determinism in a Time. Series. A. Sarkar and P. Barat. Variable Energy Cyclotron Centre

Research Article Rapidly-Converging Series Representations of a Mutual-Information Integral

Analysis of Impulsive Natural Phenomena through Finite Difference Methods A MATLAB Computational Project-Based Learning

CHARACTER RECOGNITION USING A SELF-ADAPTIVE TRAINING

Feedforward Networks

2nd Workshop on Joints Modelling Dartington April 2009 Identification of Nonlinear Bolted Lap Joint Parameters using Force State Mapping

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ANALYTICAL INVESTIGATION AND PARAMETRIC STUDY OF LATERAL IMPACT BEHAVIOR OF PRESSURIZED PIPELINES AND INFLUENCE OF INTERNAL PRESSURE

Effective joint probabilistic data association using maximum a posteriori estimates of target states

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science

Warning System of Dangerous Chemical Gas in Factory Based on Wireless Sensor Network

SHORT TIME FOURIER TRANSFORM PROBABILITY DISTRIBUTION FOR TIME-FREQUENCY SEGMENTATION

Interactive Markov Models of Evolutionary Algorithms

Feedforward Networks. Gradient Descent Learning and Backpropagation. Christian Jacob. CPSC 533 Winter 2004

Fixed-to-Variable Length Distribution Matching

Biostatistics Department Technical Report

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

New upper bound for the B-spline basis condition number II. K. Scherer. Institut fur Angewandte Mathematik, Universitat Bonn, Bonn, Germany.

Impact of Imperfect Channel State Information on ARQ Schemes over Rayleigh Fading Channels

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Research in Area of Longevity of Sylphon Scraies

Statistical properties of contact maps

Pattern Recognition and Machine Learning. Artificial Neural networks

Stochastic Subgradient Methods

Synchronization in large directed networks of coupled phase oscillators

Testing the lag length of vector autoregressive models: A power comparison between portmanteau and Lagrange multiplier tests

Sharp Time Data Tradeoffs for Linear Inverse Problems

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

Use of PSO in Parameter Estimation of Robot Dynamics; Part One: No Need for Parameterization

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

OBJECTIVES INTRODUCTION

P032 3D Seismic Diffraction Modeling in Multilayered Media in Terms of Surface Integrals

Constant-Space String-Matching. in Sublinear Average Time. (Extended Abstract) Wojciech Rytter z. Warsaw University. and. University of Liverpool

1 Bounding the Margin

SEISMIC FRAGILITY ANALYSIS

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

Local Maximum Ozone Concentration Prediction Using LSTM Recurrent Neural Networks

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China

Topic 5a Introduction to Curve Fitting & Linear Regression

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

An improved self-adaptive harmony search algorithm for joint replenishment problems

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

Statistical Logic Cell Delay Analysis Using a Current-based Model

COS 424: Interacting with Data. Written Exercises

A SIMPLIFIED METHOD FOR CALCULATING THE EFFECTIVE SOLAR OPTICAL PROPERTIES OF A DRAPERY

Transcription:

Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo collegato di Trento Abstract. This work investigates the retrieval capacities of dierent types of nononotonic neurons. Storage capacity is axiized when the neuron response is a function with well dened geoetrical characteristics. Nuerical experients deonstrate that storage capacity is directly related to the dynaical property of the iterative ap that describes the network evolution. Maxiu capacity is reached when the neuron dynaics are subdivided into two non-overlapping \erratic bands" around points x i =.. Introduction We consider the storage capacity of fully connected Hopeld odels [2, ]. The discrete dynaics of the network eleents, X(t) = fx i (t); i =; ::; N g, are assued to be governed by the N-diensional iterative ap x i (t +)=g 0 X @ N j= T ij x j (t) A with Tij = N MX = i j () where =( ;;:::; 2 N ); i = ; =; 2; :::; M are the congurations to be eorized. The retrieval process is identied in the convergence of network conguration X(t) towards one of the stable xed points of iterative ap (). A successful recognition is achieved when the overlap between the nal static conguration andp one of the eorized congurations, e.g. =, is close to unity, N i i x i =. In this work, functions g(x) are noralized in such away that g() =. This noralization ensures that in the ideal case in which the stored vectors are orthogonal, ( ; )= ;, congurations are xed points of iterative ap (). In the traditional odels, in which the neuron response is a onotone function, the nuber of congurations that can be stored and retrieved without errors { the absolute capacity {ism A 2lnN N. If a sall fraction of errors is allowed, { the relative capacity {ism R c N where c 0:4 [3, 4]. In the N! liit, critical values M A and M R separate two distinct phases of the syste.

It was shown that if the sigoid is replaced with a nononotone function storage capacity is decisively iproved [6, 7]. In [7], a theoretical study of twostage dynaics with neuron response f(x) =,ax+ csgn(x), with a =(c +)=2 and a>0, indicates an absolute capacity ofaboutn= p 2 ln N and coputer siulations estiate a relative capacity of 0:3 N. 2. The odel considered As a rst exaple of nononotone odels, we consider function g gd (u) =ue, 2 (u2,) : (2) that depends on a single paraeter,. Slope at x = is negative for >. Figures shows the retrieval perforance for a N = 00 network with neuron response given by (2), as a function of paraeter for several values of loading paraeter = M=N. Initial congurations have a 0 =0:8 overlap with the target. Each point, for <<5, is generated by averaging over 000 saples. Perforance is easured in ters of the overlap between the network output and the pattern to be retrieved. In the following siulations the syste nal output is binarized, i.e. nal values x i are set to by eans of x i! x i = sgn(x i ). We indicate with the overlap obtained using binarized variables x i. The nal conguration has values x i that are contained in two bands around, the width of which depends on and. For low the range of values of that axiizes the overlap is wide, as increases this range shrinks around value 3. Retrieval with few errors is obtained for <0:4. Networks of larger size give the sae results, enhancing the dierence between dierent phases of the network behavior. Overlap.0 0.9 0.7 α=0.2 α=0.3 α= α=0.5 α= α=0.7 α= α=0.9 α=.0 a 6 0.5 6 0.3.0 2.0 3.0 4.0 5.0 0 2 b Figure : The retrieval perforance for a N = 00 network. Figure 2: Contour plot of overlap for the piecewise linear response for = 0:4. Capacity enhanceent with respect to onotone functions is a general behavior of nononotone response functions. Referring to [5, 6, 7], we have experientally copared function (2) with two other functions: ) the piecewise linear

8 < : function, ax for jxj <, a+b +b g pl (x) =,bx ( + b) for a+b +b +b < x < b bd ( + b) for +b b <d<x and 2) the Morita function, (3) g (x) =A, e,cx (jxj,h) +ec0 +e,cx + e c0 (jxj,h) where A is a noralization factor such that g () = and h =. The shape of these functions depends on the paraeter values. We perfored anuerical study to individuate the paraeter values that axiize retrieval capacity. Better results are achieved when positive (negative) values are apped into positive (negative) values. For this reason we set d = ( + b)=b in the piecewise linear function, and = 0 in the Morita function. Once paraeters d and k are set, the other paraeters (a and b for the piecewise linear function and c and c 0 for the Morita function) are related with the positive and negative slopes of the functions. For these two paraeters we searched the values that axiize the overlap between the syste conguration and the pattern to be retrieved, averaged over a large nuber of saples (000). For =0:4, the optial values are a 6 and b :4 for the piecewise linear function, see Fig. 2. The spacing between contour levels is 0:3, the highest contour level is 0:936 2. In the sae way, for the Morita function we found c 6, c 0 5. Note that even for =0:4 there is a large plateau of paraeter values that axiize storage capacity, i.e. the contour line that includes values within a sall fraction of the axiu deliits a large area. For lower values of load paraeter this area enlarges while it shrinks for higher values. This behavior is coon to all three functions. The optial values slightly change with the loading paraeter. For exaple, for =0:46 we have a 6 and b :0 for the piecewise linear response. A direct coparison between the storage capacities of the three functions the Gaussian derivative eq. (2), the Morita function, and the piecewise linear function is shown in gures 3. For all the functions, paraeter values that axiize storage capacity are used. The gure was obtained for a N = 00 network, starting fro initial congurations that have a 0 =0:8 overlap with the target. Each point is generated by averaging over 000 saples. The siulation results indicate that retrieval perforance depends on the geoetric shape of the nononotone function. In fact, as shown in Fig. 4, the three functions have very siilar shapes and perforance when the optial paraeters are used. For function (2) we choose as optial value the edian of the intersections betweens the =0:9 line and the =0:4 curve, i.e. 3:2. The value of h was set to unity in order to lower the nuber of paraeters while keeping a reasonable shape. 2 The optial values are dened as the approxiate position of the axiu; they were found using a higher nuber of contour levels than that shown in the gure. (4)

Overlap.0 0.9 g-d. p-l. Mor. Y 2.0.8.6 g-d. p-l. Mor. y=x 0.7.4.2 0.5.0 0.3 0.2 0. 0.2 0.0 0. 0.2 0.3 0.5 0.7 α 0.0 0.0 0.5.0.5 2.0 2.5 X Figure 3: Perforance for the three Figure 4: Shape of the three functions functions with optial paraeters. for \optial" paraeter values. 3. Relation with the dynaical properties of the response function We consider function (2). Fig. 5 shows the distribution as a function of paraeter for dierent values of the loading paraeter = M=N. Note that in the coputation of, nal values x i are not binarized to by eans of x! sgn(x) as previously done. The distribution was obtained using 000 saples for each value of and. The gray level of a pixel (;) is proportional to the nuber of saples that fall into [, =28; + =28] and [x, =28;x + =28]. For x 2 α=0.2 α=0.3 α= α=0.2 α=0.3 α= - 0 2 3 4 5 2 α=0.5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 x α=0.5 α= α=0.7 α= α=0.7-0 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 Figure 5: Distribution of overlap in Figure 6: Distribution of x in the range the range <<5for six values of the < < 5 for six values of the load load paraeter (initial overlap 0 = paraeter (initial overlap 0 =0:8). 0:8). xed, there is a sharp transition in perforance when becoes larger than a threshold value (for =0:2 the transition point isat 6). The value of this threshold decreases as increases.

The low value of for \high" is caused by neuron values x i ;i=;:::; N; oving close to zero. This is evidenced by Fig. 6 that displays values x i during the network evolution for dierent values of as a function of. Each iteration is generated fro a rando initial conguration that has a 0 =0:8 overlap with the target. The gray level of a pixel (;x) is proportional to the nuber of ties the iterated points x i ; i =; :::; N; fall into the ranges of values [, =28;+=28] and [x,=28;x+=28]. In general, the network conguration has values x i that are not strictly but are contained within two bands around. The width of these bands depends on and. The coparison between gures 5 and 6 indicates that, for a xed eory load, the syste achieves axiu storage capacity when the dynaics for around points two well-separated bands in which iterated points ove erratically, i.e. evenly covering the available space. On the contrary when the iterated points accuulate around the xed points, storage capacity islowered..0 0.9 E 0.7 0.5.0 2.0 3.0 4.0 5.0 Figure 7: Coparison between retrieval capability, in ters of nal overlaps and, and the easure of the irregularity of the iteration process (as dened by eq. 0) for =0:4. The initial overlap is 0 =0:8. This irregular behavior is quantitatively characterized in Fig. 7, where overlap and overlap are plotted against a easure of the spread of the iterated points, as a function of (siilar results were obtained for other values of ). To dene an appropriate easure, interval [,3; 3] is discretized into W =0 6 segents (of length =0,6 ). The easure is dened as the ratio between the total nuber of values produced, NVP, and the nuber of segents visited during the iteration process, E = P W i=0 i NVP 2 [0; ] (5) where i = if during iteration one or ore values fall inside interval [,3 + i;,3 P +(i +)], otherwise i = 0. The total nuberofvalues, NVP, is given S by N s I s, where N = 00 is the nuber of neurons, S = 000 is the nuber of test saples, and I s is the nuber of iterations corresponding to saple s.

Measure E tends toward unity if the iterated points spread uniforly in interval [,3; 3]. On the contrary it decreases toward zero if points accuulate around particular values. The result is stable under variations of the nuber of intervals W (for large W, of order NVP). Siilar results are obtained for other values of load paraeter. 4. Conclusions The capacity of associative eories with dierent types of non-onotonic neurons were nuerically investigated to deterine the \optial" shape of the neuron response function. It was shown that starting fro dierent analytical expressions, such as a piecewise linear function, the Morita function, and a Gaussian derivative function, capacity is axiized when the functions approxiate a well dened shape. The capacity enhanceent is connected with the dynaical properties of the iterative ap that describes the discrete network evolution. While in onotone odels neurons are constrained to quasi-binary values close to, in the nononotone odels they can assue values in a wider range. We have shown that this property is directly related to the capacity increase. In fact, the capability of retrieving patterns is axiized when the dynaics of the individual neuron are conned in two wide but well-disconnected bands around. The optial shape of the neuron function corresponds to the case in which the two bands are uniforly covered by the iterated points. References [] D.J. Ait. Modelling Brain Functions. Cabridge University Press, New York, 989. [2] J. J. Hopeld, \Neural networks and physical systes with eergent collective coputational abilities". Proc. Natl. Acad. Sci. USA, Vol. 79, pp. 2554-2558, April 982. [3] I. Kanter and H. Sopolinsky, \Associative recall of eory without errors". Phys. Rev. A, Vol. 35, No. : 380{392, 987. [4] L. F. Abbott and Y. Arian. \Storage capacity of generalized networks". Phys. Rev. A, Vol. 36, No. 0: 509{5094, 987. [5] S. Yoshizawa, M. Morita, and S. Aari \Capacity of associative eory using a nononotonic neuron odel". Neural networks, Vol. 6, pp. 67-76 (993). [6] M. Morita. \Associative eory with Nononotonic Dynaics". Neural networks, Vol. 6, pp. 5-26 (993). [7] H. Yanai, and S. Aari \Auto-associative eory with two-stage dynaics of nononotonic neurons". IEEE Transactions on Neural networks, Vol. 7, No. 4, pp. 803-85 (996).