THIRD-ORDER HOPFIELD NETWORKS: EXTENSIVE CALCULATIONS AND SIMULATIONS

Similar documents
Storage Capacity of Letter Recognition in Hopfield Networks

Memory capacity of neural networks learning within bounds

Terminal attractor optical associative memory with adaptive control parameter

Neural networks that use three-state neurons

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

T sg. α c (0)= T=1/β. α c (T ) α=p/n

Hopfield Network for Associative Memory

Learning and Memory in Neural Networks

7 Rate-Based Recurrent Networks of Threshold Neurons: Basis for Associative Memory

ON VLSI: A GENERAL-PURPOSE DIGITAL NEUROCHIP

Effects of refractory periods in the dynamics of a diluted neural network

7 Recurrent Networks of Threshold (Binary) Neurons: Basis for Associative Memory

arxiv: v2 [nlin.ao] 19 May 2015

Hopfield Neural Network

ESANN'1999 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 1999, D-Facto public., ISBN X, pp.

Logic Learning in Hopfield Networks

Week 4: Hopfield Network

Data Mining Part 5. Prediction

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

HOPFIELD neural networks (HNNs) are a class of nonlinear

( ) T. Reading. Lecture 22. Definition of Covariance. Imprinting Multiple Patterns. Characteristics of Hopfield Memory

Memory capacity of networks with stochastic binary synapses

Generalization in a Hopfield network

CHAPTER 3. Pattern Association. Neural Networks

SCRAM: Statistically Converging Recurrent Associative Memory

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

Storage capacity of hierarchically coupled associative memories

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 18 Nov 1996

Solvable models of working memories

Properties of Associative Memory Model with the β-th-order Synaptic Decay

Neural Networks. Hopfield Nets and Auto Associators Fall 2017

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory

Lecture 7 Artificial neural networks: Supervised learning

Introduction to Neural Networks

Unit III. A Survey of Neural Network Model

Using a Hopfield Network: A Nuts and Bolts Approach

Neural networks: Unsupervised learning

Using Variable Threshold to Increase Capacity in a Feedback Neural Network

Analysis of Neural Networks with Chaotic Dynamics

Computational Intelligence Lecture 6: Associative Memory

Content-Addressable Memory Associative Memory Lernmatrix Association Heteroassociation Learning Retrieval Reliability of the answer

Hopfield Networks. (Excerpt from a Basic Course at IK 2008) Herbert Jaeger. Jacobs University Bremen

Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1.

Lecture 4: Feed Forward Neural Networks

CS:4420 Artificial Intelligence

Artificial Intelligence Hopfield Networks

Quasi Analog Formal Neuron and Its Learning Algorithm Hardware

How can ideas from quantum computing improve or speed up neuromorphic models of computation?

Iterative Autoassociative Net: Bidirectional Associative Memory

Introduction to Artificial Neural Networks

The Variance of Covariance Rules for Associative Matrix Memories and Reinforcement Learning

HOPFIELD MODEL OF NEURAL NETWORK

A recurrent model of transformation invariance by association

THE information capacity is one of the most important

Neural Networks Lecture 6: Associative Memory II

Phase transition in cellular random Boolean nets

CHALMERS, GÖTEBORGS UNIVERSITET. EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Synchronous vs asynchronous behavior of Hopfield's CAM neural net

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Simple Neural Nets For Pattern Classification

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

Machine Learning. Neural Networks

Supervised Learning Part I

Storkey Learning Rules for Hopfield Networks

Parallel dynamics of fully connected Q-Ising neural networks

Statistical Mechanics of Temporal Association in Neural Networks with Delayed Interactions

1 R.V k V k 1 / I.k/ here; we ll stimulate the action potential another way.) Note that this further simplifies to. m 3 k h k.

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 30 Sep 1999

Stochastic Learning in a Neural Network with Adapting. Synapses. Istituto Nazionale di Fisica Nucleare, Sezione di Bari

Associative Memory : Soft Computing Course Lecture 21 24, notes, slides RC Chakraborty, Aug.

Effects of Interactive Function Forms in a Self-Organized Critical Model Based on Neural Networks

Artificial Neural Networks Examination, March 2004

Associative Memories (I) Hopfield Networks

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

Neural networks. Chapter 20. Chapter 20 1

Neural Networks in Which Synaptic Patterns Fluctuate with Time

Specification and Implementation of Digital Hopfield-Type Associative Memory with On-Chip Training

arxiv:hep-th/ v1 2 Jul 1998

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Pattern Association or Associative Networks. Jugal Kalita University of Colorado at Colorado Springs

Neural Nets and Symbolic Reasoning Hopfield Networks

High-conductance states in a mean-eld cortical network model

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Robust Controller Design for Speed Control of an Indirect Field Oriented Induction Machine Drive

Neural networks: Associative memory

Information storage capacity of incompletely connected associative memories

Shigetaka Fujita. Rokkodai, Nada, Kobe 657, Japan. Haruhiko Nishimura. Yashiro-cho, Kato-gun, Hyogo , Japan. Abstract

Artificial Neural Networks Examination, June 2004

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Integer weight training by differential evolution algorithms

CORRELATION FUNCTIONS IN 2D SIMPLICIAL GRAVITY. Universite de Paris XI, b^atiment 211, Orsay Cedex, France

Learning Gaussian Process Models from Uncertain Data

Convolutional Associative Memory: FIR Filter Model of Synapse

Dense Associative Memory for Pattern Recognition

Storage capacity of a two-layer perceptron with fixed preprocessing in the first layer

Object Recognition Using Local Characterisation and Zernike Moments

Transcription:

Philips J. Res. 44, 501-519, 1990 R 1223 THIRD-ORDER HOPFIELD NETWORKS: EXTENSIVE CALCULATIONS AND SIMULATIONS by J.A. SIRAT and D. JO RAND Laboratoires d'electronique Philips, 3 avenue Descartes, B.P. 15, 94451 Limei/-Brévannes Cédex, France Abstract Third-order Hopfield nets have been extensively studied. Very good agreement was obtained between numerical simulations and theoretical calculations by random walk analysis. Properties similar to those of second-order nets were observed, including wide basins of attraction, resistance to synapses dilution and quantization. However, third-order nets do not seem to exhibit a catastrophic deterioration of the memory. The principal discrepancy between third- and second-order nets is quantitative: the storage capacity corresponds to the number of synaptic coefficients. Two examples of rotationally and translationally invariant character recognition on small networks are given. For this task, we derived a simple learning rule from analytical arguments on Hebb's law, which is suitable for biased patterns. Keywords: associative memory, higher-order networks, Hopfield networks, neural networks, rotational invariance, translational invariance 1. Introduetion Recently, Hopfield has demonstrated that formal neurons can perform associative memory tasks 1). Their storage capacity was calculated using the replica method developed in statistical physics 2). A simpler formalismrandom walk analysis of the first iteration-can, however, predict the shape of the basins of attraction in second-order nets 3) and the storage capacity for different learning rules 4). This capacity rises when the interaction order is increased as shown by theoretical calculations-spin glasses or random walk approach-and numerical simulations 5-10). Higher-order nets were also proposed as invariant associative memories with respect to any set of transformations 11). Philips Journalof Research Vol.44 No.5 199IJ 501

J.A. Simt and D. Jorand We present an extensive random walk analysis and numerical simulation of third-order Hopfield networks with the Hebbian learning rule. In the first part, the storage capacity under various conditions is studied, and the information contents of different Hopfield nets are compared. In the second part, analytical considerations allow a learning rule which is suited to biased patterns to be designed. Application to translationally and rotationally invariant character recognition is given as an illustration. Finally, the invariant property of the memory is discussed in terms of memory capacity. 2. General properties of third-order Hopfield networks 2.1. Network and learning rule As in the second-order model of associative memory proposed by Hopfield, we consider a network of N bipolar neurons. The state of neuron i is represented by a, = ± 1. Triplets of neurons (i, j and k) interact through the synaptic coefficient C jjk, which is computed in the learning stage. When the latter is achieved, the network can be used as a content-addressable memory. The capacity is defined as the maximal number of patterns that can be stored. Retrieval requires a sequence of steps (iterations) which may converge towards the desired state of the network, generally one of the prescribed patterns. In the retrieval step t, neuron states are updated according to the sign of the local field h.: and (Jj(t + 1) = sign[hj(t)] (1) hj(t) N = L C jjk (Jj(t) (Jk(t) j,k= I (2) where 'If.k= I denotes the sum over j and k excluding j = i, k = i or j = k. In this paper, we only use synchronous dynamics, i.e. the N neurons are updated simultaneously in one step (or iteration). To memorize a given set of P patterns (çf)jl= I,P, where f.1 denotes the pattern index and i the neuron, we use the generalized Hebb rule: Cjjk = N 2 1 P L çfçfçf Jl=l (3) 502 Philip. Journalof Research Vol. 44 No.S 1990

( Third-order Hopfield networks 2.2. Random walk analysis of the first iteration Under the assumption that the çf are PN independent random variables with zero average, the probability that a given pattern is stable after one iteration can easily be computed 4). Let us first determine the probability distribution ofthe local field hi obtained when presenting a memorized pattern çv: (4) I hi(t) = ei + N 2 L çf L efecçjçl:. p"r.v i.k=1 The second term represents the zero-mean noise due to overlapping with the other patterns, hi is hence a Gaussian variable of mean hi = Çr The variance, Ll, is obtained by averaging the square of the noise term over the possible sets of patterns er: Ll= \~4 p~ v p~ v i.~ 1i"t= 1çfçf' efe';: ecec: çjçj. el:el:'). (6) Onlyterms with u =.u' and ((j, k) = (j', k')or (j, k) = (k',j'))do not vanish and N (5) (7) These formulae and those following are first-order approximations in I/N and I/P. The factor 2 comes from thej-k permutation symmetry. The probability Pi that the state of neuron i is not equal to er after one iteration is equal to the probability that hi < 0: (z - hiçi)2] I I f,jh; (Z2) Pi=_I_fO exp[ dz =---- exp -_ dz. (8) 2Ll 2./ht 2 foa -00 With the approximation that the Pi are independent for the different i, the condition for retrieval with less than en neurons in the wrong state is N L Pi~eN. i= 1 (9) For perfect retrieval, we take en = 1. Philips Journalof Research Vol.44 No.5 1990 503

J.A. Sirat and D. Jorand TABLE I Distribution parameters of hi in the different cases. Ll Lljhf Exact capacity 1 2P 2P N 2 N 2 Noisy exact capacity (1-2r)2 2P N 2 2P N 2 (1-2r)2 Exact capacity in a diluted network (dilution r) Exact capacity with clipped synapses 1--r (1 - -r)(2- -r)p 2-t P ---- N 2 1--rN 2 2 np N 2 N 2 When the retrieval conditions or the network parameters are modified, the mean value and variance of hi after one iteration can also be computed (see Table I): Exact capacity: each learnt pattern is stable without error; Noisy exact capacity: retrieval without errors of each pattern presented with noise; s capacity: a learnt pattern yields a state at a Hamming distance less than sn from the original pattern 3.12); Exact capacity in diluted networks: synaptic coefficients are randomly suppressed with probability r ; Exact capacity with clipped synapses: synaptic coefficients take only one of the two values +1or -1, depending on sign(c ijk ) The distribution parameters of the local field hi are easily calculated for the different cases. They are displayed in Table I. 2.3. Numerical simulations and results We have performed extensive numerical simulations on an IBM 3090 computer. Calculations were carried out with integer-type synaptic coefficients 504 Philips Journalof Research Vol.44 No.5 1990

( Third-order Hopfield networks 0.10 0.09 IX 0.08 1 0.07 0.06 0.05 0.04 0.03 0 Il THEORY o SIMULATIONS 10 20 30 40 50 60 N--- Fig. I. Exact capacity (J. = P/N 2 vs the network size N. (except in the last section). Results were averaged over 20 runs each time. Two running modes were studied: the first mode consisted in a single iteration and in the second mode successive iterations were performed until convergence (maximum of 20 iterations). 2.3.1. Exact capacity This takes the form P = Cl.N 2 with Cl.of order 0.04 (for N ranging from 10 to 50 only a slight dependence on N was observed). The agreement with the statistical physics result 5,7,13) Nd-1 p=--- 2d!lnN' where d is the network order, is consistent with the fact that no significant iteration effect was observed as already noted for second-order nets 4). More precisely, a single iteration allowed convergence towards the stable patterns in almost all the runs (fig. 1). 2.3.2. s capacity When the state after one iteration and the learnt pattern are allowed to differ by less than en bits, Cl. increases slightly (dala = 0.5 for e = 0.1 with N = 40). When further iterations are allowed, errors are amplified and the capacity increase is partially suppressed. This means that the extra learnt Philips Journalof Research Vol.44 No.S 1990 505

l.a. Sirat and D. Jorand oe 1 0.048 0.046 0.044 0.042 0 0 0.040 o SIMULA TIONS 0.038 N=40 0.036 0.0 0.1 0.2 0.3 0.4 0.5 0.6 E Fig.2. B capacity ex= PjN 2 vs B, the maximum authorized fraction of wrong neurons in the output patterns. patterns do not yield supplementary stable states of the network dynamics (fig. 2). 2.3.3. Non-catastrophic deterioration of the memory? When I)( = P/N 2 rises above I)(c, the mean proportion of inverted neurons <D) (averaged over the learnt patterns) grows linearly with 1)(. The slope is small and varies weakly with N. This result has been confirmed by a study of the basins of attraction, i.e. final Hamming distance vs initial Hamming distance, for I)( > I)(c (fig. 3). This contrasts with second-order networks: similar curves are then obtained where the slope diverges as N goes to infinity (see Appendix 1), leading to the catastrophic deterioration of the memory 2). 2.3.4. Basins of attraction When a noisy pattern (neurons of the original pattern are inverted with probability r) is presented, the exact retrieval capacity is not significantly altered for r < 20% (fig. 4a)). In this case, running a few iterations «5) yields a greater capacity, which means that the prescribed patterns are stable states of the dynamics (fig. 4b)). 2.3.5. Dilution Dilution reduces weakly the storage capacity for perfect retrieval (fig. 5a)). To evaluate the storage efficiency, we plot the stored information divided by 506 Philip, Journalof Research Vol.44 No.5 1990

Third-order Hopfield networks <Ö> r 0.20 0.30.---------------,0"...-----. fl 0.10 ~ N=30 o SIMULA TIONS N=50 0.00 L...-.()I...t::...JL...-...I...----L_J..._--L...---lL...-...l.----L----I 0.00 0.10 0.20 0.30 0.40 0.50 IX---" Fig. 3. Non-catastrophic deterioration of the capacity in third-order nets. The mean Hamming distance (8) is plotted vs IX = P/N 2 the number of synaptic coefficients available vs the dilution T (fig. 5b)). The former increases by a factor of order 4 at strong dilutions. A similar effect was observed in second-order asymmetrically diluted networks in the limit of infinite dilution 14). We confirmed that symmetric dilution, i.e. cancelling simultaneously C ijk and C ikj, does not modify the result qualitatively. 2.3.6. Synapse clipping When the synaptic coefficients are restricted to one of the two values + 1 and -1, the theoretical storage capacity is only reduced by a factor of nl2 (see Appendix 2 for a detailed calculation). Numerical simulations agree perfectly with this estimate. This result is of great interest for hardware implementations of such memories: they only require 1bit precision for efficient storage (fig. 6). 2.4. Discussion 2.4.1. Comparing theoretical and numerical results: weak effect of the iterations As for second-order networks 4), a single iteration after initialization with a given pattern allows the stable state to be reached in most of the cases. Therefore for many properties we can model a Hopfield network as a set of single-layer perceptrons of the corresponding order. This makes the random walk analysis a powerful (and simple) tool for predicting the different properties Philips Journalof Research Vol.44 No.5 1990 507

J.A. Sirat and D. Jorand (X 0.05 1 ~ THEORY o SIMULATIONS 0.03 N=40 0.02 0.01 (al 0.00 0 10 20 r(%) 0.04 (XI0.03 ~ 1 ITERATION 0.02 o CONVERGENCE N=40 0.01 (bi 0.00 0 10 20 30 40 r(%1 Fig.4. Exact capacity Cl. = P/N 2 for noisy input patterns vs r, the probability for flipping one neuron: a) comparison of theory and simulation for one iteration and b) iteration effect in the simulations. of Hopfield nets of any order, as shown by the very good agreement between numerical simulations and theoretical results. The slight discrepancy (by a factor of order 1.5) is probably due to the approximation that the fields hi are independent variables. Some properties, however, cannot be described using this approach: s capacity, precise shape of the basins of attraction, and deterioration of the memory beyond the storage capacity. For those aspects, 508 Philips Journalof Research Vol.44 No. 5 1990

Third-order Hopfield networks a 0.04 t 6 0 6 6 Asymmetric 6 0.01 o Symmetric 0 simulation N = 40 6 20 40 60 80 100 (a) ----.. Dilution 't (%). ö t o Asymmetric simulation N = 40 o (b) 20 40 60 80 100 ----.. Dilution t (%) Fig. 5. a) Exact capacity ex= P/N 2 in a diluted network vs the dilution r the proportion of cancelled synaptic coefficients, and h) ex*= P/(l-.)N 2 = ex/{l-.) vs r. the iteration effect is predominant and other theoretical tools would be required 13). 2.4.2. Comparing second- and third-order networks As already noted by several authors on partial aspects (see the references on higher-order nets mentioned in Sec. 1), the results of this section show that the properties of second- and third-order nets are qualitatively identical (for second-order nets see for example ref. 3). However, the catastrophic Philips Journalof Research Vol. 44 No.5 1990 509

I.A. Sirat and D. Jorand 0.07 IX 0.06 o SIMULATIONS 1 0.05 0.04 fl THEORY 0.03 0.02 0.01 0 10 20 30 40 50 60 N Fig.6. Exact capacity IX = PjN 2 vs the network size N for quantized synaptic coefficients èjjk = ± I. z )( a.. Vl 4000 toid 5000.-----------------,..--------, 11:::: fl 3 rd ORDER 1000 o o 10000 20000 o 2 nd ORDER SIMULA TIONS 30000 NUMBER OF SYNAPTIC COEFFICIENTS Fig. 7. Information stored (PN bits) vs the number of synaptic coefficients for second- and third-order Hopfield nets. deterioration of the memory seems to disappear at orders higher than 2. To the best of our knowledge, this remains an open problem. The quantitative difference arises from the greater number of synaptic coefficients N 3 This has already been noted in theoretical calculations 5) and is shown by numerical curves for d = 2 and d = 3 in diluted or intact nets (fig.7): the stored information (PN bits) is proportional to the memory size used for the synaptic coefficients 510 Philips Journalof Research Vol.44 No.5 1990

Third-order Hopfield networks (Nd/d!). Their ratio q gives the yield of the Hopfield memory. This can be precisely evaluated in the case of clipped synapses, and when taking into account the redundancy of the synaptic coefficients: (10) In our case q = 0.14 x 2 = 0.18 re/2 for second order, and 0.04 x 6 q = re/2 0.15 for third order. 2.4.3. Improving the storage capacity Two ways are proposed to enhance the storage capacity. In still higher-order networks the number of stored patterns grows as N d - 1 (d is the order of the network) at the cost of a greater number of synaptic coefficients (N d ). Moreover, we expect from simple considerations that the quality of retrieval will improve when d grows to infinity. The field value obtained when presenting a pattern a at the input where P h i = L çf(ç!'a)d-1 1'=1 (11) (12) When d goes to infinity, we can approximate hi by the leading term in the sum: (13) where the pattern v maximizes the quantity lçval. hr n may be viewed as a 'nearest neighbour' field. Philips Journni of Research Vol. 44 No.S 1990 511

I.A. Sirat and D. Jorand Iterative algorithms already developed for second-order nets 15,16) should give optimal performances as shown by preliminary numerical simulations (capacity multiplied by a factor of order 10).. 3. Application to invariant pattern recognition 3.1. Collapse of Hebb 's rule Jor biased patterns In the first part, the Hebbian rule was used for efficient storage of zero-mean patterns. However, in many cases of interest, patterns are biased, leading to memory failure. We will consider patterns (Çf)i=I,N, where çf are PN independent random variables, with a mean value for each neuron given by (14) With the standard Hebb rule, the mean field value becomes where (15) (16) and < >/1 indicates averaging over pattern u: Therefore the restitution of 'minority' neurons is hindered by the modification of hi' The network would then converge towards the mean pattern ë with high probability. Moreover, the variance is enhanced as follows: (17) When the mean pattern ë is not equal to zero, this dramatic' noise leads to memory breakdown. 3.2. Modified Hebb rulejor biased patterns To overcome this problem, we propose a simple generalization of the Hebbian rule which is naturally derived from the random walk analysis. First, 512 Philips Journal of Research Vol.44 No.S 1990

~ - ----------------------, Third-order Hopfield networks we use centred variables (18) The synaptic coefficients become and we add the thresholds 1 P Cijk = N2 I çfçfçf 1l=1 1 p N - [1 f}i= -- L Çi 2 L (çf) PIl=1 N j=1 2J2 (19) (20) to correct the local fields. Neuron updating is performed as follows: Keeping in mind that çf are independent random variables with (22) we obtain which is proportional to ex= P/N 2 as in the non-correlated case. This also depends on the mean contrast of neuron i (first bracket), and on the relative total contrast of pattern f1. (last bracket). 3.3. Application to invariant character recognition Higher-order nets were suggested for the performance of invariant pattern retrieval by learning systematically the transformed patterns 11). The required capacity is then PT, where P is the number of basic patterns and T the number of transformations. T is often of order N, so a higher-order network is needed. Moreover, in the case of structured patterns (images, characters,... ), strong Philips Journalof Research Vol.44 No.S 1990 513

--. -H ~rhli;-_ J.A. Sirat and D. Jorand ER-Ill a-. Ill-Ill Fig. 8. Examples of translationally invariant character recognition (three characters, ten translations with periodic boundary conditions). The original non-translated patterns are obtained after a few iterations (of order 3). biases may appear; the standard Hebb rule is then inefficient. We used the modified rule (defined earlier) in a heteroassociative version; when a transformed pattern is presented to the network, it has to yield the original (non-transformed) pattern. As the output patterns are associated with themselves, the network can run many iterations to improve convergence (e.g. in the case of noisy patterns). For more details on the network parameters, see Appendix 3. In the first illustration (fig. 8), we take 5 x 5 characters (X, 0, and R) on a 10 x 5 white ground. The network learns the patterns with ten horizontal translations (periodic boundary conditions). All the learnt characters are correctly retrieved, even with a little noise (10% wrong neurons) and clipped synapses (C iik = ± 1, ei is normalized by <ICijd)iik)' The information storage efficiency is evaluated by the ratio R of stored information (3 patterns x 50 neurons x 10 translations) to the number of synaptic coefficients (50 3 ): R = 1/80. This corresponds to (J. = 0.012 < (J.c = 0.04. In the second illustration (fig. 9), two 5 x 5 characters (T, R) could be translated and rotated on a 6 x 6 white ground (4 x 4 transformations). Retrieval was as satisfactory as in the first case. In both cases, we confirmed that the simple Hebbian rule does not work (the mean state values <ë;)i were respectively 0.5 and 0.2). 3.4. Comments This limited application shows that the principal advantage provided by the third-order net is a larger storage capacity: P", N 2 instead of P '" N. The space complexity-i.e. the memory size needed-can be reduced when considering the intrinsic symmetry of the synaptic coefficients. However, the 514 Philips Journalof Research Vol.44 No.5 1990

Third-order Hopfield networks 11-11 11-11 Fig. 9. Examples oftranslationally and rotationally invariant character recognition (2 characters, 4 translations x 4 rotations). The original non-transformed patterns are obtained after a few iterations (of order 3). time complexity-i.e. the number of arithmetic operations performed in the retrieval stage-grows as the total number of synapses. Therefore the computational time for invariant recognition by a Hopfield network is comparable with the' time needed to test recognition successively on the different transforms of the pattern. To improve the storage capacity, still higher-order nets would be needed. However, generalization ofiterative algorithms already tested on second-order nets 15.16) should give optimal performances as shown by preliminary numerical simulations (capacity multiplied a factor of order 10). Acknowledgements D.J. stayed at the Laboratoires d'electronique et de Physique Appliquée during his 'stage d'option' of the Ecole Polytechnique, Palaiseau, France, which was supervised by P. Weinfeld from PNHE, Ecole Polytechnique. We are grateful to P. Peretto and M. Gordon from CEN, Grenoble, France, to B. Derrida from CEN, Saclay, France, and to P. Weisbuch from ENS, Paris, for stimulating discussions. We thank particularly S. Makram from LEP for attentive re-reading of the manuscript. Appendix 1: catastrophic deterioration of the memory in finite-size second-order networks We have performed numerical simulations on second-order Hopfield nets for different sizes: N = 100,200,300,400. Figure 10 shows the normalized mean error (D) after convergence (averaged over the learnt patterns) vs the number of learnt patterns P. For small N, (D) varies linearly with P (finite Philips Journalof Research Vol.44 No.5 1990 515

J.A. Sirat and D. Jorand <6> 0.4,...--r-------------_, 1 0.3 0.2 0.1 o N=100 SIMULA TIONS 11 N=200 o N=400 --- THEORY(N=OO) ( AMIT ET AL.8s ) (a) 0.2 0.3 0.4 0.5 0.6 0.7 txc=0.139... _IX 0.2 <6> 1 01 o SIMULA TIONS THEORY ( THIS WORK J (b) 0.0 0.00 t 0.20!Xc=0..139... 0.40 0.60 0.80 1.00. 1.20 1.40 _IX Fig. ID. a) Normalized mean error on output patterns (8) vs ex= P/N 2 in second-order nets for different sizes N-iterations continue until convergence- and b) same plot with a single iteration. size effect). The slope grows to infinity with N and the curve approaches the replica calculation result 2): catastrophic deterioration of the memory. This has a second origin: iterations propagate and amplify errors as seen on the curve for iteration 1, which does not vary with size (theory and simulations). This last phenomenon had already been noted by several authors 13). 516 Philip. Journalof Research Vol.44 No.5 1990

Third-order Hopfield networks Appendix 2: local field mean value and variance for clipped synapses When presenting the pattern v at the input, the field value is (24) with When clipping the synaptic coefficients-i.e. when replacing Cijk by sign(cijk)-the qjk are replaced by sign(qjk)' As the qjk for j < k are random independent Gaussian variables of mean 1 and variance P, and qjk = qkj' we can compute the distribution of the clipped qjk' The mean value is (q - 1)2J dq I ro [(q - 1)2J dq --+ exp -- 2P~o 2P~ (25) or - [(q_l)2] 11 dq ft qjk = 2 0 exp 2P JW'" y;p' (26) As t:"\2-2 2 (q'k) -(h) =1--",1 J J np (27) the 'clipped field' hi is a Gaussian variable ofmean value.j2fip, and variance 2/N 2 (the factor 2 arises from the j-k permutation symmetry). The effective IX for clipped synapses is (28) The loss of capacity by a factor n/2 is identical to that noted for second-order networks 17). Philips Journalof Research Vol.44 No.5 1990 517

I.A. Sirat and D. Jorand Appendix 3: heteroassociative version of the modified Hebb rule Several authors 18,19) have extended associative Hopfield memory to heteroassociation in second-order networks: the input pattern XI' is associated with the output pattern yl'. Input and output pattern sizes can be different. In the case of the standard Hebb rule in third-order nets we take Cijk = N 2 1 P L YfXjXf 1'=1 (29) We generalize this for biased patterns: 1 P C"k=- " yl'xlfxl" IJ N2 L... IJk' 1'=1 1 P N )2 e i - Y; - L L xj ( PN 11=1 j=1 (30) (31 ) When a pattern a is presented, the field is N hi = L Cijk(aj - Xj)(ak - Xk) + ei j,k= 1 (32) where Yi and x j are the reduced variables Y; - Y; and X, - Xj' As shown for second-order nets, properties such as memory capacity and fault tolerance are similar for auto- and heteroassociation. The formalism that we have used in this paper works as well for heteroassociation. REFERENCES I) 1.1. Hopfield, Proc. Natl. Acad. Sci. USA, 79,2554 (1982). 2) D.J. Amit, H. Gu tfreun d and H. Sompolinsky, Phys. Rev. Lett.,55 (14), 1530(1985). 3) R. MeElieee, E.C. Posner, E.R. Rodemich and S.S. Venkatesh, IEEE Trans. Inr. Theory,33 (4),461 (1987). 4) M. Gordon, J. Phys. (Paris), 48, 2053 (1987). 5) P. Peretto and J.J. Niez, BioI. Cybern., 54 (1), 53 (1986). 6) A.D. Bruce, A. Canning, B. Forrest, E. Garder and 0.1. Wallace, Proc. or the AlP Conf., Snowbird, UT, 1986, pp. 65-70, 1986. 7) P. Baldi and S.S. Venkatesh, Phys. Rev. Lett., 58 (9), 913 (1987). 8) L. Personnaz, I. Guyon and G. Dreyfus, Europhys. Lett., 4 (8), 863 (1987). 9) D. Horn and M. Usher, J. Phys. (paris), 49 (3), 389 (1988). 10) D. Psaltis, C.H. Park and 1. Hong, Neural Netw., 1, 149 (1988). 11) C.L. Giles and T. Maxwell, Appl. Opt., 26 (23),4972 (1987). 12) S.S. Venkatesh, Proc. of the AlP Conf., Snowbird, UT, 1986, pp. 440-445. 518 Philips Journalof Research Vol.44 No.5 1990

Third-order Hopfield networks 13) E. Gardner, J. Phys. A, 20 (11), 3453 (1987). 14) B. Derrida, E. Gardner and A. Zippelius, Europhys. Lett., 4 (2),167 (1987). 15) W. Krauth and M. Mezard, J. Phys. A, 20, L745 (1987). 16) G. Poppel and U. Krey, Europhys. Lett., 4 (9), 979 (1987). 17) H. Sompolinsky, Phys. Rev. A, 34 (3), 2571 (1986). 18) E. Domany, R. Meir and W. KinzeI, Europhys. Lett., 2 (3),175 (1986). 19) L. Personnaz, I. Guyon and G. Dreyfus, Phys. Rev. A, 34 (5),4217 (1986). Authors 1.A. Sir at: Ing. degree, Ecole Polytechnique, Palaiseau, France, 1983; Ph.D. (Nouvelle Thèse), Université Scientifique et Médicale de Grenoble, France, 1986; Laboratoires d'electronique Philips, 1986-. His main topics of research have been successively low temperature solid state physics (thesis) and neural networks. D. Jo ran d: Ing. degree, Ecole Polytechnique, Palaiseau, France, 1988; Laboratoires d'electronique Philips, 1988-. He is currently studying computer science at the Ecole Nationale Supérieurc des Télécommunications, Paris, France, with financial support from the Laboratoires d'electronique Philips. His main topics of interest are neural networks, parallel computation and high-level languages. Philips Journalof Research Vol. 44 No.S 1990 519