CHAPTER III Neural Networks as Associative Memory

CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people even f they have sunglasses or f they are somehow elder now. In ths chapter, frst the basc defntons about assocatve memory wll be gven and then t wll be explaned how neural networs can be made lnear assocators so as to perform as nterpolatve memory. Next t wll be explaned how the Hopfeld networ can be used as autoassocatve memory and then Bpolar Assocatve Memory networ that s desgned to operate as heteroassocatve memory wll be ntroduced. EE543 - ANN - CHAPTER 3

3.. Assocatve Memory In an assocatve memory we store a set of patterns µ,...k, so that the networ responds by producng whchever of the stored patterns most closely resembles the one presented to the networ Suppose that the stored patterns, whch are called exemplars or memory elements, are n the form of pars of assocatons, µ (u,y ) where u R N, y R M,..K. Accordng to the mappng ϕ: R N R M that they mplement, we dstngush the followng types of assocatve memores: Interpolatve assocatve memory Accretve assocatve memory 3.. Assocatve Memory: Interpolatve Memory In nterpolatve assocatve memory, When uu r s presented to the memory t responds by producng y r of the stored assocaton. However f u dffers from u r by an amount of ε, that s f uu r +ε s presented to the memory, then the response dffers from y r by some amount ε r. Therefore n nterpolatve assocatve memory we have r r r r ϕ( u + ε) y + ε such that ε 0 ε 0, r.. K (3..) EE543 - ANN - CHAPTER 3 2

3.. Assocatve Memory: Accretve Memory In accretve assocatve memory, when u s presented to the memory t responds by producng y r of the stored assocaton such that u r s the one closest to u among u,..k, that s, r r ϕ( u) y such that u mn u u,... K u (3..2) 3.. Assocatve Memory: Heteroassocatve-Autoassocatve The accretve assocatve memory n the form gven above, that s u and y are dfferent, s called heteroassocatve memory. However f the stored exemplars are n a specal form such that the desred patterns and the nput patterns are the same, that s y u for..k, then t s called autoassocatve memory. In such a memory, whenever u s presented to the memory t responds by u r whch s the closest one to u among u,..k, that s, r r ϕ( u) u such that u mn u u,... K u (3..3) EE543 - ANN - CHAPTER 3 3

3.. Assocatve Memory Whle nterpolatve memores can be mplemented by usng feed-forward neural networs, t s more approprate to use recurrent networs as accretve memores. The advantage of usng recurrent networs as assocatve memory s ther convergence to one of a fnte number of stable states when started at some ntal state. The basc goals are: to be able to store as many exemplars as we need, each correspondng to a dfferent stable state of the networ, to have no other stable state to have the stable state that the networ converges to be the one closest to the appled pattern. 3.. Assocatve Memory The problems that we are faced wth beng: the capacty of the networ s restrcted dependng on the number and propertes of the patterns to be stored, some of the exemplars may not be among the stable states some spurous stable states dfferent than the exemplars may arse by themselves the converged stable state may be other than the one closest to the appled pattern EE543 - ANN - CHAPTER 3 4

3.. Assocatve Memory One way of usng recurrent neural networs as assocatve memory s to fx the external nput of the networ and present the nput pattern u r to the system by settng x(0)u r. If we relax such a networ, then t wll converge to the attractor x* for whch x(0) s wthn the basn attracton as explaned n Chapter 2. If we are able to place each µ as an attractor of the networ by proper choce of the connecton weghts, then we expect the networ to relax to the attractor x* µ r that s related to the ntal state x(0)u r. For a good performance of the networ, we need the networ only to converge to one of the stored patterns µ,...k. 3.. Assocatve Memory Unfortunately, some ntal states may converge to spurous states, whch are the undesred attractors of the networ representng none of the stored patterns. Spurous states may arse by themselves dependng on the model used and the patterns stored. The capacty of the neural assocatve memores s restrcted by the sze of the networs. If we ncrement the number of stored patterns for a fxed sze neural networ, spurous states arse nevtably. Sometmes, the networ may converge not to a spurous state, but to a memory pattern that s not so close to the pattern presented. EE543 - ANN - CHAPTER 3 5

3.. Assocatve Memory What we expect for a feasble operaton s that, at least for the memory patterns themselves, f any of the stored pattern s presented to the networ by settng x(0)µ, then the networ should stay converged to x* µ r (Fgure 3.). Fgure 3.. In assocatve memory each memory element s assgned to an attractor 3.. Assocatve Memory A second way to use recurrent networs as assocatve memory, s to present the nput pattern u r to the system as an external nput. Ths can be done by settng θu r, where θ s the threshold vector whose th component s correspondng to the threshold of neuron. After settng x(0) to some fxed value, we relax the networ and then wat untl t converges to an attractor x*. For a good performance of the networ, we desre the networ to have a sngle attractor such that x*µ for each stored nput pattern u, therefore the networ wll converge to ths attractor ndependent of the ntal state of the networ. Another soluton to the problem s to have predetermned ntal values, so that these ntal values le wthn the basn attracton of µ whenever u s appled. We wll consder ths nd of networs n Chapter 7 n more detal, where we wll examne how these recurrent networs are traned. EE543 - ANN - CHAPTER 3 6

3.2. Lnear Assocators : Orthonormal patterns It s qute easy to mplement nterpolatve assocatve memory when the set of nput memory elements {u } consttutes an othonormal set of vectors, that s T u u 0 (3.2.) where T denotes transpose. By usng ronecer delta, we wrte smply u T u δ (3.2.2) 3.2. Lnear Assocators : Orthonormal patterns The mappng functon ϕ(u) defned below may be used to establsh an nterpolatve assocatve memory: ϕ ( u) T W u (3.2.3) where W u y (3.2.4) Here the symbol s used to denote outer product of vectors x R N and y R M, whch s defned as T u y u y ( y u T ) T (3.2.5) resultng n a matrx of sze N by M. EE543 - ANN - CHAPTER 3 7

3.2. Lnear Assocators : Orthonormal patterns By defnng matrces [Hayn 94]: and U[u u 2.. u.. u K ] (3.2.6) Y[y y 2.. y.. y K ] (3.2.7) the weght matrx can be formulated as T T W YU (3.2.8) If the networ s gong to be used as autoassocatve memory we have YU so, T T W UU (3.2.9) 3.2. Lnear Assocators : Orthonormal patterns For a functon ϕ(u) to consttute an nterpolatve assocatve memory, t should satsfy the condton ϕ(u r )y r r..k (3.2.0) We can chec t smply as r T r ϕ ( u ) W u (3.2.) whch s T r T r W u YU u (3.2.2) EE543 - ANN - CHAPTER 3 8

3.2. Lnear Assocators : Orthonormal patterns Snce the set {u } s orthonormal, we have T r YU u δ ry y (3.2.3) whch results n r T r r ϕ ( u ) YU u y (3.2.4) as we desred. 3.2. Lnear Assocators : Orthonormal patterns Remember: T r T r W u YU u T r YU u δ ry y r (3.2.2) (3.2.3) Furthermore, f an nput pattern uu r +ε dfferent than the stored patterns s appled as nput to the networ, we obtan T r T r T (3.2.5) ϕ( u ) W ( u + ε) W u + W ε Usng equaton (3.2.2) and (3.2.3) results n r T ϕ( u) y + W ε (3.2.6) Therefore, we have r r ϕ( u) y + ε n the requred form, where r ε W T ε (3.2.7) (3.2.8) EE543 - ANN - CHAPTER 3 9

3.2. Lnear Assocators : Orthonormal patterns Such a memory can be mplemented by usng M neurons each havng N nputs as shown n Fgure 3.2. Fgure 3.2 Lnear Assocator 3.2. Lnear Assocators : Orthonormal patterns The connecton weghts of neuron s assgned value W, whch s the th column vector of matrx W. Here each neuron has a lnear output transfer functon f(a)a. When a stored pattern u s appled as nput to the networ, the desred value y s observed at the output of the networ as: T x W u (3.2.9) EE543 - ANN - CHAPTER 3 0

3.2. Lnear Assocators : General case Untl now, we have nvestgated the use of lnear mappng YU T as assocatve memory, whch wors well when the nput patterns are orthonormal. In the case the nput patterns are not orthonormal, the lnear assocator cannot map some nput patterns to desred output patterns wthout error. In the followng we wll nvestgate the condtons necessary to mnmze the output error for the exemplar patterns. 3.2. Lnear Assocators : General case Remember: U[u u 2.. u.. u K ] (3.2.6) Y[y y 2.. y.. y K ] (3.2.7) Therefore, for a gven set of exemplars µ (u,y ), u R N, y R M,.. K, our purpose s to fnd a lnear mappng A* among A: R N R M such that: * A mn y Au A (3.2.20) where. s chosen as Eucldean norm. The problem may be reformulated by usng the matrces U and Y [Hayn 94]: A * mn Y AU A (3.2.2) EE543 - ANN - CHAPTER 3

3.2. Lnear Assocators : General case The pseudo nverse method [Kohonen 76] based on least squares estmaton provdes a soluton for the problem n whch A* s determned as: * A + YU (3.2.22) where U + s pseudo nverse of U. The pseudonverse U + s a matrx satsfyng the condton: + U U (3.2.23) where s the dentty matrx. 3.2. Lnear Assocators : General case A perfect match s obtaned by usng * A + YU snce * + AU YU U Y (3.2.24) resultng n no error due to the fact Y - A*U 0 (3.2.25) EE543 - ANN - CHAPTER 3 2

3.2. Lnear Assocators : Lnearly Independent Patterns Remember + U U (3.2.23) In the case the nput patterns are lnearly ndependent, that s none of them can be obtaned as a lnear combnatons of the others, then a matrx U + satsfyng Eq. (3.2.23) can be obtaned by applyng the formula [Golub and Van Loan 89, Hayn 94] + T T U ( U U) U (3.2.26) Notce that for the nput patterns, whch are the columns of the matrx U, to be lnearly ndependent, the number of columns should not be more than the number of rows, that s K N, otherwse U T U wll be sngular and no nverse wll exst. The condton K N means that the number of entres consttutng the patterns restrcts the capacty of the memory. At most N patterns can be stored n such a memory. 3.2. Lnear Assocators : Lnearly Independent Patterns Ths memory can be mplemented by a neural networ for whch W T YU +. The desred value y appears at the output of the networ as x when u s appled as nput to the networ: T x W u (3.2.27) as explaned prevously. EE543 - ANN - CHAPTER 3 3

3.2. Lnear Assocators : Lnearly Independent Patterns Remember: + T T U ( U U) U (3.2.26) Notce that for the specal case of orthonormal patterns that we examned prevously n ths secton, we have T U U (3.2.28) that results n the pseudonverse, whch s n the form + U T U (3.2.29) and therefore T + W YU YU T (3.2.30) as we have derved prevously. 3.3. Hopfeld Autoassocatve Memory In ths secton we wll nvestgate how Hopfeld networ can be used as autoassocatve memory. For ths purpose some modfcatons are done on contnuous Hopfeld networ so that t wors n dscrete state space and dscrete tme. Fgure 3.3 Hopfeld Assocatve Memory EE543 - ANN - CHAPTER 3 4

3.3. Hopfeld Autoassocatve Memory Note that whenever the patterns to be stored n Hopfeld networ are from N dmensonal bpolar space consttutng a hypercube, that s u {-,} N,..K, then t s convenent to have any stable state of the networ on the corners of the hypercube. If we let the output transfer functon of the neurons n the networ to have very hgh gan, n the extreme case f ( a) lm tanh( κa) κ we obtan f ( a) sgn( a) 0 for a > 0 for a 0 for a < 0 (3.3.) (3.3.2) 3.3. Hopfeld Autoassocatve Memory Furthermore note that the second term of the energy functon Chapter 2 prevously) E N N N x w x x + f x R ) 2 0 N ( dx θ x (whch was gven n (3.3.3) approaches to zero. Therefore the stable states of the networ corresponds to the local mnma of the functon: E 2 w x x θ x (3.3.4) so that they le on the corners of the hypercube as explaned prevously. EE543 - ANN - CHAPTER 3 5

3.3. Hopfeld Autoassocatve Memory Dscrete tme state exctaton [Hopfeld 82] of the networ, s provded n the followng: x ( + ) f ( a ( )) x( ) for a ( ) > 0 for a ( ) 0 for a ( ) < 0 (3.3.5) where a () s defned as we used to, that s, a( ) wx( ) + θ (3.3.6) The processng elements of the networ are updated one at a tme, such that all of the processng elements must be updated at the same average rate. 3.3. Hopfeld Autoassocatve Memory Remember: U[u u 2.. u.. u K ] (3.2.6) Y[y y 2.. y.. y K ] (3.2.7) For stablty of the bpolar dscrete Hopfeld networ, t s further requred to have w 0 n addton to the constrant w w In order to use dscrete Hopfeld networ as autoassocatve memory, ts weghts are fxed to T T W UU (3.3.8) where U s the nput pattern matrx as defned n Eq. (3.2.6), and then w are set to 0 Remember that n autoassocatve memory we have YU, where Y s the matrx of desred output patterns as defned n Eq (3.2.7). EE543 - ANN - CHAPTER 3 6

3.3. Hopfeld Autoassocatve Memory If all the states of the networ are to be updated at once, then the next state of the system may be represented n the form x(+)f(w T x()) (3.3.9) For the specal case f the exemplars are orthonormal, we have f(w T u r )f(u r )u r (3.3.0). that means each exemplar s a stable state of the networ Whenever the ntal state s set to one of the exemplar, the system remans there. However, f the ntal state s set to some arbtrary nput, then the networ converges to one of the stored exemplars, dependng on the basn of attracton n whch x(0) les. 3.3. Hopfeld Autoassocatve Memory However n general the nput patterns are not orthonormal, so there s no guarantee that each exemplar s correspondng to a stable state. Therefore the problems that we mentoned n Secton 3. arse. The capacty of the Hopfeld net s less than 0.38*N patterns, where N s the number of unts n the networ [Lppmann 89]. It s shown n the lecture notes that the energy functon always decreases as the state of the processng elements are changed one by one (asynchronous update). EE543 - ANN - CHAPTER 3 7

3.4. B-drectonal Assocatve Memory The B-drectonal Assocatve Memory (BAM) ntroduced n [Koso 88] s a recurrent networ (Fgure 3.4) desgned to wor as heteroassocatve memory [Nelsen 90]. Fgure 3.4: B-drectonal Assocatve Memory 3.4. B-drectonal Assocatve Memory BAM networ conssts of two sets of neurons whose outputs are represented by vectors x R N and v R M respectvely, havng actvaton defned by the par of equatons: da dt da x v dt N α ax + w f ( av ) +.. M θ M β a y + w f ( ax ) +.. N φ (3.4.) (3.4.2) where α, β, θ, φ are postve constants for..m,..n, f s tanh functon and W[w ] s any NxM real matrx. EE543 - ANN - CHAPTER 3 8

EE543 - ANN - CHAPTER 3 9 CHAPTER CHAPTER III : III : Neural Networs as Assocatve Memory Neural Networs as Assocatve Memory 3.4. B-drectonal Assocatve Memory The stablty of the BAM networ can be proved easly by applyng Cohen-Grossberg theorem by defnng a state vector z R N+M, such that (3.4.3) that s z obtaned through concatenaton x and v. N M M M v M x z + <, CHAPTER CHAPTER III : III : Neural Networs as Assocatve Memory Neural Networs as Assocatve Memory 3.4. B-drectonal Assocatve Memory N M a N a M v x N M v f x f bdb b f da a a f a f a f w E v x φ θ β α ) ( ) ( ) ( ) ( ) ( ) ( ), ( 0 0 + + v x Snce BAM s a specal case of the networ defned by Cohen-Grossberg theorem, t has a Lyapunov Energy functon as t s provded n the followng:

3.4. B-drectonal Assocatve Memory The dscrete BAM model s defned n a smlar manner to dscrete Hopfeld networ. The output functons are chosen to be f(a)sgn(a) and states are excted as: x ( + ) f( a ( )) x for a ( ) > 0 x x( ) for a ( ) 0 x for a ( ) < 0 x where m ax w f ( av ) +.. M θ v ( + ) f( a ( )) v for a ( ) > 0 v v( ) for a ( ) 0 v for a ( ) < 0 v where n av w f ( ax ) +.. N φ 3.4. B-drectonal Assocatve Memory In compact matrx notaton t s shortly x(+)f (W T v()) (3.4.9) and v(+)f(wx(+)). (3.4.0) EE543 - ANN - CHAPTER 3 20

3.4. B-drectonal Assocatve Memory In the dscrete BAM, the energy functon becomes E( x, y) M x M N N v x f ( a ) θ f ( a )φ w f ( a ) f ( a ) v (3.4.) satsfyng the condton E 0 (3.4.2) whch mples the stablty of the system. 3.4. B-drectonal Assocatve Memory The weghts of BAM s determned by the equaton W T YU T (3.4.3) For the specal case of orthonormal nput and output patterns we have and f(w T u r ) f(yu T u r )f(y r )y r (3.4.4) f(wy r ) f(uy T y r )f(u r )u r (3.4.5) ndcatng that exemplar are stable states of the networ. EE543 - ANN - CHAPTER 3 2

3.4. B-drectonal Assocatve Memory Whenever the ntal state s set to one of the exemplar, the system remans there. For arbtrary ntal states the networ converges to one of the stored exemplars, dependng on the basn of attracton n whch x(0) les. For the nput patterns that are not orthonormal, the networ behaves as t s explaned for the Hopfeld networ. EE543 - ANN - CHAPTER 3 22