CHAPTER III Neural Networks as Associative Memory

Similar documents
Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

APPENDIX A Some Linear Algebra

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

NP-Completeness : Proofs

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Report on Image warping

Lecture 12: Discrete Laplacian

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

CHAPTER II Recurrent Neural Networks

2.3 Nilpotent endomorphisms

Affine and Riemannian Connections

Week 5: Neural Networks

Errors for Linear Systems

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

Convexity preserving interpolation by splines of arbitrary degree

Associative Memories

2 More examples with details

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

The Order Relation and Trace Inequalities for. Hermitian Operators

COMPLEX NUMBERS AND QUADRATIC EQUATIONS

More metrics on cartesian products

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

EEE 241: Linear Systems

SL n (F ) Equals its Own Derived Group

Supporting Information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

MEM 255 Introduction to Control Systems Review: Basics of Linear Algebra

The Minimum Universal Cost Flow in an Infeasible Flow Network

Dynamic Systems on Graphs

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Neuro-Adaptive Design - I:

Singular Value Decomposition: Theory and Applications

Lecture 3. Ax x i a i. i i

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

DUE: WEDS FEB 21ST 2018

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Lecture 10 Support Vector Machines II

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)

Time-Varying Systems and Computations Lecture 6

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Norms, Condition Numbers, Eigenvalues and Eigenvectors

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Hopfield Training Rules 1 N

CONJUGACY IN THOMPSON S GROUP F. 1. Introduction

Problem Set 9 Solutions

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Lecture Notes on Linear Regression

Notes on Frequency Estimation in Data Streams

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Ballot Paths Avoiding Depth Zero Patterns

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

On the Multicriteria Integer Network Flow Problem

Feb 14: Spatial analysis of data fields

Modelli Clamfim Equazione del Calore Lezione ottobre 2014

Foundations of Arithmetic

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

First day August 1, Problems and Solutions

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

Random Walks on Digraphs

VQ widely used in coding speech, image, and video

1 Convex Optimization

10. Canonical Transformations Michael Fowler

COS 521: Advanced Algorithms Game Theory and Linear Programming

Multilayer Perceptron (MLP)

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Maximizing the number of nonnegative subsets

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

Assortment Optimization under MNL

PARTICIPATION FACTOR IN MODAL ANALYSIS OF POWER SYSTEMS STABILITY

The equation of motion of a dynamical system is given by a set of differential equations. That is (1)

Chapter Newton s Method

MATH 5630: Discrete Time-Space Model Hung Phan, UMass Lowell March 1, 2018

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

arxiv: v1 [quant-ph] 6 Sep 2007

Module 9. Lecture 6. Duality in Assignment Problems

332600_08_1.qxp 4/17/08 11:29 AM Page 481

5 The Rational Canonical Form

9 Characteristic classes

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Example: (13320, 22140) =? Solution #1: The divisors of are 1, 2, 3, 4, 5, 6, 9, 10, 12, 15, 18, 20, 27, 30, 36, 41,

Canonical transformations

PHYS 705: Classical Mechanics. Calculus of Variations II

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Feature Selection: Part 1

Basic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions

Formulas for the Determinant

8.6 The Complex Number System

Lecture 3: Shannon s Theorem

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Difference Equations

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Transcription:

CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people even f they have sunglasses or f they are somehow elder now. In ths chapter, frst the basc defntons about assocatve memory wll be gven and then t wll be explaned how neural networs can be made lnear assocators so as to perform as nterpolatve memory. Next t wll be explaned how the Hopfeld networ can be used as autoassocatve memory and then Bpolar Assocatve Memory networ that s desgned to operate as heteroassocatve memory wll be ntroduced. EE543 - ANN - CHAPTER 3

3.. Assocatve Memory In an assocatve memory we store a set of patterns µ,...k, so that the networ responds by producng whchever of the stored patterns most closely resembles the one presented to the networ Suppose that the stored patterns, whch are called exemplars or memory elements, are n the form of pars of assocatons, µ (u,y ) where u R N, y R M,..K. Accordng to the mappng ϕ: R N R M that they mplement, we dstngush the followng types of assocatve memores: Interpolatve assocatve memory Accretve assocatve memory 3.. Assocatve Memory: Interpolatve Memory In nterpolatve assocatve memory, When uu r s presented to the memory t responds by producng y r of the stored assocaton. However f u dffers from u r by an amount of ε, that s f uu r +ε s presented to the memory, then the response dffers from y r by some amount ε r. Therefore n nterpolatve assocatve memory we have r r r r ϕ( u + ε) y + ε such that ε 0 ε 0, r.. K (3..) EE543 - ANN - CHAPTER 3 2

3.. Assocatve Memory: Accretve Memory In accretve assocatve memory, when u s presented to the memory t responds by producng y r of the stored assocaton such that u r s the one closest to u among u,..k, that s, r r ϕ( u) y such that u mn u u,... K u (3..2) 3.. Assocatve Memory: Heteroassocatve-Autoassocatve The accretve assocatve memory n the form gven above, that s u and y are dfferent, s called heteroassocatve memory. However f the stored exemplars are n a specal form such that the desred patterns and the nput patterns are the same, that s y u for..k, then t s called autoassocatve memory. In such a memory, whenever u s presented to the memory t responds by u r whch s the closest one to u among u,..k, that s, r r ϕ( u) u such that u mn u u,... K u (3..3) EE543 - ANN - CHAPTER 3 3

3.. Assocatve Memory Whle nterpolatve memores can be mplemented by usng feed-forward neural networs, t s more approprate to use recurrent networs as accretve memores. The advantage of usng recurrent networs as assocatve memory s ther convergence to one of a fnte number of stable states when started at some ntal state. The basc goals are: to be able to store as many exemplars as we need, each correspondng to a dfferent stable state of the networ, to have no other stable state to have the stable state that the networ converges to be the one closest to the appled pattern. 3.. Assocatve Memory The problems that we are faced wth beng: the capacty of the networ s restrcted dependng on the number and propertes of the patterns to be stored, some of the exemplars may not be among the stable states some spurous stable states dfferent than the exemplars may arse by themselves the converged stable state may be other than the one closest to the appled pattern EE543 - ANN - CHAPTER 3 4

3.. Assocatve Memory One way of usng recurrent neural networs as assocatve memory s to fx the external nput of the networ and present the nput pattern u r to the system by settng x(0)u r. If we relax such a networ, then t wll converge to the attractor x* for whch x(0) s wthn the basn attracton as explaned n Chapter 2. If we are able to place each µ as an attractor of the networ by proper choce of the connecton weghts, then we expect the networ to relax to the attractor x* µ r that s related to the ntal state x(0)u r. For a good performance of the networ, we need the networ only to converge to one of the stored patterns µ,...k. 3.. Assocatve Memory Unfortunately, some ntal states may converge to spurous states, whch are the undesred attractors of the networ representng none of the stored patterns. Spurous states may arse by themselves dependng on the model used and the patterns stored. The capacty of the neural assocatve memores s restrcted by the sze of the networs. If we ncrement the number of stored patterns for a fxed sze neural networ, spurous states arse nevtably. Sometmes, the networ may converge not to a spurous state, but to a memory pattern that s not so close to the pattern presented. EE543 - ANN - CHAPTER 3 5

3.. Assocatve Memory What we expect for a feasble operaton s that, at least for the memory patterns themselves, f any of the stored pattern s presented to the networ by settng x(0)µ, then the networ should stay converged to x* µ r (Fgure 3.). Fgure 3.. In assocatve memory each memory element s assgned to an attractor 3.. Assocatve Memory A second way to use recurrent networs as assocatve memory, s to present the nput pattern u r to the system as an external nput. Ths can be done by settng θu r, where θ s the threshold vector whose th component s correspondng to the threshold of neuron. After settng x(0) to some fxed value, we relax the networ and then wat untl t converges to an attractor x*. For a good performance of the networ, we desre the networ to have a sngle attractor such that x*µ for each stored nput pattern u, therefore the networ wll converge to ths attractor ndependent of the ntal state of the networ. Another soluton to the problem s to have predetermned ntal values, so that these ntal values le wthn the basn attracton of µ whenever u s appled. We wll consder ths nd of networs n Chapter 7 n more detal, where we wll examne how these recurrent networs are traned. EE543 - ANN - CHAPTER 3 6

3.2. Lnear Assocators : Orthonormal patterns It s qute easy to mplement nterpolatve assocatve memory when the set of nput memory elements {u } consttutes an othonormal set of vectors, that s T u u 0 (3.2.) where T denotes transpose. By usng ronecer delta, we wrte smply u T u δ (3.2.2) 3.2. Lnear Assocators : Orthonormal patterns The mappng functon ϕ(u) defned below may be used to establsh an nterpolatve assocatve memory: ϕ ( u) T W u (3.2.3) where W u y (3.2.4) Here the symbol s used to denote outer product of vectors x R N and y R M, whch s defned as T u y u y ( y u T ) T (3.2.5) resultng n a matrx of sze N by M. EE543 - ANN - CHAPTER 3 7

3.2. Lnear Assocators : Orthonormal patterns By defnng matrces [Hayn 94]: and U[u u 2.. u.. u K ] (3.2.6) Y[y y 2.. y.. y K ] (3.2.7) the weght matrx can be formulated as T T W YU (3.2.8) If the networ s gong to be used as autoassocatve memory we have YU so, T T W UU (3.2.9) 3.2. Lnear Assocators : Orthonormal patterns For a functon ϕ(u) to consttute an nterpolatve assocatve memory, t should satsfy the condton ϕ(u r )y r r..k (3.2.0) We can chec t smply as r T r ϕ ( u ) W u (3.2.) whch s T r T r W u YU u (3.2.2) EE543 - ANN - CHAPTER 3 8

3.2. Lnear Assocators : Orthonormal patterns Snce the set {u } s orthonormal, we have T r YU u δ ry y (3.2.3) whch results n r T r r ϕ ( u ) YU u y (3.2.4) as we desred. 3.2. Lnear Assocators : Orthonormal patterns Remember: T r T r W u YU u T r YU u δ ry y r (3.2.2) (3.2.3) Furthermore, f an nput pattern uu r +ε dfferent than the stored patterns s appled as nput to the networ, we obtan T r T r T (3.2.5) ϕ( u ) W ( u + ε) W u + W ε Usng equaton (3.2.2) and (3.2.3) results n r T ϕ( u) y + W ε (3.2.6) Therefore, we have r r ϕ( u) y + ε n the requred form, where r ε W T ε (3.2.7) (3.2.8) EE543 - ANN - CHAPTER 3 9

3.2. Lnear Assocators : Orthonormal patterns Such a memory can be mplemented by usng M neurons each havng N nputs as shown n Fgure 3.2. Fgure 3.2 Lnear Assocator 3.2. Lnear Assocators : Orthonormal patterns The connecton weghts of neuron s assgned value W, whch s the th column vector of matrx W. Here each neuron has a lnear output transfer functon f(a)a. When a stored pattern u s appled as nput to the networ, the desred value y s observed at the output of the networ as: T x W u (3.2.9) EE543 - ANN - CHAPTER 3 0

3.2. Lnear Assocators : General case Untl now, we have nvestgated the use of lnear mappng YU T as assocatve memory, whch wors well when the nput patterns are orthonormal. In the case the nput patterns are not orthonormal, the lnear assocator cannot map some nput patterns to desred output patterns wthout error. In the followng we wll nvestgate the condtons necessary to mnmze the output error for the exemplar patterns. 3.2. Lnear Assocators : General case Remember: U[u u 2.. u.. u K ] (3.2.6) Y[y y 2.. y.. y K ] (3.2.7) Therefore, for a gven set of exemplars µ (u,y ), u R N, y R M,.. K, our purpose s to fnd a lnear mappng A* among A: R N R M such that: * A mn y Au A (3.2.20) where. s chosen as Eucldean norm. The problem may be reformulated by usng the matrces U and Y [Hayn 94]: A * mn Y AU A (3.2.2) EE543 - ANN - CHAPTER 3

3.2. Lnear Assocators : General case The pseudo nverse method [Kohonen 76] based on least squares estmaton provdes a soluton for the problem n whch A* s determned as: * A + YU (3.2.22) where U + s pseudo nverse of U. The pseudonverse U + s a matrx satsfyng the condton: + U U (3.2.23) where s the dentty matrx. 3.2. Lnear Assocators : General case A perfect match s obtaned by usng * A + YU snce * + AU YU U Y (3.2.24) resultng n no error due to the fact Y - A*U 0 (3.2.25) EE543 - ANN - CHAPTER 3 2

3.2. Lnear Assocators : Lnearly Independent Patterns Remember + U U (3.2.23) In the case the nput patterns are lnearly ndependent, that s none of them can be obtaned as a lnear combnatons of the others, then a matrx U + satsfyng Eq. (3.2.23) can be obtaned by applyng the formula [Golub and Van Loan 89, Hayn 94] + T T U ( U U) U (3.2.26) Notce that for the nput patterns, whch are the columns of the matrx U, to be lnearly ndependent, the number of columns should not be more than the number of rows, that s K N, otherwse U T U wll be sngular and no nverse wll exst. The condton K N means that the number of entres consttutng the patterns restrcts the capacty of the memory. At most N patterns can be stored n such a memory. 3.2. Lnear Assocators : Lnearly Independent Patterns Ths memory can be mplemented by a neural networ for whch W T YU +. The desred value y appears at the output of the networ as x when u s appled as nput to the networ: T x W u (3.2.27) as explaned prevously. EE543 - ANN - CHAPTER 3 3

3.2. Lnear Assocators : Lnearly Independent Patterns Remember: + T T U ( U U) U (3.2.26) Notce that for the specal case of orthonormal patterns that we examned prevously n ths secton, we have T U U (3.2.28) that results n the pseudonverse, whch s n the form + U T U (3.2.29) and therefore T + W YU YU T (3.2.30) as we have derved prevously. 3.3. Hopfeld Autoassocatve Memory In ths secton we wll nvestgate how Hopfeld networ can be used as autoassocatve memory. For ths purpose some modfcatons are done on contnuous Hopfeld networ so that t wors n dscrete state space and dscrete tme. Fgure 3.3 Hopfeld Assocatve Memory EE543 - ANN - CHAPTER 3 4

3.3. Hopfeld Autoassocatve Memory Note that whenever the patterns to be stored n Hopfeld networ are from N dmensonal bpolar space consttutng a hypercube, that s u {-,} N,..K, then t s convenent to have any stable state of the networ on the corners of the hypercube. If we let the output transfer functon of the neurons n the networ to have very hgh gan, n the extreme case f ( a) lm tanh( κa) κ we obtan f ( a) sgn( a) 0 for a > 0 for a 0 for a < 0 (3.3.) (3.3.2) 3.3. Hopfeld Autoassocatve Memory Furthermore note that the second term of the energy functon Chapter 2 prevously) E N N N x w x x + f x R ) 2 0 N ( dx θ x (whch was gven n (3.3.3) approaches to zero. Therefore the stable states of the networ corresponds to the local mnma of the functon: E 2 w x x θ x (3.3.4) so that they le on the corners of the hypercube as explaned prevously. EE543 - ANN - CHAPTER 3 5

3.3. Hopfeld Autoassocatve Memory Dscrete tme state exctaton [Hopfeld 82] of the networ, s provded n the followng: x ( + ) f ( a ( )) x( ) for a ( ) > 0 for a ( ) 0 for a ( ) < 0 (3.3.5) where a () s defned as we used to, that s, a( ) wx( ) + θ (3.3.6) The processng elements of the networ are updated one at a tme, such that all of the processng elements must be updated at the same average rate. 3.3. Hopfeld Autoassocatve Memory Remember: U[u u 2.. u.. u K ] (3.2.6) Y[y y 2.. y.. y K ] (3.2.7) For stablty of the bpolar dscrete Hopfeld networ, t s further requred to have w 0 n addton to the constrant w w In order to use dscrete Hopfeld networ as autoassocatve memory, ts weghts are fxed to T T W UU (3.3.8) where U s the nput pattern matrx as defned n Eq. (3.2.6), and then w are set to 0 Remember that n autoassocatve memory we have YU, where Y s the matrx of desred output patterns as defned n Eq (3.2.7). EE543 - ANN - CHAPTER 3 6

3.3. Hopfeld Autoassocatve Memory If all the states of the networ are to be updated at once, then the next state of the system may be represented n the form x(+)f(w T x()) (3.3.9) For the specal case f the exemplars are orthonormal, we have f(w T u r )f(u r )u r (3.3.0). that means each exemplar s a stable state of the networ Whenever the ntal state s set to one of the exemplar, the system remans there. However, f the ntal state s set to some arbtrary nput, then the networ converges to one of the stored exemplars, dependng on the basn of attracton n whch x(0) les. 3.3. Hopfeld Autoassocatve Memory However n general the nput patterns are not orthonormal, so there s no guarantee that each exemplar s correspondng to a stable state. Therefore the problems that we mentoned n Secton 3. arse. The capacty of the Hopfeld net s less than 0.38*N patterns, where N s the number of unts n the networ [Lppmann 89]. It s shown n the lecture notes that the energy functon always decreases as the state of the processng elements are changed one by one (asynchronous update). EE543 - ANN - CHAPTER 3 7

3.4. B-drectonal Assocatve Memory The B-drectonal Assocatve Memory (BAM) ntroduced n [Koso 88] s a recurrent networ (Fgure 3.4) desgned to wor as heteroassocatve memory [Nelsen 90]. Fgure 3.4: B-drectonal Assocatve Memory 3.4. B-drectonal Assocatve Memory BAM networ conssts of two sets of neurons whose outputs are represented by vectors x R N and v R M respectvely, havng actvaton defned by the par of equatons: da dt da x v dt N α ax + w f ( av ) +.. M θ M β a y + w f ( ax ) +.. N φ (3.4.) (3.4.2) where α, β, θ, φ are postve constants for..m,..n, f s tanh functon and W[w ] s any NxM real matrx. EE543 - ANN - CHAPTER 3 8

EE543 - ANN - CHAPTER 3 9 CHAPTER CHAPTER III : III : Neural Networs as Assocatve Memory Neural Networs as Assocatve Memory 3.4. B-drectonal Assocatve Memory The stablty of the BAM networ can be proved easly by applyng Cohen-Grossberg theorem by defnng a state vector z R N+M, such that (3.4.3) that s z obtaned through concatenaton x and v. N M M M v M x z + <, CHAPTER CHAPTER III : III : Neural Networs as Assocatve Memory Neural Networs as Assocatve Memory 3.4. B-drectonal Assocatve Memory N M a N a M v x N M v f x f bdb b f da a a f a f a f w E v x φ θ β α ) ( ) ( ) ( ) ( ) ( ) ( ), ( 0 0 + + v x Snce BAM s a specal case of the networ defned by Cohen-Grossberg theorem, t has a Lyapunov Energy functon as t s provded n the followng:

3.4. B-drectonal Assocatve Memory The dscrete BAM model s defned n a smlar manner to dscrete Hopfeld networ. The output functons are chosen to be f(a)sgn(a) and states are excted as: x ( + ) f( a ( )) x for a ( ) > 0 x x( ) for a ( ) 0 x for a ( ) < 0 x where m ax w f ( av ) +.. M θ v ( + ) f( a ( )) v for a ( ) > 0 v v( ) for a ( ) 0 v for a ( ) < 0 v where n av w f ( ax ) +.. N φ 3.4. B-drectonal Assocatve Memory In compact matrx notaton t s shortly x(+)f (W T v()) (3.4.9) and v(+)f(wx(+)). (3.4.0) EE543 - ANN - CHAPTER 3 20

3.4. B-drectonal Assocatve Memory In the dscrete BAM, the energy functon becomes E( x, y) M x M N N v x f ( a ) θ f ( a )φ w f ( a ) f ( a ) v (3.4.) satsfyng the condton E 0 (3.4.2) whch mples the stablty of the system. 3.4. B-drectonal Assocatve Memory The weghts of BAM s determned by the equaton W T YU T (3.4.3) For the specal case of orthonormal nput and output patterns we have and f(w T u r ) f(yu T u r )f(y r )y r (3.4.4) f(wy r ) f(uy T y r )f(u r )u r (3.4.5) ndcatng that exemplar are stable states of the networ. EE543 - ANN - CHAPTER 3 2

3.4. B-drectonal Assocatve Memory Whenever the ntal state s set to one of the exemplar, the system remans there. For arbtrary ntal states the networ converges to one of the stored exemplars, dependng on the basn of attracton n whch x(0) les. For the nput patterns that are not orthonormal, the networ behaves as t s explaned for the Hopfeld networ. EE543 - ANN - CHAPTER 3 22