Hopfield Neural Network and Associative Memory Typical Myelinated Vertebrate Motoneuron (Wikipedia) PHY 411-506 Computational Physics 2 1 Wednesday, March 5
1906 Nobel Prize in Physiology or Medicine. Drawings by Ramon y Cajal. PHY 411-506 Computational Physics 2 2 Wednesday, March 5
Action Potential of an Excited Neuron Neuron Networks in the Brain The human brain contains about 10 11 neurons connected in tree-like networks. The cell body of a neuron has as many as 10 5 dedrites or tree-like fibers on its surface The neuron integrates electrochemical signals received by its dendrites from other neurons If the integrated signal exceeds a threshold the neurons generates a spike-train of soliton-like signals that propagate along its axon PHY 411-506 Computational Physics 2 3 Wednesday, March 5
The axon branches into as many as 10 4 strands each ending in an synapse with a denrite on another neuron The axonic spike-train triggers a flow of chemicals (neurotransmitters) across the synaptic junction Because spikes have characteristic fixed magnitude, profile and speed, information is likely encoded in the length of a spike train and the pattern of spikes as a function of time Physiology of Memory The brain can store information and recall it at a later time A sensory stimulus, for example seeing a face or hearing a voice, triggers a cascade of spike-trains in a large network of neurons, see Wikipedia: Neuroanatomy of memory Information contained in the spike-train cascade modifies the chemical structure of a network of synaptic connections, thus encoding the information in short- or long-term memory, see Wikipedia: Physiology of memory A different stimulus at a later time can trigger a spike-train cascade similar to the original stimulus, thus recalling the stored memory PHY 411-506 Computational Physics 2 4 Wednesday, March 5
McCulloch-Pitts Neuron Model An Artificial neuron model was introduced by W. McCulloch and W. Pitts in 1943 to model the recall of information in a network of neural synapses. The action potential of the i-th neuron in a network is modeled by a simple binary voltage or state { 1 firing (active) V i = 0 not firing (resting) Synaptic information is represented by a matrix of weights w. The action potential in the i-th neuron is determined by the potentials is all of the other neurons which make synaptic connections to it by V i = Θ µ i + j w ij V j where w ij is the synaptic weight or strength of the j i synaptic connection, µ i is a threshold bias factor for neuron i, and Θ is a binary transfer function taking values 0, 1, for example a simple Heaviside step function { 1 if x > 0 Θ(x) = θ(x) = 0 otherwise The neuron is binary threshold device that switches on and off in time. PHY 411-506 Computational Physics 2 5 Wednesday, March 5
Hebb s Rule and Hebbian Learning How does the weight matrix w get modified to store memories? In 1949 D. Hebb proposed a mechanism called Hebb s rule w ij = 1 p V (k) i V (k) j p for a network excited by p different memories with excitation patterns V (k) i. k=1 PHY 411-506 Computational Physics 2 6 Wednesday, March 5
The Hopfield Neural Network Model J.J. Hopfield, Neural networks and physical systems with emergent collective computational abilities, Proc. Natl. Acad. Sci. USA 79, 2554-2558 (1982) introduced a neural network model for storage and retrieval of memories in a binary network. He showed that the model can store multiple memories, and that one or more memories can be simultaneously retrieved by stimulating the network with partial or damaged versions of the memories at a later time. This classic article has been cited more than 12,000 times, and is clearly written and relatively easy to understand. The Model System Consider N binary McCulloch-Pitts neurons with action potentials { 1 firing V i (t) = 0 not firing Each neuron changes its state at random times with a mean attempt rate W by instantaneously sampling PHY 411-506 Computational Physics 2 7 Wednesday, March 5
the potentials of all the other neurons V i (t) = θ j T ij V j (t) The Information Storage Algorithm The network stores memories in a learning phase. A memory is represented by a pattern of action potentials, i.e., by a binary vector V with particular values for its N neuronal components V i, i = 0,..., N 1. To store P memories V k, k = 0,..., P 1 the synaptic matrix is set to T ij = P 1 k=0 (2V (k) i 1)(2V (k) j 1), i j, T ii = 0. Information Retrieval To retrieve memories the model is evolved using the McCulloch-Pitts update algorithm starting from a given initial state. The network is operated by updating the neurons according to some protocol, for example by choosing neurons at random or sequentially (which is usually what is done in software networks), or by updating the whole network synchronously (which is more natural for a hardwired network controlled by a clock). PHY 411-506 Computational Physics 2 8 Wednesday, March 5
In general the state of the system tends to one of several equilibrium configurations, which represents the recalled memory. Energy Function and Ising Model Equivalence The behavior of the model can be analyzed by defining an energy function E = 1 T ij V i V j 2 which represents the energy of a random spin glass with spin variables s i = 1 2V i = ±1. If V i changes by V i in an evolution step, the energy of the system changes by E = V i T ij V j Hopfield showed that i,j i j i The network dynamics decreases the energy of the network This implies that if the network is started in an arbitrary state, then it will evolve to the nearest local energy minimum. The stored states are local minima of the energy function. So if the initial state happens to be in the basin of attraction of one of the stored minima, the that pattern will be recalled! PHY 411-506 Computational Physics 2 9 Wednesday, March 5
A network with N neurons has a huge number 2 N states. The network works best if the stored memories partition the space of network states into well defined basins. The storage capacity of the network is found to be 0.13N. If too many memories are stored, then the minima are not well defined and memories may not be perfectly recalled. The Hamming Distance There are various possible criteria for deciding how closely a given pattern, which is a N-component binary vector, resembles another pattern. A commonly used measure of the distance between two binary vectors is the Hamming distance D H = N 1 i=0 [ξ i (1 ζ i ) + (1 ξ i )ζ i ]. PHY 411-506 Computational Physics 2 10 Wednesday, March 5
This distance varies between 0 if all N bits are identical, and N if none of the bit components are the same. A simple application of neural networks is to storage and reconstruction of associative memories, illustrated in the Hopfield Java Applet PHY 411-506 Computational Physics 2 11 Wednesday, March 5