A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

Similar documents
A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

( ) T. Reading. Lecture 22. Definition of Covariance. Imprinting Multiple Patterns. Characteristics of Hopfield Memory

7 Rate-Based Recurrent Networks of Threshold Neurons: Basis for Associative Memory

Hopfield Networks. (Excerpt from a Basic Course at IK 2008) Herbert Jaeger. Jacobs University Bremen

7 Recurrent Networks of Threshold (Binary) Neurons: Basis for Associative Memory

Hopfield Network for Associative Memory

CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory

Neural Nets and Symbolic Reasoning Hopfield Networks

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

Using a Hopfield Network: A Nuts and Bolts Approach

Motivation. Lecture 23. The Stochastic Neuron ( ) Properties of Logistic Sigmoid. Logistic Sigmoid With Varying T. Logistic Sigmoid T = 0.

Artificial Intelligence Hopfield Networks

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield

Hopfield Neural Network

Artificial Neural Networks Examination, March 2004

Hopfield networks. Lluís A. Belanche Soft Computing Research Group

Logic Learning in Hopfield Networks

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

Learning and Memory in Neural Networks

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

Neural Networks. Hopfield Nets and Auto Associators Fall 2017

Neural networks: Unsupervised learning

How can ideas from quantum computing improve or speed up neuromorphic models of computation?

Introduction to Neural Networks

Week 4: Hopfield Network

Machine Learning. Neural Networks

Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1.

Computational Intelligence Lecture 6: Associative Memory

CHAPTER 3. Pattern Association. Neural Networks

Artificial Neural Networks Examination, June 2005

Neural networks. Chapter 19, Sections 1 5 1

Associative Memories (I) Hopfield Networks

Iterative Autoassociative Net: Bidirectional Associative Memory

Neural Networks Lecture 6: Associative Memory II

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Neural networks. Chapter 20, Section 5 1

Basic Principles of Unsupervised and Unsupervised

Artificial Neural Networks Examination, June 2004

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Lecture 7 Artificial neural networks: Supervised learning

Neural Network Essentials 1

Stochastic Networks Variations of the Hopfield model

Hebb rule book: 'The Organization of Behavior' Theory about the neural bases of learning

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Memories Associated with Single Neurons and Proximity Matrices

Memory capacity of networks with stochastic binary synapses

The Hebb rule Neurons that fire together wire together.

Neural networks. Chapter 20. Chapter 20 1

Properties of Associative Memory Model with the β-th-order Synaptic Decay

Terminal attractor optical associative memory with adaptive control parameter

Financial Informatics XVII:

Artifical Neural Networks

Neural Networks Lecture 2:Single Layer Classifiers

18.6 Regression and Classification with Linear Models

Neural Network Analysis of Russian Parliament Voting Patterns

Storage Capacity of Letter Recognition in Hopfield Networks

Pattern Association or Associative Networks. Jugal Kalita University of Colorado at Colorado Springs

CS:4420 Artificial Intelligence

On the Hopfield algorithm. Foundations and examples

Introduction to Chaotic Dynamics and Fractals

Chapter 10 Associative Memory Networks We are in for a surprise though!

arxiv: v2 [nlin.ao] 19 May 2015

patterns (\attractors"). For instance, an associative memory may have tocomplete (or correct) an incomplete

Systems Biology: A Personal View IX. Landscapes. Sitabhra Sinha IMSc Chennai

Lecture 5: Logistic Regression. Neural Networks

SCRAM: Statistically Converging Recurrent Associative Memory

Introduction to Artificial Neural Networks

Machine Learning and Adaptive Systems. Lectures 5 & 6

Artificial Neural Networks Examination, March 2002

Using Variable Threshold to Increase Capacity in a Feedback Neural Network

Branislav K. Nikolić

In this section, we review some basics of modeling via linear algebra: finding a line of best fit, Hebbian learning, pattern classification.

Computational Models of Human Cognition

Neural networks III: The delta learning rule with semilinear activation function

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Slide02 Haykin Chapter 2: Learning Processes

Hopfield and Potts-Hopfield Networks

Storkey Learning Rules for Hopfield Networks

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

Effects of Interactive Function Forms in a Self-Organized Critical Model Based on Neural Networks

Artificial Neural Networks The Introduction

Computational Neuroscience. Structure Dynamics Implementation Algorithm Computation - Function

Introduction to Neural Networks

Storage capacity of hierarchically coupled associative memories

Associative Neural Networks using Matlab

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

EEE 241: Linear Systems

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Hopfield Network Recurrent Netorks

Hebbian Learning II. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. July 20, 2017

Associative Memory : Soft Computing Course Lecture 21 24, notes, slides RC Chakraborty, Aug.

HOPFIELD neural networks (HNNs) are a class of nonlinear

Artificial Neural Networks

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Transcription:

III. Recurrent Neural Networks A. The Hopfield Network 2/9/15 1 2/9/15 2 Typical Artificial Neuron Typical Artificial Neuron connection weights linear combination activation function inputs output net input (local field) threshold 2/9/15 3 2/9/15 4 Equations Hopfield Network Net input: New neural state: # h i = % $ n j=1 h = Ws θ ( ) ( ) " s i = σ h i s " = σ h & ( θ ' Symmetric weights: = w ji No self-action: w ii = 0 Zero threshold: θ = 0 Bipolar states: s i { 1, +1} Discontinuous bipolar activation function: σ( h) = sgn( h) = % 1, h < 0 $ & +1, h > 0 2/9/15 5 2/9/15 6 1

What to do about h = 0? There are several options: σ(0) = +1 σ(0) = 1 σ(0) = 1 or +1 with equal probability h i = 0 no state change (s i = s i ) Not much difference, but be consistent Last option is slightly preferable, since symmetric Positive Coupling Positive sense (sign) Large strength 2/9/15 7 2/9/15 8 Negative Coupling Negative sense (sign) Large strength Either sense (sign) Little strength Weak Coupling 2/9/15 9 2/9/15 10 State = 1 & Local Field < 0 State = 1 & Local Field > 0 h < 0 h > 0 2/9/15 11 2/9/15 12 2

State Reverses State = +1 & Local Field > 0 h > 0 h > 0 2/9/15 13 2/9/15 14 State = +1 & Local Field < 0 State Reverses h < 0 h < 0 2/9/15 15 2/9/15 16 NetLogo Demonstration of Hopfield State Updating Run Hopfield-update.nlogo 2/9/15 17 Hopfield Net as Soft Constraint Satisfaction System States of neurons as yes/no decisions Weights represent soft constraints between decisions hard constraints must be respected soft constraints have degrees of importance Decisions change to better respect constraints Is there an optimal set of decisions that best respects all constraints? 2/9/15 18 3

Convergence Demonstration of Hopfield Net Dynamics I Run Hopfield-dynamics.nlogo Does such a system converge to a stable state? Under what conditions does it converge? There is a sense in which each step relaxes the tension in the system But could a relaxation of one neuron lead to greater tension in other places? 2/9/15 19 2/9/15 20 Quantifying Tension If > 0, then s i and want to have the same sign (s i = +1) If < 0, then s i and want to have opposite signs (s i = 1) If = 0, their signs are independent Strength of interaction varies with Define disharmony ( tension ) D ij between neurons i and j: D ij = s i D ij < 0 they are happy D ij > 0 they are unhappy 2/9/15 21 Total Energy of System The energy of the system is the total tension (disharmony) in it: E{ s} = D ij = s i ij ij = 1 2 s i = 1 2 i i 2/9/15 22 j i = 1 2 st Ws s i j Review of Some Vector Notation " x 1 % $ ' x = $ ' = x 1 x n # & x n x T y = n i=1 ( ) T (column vectors) x i y i = x y " x 1 y 1 x 1 y n % xy T $ ' = $ ' # x m y 1 x m y n & x T My = m n i=1 j=1 x i M ij y j (inner product) (outer product) (quadratic form) 2/9/15 23 Another View of Energy The energy measures the disharmony of the neurons states with their local fields (i.e. of opposite sign): E{ s} = 1 2 s i 2/9/15 24 i = 1 2 s i i j = 1 2 s i h i i = 1 2 st h j 4

Do State Changes Decrease Energy? Suppose that neuron k changes state Change of energy: ΔE = E { s #} E{ s} = s # i s # j + ij s i 2/9/15 25 = s # k w kj + s k w kj j k = s # k s k ij j k ( ) w kj = Δs k h k < 0 j k Energy Does Not Increase In each step in which a neuron is considered for update: E{s(t + 1)} E{s(t)} 0 Energy cannot increase Energy decreases if any neuron changes Must it stop? 2/9/15 26 Proof of Convergence in Finite Time There is a minimum possible energy: The number of possible states s { 1, +1} n is finite Hence E min = min {E(s) s {±1} n } exists Must reach in a finite number of steps because only finite number of states Conclusion If we do asynchronous updating, the Hopfield net must reach a stable, minimum energy state in a finite number of updates This does not imply that it is a global minimum 2/9/15 27 2/9/15 28 Lyapunov Functions A way of showing the convergence of discreteor continuous-time dynamical systems For discrete-time system: need a Lyapunov function E ( energy of the state) E is bounded below (E{s} > E min ) E < ( E) max 0 (energy decreases a certain minimum amount each step) then the system will converge in finite time Problem: finding a suitable Lyapunov function Example Limit Cycle with Synchronous Updating w > 0 w > 0 2/9/15 29 2/9/15 30 5

The Hopfield Energy Function is Even A function f is odd if f ( x) = f (x), for all x A function f is even if f ( x) = f (x), for all x Observe: E{ s} = 1 2 ( s)t W( s) = 1 2 st Ws = E{ s} Conceptual Picture of Descent on Energy Surface 2/9/15 31 2/9/15 (fig. from Solé & Goodwin) 32 Energy Surface Energy Surface + Flow Lines 2/9/15 (fig. from Haykin Neur. Netw.) 33 2/9/15 (fig. from Haykin Neur. Netw.) 34 Flow Lines Basins of Attraction Bipolar State Space 2/9/15 (fig. from Haykin Neur. Netw.) 35 2/9/15 36 6

Part 3A: Hopfield Network 2/9/15 Basins in Bipolar State Space Demonstration of Hopfield Net Dynamics II Run initialized Hopfield.nlogo energy decreasing paths 2/9/15 37 (fig. from Solé & Goodwin) 39 Restoration 2/9/15 38 Restoration Storing Memories as Attractors 2/9/15 2/9/15 2/9/15 (fig. from Arbib 1995) 40 (fig. from Arbib 1995) 42 Restoration (fig. from Arbib 1995) 41 2/9/15 7

Restoration Restoration 2/9/15 (fig. from Arbib 1995) 43 2/9/15 (fig. from Arbib 1995) 44 Completion Completion 2/9/15 (fig. from Arbib 1995) 45 2/9/15 (fig. from Arbib 1995) 46 Completion Completion 2/9/15 (fig. from Arbib 1995) 47 2/9/15 (fig. from Arbib 1995) 48 8

Completion Association 2/9/15 (fig. from Arbib 1995) 49 2/9/15 (fig. from Arbib 1995) 50 Association Association 2/9/15 (fig. from Arbib 1995) 51 2/9/15 (fig. from Arbib 1995) 52 Association Association 2/9/15 (fig. from Arbib 1995) 53 2/9/15 (fig. from Arbib 1995) 54 9

Applications of Hopfield Memory restoration completion generalization association Hopfield Net for Optimization and for Associative Memory For optimization: we know the weights (couplings) we want to know the minima (solutions) For associative memory: we know the minima (retrieval states) we want to know the weights 2/9/15 55 2/9/15 56 Hebb s Rule When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth or metabolic change takes place in one or both cells such that A s efficiency, as one of the cells firing B, is increased. Donald Hebb (The Organization of Behavior, 1949, p. 62) Neurons that fire together, wire together 2/9/15 57 Hebbian Learning: Imprinted 2/9/15 58 Hebbian Learning: Partial Reconstruction Mathematical Model of Hebbian Learning for One # Let W ij = x x, if i j i j $ % 0, if i = j Since x i x i = x i 2 =1, W = xx T I For simplicity, we will include self-coupling: W = xx T 2/9/15 59 2/9/15 60 10

A Single Imprinted is a Stable State Suppose W = xx T Then h = Wx = xx T x = nx since n ( ) 2 x T x = x i 2 = ±1 i=1 Hence, if initial state is s = x, then new state is s = sgn (n x) = x For this reason, scale W by 1/n May be other stable states (e.g., x) n i=1 = n Questions How big is the basin of attraction of the imprinted pattern? How many patterns can be imprinted? Are there unneeded spurious stable states? These issues will be addressed in the context of multiple imprinted patterns 2/9/15 61 2/9/15 62 Imprinting Multiple s Let x 1, x 2,, x p be patterns to be imprinted Define the sum-of-outer-products matrix: W ij = 1 n p k=1 x i k x j k ( ) T W = 1 n x k x k p k=1 2/9/15 63 Definition of Covariance Consider samples (x 1, y 1 ), (x 2, y 2 ),, (x N, y N ) Let x = x k and y = y k Covariance of x and y values : C xy = ( x k x )( y k y ) = x k y k x y k x k y + x y = x k y k x y k x k y + x y = x k y k x y x y + x y C xy = x k y k x y 2/9/15 64 Weights & the Covariance Matrix Sample pattern vectors: x 1, x 2,, x p Covariance of i th and j th components: C ij = x i k x j k x i x j If i : x i = 0 (±1 equally likely in all positions) : C ij = x i k x j k nw = pc p = 1 p x i k x j k k =1 Characteristics of Hopfield Memory Distributed ( holographic ) every pattern is stored in every location (weight) Robust correct retrieval in spite of noise or error in patterns correct operation in spite of considerable weight damage or noise 2/9/15 65 2/9/15 66 11

Stability of Imprinted Memories Demonstration of Hopfield Net Run Malasri Hopfield Demo 2/9/15 67 Suppose the state is one of the imprinted patterns x m Then: h = Wx m = 1 n x k x k [ ( ) T k ] x m = 1 n x k ( x k ) T x m k = 1 x m n ( x m ) T x m + 1 n x k ( x k ) T x m = x m + 1 n ( x k x m )x k 2/9/15 68 Interpretation of Inner Products x k x m = n if they are identical highly correlated x k x m = n if they are complementary highly correlated (reversed) x k x m = 0 if they are orthogonal largely uncorrelated x k x m measures the crosstalk between patterns k and m 2/9/15 69 Cosines and Inner products u v = u v cosθ uv If u is bipolar, then u 2 = u u = n Hence, u v = n n cosθ uv = n cosθ uv Hence h = x m + x k cosθ km 2/9/15 70 θ uv u v Conditions for Stability Stability of entire pattern : % x m = sgn' x m + & ( x k cosθ km * ) Stability of a single bit : % x m i = sgn' x m i + & ( x k i cosθ km * ) Sufficient Conditions for Instability (Case 1) Suppose x i m = 1. Then unstable if : ( 1) + x k i cosθ km > 0 x i k cosθ km >1 2/9/15 71 2/9/15 72 12

Sufficient Conditions for Instability (Case 2) Suppose x i m = +1. Then unstable if : ( +1) + x k i cosθ km < 0 x i k cosθ km < 1 Sufficient Conditions for Stability x k i cosθ km 1 The crosstalk with the sought pattern must be sufficiently small 2/9/15 73 2/9/15 74 Capacity of Hopfield Memory Depends on the patterns imprinted If orthogonal, p max = n weight matrix is identity matrix hence every state is stable trivial basins So p max < n Let load parameter α = p / n 2/9/15 75 equations Single Bit Stability Analysis For simplicity, suppose x k are random Then x k x m are sums of n random ±1 binomial distribution Gaussian in range n,, +n with mean µ = 0 and variance σ 2 = n Probability sum > t: ) 1 1 erf # t &, 2 + % (. * $ 2n '- [See Review of Gaussian (Normal) Distributions on course website] 2/9/15 76 Approximation of Probability ( ) Let crosstalk C i m = 1 n x i k x k x m We want Pr C i m >1 Note : nc i m = { } = Pr{ nc m i > n} p n k=1 j=1 x i k x j k x j m A sum of n(p 1) np random ±1s Probability of Bit Instability { } = 1 1 erf n + 2 % Pr nc i m > n ) # &, (. * + $ 2np '-. [ 1 erf( n 2p) ] [ 1 erf( 1 2α )] = 1 2 = 1 2 Variance σ 2 = np 2/9/15 77 2/9/15 (fig. from Hertz & al. Intr. Theory Neur. Comp.) 78 13

Tabulated Probability of Single-Bit Instability P error 0.1% 0.105 0.36% 0.138 1% 0.185 5% 0.37 10% 0.61 2/9/15 (table from Hertz & al. Intr. Theory Neur. Comp.) 79 α Orthogonality of Random Bipolar Vectors of High Dimension 99.99% probability of being within 4σ of mean It is 99.99% probable that random n-dimensional vectors will be within ε = 4/ n orthogonal ε = 4% for n = 10,000 Probability of being less orthogonal than ε decreases exponentially with n The brain gets approximate orthogonality by assigning random high-dimensional vectors Pr cosθ > ε u v < 4σ iff u v cosθ < 4 n iff n cosθ < 4 n iff cosθ < 4 / { } = erfc# ε n 2 n = ε 2/9/15 80! " $ & % 1 6 exp ( ε 2 n / 2) + 1 2 exp ( 2ε 2 n / 3) Spurious Attractors Mixture states: sums or differences of odd numbers of retrieval states number increases combinatorially with p shallower, smaller basins basins of mixtures swamp basins of retrieval states overload useful as combinatorial generalizations? self-coupling generates spurious attractors Spin-glass states: not correlated with any finite number of imprinted patterns occur beyond overload because weights effectively random 2/9/15 81 Basins of Mixture States x k 1 x k 3 x mix x k 2 ( ) x i mix = sgn x i k 1 + x i k 2 + x i k 3 2/9/15 82 Fraction of Unstable Imprints (n = 100) Run Hopfield-Capacity Test 2/9/15 83 2/9/15 (fig from Bar-Yam) 84 14

Number of Stable Imprints (n = 100) Number of Imprints with Basins of Indicated Size (n = 100) 2/9/15 (fig from Bar-Yam) 85 2/9/15 (fig from Bar-Yam) 86 Summary of Capacity Results Absolute limit: p max < α c n = 0.138 n If a small number of errors in each pattern permitted: p max n If all or most patterns must be recalled perfectly: p max n / log n Recall: all this analysis is based on random patterns Unrealistic, but sometimes can be arranged III B 2/9/15 87 15