Outline NIP: Hebbian Learning Neural Information Processing Amos Storkey 1/36 Overview 2/36 Types of Learning Types of learning, learning strategies Neurophysiology, LTP/LTD Basic Hebb rule, covariance rule, BCM rule supervised: input u, output v, model p(v u) Eigenanalysis reinforcement: input u and scalar reward r; often associated with a temporal credit assignment problem Constraints: subtractive and multiplicative normalization unsupervised: model p(u) Multiple output neurons Timing-based rules Reading: Reading: [Dayan and Abbott, 2002] ch.8, [Hertz et al., 1991] 5/36 6/36
Learning Strategies Hebb [Hebb, 1949] conjectured that if input from neuron A often contributes to the firing of neuron B, then the synapse from A to B should be strengthened. strengthening long-term potentiation (LTP), weakening long-term depression (LTD) Bottom-up: rules of neural plasticity, e.g. Hebbian Top-down: use of objective function E(w), updates determined by gradient w E(w) strong element of causality extensive biochemical mechanisms, protein synthesis Focus in the course will be mostly on the objective function approach, but we start with Hebbian learning Observed in many systems: hippocampus, neocortex, cerebellum Hebbian learning: long-term modification of synapses based on pre- and post-synaptic activity. (In contrast to: homeostasis, excitability changes, non-local learning such as backpropagation, etc.) 7/36 8/36 Firing-rate model dv = v + w u Assume that τr is small compared to timescale of weight change, so set dv/ = 0 to give v=w u τr Then = f(v, u, w) Basic Hebb rule is LTP and LTD in Schaffer collateral inputs to CA1. Figure from Dayan and Abbott (2001) 9/36 = uv 10/36
Basic Hebb Rule Covariance Rule Average over input distribution hi to give Implementing LTD at low activity = huvi = huut iw = Qw or where Q is the input correlation matrix = h(u θ u )vi If θv = hvi or θ u = hui then Positive feedback instability d w 2 = 2 w = 2hv2 i = Cw where C = h(u hui)(u hui)t i is the input covariance matrix Still unstable as Discrete time version of Hebb rule w w + Qw = hu(v θv )i 11/36 Hebb Rule: Single Postsynaptic Neuron d w 2 = 2hv(v hvi)i = 2var(v) > 0 12/36 PCA Basic Hebb rule = Qw, analyze using an eigendecomposition of Q Qeµ = λµ eµ, λ1 λ2... As Q is SPD, eigenvalues are real and non-negative, and the eigenvectors are orthogonal. Write w(t) = Nu X (w(0) eµ ) exp(λµ t/ )eµ. µ=1 As e1 has largest eigenvalue it grows fastest, so w e1 for large t. Similar analysis for = Cw, 13/36 [Figure: Dayan and Abbott 2001] 14/36
Energy Function Analysis BCM Rule Under [Linsker, 1988] 1 E = wt Cw 2 Using gradient descent dynamics = Cw 15/36 Stability and Competition = (u θ u )v 16/36 Subtractive Normalization For non-negative w set X Hebbian learning involves positive feedback. Control by: i LTD: not enough, covariance and correlation rules wi = n w = const where n = (1, 1,..., 1)T. Dynamics v(n u) = vu n Nu n w is constant, as dn w n n = vn u 1 =0 Nu saturation or bounds prevent weights getting too big or too small normalization over pre-synaptic or post-synaptic arbors subtractive: decrease all synapses by the same amount whether large or small multiplicative: decrease large synapses more than small synapses BCM rule Bienenstock, Cooper and Munro [Bienenstock et al., 1982] proposed = vu(v θv ), for which there is experimental evidence If θv is fixed then the BCM rule is unstable. But if θv is set to a low-pass filtered version of v2 (sliding threshold) then stability is achieved. With a sliding threshold the BCM rule implements competition between synapses we have or one gets LTD with v = 0 or u = 0 resp., which is odd E = w = u(v θv ) Subtractive normalization needs a lower bound to avoid weights becoming negative. If there is no upper saturation constraint then final outcome is that all weights but one become 0. 17/36 18/36
Multiplicative Normalization and the Oja Rule The Effect of Constraints Ensure w 2 is constant by setting [Oja, 1982] Oja rule gives w(t) e1 / α = vu αv2 w Saturation can change this outcome Dot product with 2w gives Subtractive normalization: if e1 n 6= 0 then growth of w in direction of n is stunted d w 2 = 2v2 (1 α w 2 ) which shows that w 2 relaxes to 1/α, preventing the weights growing without bound. Hebbian dynamics with saturation. Principal eigen vector is (1, 1)T / 2 19/36 [Dayan and Abbott 2001] 20/36 Example: Ocular Dominance T Q = huu i = Consider simplified model where a single cortical cell that receives input from two LGN afferents qs qd qd qs with qs > qd (Same and Different). Eigenvectors/values λ 1 = qs + qd e1 = (1, 1)T / 2 T e2 = (1, 1) / 2 λ 1 = qs qd Afferents have activities ul, ur from left and right eyes v = wr ur + wl ul weights are constrained to be non-negative. Ocular dominance occurs when one weight becomes 0 while the other is positive Principal eigenvector does not give rise to OD solution OD can be obtained by use of subtractive normalization, as e1 n 21/36 22/36
Example: Orientation Selectivity Linsker s model [Linsker, 1988] Layered architecture and arbor function constraints + Hebbian learning Layer D Layer C Layer B [Figure: Dayan and Abbott (2001), after Miller (1994)] Given ON and OFF LGN cells, Hebbian learning can give rise to cortical orientation selectivity [Miller, 1994] Layer A Needs constraints to avoid uniform component Uncorrelated noise in retina (layer A) Learned centre-surround receptive fields in layer C, and orientation-selective RF in layer G (bar detector) 23/36 24/36 25/36 26/36 Analysis by MacKay and Miller [MacKay and Miller, 2001] = (Q + k2 J)w + k1 n where J is the matrix of ones and n is the vector of ones Depending on the choice of parameters k1 and k2, uniform, centre-surround and orientation-selective RFs can be obtained
Multiple Output Neurons Fixed recurrent connections M dv τr = v + Wu + Mv at steady state gives With multiple output neurons and no interaction between them, each output neuron would behave similary. (Different initial conditions could still lead to different RFs). def v = (I M) 1 Wu =KWu Hebbian learning of W gives [Dayan and Abbott 2001] 27/36 dw = hvut i = KWQ Can consider 3 cases (i) plastic W, fixed M, (ii) fixed W, plastic M, (iii) plastic W and M Dayan and Abbott (2001) show OD stripes arising from fixed Mexican hat interactions K and plastic W 28/36 Anti-Hebbian modification of lateral connections [Figure: Dayan and Abbott 2001, after (right) Nicholls et al. 1992] Instead of fixed M, make it plastic. One possible goal is to decorrelate outputs, i.e. hvvt i = I Anti-Hebbian learning: synapses decrease when there is simultaneous pre- and postsynaptic activity Recurrent interactions can prevent different outputs from representing the same eigenvector dm = vvt + βm ( ) (need β > 0 to avoid M weights decaying to 0) For suitable β and τm a combination of the Oja rule and ( ) leads to rows of W being different eigenvectors of Q, and M ultimately decaying to 0. Other neural algorithms for multiple-output PCA are possible, e.g. [Sanger, 1989, Oja, 1989] τm 29/36 30/36
Timing-based Rules Timing-based Rules = Z 0 [H(τ )v(t)u(t τ ) + H( τ )v(t τ )u(t)] dτ An approximation to spiking models [Dayan and Abbott (2001) after (left) Markram et al. (1997) and (right) Zhang et al. (1998) ] If H(τ ) is positive for τ > 0 and negative for τ < 0 then first term on RHS is LTP, second is LTD Left: LTP and LTD Right: spike timing dependent plasticity (STDP) in Xenopus. Window of ±50ms, causal order agrees with Hebb conjecture 31/36 Temporal Hebbian rules and trace learning Z 0 T v(t) Z dτ H(τ )u(t τ ) Dayan, P. and Abbott, L. F. (2002). Theoretical Neuroscience. MIT press, Cambridge, MA. F oldi ak, P. (1991). Learning invariance from transformation sequences. Neural Comp., 3:194 200. Filtered version is a (memory) trace of u(t) Hebb, D. (1949). The organization of behavior. New York: Wiley. This rule can be used to model development of invariant responses, e.g. the presence of an object in an image [F oldi ak, 1991, Wallis and Rolls, 1996] 32/36 Bienenstock, E. L., Cooper, L. N., and Munro, P. W. (1982). Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J. Neurosci., 2:32 48. Temporally dependent Hebbian plasticity depends on the correlation between v(t) and a filtered version of u(t) References I Approximate solution of timing-based rule is 1 w= Hertz, J., Krogh, A., and Palmer, R. G. (1991). Introduction to the theory of neural computation. Perseus, Reading, MA. 33/36 34/36
References II References III Linsker, R. (1988). Self-organization in a perceptual network. Computer, 21(3):105 117. MacKay, D. and Miller, K. (2001). Analysis of Linsker s Simulations of Hebbian Rules. Self-Organizing Map Formation: Foundations of Neural Computation. Sanger, T. (1989). Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks, 2(6):459 473. Miller, K. D. (1994). A model for the development of simple cell receptive fields and the ordered arrangement of orientation columns through activity-dependent competition between on- and off-center inputs. J Neurosci, 14(1):409 441. Wallis, G. and Rolls, E. T. (1996). A model of invariant object recognition in the visual system. Prog Neurobiol., 51:167 194. Oja, E. (1982). A simplified neuron model as a principal component analyzer. J. Math. Biol., 15:267 273. Oja, E. (1989). Neural networks, principal components, and subspaces. International Journal of Neural Systems, 1(1):61 68. 35/36 36/36