Probabilistic Models in Theoretical Neuroscience

Similar documents
Modelling stochastic neural learning

Sampling-based probabilistic inference through neural and synaptic dynamics

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics

Implementation of a Restricted Boltzmann Machine in a Spiking Neural Network

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Fundamentals of Computational Neuroscience 2e

Lecture 16 Deep Neural Generative Models

arxiv: v1 [q-bio.nc] 17 Jul 2017

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)

Noise as a Resource for Computation and Learning in Networks of Spiking Neurons

Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity

Hierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References

Methods for Estimating the Computational Power and Generalization Capability of Neural Microcircuits

Introduction to Neural Networks

STA 4273H: Statistical Machine Learning

Introduction to Restricted Boltzmann Machines

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

CPSC 540: Machine Learning

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)

UNSUPERVISED LEARNING

Computational Explorations in Cognitive Neuroscience Chapter 2

Chapter 20. Deep Generative Models

Deep unsupervised learning

The Origin of Deep Learning. Lili Mou Jan, 2015

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

An Introductory Course in Computational Neuroscience

The Bayesian Brain. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. May 11, 2017

Liquid Computing in a Simplified Model of Cortical Layer IV: Learning to Balance a Ball

Deep Boltzmann Machines

How to do backpropagation in a brain

Dynamical Constraints on Computing with Spike Timing in the Cortex

Infinite systems of interacting chains with memory of variable length - a stochastic model for biological neural nets

Synaptic dynamics. John D. Murray. Synaptic currents. Simple model of the synaptic gating variable. First-order kinetics

Chapter 16. Structured Probabilistic Models for Deep Learning

Consider the following spike trains from two different neurons N1 and N2:

Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

The Mixed States of Associative Memories Realize Unimodal Distribution of Dominance Durations in Multistable Perception

Supporting Online Material for

Emergence of resonances in neural systems: the interplay between adaptive threshold and short-term synaptic plasticity

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

A gradient descent rule for spiking neurons emitting multiple spikes

Learning Spatio-Temporally Encoded Pattern Transformations in Structured Spiking Neural Networks 12

The connection of dropout and Bayesian statistics

Neural Coding: Integrate-and-Fire Models of Single and Multi-Neuron Responses

Learning and Memory in Neural Networks

Does the Wake-sleep Algorithm Produce Good Density Estimators?

Computational physics: Neural networks

Restricted Boltzmann Machines

Machine Learning Techniques for Computer Vision

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture

Introduction Biologically Motivated Crude Model Backpropagation

How to do backpropagation in a brain. Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto

Density estimation. Computing, and avoiding, partition functions. Iain Murray

arxiv: v1 [cs.ne] 19 Sep 2015

Restricted Boltzmann Machines for Collaborative Filtering

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. Revised submission to IEEE TNN

STDP Learning of Image Patches with Convolutional Spiking Neural Networks

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Gentle Introduction to Infinite Gaussian Mixture Modeling

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network

CSC 2541: Bayesian Methods for Machine Learning

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Stochastic Networks Variations of the Hopfield model

A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

Bidirectional Representation and Backpropagation Learning

Learning Energy-Based Models of High-Dimensional Data

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Artificial Neural Network and Fuzzy Logic

Neural networks. Chapter 19, Sections 1 5 1

Neurons as Monte Carlo Samplers: Bayesian Inference and Learning in Spiking Networks

Fast and exact simulation methods applied on a broad range of neuron models

Neural networks. Chapter 20. Chapter 20 1

Neural Nets and Symbolic Reasoning Hopfield Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks

Multilayer Perceptron

Markov Chain Monte Carlo

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Deep Belief Networks are compact universal approximators

Lecture 7 and 8: Markov Chain Monte Carlo

Undirected Graphical Models

Knowledge Extraction from DBNs for Images

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

CMSC 421: Neural Computation. Applications of Neural Networks

Model neurons!!poisson neurons!

Advanced Machine Learning

Basic Principles of Unsupervised and Unsupervised

HOPFIELD neural networks (HNNs) are a class of nonlinear

Algorithmisches Lernen/Machine Learning

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Artificial Neural Networks Examination, March 2004

Contrastive Divergence

Transcription:

Probabilistic Models in Theoretical Neuroscience visible unit Boltzmann machine semi-restricted Boltzmann machine restricted Boltzmann machine hidden unit Neural models of probabilistic sampling: introduction Matt Graham 16 th January 2014

Overview 1 Motivation What is the neural sampling hypothesis? Why is it interesting? Toy example 2 Theory review Stochastic networks Why sigmoidal conditionals? Boltzmann machines 3 Neural dynamics as sampling Introduction Overview of model Simulations 4 More recent work

Overview 1 Motivation What is the neural sampling hypothesis? Why is it interesting? Toy example 2 Theory review Stochastic networks Why sigmoidal conditionals? Boltzmann machines 3 Neural dynamics as sampling Introduction Overview of model Simulations 4 More recent work

Neural sampling hypothesis Model for probabilistic perception and learning. Proposes activity patterns across networks of neurons represent samples from posterior distribution over interpretations given input. Neural response variability uncertainty in interpretation of inputs. Spontaneous network activity samples from prior distribution over inputs and interpretations. Some experimental support from systematic variation in response variability, high degree of structure in spontaneous network activity and similarity to stimulus evoked activity.

Computational advantages (I) Anytime computing y 2 Increasing accuracy y 2 Decreasing time y 1 y 1

Computational advantages (II) Marginalisation at no extra cost y 2 y 1

Computational advantages (III) Consistency of representation distinction between input and output arbitrary hierarchical and recurrent models learning from examples naturally deals with incomplete input

Toy example Is a 'o' Is a 'b' Is a 'o' Is a '6' Is a '6' Is a 'b'

Toy example Is a 'o' Is a 'b' Is a 'o' Is a '6' Is a '6' Is a 'b'

Toy example Next to digits Is a 'o' Is a 'b' Is a 'o' Next to digits Is a '6' Is a '6' Is a 'b'

Overview 1 Motivation What is the neural sampling hypothesis? Why is it interesting? Toy example 2 Theory review Stochastic networks Why sigmoidal conditionals? Boltzmann machines 3 Neural dynamics as sampling Introduction Overview of model Simulations 4 More recent work

Stochastic binary neural network models Spiking point neuron models. Inter-neuron communication assumed to be entirely spike based (binary). Neural spiking stochastic - network dynamics define probability of each neuron firing given current state of network. Typically discrete time models - time binned into small intervals and network state defined as set of binary variables indicating if neurons fired in last interval or not.

General sigmoidal stochastic binary network (SSBN) Network of N binary neurons, states s = [s i ] i {1...N} {0, 1} N Parametrised by: weight matrix W = [w ij ] i,j {1...N} R N N bias vector b = [b i ] i {1...N} R N Local potential weighted sum of states of other units N ( ) u (t) i = w ij s (t) j + b i j=1 If unit i updated at t + 1, new state sampled from conditional P ( s (t+1) i = 1 s (t)) ( = σ u (t) i ) = 1 1 + e u(t) i Special case of more general Markov random field. σ(u) 1-1 0 1 u

A brief aside Is there any biological justification for using sigmoidal conditional distributions? (Yes)

Origins of stochasticity in biological neurons For a fixed injected current signal neural firing tends to be highly consistent Variability appears to mainly arise from synaptic transmission Number and transmitter content of synaptic vesicles released on arrival of a presynaptic spike both fluctuate Figure source: Mainen and Sejnowski (1995)

Synaptic noise model (I) Number of vesicles released Poisson distribution. Transmitter content of each vesicle Gaussian distribution. Figure source: Castillo and Katz (1954)

y Motivation Theory review Neural dynamics as sampling More recent work Synaptic noise model (II) 1.0 0.8 y =Φ(x) y =σ( πx) 0.6 0.4 0.2 0.0 4 3 2 1 0 1 2 3 4 x Assumption of independent distributions and large number of synaptic connections Central limit theorem conditional distribution on membrane potential given spiking state of rest of network Gaussian. conditional probability of neuron being super threshold and so spiking takes form of Gaussian CDF. Gaussian CDF Φ(x) well approximated by scaled sigmoid σ(x) = [1 + exp( x)] 1.

Boltzmann machines (BM) visible unit Boltzmann machine semi-restricted Boltzmann machine restricted Boltzmann machine hidden unit Analytically tractable variant of SSBN. Also known as an Ising model within statistical physics. Constrained to have symmetric connectivity (w ij = w ji i, j) and zero self-connectivity (w ii = 0 i). Visible units are fixed to known values, hidden units are freely varying. Restricted and semi-restricted BMs are special cases of general BM with restricted connectivity graphs allowing simpler updates.

Boltzmann machine dynamics Each time-step a single unit picked to update, either deterministic sequence or randomly (Gibbs sampling). Symmetric connectivity enforces detailed balance condition i.e. that transitions are reversible, guaranteeing existence of equilibrium distribution. ( ) ( ) P P s (t) = u ( s (t+1) = v P s (t+1) = v s (t) = u = ) ( ) P s (t) = u s (t+1) = v After initial burn in, dynamics of network cause it to sample from Boltzmann distribution at equilibrium P (s) = 1 Z exp ( E(s)) = 1 ( ) 1 Z exp 2 st Ws + b T s

Boltzmann machine learning Boltzmann machines can be trained so that the equilibrium distribution tends towards any arbitrary distribution across binary vectors given samples from that distribution 1. Log likelihood derivative (s = [ s T h s T ] T) v log [P (s v )] w ij = s h {s i s j P (s h s v )} s = s i s j + s i s j {s i s j P (s)} Expectations generally analytically intractable approximated with MCMC sampling based methods. Learning rule is local & Hebbian-like biologically plausible. For large networks, learning very slow due to need to allow network to converge to equilibrium distribution. 1 Ackley, Hinton and Sejnowski (1985)

Parallel updates and asymmetric connectivity Updating all units in parallel but maintaining symmetric connectivity gives different but still tractable equilibrium distribution and learning rule 2. Relaxing symmetry constraint generally means no longer tractable to find analytic form for stationary distribution and possible there will be none if Markov chain non-ergodic. Irreversibility introduced by weight asymmetry may however improve speed of convergence to stationary distribution while also being more biologically relevant. Learning rule can still be derived using time-dependent state distribution but this introduces requirement to take expectations over history of states 3. 2 Apolloni and de Falco (1990) 3 Apolloni, Bertoni, Campadelli and de Falco (1991)

Boltzmann machines as a model for cortical computation + Distributed computation + Binary communication between units + High representational power + Local learning rule - Discrete time formulation - Reversible dynamics - Symmetric connectivity - Slow convergence

Boltzmann machines as a model for cortical computation + Distributed computation + Binary communication between units + High representational power + Local learning rule - Discrete time formulation - Reversible dynamics - Symmetric connectivity - Slow convergence

Overview 1 Motivation What is the neural sampling hypothesis? Why is it interesting? Toy example 2 Theory review Stochastic networks Why sigmoidal conditionals? Boltzmann machines 3 Neural dynamics as sampling Introduction Overview of model Simulations 4 More recent work

Neural Dynamics as Sampling: A model for stochastic computation in recurrent networks of spiking neurons L. Buesing, J. Bill, B. Nessler and W. Maass - PLOS Computational Biology (2011) Demonstrates a network model with more biologically plausible dynamics than a BM which samples from a Boltzmann distribution. Consists of a recurrently connected network of spiking neurons with irreversible dynamics. Irreversible dynamics allow inclusion of refractory mechanism and finite duration post-synaptic potentials. Discrete time models with both absolute and relative refractory mechanisms demonstrated. Continuous time formulation shown as a limiting-case of discrete time dynamics.

Relation between spike activity and network state k 1 2 3 4 5 6 7 8 ζ k [t] z k [t] 9 0 2 0 0 7 3 10 1 0 1 0 0 1 1 1 t-τ t Network state defined by ζ[t] = [ζ 1 [t]... ζ N [t]] T with Markov property P (ζ[t + 1] ζ[t], ζ[t 1],...) = P (ζ[t + 1] ζ[t]) Here τ = absolute refractory period = PSP duration

Discrete time model with absolute refractory mechanism for k = 1 to N: if ζ k [t] > 1: ζ k [t] = ζ k [t 1] 1 else: u k = N j=1 (w kjz j [t]) + b k r rand(0, 1) z k [t] = r σ(u k log τ) if z k [t] = 1: ζ k [t] = τ else: ζ k [t] = 0

Discrete time model with relative refractory mechanism Relaxes assumption of hard refractory period with no firing. Probability of any neuron firing defined as product of functions of last firing time and membrane potential P (z k [t] = 1 ζ k [t 1], u k [t 1]) = f (u k [t 1]) g (ζ k [t 1])

Sampling from random distributions with relative refractory mechanism

Effect of using more realistic post-synaptic potentials

Toy model of perceptual multistability

Overview 1 Motivation What is the neural sampling hypothesis? Why is it interesting? Toy example 2 Theory review Stochastic networks Why sigmoidal conditionals? Boltzmann machines 3 Neural dynamics as sampling Introduction Overview of model Simulations 4 More recent work

Bayesian computation emerges in generic cortical microcircuits through spike-timing-dependent plasticity B. Nessler, M. Pfeiffer, L. Buesing and W. Maass - PLOS Computational Biology (2013) Proposes biologically plausible probabilistic learning rule Spike timing dependent plasticity updates within soft winner take all cortical microcircuits shown to approximate expectation maximisation Limited to single layer networks in this paper, proposes potentially could be extended to deep and/or recurrent structures

Stochastic Computations in Cortical Microcircuit Models S. Habenschuss, Z. Jonke and W Maass - PLOS Computational Biology (2013) Shows that under quite general conditions, the activity of a network of neurons with some degree of stochasticity in dynamics will converge to a stationary distribution Oscillatory input / activity shown to lead to phase specific stationary distributions. Simulations performed with cortical microcircuit model with anatomically based laminar structure with separate inhibitory / excitatory populations and data-based network connectivity and short-term dynamics.

Thank you - any questions? References Mainen, Z. F., & Sejnowski, T. J. (1995). Reliability of spike timing in neocortical neurons. Science, 268(5216), 1503-1506. Del Castillo, J., & Katz, B. (1954). Quantal components of the end-plate potential. The Journal of physiology, 124(3), 560-573. Ackley, D., Hinton, G., and Sejnowski, T. (1985). A Learning Algorithm for Boltzmann Machines. Cognitive Science, 9(1):147-169. Apolloni, B., & de Falco, D. (1991). Learning by parallel Boltzmann machines. Information Theory, IEEE Transactions on, 37(4), 1162-1165. Apolloni, B., Bertoni, A., Campadelli, P., & de Falco, D. (1991). Asymmetric Boltzmann machines. Biological cybernetics, 66(1), 61-70. Buesing, L., Bill, J., Nessler, B., & Maass, W. (2011). Neural dynamics as sampling: A model for stochastic computation in recurrent networks of spiking neurons. PLoS computational biology, 7(11), e1002211. Resources Kappen, H. J. (2001). An introduction to stochastic neural networks. Handbook of Biological Physics, 4, 517-552. Hinton, G. H. (2007) Boltzmann machine. Scholarpedia, 2(5):1668.