Event-Driven Random Backpropagation: Enabling Neuromorphic Deep Learning Machines

Similar documents
UC Irvine UC Irvine Previously Published Works

arxiv: v2 [cs.ne] 6 Oct 2017

Learning Spatio-Temporally Encoded Pattern Transformations in Structured Spiking Neural Networks 12

Synaptic Devices and Neuron Circuits for Neuron-Inspired NanoElectronics

STDP Learning of Image Patches with Convolutional Spiking Neural Networks

Fast classification using sparsely active spiking networks. Hesham Mostafa Institute of neural computation, UCSD

arxiv: v1 [cs.lg] 9 Dec 2016

A brain-inspired neuromorphic architecture for robust neural computation

Introduction. Previous work has shown that AER can also be used to construct largescale networks with arbitrary, configurable synaptic connectivity.

The N3XT Technology for. Brain-Inspired Computing

Implementation of a Restricted Boltzmann Machine in a Spiking Neural Network

How to do backpropagation in a brain

Deep learning in the brain. Deep learning summer school Montreal 2017

An Introductory Course in Computational Neuroscience

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture

How to do backpropagation in a brain. Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

NEUROMORPHIC COMPUTING WITH MAGNETO-METALLIC NEURONS & SYNAPSES: PROSPECTS AND PERSPECTIVES

Addressing Challenges in Neuromorphic Computing with Memristive Synapses

RE-ENGINEERING COMPUTING WITH NEURO- MIMETIC DEVICES, CIRCUITS, AND ALGORITHMS

Classification with Perceptrons. Reading:

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

arxiv: v1 [cs.ne] 19 Sep 2015

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Neuromorphic computing with Memristive devices. NCM group

Liquid Computing in a Simplified Model of Cortical Layer IV: Learning to Balance a Ball

Data Mining Part 5. Prediction

arxiv: v2 [cs.ne] 7 Nov 2016

Online Learning in Event-based Restricted Boltzmann Machines

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

Introduction to Neural Networks

EXTENDING THE FRAMEWORK OF EQUILIBRIUM PROPAGATION TO GENERAL DYNAMICS

arxiv: v3 [q-bio.nc] 7 Apr 2017

Modelling stochastic neural learning

Decoding. How well can we learn what the stimulus is by looking at the neural responses?

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning

GPGPU ACCELERATED SIMULATION AND PARAMETER TUNING FOR NEUROMORPHIC APPLICATIONS

Course Structure. Psychology 452 Week 12: Deep Learning. Chapter 8 Discussion. Part I: Deep Learning: What and Why? Rufus. Rufus Processed By Fetch

Learning on Silicon: Overview

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

LEADING THE EVOLUTION OF COMPUTE MARK KACHMAREK HPC STRATEGIC PLANNING MANAGER APRIL 17, 2018

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

Spike-based Long Short-Term Memory networks

Supervised Learning in Neural Networks

Introduction to Neural Networks

SuperSpike: Supervised learning in multi-layer spiking neural networks

Neuron. Detector Model. Understanding Neural Components in Detector Model. Detector vs. Computer. Detector. Neuron. output. axon

UNSUPERVISED LEARNING

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Introduction to Neural Networks

Machine Learning. Neural Networks

Credit Assignment: Beyond Backpropagation

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

Neural networks. Chapter 20, Section 5 1

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

Artificial Neural Networks. Historical description

Grundlagen der Künstlichen Intelligenz

This script will produce a series of pulses of amplitude 40 na, duration 1ms, recurring every 50 ms.

Basic elements of neuroelectronics -- membranes -- ion channels -- wiring. Elementary neuron models -- conductance based -- modelers alternatives

Center for Spintronic Materials, Interfaces, and Novel Architectures. Spintronics Enabled Efficient Neuromorphic Computing: Prospects and Perspectives

(Artificial) Neural Networks in TensorFlow

Introduction to Convolutional Neural Networks (CNNs)

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

Introduction Biologically Motivated Crude Model Backpropagation

Basic elements of neuroelectronics -- membranes -- ion channels -- wiring

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Fast and exact simulation methods applied on a broad range of neuron models

Balance of Electric and Diffusion Forces

Neural Networks Introduction

Neural networks. Chapter 19, Sections 1 5 1

Classification of Hand-Written Digits Using Scattering Convolutional Network

3 Detector vs. Computer

ECE521 Lectures 9 Fully Connected Neural Networks

Outline. NIP: Hebbian Learning. Overview. Types of Learning. Neural Information Processing. Amos Storkey

Multilayer Perceptron

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Deep Reinforcement Learning SISL. Jeremy Morton (jmorton2) November 7, Stanford Intelligent Systems Laboratory

Neural Networks 2. 2 Receptive fields and dealing with image inputs

Machine Learning. Boris

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Artificial Neural Network

Edge of Chaos Computation in Mixed-Mode VLSI - A Hard Liquid

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Multilayer Perceptrons (MLPs)

Computational statistics

Deep Feedforward Networks

Synaptic Plasticity. Introduction. Biophysics of Synaptic Plasticity. Functional Modes of Synaptic Plasticity. Activity-dependent synaptic plasticity:

Large-scale neural modeling

A gradient descent rule for spiking neurons emitting multiple spikes

New Delhi & Affiliated to VTU, Belgavi ) Oorgaum, Kolar Dist ,Karnataka

arxiv: v1 [cs.lg] 10 Aug 2018

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Recurrent Neural Networks

Transcription:

Event-Driven Random Backpropagation: Enabling Neuromorphic Deep Learning Machines Emre Neftci Department of Cognitive Sciences, UC Irvine, Department of Computer Science, UC Irvine, March 7, 2017

Scalable Event-Driven Learning Machines Cauwenberghs, Proceedings of the National Academy of Sciences, 2013 Karakiewicz, Genov, and Cauwenberghs, IEEE Sensors Journal, 2012 Neftci, Augustine, Paul, and Detorakis, arxiv preprint arxiv:1612.05596, 2016 1000x power improvements compared to future GPU technology through two factors: Architecture and device level optimization in event-based computing Algorithmic optimization in neurally inspired learning and inference

Neuromorphic Computing Can Enable Low-power, Massively Parallel Computing Only spikes are communicated & routed between neurons (weights, internal states are local) To use this architecture for practical workloads, we need algorithms that operate on local information

Why Do Embedded Learning? For many industrial applications involving controlled environments, where existing data is readily available, off-chip/off-line learning is often sufficient. So why do embedded learning? Two main use cases: Mobile, low-power platform in uncontrolled environments, where adaptive behavior is required. Working around device mismatch/non-idealities. Potentially rules out: Self-driving cars Data mining Fraud Detection

Neuromorphic Learning Machines Neuromorphic Learning Machines: Online learning for data-driven autonomy and algorithmic efficiency Hardware & Architecture: Scalable Neuromorphic Learning Hardware Design Programmability: Neuromorphic supervised, unsupervised and reinforcement learning framework

Foundations for Neuromorphic Machine Learning Software Framework & Library neon_mlp_extract.py # setup model layers layers = [Affine(nout=100, init=init_norm, activation=rectlin()), Affine(nout=10, init=init_norm, activation=logistic(shortcut=true))] # setup cost function as CrossEntropy cost = GeneralizedCost(costfunc=CrossEntropyBinary()) # setup optimizer optimizer = GradientDescentMomentum( 0.1, momentum_coef=0.9, stochastic_round=args.rounding)

Can we design a digital neuromorphic learning machine that is flexible and efficient?

Examples of linear I&F neuron models Leaky Stochastic I&F Neuron (LIF) n V[t + 1] = αv[t] + ξ jw j(t)s j(t) j=1 V[t + 1] T : V[t + 1] V reset (1a) (1b)

Examples of linear I&F neuron models Continued LIF with first order kinetic synapse V[t + 1] = αv[t] + I syn n I syn[t + 1] = a 1I syn[t] + w j(t)s j(t) j=1 V[t + 1] T : V[t + 1] V reset (2a) (2b) (2c)

Examples of linear I&F neuron models Continued LIF with second order kinetic synapse V[t + 1] = αv[t] + I syn + I syn, I syn[t + 1] = a 1I syn[t] + c 1I s[t] + η[t] + b n I s[t + 1] = a 2I s[t] + w js j[t] j=1 V[t + 1] T : V[t + 1] V reset (3a) (3b) (3c) (3d)

Examples of linear I&F neuron models Continued Dual-Compartment LIF with synapses V 1[t + 1] = αv 1[t] + α 21V 2[t] V 2[t + 1] = αv 2[t] + α 12V 1[t] + I syn n I syn[t + 1] = a 1I syn[t] + w 1 j (t)s j(t) + η[t] + b j=1 V 1[t + 1] T : V 1[t + 1] V reset (4a) (4b) (4c) (4d)

Mihalas-Niebur Neuron Continued Mihalas Niebur Neuron (MNN) n V[t + 1] = αv[t] + I e G E L + I i[t] i=1 Θ[t + 1] = (1 b)θ[t] + av[t] ae L + b I 1[t + 1] = α 1I 1[t] I 2[t + 1] = α 2I 2[t] V[t + 1] Θ[t + 1] : Reset(V[t + 1], I 1, I 2, Θ) (5a) (5b) (5c) (5d) (5e) MNN can produce a wide variety of spiking behaviors Mihalas and Niebur, Neural Computation, 2009

Digital Neural and Synaptic Array Transceiver Multicompartment generalized integrate-and-fire neurons Multiplierless design Weight sharing (convnets) at the level of the core Equivalent software simulations for analyzing fault tolerance, precision, performance, and efficiency trade-offs (available publicly soon!)

NSAT Neural Dynamics Flexibility Tonic spiking Mixed mode Amplitude (mv) Amplitude (mv) -30-50 -70-30 -50-70 Class I Phasic spiking Class II Tonic bursting Amplitude (mv) -30-50 -70 0 100 200 300 400 500 Time (ticks) 0 100 200 300 400 500 Time (ticks) Detorakis, Augustine, Paul, Pedroni, Sheik, Cauwenberghs, and Neftci (in preparation)

Flexible Learning Dynamics w k[t + 1] = w k[t] + s k[t + 1]e k e k = x m (K[t t k] + K[t k t last]) }{{} STDP x m = i γ ix i (Weight update) (Eligibilty) (Modulation) Detorakis, Augustine, Paul, Pedroni, Sheik, Cauwenberghs, and Neftci (in preparation)

Flexible Learning Dynamics w k[t + 1] = w k[t] + s k[t + 1]e k e k = x m (K[t t k] + K[t k t last]) }{{} STDP x m = i γ ix i (Weight update) (Eligibilty) (Modulation) Detorakis, Augustine, Paul, Pedroni, Sheik, Cauwenberghs, and Neftci (in preparation) Based on two insights: Causal and acausal STDP weight updates on pre-synaptic spikes only, using only forward lookup access of the synaptic connectivity table Plasticity involves as a third factor a local dendritic potential, besides pre- and postsynaptic firing times Pedroni et al.,, 2016 Urbanczik and Senn, Neuron, 2014 Clopath, Büsing, Vasilaki, and Gerstner, Nature Neuroscience, 2010

Applications for Three-factor Plasticity Rules Example learning rules Reinforcement Learning w ij = ηrstdp ij Unsupervised Representation Learning Florian, Neural Computation, 2007 w ij = ηg(t)stdp ij Neftci, Das, Pedroni, Kreutz-Delgado, and Cauwenberghs, Frontiers in Neuroscience, 2014 Unsupervised Sequence Learning w ij = η (Θ(V) α(ν i C)) ν j Supervised Deep Learning Sheik et al. 2016 w ij = η(ν tgt ν i)φ (V)ν j Neftci, Augustine, Paul, and Detorakis, arxiv preprint arxiv:1612.05596, 2016

Applications for Three-factor Plasticity Rules Example learning rules Reinforcement Learning w ij = ηrstdp ij Unsupervised Representation Learning Florian, Neural Computation, 2007 w ij = ηg(t)stdp ij Neftci, Das, Pedroni, Kreutz-Delgado, and Cauwenberghs, Frontiers in Neuroscience, 2014 Unsupervised Sequence Learning w ij = η (Θ(V) α(ν i C)) ν j Supervised Deep Learning Sheik et al. 2016 w ij = η(ν tgt ν i)φ (V)ν j Neftci, Augustine, Paul, and Detorakis, arxiv preprint arxiv:1612.05596, 2016

Gradient Backpropagation (BP) is non-local on Neural Substrates Potential incompatibilities of BP on a neural (neuromorphic) substrate: 1 Symmetric Weights 2 Computing Multiplications and Derivatives 3 Propagating error signals with high precision 4 Precise alternation between forward and backward passes 5 Synaptic weights can change sign 6 Availability of targets

Feedback Alignment Replace weight matrices in backprop phase with (fixed) random weights Lillicrap, Cownden, Tweed, and Akerman, arxiv preprint arxiv:1411.0247, 2014 Baldi, Sadowski, and Lu, arxiv preprint arxiv:1612.02734, 2016

Event-Driven Random Backpropagation (erbp) for Deep Supervised Learning Event-driven Random Backpropagation Learning Rule: Error-modulated, membrane voltage-gated, event-driven, supervised. w ik φ (I syn,i[t]) S k[t] }{{} j Derivative G ij (L j[t] P j[t]) }{{} Error (erbp)

Event-Driven Random Backpropagation (erbp) for Deep Supervised Learning Event-driven Random Backpropagation Learning Rule: Error-modulated, membrane voltage-gated, event-driven, supervised. w ik φ (I syn,i[t]) S k[t] G ij (L j[t] P j[t]) }{{}}{{} j Derivative Error }{{} T i (erbp) Approximate derivative with a boxcar function: Neftci, Augustine, Paul, and Detorakis, arxiv preprint arxiv:1612.05596, 2016 One addition and two comparison per synaptic event

erbp PI MNIST Benchmarks Network Classification Error Dataset erbp perbp RBP (GPU) BP (GPU) PI MNIST 784-100-10 3.94% 3.02% 2.74% 2.19% PI MNIST 784-200-10 3.53% 2.69% 2.15% 1.81% PI MNIST 784-500-10 2.76% 2.40% 2.08% 1.8% PI MNIST 784-200-200-10 3.48% 2.29% 2.42% 1.91% PI MNIST 784-500-500-10 2.02% 2.20% 1.90% perbp = erbp with stochastic synapses

perbp MNIST Benchmarks (Convolutional Neural Net) Network Classification Error Dataset perbp RBP (GPU) BP (GPU) MNIST 3.8 (5 epochs)% 1.95% 1.23%

Energetic Efficiency Energy Efficieny During Inference: Inference: = 100k Synops until first spike: <5% error, 100, 000 SynOps per classification erbp DropConnect (GPU) Spinnaker True North Implementation (20 pj/synop) CPU/GPU ASIC ASIC Accuracy 95% 99.79% 95% 95% Energy/classify 2 µj 1265 µj 6000 µj 4µJ Technology 28 nm Unknown 28 nm

Energetic Efficiency Energy Efficieny During Training: Training: SynOp-MAC parity Embedded local plasticity dynamics for continuous (life-long) learning

Learning using Fixed Point Variables 16 bits neural states 8 bits synaptic weights = 1Mbit Synaptic Weight Memory All-digital implementation for exploring scalable event-based learning UCI (Neftci, Krichmar, Dutt), UCSD (Cauwenberghs)

Summary & Acknowledgements Summary: 1 NSAT: Flexible and efficient neural learning machines 2 Supervised deep learning with event-driven random back-propagation can achieve good learning results at >100x energy improvements Challenges: 1 Catastrophic Forgetting: Need for Hippocampus, Intrinsic Replay and Neurogenesis 2 Build a neuromorphic library of deep learning tricks (Batch normalization, Adam,... )

Acknowledgements Collaborators: Georgios Detorakis (UCI) Somnath Paul (Intel) Charles Augustine (Intel) Support:

P. Baldi, P. Sadowski, and Zhiqin Lu. Learning in the Machine: Random Backpropagation and the Learning Channel. In: arxiv preprint arxiv:1612.02734 (2016). Gert Cauwenberghs. Reverse engineering the cognitive brain. In: Proceedings of the National Academy of Sciences 110.39 (2013), pp. 15512 15513. C. Clopath, L. Büsing, E. Vasilaki, and W. Gerstner. Connectivity reflects coding: a model of voltage-based STDP with homeostasis. In: Nature Neuroscience 13.3 (2010), pp. 344 352. R.V. Florian. Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. In: Neural Computation 19.6 (2007), pp. 1468 1502. R. Karakiewicz, R. Genov, and G. Cauwenberghs. 1.1 TMACS/mW Fine-Grained Stochastic Resonant Charge-Recycling Array Processor. In: IEEE Sensors Journal 12.4 (Apr. 2012), pp. 785 792. Timothy P Lillicrap, Daniel Cownden, Douglas B Tweed, and Colin J Akerman. Random feedback weights support learning in deep neural networks. In: arxiv preprint arxiv:1411.0247 (2014).

S. Mihalas and E. Niebur. A generalized linear integrate-and-fire neural model produces diverse spiking behavior. In: Neural Computation 21 (2009), pp. 704 718. E. Neftci, S. Das, B. Pedroni, K. Kreutz-Delgado, and G. Cauwenberghs. Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems. In: Frontiers in Neuroscience 7.272 (Jan. 2014). ISSN: 1662-453X. DOI: 10.3389/fnins.2013.00272. URL: http://www.frontiersin.org/neuromorphic_engineering/ 10.3389/fnins.2013.00272/abstract. Emre Neftci, Charles Augustine, Somnath Paul, and Georgios Detorakis. Event-driven Random Back-Propagation: Enabling Neuromorphic Deep Learning Machines. In: arxiv preprint arxiv:1612.05596 (2016). Bruno U Pedroni, Sadique Sheik, Siddharth Joshi, Georgios Detorakis, Somnath Paul, Charles Augustine, Emre Neftci, and Gert Cauwenberghs. Forward Table-Based Presynaptic Event-Triggered Spike-Timing-Dependent Plasticity. In: Oct. 2016. URL: %7BIEEE%20Biomedical%20Circuits%20and%20Systems% 20Conference%20(BioCAS),%20https: //arxiv.org/abs/1607.03070%7d.

Robert Urbanczik and Walter Senn. Learning by the dendritic prediction of somatic spiking. In: Neuron 81.3 (2014), pp. 521 528.