Applications of Memristors in ANNs

Similar documents
Neuromorphic computing with Memristive devices. NCM group

Introduction to Neural Networks

Synaptic Devices and Neuron Circuits for Neuron-Inspired NanoElectronics

Artifical Neural Networks

Artificial Neural Networks

Neural networks. Chapter 19, Sections 1 5 1

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Neural networks. Chapter 20. Chapter 20 1

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Artificial Neural Networks. Historical description

Simple neuron model Components of simple neuron

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Artificial Neural Networks The Introduction

Artificial Neural Network and Fuzzy Logic

Addressing Challenges in Neuromorphic Computing with Memristive Synapses

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

1. HP's memristor and applications 2. Models of resistance switching. 4. 3D circuit architectures 5. Proposal for evaluation framework

Introduction to Artificial Neural Networks

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

In the Name of God. Lecture 9: ANN Architectures

Data Mining Part 5. Prediction

Simple Neural Nets For Pattern Classification

Lecture 4: Feed Forward Neural Networks

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Machine Learning. Neural Networks

RRAM technology: From material physics to devices. Fabien ALIBART IEMN-CNRS, Lille

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

CMSC 421: Neural Computation. Applications of Neural Networks

CS:4420 Artificial Intelligence

Grundlagen der Künstlichen Intelligenz

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Artificial Neural Network

Artificial Neural Networks Examination, June 2005

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Learning and Memory in Neural Networks

18.6 Regression and Classification with Linear Models

CSC321 Lecture 5: Multilayer Perceptrons

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Part 8: Neural Networks

Artificial neural networks

Artificial Intelligence Hopfield Networks

Artificial Neural Networks Examination, March 2004

Artificial Neural Network

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Convolutional networks. Sebastian Seung

Hopfield Neural Network

Dmitri Strukov. UC Santa Barbara

Niobium oxide and Vanadium oxide as unconventional Materials for Applications in neuromorphic Devices and Circuits

Artificial Neural Networks Examination, June 2004

The N3XT Technology for. Brain-Inspired Computing

Using a Hopfield Network: A Nuts and Bolts Approach

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Neural Networks and Deep Learning

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Magnetic tunnel junction beyond memory from logic to neuromorphic computing WANJUN PARK DEPT. OF ELECTRONIC ENGINEERING, HANYANG UNIVERSITY

REAL-TIME COMPUTING WITHOUT STABLE

Artificial Intelligence

A 68 Parallel Row Access Neuromorphic Core with 22K Multi-Level Synapses Based on Logic- Compatible Embedded Flash Memory Technology

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture

An Introductory Course in Computational Neuroscience

Lecture 4: Perceptrons and Multilayer Perceptrons

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Novel Devices and Circuits for Computing

Multilayer Perceptron Tutorial

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

A Hybrid CMOS/Memristive Nanoelectronic Circuit for Programming Synaptic Weights

Neural Networks. Nethra Sambamoorthi, Ph.D. Jan CRMportals Inc., Nethra Sambamoorthi, Ph.D. Phone:

! Crosstalk. ! Repeaters in Wiring. ! Transmission Lines. " Where transmission lines arise? " Lossless Transmission Line.

How to do backpropagation in a brain. Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto

Artificial Neural Networks

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Novel VLSI Implementation for Triplet-based Spike-Timing Dependent Plasticity

Lecture 7 Artificial neural networks: Supervised learning

Using Variable Threshold to Increase Capacity in a Feedback Neural Network

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

Sections 18.6 and 18.7 Artificial Neural Networks

Unit III. A Survey of Neural Network Model

Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics

Experimental and theoretical understanding of Forming, SET and RESET operations in Conductive Bridge RAM (CBRAM) for memory stack optimization

Model-Free Stochastic Perturbative Adaptation and Optimization

Artificial Neural Network : Training

How to do backpropagation in a brain

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Introduction to Machine Learning Spring 2018 Note Neural Networks

Multilayer Perceptrons (MLPs)

CHALMERS, GÖTEBORGS UNIVERSITET. EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

AI Programming CS F-20 Neural Networks

Neural Networks (Part 1) Goals for the lecture

CSCI 315: Artificial Intelligence through Deep Learning

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

Transcription:

Applications of Memristors in ANNs

Outline Brief intro to ANNs Firing rate networks Single layer perceptron experiment Other (simulation) examples Spiking networks and STDP

ANNs ANN is bio inpsired inpsired massively parallel network, i.e. directed graph, with nodes acting as neurons and edges acting as synapses. The functionality is learned during training phase by changing weights of synapses By topology By learning paradigm By coding neural information

Very good review

Applications

Complexity ~ 10 11 neurons ~ 10 15 synapses Connectivity ~ 1 : 10000 Massive parallelism li 100 steps long rule: few to several hundred hertz; face recognition in ~100 ms Challenges 2 3 mm think, 2200 cm 2

McCulloch Pitts neuron 1943 dff different activation functions

By topology

By learning paradigm Key questions: Capacity, Sample complexity, Computational complexity

By information coding Firing rate vs spiking Firing rate vs spiking models

Perceptron: Main idea Single layer perceptron Bias, x 0 x x 1 x 2 w 1 w 0 9 x 3 y sgn[ w x i i ] x 9 w 9 i 0

Hebbian rule Learning using local information Orientation selectivity

Multilayer perceptron Key questions: number of layers, number of hidden neurons

Backpropagation Gradient descent method to minimize i i cost function

Competitive learning

Learning binary patterns with competitive network Instar learning law: What happens if more than four unique patterns are presented? What happens when all white pattern is presented?

Complementary coding Resolve no signal issue for a particular (instar) learning law How to learn invariance? (translation, size, angle etc.)

With added complex cells

With added complex cells AND in bottom layer OR in top present one hot AND in bottom layer, OR in top, present one hot patterns to the top layer

Perceptron: Main idea Single layer perceptron Bias, x 0 x 1 hw bottleneck x w 1 2 w 0 9 x 3 y sgn[ w i x i ] i 0 Binary pixel array x 1 x 4 x 7 x 2 x x x = +1 x 5 x 8 x 3 x 6 x 9 x = 1 w 9 w x 9 9 Considered training/test patterns Perceptron training rule: w i = αx (p) i (d (p) y (p) ) Pattern X, class d = +1 Crossbar implementation V x G + -G - = G w V 0 V 1 V 9 V 2 G 0+ G 1+ G + 2 G + 9 I + G 0 G 1 G 2 G 9 I y = sgn[i I + -II - ] A A + param. analyzerbased Pattern T, class d = 1 Alibart et al., submitted, 2012

Windrow s memistor AdaLiNe concept and hardware implementation Bernard Widrow Marcian Hoff B. Widrow and M.E. Hoff, Jr., IRE WESCON Convention Record, 4:96 1960

Pt/TiO 2 x /Pt devices g=i(0.2v)/ 0.2 V 1.0 S = 25 nm Au / 15 nm Pt top electrode e beam 30 nm TiO 2 x patterned Pt protrusion 5 nm Ti / 25 nm Pt bottom electrode Curr rent (ma) 1.0 0 Alibart et al., submitted, 2012 S A 1.0 0 1.0 Voltage (V) V switch +V switch V 20 nm Any state between ON and OFF In principle dynamic system with frequency dependent loop size but. Strongly (superexp) nonlinear switching dynamics Gray area = no change State defined within gray area

Switching dynamics voltage set initialize to R 0FF time read reset initialize to R 0N Small pulse amp = finer state change but may require exp long time Large pulse amp faster but at cruder step RESET: R 0 =R ON SET: R 0 =R OFF 100 10 1 R/R 0 0.1 mv (A) Current @ -200 1E-4 1E-5-0.5V to -0.8V -0.9V -1.0V -1.1V -1.2V -1.3V 0 1x10-5 2x10-5 -1.5-1.0-0.5 Pulse voltage (V) ( 0.0 0.5 1.0 15 1.5 1 1E-8 1E-6 1E-4 0.01 Time (s) Time (s) F. Alibart et al. Nanotechnology, 23 075201, 2012

Nonlinear switching dynamics effective barrier modulation due to: 1 2 electric field 1 heating ~ k B T U A ion hopping initial profile electrode e oxidation ion hoping z + + v z + electrode e reduction 2 ~Eaq/2 energy 3 U A hop distance a 3 phase transition or redox reaction position J. Yang et al. submitted 2012

Speed vs. retention linear ionic transport store ~ write ( v 0) ( v V ) V I D I V V T nonlinear effect due to temperature and/or electric field e.g. temperature only: A A store V kbtstore kbtwrite ~ ( e e write V T U U ) D.Strukov et al. Appl.Phys.A 94 515 (2009)

Switching statistics RESET SET 0.6 0.8 Voltag tage (V) 1.0 1.2 1.4 10-4 10-5 0.0 2.0x10-6 4.0x10-6 6.0x10-6 80x10 8.0x10-6 1.0x10-5 Cu urrent @ 200m mv (A) Cumulative time (s) -1.4-1.2 Voltage (V) -1.0-0.8 2.0x10-6 -0.6 10 TiO 2 x devices 1.5x10-6 10-5 0.0 5.0x10-7 1.0x10-6 10-4 Cumulat ative time (s) Current @ 200 0mV (A) Large switching dynamics dispersion! Alibart et al., submitted, 2012

Variations in switching behavior g = I(0.2V)/ 0.2 V Current (ma) 1.0 1.0 0 1.0 0 1.0 Voltage (V) write 1 10 g INITIAL g AFTER /g SET S = tune read RESET Syn ynaptic weight ht, ms) Continuous state change ginitial (ms 01 0.1 1 1 0-1 RESET Pulse voltage (V) SET Alibart et al., submitted, 2012

Tuning algorithm Processing Write V WRITE = V WRITE +sign * T VSTEP oldsign = sign apply pulse V WRITE Start (inputs: desired state I desired, desired accuracy A desired ; initialize: write voltage to small non disturbing value V WRITE = 200 mv, voltage step T VSTEP = 10 mv; Read (apply V READ = 200 mv and read current I current ) Processing Is state reached within required precision, i.e. (I desired I current )/ I desired < A desired? yes no Processing check for overshoot and set the sign of increment, i.e. sign = I current I desired ; if V WRITE!=V READ and sign!=oldsign then initialize V WRITE = 200 mv Finish Intuitive algorithm voltage set 0 time read reset Implemented algorithm voltage set time 0 read reset non disturbing pulse F. Alibart et al. Nanotechnology, 23 075201, 2012

High precision tuning mv (A) t @-200 1E-4 voltage 0 Decrease set Weight time Increase Weight Stand-by (Read only) reset read 32 120 A 60 A 30 A TiO 2 x devices (w/o protrusion) 100(g des g act )/g des <1% ~ 8 bit precision Current 1E-5 31 30 29 28 950 1000 1050 1100 1150 15 A 7 A 0 1000 2000 3000 Pulse Number F. Alibart et al. Nanotechnology, 23 075201, 2012

Limitation to tuning accuracy: Random telegraph noise.u.) Current (a 3 5k 5k 4k 2k 1k PSD/I 2 (H Hz -1 ) 10-8 10-9 10-10 10-11 R/R (%) 2 1 0 2 4 6 8 10 Resistance (k ) 4k 2k 1k 0.5k 0.5k 0.2 0.4 0.6 0.8 1.0 1.2 Time (s) 10-12 0.2 0.4 0.6 0.8 Time (s) 10 2 10 3 10 4 Frequency (Hz) Solid state electrolyte (electrochemical) are noisier The higher R, the larger is noise For a Si limit to ~5 6 bit precision (but no optimization) Ligang Gao et al, VLSI SoC, 2012

Perceptron experimental setup V Arbitrary waveform generator B1530 A t Switching matrix (Agilent E5250A) Current measurement B1530 (fast IV mode) Ground (GNDU, Agilent) Agilent B1500 Wires implementing crossbar circuit Chip packaged wire bonded memristive devices Alibart et al., submitted, 2012

Perceptron: Ex situ training s 1 s v 1 05 0.5 read pulse v v s 2 t write pulse t s 2 Synaptic weight, g (m ms) g Evolution of synaptic conductance upon sequential tuning 0.6 0.4 0.3 0.2 0.1 weight import accuracy ~10% final weights after programming g + tuning g g + i+, i 1 2 3 4 5 6 weight slightly affected by half select problem 0.0 voltage at g - 0 20 40 60 80 100 120 250 300 8 t +V switch -V switch Pulse number # Crossbar half select trick Half selected l ddevices slightly l affected (>5 bit precision) ii 7 8 9 10 Alibart et al., submitted, 2012

s 1 s 2 g + 1 g + 4 g 1 - g 4 - Perceptron: In situ training s 3 s 4 g ± i = ±αx i (d (p) y (p) ) Four steps α (V, g) 0.00-0.05 0.1 0.0-0.1 Evolution of synaptic conductance upon parallel tuning V train = 0.9V V train = 1V g g +V train /2 v -V t train /2 v v s 1 =PS x=+1 voltage at g1 + 1 x=+1 1 2 3 4 s 2 =PS x= 1 1 s 3=PS + d=+1 voltage at g 1 v t -V train voltage at g - 1 v t t t voltage at g + 4 v t g g g (ms) 005 0.05 0.00 0.00-0.05 0.05 0.00 0.15 0.00-0.15-0.15-0.20 01 0.1 0.0-0.05-0.10-0.15 g g g g g g g v +V switch -V switch s 4=PS d=+1 1 t voltage at g - 4 v t 0.1 00 0.0 Alibart et al., submitted, 2012 0 4 8 12 16 Training epoch g

Results Ex situ In situ 10 initial Initial (random (a weights) X T 10 initial X T Number of patterns 0 10 0 10 0 10 accuracy weight import ~ 40% accuracy ~40% 0 accuracy weight import ~ 10% accuracy ~10% Numb ber of pattern ns accuracy weight import ~ 2% 10 accuracy ~2% 10 after 10 epochs with V train =0.9V 0 after 7 more epochs with V train =1V 10 train 0-0.0002 0.0000 0.0002 I + - I - (A) 3 bit is enough for considered task 0-0.0002 0.0000 0.0002 I + - I - (A) Alibart et al., submitted, 2012

Big picture add on CMOS stack Tight integration i with ihcmos logic (CMOL) Multi layer perceptron network x 1 x 2 x 3 weight x 1 w j1 y w j2 y j x 2 w j3 x 3 g j1 g j2 g j3 + memristor x i g i ji CMOS CMOS cell

Spiking Networks and Spike Timing Dependent Plasticity (STDP)

Spiking vs. firing rate neural networks Firing rate (average frequency matters, high frequency level 1, low frequency level 0) Spiking networks Relative timing of the spikes matters Delay between neurons matters Enriches the functionality

Spiking neural networks Spatiotemporal processing Known to happen in biology, e.g. detecting the direction i of the sound with two sensors and two neurons

Polychronization: Computation with Spikes According to Izhikevitch: Accounting for timing of spikes allows to increase the capacity of the network beyond that of Hopfield networks

Hopfield Networks Binary Hopfield network v j ( t 1) sgn[ i 0 w ji v ( t)] i Capacity is p max = N/logN

Polychro nization: Computa tion with Spikes Due to STDP system can self organized to activate various polychronous groups

Spike Timing Dependent Plasticity

STDP Implementation (first attempt) we have implemented a CMOS neuron circuit to convert the relative timing information of the neuron spikes into pulse width information seen by the memristor synapse

STDP Implementation Proposalfor Memristors Assumed rate change as a function of applied voltage

STDP Implementation with PCM

Long Term Depression and Short Term Potentiating

Electronic Pavlov s Dog

Snider s Spiking Networks

Example: Network Self- Organization (Spatial Orientation Filter Array) adaptive recurrent network output + + + + x i input 49 G. Snider, Nanotechnology 18 365202 (2007)