Model-Free Stochastic Perturbative Adaptation and Optimization

Size: px
Start display at page:

Download "Model-Free Stochastic Perturbative Adaptation and Optimization"

Transcription

1 Model-Free Stochastic Perturbative Adaptation and Optimization Gert Cauwenberghs Johns Hopkins University Learning on Silicon

2 Model-Free Stochastic Perturbative Adaptation and Optimization OUTLINE Model-Free Learning Model Complexity Compensation of Analog VLSI Mismatch Stochastic Parallel Gradient Descent Algorithmic Properties Mixed-Signal Architecture VLSI Implementation Extensions Learning of Continuous-Time Dynamics Reinforcement Learning Model-Free Adaptive Optics AdOpt VLSI Controller Adaptive Optics Quality Metrics Applications to Laser Communication and Imaging

3 The Analog Computing Paradigm Local functions are efficiently implemented with minimal circuitry, exploiting the physics of the devices. Excessive global interconnects are avoided: Currents or charges are accumulated along a single wire. Voltage is distributed along a single wire. Pros: Massive Parallellism Low Power Dissipation Real-Time, Real-World Interface Continuous-Time Dynamics Cons: Limited Dynamic Range Mismatches and Nonlinearities (WYDINWYG)

4 Effect of Implementation Mismatches INPUTS SYSTEM { p i } OUTPUTS ε(p) REFERENCE Associative Element: Mismatches can be properly compensated by adjusting the parameters p i accordingly, provided sufficient degrees of freedom are available to do so. Adaptive Element: Requires precise implementation The accuracy of implemented polarity (rather than amplitude) of parameter update increments p i is the performance limiting factor.

5 Example: LMS Rule A linear perceptron under supervised learning: y i (k) = Σ j p ij x j (k) ε = 1 2 Σ k Σ j ( ) y i target ( k) - y i (k) 2 with gradient descent: p ij (k) = -η ε(k) p ij = -η x j (k) y i target ( k) - y i (k) reduces to an incremental outer-product update rule, with scalable, modular implementation in analog VLSI.

6 Incremental Outer-Product Learning in Neural Nets j x j p ij i x i e j e i Multi-Layer Perceptron: x i = f( Σ j p ij x j ) Outer-Product Learning Update: p ij = η x j e i Hebbian (Hebb, 1949): LMS Rule (Widrow-Hoff, 1960): e i = x i e i = f ' i x i target - x i Backpropagation (Werbos, Rumelhart, LeCun): e j = f ' j p ij e i G. Cauwenberghs Learning on Silicon Σ i

7 Gradient Descent Learning Minimize ε(p) by iterating: from calculation of the gradient: Implementation Problems: p i (k + 1) = p i (k) - η ε ε p i = Σ l Σ m ε y l p i Requires an explicit model of the internal network dynamics. Sensitive to model mismatches and noise in the implemented network and learning system. Amount of computation typically scales strongly with the number of parameters. (k) y l x m x m p i

8 Gradient-Free Approach to Error-Descent Learning Avoid the model sensitivity of gradient descent, by observing the parameter dependence of the performance error on the network directly, rather than calculating gradient information from a preassumed model of the network. Stochastic Approximation: Multi-dimensional Kiefer-Wolfowitz (Kushner & Clark 1978) Function Smoothing Global Optimization (Styblinski & Tang 1990) Simultaneous Perturbation Stochastic Approximation (Spall 1992) Hardware-Related Variants: Model-Free Distributed Learning (Dembo & Kailath 1990) Noise Injection and Correlation (Anderson & Kerns; Kirk & al ) Stochastic Error Descent (Cauwenberghs 1993) Constant Perturbation, Random Sign (Alspector & al. 1993) Summed Weight Neuron Perturbation (Flower & Jabri 1993)

9 Stochastic Error-Descent Learning Minimize ε(p) by iterating: p (k +1) = p (k ) µε (k ) π (k ) from observation of the gradient in the direction of π (k) : ε (k ) = 1 2 ε(p(k ) +π (k ) ) ε(p (k ) π (k ) ) with random uncorrelated binary components of the perturbation vector π (k) : π i (k ) = ±σ ; E(π i (k ) π j (l ) ) σ 2 δ ij δ kl Advantages: No explicit model knowledge is required. Robust in the presence of noise and model mismatches. Computational load is significantly reduced. Allows simple, modular, and scalable implementation. Convergence properties similar to exact gradient descent.

10 Stochastic Perturbative Learning Cell Architecture ^ η ε(t) φ(t) φ(t) π i (t) NETWORK ^ η ε(t) Σ z 1 p i (t) Σ p i (t) + φ(t) π i (t) ε (p(t) + φ(t) π(t)) LOCAL GLOBAL p (k +1) = p (k ) µε (k ) π (k ) ε (k ) = 1 2 ε(p(k ) +π (k ) ) ε(p (k ) π (k ) )

11 Stochastic Perturbative Learning Circuit Cell V σ + V σ π i EN p π i π i V bp sign( ε) ^ POL C perturb p i (t) + φ(t) π i (t) V bn C store π i EN n

12 Charge Pump Characteristics (b) EN p V bp POL I adapt Q adapt V stored C (a) V bn EN n Voltage Decrement ²V stored (V) t = 40 msec 1 msec 23 µsec t = 0 Voltage Increment ²V stored (V) t = 40 msec 1 msec 23 µsec Gate Voltage V bn (V) (a) Gate Voltage V bp (V) (b)

13 Supervised Learning of Recurrent Neural Dynamics BINARY QUANTIZATION UPDATE ACTIVATION AND PROBE MULTIPLEXING H π 1 H π 2 H π 3 H π 4 H π 5 H π 6 Q(.) Q(.) Q(.) Q(.) Q(.) Q(.) W 11 W 12 W 13 W 14 W 15 W 16 W 21 W 22 W 31 W 41 W 51 W 56 W 61 W 65 W 66 x 1 x 2 x 3 x 4 x 5 x 6 TEACHER FORCING x 1 (t) x 1 T (t) x 2 (t) x 2 T (t) π 0 H θ 1 θ 2 θ 3 θ 4 θ 5 θ 6 W off W off W off W off W off W off + I ref I ref DYNAMICAL SYSTEM π 1 V π 2 V π 3 V π 4 V π 5 V π 6 V y(t) dx x(t) F p, x, y z = G () x dt = ( ) z(t)

14 The Credit Assignment Problem or How to Learn from Delayed Rewards INPUTS SYSTEM { } p i OUTPUTS r*(t) ADAPTIVE CRITIC r(t) External, discontinuous reinforcement signal r(t). Adaptive Critics: Discrimination Learning (Grossberg, 1975) Heuristic Dynamic Programming (Werbos, 1977) Reinforcement Learning (Sutton and Barto, 1983) TD(λ) (Sutton, 1988) Q-Learning (Watkins, 1989)

15 Reinforcement Learning (Barto and Sutton, 1983) Locally tuned, address encoded neurons: χ(t) {0,... 2 n 1} : n bit address encoding of state space y(t) = y χ(t) : classifier output q (t) = q χ(t) : adaptive critic Adaptation of classifier and adaptive critic: y k (t+1) = y k (t) + α r(t) e k (t) y k (t) q k (t+1) = q k (t) + β r(t) e k (t) eligibilities: e k (t+1) = λ e k (t) + (1 λ) δ k χ(t) internal reinforcement: r(t) = r(t) + γ q (t) q (t 1)

16 Reinforcement Learning Classifier for Binary Control State Eligibility SEL hor V bp UPD q vert SEL vert e k V αp q k V δ Neuron Select (State) r^ V bn UPD V bn Adaptive Critic SEL hor HYST V bp V bp UPD V αn V αp UPD LOCK HYST LOCK y vert 64 Reinforcement Learning Neurons x 2 (t) State (Quantized) y k Action Network V bn SEL hor x 1 (t) Action (Binary) y = 1 y(t) y = 1 u(t)

17 A Biological Adaptive Optics System brain iris retina lens cornea zonule fibers optic nerve

18 Wavefront Distortion and Adaptive Optics Imaging -defocus -motion Laser beam - beam wander/spread - intensity fluctuations

19 Adaptive Optics Conventional Approach Performs phase conjugation assumes intensity is unaffected Complex requires accurate wavefront phase sensor (Shack-Hartman; Zernike nonlinear filter; etc.) computationally intensive control system

20 Adaptive Optics Model-Free Integrated Approach Incoming wavefront Wavefront corrector with N elements: u 1,,u n,,u N Optimizes a direct measure J of optical performance ( quality metric ) No (explicit) model information is required any type of quality metric J, wavefront corrector (MEMS, LC, ) no need for wavefront phase sensor Tolerates imprecision in the implementation of the updates system level precision limited by accuracy of the measured J

21 Adaptive Optics Controller Chip Optimization by Parallel Perturbative Stochastic Gradient Descent Φ(u) image wavefront corrector u performance metric sensor J(u) J(u) AdOpt VLSI wavefront controller G. Cauwenberghs Learning on Silicon

22 Adaptive Optics Controller Chip Optimization by Parallel Perturbative Stochastic Gradient Descent Φ(u) Φ(u+δu) image wavefront corrector u+δu performance metric sensor J(u+δu) J(u+δu) J(u) AdOpt VLSI wavefront controller G. Cauwenberghs Learning on Silicon

23 Adaptive Optics Controller Chip Optimization by Parallel Perturbative Stochastic Gradient Descent Φ(u) Φ(u+δu) image Φ(u-δu) J(u+δu) wavefront corrector performance metric sensor J(u) J(u-δu) δj u-δu J(u-δu) AdOpt VLSI wavefront controller u j (k+1) = u j (k) + γδj (k) δu j (k) G. Cauwenberghs Learning on Silicon

24 Parallel Perturbative Stochastic Gradient Descent Architecture γ δj δu j {-1,0,+1} <δu i δu j > = σ 2 δ ij uncorrelated perturbations z 1 u j J(u) u j (k+1) = u j (k) + γ δj (k) δu j (k) ( ( k) ( k) ) ( k) ( k) 1 δu1 N δun 1 δ u1... u N δun ( ) 2 δ J (k) = J u (k) +,...,u (k) + J u (k),, (k)

25 Parallel Perturbative Stochastic Gradient Descent Mixed-Signal Architecture analog digital sgn(δj) γ δj δu j {-1,0,+1} δu j = ±1 Bernoulli z 1 u j J(u) u j (k+1) = u j (k) + γ δj (k) sgn(δj (k) ) δu j (k) Amplitude Polarity

26 Wavefront Controller VLSI Implementation Edwards, Cohen, Cauwenberghs, Vorontsov & Carhart (1999) Generate Bernoulli distributed ( ) ( ) { ( k) δ u } j ( ) ( ) k k k j sgn j j with δu = σ and δu = π = ± 1 ( ) ( ) Decompose δj into sgn δj δj (k) (k) (k) ( ) Update u = u γ xor sgn δj π (k+ 1) (k) (k) (k) j j j AdOpt mixed-mode chip 2.2 mm 2, 1.2µm CMOS Controls 19 channels Interfaces with LC SLM or MEMS mirrors

27 Wavefront Controller Chip-level Characterization G. Cauwenberghs Learning on Silicon

28 Wavefront Controller System-level Characterization (LC SLM) HEX-127 LC SLM

29 Wavefront Controller System-level Characterization (SLM) Maximized and minimized J(u) 100 times max min

30 Wavefront Controller System-Level Characterization (MEMS) Weyrauch, Vorontsov, Bifano, Hammer, Cohen & Cauwenberghs Response function MEMS mirror (OKO)

31 Wavefront Controller System-level Characterization (over 500 trials) G. Cauwenberghs Learning on Silicon

32 Quality Metrics imaging: J image r = I( r,t) υ d 2 r Horn(1968), Delbruck (1999) laser beam focusing: { ( r )} J = F I,t beam Muller et al.(1974) Vorontsov et al.(1996) laser comm: J comm = bit-error rate

33 N,M i,j i,j Image Quality Metric Chip Cohen, Cauwenberghs, Vorontsov & Carhart (2001) N,M IQM = I K I K = i,j i,j image Iij edge image Iij K pixel 2mm 22 x µm pixel array 120µm

34 Architecture (3 x 3) Edge and Corner Kernels

35 Image Quality Metric Chip Characterization Experimental Setup G. Cauwenberghs Learning on Silicon

36 Image Quality Metric Chip Characterization Experimental Results dynamic range ~ 20 uniformly illuminated

37 Image Quality Metric Chip Characterization Experimental Results G. Cauwenberghs Learning on Silicon

38 Beam Variance Metric Chip Cohen, Cauwenberghs, Vorontsov & Carhart (2001) 2 N,M N,M 2 = i,j i,j i,j i,j BVM N M I I perimeter of dummy pixels 22 x 22 22x22 20 x 20 array Pixel Pixel array array 70µm

39 I N,M ( 1+κ ) MNI I I 2 κ u i,j u BVM i,j N,M 2 i,j I i,j

40 Beam Variance Metric Chip Characterization Experimental Setup G. Cauwenberghs Learning on Silicon

41 Beam Variance Metric Chip Characterization Experimental Results dynamic range ~ 5

42 Beam Variance Metric Chip Characterization Experimental Results G. Cauwenberghs Learning on Silicon

43 Beam Variance Metric Sensor in the Loop Laser Receiver Setup G. Cauwenberghs Learning on Silicon

44 Beam Variance Metric Sensor in the Loop G. Cauwenberghs Learning on Silicon

45 Conclusions Computational primitives of adaptation and learning are naturally implemented in analog VLSI, and allow to compensate for inaccuracies in the physical implementation of the system under adaptation. Care should still be taken to avoid inaccuracies in the implementation of the adaptive element. Nevertheless, this can easily be achieved by ensuring the correct polarity, rather than amplitude, of the parameter update increments. Adaptation algorithms based on physical observation of the performance gradient in parameter space are better suited for analog VLSI implementation than algorithms based on a calculated gradient. Among the most generally applicable learning architectures are those that operate on reinforcement signals, and those that blindly extract and classify signals. Model-free adaptive optics leads to efficient and robust analog implementation of the control algorithm using a criterion that can be freely chosen to accommodate different wavefront correctors, and different imaging or laser communication applications.

Learning on Silicon: Overview

Learning on Silicon: Overview Learning on Silicon: Overview Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning on Silicon http://bach.ece.jhu.edu/gert/courses/776 Learning on Silicon: Overview Adaptive Microsystems

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization

A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization Gert Cauwenberghs California Institute of Technology Mail-Code 128-95 Pasadena, CA 91125 E-mail: gert(qcco. cal tech.

More information

A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks

A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks J. Alspector R. Meir'" B. Yuhas A. Jayakumar D. Lippet Bellcore Morristown, NJ 07962-1910 Abstract Typical methods for gradient

More information

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation Neural Networks Plan Perceptron Linear discriminant Associative memories Hopfield networks Chaotic networks Multilayer perceptron Backpropagation Perceptron Historically, the first neural net Inspired

More information

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions Chapter ML:VI VI. Neural Networks Perceptron Learning Gradient Descent Multilayer Perceptron Radial asis Functions ML:VI-1 Neural Networks STEIN 2005-2018 The iological Model Simplified model of a neuron:

More information

Applications of Memristors in ANNs

Applications of Memristors in ANNs Applications of Memristors in ANNs Outline Brief intro to ANNs Firing rate networks Single layer perceptron experiment Other (simulation) examples Spiking networks and STDP ANNs ANN is bio inpsired inpsired

More information

Introduction. Previous work has shown that AER can also be used to construct largescale networks with arbitrary, configurable synaptic connectivity.

Introduction. Previous work has shown that AER can also be used to construct largescale networks with arbitrary, configurable synaptic connectivity. Introduction The goal of neuromorphic engineering is to design and implement microelectronic systems that emulate the structure and function of the brain. Address-event representation (AER) is a communication

More information

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET Unit-. Definition Neural network is a massively parallel distributed processing system, made of highly inter-connected neural computing elements that have the ability to learn and thereby acquire knowledge

More information

Reinforcement Learning In Continuous Time and Space

Reinforcement Learning In Continuous Time and Space Reinforcement Learning In Continuous Time and Space presentation of paper by Kenji Doya Leszek Rybicki lrybicki@mat.umk.pl 18.07.2008 Leszek Rybicki lrybicki@mat.umk.pl Reinforcement Learning In Continuous

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Computational Intelligence Winter Term 2017/18

Computational Intelligence Winter Term 2017/18 Computational Intelligence Winter Term 207/8 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Plan for Today Single-Layer Perceptron Accelerated Learning

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks Vincent Barra LIMOS, UMR CNRS 6158, Blaise Pascal University, Clermont-Ferrand, FRANCE January 4, 2011 1 / 46 1 INTRODUCTION Introduction History Brain vs. ANN Biological

More information

Computational Intelligence

Computational Intelligence Plan for Today Single-Layer Perceptron Computational Intelligence Winter Term 00/ Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Accelerated Learning

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

Microscale adaptive optics: wave-front control with a -mirror array and a VLSI stochastic gradient descent controller

Microscale adaptive optics: wave-front control with a -mirror array and a VLSI stochastic gradient descent controller Microscale adaptive optics: wave-front control with a -mirror array and a VLSI stochastic gradient descent controller Thomas Weyrauch, Mikhail A. Vorontsov, Thomas G. Bifano, Jay A. Hammer, Marc Cohen,

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Using a Hopfield Network: A Nuts and Bolts Approach

Using a Hopfield Network: A Nuts and Bolts Approach Using a Hopfield Network: A Nuts and Bolts Approach November 4, 2013 Gershon Wolfe, Ph.D. Hopfield Model as Applied to Classification Hopfield network Training the network Updating nodes Sequencing of

More information

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation Neural Networks Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation Neural Networks Historical Perspective A first wave of interest

More information

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net. 2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net. - For an autoassociative net, the training input and target output

More information

Reinforcement Learning and NLP

Reinforcement Learning and NLP 1 Reinforcement Learning and NLP Kapil Thadani kapil@cs.columbia.edu RESEARCH Outline 2 Model-free RL Markov decision processes (MDPs) Derivative-free optimization Policy gradients Variance reduction Value

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

15 Grossberg Network 1

15 Grossberg Network 1 Grossberg Network Biological Motivation: Vision Bipolar Cell Amacrine Cell Ganglion Cell Optic Nerve Cone Light Lens Rod Horizontal Cell Retina Optic Nerve Fiber Eyeball and Retina Layers of Retina The

More information

Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Threshold units Gradient descent Multilayer networks Backpropagation Hidden layer representations Example: Face Recognition Advanced topics 1 Connectionist Models Consider humans:

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Oliver Schulte - CMPT 310 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will focus on

More information

Multilayer Neural Networks

Multilayer Neural Networks Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient

More information

Introduction to Reinforcement Learning. CMPT 882 Mar. 18

Introduction to Reinforcement Learning. CMPT 882 Mar. 18 Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Artificial Neural Networks. Historical description

Artificial Neural Networks. Historical description Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

More information

Hopfield Neural Network

Hopfield Neural Network Lecture 4 Hopfield Neural Network Hopfield Neural Network A Hopfield net is a form of recurrent artificial neural network invented by John Hopfield. Hopfield nets serve as content-addressable memory systems

More information

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009 AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

More information

Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part 1) Goals for the lecture Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

Credit Assignment: Beyond Backpropagation

Credit Assignment: Beyond Backpropagation Credit Assignment: Beyond Backpropagation Yoshua Bengio 11 December 2016 AutoDiff NIPS 2016 Workshop oo b s res P IT g, M e n i arn nlin Le ain o p ee em : D will r G PLU ters p cha k t, u o is Deep Learning

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data January 17, 2006 Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Multi-Layer Perceptrons Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts

More information

Tutorial on Machine Learning for Advanced Electronics

Tutorial on Machine Learning for Advanced Electronics Tutorial on Machine Learning for Advanced Electronics Maxim Raginsky March 2017 Part I (Some) Theory and Principles Machine Learning: estimation of dependencies from empirical data (V. Vapnik) enabling

More information

Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

More information

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 2. Perceptrons COMP9444 17s2 Perceptrons 1 Outline Neurons Biological and Artificial Perceptron Learning Linear Separability Multi-Layer Networks COMP9444 17s2

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Intelligent Control. Module I- Neural Networks Lecture 7 Adaptive Learning Rate. Laxmidhar Behera

Intelligent Control. Module I- Neural Networks Lecture 7 Adaptive Learning Rate. Laxmidhar Behera Intelligent Control Module I- Neural Networks Lecture 7 Adaptive Learning Rate Laxmidhar Behera Department of Electrical Engineering Indian Institute of Technology, Kanpur Recurrent Networks p.1/40 Subjects

More information

Hopfield Networks. (Excerpt from a Basic Course at IK 2008) Herbert Jaeger. Jacobs University Bremen

Hopfield Networks. (Excerpt from a Basic Course at IK 2008) Herbert Jaeger. Jacobs University Bremen Hopfield Networks (Excerpt from a Basic Course at IK 2008) Herbert Jaeger Jacobs University Bremen Building a model of associative memory should be simple enough... Our brain is a neural network Individual

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir Supervised (BPL) verses Hybrid (RBF) Learning By: Shahed Shahir 1 Outline I. Introduction II. Supervised Learning III. Hybrid Learning IV. BPL Verses RBF V. Supervised verses Hybrid learning VI. Conclusion

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Multilayer Perceptron

Multilayer Perceptron Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

CSC 411 Lecture 10: Neural Networks

CSC 411 Lecture 10: Neural Networks CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

CHALMERS, GÖTEBORGS UNIVERSITET. EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 135, FIM 72 GU, PhD Time: Place: Teachers: Allowed material: Not allowed: October 23, 217, at 8 3 12 3 Lindholmen-salar

More information

VLSI Implementation of Fuzzy Adaptive Resonance and Learning Vector Quantization

VLSI Implementation of Fuzzy Adaptive Resonance and Learning Vector Quantization C Analog Integrated Circuits and Signal Processing, 3, 49 57, 22 22 Kluwer Academic Publishers. Manufactured in The Netherlands. VLSI Implementation of Fuzzy Adaptive Resonance and Learning Vector Quantization

More information

Case Studies of Logical Computation on Stochastic Bit Streams

Case Studies of Logical Computation on Stochastic Bit Streams Case Studies of Logical Computation on Stochastic Bit Streams Peng Li 1, Weikang Qian 2, David J. Lilja 1, Kia Bazargan 1, and Marc D. Riedel 1 1 Electrical and Computer Engineering, University of Minnesota,

More information

Strength-based and time-based supervised networks: Theory and comparisons. Denis Cousineau Vision, Integration, Cognition lab. Université de Montréal

Strength-based and time-based supervised networks: Theory and comparisons. Denis Cousineau Vision, Integration, Cognition lab. Université de Montréal Strength-based and time-based supervised networks: Theory and comparisons Denis Cousineau Vision, Integration, Cognition lab. Université de Montréal available at: http://mapageweb.umontreal.ca/cousined

More information

An Adaptive Broom Balancer with Visual Inputs

An Adaptive Broom Balancer with Visual Inputs An Adaptive Broom Balancer with Visual Inputs Viral V. Tolatl Bernard Widrow Stanford University Abstract An adaptive network with visual inputs has been trained to balance an inverted pendulum. Simulation

More information

Recurrent Neural Network Training with Preconditioned Stochastic Gradient Descent

Recurrent Neural Network Training with Preconditioned Stochastic Gradient Descent Recurrent Neural Network Training with Preconditioned Stochastic Gradient Descent 1 Xi-Lin Li, lixilinx@gmail.com arxiv:1606.04449v2 [stat.ml] 8 Dec 2016 Abstract This paper studies the performance of

More information

Name: Student number:

Name: Student number: UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2018 EXAMINATIONS CSC321H1S Duration 3 hours No Aids Allowed Name: Student number: This is a closed-book test. It is marked out of 35 marks. Please

More information

Gaussian and Linear Discriminant Analysis; Multiclass Classification

Gaussian and Linear Discriminant Analysis; Multiclass Classification Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015

More information

In the Name of God. Lecture 9: ANN Architectures

In the Name of God. Lecture 9: ANN Architectures In the Name of God Lecture 9: ANN Architectures Biological Neuron Organization of Levels in Brains Central Nervous sys Interregional circuits Local circuits Neurons Dendrite tree map into cerebral cortex,

More information

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture CSE/NB 528 Final Lecture: All Good Things Must 1 Course Summary Where have we been? Course Highlights Where do we go from here? Challenges and Open Problems Further Reading 2 What is the neural code? What

More information

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Learning Deep Architectures for AI. Part II - Vijay Chakilam Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model

More information

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples Back-Propagation Algorithm Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples 1 Inner-product net =< w, x >= w x cos(θ) net = n i=1 w i x i A measure

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radial-Basis Function etworks A function is radial basis () if its output depends on (is a non-increasing function of) the distance of the input from a given stored vector. s represent local receptors,

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

Chapter 3 Supervised learning:

Chapter 3 Supervised learning: Chapter 3 Supervised learning: Multilayer Networks I Backpropagation Learning Architecture: Feedforward network of at least one layer of non-linear hidden nodes, e.g., # of layers L 2 (not counting the

More information

3.4 Linear Least-Squares Filter

3.4 Linear Least-Squares Filter X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum

More information

Wavefront Correction of Model-based Sensorless Adaptive Optics System

Wavefront Correction of Model-based Sensorless Adaptive Optics System Wavefront Correction of Model-based Sensorless Adaptive Optics System Huizhen Yang 1*, Jian Wu 2 1. School of Electronic Engineering, Huaihai Institute of Technology, Lianyungang, China 222005; 2. School

More information

CMSC 421: Neural Computation. Applications of Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks CMSC 42: Neural Computation definition synonyms neural networks artificial neural networks neural modeling connectionist models parallel distributed processing AI perspective Applications of Neural Networks

More information

Unit III. A Survey of Neural Network Model

Unit III. A Survey of Neural Network Model Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

More information

The Perceptron. Volker Tresp Summer 2014

The Perceptron. Volker Tresp Summer 2014 The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding

More information

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5 Artificial Neural Networks Q550: Models in Cognitive Science Lecture 5 "Intelligence is 10 million rules." --Doug Lenat The human brain has about 100 billion neurons. With an estimated average of one thousand

More information

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network. III. Recurrent Neural Networks A. The Hopfield Network 2/9/15 1 2/9/15 2 Typical Artificial Neuron Typical Artificial Neuron connection weights linear combination activation function inputs output net

More information

Pattern Classification

Pattern Classification Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

Spatial Transformation

Spatial Transformation Spatial Transformation Presented by Liqun Chen June 30, 2017 1 Overview 2 Spatial Transformer Networks 3 STN experiments 4 Recurrent Models of Visual Attention (RAM) 5 Recurrent Models of Visual Attention

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch. Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will

More information

Multilayer Neural Networks

Multilayer Neural Networks Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the

More information

Reinforcement Learning for Continuous. Action using Stochastic Gradient Ascent. Hajime KIMURA, Shigenobu KOBAYASHI JAPAN

Reinforcement Learning for Continuous. Action using Stochastic Gradient Ascent. Hajime KIMURA, Shigenobu KOBAYASHI JAPAN Reinforcement Learning for Continuous Action using Stochastic Gradient Ascent Hajime KIMURA, Shigenobu KOBAYASHI Tokyo Institute of Technology, 4259 Nagatsuda, Midori-ku Yokohama 226-852 JAPAN Abstract:

More information

Frequency Selective Surface Design Based on Iterative Inversion of Neural Networks

Frequency Selective Surface Design Based on Iterative Inversion of Neural Networks J.N. Hwang, J.J. Choi, S. Oh, R.J. Marks II, "Query learning based on boundary search and gradient computation of trained multilayer perceptrons", Proceedings of the International Joint Conference on Neural

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Temporal Difference Learning Temporal difference learning, TD prediction, Q-learning, elibigility traces. (many slides from Marc Toussaint) Vien Ngo Marc Toussaint University of

More information

Artifical Neural Networks

Artifical Neural Networks Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

More information

Machine Learning

Machine Learning Machine Learning 10-601 Maria Florina Balcan Machine Learning Department Carnegie Mellon University 02/10/2016 Today: Artificial neural networks Backpropagation Reading: Mitchell: Chapter 4 Bishop: Chapter

More information

CS545 Contents XVI. Adaptive Control. Reading Assignment for Next Class. u Model Reference Adaptive Control. u Self-Tuning Regulators

CS545 Contents XVI. Adaptive Control. Reading Assignment for Next Class. u Model Reference Adaptive Control. u Self-Tuning Regulators CS545 Contents XVI Adaptive Control u Model Reference Adaptive Control u Self-Tuning Regulators u Linear Regression u Recursive Least Squares u Gradient Descent u Feedback-Error Learning Reading Assignment

More information

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield 3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield (1982, 1984). - The net is a fully interconnected neural

More information

COMP-4360 Machine Learning Neural Networks

COMP-4360 Machine Learning Neural Networks COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca

More information

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units Connectionist Models Consider humans: Neuron switching time ~ :001 second Number of neurons ~ 10 10 Connections per neuron ~ 10 4 5 Scene recognition time ~ :1 second 100 inference steps doesn't seem like

More information

TECHNICAL RESEARCH REPORT

TECHNICAL RESEARCH REPORT TECHNICAL RESEARCH REPORT Advanced Phase-Contrast Techniques for Wavefront Sensing and Adaptive Optics by M.A. Vorontsov, E.W. Justh, and L.A. Beresnev CDCSS TR 1-4 (ISR TR 1-7) C + - D S CENTER FOR DYNAMICS

More information

Deep Learning: a gentle introduction

Deep Learning: a gentle introduction Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information