Learning and Memory in Neural Networks

Size: px
Start display at page:

Download "Learning and Memory in Neural Networks"

Transcription

1 Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units (neurons) that are linked by a directed graph with some degree of connectivity (network). The connections comprising the edges in the graph are termed weights. As the name suggests the magnitude of the weight determines the magnitude of the effect that the connecting neuron can have upon its target partner. In caricature, neural networks use the many parallel operations of simple units to perform computations with uncertain data, rather than serial operations with logical blocks to perform computations with exact data. Neural networks are useful computational devices for learning data classifications, for autoassociative (content addressable) memories and for associative (classical conditioning) memories. In this brief, neural networks performing each of these tasks are introduced, respectively: The multilayer perceptron, the Hopfield network and the associative network. Two class data classification: The Perceptron The perceptron (Rosenblatt, 1958) can be used to learn a distinction between two clusters within some data set, Fig. 1A. The aim of the perceptron is to classify data into two classes C 1 and C 2 by labelling each data point x with its output f(a) { 1, 1} such that f(a) = 1 for class C 1 and f(a) = 1 for class C 2. As input the perceptron is passed a feature vector φ(x i ) { 1, 1}, i {1,..., N} constructed from the N dimensional data point to be classified by means of some fixed non linear transformation (for example brightness thresholding of the pixels of an image). In order to perform a classification, the activation is first calculated a(x) = i w i φ(x i ) φ (1) where the φ is the bias. The activation is then passed through the step function { 1 a 0 f(a) = 1 a < 0. What classifications of the data can the perceptron perform given some feature vector φ(x)? The perceptron can only separate feature vectors that are linearly separable. The decision between attributing a given input vector to class C 1 or to class C 2 occurs when a(x) = 0. This criterion is satisfied by an N 1 dimensional hyperplane within feature space. This surface is the decision boundary of the classification. Data that is separable with the perceptron is described as linearly separable because the decision boundary for a single linear thresholder is linear (a hyperplane). The bias determines the displacement of the decision boundary from the origin. How does the decision boundary relate to the correct weight vector for a given classification? Consider two locations on the decision boundary φ(x 1 ) and φ(x 2 ). Since a(x 1 ) = a(x 2 ) = 0 it is the case that w T (φ(x 1 ) φ(x 2 )) = 0 which can only be satisfied if the displacement from φ(x 1 ) to φ(x 2 ) is orthogonal to the weight vector (Bishop, 2006). Thus, components of the weight vector associated with some decision boundary are identical to the components of the normal to the decision boundary in input space. The aim of learning with the perceptron is to find the weight vector corresponding to a decision boundary that renders the input data disjoint. One method for achieving this is the perceptron learning algorithm, an explanation of which is beyond the scope of this brief but see Rojas or Bishop (Rojas, 1996; Bishop, 2006). The perceptron convergence theorem states that the perceptron learning algorithm is guaranteed to find a weight vector corresponding to a decision boundary in a finite number of steps, as long as the feature vectors are linearly separable (Minsky and Papert, 1969; Bishop, 2006). When the data are linearly separable there may be more than one valid classification, in which case the one achieved will depend upon the initial conditions. If the data are not linearly separable, then the perceptron learning algorithm does not converge. For any given data set there may be many possible decision boundaries. Assume that the learning algorithm has converged upon some successful decision boundary. The information that is transmitted about that data by the perceptron in this case is a disambiguation between the chosen labeling and the other possible labellings. By counting all possible classifications of a set of random points in feature space it can be shown that the limiting capacity of the perceptron is 2 bits per weight (MacKay, 2003). (2) 1

2 Figure 1: Diagrams contrasting the architectures of the neural networks discussed in this brief. A: The perceptron consists of a single unit (open circle) that takes a threshold of the scalar product of its feedforward weights (lines) with its inputs (filled circles). The Perceptron can classify two-class linearly separable data. B: The multilayer perceptron (MLP) consists of one hidden layer of units (grey circles) between the input layer (filled circles) and the output layer (open circles). Each unit in the hidden layer computes a continuous threshold function of the scalar product of its feedforward weights (lines) with the values of the input layer. Each unit in the output layer takes the scalar product of its feedforward weights (lines) with the inputs received from the hidden layer. The MLP can fit nonlinear curves to classifications of data sets where the number of classes is determined by the number of outputs. C: A Hopfield network is a feedback network. Thus the synaptic weights are directed (arrows) and symmetric. The Hopfield net functions as a content addressable memory by completing a disrupted pattern. Each unit operates as both an input and an output unit. D: The associative network learns an association between two binary patterns on the inputs (filled circles) and the outputs (open circles). At each crossing of an input line and an output line, there is a binary synaptic weight. The weights in this diagram show the result of storing the association between the input pattern and the output pattern using the learning rule in the text. Each filled square represents a weight that has been set to one. All other weights on the grid are zero. 2

3 Multiclass data classification with a neural network: The multilayer Perceptron As we have seen, the classifications that can be performed by a single perceptron are limited to those that are linearly separable. This is an enormous restriction since many interesting patterns in data give rise to feature vectors that are not linearly separable. This point was demonstrated by Minsky and Papert (Minsky and Papert, 1969). However it turns out that this restriction does not apply in the case of multilayer perceptrons (MLP) and so these neural networks find far greater utility in learning data classifications. The most common implementation of the multilayer perceptron makes use of three layers: An input layer, a hidden layer and an output layer, Fig. 1B. Multilayer perceptrons are fully feedforward networks, meaning that the graph describing their structure is acyclic. The input layer of the network provides the feature space into which the data must be mapped. Each input comprising the N dimensional feature vector φ(x) is connected to every perceptron in the hidden layer. There are H perceptrons in the hidden layer. In turn each of the perceptrons in the hidden layer passes its output to every perceptron in the output layer. There are K perceptrons in the output layer. The classification of the input feature vector is read out from the output layer. The activations of the units in the hidden layer, layer (1) are a (1) j = l w (1) jl φ(x l) + θ (1) j (3) where j {1,..., H}, l (1,..., N} and the θ (1) j in the output layer, layer (2) are are the biases of the hidden layer perceptrons. The activations a (2) i = j w (2) ij h j + θ (2) i (4) where i {1,..., K} and the θ (2) i layer respectively are are the biases of the output layer perceptrons. The outputs for units in each h j = f (1) (a (1) j ) y i = f (2) (a (2) i ) where the activation function f (2) is the identity function f (2) (a) = a. The form of the activation function f (1) can differ from implementation to implementation. For learning in the MLP however, it is important that this function be a continuous non linearity. Consequently the logistic sigmoid is chosen, f (1) (a) = exp( a). (6) Training in the MLP adjusts the weights to minimize a sum squared error function. In this manner the network fits a non linear function to the data (the target classification). Again, a full discussion of the learning algortihm is beyond the scope of this brief but see Bishop and MacKay (MacKay, 2003; Bishop, 2006). One important algorithm is the backpropagation algorithm (Rumelhart, Hinton, and Williams, 1986). The Multilayer perceptron is a non linear curve fitting device. The number of hidden units H that should be chosen in order to perform that fit is not constrained by the data but is a free parameter. Since increasing the number of hidden units increases the complexity of the model we might expect that the complexity of the function that is fit to the data should also increase. Indeed this is the case, but it turns out that the complexity of the curve is independent of the number of hidden units as H (Neal, 1996; MacKay, 2003), but is determined by the magnitude of the weights themselves. It may seem advantageous to choose the most complex fit that is possible with available computational resources, but this is not the case. Since the input data contains noise that is unreproducible and peculiar to the example under consideration it is possible for the neural network to overfit the data. Overfitting results in a match that is too close to one particular example and leads to a decrease in the performance of recognising examples that should belong to the same class, but that have random variations (generalisation). The approach taken to prevent overfitting is to add a regularizer that penalises excessively large weights. One is then left with the problem of how to choose the regularizer so as to optimise the trade off between specificity and generalisation performance (MacKay, 2003; Bishop, 2006). (5) 3

4 Auto-associative memory: Hopfield networks Hopfield networks can be used to create content addressable memories that store data in such a way that it can be retrieved by supplying a partial version of the original pattern. The graph of the connections in a Hopfield network is cyclic and thus the network is a feedback rather than a feedforward network, Fig. 1C. Hopfield networks are constructed from N neurons where each neuron i {1,..., N} is connected to some other neuron j {0, 1,..., N} by a symmetric connection w ij such that w ij = w ji. There are no self connections w ii = 0. Each neuron can have a bias w i0 that can be considered as resulting from the feedforward input from a zeroth layer of neurons with constant acitivity. The activation of each neuron is a i = j w ij x j. (7) For the binary Hopfield network the output is a threshold function of the activation as in Eq. (2) where x i = f(a i ) { 1, 1}. Alternatively for a continuous Hopfield network where outputs vary between 1 and 1, the output function is x i = f(a i ) = tanh(a i ). In feedback networks the output of each neuron is also an input to other neurons. Due to this, alterations to the weights of each neuron can either be synchronous or asynchronous. In the synchronous case all neurons calculate their activations according to Eq. (7) and then update their outputs only after the calculation of all other activations is complete. In the asynchronous case each neuron first calculates its activation and then updates its output in turn, before the calculation of activations of the other neurons. For some fixed set of weights, the Hopfield network stores data as features in the N dimensional phase space defined by the x i activity variables. Each memory is stored as a fixed point of the activity of the network. The partial pattern to be recalled is presented as the initial condition of the activity in the network. After some time the network converges upon the fixed point corresponding to the basin of attraction of that initial condition. The aim is that this fixed point should be the complete pattern and that the initial condition is the partial pattern to be completed. The values of the weights in the Hopfield network determine the locations of the fixed points in the activity space. For an asynchronous Hopfield net to store a set of M patterns {x (m) }, m {1,..., M}, the weights are set with one-shot learning according to w ij = α m x (m) i x (m) j (8) where α is a constant. How do we know that the the simple operation in Eq.(8) ensures that the Hopfield net recalls the given patterns? Asynchronous Hopfield networks have Lyapunov functions of their dynamics. The Lyapunov function is a function that always decreases under the evolution of the system, the presence of which ensures that the system settles upon a fixed point. Proof of this is beyond the scope of this brief, but see MacKay (MacKay, 2003). What about synchronous Hopfield nets? In the synchronous case we can guarantee that the activity of the Hopfield net settles upon a fixed point as long as the time is continuous. In this case activation functions a i (t) and the activity of the neurons x i (t) are defined as continuous functions of time a i (t) = j w ij x j (t) (9) and the neurons response to its activation is filtered with some time constant τ dx i (t) = 1 dt τ (x i(t) f(a i )) (10) where the output function is the hyperbolic tangent x i = f(a i ) = tanh(a i ). The synchronous continuous time Hopfield net has a Lyapunov function and is thus guaranteed to settle to a fixed point of the weights are set with Eq.(8). As mentioned, due to the presence of the Lyapunov function, the Hopfield network is guaranteed to settle into some stable state (output pattern) when supplied with an initial activity state (input pattern). However as the number of patterns stored in the network is increased, there comes a point at which the output patterns are garbled and are not valid completions of the input patterns. This failure of performance of the Hopfield net is graceful rather than catastrophic. Typically when the limit of overload is approached, some memories survive with a minority of bits flipped. The limiting number of patterns stored at which this transition occurs from memory storage to spurious spin glass states is M = 0.138N (Amit, Gutfreund, and Sompolinski, 1985). 4

5 Associative networks One distinctive feature of biological learning is the ability for associations to be made between stimuli (for example, in the famous case of Pavlovian, or classical, conditioning where a dog learns to associate the ring of a bell with being fed). Associative networks are feedforward neural networks that learn an association between two input patterns. When one pattern is presented to the inputs of the network, the outputs of the associative network present a pattern that has been associated with the input pattern. Adjustment of the weights allows this pairing between an arbitrary input and an arbitrary output pattern. The associative network can be envisaged as a grid of horizontal input lines and vertical output lines. At the intersections between these lines - the points on the grid - are the weights. One edge of the input lines terminates in an array of points that are the N I inputs. One edge of the output lines terminates in an array of points that are the N O outputs, Fig. 1D. The inputs x i {0, 1}, i {1,..., N I } are binary and the outputs are binary y j {0, 1}, j {1,..., N O }. The weights at each grid point w ij {0, 1} are also binary. When one pattern k is presented {x (k) }, k {1,..., K} each output is set to one y j = 1 if its dendritic sum d (k) i = j w ij x (k) j (11) is equal to the input activity a (k) = j x (k) j (12) but is set to zero y j = 0 otherwise (Willshaw, Buneman, and Longuet-Higgins, 1969; Buckingham and Willshaw, 1992). Associations are stored in the network by applying the input pattern to be associated, to the inputs x while the output pattern to be associated is applied to the outputs y. The following rule is then applied to each weight w ij associated with input line i and output line j w ij = 1 if x i = 1 y j = 1 w ij = 0 otherwise There is a rich literature dealing with the optimisation of the storage capacity of associative networks under differing conditions (Dayan and Willshaw, 1991). An important factor in determining the performance of the associative networks is the sparsity of the patterns. Sparse patterns tend to have a small fraction of their units active. Consider N I inputs where for each pattern to be stored M I inputs are active on average, being stored in association with N O outputs where M O outputs are active on average. For the simple network described here the limit to the number of associations that can be stored before the expected number of output units that fires spuriously approaches one is R N I N O M I M O ln(1 1 N O ) (Buckingham and Willshaw, 1992). References Amit, D.J., H. Gutfreund, and H. Sompolinski (1985). Storing infinite numbers of patterns in a spin glass model of neural networks. Phys. Rev. Lett. 55: Bishop, C. (2006). Pattern Recognition and Machine Learning. Springer-Verlag, New York. Buckingham, J. and D. Willshaw (1992). Performance characteristics of the associative net. Network 3: Dayan, P. and D.J. Willshaw (1991). Optimising synaptic learning rules in linear associative memories. Biological Cybernetics 65: MacKay, D.J.C (2003). Information theory, inference and learning algorithms. Cambridge University Press, Cambridge, UK. Minsky, M. and S. Papert (1969). Perceptrons. MIT Press, Cambridge, Mass. Neal, R. (1996). Bayesian Learning for Neural Networks. Springer, Berlin. Rojas, R. (1996). Neural Networks. Springer, Berlin. (13) 5

6 Rosenblatt, F. (1958). The perceptron a probabilistic model for information storage in the brain. Psychological review 65: Rumelhart, D. E., G. E. Hinton, and R. J. Williams (1986). Learning internal representations by error propagation, pp MIT Press, Cambridge, Mass. Willshaw, D.J., O.P. Buneman, and H.C. Longuet-Higgins (1969). Non-holographic associative memory. Nature 222:

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation Neural Networks Plan Perceptron Linear discriminant Associative memories Hopfield networks Chaotic networks Multilayer perceptron Backpropagation Perceptron Historically, the first neural net Inspired

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Lecture 4: Feed Forward Neural Networks

Lecture 4: Feed Forward Neural Networks Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training

More information

Introduction Biologically Motivated Crude Model Backpropagation

Introduction Biologically Motivated Crude Model Backpropagation Introduction Biologically Motivated Crude Model Backpropagation 1 McCulloch-Pitts Neurons In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published A logical calculus of the

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs) Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

CS 4700: Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman selman@cs.cornell.edu Machine Learning: Neural Networks R&N 18.7 Intro & perceptron learning 1 2 Neuron: How the brain works # neurons

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

More information

CS:4420 Artificial Intelligence

CS:4420 Artificial Intelligence CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart

More information

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs) Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Simple Neural Nets For Pattern Classification

Simple Neural Nets For Pattern Classification CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In humans, association is known to be a prominent feature of memory.

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

Chapter 9: The Perceptron

Chapter 9: The Perceptron Chapter 9: The Perceptron 9.1 INTRODUCTION At this point in the book, we have completed all of the exercises that we are going to do with the James program. These exercises have shown that distributed

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron

More information

Artificial Neural Networks

Artificial Neural Networks Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 2. Perceptrons COMP9444 17s2 Perceptrons 1 Outline Neurons Biological and Artificial Perceptron Learning Linear Separability Multi-Layer Networks COMP9444 17s2

More information

Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

Feed-forward Network Functions

Feed-forward Network Functions Feed-forward Network Functions Sargur Srihari Topics 1. Extension of linear models 2. Feed-forward Network Functions 3. Weight-space symmetries 2 Recap of Linear Models Linear Models for Regression, Classification

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Multilayer Perceptron

Multilayer Perceptron Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4

More information

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

INTRODUCTION TO ARTIFICIAL INTELLIGENCE v=1 v= 1 v= 1 v= 1 v= 1 v=1 optima 2) 3) 5) 6) 7) 8) 9) 12) 11) 13) INTRDUCTIN T ARTIFICIAL INTELLIGENCE DATA15001 EPISDE 8: NEURAL NETWRKS TDAY S MENU 1. NEURAL CMPUTATIN 2. FEEDFRWARD NETWRKS (PERCEPTRN)

More information

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5 Hopfield Neural Network and Associative Memory Typical Myelinated Vertebrate Motoneuron (Wikipedia) PHY 411-506 Computational Physics 2 1 Wednesday, March 5 1906 Nobel Prize in Physiology or Medicine.

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28 1 / 28 Neural Networks Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 2 / 28 Goals: Understand how (recurrent) networks behave Find a way to teach networks to do a certain

More information

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Alexandre Allauzen Anne Auger Michèle Sebag LIMSI LRI Oct. 4th, 2012 This course Bio-inspired algorithms Classical Neural Nets History

More information

Neural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture

Neural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network architecture Geoffrey Hinton with Nitish Srivastava Kevin Swersky Feed-forward neural networks These are

More information

Simple neuron model Components of simple neuron

Simple neuron model Components of simple neuron Outline 1. Simple neuron model 2. Components of artificial neural networks 3. Common activation functions 4. MATLAB representation of neural network. Single neuron model Simple neuron model Components

More information

Multi-layer Neural Networks

Multi-layer Neural Networks Multi-layer Neural Networks Steve Renals Informatics 2B Learning and Data Lecture 13 8 March 2011 Informatics 2B: Learning and Data Lecture 13 Multi-layer Neural Networks 1 Overview Multi-layer neural

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Neural Nets and Symbolic Reasoning Hopfield Networks

Neural Nets and Symbolic Reasoning Hopfield Networks Neural Nets and Symbolic Reasoning Hopfield Networks Outline The idea of pattern completion The fast dynamics of Hopfield networks Learning with Hopfield networks Emerging properties of Hopfield networks

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

Revision: Neural Network

Revision: Neural Network Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn

More information

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield 3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield (1982, 1984). - The net is a fully interconnected neural

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts

More information

Neural Networks: Introduction

Neural Networks: Introduction Neural Networks: Introduction Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others 1

More information

CSC321 Lecture 5: Multilayer Perceptrons

CSC321 Lecture 5: Multilayer Perceptrons CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3

More information

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines COMP9444 17s2 Boltzmann Machines 1 Outline Content Addressable Memory Hopfield Network Generative Models Boltzmann Machine Restricted Boltzmann

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

Deep Feedforward Networks

Deep Feedforward Networks Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3

More information

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5 Artificial Neural Networks Q550: Models in Cognitive Science Lecture 5 "Intelligence is 10 million rules." --Doug Lenat The human brain has about 100 billion neurons. With an estimated average of one thousand

More information

Using a Hopfield Network: A Nuts and Bolts Approach

Using a Hopfield Network: A Nuts and Bolts Approach Using a Hopfield Network: A Nuts and Bolts Approach November 4, 2013 Gershon Wolfe, Ph.D. Hopfield Model as Applied to Classification Hopfield network Training the network Updating nodes Sequencing of

More information

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable

More information

Neural Networks. Hopfield Nets and Auto Associators Fall 2017

Neural Networks. Hopfield Nets and Auto Associators Fall 2017 Neural Networks Hopfield Nets and Auto Associators Fall 2017 1 Story so far Neural networks for computation All feedforward structures But what about.. 2 Loopy network Θ z = ቊ +1 if z > 0 1 if z 0 y i

More information

Associative Memories (I) Hopfield Networks

Associative Memories (I) Hopfield Networks Associative Memories (I) Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Applied Brain Science - Computational Neuroscience (CNS) A Pun Associative Memories Introduction

More information

Links between Perceptrons, MLPs and SVMs

Links between Perceptrons, MLPs and SVMs Links between Perceptrons, MLPs and SVMs Ronan Collobert Samy Bengio IDIAP, Rue du Simplon, 19 Martigny, Switzerland Abstract We propose to study links between three important classification algorithms:

More information

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation 1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

An Introduction to Statistical and Probabilistic Linear Models

An Introduction to Statistical and Probabilistic Linear Models An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir Supervised (BPL) verses Hybrid (RBF) Learning By: Shahed Shahir 1 Outline I. Introduction II. Supervised Learning III. Hybrid Learning IV. BPL Verses RBF V. Supervised verses Hybrid learning VI. Conclusion

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

COMP-4360 Machine Learning Neural Networks

COMP-4360 Machine Learning Neural Networks COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca

More information

arxiv: v2 [nlin.ao] 19 May 2015

arxiv: v2 [nlin.ao] 19 May 2015 Efficient and optimal binary Hopfield associative memory storage using minimum probability flow arxiv:1204.2916v2 [nlin.ao] 19 May 2015 Christopher Hillar Redwood Center for Theoretical Neuroscience University

More information

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding

More information

Artificial Intelligence Hopfield Networks

Artificial Intelligence Hopfield Networks Artificial Intelligence Hopfield Networks Andrea Torsello Network Topologies Single Layer Recurrent Network Bidirectional Symmetric Connection Binary / Continuous Units Associative Memory Optimization

More information

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network. III. Recurrent Neural Networks A. The Hopfield Network 2/9/15 1 2/9/15 2 Typical Artificial Neuron Typical Artificial Neuron connection weights linear combination activation function inputs output net

More information

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, 2012 Sasidharan Sreedharan www.sasidharan.webs.com 3/1/2012 1 Syllabus Artificial Intelligence Systems- Neural Networks, fuzzy logic,

More information

This is a repository copy of Improving the associative rule chaining architecture.

This is a repository copy of Improving the associative rule chaining architecture. This is a repository copy of Improving the associative rule chaining architecture. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/75674/ Version: Accepted Version Book Section:

More information

CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory

CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory Part VII 1 The basic task Store a set of fundamental memories {ξξ 1, ξξ 2,, ξξ MM } so that, when presented a new pattern

More information

Artificial Neural Networks (ANN)

Artificial Neural Networks (ANN) Artificial Neural Networks (ANN) Edmondo Trentin April 17, 2013 ANN: Definition The definition of ANN is given in 3.1 points. Indeed, an ANN is a machine that is completely specified once we define its:

More information

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET Unit-. Definition Neural network is a massively parallel distributed processing system, made of highly inter-connected neural computing elements that have the ability to learn and thereby acquire knowledge

More information

Machine Learning Lecture 10

Machine Learning Lecture 10 Machine Learning Lecture 10 Neural Networks 26.11.2018 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Today s Topic Deep Learning 2 Course Outline Fundamentals Bayes

More information

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network LETTER Communicated by Geoffrey Hinton Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network Xiaohui Xie xhx@ai.mit.edu Department of Brain and Cognitive Sciences, Massachusetts

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 arzaneh Abdollahi

More information

Efficient and Robust Associative Memory from a Generalized Bloom Filter

Efficient and Robust Associative Memory from a Generalized Bloom Filter oname manuscript o (will be inserted by the editor) Efficient and Robust Associative Memory from a Generalied Bloom Filter Philip Sterne Received: date / Accepted: date Abstract We develop a variant of

More information

Computational Intelligence Lecture 6: Associative Memory

Computational Intelligence Lecture 6: Associative Memory Computational Intelligence Lecture 6: Associative Memory Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Computational Intelligence

More information

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks Sections 18.6 and 18.7 Analysis of Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline Univariate regression

More information

y(x n, w) t n 2. (1)

y(x n, w) t n 2. (1) Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,

More information

Introduction to Deep Learning

Introduction to Deep Learning Introduction to Deep Learning Some slides and images are taken from: David Wolfe Corne Wikipedia Geoffrey A. Hinton https://www.macs.hw.ac.uk/~dwcorne/teaching/introdl.ppt Feedforward networks for function

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Neural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve)

Neural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve) Neural Turing Machine Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve) Introduction Neural Turning Machine: Couple a Neural Network with external memory resources The combined

More information

Neural Networks DWML, /25

Neural Networks DWML, /25 DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference

More information

Unit III. A Survey of Neural Network Model

Unit III. A Survey of Neural Network Model Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

More information

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single

More information