Artificial Neural Networks. Edward Gatt

Similar documents
Introduction to Neural Networks

Multilayer Neural Networks

Lab 5: 16 th April Exercises on Neural Networks

Artificial Neural Network

Lecture 4: Perceptrons and Multilayer Perceptrons

Neural Networks Lecture 4: Radial Bases Function Networks

Artificial Neural Networks Examination, March 2004

Neural networks. Chapter 19, Sections 1 5 1

Artificial Intelligence

Pattern Classification

Neural networks. Chapter 20. Chapter 20 1

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Artificial Neural Networks Examination, June 2005

Artifical Neural Networks

Course 395: Machine Learning - Lectures

4. Multilayer Perceptrons

Artificial Neural Networks Examination, June 2004

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Part 8: Neural Networks

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Data Mining Part 5. Prediction

Multilayer Neural Networks

Artificial Neural Networks

Introduction to Artificial Neural Networks

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Artificial Neural Networks

Neural networks. Chapter 20, Section 5 1

Neural Networks DWML, /25

CS:4420 Artificial Intelligence

Neural Networks (Part 1) Goals for the lecture

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Artificial Neural Networks. Historical description

Revision: Neural Network

Artificial Neural Networks

Feedforward Neural Nets and Backpropagation

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

Machine Learning. Neural Networks

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

AI Programming CS F-20 Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks

Artificial Neural Network and Fuzzy Logic

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Introduction to Neural Networks

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

An artificial neural networks (ANNs) model is a functional abstraction of the

Lecture 7 Artificial neural networks: Supervised learning

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Multilayer Perceptron = FeedForward Neural Network

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Neural Networks

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

y(x n, w) t n 2. (1)

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Simple Neural Nets For Pattern Classification

EEE 241: Linear Systems

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Machine Learning

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Artificial Neural Networks

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Lecture 4: Feed Forward Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Training Multi-Layer Neural Networks. - the Back-Propagation Method. (c) Marcin Sydow

Computational statistics

Introduction to Neural Networks: Structure and Training

Multilayer Perceptrons and Backpropagation

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Artificial Neural Networks

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Neural Networks and Deep Learning

Ch.8 Neural Networks

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Artificial Neural Networks Examination, March 2002

Neural Networks and the Back-propagation Algorithm

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

Multilayer Perceptron

Christian Mohr

Unit III. A Survey of Neural Network Model

Machine Learning

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Lecture 5: Logistic Regression. Neural Networks

Supervised Learning in Neural Networks

Learning Vector Quantization

Artificial neural networks

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

k k k 1 Lecture 9: Applying Backpropagation Lecture 9: Applying Backpropagation 3 Lecture 9: Applying Backpropagation

Multilayer Perceptron

Instituto Tecnológico y de Estudios Superiores de Occidente Departamento de Electrónica, Sistemas e Informática. Introductory Notes on Neural Networks

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Transcription:

Artificial Neural Networks Edward Gatt

What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very simple principles Very complex behaviours Applications As powerful problem solvers As biological models

ANNs The basics ANNsincorporate the two fundamental components of biological neural nets: 1. Neurones (nodes) 2. Synapses (weights)

Neurone vs. Node

Structure of a node: Squashing function limits node output:

Synapse vs. weight

Feed-forward nets Information flow is unidirectional Data is presented to Input layer Passed on to Hidden Layer Passed on to Output layer Information is distributed Information processing is parallel Internal representation (interpretation) of data

Feeding data through the net: (1 0.25) + (0.5 (-1.5)) = 0.25 + (-0.75) = -0.5 Squashing: 1 1+ e 0.5 = 0.3775

Supervised Vs. Unsupervised Networks can be supervised Need to be trained ahead of time with lots of data Unsupervised networks adapt to the input Applications in Clustering and reducing dimensionality Learning may be very slow

What can a Neural Net do? Compute a known function Approximate an unknown function Pattern Recognition Signal Processing Learn to do any of the above

Basic Concepts A Neural Network generally maps a set of inputs to a set of outputs Input 0 Input 1... Input n Number of inputs/outputs is variable Neural Network The Network itself is composed of an arbitrary number of nodes with an arbitrary topology Output 0 Output 1... Output m

Basic Concepts Input 0 Input 1... Input n Definition of a node: W b W 0 W 1 W n + f H (x) +... Connection A node is an element which performs the function y = f H ( (w i x i ) + W b ) Output Node

Simple Perceptron Binary logic application f H (x) = u(x) [linear threshold] W i = random(-1,1) Y = u(w 0 X 0 + W 1 X 1 + W b ) W b Input 0 Input 1 W 0 W 1 + f H (x) Now how do we train it? Output

Basic Training Perception learning rule ΔW i = η* (D Y) * X i η= Learning Rate D = Desired Output Adjust weights based on a how well the current weights match an objective

Logic Training Expose the network to the logical OR operation Update the weights after each epoch As the output approaches the desired output for all cases, ΔW i will approach 0 X 0 X 1 D 0 0 0 0 1 1 1 0 1 1 1 1

Training the Network - Learning Backpropagation Requires training set (input/ output pairs) Starts with small random weights Error is used to adjust weights (supervised learning) Gradient descent on error landscape

The BackpropagationNetwork The backpropagationnetwork (BPN) is the most popular type of ANN for applications such as classification or function approximation. Like other networks using supervised learning, the BPN is not biologically plausible. The structure of the network is identical to the one we discussed before: Three (sometimes more) layers of neurons, Only feedforward processing: input layer hidden layer output layer, Sigmoid activation functions

Typical Activation Functions F(x) = 1 / (1 + e -k (w i x i ) ) Shown for k = 0.5, 1 and 10 Using a nonlinear function which approximates a linear threshold allows a network to approximate nonlinear functions

Alternative Activation functions Radial Basis Functions Square Triangle Gaussian! (μ, σ) can be varied at each hidden node to guide training Input 0 Input 1... Input n f RBF (x) f RBF (x) f RBF (x) f H (x) f H (x) f H (x)

The BackpropagationNetwork BPN units and activation functions: O 1 output vector y O K f(net o ) H 1 H 2 H 3 H J f(net h ) I 1 I 2 input vector x I I

Supervised Learning in the BPN Before the learning process starts, all weights (synapses) in the network are initializedwith pseudorandom numbers. We also have to provide a set of training patterns (exemplars). They can be described as a set of ordered vector pairs {(x 1, y 1 ), (x 2, y 2 ),, (x P, y P )}. Then we can start the backpropagationlearning algorithm. This algorithm iteratively minimizes the network s error by finding the gradient of the error surface in weightspace and adjusting the weightsin the opposite direction (gradient-descent technique).

Supervised Learning in the BPN Gradient-descent example:finding the absolute minimum of a one-dimensional error function f(x): f(x) slope: f (xf 0 ) x 0 x 1 = x 0 - ηf (x 0 ) x Repeat this iteratively until for some x i, f (x i ) is sufficiently close to 0.

Supervised Learning in the BPN In the BPN, learning is performed as follows: 1. Randomly select a vector pair (x p, y p ) from the training set and call it (x, y). 2. Use xas input to the BPN and successively compute the outputs of all neurons in the network (bottom-up) until you get the network output o. 3. Compute the error δ o pk, for the pattern p across all K output layer units by using the formula: δ o pk = ( y o k k ) f '( net o k )

Supervised Learning in the BPN 4. Compute the error δ h pj, for all J hidden layer units by using the formula: δ h pj = f '( net h k ) K k = 1 δ o pk w kj 5. Update the connection-weight values to the hidden layer by using the following equation: w ji ( t + 1) = w ( t) ji + ηδ h pj x i

Supervised Learning in the BPN 6. Update the connection-weight values to the output layer by using the following equation: w kj ( t + 1) = w kj ( t) + ηδ o pk f ( net h j ) Repeat steps 1 to 6 for all vector pairs in the training set; this is called a training epoch. Run as many epochs as required to reduce the network error E to fall below a threshold ε: P K o E = δ pk ) ( p= 1 k = 1 2

Supervised Learning in the BPN The only thing that we need to know before we can start our network is the derivative of our sigmoid function, for example, f (net k ) for the output neurons: 1 e f ( net k ) = net 1+ k f '(net k ) = f (net net k k ) = o k (1 o k )

Supervised Learning in the BPN Now our BPN is ready to go! If we choose the type and number of neurons in our network appropriately, priately, after training the network should show the following behavior: If we input any of the training vectors, the network should yield the expected output vector (with some margin of error). If we input a vector that the network has never seen before, it should be able to generalize and yield a plausible output vector based on its knowledge about similar input vectors.

Self-Organizing Maps (KohonenMaps) In the BPN, we used supervised learning. This is not biologically plausible: In a biological system, there e is no external teacher who manipulates the network s s weights from outside the network. Biologically more adequate: unsupervised learning. We will study Self-Organizing Maps (SOMs( SOMs) ) as examples for unsupervised learning (Kohonen,, 1980).

Self-Organizing Maps (KohonenMaps) Such topology-conserving mapping can be achieved by SOMs: Two layers: input layer and output (map) layer Input and output layers are completely connected. Output neurons are interconnected within a defined neighborhood. A topology (neighborhood relation) is defined on the output layer.

Self-Organizing Maps (KohonenMaps) A neighborhood function φ(i,, k) indicates how closely neurons i and k in the output layer are connected to each other. Usually, a Gaussian function on the distance between the two neurons in the layer is used: φ position of i position of k

Unsupervised Learning in SOMs For n-dimensional input space and m output neurons: (1) Choose random weight vector w i for neuron i, i = 1,..., m (2) Choose random input x (3) Determine winner neuron k: w k x = min i w i x (Euclidean distance) (4) Update all weight vectors of all neurons i in the neighborhood of neuron k: w i := w i + η φ(i, k) (x w i ) (w i is shifted towards x) (5) If convergence criterion met, STOP. Otherwise, narrow neighborhood function φ and learning parameter η and go to (2).