Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Similar documents
Introduction to Neural Networks

Introduction to Artificial Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Neural Networks (Part 1) Goals for the lecture

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 20, Section 5 1

Machine Learning. Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Artificial Neural Networks

Learning and Memory in Neural Networks

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Artifical Neural Networks

Chapter 2 Single Layer Feedforward Networks

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Multilayer Perceptron

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

CMSC 421: Neural Computation. Applications of Neural Networks

Artificial Neural Networks Examination, June 2005

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Artificial Neural Networks. MGS Lecture 2

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Artificial Neural Networks. Historical description

Multilayer Perceptron

Basic Principles of Unsupervised and Unsupervised

Introduction To Artificial Neural Networks

Lecture 7 Artificial neural networks: Supervised learning

Multilayer Neural Networks

Unit III. A Survey of Neural Network Model

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Simple Neural Nets For Pattern Classification

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Neural Networks and Deep Learning

y(x n, w) t n 2. (1)

Logistic Regression & Neural Networks

CHALMERS, GÖTEBORGS UNIVERSITET. EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Machine Learning

Multi-layer Perceptron Networks

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Artificial Neural Networks Examination, June 2004

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Machine Learning

Feedforward Neural Nets and Backpropagation

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Neural Networks and the Back-propagation Algorithm

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Artificial Neural Networks

Part 8: Neural Networks

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

AI Programming CS F-20 Neural Networks

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Statistical NLP for the Web

Multilayer Perceptrons and Backpropagation

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Introduction to Neural Networks

Artificial Neural Networks The Introduction

Course 395: Machine Learning - Lectures

CS 4700: Foundations of Artificial Intelligence

Neural Networks biological neuron artificial neuron 1

CS 4700: Foundations of Artificial Intelligence

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Input layer. Weight matrix [ ] Output layer

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

How to do backpropagation in a brain

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Lab 5: 16 th April Exercises on Neural Networks

MLPR: Logistic Regression and Neural Networks

Lecture 4: Perceptrons and Multilayer Perceptrons

Outline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.

Chapter ML:VI (continued)

Unit 8: Introduction to neural networks. Perceptrons

Artificial Neural Networks

Deep Feedforward Networks

COMP-4360 Machine Learning Neural Networks

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Data Mining Part 5. Prediction

Neural Networks Task Sheet 2. Due date: May

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Mul7layer Perceptrons

Hopfield Neural Network

Multilayer Perceptron Tutorial

Revision: Neural Network

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

CS:4420 Artificial Intelligence

Sections 18.6 and 18.7 Artificial Neural Networks

Grundlagen der Künstlichen Intelligenz

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions

Supervised Learning in Neural Networks

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Artificial Neural Network

Transcription:

Neural Networks

Plan Perceptron Linear discriminant Associative memories Hopfield networks Chaotic networks Multilayer perceptron Backpropagation

Perceptron Historically, the first neural net Inspired by human brain Proposed By Rosenblatt Between 1957 et 1961 The brain was appearing as the best computer Goal: associated input patterns to recognition outputs Akin to a linear discriminant

Perceptron Constitution Input layer / retina 1/0 1/0 1/0 1/0 Output layer Connections/Synapses {0/1} {0/1} {0/1}

Perceptron Constitution Input layer / retina 1/0 1/0 1/0 1/0 Output layer Connections/Synapses {0/1} {0/1} {0/1}

Perceptron Constitution 1/0 1/0 1/0 1/0 {0/1} {0/1} {0/1} x 0 w 0j o j =f(a j ) Sigmoid x 1 w 1j a j = i x i w ij w 2j x 2 w 3j x 3

Perceptron Constitution a j : activation of the j output neuron x 0 w 0j x i : activation of the i input neuron o j =f(a j ) w i,j : connection parameter between input neuron i and output neuron j x 1 w 1j w 2j a j = i x i w ij o j : decision rule o j = 0 for a j <= θ j, 1 for a j > θ j x 2 x 3 w 3j

Perceptron Need an associated learning Learning is supervised Based on a couple input pattern and desired output If the activation of output neuron is OK => nothing happens Otherwise inspired by neurophysiological data If it is activated : decrease the value of the connection If it is unactivated : increase the value of the connection Iterated until the output neurons reach the desired value

Perceptron Supervised learning How to decrease or increase the connections? Learning rule of Widrow-Hoff Closed to Hebbian learning w i,j (t+1) = w i,j (t) +n(t j -o j )x i = w i,j (t) + w i,j Desired value of output neuron j Learning rate

Theory of linear discriminant Compute: g(x) = W T x + W o g(x) = 0 g(x) > 0 And: Choose: g(x) < 0 class 1 if g(x) > 0 class 2 otherwise But how to find W on the basis of the data? 11

Gradient descent: E Δ Wi = η, i W i In general a sigmoid is used for the statistical interpretation: (0,1) [ ( )] Y = 1/1+ exp g x Easy to derive = Y(1-Y) Class 1 if Y > 0.5 and 2 otherwise The error could be least square: (Y Y d ) 2 Or maximum likelihood: Y logy + (1 Y ) log(1 Y ) d d But at the end, you got the learning rule: ΔW = η ( Yd Y ) X j 12

Perceptron limitations Limitations Not always easy to learn But above all, cannot separate not linearly separable data Why so? The XOR kills NN researches for 20 years (Minsky and Papert were responsable) 0,1 1,1 Consequence 0,0 We had to wait for the magical hidden layer And for backpropagation 1,0

Associative memories Around 1970 Two types Hetero-associative And auto-associative We will treat here only auto-associative Make an interesting connections between neurosciences and physics of complex systems John Hopfield

Auto-associative memories Constitution Input Fully connected neural networks

Associative memories Hopfield -> DEMO Fully connected graphs Input layer = Output layer = Networks The connexions have to be symmetric IN OUT It is again an hebbian learning rule

Associative memories Hopfield The newtork becomes a dynamical machine It has been shown to converge into a fixed point This fixed point is a minimal of a Lyapunov energy These fixed point are used for storing «patterns» Discrete time and asynchronous updating input in {-1,1} x i sign(σ j w ij x j )

Mémoires associatives Hopfield The learning is done by Hebbian learning Over all patterns to learn: ij p i patterns ΔW = X X p j

My researches: Chaotic encoding of memories in brain 19

Multilayer perceptron Constitution Connection Matrices W Z x INPUT I neurons h HIDDEN L neurons o OUTPUT J neurons

Multilayer Perceptron Constitution x 0 Z j0 w 0j x 1 w 1j a j = i x i w ij o j =f(a j ) Z j1 Z j2 w 2j x 2 w 3j x 3

Error backpropagation Learning algorithm How it proceeds : Inject an input Get the output Compute the error with respect to the desired output Propagate this error back from the output layer to the input layer of the network Just a consequence of the chaining derivative of the gradient descent

Backpropagation Select a derivable transfert function Classicaly used : The logistics And its derivative f ( x) 1 = 1 + e x f '( x) = f ( x)[1 f ( x)]

Backpropagation The algorithm 1. Inject an entry

Backpropagation Algorithm 1. Inject an entry h=f(w*x) 2. Compute the intermediate h

Backpropagation Algorithm h=f(w*x) o=f(z*h) 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o

Backpropagation Algorithm δ sortie =f (Zh)*(t - o) o=f(z*h) 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output

Backpropagation Algorithm δ sortie =f (Zh)*(t - o) 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error Z (t+1) =Z (t) +n δ sortie h = Z (t) + (t) Z

Backpropagation Algorithm δ cachée =f (Wx)*(Z δ sortie ) 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer Z (t+1) =Z (t) +n δ sortie h = Z (t) + (t) Z

Backpropagation Algorithm δ cachée =f (Wx)*(Z δ sortie ) 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output W (t+1) =W (t) +n δ cachée x = W (t) + (t) W 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer 7. Adjust W on the basis of this error

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output W (t+1) =W (t) +n δ cachée x = W (t) + (t) W 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer 7. Adjust W on the basis of this error

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer 7. Adjust W on the basis of this error

Neural network Simple linear discriminant

Neural networks Few layers Little learning

Neural networks More layers More learning

36

37

Neural networks Tricks Favour simple NN (you can add the structure in the error) Few layers are enough (theoretically only one) Exploit cross validation