Neural networks. Chapter 19, Sections 1 5 1

Similar documents
Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Artificial neural networks

CS:4420 Artificial Intelligence

Grundlagen der Künstlichen Intelligenz

Sections 18.6 and 18.7 Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

Supervised Learning Artificial Neural Networks

Learning and Neural Networks

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Ch.8 Neural Networks

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

Artificial Neural Networks

Data Mining Part 5. Prediction

Introduction To Artificial Neural Networks

CS 4700: Foundations of Artificial Intelligence

Artifical Neural Networks

Machine Learning. Neural Networks

AI Programming CS F-20 Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

CMSC 421: Neural Computation. Applications of Neural Networks

Introduction Biologically Motivated Crude Model Backpropagation

CSC242: Intro to AI. Lecture 21

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Lecture 4: Perceptrons and Multilayer Perceptrons

Introduction to Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

CS 4700: Foundations of Artificial Intelligence

Lecture 4: Feed Forward Neural Networks

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Multilayer Perceptron

Lecture 5: Logistic Regression. Neural Networks

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Introduction to Artificial Neural Networks

Multilayer Perceptrons (MLPs)

Neural Networks biological neuron artificial neuron 1

Artificial Neural Networks

Lecture 7 Artificial neural networks: Supervised learning

Machine Learning (CSE 446): Neural Networks

Unit 8: Introduction to neural networks. Perceptrons

Feedforward Neural Nets and Backpropagation

Neural Networks. Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994

Artificial Neural Networks. Historical description

Unit III. A Survey of Neural Network Model

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

EEE 241: Linear Systems

Multilayer Neural Networks

18.6 Regression and Classification with Linear Models

Linear Regression, Neural Networks, etc.

Revision: Neural Network

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

y(x n, w) t n 2. (1)

Course 395: Machine Learning - Lectures

4. Multilayer Perceptrons

Neural Networks and the Back-propagation Algorithm

Artificial Neural Network

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Multilayer Neural Networks

Artificial Neural Networks. Edward Gatt

Deep Learning Lecture 2

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Linear classification with logistic regression

Artificial Neural Network

CSC321 Lecture 5: Multilayer Perceptrons

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Pattern Classification

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Neural Networks and Deep Learning

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Artificial Neural Network and Fuzzy Logic

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Artificial Neural Networks

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Artificial Intelligence

Fundamentals of Neural Networks

Lab 5: 16 th April Exercises on Neural Networks

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Neural Networks Introduction

Neural Networks: Introduction

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Simple Neural Nets For Pattern Classification

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Multi-layer Neural Networks

Transcription:

Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1

Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2

Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms 10ms cycle time Signals are noisy spike trains of electrical potential Axonal arborization Synapse Axon from another cell Dendrite Axon Nucleus Synapses Cell body or Soma Chapter 19, Sections 1 5 3

McCulloch Pitts unit Output is a squashed linear function of the inputs: a i g(in i ) = g ( Σ j W j,i a j ) a 0 = 1 a i = g(in i ) Wj,i a j Bias Weight W 0,i Σ in i g a i Input Links Input Function Activation Function Output Output Links Chapter 19, Sections 1 5 4

Activation functions g(in i ) g(in i ) +1 +1 (a) in i (b) in i (a) is a step function or threshold function (b) is a sigmoid function 1/(1 + e x ) Changing the bias weight W 0,i moves the threshold location Chapter 19, Sections 1 5 5

Implementing logical functions W 0 = 1.5 W 0 = 0.5 W 0 = 0.5 W 1 = 1 W 2 = 1 W 1 = 1 W 2 = 1 W 1 = 1 AND OR NOT McCulloch and Pitts: every Boolean function can be implemented Chapter 19, Sections 1 5 6

Feed-forward networks: single-layer perceptrons multi-layer perceptrons Network structures Feed-forward networks implement functions, have no internal state Recurrent networks: Hopfield networks have symmetric weights (W i,j = W j,i ) g(x) = sign(x), a i = ± 1; holographic associative memory Boltzmann machines use stochastic activation functions, MCMC in BNs recurrent neural nets have directed cycles with delays have internal state (like flip-flops), can oscillate etc. Chapter 19, Sections 1 5 7

Feed-forward example 1 W 1,3 W 1,4 3 W 3,5 5 2 W 2,3 W 2,4 4 W 4,5 Feed-forward network = a parameterized family of nonlinear functions: a 5 = g(w 3,5 a 3 + W 4,5 a 4 ) = g(w 3,5 g(w 1,3 a 1 + W 2,3 a 2 ) + W 4,5 g(w 1,4 a 1 + W 2,4 a 2 )) Chapter 19, Sections 1 5 8

Perceptrons Input Units Perceptron output 0.9 1 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Output 0.1 W 0 j,i Units -4-2 0 x 1 2 4-4 -2 0 2 4 x 2 Chapter 19, Sections 1 5 9

Expressiveness of perceptrons Consider a perceptron with g = step function (Rosenblatt, 1957, 1960) Can represent AND, OR, NOT, majority, etc. Represents a linear separator in input space: Σ j W j x j > 0 or W x > 0 I 1 I 1 I 1 1 1 1? 0 0 0 0 1 I 2 I 2 0 1 0 1 I 2 I 1 I 2 I 1 I 2 (a) and (b) or (c) I 1 xor I 2 Chapter 19, Sections 1 5 10

Perceptron learning Learn by adjusting weights to reduce error on training set The squared error for an example with input x and true output y is E = 1 2 Err 2 1 2 (y h W(x)) 2, Perform optimization search by gradient descent: E W j = Err Err = Err W j W j = Err g (in) x j Simple weight update rule: W j W j + α Err g (in) x j ( y g(σ n j = 0W j x j ) ) E.g., +ve error increase network output increase weights on +ve inputs, decrease on -ve inputs Chapter 19, Sections 1 5 11

Perceptron learning contd. Perceptron learning rule converges to a consistent function for any linearly separable data set 1 1 Proportion correct on test set 0.9 0.8 0.7 0.6 0.5 Perceptron Decision tree Proportion correct on test set 0.9 0.8 0.7 0.6 0.5 Decision tree Perceptron 0.4 0 10 20 30 40 50 60 70 80 90 100 0.4 0 10 20 30 40 50 60 70 80 90 100 Training set size Training set size Chapter 19, Sections 1 5 12

Multilayer perceptrons Layers are usually fully connected; numbers of hidden units typically chosen by hand Output units a i W j,i Hidden units a j W k,j Input units a k Chapter 19, Sections 1 5 13

Expressiveness of MLPs All continuous functions w/ 2 layers, all functions w/ 3 layers h W (x 1, x 2 ) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-4 -2 0 x 1 2 4-4 -2 0 2 4 x 2 h W (x 1, x 2 ) 0.9 1 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-4 -2 0 x 1 2 4-4 -2 0 2 4 x 2 Chapter 19, Sections 1 5 14

Back-propagation learning Output layer: same as for single-layer perceptron, W j,i W j,i + α a j i where i = Err i g (in i ) Hidden layer: back-propagate the error from the output layer: j = g (in j ) W j,i i. i Update rule for weights in hidden layer: W k,j W k,j + α a k j. (Most neuroscientists deny that back-propagation occurs in the brain) Chapter 19, Sections 1 5 15

Back-propagation derivation The squared error on a single example is defined as E = 1 2 i (y i a i ) 2, where the sum is over the nodes in the output layer. E W j,i = (y i a i ) a i W j,i = (y i a i ) g(in i) W j,i = (y i a i )g (in i ) in i = (y i a i )g (in i ) W j,i = (y i a i )g (in i )a j = a j i W j,i j W j,ia j Chapter 19, Sections 1 5 16

Back-propagation derivation contd. E W k,j = i (y i a i ) a i W k,j = i (y i a i ) g(in i) = (y i a i )g (in i ) in i = i W i k,j i W k,j W k,j = i iw j,i a j W k,j = i iw j,i g(in j ) W k,j = i iw j,i g (in j ) in j = i iw j,i g (in j ) W k,j W k,j k W k,j a k j W j,ia j = i iw j,i g (in j )a k = a k j Chapter 19, Sections 1 5 17

Back-propagation learning contd. At each epoch, sum gradient updates for all examples and apply 14 Total error on training set 12 10 8 6 4 2 0 0 50 100 150 200 250 300 350 400 Number of epochs Usual problems with slow convergence, local minima Chapter 19, Sections 1 5 18

Back-propagation learning contd. 1 0.9 % correct on test set 0.8 0.7 0.6 0.5 0.4 Multilayer network Decision tree 0 10 20 30 40 50 60 70 80 90 100 Training set size Chapter 19, Sections 1 5 19

Handwritten digit recognition 3-nearest-neighbor = 2.4% error 400 300 10 unit MLP = 1.6% error LeNet: 768 192 30 10 unit MLP = 0.9% Chapter 19, Sections 1 5 20

Summary Most brains have lots of neurons; each neuron linear threshold unit (?) Perceptrons (one-layer networks) insufficiently expressive Multi-layer networks are sufficiently expressive; can be trained by gradient descent, i.e., error back-propagation Many applications: speech, driving, handwriting, credit cards, etc. Chapter 19, Sections 1 5 21