Artifical Neural Networks

Similar documents
Numerical Learning Algorithms

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

CS:4420 Artificial Intelligence

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 20. Chapter 20 1

Artificial Neural Networks. Historical description

Artificial Neural Networks

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Lecture 4: Perceptrons and Multilayer Perceptrons

Artificial neural networks

Linear discriminant functions

AI Programming CS F-20 Neural Networks

Artificial Neural Network

Neural networks. Chapter 20, Section 5 1

Lecture 7 Artificial neural networks: Supervised learning

Data Mining Part 5. Prediction

Feedforward Neural Nets and Backpropagation

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Revision: Neural Network

y(x n, w) t n 2. (1)

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Simple Neural Nets For Pattern Classification

Introduction to Artificial Neural Networks

Artificial Neural Networks Examination, June 2005

Backpropagation Neural Net

Artificial Neural Network

Artificial Neural Networks The Introduction

Multilayer Perceptron

Computational statistics

Introduction to Neural Networks

Artificial Neural Networks. Edward Gatt

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Machine Learning (CSE 446): Neural Networks

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Neural Networks (Part 1) Goals for the lecture

Multilayer Perceptrons and Backpropagation

Introduction to Neural Networks

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Lecture 4: Feed Forward Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning

Ch.8 Neural Networks

COMP304 Introduction to Neural Networks based on slides by:

Introduction To Artificial Neural Networks

Supervised Learning in Neural Networks

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

CSC242: Intro to AI. Lecture 21

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Introduction to Neural Networks

Introduction Biologically Motivated Crude Model Backpropagation

Artificial Intelligence

Course 395: Machine Learning - Lectures

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

CS4442/9542b Artificial Intelligence II prof. Olga Veksler. Lecture 5 Machine Learning. Neural Networks. Many presentation Ideas are due to Andrew NG

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Unit 8: Introduction to neural networks. Perceptrons

COMP-4360 Machine Learning Neural Networks

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

CMSC 421: Neural Computation. Applications of Neural Networks

Neural Networks. Xiaojin Zhu Computer Sciences Department University of Wisconsin, Madison. slide 1

4. Multilayer Perceptrons

EEE 241: Linear Systems

Machine Learning

Statistical Machine Learning from Data

Introduction to Neural Networks

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Neural Networks Examination, June 2004

Learning and Neural Networks

Neuro-Fuzzy Comp. Ch. 4 March 24, R p

Artificial Neural Networks. MGS Lecture 2

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Artificial Neural Networks

Neural Networks and the Back-propagation Algorithm

Neural Networks Introduction

Training Multi-Layer Neural Networks. - the Back-Propagation Method. (c) Marcin Sydow

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Artificial Neural Networks

Artificial Neural Networks

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

CS489/698: Intro to ML

Chapter ML:VI (continued)

Neural Networks: Introduction

Sections 18.6 and 18.7 Artificial Neural Networks

Simple neuron model Components of simple neuron

Multivariate Analysis, TMVA, and Artificial Neural Networks

Unit III. A Survey of Neural Network Model

Linear Regression, Neural Networks, etc.

Sections 18.6 and 18.7 Artificial Neural Networks

Artificial Neural Networks Examination, March 2004

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing

Transcription:

Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure........................................... 4 ANN Illustration.......................................... 5 Perceptrons 6 Perceptron Learning Rule................................... 6 Perceptrons Continued..................................... 7 Example of Perceptron Learning (α = )......................... 8 Gradient Descent 9 Gradient Descent (Linear)................................... 9 The Delta Rule......................................... 0 Multilayer Networks Illustration............................................. Sigmoid Activation....................................... Plot of Sigmoid Function................................... 3 Applying The Chain Rule................................... 4 Backpropagation........................................ 5 Hypertangent Activation Function............................. 6 Gaussian Activation Function................................ 7 Issues in Neural Networks.................................. 8 Biological Neural Networks Neural networks are inspired by our brains. The human brain has about 0 neurons and 0 4 synapses. A neuron consists of a soma (cell body), axons (sends signals), and dendrites (receives signals). A synapse connects an axon to a dendrite. Given a signal, a synapse might increase (excite) or decrease (inhibit) electrical potential. A neuron fires when its electrical potential reaches a threshold. Learning might occur by changes to synapses. CS 533 Artificial Intelligence Artificial Neural Networks Artificial Neural Networks An (artificial) neural network consists of units, connections, and weights. Inputs and outputs are numeric. Biological NN soma axon, dendrite synapse potential threshold signal Artificial NN unit connection weight weighted sum bias weight activation CS 533 Artificial Intelligence Artificial Neural Networks 3

ANN Structure A typical unit i receives inputs a j, a j,... from other units and performs a weighted sum: in i = W 0,i + j W j,i a j and outputs activation a i = g(in i ). Typically, input units store the inputs, hidden units transform the inputs into an internal numeric vector, and an output unit transforms the hidden values into the prediction. An ANN is a function f(x,w) = a, where x is an example, W is the weights, and a is the prediction (activation value from output unit). Learning is finding a W that minimizes error. CS 533 Artificial Intelligence Artificial Neural Networks 4 ANN Illustration INPUT x x x3 x4 W5 W5 W35 W45 W6 W6 W36 W46 WEIGHTS HIDDEN H5 W05 bias W06 H6 W57 W07 W67 UNIT O7 CS 533 Artificial Intelligence Artificial Neural Networks 5 a7 Perceptrons 6 Perceptron Learning Rule A perceptron is a single unit with activation: a = sign (W 0 + ) j W j x j sign returns or. W 0 is the bias weight. One version of the perceptron learning rule is: in W 0 + W j x j E max(0, y in) if E > 0 then W 0 W 0 + α y W j W j + α x j y for each input x j x is the inputs, y {, } is the target, and α > 0 is the learning rate. CS 533 Artificial Intelligence Artificial Neural Networks 6 Perceptrons Continued This learning rule tends to minimize E. The perceptron convergence theorem states that if some W classifies all the training examples correctly, then the perceptron learning rule will converge to zero error on the training examples. Usually, many epochs (passes over the training examples) are needed until convergence. If zero error is not possible, use α 0./n, where n is the number of normalized or standardized inputs. CS 533 Artificial Intelligence Artificial Neural Networks 7 3 4

Example of Perceptron Learning (α = ) Using α = : Inputs Weights x x x 3 x 4 y in E W 0 W W W 3 W 4 0 0 0 0 0 0 0 0-0 - 0 0 0-0 - 0-0 0-0 0-0 - 0-0 0 0 0-0 0-0 0 - - 0 0 0-0 0 0 0 0 0-0 - - 0 0 0-3 0 0 - CS 533 Artificial Intelligence Artificial Neural Networks 8 The Delta Rule Given error E(W), obtain the gradient: [ E E(W) =, E,... E ] W 0 W W n To decrease error, use the update rule: W j W j α E W j The LMS update rule can be derived using: E(W) = (y (W 0 + W j x j )) / For the perceptron learning rule, use: E(W) = max(0, y (W 0 + W j x j )) CS 533 Artificial Intelligence Artificial Neural Networks 0 Gradient Descent 9 Gradient Descent (Linear) Suppose activation is a linear function: a = W 0 + W j x j The Widrow-Hoff (LMS) update rule is: diff y a W 0 W 0 + α diff W j W j + α x j diff for each input j where α is the learning rate. This update rule tends to minimize squared error. E(W) = (y a) (x,y) CS 533 Artificial Intelligence Artificial Neural Networks 9 Multilayer Networks Illustration INPUT x x x3 x4 H5 3 bias 0 3 4 H6 WEIGHTS HIDDEN 4 UNIT O7 CS 533 Artificial Intelligence Artificial Neural Networks a7 5 6

Sigmoid Activation The sigmoid function is defined as: sigmoid(x) = + e x It is commonly used for ANN activation functions: a i = sigmoid(in i ) = sigmoid(w 0,i + j W j,i a j ) Note that sigmoid(x) = sigmoid(x)( sigmoid(x)) x CS 533 Artificial Intelligence Artificial Neural Networks Plot of Sigmoid Function 0.8 0.6 0.4 0. sigmoid(x) Applying The Chain Rule Using E = (y i a i ) for output unit i: E W j,i = E a i in i a i in i W j,i = (y i a i ) a i ( a i ) a j For weights from input to hidden units: E W k,j = E a i in i a j in j a i in i a j in j W k,j = (y i a i ) a i ( a i ) W j,i a j ( a j ) x k CS 533 Artificial Intelligence Artificial Neural Networks 4 Backpropagation Backpropagation is an application of the delta rule. Update each weight W using the rule: W W α E W where α is the learning rate. CS 533 Artificial Intelligence Artificial Neural Networks 5-4 - 0 4 CS 533 Artificial Intelligence Artificial Neural Networks 3 7 8

Hypertangent Activation Function tanh(x) 0.5-0.5 - - - 0 CS 533 Artificial Intelligence Artificial Neural Networks 6 Issues in Neural Networks Sufficiently complex ANNs can approximate any reasonable function. ANNs approx. a preference bias for interpolation. How many hidden units and layers? What are good initial values? One approach to avoid overfitting is: Remove validation exs. from training exs. Train neural network using training exs. Choose weights that are best on validation set. Faster algorithms might rely on: Momentum, a running average of deltas. Conjugant Gradient, a second deriv. method, or RPROP, update based only on sign. CS 533 Artificial Intelligence Artificial Neural Networks 8 Gaussian Activation Function 0.8 gaussian(x,) 0.6 0.4 0. -3 - - 0 3 CS 533 Artificial Intelligence Artificial Neural Networks 7 9 0