Supervised Learning NNs

Similar documents
EEE 241: Linear Systems

Neural Networks & Learning

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Introduction to the Introduction to Artificial Neural Network

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Multi-layer neural networks

Multigradient for Neural Networks for Equalizers 1

Multilayer neural networks

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Multilayer Perceptron (MLP)

Evaluation of classifiers MLPs

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

1 Convex Optimization

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Pattern Classification

Neural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17

Hopfield Training Rules 1 N

Introduction to Artificial Neural Networks EE /09/2015. Who am I. Associate Prof. Dr. Turgay IBRIKCI. What you learn from the course

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Week 5: Neural Networks

Multi layer feed-forward NN FFNN. XOR problem. XOR problem. Neural Network for Speech. NETtalk (Sejnowski & Rosenberg, 1987) NETtalk (contd.

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Non-linear Canonical Correlation Analysis Using a RBF Network

Lecture Notes on Linear Regression

Neural networks. Nuno Vasconcelos ECE Department, UCSD

CS294A Lecture notes. Andrew Ng

Radial-Basis Function Networks

1 Input-Output Mappings. 2 Hebbian Failure. 3 Delta Rule Success.

Generalized Linear Methods

A neural network with localized receptive fields for visual pattern classification

Development of a General Purpose On-Line Update Multiple Layer Feedforward Backpropagation Neural Network

Which Separator? Spring 1

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Neural Networks. Class 22: MLSP, Fall 2016 Instructor: Bhiksha Raj

Chapter - 2. Distribution System Power Flow Analysis

Gradient Descent Learning and Backpropagation

A Fuzzy Image Segmentation using Feedforward Neural Networks with Supervised Learning

The McCulloch Neuron (1943)

Report on Image warping

PERFORMANCE COMPARISON BETWEEN BACK PROPAGATION, RPE AND MRPE ALGORITHMS FOR TRAINING MLP NETWORKS

10-701/ Machine Learning, Fall 2005 Homework 3

Classification learning II

Associative Memories

Video Data Analysis. Video Data Analysis, B-IT

arxiv: v1 [cs.lg] 17 Jan 2019

Kernel Methods and SVMs Extension

LECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page)

Boostrapaggregating (Bagging)

Mean Field / Variational Approximations

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

Linear Feature Engineering 11

Why feed-forward networks are in a bad shape

Determining Transmission Losses Penalty Factor Using Adaptive Neuro Fuzzy Inference System (ANFIS) For Economic Dispatch Application

Online Classification: Perceptron and Winnow

15-381: Artificial Intelligence. Regression and cross validation

Support Vector Machines CS434

CHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION

Ensemble Methods: Boosting

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

CS407 Neural Computation

MATH 567: Mathematical Techniques in Data Science Lab 8

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

Finite Difference Method

Unsupervised Learning

A Tutorial on Data Reduction. Linear Discriminant Analysis (LDA) Shireen Elhabian and Aly A. Farag. University of Louisville, CVIP Lab September 2009

The Chaotic Robot Prediction by Neuro Fuzzy Algorithm (2) = θ (3) = ω. Asin. A v. Mana Tarjoman, Shaghayegh Zarei

RBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Atmospheric Environmental Quality Assessment RBF Model Based on the MATLAB

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Introduction to Neural Networks. David Stutz

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming

Lecture 23: Artificial neural networks

Principe, J.C. Artificial Neural Networks The Electrical Engineering Handbook Ed. Richard C. Dorf Boca Raton: CRC Press LLC, 2000

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,*

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

Chapter Newton s Method

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

4DVAR, according to the name, is a four-dimensional variational method.

Solving Nonlinear Differential Equations by a Neural Network Method

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

CS294A Lecture notes. Andrew Ng

CoSMo 2012 Gunnar Blohm

Machine Learning CS-527A ANN ANN. ANN Short History ANN. Artificial Neural Networks (ANN) Artificial Neural Networks

Linear Regression Introduction to Machine Learning. Matt Gormley Lecture 5 September 14, Readings: Bishop, 3.1

CHAPTER 3 ARTIFICIAL NEURAL NETWORKS AND LEARNING ALGORITHM

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

Lecture 12: Classification

A New Algorithm for Training Multi-layered Morphological Networks

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

A New Algorithm Using Hopfield Neural Network with CHN for N-Queens Problem

Transcription:

EE788 Robot Cognton and Plannng, Prof. J.-H. Km Lecture 6 Supervsed Learnng NNs Robot Intellgence Technolog Lab. From Jang, Sun, Mzutan, Ch.9, Neuro-Fuzz and Soft Computng, Prentce Hall

Contents. Introducton. Perceptrons 3. Adalne 4. Backpropagaton MLPs 5. Radal Bass Functon Netorks 6. Modular Netorks

. Introducton Artfcal neural netorks or smpl NNs Perceptron: McCulloch and Ptts 43 Sngle-laer perceptrons (Rosenblatt 6) Appled to pattern classfcaton learnng Interest n NNs dndled n the 97s. Lmtaton of sngle-laer sstems (Mnsk and Papert 69) The recent resurgence of nterest n the feld of NNs (snce 98s) Ne NN learnng algorthms Analog VLSI crcuts Parallel processng technques 3

. Introducton Classfcaton of NN models Learnng methods: supervsed vs. unsupervsed Archtectures: feedforard vs. recurrent Output tpes: bnar vs. contnuous Node tpes: unform vs. hbrd Implementatons: softare vs. hardare Connecton eghts: adustable vs. hardred Supervsed learnng or mappng netorks Desred nput-output data sets Adustable parameters Updated b a supervsed learnng rule 4

. Perceptrons Archtecture: a sngle-laer perceptron for pattern recognton g : maps all or a part of the nput pattern nto A bnar value {, } A bpolar value {-, } The term : : actve or ectator : nactve -: nhbtor 5

. Perceptrons Output: o f f n n f n th and here : a modfable eght, : the bas term Actvaton functon f( ): sgnum or step functon sgn( ) f, otherse step( ) f, otherse 6

. Perceptrons Learnng algorthm. Select an nput vector from the tranng data set.. If the perceptron gves an ncorrect response, modf all connecton eghts accordng to Δ = η t here t : a target output, η: learnng rate 3. Repeat and. Learnng rate A constant throughout tranng Proportonal to the error Faster convergence Ma lead to unstable learnng 7

8 Eclusve-OR problem: Not lnearl separable usng a sngle-laer perceptron because The to-laer perceptron: Multlaer perceptrons Solve nonlnearl separable problems.,,,. Perceptrons o: class, : class (Mnsk and Papert, 69)

9 3. Adalne Adalne (adaptve lnear element), Wdro and Hoff 6 Delta rule for adustng the eghts on the pth I/O pattern: n o p p p p p p p p p o t o t E o t E ) ( ) ( ) (

3. Adalne Delta rule Wdro-Hoff learnng rule Least mean square learnng procedure mnmzng squared errors Features: Smplct Dstrbuted learnng (performed locall at each node level) On-lne learnng (updated after presentaton of each pattern) Applcatons Adaptve nose cancellaton Interference cancelng n electrocardograms Echo elmnaton from long-dstance telephone transmsson lnes Antenna sdelobe nterference cancelng Adaptve nverse control

4. Backpropagaton MLPs MLP (Mult-Laer Perceptron) Feedforard netork that emplos the delta rule for tranng Feedforard netorks Full connected multlaer netork All neurons n a partcular laer are full connected to all neurons n the subsequent laer E) Three-laer feedforard NN Three neuron nput laer To neuron output laer Four neuron hdden laer (Laer ) (Laer ) (Laer )

4. Backpropagaton MLPs Basc model of a sngle artfcal neuron,, : nputs b f,, : eghts : a bas : the actvaton functon : the output Let s be a eghted sum, N s () t () t b s (t) = W +b

4. Backpropagaton MLPs Actvaton functon f(s) Bas b ll move the curve along the s-as. Sgmod actvaton functon: f ( s ) e Dfferentable Monotonc s Actvaton functons or, logstc 3

4. Backpropagaton MLPs Backpropagaton Supervsed learnng method to tran NNs Uses a gradent-descent optmzaton method, also referred to as the delta rule, hen appled to feedforard netorks Performance nde or cost functon J: M J ( d here d : desred netork output : actual netork output Usng gradent-descent, the eght ncrement s J here μ: a constant ) 4

5 Usng the chan rule, M d M J ) (, ) ( M d M J M d M J ) ( If the actvaton functon s the sgmod functon, then ts dervatve s ) ( s s s s e e e e s f or ) ( ) ( s f s f s f Snce f(s) s the neuron output, then above equaton can be rtten as ) ( s From (), agan usng the chan rule, s s () () (3) (4) 4. Backpropagaton MLPs

6 Bas b s called, thus N s ) ( N N s (5) (6) Substtutng (3) and (5) nto (4), Puttng (6) nto (), M M M d M J ) ( ) ( (7) here ) ( ) ( d (8) 4. Backpropagaton MLPs

4. Backpropagaton MLPs Substtutng (7) nto the eght ncrement here M M (9) Ths leads to a eght ncrement, called the delta rule, for a partcular neuron: ( kt ) here η s the learnng rate and s a value of beteen and. Hence the ne eght becomes () ( kt ) ( k ) T ( kt ) ( k ) T () 7

4. Backpropagaton MLPs Consder a three laered netork: Input laer (l=), hdden laer (l=), and output laer (l=) Back-propagaton commences th the output laer here d s knon and hence δ can be calculated usng (8), and the eghts adusted usng (). To adust the eghts on the hdden laer (l=), (8) s replaced b Three-laer feedforard NN: N [ ] l [ ( )] l () l 8

4. Backpropagaton MLPs Hence, the δ values for laer l are calculated usng the neuron outputs from laer l (hdden laer) together th the summaton of and δ products from laer l+ (output laer). The back-propagaton process contnues untl all eghts have been adusted. Then, usng a ne set of nputs, nformaton s fed forard through the netork (usng the ne eghts) and errors at the output laer computed. The process contnues untl () The performance nde J reaches an acceptable lo value () A mamum teraton count (number of epochs) has been eceeded () A tranng-tme perod has been eceeded. 9

4. Backpropagaton MLPs Equatons that govern the BPA can be summarzed as Sngle neuron summaton: Sgmod actvaton functon: s N ( t) ( t) b (3) e s (4) Delta rule: ( kt ) (5) Ne eght: ( kt ) ( k ) T ( kt ) (6) Output laer: Other laers: d ) ( ) (7) J ( M ( d ) l (8) N [ ] l [ ( )] l (9)

4. Backpropagaton MLPs Learnng th momentum Makng the current eght change equal to a proporton of the prevous eght change summed th the eght change calculated usng the delta rule Delta rule gven n (5) can be modfed as Δ ( kt ) ( α) ηδ αδ (( k ) T ) () here α s the momentum coeffcent, and has a value of beteen and. Used n BPA, the soluton stands less chance of becomng trapped n local mnma

4. Backpropagaton MLPs E) Tranng usng back-propagaton Calculate the output, and hence the ne values for the eghts and bases. Assume a learnng rate of.5. Current nputs =., =.6 Desred output d = Estng eghts and bases are: n the hdden laer,..5.5.5. b 3. n the output laer,..5.5 3... b 4.

4. Backpropagaton MLPs Sol.) Forard propagaton Hdden laer (l=): Sngle neuron summaton or () B sgmod actvaton functons (= to 3), () 3

4. Backpropagaton MLPs Insertng values nto () and (), Output laer (l=) (3) (4) Insertng values nto (3) and (4), 4

4. Backpropagaton MLPs Back propagaton Output laer (l=): From (8), Snce =, Delta rule: Ne eghts and bases for the output laer: 5

4. Backpropagaton MLPs Hdden laer (l=): From (9), To llustrate ths equaton, had there been to neurons n laer (l+),.e. the output laer, values for δ and δ for laer (l+) ould have been calculated. Thus, for laer l (the hdden laer), the [δ ] l values ould be Hoever, snce n ths eample there s onl a sngle neuron n laer (l+), δ =. Thus the δ values for laer l are 6

4. Backpropagaton MLPs Hence, usng the delta rule, the eght ncrements for the hdden laer are The ne eghts and bases for the hdden laer no become Intall,..5.5.5. 3. b..5.5 W.5.5.5.545.66 3.3 b.75.489.55 7

5. Radal Bass Functon Netorks Locall tuned and overlappng receptve felds: Structures n the regons of the cerebral corte, the vsual corte, etc. RBFN: A netork structure emplong local receptve felds to perform functon mappngs The actvaton level of the th receptve feld unts (or hdden unt): R u /,,, H R..., (a) Weghted sum of the output values (b) Weghted average of the output values 8

5. Radal Bass Functon Netorks R ( ) s a radal bass functon: Gaussan functon: Logstc functon: R R u ep ep[ u / ] The actvaton level s mamum hen the nput vector s at the center u of that unt. Note that there are no connecton eghts beteen the nput laer and the hdden laer. 9

3 5. Radal Bass Functon Netorks Output of an RBFN Weghted sum of the output value assocated th each receptve feld: Weghted average of the output value assocated th each receptve feld: Dsadvantage: a hgher degree of computatonal complet Advantage: a ell-nterpolated overall output beteen the outputs of the overlappng receptve felds H H R c c d H H H H R R c c d

5. Radal Bass Functon Netorks RBFN s appromaton capact Further mproved th supervsed adustments of the center and shape of the receptve feld (or radal bass) functon Sequental tranng algorthm: F the receptve feld functons frst Then adust the eghts of the output laer Functonal equvalence to Fuzz Inference Sstem Same aggregaton method to derve ther overall outputs Weghted average or eghted sum Same number of receptve feld unts and fuzz f-then rules Each radal bass functon of the RBFN s equal to a multdmensonal composte MF. 3

6. Modular Netorks Modular netorks A herarchcal organzaton comprsng multple NNs To prncpal components: Local eperts (or epert netorks) Integratng unt (or gatng netork) Overall output usng estmated combnaton eghts (g ): Y K g O 3

7. Summar Learnng modes Characterstcs of avalable nformaton for learnng Supervsed: Instructve nformaton on desred responses, eplctl specfed b a teacher Renforcement: Partal nformaton about desred responses, or onl rght or rong, evaluatve nformaton Unsupervsed: No nformaton about desred response Recordng: A pror desgn nformaton for memor storng 33