Radial Basis Function Networks. Ravi Kaushik Project 1 CSC Neural Networks and Pattern Recognition

Similar documents
Neural Networks Lecture 4: Radial Bases Function Networks

In the Name of God. Lectures 15&16: Radial Basis Function Networks

CHAPTER IX Radial Basis Function Networks

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I

Neural Networks and the Back-propagation Algorithm

Application of a radial basis function neural network for diagnosis of diabetes mellitus

The Perceptron Algorithm

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Radial Basis Function (RBF) Networks

Artificial Neural Networks

Vorlesung Neuronale Netze - Radiale-Basisfunktionen (RBF)-Netze -

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Neural Network Training

Machine Learning Lecture 5

y(x n, w) t n 2. (1)

Multilayer Neural Networks

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Radial-Basis Function Networks

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Neural networks. Chapter 19, Sections 1 5 1

Linear & nonlinear classifiers

Radial-Basis Function Networks

Artificial Neural Networks. Edward Gatt

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Multilayer Neural Networks

Artificial Neural Networks. MGS Lecture 2

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Introduction to Neural Networks

What is semi-supervised learning?

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Unit III. A Survey of Neural Network Model

Neural networks. Chapter 20. Chapter 20 1

Multi-Layer Boosting for Pattern Recognition

X/94 $ IEEE 1894

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

CHAPTER 2 NEW RADIAL BASIS NEURAL NETWORKS AND THEIR APPLICATION IN A LARGE-SCALE HANDWRITTEN DIGIT RECOGNITION PROBLEM

Artificial Neural Networks

The Perceptron. Volker Tresp Summer 2014

Radial-Basis Function Networks. Radial-Basis Function Networks

Reading Group on Deep Learning Session 1

Introduction to Neural Networks: Structure and Training

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

Slide05 Haykin Chapter 5: Radial-Basis Function Networks

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

Deep Neural Networks (1) Hidden layers; Back-propagation

The Perceptron. Volker Tresp Summer 2016

Artifical Neural Networks

Learning and Neural Networks

From perceptrons to word embeddings. Simon Šuster University of Groningen

Statistical Machine Learning from Data

Learning Vector Quantization

EEE 241: Linear Systems

Neural Networks Introduction

Pattern Classification

Multilayer Perceptrons and Backpropagation

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Deep Neural Networks (1) Hidden layers; Back-propagation

Introduction to Neural Networks

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

Learning Vector Quantization (LVQ)

Introduction to Neural Networks

Multilayer Perceptrons (MLPs)

Neural Networks and Deep Learning

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters

Artificial Neural Networks Examination, June 2004

Learning with kernels and SVM

Christian Mohr

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

Feed-forward Network Functions

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

CS:4420 Artificial Intelligence

Cheng Soon Ong & Christian Walder. Canberra February June 2018

CMSC 421: Neural Computation. Applications of Neural Networks

Artificial Neural Networks. Historical description

Neural Networks Lecture 3:Multi-Layer Perceptron

Automatic Noise Recognition Based on Neural Network Using LPC and MFCC Feature Parameters

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

An artificial neural networks (ANNs) model is a functional abstraction of the

Neural Networks Task Sheet 2. Due date: May

Data Mining Part 5. Prediction

Support Vector Machines: Maximum Margin Classifiers

Bayesian Learning. Two Roles for Bayesian Methods. Bayes Theorem. Choosing Hypotheses

Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels

CSC242: Intro to AI. Lecture 21

Artificial neural networks

Lecture 4: Perceptrons and Multilayer Perceptrons

Regression and Classification" with Linear Models" CMPSCI 383 Nov 15, 2011!

Basic Principles of Unsupervised and Unsupervised

Cheng Soon Ong & Christian Walder. Canberra February June 2018

SISTEMI DI SUPERVISIONE ADATTATIVI. Silvio Simani References

Artificial Neural Networks Examination, June 2005

Part-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287

Transcription:

Radial Basis Function Networks Ravi Kaushik Project 1 CSC 84010 Neural Networks and Pattern Recognition

History Radial Basis Function (RBF) emerged in late 1980 s as a variant of artificial neural network. The activation of the hidden layer is dependent on the distance between the input vector and a prototype vector Topics include function approximation, regularization, noisy interpolation, density estimation, optimal classification theory and potential functions.

Motivation RBF can approximate any regular function Trains faster than any multi-layer perceptron It has just two layers of weights Each layer is determined sequentially Each hidden unit implements a radial activated function Input is non-linear and output is linear

Advantages RBFN can be trained faster than multilayer perceptron due to its two stage training procedure. Two layer network Non-linear approximation Use of both unsupervised and supervised learning No saturation while generating outputs While training, it does not get stuck in local minima

Network Topology φ j (x) ψ k (x)

Basis Functions RBF network has be shown to be a universal approximator for continuous functions, provided that the number n r of hidden nodes is sufficiently large. However, the use of direct multi-quadric function as activation function will avoid saturation of the node outputs.

Network Topology Gaussian Activation Function φ j x [ ( )Σ 1 ( j X μ )] j j =1...L ()= exp X μ j Output Layer: is a weighted sum of hidden inputs ψ k (x) = L j=1 λ jk.φ j (x) Output for pattern recognition problems Y k (x) = 1 1+ exp ψ k (x) ( ) k =1...M

RBF NN Mapping M j=1 y k (x) = w kj φ j (x) + w k 0 φ j (x) = exp x μ j 2 2σ j 2 X is a d dimensional input vector with elements x i and μ j is the vector determining the center of basis function φ j and has elements μ ji.

Network Training Two stages of Training Stage 1: Unsupervised training Determine the parameters of the basis functions (μ j and σ j ) using the dataset x n.

Network Training Stage 2: Optimization of the second layer weights y k (x) = E = 1 2 n M j= 0 k w kj.φ j (x) { y k (x n n ) t } k y(x) = Wφ 2 Sum of least squares Φ T ΦW T = Φ T T W T = Φ 1 T

Training Algorithms Two kinds of training algorithms - Supervised and Unsupervised - RBF networks are used mainly in supervised applications - In this case, both dataset and its output is known. - Network parameters are found such that they minimize the cost function Q ( ( ) T Y k X i min Y k ( X i ) F k X i i=1 ( ( ) F k ( X i )

Training algorithms Clustering algorithms (k-mean) The centers of radial basis functions are initialized randomly. For a given data sample X i the algorithm adapts its closest center X i ˆ μ j L = min k=1 X i ˆ μ k

Training Algorithms (cont..) Regularization (Haykin, 1994) Orthogonal least squares using Gram- Schimdt algorithm Expectation-maximization algorithm using a gradient descent algorithm (Moody and Darken, 1989) for modeling input-output distributions

Regularization Determines weight by matrix computation E = 1 2 n { y(x n ) t n } 2 + v 2 Py 2 dx E is the total error to be minimized P is some differential operator ν is called the regularization parameter ν controls the relative importance of the regularization hence the degree of smoothness of the function y(x)

Regularization If Regularization parameter is zero, the weights converge to the pseudo inverse solution If the input dimension and the number of patterns are large, not only it is difficult to implement the regularization, but also numerical errors may occur during the computation.

Gradient Descent Method Gradient Descent method goes through entire set of training patterns repeatedly It tends to settle down to a local minimum and sometimes even does not converge if the patterns of the outputs of the middle layer are not linearly separable Its difficult obtain parameters such as learning rate

RBFNN vs. Multi-Layer Perceptron RBFNN uses a distance to a prototype vector followed by transformation by a localized function. MLP depends on weighted linear summations of the inputs, transformed by monotonic variation functions. MLP, for a given input value, many hidden units will typically contribute to the determination of the output value. RBF, for a given input vector, only a few hidden units are activated.

RBFNN vs. Multi-Layer Perceptron MLP has many layers of weights, a complex pattern of connectivity, so that not all possible weights in a given layer are present. RBF is simplistic with two layers. First layer contains the parameters of the basis functions, second layer forms linear combinations of the activations of the basis functions to generate outputs. All parameters of MLP are determined simultaneously using supervised training. RBFNN is a two stage training technique, with first layer parameters are computed using unsupervised network and second layer using fast linear supervised methods

Programming Paradigm and Languages Java with Eclipse IDE Matlab 7.4 Neural Network Toolbox Java Application Development Existing Codes online Object Oriented Programming Debugging is easier in Eclipse IDE Java Documentation is extensive.

Java Eclipse IDE

Matlab 7.0 Neural Network Toolbox

Matlab 7.0 Neural Network Toolbox

Applications of RBNN Pattern Recognition (Lampariello & Sciandrone) Problem is formulated in terms of a system of non-linear equalities, a suitable error function, which only depends on the violated inequalities. Reason to choose RBFNN over MLP - Classification problems will not saturate by a suitable choice of an activation function.

Pattern Recognition (using RBFNN) Different error functions are used such as cross entropy Exponential function

Pattern Recognition (using RBFNN) Non linear Inequality Error function

Four 2D Gaussian Clusters grouped into two classes

Modeling a 3D Shape The algorithms using robust statistics provide better parameter estimation than classical RBF network estimation

Classification problem applied to Diabetes Mellitus Two stages of RBF NN Stage one of training includes fixing the radial basis centers μ j using the k-means clustering algorithm Stage two of training involves determination of Weight W ij which would approximate the limited sample data X, thus leading to a linear optimization problem using least squares.

Classification problem applied to Diabetes Mellitus Results 1200 cases, 600 for training, 300 for validation and 300 for testing. QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture.

Conclusion RBF has very good properties such as Localization Functional approximation Interpolation Cluster modeling Quasi-orthogonality Applications in fields include Telecommunications Signal and image processing Control engineering Computer vision

References Broomhead, D. S. and Lowe, D. (1988). Multivariable function interpolation and adaptive networks. Complex Systems, 2, 321-355. Moody, J. and Darken, C. J. (1989). Fast learning in networks of locally-tuned processing units. Neural Computation, 1, 281-294. Poggio, T. and Girosi, F. (1990). Networks for approximation and learning. Proceedings of the IEEE, 78, 1481-1497.

References Hwang, Young-Sup, Sung-Yang, An Efficient Method to construct a Radial Basis Function Neural Network classifier and its application to unconstrained handwritten digit recognition, 13th Intl. Conference on Pattern Recognition, pp. 640, vol. 4, 1996 Venkatesan P, Anitha. S, Application of a radial basis function neural network for diagnosis of diabetes mellitus Current Science, vol. 91, pp. 1195-1199, 2006

References Christopher Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995