INTRODUCTION TO NEURAL NETWORKS R. Beale & T.Jackson: Neural Computing, an Introduction. Adam Hilger Ed., Bristol, Philadelphia and New York, 990. THE STRUCTURE OF THE BRAIN The brain consists of about 0 0 basic units, called neurons, each is connected to about 0 4 others. The neuron is an analogue logical processing unit. The neurons form two main types of local processing: local processing interneuron cells, output cells: connect different regions of the brain, connect the brain to muscle, connect from the sensory organs into the brain. The operation of the neuron: the neuron accepts many inputs, which or all are added up in some fashion, if enough active inputs are received at once then the neuron will be activated and fire.
aon dendrite synapse synapse cell body aon aon (soma) synapse aon Fig.. The basic features of a biological neuron. The components of a neuron: the soma is the body of the neuron, the dendrites act as the connections through which all the input to the neuron arrive; there are not electrically active, the dendrites perform more comple function then simple addition on inputs they receive; considering a simple summation is a reasonable approimation, the aon is another type of nerve process attached to soma; this is electrically active, and serves as the output channel of the neuron; the aon is non-linear threshold device producing a voltage pulse called action potential (in fact series of voltage spikes), the synapse couples the aon with the dendrite of another cell; it is a temporary chemical linkage, the neurotransmitters are chemicals released by the synapse when its potential is raised sufficiently by the action potential.
Learning in Biological Systems Learning is thought to occur when modifications are made to the effective coupling between one cell and another, at the synaptic junction. aon synaptic vesicles neurotransmitter neurotransmitter receptors synaptic cl dendrite Fig.. The synapse. The mechanism of achieving this to be facilitate the release of more neurotransmitters. This has the effect of opening more gates on the dendrite on the post-synaptic side of the junction and so increasing the coupling effect of the two cells. The adjustment of coupling so as to favourably reinforce good connections is an important feature of artificial neural net models, as is the effective coupling, or weighting, that occurs on connections into a neuronal cell.
Summary Brain is parallel, distributed processing system. Basic processing units called neuron. Approimately 0 0 neurons, each connected to about 0 4 others, Operation of neuron: fire pulse down aon when sufficient input received from dendrites. Connection via chemical junctions called synapses Learning increases efficacy of synaptic junction. Machine can learn through positive reinforcement Celebrar corte shows local areas of specialised function. Further reading. Parallel Distributed Processing, Volumes, and 3. J.L. Mc- Clelland & D.E. Rumelhart.. An Introduction to Computing with Neural Nets. Richard P. Lippmann. In IEEE ASSP Magazine, April 987. An ecellent, concise overview of the whole area. 3. An Introduction to Neural Computing. Tuevo Kohonen. In Neural Networks, volume, number, 988. A general review. 4. Neurocomputing: Foundations and Research. Edited by Anderson and Rosenfeld. MIT Press, 988. An epensive book but ecellent for reference, it is a collection of reprints of most of the major papers in the field. 5. Neural Computing: Theory and Practice. Philip D. Wasserman.Routledge, Chapman & Hall, 989. An introductory tet. Well-written. Journals: Neural Networks, Network: Computation in Neural Systems, Neural Information Processing Systems (NIPS) (annual conference proceedings), IJCNN Conference (annual conference Proceedings).
PATTERN RECOGNITION The pattern recognition is the dominated area for the application of neural networks. Eample. The classifications of letters from the alphabet It could be resolved using a template matching technique. each letter is read into a fied frame size, frame compared to the template of all possible characters. The difficulties are further complicated when we turn our attention to processing images, speech or even stock market trends. Pattern recognition - a definition The fundamental objective - classification: given an input of some form can we analyse that input to provide a meaningful categorisation of its data content? A pattern recognition system - two stage device: feature etraction; we define feature as a measurement, classification; the classifier must decide which type of class category the list of measured features match most closely; classifiers typically rely on distance metrics and probability theory to do this.
Feature Vectors and Feature Space feature vector n measurements; its create an dimensional feature space; usually for a classification several measurements are required to be able to adequately distinguish inputs that belong to different categories or classes. Eample. Distinguishing ballet dancers from Rugby players: feature vector consists of height and weight, if we make a series of height and weight measurements on typical eamples of each then we can plot the range of readings in a two-dimensional Euclidean plane, that defines our feature space. height decision boundary Ballet dancer Rugby player weight Fig. 3. A two dimensional Euclidean feature space Discriminat function Discriminating function is a function that maps our input features onto a classification space - in the eample above, by defining a plane that would separate the two clusters.
Classification Techniques Numeric: deterministic and stochastic measures; Particular implementation of discriminant function analysis known as K nearest neighbour. Non-numeric: symbolic processing - fuzzy sets. Nearest Neighbour Classification A discriminant function f( ) f ( ) = closest( class) closest( class) f( ) negative to say class membership, f( ) positive to class. height d d weight unclassified pattern d d shortest distance to cla shortest distance to cla Fig. 4. Classification by comparison to the nearest neighbour
The rogue pattern problem height weight unclassified pattern rogue pattern Fig. 5. Measuring to the nearest neighbour can produce errors in classification if a rogue sample is selected The solution of this fairly basic problem is to take several distance measures against many class samples such that the effect of any rogue measurement made is likely to be arranged out. This is K nearest neighbour classification, where K is the number of neighbouring samples against which we decide to measure.
Distance Metrics For two vectors X = (,,... ) (,,...) Y = y y Hamming distance measure H = i yi, for binary data y XOR y i i i i Euclidean distance measure n (, ) = Euc ( i j) d X Y y i= City block distance D = y cb i j n Square distance D = ma y sq i i
LINEAR CLASSIFIERS We wish to classified an input into two possible classes A or B. The classes may be separated in pattern space by the use of a linear decision boundary. linear decision boundary class class weight vector Fig. 6. Discriminating classes with a linear decision boundary. The weight vector orientation in a pattern space will be used to define a linear decision boundary f( X) = n i= w i i where X- input vector, W- weight vector. if f() > 0 = class A, if f() < 0 = class B., The problem lies in actually finding a suitable vector that will give these results for all inputs from class A and class B.
More generally f( X) n = w θ, i= i i where θ is the bias value. Thus for the boundary condition n w i i θ = 0, i= what for a two dimensional problem gives w + w θ = 0 and finally w = w + θ. w linear decision boundary weight vector w θ w w θ w Fig. 7. Relationships between the weight vector and the decision boundary
If we have the correct values for the weight vector we can indeed perform the discriminating process and set the position of the decision boundary. This is not a trivial problem The weight vector should be found by iterative trial and error methods that modify the weights values according to some error function. The error function typically compares the output of the classifier with a desired response and gives an indication of the difference between two. The linear separability problem is one of most important thing. d + - class class - + class class d classification sign of decision line d d Class + + Class + - - + Fig. 8. Picewise linear classification for a non-linearly separable pattern