ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

Neural Networks A neural network is a set of connected input/output units where each connection has a weight associated with it. It is also referred to as connectionist learning. Has been applied to various industry including financial area e.g. stock market trading ECLT 5810 Classification - Neural Networks 2

Demonstrating Some Intelligence Mastering the game of Go with Deep Neural Networks and Tree Search, Nature 529, Jan 28, 2016. ECLT 5810 Classification - Neural Networks 3

Application in Financial Industry Algorithms Take Control of Wall Street, Felix Salmon and Jon Strokes Magazine, 2010 Dec 27. ECLT 5810 Classification - Neural Networks 4

Neural Networks Advantages prediction accuracy is generally high robust, works when training examples contain errors fast evaluation of the learned target function Criticism long training time difficult to understand the learned function (weights) not easy to incorporate domain knowledge ECLT 5810 Classification - Neural Networks 5

Multi-Layer Feed-Forward Network Output vector Output nodes Hidden nodes θ neuron (processing unit) w i Input nodes Input vector: x i ECLT 5810 Classification - Neural Networks 6

Defining a Network Topology Normalize the input values for each attribute Discrete-valued attributes may be encoded such that there is one input unit per domain value An output unit is used to represent two classes. If there are more than two classes, then one output unit per class is used. ECLT 5810 Classification - Neural Networks 7

A Neuron (I) The n-dimensional input vector is mapped into the output variable by means of the scalar product and a nonlinear function mapping - (bias) O 0 w 0 O 1 O n w 1 w n I f Output O Inputs (outputs from previous layer) weight vector w weighted sum Activation function ECLT 5810 Classification - Neural Networks 8

A Neuron (II) - O 0 w 0 O 1 O n w 1 w n I i w i O i I f O' Output O 1 1 e I squashing function (to map a large input domain onto [0,1]) ECLT 5810 Classification - Neural Networks 9

An example of a neural network 1 X 1 1 W 14 4 Assume all the weights and thresholds have been trained 0 X 2 2 W 24 W 15 W 25 4 5 W 46 6 6 Prediction 1 X 3 W 34 3 W 35 5 W 56 X 1 X 2 X 3 W 14 W 15 W 24 W 25 W 34 W 35 W 46 W 56 4 5 6 1 0 1 0.2-0.3 0.4 0.1-0.5 0.2-0.3-0.2-0.4 0.2 0.1 ECLT 5810 Classification - Neural Networks 10

Using the Neural Network for Prediction Prediction output = 0 ECLT 5810 Classification - Neural Networks 11

Network Training The ultimate obective of training obtain a set of weights that makes almost all the tuples in the training data classified correctly Steps Initialize weights with random values While terminating condition not satisfied For each training samples (input tuples) Compute the output value of each unit (including the output unit) // back-propagate the errors For each output unit Compute the error For each hidden unit Compute the error // weight and bias updating (case updating) For each output and hidden unit Update the weights and the bias ECLT 5810 Classification - Neural Networks 12

Network Training (Backpropagation) (I) Output vector Output nodes Err where O T ( 1 O )( T O is the true output ) Hidden nodes Err w k O (1 O ) k Err k w k Input nodes Input vector: x i ECLT 5810 Classification - Neural Networks 13

Network Training (Backpropagation) (II) Output vector Output nodes θ l is the learning rate w i (l) Err w w ( l) i i O i Hidden nodes θ Input nodes w i (l) Err w w ( l) Err O i i i Input vector: x i ECLT 5810 Classification - Neural Networks 14

An example of training a neural network 1 X 1 1 W 14 4 0 1 X 2 X 3 2 W 24 W 34 3 W 15 W 25 W 35 4 5 5 W 46 W 56 6 6 Correct Output=1 X 1 X 2 X 3 W 14 W 15 W 24 W 25 W 34 W 35 W 46 W 56 4 5 6 1 0 1 0.2-0.3 0.4 0.1-0.5 0.2-0.3-0.2-0.4 0.2 0.1 Assume these are initial values for training ECLT 5810 Classification - Neural Networks 15

Learning Example ECLT 5810 Classification - Neural Networks 16

Learning rate = 0.9 ECLT 5810 Classification - Neural Networks 17

ECLT 5810 Classification - Neural Networks 18

Weight Updating Case updating The weights and biases are updated after the presentation of each sample. Epoch updating The weight and bias increments could be accumulated in variables, so that the weights and biases are updated after all of the samples in the training set have been presented. In practice, case updating is more common. ECLT 5810 Classification - Neural Networks 19

Terminating Condition Training stops when All changes in weights were so small as to be below some threshold, or The percentage of samples misclassified is below some threshold, or A pre-specified number of epochs has expired. In practice, several hundreds of thousands of epochs may be required before the weights will converge. ECLT 5810 Classification - Neural Networks 20