Self-organizing Neural Networks

Size: px
Start display at page:

Download "Self-organizing Neural Networks"

Transcription

1 Bachelor project Self-organizing Neural Networks Advisor: Klaus Meer Jacob Aae Mikkelsen December 2006

2 Abstract In the first part of this report, a brief introduction to neural networks in general and principles in learning with a neural network is given. Later, the Kohonen Self Organizing Map (SOM) and two algorithms for learning with this type of network are presented. In the second part, recognition of handwritten characters as a specific task for a Kohonen SOM is described, and three different possibilities of encoding the input are implemented and tested.

3 Contents Contents 1 1 Introduction 3 2 Introduction to Neural Networks The Neuron Network Structure Kohonen Network Learning with a neural network Kohonen Self-organizing Map Structure Neighborhood function η - the learning parameter Border factor Learning algorithms for SOM The Original incremental SOM algorithm The Batch SOM algorithm Vector Quantization Definitions Vector Quantization algorithm Comparison between VQ and SOM algorithms Kohonen SOM for pattern recognition Preprocessing Implementation Character Classification Results Conclusion 31 Bibliography 32 1

4 A Test results 33 A.1 Results from test A.2 Parameters for test A.3 Parameters for test A.4 Parameters for test A.5 Parameters for test A.6 Estimate of τ 1 for Batch Algorithm A.7 Parameters for test A.8 Parameters for test A.9 Output from test B Characteristics tested for 40 C Source Code 42 D Material produced 116 2

5 Chapter 1 Introduction The basis knowledge for this project originates from the course DM74 - Neural Networks and Learning Theory, at University of Southern Denmark, spring semester The textbook in the course was Haykin [1]. The reader is not expected to a prior knowledge of neural networks, but a background in computer science or mathematics is recommended. Typical applications of neural networks are function approximation, pattern recognition, statistical learning etc. which makes neural networks usable for a wide area of people such as computer scientists, statisticians, biologists, engineers and so on. The Kohonen self-organizing maps are frequently used in speech recognition, but also in pattern classification, statistical analysis and other areas. In the second chapter, the building block of neural networks, the neuron, is presented, along with an introduction to its computation. In the chapter it is described what learning with a neural network is, and which problems can be handled. The third chapter treats the central parts of the Kohonen self organizing map. This includes the structure of the neurons in the lattice and a description of the neighborhood function, the learning parameter etc. Two learning algorithms for the Kohonen self organizing map are presented in chapter four. Those are the original incremental SOM algorithm and the batch SOM algorithm. In chapter five, vector quantization is defined and the LBG (Linde-Buzo- Gray) algorithm is described. Relations between the SOM algorithms and vector quantization are treated here as well. Finally in chapter six, the implementation of the Kohonen SOM for recognition of handwritten characters is treated, including three encodings for the characters as input, and a thorough testing of the algorithms. The results from the tests are compared to the results of others using SOM. The source code produced can be found in the appendix, but also downloaded from the web, see appendix D. 3

6 Chapter 2 Introduction to Neural Networks We will in this chapter take a look at the neuron as a single unit and how it works, the structure of neural networks and the different learning processes. The basic idea in neural networks is to have a network build by fundamental units, neurons, that interacts. Each neuron is able to get input from other neurons and process them in a certain way and produce an output signal. Although work have been done in neural network research earlier, the modern research was started by McCulloch and Pitts, who published A Logical Calculus of the Ideas Immanent in Nervous Activity in 1943 [2]. There are two approaches to neural networks. The first idea is to model the brain. The second is to create heuristics which are biologically inspired. In the early years the first was dominant, but now the second is prevailing. 2.1 The Neuron The neuron works as follows: It gets input from other neurons or input synapses X 1,X 2,...,X n. Then it combines these signals in a certain way, and computes the output. 4

7 Figure 2.1: A model of neuron k Computation of neuron k 1. On input signals X 1,X 2,...,X n the sum v k = n W ki X i + b k i=1 is computed 2. Apply the activation function ϕ to v k : y k = ϕ(v k ) This will be the next output of neuron X k Activation functions Different activation functions are used in the different types of networks. Often used functions are the threshold function ( figure 2.2 (a), equation 2.1 ), used by McCulloch & Pitts, a stepwise linear function( figure 2.2 (b), equation 2.2), and the sigmoidal activation ( figure 2.2 (c), equation 2.3). The stepwise linear function is continuous, but not differentiable, whereas the sigmoidal function is both continuous and differentiable. The threshold function can be approximated by the sigmoidal function by choosing a sufficiently large a. 5

8 1 0,8 0,6 0,4 0,2 1 0,8 0,6 0,4 0,2 1 0,8 0,6 0,4 0, (a) Threshold (b) Stepwise linear (c) Sigmoidal (a = 3) Figure 2.2: Three different activation functions φ(v) = φ(v) = { 0 if v < 0 1 if v 0 (2.1) 0 if v < v if 1 2 v 1 2 (2.2) 1 if 1 2 < v φ(v) = exp ( av) (2.3) When a neuron using the threshold function outputs a 1, it is said to fire. In the Kohonen self organizing network, the activation function is the identity function ϕ(v) = v so v k = y k. 2.2 Network Structure Neurons are often connected in layers to form a neural network. The first layer is called the input layer, the last layer is called the output layer, and the ones in between are called hidden layers. The neurons are connected in different ways depending on the network type. If the underlying structure is an acyclic directed graph, with neurons as vertices, the network is called a feed forward network, opposed to recurrent network, where the underlying structure is directed, but not acyclic. We suppose the network is clocked in discrete time steps. The output of some neurons are sent to a neighbor neuron. The weighted sum and the activation are computed. One step later the output of the neighbor neuron is available. In Kohonen networks there is just one layer of neurons, thus no hidden layers of neurons, and the input layer is the same as the output layer. All the neurons are however connected in the layer. In one time step in the Kohonen network, a complete update of the neurons is made. 6

9 2.3 Kohonen Network The Kohonen neural network works as follows: First the weights of the neurons in the network are initialized to random values. When a sample is presented, the neurons calculate their output. The neurons compare their output, which is the distance between their weight and the sample presented. The neuron with which have the smallest (or largest, depending on the metric used) distance is found. This neuron is called the winning neuron. The neurons in the network are all updated, moving the weight of the neurons in the direction of the sample presented. The winning neuron is updated the most, and the greater the distance between the winner and the location of the neuron, the lesser that neuron is updated. In training, the samples are repeatedly presented to the network, and the amount that the weights of the neurons are moved in a single time step decreases, just as the perimeter for affecting surrounding neurons decreases. Toward the end of training, a training sample affects a single neuron, and only by a small amount. Because of the winning neurons influence on neurons close to itself, the network lattice obtain a topological ordering, where neurons located close to each other have similar weights. The different elements of the Kohonen network are described in details in chapter Learning with a neural network Learning with a neural network depends on the problem considered. Typically the network is supposed to deduce general characteristics from the samples presented. If the task is pattern recognition/clustering, samples from the same cluster should be recognized as being alike, thus being recognized by the same neuron in the lattice. If the problem is approximation of a function, learning consists of updating the weights in the network, so the output produced when the network is presented with an input, is closer to the output corresponding to the input. This of course for all the inputs presented, but still without overfitting the function, where the output in the points presented are exactly correct, but the function fluctuate Supervised and unsupervised learning In supervised learning, for each sample presented, the result (or the correct class if it is pattern classification) is known, and the network can use the result to correct the weights of the neurons into the right direction. This is typical for the perceptron network and the radial-basis function networks. The learning rules for this type of learning typically uses error correction or minimization of output error. 7

10 When using an unsupervised strategy, the network doesn t know the classification or result of the input sample until learning has been completed. After learning, the network is calibrated by a manually analyzed set of samples. One type of learning for the unsupervised network is correlation based learning, which tries to find correlation between neurons that always fire at the same time. Then it makes sense to increase the weights between two such correlated neurons. Another type is competitive earning, which tries to find a winning neuron when input comes from a particular cluster of points in the input space, and change the weight of this neuron Hebb based learning The Hebbian based learning is used for the correlation learning scenario. Consider two neurons A,B, and suppose we remark that each time A is firing, B will fire in the next step. This could be incorporated in the following rule on how to change the weight W AB between A and B. with: W new AB = W old AB + AB (2.4) AB = ηx A Y A (2.5) where X A and Y B are earlier weights of the two neurons and η > 0 is a constant Competitive learning The overall goal is classification of inputs. Several output neurons compete against each other, the one with the largest output value wins and fires. If for an input x R the output neuron X k wins, then the weight coming into X k is changed into the direction of x. This is done by updating the weights W kj as follows: where W new kj = W old kj + kj (2.6) W kj = η(x W kj ) (2.7) and η is the learning parameter, and 0 < η < 1. The Kohonen self organizing map is an example of competitive learning, where the neighbors of the winning neuron also are updated, and the learning parameter decreases over time. 8

11 Chapter 3 Kohonen Self-organizing Map The Kohonen SOM works like the brain in pattern recognition tasks. When presented with an input, it excites neurons in a specific area. It does this by clustering a large input space to a smaller output space. To achieve this, the Kohonen SOM algorithms builds a lattice of neurons, where neurons located close to each other have similar characteristics. There are several key concepts in this process, such as structure of the lattice, how the geographical location between neurons influence the training (neighborhood), how much each update time step should affect each neuron (learning parameter), and making sure all samples fall within the lattice (border factor). These matters are described in this chapter. 3.1 Structure As described in section 2.2 there is only one layer of neurons in the Kohonen self-organizing map. The location of the neurons in relation to each other in the layer is important. (a) Network structure using squares (b) Network structure using regular hexagons Figure 3.1: Two different lattice structures The neurons can in principle be located in a square lattice, a hexagonal lattice or even an irregular pattern. When using a structure, where the neurons form a pattern of regular hexagons, each neuron will have six neighbors 9

12 with the same geographical distance, where the squared network structure only has four. This causes that more neurons will belong to the same neighborhood, and thus updating those neurons with the same magnitude in the direction of the winner. Neighboring neurons in the net should end up being mutually similar, and the hexagonal lattice does not favor the horizontal or vertical direction as much as the square lattice ( Makhoul et al. [3] and Kohonen [4]). 3.2 Neighborhood function When using Kohonen self-organizing maps, the distance in the lattice between the two neurons influence the learning process. The neighborhood function should decrease over time, for all but the winning neuron. A typical choice of neighborhood function is: ( d 2 ) j,i h j,i(x) = exp 2σ 2 (n) (3.1) As σ 0 the radius of the net is recommended (or a little bigger), and the sigma is calculated by: ( n ) σ(n) = σ 0 exp (3.2) τ As τ 1 Haykin [1] recommends log(σ 0 ), in figure 3.2 the neighborhood function is displayed, for two neurons with a distance 1, and different τ 1 values. Figure 3.2: How the neighborhood function decreases over time, with d = 1 and a) τ 1 = 1000, b) τ 1 = 5000, c) τ 1 = and d) τ 1 = This neighborhood function provides a percentage-like measure of how close the neurons are. The function decreases not just over time, but also depending on the geographical distance of the two neurons in the net. Another possibility of neighborhood function is a boolean function, just decreasing the radius over time. At first all neurons are in the neighborhood, 10

13 but as the radius decreases, the number of neurons in the neighborhood decreases, ending when only the neuron itself. Raising τ 1 causes the neighborhood function to decrease slower, and therefore the ordering takes longer time. 3.3 η - the learning parameter The learning parameter also decreases over time. One possible formula is: ( n ) η(n) = η 0 exp (3.3) τ 2 Another possibility is: ( η(n) = n ) 1000 (3.4) A graph of η decreasing over time, using formula 3.3 can be seen in figure 3.3 with different values for τ 2. It is easy to see that raising τ 2 makes the graph of η decrease more slowly. After the ordering phase, the learning parameter should remain at a low value like 0.01 (Haykin [1])or 0.02 (Kohonen [4]). Figure 3.3: η displayed over time, with a) τ 2 = 1000, b) τ 2 = 2000 and c) τ 2 = 5000 possibilities of decreasing η functions 3.4 Border factor To make sure all samples fall within the lattice, an enlarged update of the neurons in the border of the lattice is made, once a such neuron is the winner. The ordering phase is the first part of the algorithm defined in 4.1. The function B(i(X)) is defined af follows: 11

14 1 if the ordering phase is not over, else b B(i(X)) = 2 if i(x) is located a corner of the lattice b if i(x) is located on an edges 1 otherwise (3.5) where b is a constant. The border factor increases the update of the neurons in the border of the lattice, thus stretching it, so more of the samples will be matched inside the net, and not outside. If we don t use the border factor, we risk having too many samples matching on the outside of the lattice, and the network will not converge properly. If b is set to 1, the border factor is simply ignored. 12

15 Chapter 4 Learning algorithms for SOM In this chapter we take a look at two fundamentally different algorithms for the Kohonen SOM. The original incremental algorithm and the batch learning algorithm. The main difference between the two algorithms are found after the initialization. The incremental method repeats the following steps: process one training sample, and then updates the weights. The batch learning method repeats the steps: process all the training samples, and then update the weights. The incremental algorithm repeats its steps many times more than the batch algorithm. 4.1 The Original incremental SOM algorithm The original SOM algorithm works by updating neurons in a neighborhood around the best matching neuron. The units closest to the winning neuron are allowed to learn the most. Learning or updating consists of a linear combination off the old weight and the new sample presented to the network. The algorithm is defined as follows: 1. Initialization The weights of the neurons in the network are initialized. 2. Sampling Samples are drawn from the input space in accordance with the probability distribution off the input. 3. Similarity matching The winning neuron is determined, as the one with the most similar weight vector compared to the sample presented. 13

16 4. Updating The neurons in the net are all updated according to the updating rule: equation Continuation Continue steps 2-4 until no noticeable changes in the weights are observed. 6. Calibration The neurons are labeled from a set of manually analyzed samples. The first approximately 1000 steps (Haykin [1]) or more of the algorithm is a self-organizing/ordering phase, and the rest is called the convergence phase. In the ordering phase, the general topology of the input vectors are ordered in the net. In the convergence phase the map is fine tuned to provide an accurate quantification of the input data Initialization This can be done by using randomized values for the entries in the weights or by selecting different of the training samples as the initial weights for the neurons. The samples are vectors in R n, and thus the weight for each neuron is a vector in R n Sampling How the samples are drawn from the input set, can be done by selecting samples at random or presenting the samples in a cyclic way, see Kohonen [4] Similarity Matching This is measured by the euclidean distance between the weights of the neurons in the net, and the selected input. It could alternatively be done selecting the largest dot product. The winning neuron is the one updated the most, and in the last part of the of the algorithm (the last time steps) it is the only neuron updated, which justifies the label winner takes all Updating the neurons For each sample x R n presented, all neurons in the net are updated according to the following rule: 14

17 W j (n + 1) = W j (n) + η(n)b(i(x))h j,i(x) (n)(x W j (n)) (4.1) where x is the sample presented to the network, i(x) is the index of the winning neuron, j is the index of the neuron we are updating, η is a learning parameter depending on the time step n, W j (n) is the weight of neuron j in time step n, and the term B(i(x)) is a border factor Continuation The algorithm must run until there are no notable changes in the updates of the network. This is difficult to measure, but as a guideline the number of iterations in the convergence phase, Kohonen [4] recommends at least 500 times the number of neurons in the network Calibration Once the network has finished training, the neurons in the lattice must be assigned which class they correspond to. This can be accomplished by a manually analyzed set of samples, where each neuron is assigned the class of the sample it resembles the most. 4.2 The Batch SOM algorithm The original algorithm works on line, thus not needing all samples at the same time. This is not the case in the Batch SOM algorithm, here all training samples must be available when learning begins, since they are all considered in each step. The algorithm is described here 1. Initialization of the weights By using random weights or the first K training vectors, where K is the number of neurons. 2. Sort the samples Each training samples best matching neuron is found, and the sample is added to this neurons list of samples. 3. Update All weights of the neurons are updated, as a weighted mean of the lists found in step Continuation Repeat step 2-3 some times, depending on the neighborhood function. 5. Calibration The neurons are labeled from a set of manually analyzed samples. 15

18 4.2.1 Initialization of the weights The initialization of the weights can be done in the same way as the original incremental algorithm Sort the samples All the input samples are distributed in lists, one for each neuron, where each sample is added to the list corresponding to the neuron it matches best Updating the neurons For the new weights, the mean of the weight vectors in the neighborhood is taken, where the neighborhood is defined as in the incremental algorithm, and the weight vectors for each neuron is the mean of the samples in the list collected in step two. The new weights in the network is calculated from the following update rule i W j (n + 1) = n i h ji x j i n (4.2) i h ji where x j is the mean of the vectors in the lists collected in step, n i is the number of elements in the list and both sums are taken over all neurons in the net. When the edges are updated, the mean of the neurons own list is weighted higher than the others, and in the corners even higher. The neighborhood function is the same as in the incremental algorithm, only parameters are used, so in the last couple of iterations the neighborhood consists of only the single neuron Continuation This algorithm typically loops through step two and tree for 15 to 25 times (Kohonen [4]). This should be sufficient, so the final updates are very moderate Calibration As in the incremental algorithm, the neurons are assigned a label from the class the are most similar to, taken from a set of manually analyzed samples. 16

19 Chapter 5 Vector Quantization Vector quantization is typically used in lossy image compression, where the main idea is to code values from a high dimension input space into a key in a discrete output space of a lower dimension, to reduce the bit rate of transfer or the space needed for archiving. Compression is extensively used in codecs for video and image compression, and is especially important when a signal must pass a low transmission rate network and still maintain an acceptable quality [5]. One example of compression is using the HTML safe color codes, where the entire color spectrum is encoded into 216 safe colors to use, which should always remain the same no matter which browser is used. Note that 216 different colors easily can be encoded using 8 bits, compressing the encoding for the millions of colors monitors today can handle. Vector quantization can generate a better choice of colors corresponding to each of the (in this case 216) possibilities, depending on how extensively the different colors are used. 5.1 Definitions As seen in figure 5.1, selecting 10 codebook vectors (the dots), the vectors lying in the same area bounded by the lines (or in general, hyperplanes if the dimension of the vectors are greater than two) is called the Voronoi set corresponding to the codebook vector. The lines bounding the Voronoi sets are altogether called the Voronoi Tesselation. 17

20 Figure 5.1: 10 codebook vectors (the dots), and the corresponding voronoi tessellation (the lines) in R Vector Quantization algorithm Lloyd s algorithm for vector quantization is the original algorithm for vector quantization. The LBG (Linde-Buzo-Gray) algorithm, which is very similar to the K-means algorithm, is a generalization of Lloyd s algorithm Gray [6]. It tries to find the best location of the codebook vectors, according to the samples presented. It works as follows: 1. Determine the number of vectors in the codebook, N 2. Select N random vectors and let them be the initial codebook vectors 3. Clusterize the input vectors around the vectors in the codebook using a distance measure 4. Compute the new set of codebook vectors, by obtaining the mean of the clustered vectors for each codebook vector 5. Repeat 3 and 4 until the change in codebook vectors is very small In compression, by selecting N codebook vectors, where N is smaller than the original number of colors in the picture or the video stream, the algorithm tries to find the N colors which gives the smallest distortion in the picture or video stream. 5.3 Comparison between VQ and SOM algorithms Especially the batch SOM algorithm has similarities with the LBG algorithm. The first step in the LBG algorithm, selecting the number of codebook vectors, compares to the number of neurons in the lattice used in the Bath SOM algorithm. Both algorithms can use randomly selected input samples as their initial values. 18

21 The third step in both algorithms are quite identical. This is in both algorithms a separation of the input samples into Voronoi sets. Compared to the Batch SOM algorithm, only step 4 in the LBG algorithm, the calculation of the new codebook vectors, is really different. In the SOM algorithm, the mean of the neighboring neurons is considered with a weight decided by the distance between the neurons. This causes the Batch SOM algorithm to have a topological ordering. The last step is identical in the two algorithms, and when the neighborhood in the Batch SOM algorithm has decreased to only the single neuron, the two algorithms behave exactly the same. In [5] the Kohonen algorithm and vector quantization algorithms are compared, documenting that the Kohonen algorithms usually does at least as good as the vector quantization algorithms in practice, although it has only been proved empirically. 19

22 Chapter 6 Kohonen SOM for pattern recognition In this chapter we take a look at the Kohonen selforganizing map implemented to recognize handwritten characters. This recognition problem is a typical example of a clustering problem, where the clusters are the different characters available as input. After the characters are collected, there are two main stages in this process: preprocessing and classification. The purpose of the preprocessing stage is to produce an encoding for the characters that can be used as input to the network. Three different encodings has been implemented and described. In the classification stage, the Kohonen Self-organizing map performs the training, and the result are reported. Last in the chapter, The test results are evaluated and compared to results in other papers. To collect data, first 200 persons were asked to fill out a form as seen in D.1, and one person filled out 20 forms. The forms were then scanned using a HP PSC 1402 scanner, and the individual letters were saved as single files, using a tool developed to do exactly this. To reduce the size of the images (for the one-to-one encoding) Faststone Foto Resizer was used. The source code of the preprocessing and the algorithm can be found in appendix C. The tests have been rum from inside Eclipse, therefore the parameters have been changed directly in the source code. 6.1 Preprocessing Three different input encodings have been tried. Figure 6.1 and figure 6.2 shows the letters A and O written by different writers. It is clearly seen that regardless of encoding used, they will be different, making the task of recognition complex. 20

23 Figure 6.1: Example of different versions of the capitol letter A Figure 6.2: Example of different versions of the capitol letter O The three encodings are described in the next sections One-to-one of pixels and input The first approach was to reduce the size off the characters to a dimension of 20 times 25 pixels making the input vector have 500 entries. Then mapping in rows the white pixels to a 0 and the black pixels to a 1. It turned out that the scaling of the input and the relative large input dimension was ineffective and complex in its training, without yielding any impressive results. Figure A.1 shows a visual output of the weights of the neurons in the network after training. The large input dimension makes the amount of calculations of the algorithm too high to have the desired number of neurons in the network, on a simple personal computer. This method was therefore discarded and will only be briefly mentioned in this report Division into lines The second encoding tried uses the number of black lines seen vertically and the number of lines horizontally in the letter. Figure 6.3 shows an example of this encoding. In the project, most of the letters have a dimension (not reduced like in the first approach) off 60 times 80 pixels. This encoding have taken 30 evenly spaced encodings for the vertical lines seen in each horisontal row, and 20 evenly spaced encodings for horizontal lines seen in each vertical row. This makes the input dimension of size 50, which is a factor 7 less than in the first approach. Furthermore, the input values in the vector are integers, and not boolean values (zero or one). 21

24 Figure 6.3: Simple 5 4 dimension of encoding for the character R making it [ 1,2,1,2,2,1,2,3,2 ] T The results on most of the letters are fairly good, but as seen in figure 6.4 the encoding for two different characters can be very similar, if they are symmetric and the mirrored images of each other are alike. Figure 6.4: Example of same encoding for different characters Characteristics The final encoding tested in this project is preprocessing the characters, to see which characteristics they possess, and coding the input vector with zeros and one s accordingly. The testing for the characteristics is thus vital, but unfortunately sensitive to rotations etc. The complete list of characteristics and the methods used to examine them can be found in appendix B Possible encodings not used The two latest encodings unfortunately don t use the same numeric values, otherwise combining both of them would be a possibility. In the articles studied for this project, different encodings has been used, besides the three described above. These include fourier coefficients extracted from the handwritten shapes as seen in [7] and encoding using the Shadow Method described by Jameel and Koutsougeras [8] 22

25 6.2 Implementation When implementing the network, the hexagonal net was chosen. The charts shown in A.1 doesn t display this, all even lines should be shifted half a character to the right. When presenting the samples to the network, they are chosen at random from the entire set of training samples. 6.3 Character Classification To limit the assignment, only the capitol letters A,B,...,Z has been taken into consideration in the pattern classification. The first data set consists of one sample from each of 200 test persons, making the total set of individual characters Of those are 3900 used to train the network, and 1300 used to test the network. The second dataset consist of 20 samples from one person. This makes a total of 520 separate characters, of which 390 are used for training and 130 are used for testing. While training, the network didn t know the label of each character, it was only when it assigned the letters to the neurons in the calibration phase. First of all, there is a number of parameters that can be changed in the algorithms, so testing will focus on the following: 1. The different encodings for the input 2. The number of neurons in the network 3. The number of iterations 4. The τ parameters 5. The start of the convergence phase The incremental algorithm using the first data set is tested the most, since this expectedly is a more challenging task, than a dataset from only one person. Test 1: Encodings First the best encoding, in this implementation, is decided, using a small number of neurons and iterations, and the same τ parameters and start of the convergence phase. Table A.1 shows the parameters used in this test, and table 6.1 shows the results of the test. 23

26 Encoding Rate of Success - Training set Rate of Success - Test set Table 6.1: Results for test 1 It seems that the high dimensional one to one mapping is less successful than the two others. The running time for the algorithm is much longer for the first encoding ( R 500 ) than for the two other other encodings ( R 50 or R 63 ), and therefore we disregard this encoding in the further tests. Test 2: Number of neurons Using the result of test 1, we now only examine for the second and third encoding. In this test, the parameters are kept the same, only changing the dimension of the network, and thus the number of neurons. It is worth noting, that since we initialize and draw samples at random, and only run each test one time, there is a bit of inaccuracy in the results. Obviously it would be better to run each test a number of times and take the average of the results. The results of test 2 can be seen in table 6.2, and the parameters used in table A.2 Lattice Total Encoding 2 Encoding 2 Encoding 3 Encoding 3 Dimension neurons Training set Test set Training set Test set 10* * * * * * * * * * * * * Table 6.2: Results for test 1 Here, the dimension of the network is not really considered. Kohonen [4] 24

27 recommends not to use a square, but instead a rectangular lattice. These results are not conclusive to whether a square or a rectangle is the best choice, and increasing the number of neurons in the lattice doesn t seem to have a large effect on the success rate. Figure 6.5: Rate of success depending of the number of neurons in the lattice. Plus is the test set for encoding 2, dots are test set for encoding 3 Test 3: Number of iterations We now test for the number of iterations, using encoding 2 and three, and the dimension 40 times 30 that seemed to be a fair choice according to test two. Iterations Encoding 2 Encoding 2 Encoding 3 Encoding 3 Training set Test set Training set Test set Table 6.3: Results for test 3 25

28 As it is clearly seen in table 6.3 and in figure 6.6, the number of iterations only improve the success rate to a certain degree. There is nothing gained in using more than iterations, since above this level, the results are stagnant. The advice in Haykin [1] and Kohonen [4] about using 500 times the number of neurons in the network doesn t apply to lattices of this larger size. Figure 6.6: Rate of success depending of the number of iterations in the algorithm. Plus is the test set for encoding 2, dots are test set for encoding 3 Test 4: The τ parameters We now turn our attention to the τ parameters. The τ 1 controls how fast the neighborhood should decrease. A larger τ 1 makes the neighborhood shrink slowly. The τ 2 parameter controls how rapid the learning parameter η decreases. The smaller the τ 2 gets, the faster the η decreases. The results of different combinations of the τ parameters are in diagram 6.4 The recommended value of 1000 (Haykin [1]) for the τ 2 parameter seems to small for this problem, a larger value is preferred. 26

29 τ 1 τ 2 Training set Test set 500/σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ /σ Table 6.4: Results for test 4 Test 5: Start of the convergence phase The start of the convergence phase indicates when the border factor should be applied. We therefore test these two parameters together. The results are displayed in table 6.5, and the other parameters used can be found in table A.4. 27

30 Convergence Border Success rate Success rate start factor Training set Test set Table 6.5: Results for test 5 The results indicate that the start of the convergence phase is not as important as the size of the border factor. A factor of 4 seems to be a good choice Test 6: The Batch algorithm In the Batch Kohonen SOM algorithm, we rely on test two and three. Thus, we use a lattice of dimension 40 times 30, and only consider encoding 3. Table A.6 presents a calculation of values of the neighborhood function, when τ 1 is selected to 5.1. A spreadsheet was used to select the value, knowing the last couple of iterations should be narrowed down, so the neurons only are influenced by the samples in the Voronoi set. 28

31 Iterations τ 1 Success rate Success rate Training set Test set 20 4, , , , , , , , , Table 6.6: Results for test 6 The success rates here are quite similar to the ones obtained from the incremental algorithm Test 7: Samples from one person Imagine the Kohonen SOM was used in a personal digital assistant (PDA). In this case, if the PDA could adapt to one persons handwriting, the owner would not have to learn how to write letters the device could detect correct. Using a smaller lattice size, and the results gained from test one to five, we perform a test on the dataset of 15 alphabets from one person, and tests on the last 5 alphabets. We will run tests on encoding 2 and tree, with parameters as in A.8, and the results can be seen in 6.7. Encoding 2 Dimension σ 0 Success rate Success rate Training set Test set 20 * * * Encoding 3 Dimension σ 0 Success rate Success rate Training set Test set 20 * * * Table 6.7: Results for test 7 The recognition rate in this test seems satisfactory. The output from the 29

32 last of the tests can be found in A.9. The characters the network has confused are N recognized as M, O as D, Q as B, Z as I and three times R as D. Based on this result, a test for a characteristic that would distinguish between R and D should have been added to the third encoding. The PDA would have one major advantage, that cannot be used in this report. When examining characteristics, the starting location of the pen could be taken into consideration, this is not possible, since the characters in this project have been written on paper and scanned. 6.4 Results For the first encoding style, Idan and Chevallier [9],using a training set of 735 and a test set of 265 samples, present a success rate of 90,7% on the training samples and 75,5% on the test samples, but this is on digits, thus only 10 different shapes to consider. This speed of the preprocessing is good, but the input dimension is much higher than other methods, which in practical applications could become an issue. In [7], 7400 samples have been collected from only 17 writers. Searching for only 18 different shapes, and using fourier descriptors a recognition rate of 88.43% is reached. They also had the advantage of knowing the writers starting location and writing speed. Considering the relative small number of writers and different shapes, the recognition rate found in this report in the perimeter of 70% seems reasonable. In [8] which focus on a new input encoding, the shadow method, 1000 characters from 13 writers yield a 78,6% result at best, but doesn t specify whether the training samples are the same as the testsamples. In comparison, when using one writer gets at least 93% succes rate to the 70% for the 200 writers, the results in this report, using a rather naive characterisation test on the letters, are satisfactory. 30

33 Chapter 7 Conclusion In this report, neural networks in general and learning with those, are presented. The primary focus is on unsupervised learning with the Kohonen Self Organizing Map. Two learning algorithms for the Kohonen SOM are described, the original incremental algorithm, and the batch style algorithm. A brief introduction to vector quantization is given, and similarities between vector quantization algorithms and the self organizing map algorithms are described. In the second part of the project, the Kohonen SOM is used to recognize single handwritten characters. When using neural networks for such a pattern classification problem, the input encoding is as important as the network itself. Therefore, three different encodings has been implemented and used as input to the network. Both the incremental algorithm and the batch algorithm have been implemented, and both are tested with datasets collected and processed for this specific purpose. The recognition rate for the dataset from 200 different writers is approximately 70%, which compared to other results using similar methods is satisfactory. Using samples from only one writer, the recognition rate is 96%, which implies that the method could be used on a PDA, which in time could adapt to the way the owner writes, instead of the owner adapting to the PDA. 31

34 Bibliography [1] Simon Haykin. Neural Networks, a comprehensive Foundation, 2nd Ed. Prentice-Hall, [2] Warren S. McCulloch and Walter Pitts. A Logical Calculus of the Ideas Immanent in Nervous Activity. MIT Press, [3] John Makhoul, Salim Roucos, and Herbert Gish. Vector quantization in speech coding. Proceedings of the IEEE, Vol 73, NO. 11, November 1985, pages , [4] Teuvo Kohonen. Self-Organizing Maps 3rd Ed. Springer-Verlag, [5] Nasser M. Nasrabadi and Yushu Feng. Vector quantization of images based upon the kohonen self-organizing feature maps. IEEE International Conference on Neural Networks, pages , [6] Robert M. Gray. Vector quantization. ASSP Magazine, IEEE Volume: 1, Issue: 2, Part 1, pages 4 29, [7] N. Mezghani, A. Mitiche, and M. Cheriet. On-line recognition of handwritten arabic characters using a kohonen neural network. Eighth International Workshop on Frontiers in Handwriting Recognition, Proceedings., pages , [8] Akhtar Jameel and Cris Koutsougeras. Experiments with kohonen s learning vector quantization in handwritten character recognition systems. Proceedings of the 37th Midwest Symposium on Circuits and Systems., pages iii, [9] Yizhak Idan and Raymond C. Chevallier. Handwritten digits recognition by a supervised kohonen - like learning algorithm IEEE International Joint Conference on Neural Networks., pages , [10] Autar Kaw and Michael Keteltas. Lagrangian interpolation,

35 Appendix A Test results A.1 Results from test 1 Iterations Dimension 30 times 15 η σ 0 16 τ /log(σ 0 ) τ Borderfactor b 2 Convergence phase Table A.1: Parameters for test 1 33

36 Results encoding 1 Figure A.1: Visual display of the neurons in the network after training, using the first type encoding (Test 1) The labels given to the neurons in the lattice, using encoding 1. LOCOOCGOOUUUVHWNKMHNWRMHAAAATT CCCGCCULLUUQDHKNMNMMFMHQPEXAIQ CLECECCLLDQVLHMNMNNNMMXHIYIXTI ULSECGCLLUGLIMNNMHMNHFXXXZSSIU SLQBVVCLLLLIIKINMMMNNYMHZZZSJA JQSVVVGLIIIIVHINMHHHWHYXZZZLIU SJJJVVIKIIIIIIFNNHHVWYYFFZZZCU JZJESHVILIIIIIPNHHHVVVYFZZZCCU JJZIYILILIIIJIPKRRRVFIFFFRZZCC IIZTTYICLIIJPFIFKFGFFTFFBSCZCC TLJZIXCIFIIFIYKFFPQCFFTIFBZJCZ TJIFXXIIIIFIFRPRIIGFTPIBBSVLUJ TIAXXXFIIAFFIIIIHIIGTTTBBBCCOL YTXXRRIIIHIIFPIMIPCFTTBBBCCCUQ TTYYXXBAIRAAPHPFBPBBBBDBDSLCJQ 34

37 Results encoding 2 The labels given to the neurons in the lattice, using encoding 2. FCCZZZEESSSGGGGBBBBRBQQQVWNTWW CCCSSEEZZSSGGGBBBBOQDOOODWNNWN FCCSZSZEEZSSBBBBBBDBOOOOOMWNNN CCSSSZZZSSSSCBBOBBDDODOOONNNNN CCCSCCISSSSSBODODDODOOOOUUNMNN CCCCJJSSSSCPPDOOOOODOOOOUUMNMU FCJCCCJSSZCPPPDDDDOOOGDOUUVVMN FLJCCCCGZCCPPPDDDODOOGODUUVMMU ZITJLCCCCZCPPPPODDDOAOOOUNMNVM TIJLLCCCTIPPPPPCDDDAAAGUNVNVVV CTTTLCCCZTTPPPPJKRRDAAVVUVMUVV ITIICCCTCYYPPPKKKRKAAAJVVVVUVO IITLTLLTTYYYPPPKKCKKRAVUUVUVVU LILLICTTYYYYYPKXXKKKAUUVVVVVVV IILLJIFCYTYYYYYXXXXXKHNHDUVVVV Results encoding 3 The labels given to the neurons in the lattice, using encoding 3. ZESSGGGQQOSPDYNMYYYXXKHNNVVVUU IISSGGBBDBBOYYMYXYVVHHHNMVVVUU ZEESSGGBBBBBAKPYKHHMVMMHMVUUVV ZESSJGGBBBBBBKVRVVHMMMNNWNUUVU ESESSRRBBBBRBBBKKKMUMMNWWWUUUV FEZZRRQBBBBBORRGUHUHMNWWWUVVUV ZEPPPPPRQOBBBOCCGHHUUWWMWNVVAV IFFIPRPPPODBBUCUKKKHWMMWMVAVVV IIIFPPPPPDDDDDBGAKKXYMXMKRKXXL IFFFFPPPPDDDOBDAKKKYYXYKRJAALL ILIFFFPPPDDDOBRHGKKXXYYAAJJJLL LLFFTPPPPPODDDQQGMMXYYHTJTJJLL LCCCCYYQPPOOODSQQGSJTTTTTYJJLL CCCCCRPRQBOOODOZGSSSZTTTTTTILI LCCCCRCRROCOOOOGSJSJIZITTYTTII 35

38 A.2 Parameters for test 2 Iterations η σ 0 max(x,y)/2 + 1 τ /log(σ 0 ) τ Borderfactor b 2 Convergence phase Table A.2: Parameters for test 1 A.3 Parameters for test 3 Dimension 40 * 30 η σ 0 21 τ /log(σ 0 ) τ Borderfactor b 2 Convergence phase Table A.3: Parameters for test 3 A.4 Parameters for test 4 Iterations Encoding 3 Dimension 40 * 30 η σ 0 21 Borderfactor b 2 Convergence phase Table A.4: Parameters for test 4 36

39 A.5 Parameters for test 5 Iterations Encoding 3 Dimension 40 * 30 η σ 0 21 τ /log(σ 0 ) τ Table A.5: Parameters for test 5 A.6 Estimate of τ 1 for Batch Algorithm Distance Iterations Table A.6: Estimate of the neighborhood function for the Batch Algorithm, depending on the distance and the iterations, when τ 1 is selected to 5.1 A.7 Parameters for test 6 Iterations 25 Encoding 3 Dimension 40 * 30 σ 0 21 τ Table A.7: Parameters for test 6 37

40 A.8 Parameters for test 7 A.9 Output from test 7 Iterations η τ /log(σ 0 ) τ Borderfactor b 4 Convergence phase 1000 Table A.8: Parameters for test 7 Succesrate: {Z=1.0 dvs: 15 ud af 15, Y=1.0 dvs: 15 ud af 15, X=1.0 dvs: 15 ud af 15, W=1.0 dvs: 15 ud af 15, V=1.0 dvs: 15 ud af 15, U=1.0 dvs: 15 ud af 15, T=1.0 dvs: 15 ud af 15, S=1.0 dvs: 15 ud af 15, R=1.0 dvs: 15 ud af 15, Q= dvs: 14 ud af 15 B: %, P=1.0 dvs: 15 ud af 15, O=1.0 dvs: 15 ud af 15, N= dvs: 13 ud af 15 M: %, M=1.0 dvs: 15 ud af 15, L=1.0 dvs: 15 ud af 15, K= dvs: 14 ud af 15 R: %, J= dvs: 14 ud af 15 S: %, I=1.0 dvs: 15 ud af 15, H=1.0 dvs: 15 ud af 15, G= dvs: 14 ud af 15 S: %, F=1.0 dvs: 15 ud af 15, E= dvs: 13 ud af 15 F: %, D=1.0 dvs: 15 ud af 15, C=1.0 dvs: 15 ud af 15, B=1.0 dvs: 15 ud af 15, A=1.0 dvs: 15 ud af 15 } Succesrate:

41 {Z=0.8 dvs: 4 ud af 5 I: %, Y=1.0 dvs: 5 ud af 5, X=1.0 dvs: 5 ud af 5, W=1.0 dvs: 5 ud af 5, V=1.0 dvs: 5 ud af 5, U=1.0 dvs: 5 ud af 5, T=1.0 dvs: 5 ud af 5, S=1.0 dvs: 5 ud af 5, R=0.4 dvs: 2 ud af 5 D: %, Q=0.8 dvs: 4 ud af 5 B: %, P=1.0 dvs: 5 ud af 5, O=0.8 dvs: 4 ud af 5 D: %, N=0.8 dvs: 4 ud af 5 M: %, M=1.0 dvs: 5 ud af 5, L=1.0 dvs: 5 ud af 5, K=1.0 dvs: 5 ud af 5, J=1.0 dvs: 5 ud af 5, I=1.0 dvs: 5 ud af 5, H=1.0 dvs: 5 ud af 5, G=1.0 dvs: 5 ud af 5, F=1.0 dvs: 5 ud af 5, E=1.0 dvs: 5 ud af 5, D=1.0 dvs: 5 ud af 5, C=1.0 dvs: 5 ud af 5, B=1.0 dvs: 5 ud af 5, A=1.0 dvs: 5 ud af 5 } NNHHHHHHAALLLEEEIBBBRPPFFFEIII NNKKHHHAAALLFEESEBRRRPFFFEIZZZ NNKKKHXXAAALLIGSBBQRRDRFFFEZZZ MMKKKKXXAALLLGSGBRRQDDDCCLCZZZ MMMKKXXXAAAALQGGBBBDDOURCCLITT HNNXXXXXAAAQQQPBRRDDDDUCCCIIZZ HNMMXXXXTTWWWQPPRRRDOODUCGSIIJ UNMMYYYTTTWWWPPPRRQOOOBGSSSJJJ UUVVVYYTTTWWWWPPPQQQOOOGGSSSJJ UUVVVYYTTTWWWWPPPQQQOOGGGSSIJJ 39

42 Appendix B Characteristics tested for Here is a list of the different characteristics tested for in the third encoding style, the numbers refer to the location in the input vector. 0,1,2 Tests if there is is a horisontal bar in the top, bottom or middle. 3,4,5 Tests to see if there is a vertical bar in the left, middle or right of the picture 6,7,8,9 Checks if there is a rounded line segment in each of the four quadrants. The test is done using Lagrangian Interpolation[10] to model a desired curve, and calculating the percentage of black pixels on this arc. 10,11 Examining if there are diagonal bars like / or \ 12,13,14,15 Tests if the number of vertical bars seen 1 4 from the top of the image is exual to 0,1,2 or 3 (Similar to encoding 2) 16,17,18,19 Tests if the number of vertical bars seen 1 2 from the top of the image is exual to 0,1,2 or 3 (Similar to encoding 2) 20,21,22,23 Tests if the number of vertical bars seen 3 4 from the top of the image is exual to 0,1,2 or 3 (Similar to encoding 2) 24,25,26 Tests if the number of horisontal bars seen 1 3 exual to 1,2 or 3 (Similar to encoding 2) 27,28,29 Tests if the number of horisontal bars seen 1 2 exual to 1,2 or 3 (Similar to encoding 2) 30,31,32 Tests if the number of horisontal bars seen 2 3 exual to 1,2 or 3 (Similar to encoding 2) into the image is into the image is into the image is 33 Checks if the pickture is very narrow (below a threshold limit) 40

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Artificial Neural Networks. Edward Gatt

Artificial Neural Networks. Edward Gatt Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Analysis of Interest Rate Curves Clustering Using Self-Organising Maps

Analysis of Interest Rate Curves Clustering Using Self-Organising Maps Analysis of Interest Rate Curves Clustering Using Self-Organising Maps M. Kanevski (1), V. Timonin (1), A. Pozdnoukhov(1), M. Maignan (1,2) (1) Institute of Geomatics and Analysis of Risk (IGAR), University

More information

Artificial Neural Network and Fuzzy Logic

Artificial Neural Network and Fuzzy Logic Artificial Neural Network and Fuzzy Logic 1 Syllabus 2 Syllabus 3 Books 1. Artificial Neural Networks by B. Yagnanarayan, PHI - (Cover Topologies part of unit 1 and All part of Unit 2) 2. Neural Networks

More information

Learning Vector Quantization

Learning Vector Quantization Learning Vector Quantization Neural Computation : Lecture 18 John A. Bullinaria, 2015 1. SOM Architecture and Algorithm 2. Vector Quantization 3. The Encoder-Decoder Model 4. Generalized Lloyd Algorithms

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Ch. 10 Vector Quantization. Advantages & Design

Ch. 10 Vector Quantization. Advantages & Design Ch. 10 Vector Quantization Advantages & Design 1 Advantages of VQ There are (at least) 3 main characteristics of VQ that help it outperform SQ: 1. Exploit Correlation within vectors 2. Exploit Shape Flexibility

More information

SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS

SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS Hans-Jürgen Winkler ABSTRACT In this paper an efficient on-line recognition system for handwritten mathematical formulas is proposed. After formula

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

Neural Networks: Introduction

Neural Networks: Introduction Neural Networks: Introduction Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others 1

More information

Unit 8: Introduction to neural networks. Perceptrons

Unit 8: Introduction to neural networks. Perceptrons Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad

More information

Radial-Basis Function Networks. Radial-Basis Function Networks

Radial-Basis Function Networks. Radial-Basis Function Networks Radial-Basis Function Networks November 00 Michel Verleysen Radial-Basis Function Networks - Radial-Basis Function Networks p Origin: Cover s theorem p Interpolation problem p Regularization theory p Generalized

More information

Institute for Advanced Management Systems Research Department of Information Technologies Åbo Akademi University

Institute for Advanced Management Systems Research Department of Information Technologies Åbo Akademi University Institute for Advanced Management Systems Research Department of Information Technologies Åbo Akademi University The winner-take-all learning rule - Tutorial Directory Robert Fullér Table of Contents Begin

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

Learning Vector Quantization (LVQ)

Learning Vector Quantization (LVQ) Learning Vector Quantization (LVQ) Introduction to Neural Computation : Guest Lecture 2 John A. Bullinaria, 2007 1. The SOM Architecture and Algorithm 2. What is Vector Quantization? 3. The Encoder-Decoder

More information

Lecture 4: Feed Forward Neural Networks

Lecture 4: Feed Forward Neural Networks Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training

More information

Classification with Perceptrons. Reading:

Classification with Perceptrons. Reading: Classification with Perceptrons Reading: Chapters 1-3 of Michael Nielsen's online book on neural networks covers the basics of perceptrons and multilayer neural networks We will cover material in Chapters

More information

Introduction To Artificial Neural Networks

Introduction To Artificial Neural Networks Introduction To Artificial Neural Networks Machine Learning Supervised circle square circle square Unsupervised group these into two categories Supervised Machine Learning Supervised Machine Learning Supervised

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

Image compression using a stochastic competitive learning algorithm (scola)

Image compression using a stochastic competitive learning algorithm (scola) Edith Cowan University Research Online ECU Publications Pre. 2011 2001 Image compression using a stochastic competitive learning algorithm (scola) Abdesselam Bouzerdoum Edith Cowan University 10.1109/ISSPA.2001.950200

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable

More information

COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS

COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS Proceedings of the First Southern Symposium on Computing The University of Southern Mississippi, December 4-5, 1998 COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS SEAN

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Using a Hopfield Network: A Nuts and Bolts Approach

Using a Hopfield Network: A Nuts and Bolts Approach Using a Hopfield Network: A Nuts and Bolts Approach November 4, 2013 Gershon Wolfe, Ph.D. Hopfield Model as Applied to Classification Hopfield network Training the network Updating nodes Sequencing of

More information

Neural Network to Control Output of Hidden Node According to Input Patterns

Neural Network to Control Output of Hidden Node According to Input Patterns American Journal of Intelligent Systems 24, 4(5): 96-23 DOI:.5923/j.ajis.2445.2 Neural Network to Control Output of Hidden Node According to Input Patterns Takafumi Sasakawa, Jun Sawamoto 2,*, Hidekazu

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

Effects of Interactive Function Forms in a Self-Organized Critical Model Based on Neural Networks

Effects of Interactive Function Forms in a Self-Organized Critical Model Based on Neural Networks Commun. Theor. Phys. (Beijing, China) 40 (2003) pp. 607 613 c International Academic Publishers Vol. 40, No. 5, November 15, 2003 Effects of Interactive Function Forms in a Self-Organized Critical Model

More information

9 Competitive Neural Networks

9 Competitive Neural Networks 9 Competitive Neural Networks the basic competitive neural network consists of two layers of neurons: The distance-measure layer, The competitive layer, also known as a Winner-Takes-All (WTA) layer The

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation Neural Networks Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation Neural Networks Historical Perspective A first wave of interest

More information

Scuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017

Scuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017 Scuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017 www.u4learn.it Ing. Giuseppe La Tona Sommario Machine Learning definition Machine Learning Problems Artificial Neural

More information

CS:4420 Artificial Intelligence

CS:4420 Artificial Intelligence CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of

More information

Ch.8 Neural Networks

Ch.8 Neural Networks Ch.8 Neural Networks Hantao Zhang http://www.cs.uiowa.edu/ hzhang/c145 The University of Iowa Department of Computer Science Artificial Intelligence p.1/?? Brains as Computational Devices Motivation: Algorithms

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

Sections 18.6 and 18.7 Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks Sections 18.6 and 18.7 Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs. artifical neural

More information

CMSC 421: Neural Computation. Applications of Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks CMSC 42: Neural Computation definition synonyms neural networks artificial neural networks neural modeling connectionist models parallel distributed processing AI perspective Applications of Neural Networks

More information

Artificial Neural Network : Training

Artificial Neural Network : Training Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural

More information

Introduction Biologically Motivated Crude Model Backpropagation

Introduction Biologically Motivated Crude Model Backpropagation Introduction Biologically Motivated Crude Model Backpropagation 1 McCulloch-Pitts Neurons In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published A logical calculus of the

More information

Classification and Clustering of Printed Mathematical Symbols with Improved Backpropagation and Self-Organizing Map

Classification and Clustering of Printed Mathematical Symbols with Improved Backpropagation and Self-Organizing Map BULLETIN Bull. Malaysian Math. Soc. (Second Series) (1999) 157-167 of the MALAYSIAN MATHEMATICAL SOCIETY Classification and Clustering of Printed Mathematical Symbols with Improved Bacpropagation and Self-Organizing

More information

18.6 Regression and Classification with Linear Models

18.6 Regression and Classification with Linear Models 18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight

More information

Artificial neural networks

Artificial neural networks Artificial neural networks Chapter 8, Section 7 Artificial Intelligence, spring 203, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 8, Section 7 Outline Brains Neural

More information

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Proyecto final de carrera

Proyecto final de carrera UPC-ETSETB Proyecto final de carrera A comparison of scalar and vector quantization of wavelet decomposed images Author : Albane Delos Adviser: Luis Torres 2 P a g e Table of contents Table of figures...

More information

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Fall, 2018 Outline Introduction A Brief History ANN Architecture Terminology

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

Sections 18.6 and 18.7 Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks Sections 18.6 and 18.7 Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs artifical neural networks

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

Artificial Neural Networks Examination, March 2002

Artificial Neural Networks Examination, March 2002 Artificial Neural Networks Examination, March 2002 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Application of SOM neural network in clustering

Application of SOM neural network in clustering J. Biomedical Science and Engineering, 2009, 2, 637-643 doi: 10.4236/jbise.2009.28093 Published Online December 2009 (http://www.scirp.org/journal/jbise/). Application of SOM neural network in clustering

More information

3.4 Linear Least-Squares Filter

3.4 Linear Least-Squares Filter X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum

More information

CS 4700: Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman selman@cs.cornell.edu Machine Learning: Neural Networks R&N 18.7 Intro & perceptron learning 1 2 Neuron: How the brain works # neurons

More information

The Perceptron. Volker Tresp Summer 2016

The Perceptron. Volker Tresp Summer 2016 The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.

) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2. 1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2011 Recitation 8, November 3 Corrected Version & (most) solutions

More information

Nonlinear Classification

Nonlinear Classification Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions

More information

Effects of Interactive Function Forms and Refractoryperiod in a Self-Organized Critical Model Based on Neural Networks

Effects of Interactive Function Forms and Refractoryperiod in a Self-Organized Critical Model Based on Neural Networks Commun. Theor. Phys. (Beijing, China) 42 (2004) pp. 121 125 c International Academic Publishers Vol. 42, No. 1, July 15, 2004 Effects of Interactive Function Forms and Refractoryperiod in a Self-Organized

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

Introduction to Neural Networks: Structure and Training

Introduction to Neural Networks: Structure and Training Introduction to Neural Networks: Structure and Training Professor Q.J. Zhang Department of Electronics Carleton University, Ottawa, Canada www.doe.carleton.ca/~qjz, qjz@doe.carleton.ca A Quick Illustration

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks Sections 18.6 and 18.7 Analysis of Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline Univariate regression

More information

Handwritten English Character Recognition using Pixel Density Gradient Method

Handwritten English Character Recognition using Pixel Density Gradient Method International Journal of Computer Science and Engineering Open Access Research Paper Volume-2, Issue-3 E-ISSN: 2347-2693 Handwritten English Character Recognition using Pixel Density Gradient Method Rakesh

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts

More information

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Neural Networks and Nonparametric Methods CMPSCI 383 Nov 17, 2011! Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error

More information

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Alexandre Allauzen Anne Auger Michèle Sebag LIMSI LRI Oct. 4th, 2012 This course Bio-inspired algorithms Classical Neural Nets History

More information

2.1 Definition. Let n be a positive integer. An n-dimensional vector is an ordered list of n real numbers.

2.1 Definition. Let n be a positive integer. An n-dimensional vector is an ordered list of n real numbers. 2 VECTORS, POINTS, and LINEAR ALGEBRA. At first glance, vectors seem to be very simple. It is easy enough to draw vector arrows, and the operations (vector addition, dot product, etc.) are also easy to

More information

Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

More information

Classic K -means clustering. Classic K -means example (K = 2) Finding the optimal w k. Finding the optimal s n J =

Classic K -means clustering. Classic K -means example (K = 2) Finding the optimal w k. Finding the optimal s n J = Review of classic (GOF K -means clustering x 2 Fall 2015 x 1 Lecture 8, February 24, 2015 K-means is traditionally a clustering algorithm. Learning: Fit K prototypes w k (the rows of some matrix, W to

More information

AE = q < H(p < ) + (1 q < )H(p > ) H(p) = p lg(p) (1 p) lg(1 p)

AE = q < H(p < ) + (1 q < )H(p > ) H(p) = p lg(p) (1 p) lg(1 p) 1 Decision Trees (13 pts) Data points are: Negative: (-1, 0) (2, 1) (2, -2) Positive: (0, 0) (1, 0) Construct a decision tree using the algorithm described in the notes for the data above. 1. Show the

More information

Neural Networks Introduction

Neural Networks Introduction Neural Networks Introduction H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural Networks 1/22 Biological

More information

Effect of number of hidden neurons on learning in large-scale layered neural networks

Effect of number of hidden neurons on learning in large-scale layered neural networks ICROS-SICE International Joint Conference 009 August 18-1, 009, Fukuoka International Congress Center, Japan Effect of on learning in large-scale layered neural networks Katsunari Shibata (Oita Univ.;

More information

Lecture 7: DecisionTrees

Lecture 7: DecisionTrees Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:

More information

Dynamic Classification of Power Systems via Kohonen Neural Network

Dynamic Classification of Power Systems via Kohonen Neural Network Dynamic Classification of Power Systems via Kohonen Neural Network S.F Mahdavizadeh Iran University of Science & Technology, Tehran, Iran M.A Sandidzadeh Iran University of Science & Technology, Tehran,

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radial-Basis Function etworks A function is radial basis () if its output depends on (is a non-increasing function of) the distance of the input from a given stored vector. s represent local receptors,

More information

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural 1 2 The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural networks. First we will look at the algorithm itself

More information

PV021: Neural networks. Tomáš Brázdil

PV021: Neural networks. Tomáš Brázdil 1 PV021: Neural networks Tomáš Brázdil 2 Course organization Course materials: Main: The lecture Neural Networks and Deep Learning by Michael Nielsen http://neuralnetworksanddeeplearning.com/ (Extremely

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

Sample Exam COMP 9444 NEURAL NETWORKS Solutions

Sample Exam COMP 9444 NEURAL NETWORKS Solutions FAMILY NAME OTHER NAMES STUDENT ID SIGNATURE Sample Exam COMP 9444 NEURAL NETWORKS Solutions (1) TIME ALLOWED 3 HOURS (2) TOTAL NUMBER OF QUESTIONS 12 (3) STUDENTS SHOULD ANSWER ALL QUESTIONS (4) QUESTIONS

More information

Course 10. Kernel methods. Classical and deep neural networks.

Course 10. Kernel methods. Classical and deep neural networks. Course 10 Kernel methods. Classical and deep neural networks. Kernel methods in similarity-based learning Following (Ionescu, 2018) The Vector Space Model ò The representation of a set of objects as vectors

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radial-Basis Function etworks A function is radial () if its output depends on (is a nonincreasing function of) the distance of the input from a given stored vector. s represent local receptors, as illustrated

More information

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

COMP-4360 Machine Learning Neural Networks

COMP-4360 Machine Learning Neural Networks COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca

More information

Artificial Neural Networks. Historical description

Artificial Neural Networks. Historical description Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

More information