A Hybrid Neuron with Gradient-based Learning for Binary Classification Problems

Size: px
Start display at page:

Download "A Hybrid Neuron with Gradient-based Learning for Binary Classification Problems"

Transcription

1 A Hybrid Neuron with Gradient-based Learning for Binary Classification Problems Ricardo de A. Araújo 1,2, Adriano L. I. Oliveira 1, Silvio R. L. Meira 1 1 Informatics Center Federal University of Pernambuco Recife, PE, Brazil 2 Informatics Department Federal Institute of Sertão Pernambucano Ouricuri, PE, Brazil ricardo.araujo@ifsertao-pe.edu.br or {raa,alio,srlm}@cin.ufpe.br Abstract. In this paper we present a hybrid neuron based on principles of mathematical morphology lattice theory to solve binary classification problems. For the model design we present a gradient-based method using the back propagation algorithm ideas a systematic approach to overcome the problem of nondifferentiability of morphological operations. Furthermore, we conduct an experimental analysis using two relevant binary classification problems the obtained results are discussed compared with those obtained by established techniques in the literature. 1. Introduction The perceptron is the most known artificial neuron proposed in the literature [Haykin 1998, Haykin 2007]. It was inspired by the concept of biological neurons it is able to solve linear classification problems [Haykin 1998, Haykin 2007]. A particular class of artificial neurons based on the framework of mathematical morphology (MM) [Maragos 1989] under context of lattice theory [Heijmans 1994], called morphological perceptrons (MPs) [Ritter Urcid 2003, Sussner Esmi 2011], have been successfully applied as solution of linear nonlinear problems [Ritter et al. 1997, Ritter et al. 1998,Sussner 1998a,Petridis Kaburlasos 1998,Kaburlasos Petridis 2000,Khabou Gader 2000, Hocaoglu Gader 2003, Sussner Valle 2006a, Sussner Valle 2006b,de A. Araújo et al. 2006b,de A. Araújo et al. 2006c,de A. Araújo et al. 2006a,Sussner Valle 2007, Valle Sussner 2008, Silva Sussner 2008, Sussner Esmi 2009b, Sussner Esmi 2009a, Sussner Esmi 2011]. In this context, this work proposes a hybrid artificial neuron, which can be seen as a particular class of morphological-linear perceptrons, for dealing with binary classification problems. The proposed model, called the dilation-erosion-linear perceptron (DELP), consists of a linear combination of nonlinear morphological operators under context of lattice theory a linear operator. Also, a gradient-based method is presented to design the proposed DELP (learning process) based on ideas from the back propagation (BP) algorithm [Haykin 1998, Haykin 2007] using a systematic approach to overcome the problem of nondifferentiability of morphological operations, based on ideas from Pessoa

2 Maragos [Pessoa Maragos 2000] Sousa [de Sousa et al. 2000]. Furthermore, an experimental analysis is conducted with the proposed model using the Ripley s Synthetic [Ripley 1996] the Wisconsin Breast Cancer [Asuncion Newman 2007] classification problems. The achieved results are compared with those obtained by established techniques in the literature, where it is possible to notice that the DELP model can be used as an accurate binary classifier. This work is organized as follows. Section 2 describes the proposed DELP model. In Section 3 are presented simulations experimental results with the proposed DELP model, as well as a comparison between the obtained results those given by established techniques presented in the literature. Finally, in Section 4, it is presented the final remarks of this work. 2. The Dilation-Erosion-Linear Perceptron The proposed dilation-erosion-linear perceptron (DELP) consists of a linear combination of a nonlinear operator (dilation erosion operators) a linear operator (finite impulse response). Next we present the definition, the fundamentals the proposed training algorithm to design the DELP. Let x = (x 1, x 2,..., x n ) R n a real-valued input signal inside an n-point moving window let y the output of the DELP. Then, the DELP is defined by a hybrid morphological-linear system with local signal transformation rule x y, given by where in which y = λα + (1 λ)β, λ [0, 1], (1) β = x p T = x 1 p 1 + x 2 p x n p n, (2) α = θφ + (1 θ)ω, θ [0, 1], (3) φ = δ a (x) = ω = ε b (x) = n (x i + a i ), (4) i=1 n (x i + b i ), (5) where term n denotes the dimensionality of the input signal (x), terms λ, θ R a, b, p R n. The vector p R n represents the coefficients (weights) of the linear operator. The term β represents the output of the linear operator. The term α represents the convex combination of the morphological operators of dilation erosion (the mixture term is defined by θ). The terms φ ω represent the output of morphological operators of dilation erosion, respectively. The vectors a b represent the structuring elements (weights) of the dilation (δ a (x)) erosion (ε b (x)) operators employed into the nonlinear module of the DELP. Terms represent the supremum the infimum operations. Note that the output y is given by a convex combination of the linear operator another convex combination of morphological operators of dilation erosion (the i=1

3 mixture term is defined by λ). The main differences between + + are given by the following rules: ( ) + (+ ) = (+ ) + ( ) =, (6) ( ) + (+ ) = (+ ) + ( ) = +. (7) 2.1. Learning Process The design of the DELP model requires the adjustment of the parameters a, b, p R n λ, θ R. Therefore, the weight vector w (note that w R 3n+2 ) of the DELP model is given by w = (a, b, p, λ, θ). (8) During the proposed learning process, all parameters of the DELP model are iteratively adjusted according to an error criterium until convergence. Therefore, it is necessary to define an objective function J(w) (represented by the error between the target the model output using the weight vector w) to be minimized during the learning process, given by M J(w) = e 2 (m), (9) m=1 in which M represents the input patterns amount in the learning process e(m) represents the instantaneous error for the m-th input pattern, given by e(m) = t(m) y(m), (10) where t(m) y(m) are the target the model output, respectively. Note that the objective function builds an error surface within space R 3n+2. The main problem to minimize J(w) is to find for the optimal point in this space which minimizes the error between the target the model output, that is, to determine w in which argmin[j(w)]. In this work we propose a gradient steepest descent method using ideas of the back propagation algorithm [Haykin 1998, Haykin 2007], which is used to obtain the gradient vector to adjust the weight vector of the DELP. The learning process of the DELP model updates the weight vector w based on the steepest descent method. The adjustment of vector w for the m-th input training pattern is given by the following iterative formula: w(i + 1) = w(i) µ J(w), (11) where µ > 0 (usually called step size or learning rate) i {1, 2,...}. The term µ is responsible for regulating the tradeoff between stability speed of convergence of the iterative procedure. The iteration of Equation 11 starts with an initial guess w(0) stops when some desired condition is reached. Term J(w) is the gradient, which is given by J(w) = J ( J w = a, J b, J p, J λ, J ), (12) θ

4 in which J a = 2e(m) a, (13) J = 2e(m) b b, (14) J = 2e(m) p p, (15) J = 2e(m) λ λ, (16) J θ = 2e(m) θ. (17) Note that the existence of the gradient of J with respect to w depends on the existence of the gradients,,,. Next we present the formulas to calculate a b p λ θ them. in which The term is given by λ The term is given by p λ = α β. (18) p = β β p, (19) β β p where x represents the input signal (m-th input training pattern). in which The term θ is given by = 1 λ, (20) = x, (21) θ = θ, (22) θ = λ, (23) = φ ω. (24) Terms are estimated using the concept of smoothed rank indicator vector [Pessoa Maragos 2000, Sousa 2000, de Sousa et al. 2000] (because dilation a b erosion operators can be seen as particular cases of the rank function), where we choose the smoothed unit sample function Q σ (x) = [q σ (x 1 ), q σ (x 2 ),..., q σ (x n )], in which ( x ) q σ (x i ) = sech 2, i = 1,..., n. (25) σ

5 Note that the choice of the scale factor σ directly affect the estimation interpolation of the gradients. However, the learning process of the DELP model a b even works with σ 0, since in this particular case, the gradient will be given in terms of the usual rank indicator vector [Pessoa Maragos 2000, Sousa 2000, de Sousa et al. 2000]. in which in which Therefore, the term is given by a a = φ a = φ φ As the same way, the term is given by b b = ω b = φ a = λ φ φ a, (26) = θ, (27) Q σ (φ 1 (x + a)) Q σ (φ 1 (x + a)) 1 T. (28) ω ω 3. Simulations Experimental Results ω b = λ ω ω b, (29) = 1 θ, (30) Q σ (ω 1 (x + b)) Q σ (ω 1 (x + b)) 1 T. (31) The well-known Ripley s synthetic Wisconsin breast cancer classification problems were used as a test bed for the evaluation of the proposed model. To assess the classification performance we use the percentage of misclassified patterns (PMP) [Sussner Esmi 2011] metric. Also, we use the percentage gain (PG) metric, in terms of the PMP obtained using DELP model using other models investigated in this work, which is given by P G = P MP delp, (32) P MP model where P MP delp represents the PMP obtained using the proposed DELP model P MP model represents the PMP obtained using the investigated model. It is worth mentioning that the data was normalized to lie within the range [0, 1] according to Prechelt [Prechelt 1994]. The entries of the DELP weight vectors a, b p are romly initialized within the range [ 1, 1]. The initial DELP mixture coefficients λ θ are romly chosen in the interval [0, 1]. Based on exhaustive experiments to determine the best learning rate (µ) the scale factor (σ), we use µ = 0.01 σ = 1.5. It is worth mentioning that three stop conditions are used into the learning process : i) The maximum epoch number equals to 10 4 ; ii) The decrease in the training error process training (P t) of the cost function equals to 10 6.

6 In order to establish a fair performance comparison, results with the following classification models were examined in the same context under the same experimental conditions: multilayer perceptrons (MLP) [Haykin 1998, Haykin 2007], morphologicalrank-linear neural network (MRLNN) [Pessoa Maragos 2000], morphological perceptron with competitive learning (MP/CL) [Sussner Esmi 2011], single layer morphological perceptron (SLMP) [Sussner 1998b], fuzzy lattice neural network (FLNN) [Petridis Kaburlasos 1998], fuzzy lattice reasoning (FLR) [Kaburlasos et al. 2007], k-nearest neighbors (KNN) [Devroye et al. 1996], decision tree (DT) [Breiman et al. 1984, Esposito et al. 1997] support vector machine (SVM) [Haykin 2007]. In all experiments we used the MLP model with sigmoidal processing units a single hidden layer. For its learning process we used the Levenberg-Marquardt [Hagan Menhaj 1994] algorithm using the following stopping criteria [Prechelt 1994]: i) The maximum epoch number equals to 10 4 ; ii) The decrease in the training error process training (P t) of the cost function equals to Also, for the MRLNN model we used the same parameters suggested by [Pessoa Maragos 2000] with a single hidden layer, for its learning process we used the generalized back propagation (GBP) [Pessoa Maragos 2000] algorithm with learning rate equals to 0.01, scale factor equals to using the same stopping criteria of the MLP model. It is worth mentioning that for both MLP MRLNN models, we applied the 10-fold cross validation to determine the number of hidden processing units (5, 10, 15, 20, 25 or 50). For the MP/CL model we used the same design process parameters definition suggested by [Sussner Esmi 2011]. For the SLMP model we used the same design process parameters definition suggested by [Sussner 1998b]. For the FLNN model we used the same design process parameters definition suggested by [Petridis Kaburlasos 1998, Sussner Esmi 2011]. For the FLR model we used the same design process parameters definition suggested by [Kaburlasos et al. 2007, Sussner Esmi 2011]. For the KNN model we used the 10-fold cross-validation to determine the best value of k (1,2,...,20) in terms of the mean error rate on the validation set, as suggested by [Sussner Esmi 2011]. For the DT model we used the criterion for choosing a split by Gini s diversity index, as suggested by [Sussner Esmi 2011]. At the end, for the SVM model we used linear (SVM-L), polynomial (SVM-P), quadratic (SVM-Q) rbf (SVM-RBF) kernels with the least squares method to find the separating hyperplane, as defined in [Haykin 2007] Ripley s Synthetic Problem The Ripley s synthetic problem [Ripley 1996] consists of samples from two classes. Each sample has 2-dimensional features vector. The data are divided into training test sets. The training set consists of 250 samples, while the test set consists of 1000 samples. It is worth mentioning that, for both training test sets, we have the same number of samples belonging to each of the two classes, characterizing a balanced binary classification problem in R 2. The Table 1 presents the experimental results of the test set obtained by the models presented in literature, as well as those achieved by the proposed DELP model. According to Table 1, it is possible to notice that the best model found in the literature is the SVM-RBF (with P MP = 8.30%). However, a slightly inferior classification performance can be achieved using SVM-P, MLP, SVM-Q, MRLNN, KNN MP/CL models. It is worth mentioning that the proposed DELP model obtained good classifica-

7 Table 1. Percentage of misclassified patterns of the test set for the Ripley s synthetic problem. Model PMP (%) MLP 9.30 MRLNN 9.50 MP/CL SLMP FLNN FLR KNN 9.60 DT SVM-L SVM-P 9.10 SVM-Q 9.40 SVM-RBF 8.30 DELP 8.30 tion performance, having the same PMP value obtained by the best model found in the literature. The Table 2 presents the PG (test set), in terms of the PMP obtained using DELP model using other models investigated in this work. Table 2. Percentage gain (test set) of the proposed DELP model regarding MLP, MRLNN, MP/CL, SLMP, FLNN, FLR, KNN, DT, SVM-L, SVM-P, SVM-Q SVM-RBF models for the Ripley s synthetic problem. PG (%) DELP / MLP DELP / MRLNN DELP / MP/CL DELP / SLMP DELP / FLNN DELP / FLR DELP / KNN DELP / DT DELP / SVM-L DELP / SVM-P 8.79 DELP / SVM-Q DELP / SVM-RBF 0.00 According to the Table 2, without relying on the results obtained with SVM-RBF (where the proposed DELP model achieved the same classification performance), it is possible to notice that the proposed DELP model obtained improvement greater than 8% over the results achieved using MLP, MRLNN, MP/CL, SLMP, FLNN, FLR, KNN, DT, SVM-L, SVM-P SVM-Q models. The decision surface generated by the proposed DELP model for the Ripley s synthetic problem is depicted in Figure 1.

8 Figure 1. Decision surface produced by the proposed DELP model for the Ripley s synthetic problem Feature Feature Wisconsin Breast Cancer Problem The Wisconsin breast cancer problem [Asuncion Newman 2007] consists of samples from two classes representing malignant benignant breast cancer. The data are divided into training test set, where we used the same partitioning scheme suggested by [Sussner Esmi 2011] (the first 249 samples of the benignant class the first 148 samples of the malignant class are used in the training set the rest of the samples from both classes are used in the test set). Each sample has 30-dimensional features vector. The Table 3 presents the experimental results of the test set obtained by the models presented in literature, as well as those achieved by the proposed DELP model. According to Table 3, we can verify that the best models found in the literature are the SVM-L SVM-Q (having the same P MP = 1.75%). However, a slightly inferior classification performance can be found using MRLNN, SVM-RBF, FLR, MP/CL MLP. It is possible to notice that the proposed DELP model obtained good classification performance (with P M P = 1.40%), overcoming the best models found in the literature. The Table 4 presents the PG (test set), in terms of the PMP obtained using DELP model regarding the PMP obtained using other models investigated in this work. According to the Table 4, we can see that the proposed DELP model obtained improvement greater than 20% over the results achieved using MLP, MRLNN, MP/CL, SLMP, FLNN, FLR, KNN, DT, SVM-L, SVM-Q, SVM-P SVM-RBF models. 4. Conclusion In this paper, a hybrid artificial neuron was presented for dealing with synthetic realworld binary classification problems. The proposed model, called the dilation-erosionlinear perceptron (DELP), consists of a linear combination of nonlinear morphological

9 Table 3. Percentage of misclassified patterns of the test set for the Wisconsin breast cancer problem. Model PMP (%) MLP 4.55 MRLNN 2.10 MP/CL 4.20 SLMP FLNN 5.59 FLR 3.50 KNN 5.94 DT 8.74 SVM-L 1.75 SVM-P SVM-Q 1.75 SVM-RBF 3.15 DELP 1.40 Table 4. Percentage gain (test set) of the proposed DELP model regarding MLP, MRLNN, MP/CL, SLMP, FLNN, FLR, KNN, DT, SVM-L, SVM-P, SVM-Q SVM-RBF models for the Wisconsin breast cancer problem. PG (%) DELP / MLP DELP / MRLNN DELP / MP/CL DELP / SLMP DELP / FLNN DELP / FLR DELP / KNN DELP / DT DELP / SVM-L DELP / SVM-P DELP / SVM-Q DELP / SVM-RBF operators under context of lattice theory a linear operator. For the DELP design (learning process) we presented a gradient steepest descent method based on ideas from the back propagation (BP) algorithm using a systematic approach to overcome the problem of nondifferentiability of morphological operations. The classification performance of the proposed DELP model was assessed in terms of well-known models presented in the literature (MLP, MRLNN, MP/CL, SLMP, FLNN, FLR, KNN, DT, SVM-L, SVM-P, SVM-Q SVM-RBF) using the percentage misclassified patterns metric. Besides, two relevant binary classification problem were investigated in this work: Ripley s Synthetic Wisconsin Breast Cancer. The experimental results demonstrated similar performance (for the Ripley s problem) better performance (for the Wisconsin Breast Cancer problem) of the proposed DELP model in

10 comparison to the better models found in the literature. In other words, the proposed DELP model succeeded to solve the aforementioned classification problems, exhibiting very satisfactory classification results. Further studies must be developed to better formalize explain the properties of the proposed DELP model to determine its possible limitations with other binary classification problems. Further studies, in terms of convergence analysis, must be done in the learning process of the DELP model. Finally, a particular study about the computing complexity CPU time of the proposed DELP model must be done in order to establish a complete cost-performance evaluation of the proposed model. According to this investigation, it will be possible to relate, in terms of cost, the necessary time to generate an optimal model. References Asuncion, A. Newman, D. J. (2007). UCI machine learning repository. Breiman, L., Friedman, J., Olshen, R., Stone, C. (1984). Classification Regression Trees. Wadsworth Brooks, Monterey, CA. de A. Araújo, R., Madeiro, F., de Sousa, R. P., Pessoa, L. F. C. (2006a). Modular morphological neural network training via adaptive genetic algorithm for designing translation invariant operators. In Proceedings of the IEEE International Conference on Acoustics, Speech Signal Processing, volume 2, pages de A. Araújo, R., Madeiro, F., de Sousa, R. P., Pessoa, L. F. C., Ferreira, T. A. E. (2006b). An evolutionary morphological approach for financial time series forecasting. In Proceedings of the IEEE Congress on Evolutionary Computation, pages de A. Araújo, R., Madeiro, F., Ferreira, T. A. F., de Sousa, R. P., Pessoa, L. F. C. (2006c). Improved evolutionary hybrid method for designing morphological operators. In Proceedings of the IEEE International Conference on Image Processing, pages de Sousa, R. P., Carvalho, J. M., Assis, F. M., Pessoa, L. F. C. (2000). Designing translation invariant operations via neural network training. In Proc. of the IEEE Intl Conference on Image Processing, Vancouver, Canada. Devroye, L., Gyorfi, L., Lugosi, G. (1996). A probabilistic theory of pattern recognition. Springer. Esposito, F., Malerba, D., Semeraro, G. (1997). A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Anal. Mach. Intell., 19(5): Hagan, M. Menhaj, M. (1994). Training feedforward networks with the marquardt algorithm. IEEE Transactions on Neural Networks, 5(6): Haykin, S. (1998). Neural networks: A comprehensive foundation. Prentice Hall, New Jersey. Haykin, S. (2007). Neural Networks Learning Machines. McMaster University, Canada. Heijmans, H. J. A. M. (1994). Morphological Image Operators. Academic Press, New York, NY.

11 Hocaoglu, A. K. Gader, P. D. (2003). Domain learning using choquet integralbased morphological shared weight neural networks. Image Vision Computing, 21(7): Kaburlasos, V. G., Athanasiadis, I. N., Mitkas, P. A. (2007). Fuzzy lattice reasoning (flr) classifier its application for ambient ozone estimation. Int. J. Approx. Reasoning, 45(1): Kaburlasos, V. G. Petridis, V. (2000). Fuzzy lattice neurocomputing (FLN) models. Neural Networks, 13(10): Khabou, M. A. Gader, P. D. (2000). Automatic target detection using entropy optimized shared-weight neural networks. IEEE Transactions on Neural Networks, 11(1): Maragos, P. (1989). A representation theory for morphological image signal processing. IEEE Transactions on Pattern Analysis Machine Intelligence, 11: Pessoa, L. F. C. Maragos, P. (2000). Neural networks with hybrid morphological rank linear nodes: a unifying framework with applications to hwritten character recognition. Pattern Recognition, 33: Petridis, V. Kaburlasos, V. G. (1998). Fuzzy lattice neural network (FLNN): a hybrid model for learning. IEEE Transactions on Neural Networks, 9(5): Prechelt, L. (1994). Proben1: A set of neural network benchmark problems benchmarking rules. Technical Report 21/94. Ripley, B. D. (1996). Pattern Recognition Neural Networks. Cambridge University Press, Cambridge, United Kingdom. Ritter, G. X., Sussner, P., de Leon, J. L. D. (1998). Morphological associative memories. IEEE Transactions on Neural Networks, 9(2): Ritter, G. X., Sussner, P., Hacker, W. B. (1997). Associative memories with infinite storage capacity. In InterSymp 97, 9th International Conference on Systems Research Informatics Cybernetics, pages , Baden-Baden, Germany. Invited Plenary Paper. Ritter, G. X. Urcid, G. (2003). Lattice algebra approach to single-neuron computation. IEEE Transactions on Neural Network, 14(2): Silva, A. M. Sussner, P. (2008). A brief review comparison of feedforward morphological neural networks with applications to classification. In Proceedings of the International Conference on Artificial Neural Networks, pages Sousa, R. P. (2000). Design of translation invariant operators via neural network training. PhD thesis, UFPB, Campina Gre, Brazil. Sussner, P. (1998a). Kernels for morphological associative memories. In Proceedings of the International ICSA/IFAC Symposium on Neural Computation, pages 79 85, Vienna. Sussner, P. (1998b). Morphological perceptron learning. In Proceedings of the IEEE International Symposium on Intelligent Control, pages , Gaithersburg, MD.

12 Sussner, P. Esmi, E. L. (2009a). Constructive morphological neural networks: some theoretical aspects experimental results in classification. In Kacprzyk, J., editor, Constructive Neural Networks, Studies in Computational Intelligence, pages Springer Verlag, Heidelberg, Germany. Sussner, P. Esmi, E. L. (2009b). An introduction to morphological perceptrons with competitive learning. In Proceedings of the International Joint Conference on Neural Networks, pages , Atlanta, GA. Sussner, P. Esmi, E. L. (2011). Morphological perceptrons with competitive learning: Lattice-theoretical framework constructive learning algorithm. Information Sciences, 181(10): Sussner, P. Valle, M. E. (2006a). Grayscale morphological associative memories. IEEE Transactions on Neural Networks, 17(3): Sussner, P. Valle, M. E. (2006b). Implicative fuzzy associative memories. IEEE Transactions on Fuzzy Systems, 14(6): Sussner, P. Valle, M. E. (2007). Morphological certain fuzzy morphological associative memories for classification prediction. In Kaburlassos, V. G. Ritter, G. X., editors, Computational Intelligence Based on Lattice Theory, volume 67, pages Springer Verlag, Heidelberg, Germany. Valle, M. E. Sussner, P. (2008). A general framework for fuzzy morphological associative memories. Fuzzy Sets Systems, 159(7):

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

COMP-4360 Machine Learning Neural Networks

COMP-4360 Machine Learning Neural Networks COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Knowledge

More information

NEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE

NEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Series A, OF THE ROMANIAN ACADEMY Volume 0, Number /009, pp. 000 000 NEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE

More information

Nonlinear Classification

Nonlinear Classification Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

A Novel Activity Detection Method

A Novel Activity Detection Method A Novel Activity Detection Method Gismy George P.G. Student, Department of ECE, Ilahia College of,muvattupuzha, Kerala, India ABSTRACT: This paper presents an approach for activity state recognition of

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 2012 Engineering Part IIB: Module 4F10 Introduction In

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES

More information

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

DESIGNING RBF CLASSIFIERS FOR WEIGHTED BOOSTING

DESIGNING RBF CLASSIFIERS FOR WEIGHTED BOOSTING DESIGNING RBF CLASSIFIERS FOR WEIGHTED BOOSTING Vanessa Gómez-Verdejo, Jerónimo Arenas-García, Manuel Ortega-Moral and Aníbal R. Figueiras-Vidal Department of Signal Theory and Communications Universidad

More information

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding

More information

Study on Classification Methods Based on Three Different Learning Criteria. Jae Kyu Suhr

Study on Classification Methods Based on Three Different Learning Criteria. Jae Kyu Suhr Study on Classification Methods Based on Three Different Learning Criteria Jae Kyu Suhr Contents Introduction Three learning criteria LSE, TER, AUC Methods based on three learning criteria LSE:, ELM TER:

More information

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Kyriaki Kitikidou, Elias Milios, Lazaros Iliadis, and Minas Kaymakis Democritus University of Thrace,

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

A Hybrid Method of CART and Artificial Neural Network for Short-term term Load Forecasting in Power Systems

A Hybrid Method of CART and Artificial Neural Network for Short-term term Load Forecasting in Power Systems A Hybrid Method of CART and Artificial Neural Network for Short-term term Load Forecasting in Power Systems Hiroyuki Mori Dept. of Electrical & Electronics Engineering Meiji University Tama-ku, Kawasaki

More information

THE multilayer perceptron (MLP) is a nonlinear signal

THE multilayer perceptron (MLP) is a nonlinear signal Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 013 Partially Affine Invariant Training Using Dense Transform Matrices Melvin D Robinson and Michael T

More information

1162 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 5, SEPTEMBER The Evidence Framework Applied to Support Vector Machines

1162 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 5, SEPTEMBER The Evidence Framework Applied to Support Vector Machines 1162 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 5, SEPTEMBER 2000 Brief Papers The Evidence Framework Applied to Support Vector Machines James Tin-Yau Kwok Abstract In this paper, we show that

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

EM-algorithm for Training of State-space Models with Application to Time Series Prediction EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research

More information

Artificial Neural Networks

Artificial Neural Networks Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks

More information

A Machine Learning Approach to Define Weights for Linear Combination of Forecasts

A Machine Learning Approach to Define Weights for Linear Combination of Forecasts A Machine Learning Approach to Define Weights for Linear Combination of Forecasts Ricardo Prudêncio 1 and Teresa Ludermir 2 1 Departament of Information Science, Federal University of Pernambuco, Av. dos

More information

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH Abstract POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH A.H.M.A.Rahim S.K.Chakravarthy Department of Electrical Engineering K.F. University of Petroleum and Minerals Dhahran. Dynamic

More information

An artificial neural networks (ANNs) model is a functional abstraction of the

An artificial neural networks (ANNs) model is a functional abstraction of the CHAPER 3 3. Introduction An artificial neural networs (ANNs) model is a functional abstraction of the biological neural structures of the central nervous system. hey are composed of many simple and highly

More information

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.

More information

Neural networks and support vector machines

Neural networks and support vector machines Neural netorks and support vector machines Perceptron Input x 1 Weights 1 x 2 x 3... x D 2 3 D Output: sgn( x + b) Can incorporate bias as component of the eight vector by alays including a feature ith

More information

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16 COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS6 Lecture 3: Classification with Logistic Regression Advanced optimization techniques Underfitting & Overfitting Model selection (Training-

More information

Estimating the Number of Hidden Neurons of the MLP Using Singular Value Decomposition and Principal Components Analysis: A Novel Approach

Estimating the Number of Hidden Neurons of the MLP Using Singular Value Decomposition and Principal Components Analysis: A Novel Approach Estimating the Number of Hidden Neurons of the MLP Using Singular Value Decomposition and Principal Components Analysis: A Novel Approach José Daniel A Santos IFCE - Industry Department Av Contorno Norte,

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems

Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems Thore Graepel and Nicol N. Schraudolph Institute of Computational Science ETH Zürich, Switzerland {graepel,schraudo}@inf.ethz.ch

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

Intelligent Modular Neural Network for Dynamic System Parameter Estimation

Intelligent Modular Neural Network for Dynamic System Parameter Estimation Intelligent Modular Neural Network for Dynamic System Parameter Estimation Andrzej Materka Technical University of Lodz, Institute of Electronics Stefanowskiego 18, 9-537 Lodz, Poland Abstract: A technique

More information

Modified Learning for Discrete Multi-Valued Neuron

Modified Learning for Discrete Multi-Valued Neuron Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 Modified Learning for Discrete Multi-Valued Neuron Jin-Ping Chen, Shin-Fu Wu, and Shie-Jue Lee Department

More information

y(n) Time Series Data

y(n) Time Series Data Recurrent SOM with Local Linear Models in Time Series Prediction Timo Koskela, Markus Varsta, Jukka Heikkonen, and Kimmo Kaski Helsinki University of Technology Laboratory of Computational Engineering

More information

Computational Intelligence Winter Term 2017/18

Computational Intelligence Winter Term 2017/18 Computational Intelligence Winter Term 207/8 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Plan for Today Single-Layer Perceptron Accelerated Learning

More information

Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels

Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels ESANN'29 proceedings, European Symposium on Artificial Neural Networks - Advances in Computational Intelligence and Learning. Bruges Belgium), 22-24 April 29, d-side publi., ISBN 2-9337-9-9. Heterogeneous

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Unit III. A Survey of Neural Network Model

Unit III. A Survey of Neural Network Model Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Multilayer Perceptron Learning Utilizing Singular Regions and Search Pruning

Multilayer Perceptron Learning Utilizing Singular Regions and Search Pruning Multilayer Perceptron Learning Utilizing Singular Regions and Search Pruning Seiya Satoh and Ryohei Nakano Abstract In a search space of a multilayer perceptron having hidden units, MLP(), there exist

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Introduction to Neural Networks

Introduction to Neural Networks CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character

More information

ADAPTIVE NEURO-FUZZY INFERENCE SYSTEMS

ADAPTIVE NEURO-FUZZY INFERENCE SYSTEMS ADAPTIVE NEURO-FUZZY INFERENCE SYSTEMS RBFN and TS systems Equivalent if the following hold: Both RBFN and TS use same aggregation method for output (weighted sum or weighted average) Number of basis functions

More information

Computational Intelligence

Computational Intelligence Plan for Today Single-Layer Perceptron Computational Intelligence Winter Term 00/ Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Accelerated Learning

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo

More information

Deep Feedforward Networks. Sargur N. Srihari

Deep Feedforward Networks. Sargur N. Srihari Deep Feedforward Networks Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation

More information

Multilayer Neural Networks

Multilayer Neural Networks Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient

More information

Weight Initialization Methods for Multilayer Feedforward. 1

Weight Initialization Methods for Multilayer Feedforward. 1 Weight Initialization Methods for Multilayer Feedforward. 1 Mercedes Fernández-Redondo - Carlos Hernández-Espinosa. Universidad Jaume I, Campus de Riu Sec, Edificio TI, Departamento de Informática, 12080

More information

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning Lecture 0 Neural networks and optimization Machine Learning and Data Mining November 2009 UBC Gradient Searching for a good solution can be interpreted as looking for a minimum of some error (loss) function

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

Weight Quantization for Multi-layer Perceptrons Using Soft Weight Sharing

Weight Quantization for Multi-layer Perceptrons Using Soft Weight Sharing Weight Quantization for Multi-layer Perceptrons Using Soft Weight Sharing Fatih Köksal, Ethem Alpaydın, and Günhan Dündar 2 Department of Computer Engineering 2 Department of Electrical and Electronics

More information

Discriminant Kernels based Support Vector Machine

Discriminant Kernels based Support Vector Machine Discriminant Kernels based Support Vector Machine Akinori Hidaka Tokyo Denki University Takio Kurita Hiroshima University Abstract Recently the kernel discriminant analysis (KDA) has been successfully

More information

Intelligent Handwritten Digit Recognition using Artificial Neural Network

Intelligent Handwritten Digit Recognition using Artificial Neural Network RESEARCH ARTICLE OPEN ACCESS Intelligent Handwritten Digit Recognition using Artificial Neural Networ Saeed AL-Mansoori Applications Development and Analysis Center (ADAC), Mohammed Bin Rashid Space Center

More information

Deep Feedforward Networks

Deep Feedforward Networks Deep Feedforward Networks Yongjin Park 1 Goal of Feedforward Networks Deep Feedforward Networks are also called as Feedforward neural networks or Multilayer Perceptrons Their Goal: approximate some function

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

CS325 Artificial Intelligence Chs. 18 & 4 Supervised Machine Learning (cont)

CS325 Artificial Intelligence Chs. 18 & 4 Supervised Machine Learning (cont) CS325 Artificial Intelligence Cengiz Spring 2013 Model Complexity in Learning f(x) x Model Complexity in Learning f(x) x Let s start with the linear case... Linear Regression Linear Regression price =

More information

Linear Least-Squares Based Methods for Neural Networks Learning

Linear Least-Squares Based Methods for Neural Networks Learning Linear Least-Squares Based Methods for Neural Networks Learning Oscar Fontenla-Romero 1, Deniz Erdogmus 2, JC Principe 2, Amparo Alonso-Betanzos 1, and Enrique Castillo 3 1 Laboratory for Research and

More information

Neural Networks Introduction

Neural Networks Introduction Neural Networks Introduction H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural Networks 1/22 Biological

More information

Pattern Classification

Pattern Classification Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17 3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Multilayer Perceptron

Multilayer Perceptron Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts

More information

Neural Networks Lecture 4: Radial Bases Function Networks

Neural Networks Lecture 4: Radial Bases Function Networks Neural Networks Lecture 4: Radial Bases Function Networks H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi

More information

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

INTRODUCTION TO ARTIFICIAL INTELLIGENCE v=1 v= 1 v= 1 v= 1 v= 1 v=1 optima 2) 3) 5) 6) 7) 8) 9) 12) 11) 13) INTRDUCTIN T ARTIFICIAL INTELLIGENCE DATA15001 EPISDE 8: NEURAL NETWRKS TDAY S MENU 1. NEURAL CMPUTATIN 2. FEEDFRWARD NETWRKS (PERCEPTRN)

More information

Dyadic Classification Trees via Structural Risk Minimization

Dyadic Classification Trees via Structural Risk Minimization Dyadic Classification Trees via Structural Risk Minimization Clayton Scott and Robert Nowak Department of Electrical and Computer Engineering Rice University Houston, TX 77005 cscott,nowak @rice.edu Abstract

More information

Artifical Neural Networks

Artifical Neural Networks Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

More information

Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part 1) Goals for the lecture Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

A new method for short-term load forecasting based on chaotic time series and neural network

A new method for short-term load forecasting based on chaotic time series and neural network A new method for short-term load forecasting based on chaotic time series and neural network Sajjad Kouhi*, Navid Taghizadegan Electrical Engineering Department, Azarbaijan Shahid Madani University, Tabriz,

More information

MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES

MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES S. Cankurt 1, M. Yasin 2 1&2 Ishik University Erbil, Iraq 1 s.cankurt@ishik.edu.iq, 2 m.yasin@ishik.edu.iq doi:10.23918/iec2018.26

More information

Lecture 3: Pattern Classification. Pattern classification

Lecture 3: Pattern Classification. Pattern classification EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and

More information

A Particle Swarm Optimization (PSO) Primer

A Particle Swarm Optimization (PSO) Primer A Particle Swarm Optimization (PSO) Primer With Applications Brian Birge Overview Introduction Theory Applications Computational Intelligence Summary Introduction Subset of Evolutionary Computation Genetic

More information

Support Vector Machines: Training with Stochastic Gradient Descent. Machine Learning Fall 2017

Support Vector Machines: Training with Stochastic Gradient Descent. Machine Learning Fall 2017 Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Fall 2017 1 Support vector machines Training by maximizing margin The SVM objective Solving the SVM optimization problem

More information

Single layer NN. Neuron Model

Single layer NN. Neuron Model Single layer NN We consider the simple architecture consisting of just one neuron. Generalization to a single layer with more neurons as illustrated below is easy because: M M The output units are independent

More information

Links between Perceptrons, MLPs and SVMs

Links between Perceptrons, MLPs and SVMs Links between Perceptrons, MLPs and SVMs Ronan Collobert Samy Bengio IDIAP, Rue du Simplon, 19 Martigny, Switzerland Abstract We propose to study links between three important classification algorithms:

More information

Neural Networks DWML, /25

Neural Networks DWML, /25 DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference

More information

In the Name of God. Lectures 15&16: Radial Basis Function Networks

In the Name of God. Lectures 15&16: Radial Basis Function Networks 1 In the Name of God Lectures 15&16: Radial Basis Function Networks Some Historical Notes Learning is equivalent to finding a surface in a multidimensional space that provides a best fit to the training

More information

Prediction of Hourly Solar Radiation in Amman-Jordan by Using Artificial Neural Networks

Prediction of Hourly Solar Radiation in Amman-Jordan by Using Artificial Neural Networks Int. J. of Thermal & Environmental Engineering Volume 14, No. 2 (2017) 103-108 Prediction of Hourly Solar Radiation in Amman-Jordan by Using Artificial Neural Networks M. A. Hamdan a*, E. Abdelhafez b

More information

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer. University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x

More information

Rprop Using the Natural Gradient

Rprop Using the Natural Gradient Trends and Applications in Constructive Approximation (Eds.) M.G. de Bruin, D.H. Mache & J. Szabados International Series of Numerical Mathematics Vol. 1?? c 2005 Birkhäuser Verlag Basel (ISBN 3-7643-7124-2)

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS

SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS Filippo Portera and Alessandro Sperduti Dipartimento di Matematica Pura ed Applicata Universit a di Padova, Padova, Italy {portera,sperduti}@math.unipd.it

More information

Unit 8: Introduction to neural networks. Perceptrons

Unit 8: Introduction to neural networks. Perceptrons Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad

More information

Unsupervised Classification via Convex Absolute Value Inequalities

Unsupervised Classification via Convex Absolute Value Inequalities Unsupervised Classification via Convex Absolute Value Inequalities Olvi L. Mangasarian Abstract We consider the problem of classifying completely unlabeled data by using convex inequalities that contain

More information

Neural Networks. Advanced data-mining. Yongdai Kim. Department of Statistics, Seoul National University, South Korea

Neural Networks. Advanced data-mining. Yongdai Kim. Department of Statistics, Seoul National University, South Korea Neural Networks Advanced data-mining Yongdai Kim Department of Statistics, Seoul National University, South Korea What is Neural Networks? One of supervised learning method using one or more hidden layer.

More information

FEEDBACK GMDH-TYPE NEURAL NETWORK AND ITS APPLICATION TO MEDICAL IMAGE ANALYSIS OF LIVER CANCER. Tadashi Kondo and Junji Ueno

FEEDBACK GMDH-TYPE NEURAL NETWORK AND ITS APPLICATION TO MEDICAL IMAGE ANALYSIS OF LIVER CANCER. Tadashi Kondo and Junji Ueno International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 3(B), March 2012 pp. 2285 2300 FEEDBACK GMDH-TYPE NEURAL NETWORK AND ITS

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks Steve Renals Automatic Speech Recognition ASR Lecture 10 24 February 2014 ASR Lecture 10 Introduction to Neural Networks 1 Neural networks for speech recognition Introduction

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

Final Examination CS 540-2: Introduction to Artificial Intelligence

Final Examination CS 540-2: Introduction to Artificial Intelligence Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11

More information