An Adaptively Constructing Multilayer Feedforward Neural Networks Using Hermite Polynomials
|
|
- Stanley Lyons
- 5 years ago
- Views:
Transcription
1 An Adaptively Constructing Multilayer Feedforward Neural Networks Using Hermite Polynomials L. Ma 1 and K. Khorasani 2 1 Department of Applied Computer Science, Tokyo Polytechnic University, 1583 Iiyama, Atsugi, Kanagawa, Japan maly@cs.t-kougei.ac.jp 2 Department of Electrical and Computer Engineering, Concordia University, Montreal, Quebec, Canada H3G 1M8 kash@ece.concordia.ca Abstract. In this paper a new strategy is introduced for constructing a multi-hidden-layer feedforward neural network (FNN) where each hidden unit employs a polynomial function for its activation function that is different from other units. The proposed scheme incorporates a structure level adaptation as well as a function level adaptation methodologies in constructing the desired network. The activation functions considered consist of orthonormal Hermite polynomials. Using this strategy, a FNN can be constructed as having as many hidden layers and hidden units as dictated by the complexity of the problem being considered. Keywords: Adaptive structure neural networks; Hermite polynomials; Multilayer feedforward neural networks. 1 Introduction The design of a feedforward neural network (FNN) requires consideration of three important issues. The solutions to these will significantly influence the overall performance of the neural networks ( NN) as far as the following two considerations are concerned: (i) recognition rate to new patterns, and (ii) generalization performance to new data sets that have not been presented during network training. The first problem is the selection of data/patterns for network training. This is a problem that has practical implications and has received limited attention. The training data set selection can have considerable effects on the performance of the trained network. The second problem is the selection of an appropriate and efficient training algorithm from a large number of possible training algorithms that have been developed in the literature. Many new training algorithms This research was supported in part by the NSERC (Natural Science and Engineering Research Council of Canada). D.-S. Huang et al. (Eds.): ICIC 2008, LNAI 5227, pp , c Springer-Verlag Berlin Heidelberg 2008
2 An Adaptively Constructing Multilayer Feedforward Neural Networks 643 with faster convergence properties and less computational requirements are being developed by researchers in the NN community. The third problem is the determination of the network size. This problem is more important from a practical point of view when compared to the above two problems, and is generally more difficult to solve. The problem here is to find a network structure as small as possible to meet certain desired performance specifications. What is usually done in practice is that the developer trains a number of networks with different sizes, and then the smallest network that can fulfill all or most of the required performance requirements is selected. This amounts to a tedious process of trial and errors. The 2nd and 3rd problems are actually closely related to one another in the sense that different training algorithms are suitable for different NN topologies. Therefore, the above three considerations are indeed critical when a NN is to be applied to a real-life problem. This paper focuses on developing a systematic procedure for an automatic determination and/or adaptation of the network architecture for a FNN. Algorithms that can determine an appropriate network architecture automatically according to the complexity of the underlying function embedded in the data set are very cost-efficient, and thus highly desirable. Efforts toward the network size determination have been made in the literature for many years, and many techniques have been developed [3], [6] (and the references therein]. 2 Hermite Polynomials In this section, the Hermite polynomials with their hierarchical nonlinearities are first introduced. These polynomials will be used subsequently as the activation functions of the hidden units of the proposed constructive feedforward neural network (FNN). The orthogonal Hermite polynomials are defined over the interval (, ) of the input space and are given formally as follows [11] (see also the references therein): H 0 (x) =1, (1) H 1 (x) =2x, (2).. H n (x) =2xH n 1 (x) 2(n 1)H n 2 (x), n 2. (3) The definition of H n (x) may be given alternatively by ( H n (x) =( 1) n e x2 dn dx n e x2), n > 0, with H 0 (x) =1. (4) The polynomials given in (1) (3) are orthogonal to each other but not orthonormal. The orthonormal Hermite polynomials may then be defined according to h n (x) =α n H n (x)φ(x), (5)
3 644 L. Ma and K. Khorasani where α n =(n!) 1/2 π 1/4 2 (n 1)/2, (6) φ(x) = 1 e x2 /2. 2π (7) The first-order derivative of h n (x) can be easily obtained by virtue of the recursive nature of the polynomials defined in (3), that is, dh n (x) =(2n) 1/2 h n 1 (x) xh n (x), n 1, (8) dx dh 0 (x) dφ(x) = α 0 dx dx = xh 0(x), n =0. (9) In [11], the orthonormal Hermite polynomials are used as basis functions to model 1-D signals in the biomedical field for the purposes of signal analysis and detection. In [4], a selected combination of the orthonormal Hermite polynomials in a weighted-sum form is used as the fixed activation function of all the hidden units of the OHL-FNN. Hermite coefficients in [8], [9] are used as pre-processing filters, or served as the features of the process, which are then fed to (fuzzy) neural networks for classification problems. A feedforward neural network is designed in [10] that uses Hermite function regression formula to approximate the hidden units activation functions to obtain an improved generalization capability. In this FNN, a fixed number of Hermite polynomials is used. It was indicated earlier that the Hermite polynomials are chosen due to their suitable properties. There are other polynomials that have similar properties, such as, Legendre polynomials, Tchebycheff polynomials and Laguerre polynomials [5]. However, the input range for these polynomials does not meet the requirement for a hidden activation function which should take values ranging from to +. If one uses a polynomial with a limited input range as an activation function for a hidden unit, one should then restrict the input-side weight space in which an optimal input-side weight vector has to be searched for under a given training criterion. The restriction of the input-side weight space is not desirable as it would limit the representational capability of the network. The justification and rationale for choosing the Hermite polynomials are therefore further motivated by their restriction-free input range characteristic. Alternatively, the above polynomials may still be considered as activation functions of the hidden units provided that the input to a hidden unit is normalized each time its input-side weight vector is updated. Although, the weight vector is now unconstrained, and only the input is normalized by a properly selected constant, this constant is actually a function of the weight vector. Therefore, the input to the hidden unit during the input-side training phase will be no longer linear with respect to the weight vector. Consequently, as far as the weight training is concerned the input-side training will now have two types of nonlinearities: one arising from the input to the hidden unit, and the other due to the activation function of the hidden unit. The corresponding optimization problem is clearly more complicated as compared to the one with only the nonlinearity of
4 An Adaptively Constructing Multilayer Feedforward Neural Networks 645 the activation function. It is, therefore, our conclusion that Hermite polynomials are expected to be the most suitable choice for the activation functions of the hidden units of a FNN. 3 The Proposed Incremental Constructive Training Algorithm Consider a typical constructive one-hidden-layer feedforward neural network (OHL-FNN) with a linear output layer and a polynomial-type hidden layer, as shown in Figure 1. The output layer can also be nonlinear, as will be shown in our simulation results. The hidden unit with input and output connections denoted by dotted lines is the present neuron that is being trained before it is allowed to join the other existing hidden units in the network. Suppose, without loss of generality, that a given regression problem has an M-dimensional input vector and a scalar one dimensional output. The j-th input and output samples are denoted by X j =(x j 0,xj 1,,xj M )(xj 0 = 1 denoting the bias) and dj (target), respectively, where j =1, 2,,P (P is the number of training data samples). The OHL constructive algorithm starts from a null network without any hidden units. At any given point during the constructive learning process, suppose there are n 1 hidden units in the hidden layer, and the n-th hidden unit is being trained before it is added to the existing network (see Figure 1). The output of the n-th hidden unit for the j-th training sample is given by ( M ) f n (s j n)=h n 1 w n,m x j m (10) m=0 h 0 h 1 v1 v2 vn-1 h n-2 vn Input-side training h n-1 Output-side training Fig. 1. Structure of a constructive OHL-FNN that utilizes the orthonormal Hermite polynomials as its activation functions for the hidden units
5 646 L. Ma and K. Khorasani where h n ( ) denotes the n-th order Hermite orthonormal polynomial. Its derivative can be calculated recursively from equation (8) and s j n is the input to the n-th hidden unit, given by M s j n = w n,m x j m (11) m=0 where w n,0 is the bias weight of the node with its input x j 0 =1,andw n,m (m 0) is the weight from the m-th component of the input vector to the n-th hidden node. The output of the network may now be expressed as follows n 1 y j n 1 = v 0 + v i h i 1 (s j i ) (12) i=1 = v 0 + v 1 h 0 (s j 1 )+v 2h 1 (s j 2 )+v 3h 2 (s j 3 )+ + v nh n 1 (s j n 1 ). where v 0 is the bias of the output node, and v 1,v 2,,v n are the weights from the hidden layer to the output node. Clearly, the above expression has a very close resemblance to a functional series expansion that utilizes the orthonormal Hermite polynomials as its basis functions, and where each additional term in the expansion contributes to improving the accuracy of the function that is being approximated. Through this series expansion, the approximation error will become smaller as more terms, having higher-order nonlinearities, are included. This situation is actually quite similar to that of an incrementally constructing OHL-FNN, in the sense that the error is expected to become smaller as new hidden units having hierarchically higher-order nonlinearities are successively added to the network. In fact, it has been shown that under certain circumstances incorporating new hidden units in a OHL-FNN can decrease monotonically the training error [7]. Therefore, in this sense adding a new hidden unit to the network is somewhat equivalent to the addition of a higher-order term in a Hermite polynomial-based series expansion. It is in this sense that the present network is envisaged to be more suitable for constructive learning paradigm as compared to a fixed-structure FNNs. Note that, only if s j 1 = sj 2 = = sj n 1,thenyj n 1 will be an exact Hermite polynomial-based series expansion. Otherwise, as structured in the algorithm above the expansion would be only approximate since the weights associated with each added neuron is adjusted separately, resulting in different inputs to the activation functions. The problem addressed now is to determine how to train the n-th hidden unit that is to be added to the active network. There are many objective functions (see for example [2], [6] for more details) that can be considered for the inputside training of this hidden unit. A simple, but a general cost function that has been shown in the literature to work quite well, is as follows [2]: J input = (e j n 1 ē n 1)(f n (s j n ) f n ) (13)
6 An Adaptively Constructing Multilayer Feedforward Neural Networks 647 where f n (s j n )=h n 1(s j n ) (14) ē n 1 = 1 e j n 1 P (15) f n = 1 P h n 1 (s j n), (16) e j n 1 = dj y j n 1 (17) where f n (s j n)(orh n 1 (s j n)) is an orthonormal Hermite polynomial of order n 1 used as the activation function of the n-th hidden unit, e j n 1 is the network output error when the network has n 1 hidden units and d j is the j-th target output to be used for the network training. The derivative of J input with respect to the weight w n,i is calculated by J input w n,i = sgn (e j n 1 ē n 1)(f n (s j n) f n ) (18) (e j n 1 ē n 1)h n 1 (sj n )xj i where sgn( ) is a sign function. The first-order derivative h n 1 (sj n)intheabove expression can be easily evaluated by using the recursive expression (8) for n 2 and (9) for n =1. A candidate unit that maximizes the objective function (13) will be incorporated into the network as the n-th hidden unit. Following this stage, the outputside training is performed by solving a least squared (LS) problem given that the output layer has a linear activation function (see Figure 1). Specifically, once the input-side training is accomplished, the network output y j with n hidden units may now be expressed as follows: y j = n k=0 v k f k (s j k ), j =1, 2,,P (19) where v k, (k =1, 2,,n) are output-side weights of the k-th hidden unit, and v 0 is the bias of the output unit with its input being fixed to f 0 = 1. The corresponding output neuron tracking error is now given by e j = d j y j (20) n = d j v k f k (s j k ), j =1, 2,,P. k=0
7 648 L. Ma and K. Khorasani Subsequently, the output-side training is performed by solving the following LS problem given that the output layer has linear activation function, that is { } 2 J output = 1 (e j ) 2 = 1 n d j v k f k (s j k 2 2 ). After performing the output-side training, a new error signal e j is calculated for the next cycle of input-side and output-side training. Our proposed constructive FNN algorithm may now be summarized according to the following steps: Step 1: Initialization of the network Start the network training process with a OHL-FNN having one hidden unit. Set n =1,e j = d j, and the activation function = h 0 ( ). Step 2: Input-side training for the n-th hidden unit Train only the input-side weights associated with the n-th hidden unit h n 1 ( ). The input-side weights for the existing hidden units, if any, are all frozen. One candidate for the n-th hidden unit with random initial weights is trained using the objective function defined in (13), based on the quickprop algorithm [1]. If instead, for the n-th unit, a pool of candidates are trained, then the one candidate that yields the maximum objective function will be chosen as the n-th hidden unit to be added to the active network. Step 3: Output-side training Train the output-side weights of all the hidden units in the present network. When the output layer has a linear activation function the output-side training may be performed by, for example, invoking the pseudo-inverse operation resulting from the least square optimization criterion as indicated earlier. When the activation function of output layer is nonlinear, the Quasi-Newton algorithm is to be used for training. Step 4: Network performance evaluation and training control Evaluate the network performance by monitoring the metric known as fraction of variance unexplained (FVU) [7] metric on the training data set. The FVU measure is defined according to FVU = k=0 (ĝ(x j ) g(x j )) 2 (21) (g(x j ) ḡ) 2 where g( ) is the function being implemented by the FNN, ĝ( ) isanestimate of g( ) or the output of the trained network, and ḡ is the mean value of g( ). The network training is terminated provided that certain stopping conditions are satisfied. These conditions may be specified formally, for example, in terms of a prespecified FVU threshold oramaximumnumberofpermissible hidden units, among others. If the stopping conditions are not satisfied, then the network output y j and the network output error e j = d j y j are computed, n is increased by 1, i.e., n = n + 1, and we proceed to Step 2.
8 An Adaptively Constructing Multilayer Feedforward Neural Networks 649 It is well known in the constructive OHL-FNNs literature [7] that the determination of a proper initial network size is a problem that needs careful attention. This initialization may significantly influence the effectiveness and efficiency of the network training that follows. However, in our proposed scheme, the network training starts from the smallest possible architecture (that is the null network). This is an important advantage of our proposed algorithm over the similar methods previously presented in the literature. 4 Proposed Multi hidden layer FNN Strategy Construction and design of multi-hidden-layer networks are generally more difficult than those of one-hidden-layer (OHL) networks since one has to determine not only the depth of the network in terms of the number of hidden layers, but also the number of neurons in each layer. In this section, a new strategy for constructing a multi-hidden-layer FNN with regular connections is proposed. The new constructive algorithm has the following features and advantages: (1) As with OHL constructive algorithms, the number of hidden units in a given hidden layer is automatically determined by the algorithm. In addition, the number of hidden layers are also determined automatically by the algorithm. (3) The algorithm adds hidden units and hidden layers one at a time, when and if they are needed. The input-side weights for all the hidden units in the installed hidden layer are fixed (input weights are frozen), however, the inputside weights for a new hidden unit that is being added to the network are updated during the input-side training. Generally, a constructive OHL may need to grow a large number of hidden units, or even could fail to represent a map that actually requires more than a single hidden layer. Therefore, it is expected that our proposed algorithm that incrementally expands its hidden layers and units automatically as a function of the complexity of the problem, would converge very fast and is computationally more efficient when compared to other constructive OHL networks. (4) Incentives are also introduced in the proposed algorithm to control and monitor the hidden layer creation process such that a network with fewer layers are constructed. This will encourage the algorithm to solve problems using smaller networks, reducing the costs of network training and implementation. 4.1 Construction, Design and Training of the Proposed Algorithm For sake of simplicity, and without loss of any generality, suppose one desires to construct a FNN to solve a mapping or a regression problem having a multidimensional input and a single scalar dimensional output. Extension to the case of multi-dimensional output space is quite trivial and straightforward. It is also assumed that the output unit has a linear activation function for dealing with the regression problems, although nonlinear activation function have to be used
9 650 L. Ma and K. Khorasani (a) Initial linear network (b) Creation of the first hidden layer (c) Addition of hidden units to the first hidden layer (d) Creation of the second hidden layer and its units Fig. 2. Multi-hidden-layer network construction process for the proposed strategy using Hermite polynomials for the nonlinear activation function of the neurons for pattern recognition problems. Specifically, the output-side training of the weights would have to be performed by utilizing the Delta rule, instead of an LS solution that is used here when the activation function is selected as a linear function. The proposed constructive algorithm consists of the following steps:
10 An Adaptively Constructing Multilayer Feedforward Neural Networks 651 Step 1: Initially train the network with a single output node and no hidden layers (refer to Figure 2(a)). The training of the weights is achieved according to a LS solution or some gradient-based algorithm, including the bias of the output node. The training error FVU (as described below) is monitored and if it is not smaller than some prespecified required threshold, one would then proceed to Step 2a, otherwise one would go to Step 2c. The performance of a network is measured by the fraction of variance unexplained (FVU) [7] whichisdefinedas FVU = (ĝ(x j ) g(x j )) 2 (22) (g(x j ) ḡ) 2 where g( ) is the function to be implemented by the FNN, ĝ( ) isanestimate of g( ) realized by the network, and ḡ is the mean value of g( ). The FVU is equivalently a ratio of the error variance to the variance of the function being analyzed by the network. Generally, the larger the variance of the function, the more difficult it would be to do the regression analysis. Therefore, the FVU may be viewed as a measure normalized by the complexity of the function. Note that the FVU is proportional to the mean square error (MSE). Furthermore, the function under study is likely to be contaminated by an additive noise ɛ. In this case the signal-to-noise ratio (SNR) is defined by (g(x j ) ḡ) 2 SNR =10log 10 (23) (ɛ j ɛ) 2 where ɛ is the mean value of the additive noise, which is usually assumed to be zero. Step 2a: Treat the output node as a hidden unit with unit output-side weight and zero bias (refer to Figure 2(b)). The hidden unit input-side weights are fixed and are as determined from the previous training step. The output error of the new linear output node will be identical to its hidden unit. Proceed to Step 2b. Step 2b: A new hidden unit is added to the present layer in Step 2a one at a time, just as in a constructive OHL-FNN (refer to Figure 2(c)). Each time a new hidden unit is added, the output error (FVU) is monitored to see if it is smaller than a certain prescribed threshold value. If so, one then proceeds to Step 2c. Otherwise, a new hidden unit is added until the output error FVU is stabilized above the prescribed threshold. Once the error is stabilized, Step 2a is invoked again and a new hidden layer is introduced to possibly further reduce the output error (refer to Figure 2(d)). Steps 2a and 2b are
11 652 L. Ma and K. Khorasani repeated as many times as necessary until the output error FVU is below the prescribed desired threshold, following which one then proceeds to Step 2c. Input-side training of a new hidden unit weights is performed by maximizing a correlation-based objective function, whereas the output-side training of the weights is performed through a LS type algorithm. Step 2c: The constructive training is terminated. For implementing the above algorithm some further specifications are needed. First, a metric stop-ratio is introduced to determine if the training error FVU is stabilized as a function of the added hidden unit, namely stop ratio = FVU(n 1) FVU(n) FVU(n 1) 100% (24) where FVU(n) denotes the training error FVU when the n-th hidden unit is added to a hidden layer of the network. Note that according to [6] addition of a proper hidden unit to an existing network should reduce the training error. The issue that needs addressing here is the magnitude selection of the stop-ratio. At the early stages of the constructive training, this ratio may be large. However, as more hidden units are added to the network the ratio will monotonically decrease. When it has become sufficiently small, say a few percents, the inclusion of further additional hidden units will contribute little to the reduction of the training error, and as a matter of fact may actually start giving rise to an increase in the generalization error, as the unnecessary hidden units may force the network to begin to memorize the data. Towards this end, when the stop-ratio is determined to be smaller than a prespecified desired percentage successively for a given number of times (this is denoted by the metric stop-num ), the algorithm will treat the error FVU as being stabilized, and hence will proceed to the next step to explore the possibility of further reducing the training error by extending a new layer to the network. It should be noted that the stop-ratio and the stop-num are to be generally determined by trial and error, and one may utilize prior experience derived from simulation results to assign them. 5 Conclusions In this paper, a new strategy for constructing a multi-hidden-layer FNN was presented. The proposed algorithm extends the constructive algorithm developed for adding hidden units to a OHL network by generating additional hidden layers. The network obtained by using our proposed algorithm has regular connections to facilitate hardware implementation. The constructed network adds new hidden units and layers incrementally only when they are needed based on monitoring the residual error that can not be reduced any further by the already existing network. We have also proposed a new type of a constructive OHL-FNN that adaptively assigns appropriate orthonormal Hermite polynomials to its generated neurons. The network generally learns as effectively as, but generalizes much better than the conventional constructive OHL networks.
12 An Adaptively Constructing Multilayer Feedforward Neural Networks 653 References 1. Fahlman, S.E.: An Empirical Study of Learning Speed in Back-Propagation Networks. Tech. Rep., CMU-CS , Carnegie Mellon University (1988) 2. Fahlman, S.E., Lebiere, C.: The Cascade-correlation Learning Architecture. Tech. Rep., CMU-CS , Carnegie Mellon University (1991) 3. Hush, D.R., Horne, B.G.: Progress in Supervised Neural Networks. IEEE Signal Processing Magazine, 8 39 (January 1993) 4. Hwang, J.N., Lay, S.R., Maechler, M., Martin, R.D., Schimert, J.: Regression Modeling in Back-propagation and Projection Pursuit Learning. IEEE Trans. on Neural Networks 5, (1994) 5. Korn, G.A., Korn, T.M.: Mathematical Handbook for Scientists and Engineers, 2nd edn. McGraw-Hill, New York (1968) 6. Kwok, T.Y., Yeung, D.Y.: Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems. IEEE Trans. on Neural Networks 8(3), (1997) 7. Kwok, T.Y., Yeung, D.Y.: Objective Functions for Training New Hidden Units in Constructive Neural Networks. IEEE Trans. on Neural Networks 8(5), (1997) 8. Lau, B., Chao, T.H.: Aided Target Recognition Processing of Mudss Sonar Data. Proceedings of the SPIE 3392, (1998) 9. Linh, T.H., Osowski, S., Stodolski, M.: On-line Heart Beat Recognition Using Hermite Polynomials and Neuro-fuzzy Network. In: IMTC/2002. Proceedings of the 19th IEEE Instrumentation and Measurement Technology Conference, Anchorage, AK, USA, vol. 1, pp (2002) 10. Pilato, G., Sorbello, F., Vassallo, G.: Using the Hermite Regression Algorithm to Improve the Generalization Capability of a Neural Network. In: Neural Nets WIRN Vietri Proceedings of the 11th Italian Workshop on Neural Nets, Salerno, Italy, pp (1999) 11. Rasiah, A.I., Togneri, R., Attikiouzel, Y.: Modeling 1-d Signals Using Hermite Basis Functions. IEE Proc. -Vis. Image Signal Process 144(6), (1997)
Multilayer Neural Networks
Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the
More informationMultilayer Neural Networks
Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More informationCascade Neural Networks with Node-Decoupled Extended Kalman Filtering
Cascade Neural Networks with Node-Decoupled Extended Kalman Filtering Michael C. Nechyba and Yangsheng Xu The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 Abstract Most neural networks
More informationNeural Network Approach to Control System Identification with Variable Activation Functions
Neural Network Approach to Control System Identification with Variable Activation Functions Michael C. Nechyba and Yangsheng Xu The Robotics Institute Carnegie Mellon University Pittsburgh, PA 52 Abstract
More informationInteger weight training by differential evolution algorithms
Integer weight training by differential evolution algorithms V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis University of Patras, Department of Mathematics, GR-265 00, Patras, Greece. e-mail: vpp
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationApproximation of Functions by Multivariable Hermite Basis: A Hybrid Method
Approximation of Functions by Multivariable Hermite Basis: A Hybrid Method Bartlomiej Beliczynski Warsaw University of Technology, Institute of Control and Industrial Electronics, ul. Koszykowa 75, -66
More informationBlind Equalization Formulated as a Self-organized Learning Process
Blind Equalization Formulated as a Self-organized Learning Process Simon Haykin Communications Research Laboratory McMaster University 1280 Main Street West Hamilton, Ontario, Canada L8S 4K1 Abstract In
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,
More information4. Multilayer Perceptrons
4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output
More informationECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann
ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationAI Programming CS F-20 Neural Networks
AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols
More informationMultilayer Perceptron = FeedForward Neural Network
Multilayer Perceptron = FeedForward Neural Networ History Definition Classification = feedforward operation Learning = bacpropagation = local optimization in the space of weights Pattern Classification
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationSparse Support Vector Machines by Kernel Discriminant Analysis
Sparse Support Vector Machines by Kernel Discriminant Analysis Kazuki Iwamura and Shigeo Abe Kobe University - Graduate School of Engineering Kobe, Japan Abstract. We discuss sparse support vector machines
More informationSpeaker Representation and Verification Part II. by Vasileios Vasilakakis
Speaker Representation and Verification Part II by Vasileios Vasilakakis Outline -Approaches of Neural Networks in Speaker/Speech Recognition -Feed-Forward Neural Networks -Training with Back-propagation
More informationSecond-order Learning Algorithm with Squared Penalty Term
Second-order Learning Algorithm with Squared Penalty Term Kazumi Saito Ryohei Nakano NTT Communication Science Laboratories 2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 69-2 Japan {saito,nakano}@cslab.kecl.ntt.jp
More informationDETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja
DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More informationNonlinear System Identification Based on a Novel Adaptive Fuzzy Wavelet Neural Network
Nonlinear System Identification Based on a Novel Adaptive Fuzzy Wavelet Neural Network Maryam Salimifard*, Ali Akbar Safavi** *School of Electrical Engineering, Amirkabir University of Technology, Tehran,
More informationIn the Name of God. Lectures 15&16: Radial Basis Function Networks
1 In the Name of God Lectures 15&16: Radial Basis Function Networks Some Historical Notes Learning is equivalent to finding a surface in a multidimensional space that provides a best fit to the training
More informationNN V: The generalized delta learning rule
NN V: The generalized delta learning rule We now focus on generalizing the delta learning rule for feedforward layered neural networks. The architecture of the two-layer network considered below is shown
More informationPattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore
Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal
More informationA Neuro-Fuzzy Scheme for Integrated Input Fuzzy Set Selection and Optimal Fuzzy Rule Generation for Classification
A Neuro-Fuzzy Scheme for Integrated Input Fuzzy Set Selection and Optimal Fuzzy Rule Generation for Classification Santanu Sen 1 and Tandra Pal 2 1 Tejas Networks India Ltd., Bangalore - 560078, India
More informationVC-dimension of a context-dependent perceptron
1 VC-dimension of a context-dependent perceptron Piotr Ciskowski Institute of Engineering Cybernetics, Wroc law University of Technology, Wybrzeże Wyspiańskiego 27, 50 370 Wroc law, Poland cis@vectra.ita.pwr.wroc.pl
More informationFEEDBACK GMDH-TYPE NEURAL NETWORK AND ITS APPLICATION TO MEDICAL IMAGE ANALYSIS OF LIVER CANCER. Tadashi Kondo and Junji Ueno
International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 3(B), March 2012 pp. 2285 2300 FEEDBACK GMDH-TYPE NEURAL NETWORK AND ITS
More informationAn artificial neural networks (ANNs) model is a functional abstraction of the
CHAPER 3 3. Introduction An artificial neural networs (ANNs) model is a functional abstraction of the biological neural structures of the central nervous system. hey are composed of many simple and highly
More informationNeural Network to Control Output of Hidden Node According to Input Patterns
American Journal of Intelligent Systems 24, 4(5): 96-23 DOI:.5923/j.ajis.2445.2 Neural Network to Control Output of Hidden Node According to Input Patterns Takafumi Sasakawa, Jun Sawamoto 2,*, Hidekazu
More informationIntroduction to Machine Learning Spring 2018 Note Neural Networks
CS 189 Introduction to Machine Learning Spring 2018 Note 14 1 Neural Networks Neural networks are a class of compositional function approximators. They come in a variety of shapes and sizes. In this class,
More informationLearning and Memory in Neural Networks
Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units
More informationNeural Networks Learning the network: Backprop , Fall 2018 Lecture 4
Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:
More informationUnit III. A Survey of Neural Network Model
Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of
More informationNotes on Back Propagation in 4 Lines
Notes on Back Propagation in 4 Lines Lili Mou moull12@sei.pku.edu.cn March, 2015 Congratulations! You are reading the clearest explanation of forward and backward propagation I have ever seen. In this
More informationArtificial Neural Networks. MGS Lecture 2
Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation
More informationIntroduction to Neural Networks: Structure and Training
Introduction to Neural Networks: Structure and Training Professor Q.J. Zhang Department of Electronics Carleton University, Ottawa, Canada www.doe.carleton.ca/~qjz, qjz@doe.carleton.ca A Quick Illustration
More informationy(n) Time Series Data
Recurrent SOM with Local Linear Models in Time Series Prediction Timo Koskela, Markus Varsta, Jukka Heikkonen, and Kimmo Kaski Helsinki University of Technology Laboratory of Computational Engineering
More informationSerious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions
BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design
More information488 Solutions to the XOR Problem
488 Solutions to the XOR Problem Frans M. Coetzee * eoetzee@eee.emu.edu Department of Electrical Engineering Carnegie Mellon University Pittsburgh, PA 15213 Virginia L. Stonick ginny@eee.emu.edu Department
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationA Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier
A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier Seiichi Ozawa 1, Shaoning Pang 2, and Nikola Kasabov 2 1 Graduate School of Science and Technology,
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationIdentification of Nonlinear Systems Using Neural Networks and Polynomial Models
A.Janczak Identification of Nonlinear Systems Using Neural Networks and Polynomial Models A Block-Oriented Approach With 79 Figures and 22 Tables Springer Contents Symbols and notation XI 1 Introduction
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationRadial Basis Function Networks. Ravi Kaushik Project 1 CSC Neural Networks and Pattern Recognition
Radial Basis Function Networks Ravi Kaushik Project 1 CSC 84010 Neural Networks and Pattern Recognition History Radial Basis Function (RBF) emerged in late 1980 s as a variant of artificial neural network.
More informationKeywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm
Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding
More information22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1
Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable
More informationChapter 3 Supervised learning:
Chapter 3 Supervised learning: Multilayer Networks I Backpropagation Learning Architecture: Feedforward network of at least one layer of non-linear hidden nodes, e.g., # of layers L 2 (not counting the
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationMultiple Similarities Based Kernel Subspace Learning for Image Classification
Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese
More informationComments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms
Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:
More informationNovel determination of dierential-equation solutions: universal approximation method
Journal of Computational and Applied Mathematics 146 (2002) 443 457 www.elsevier.com/locate/cam Novel determination of dierential-equation solutions: universal approximation method Thananchai Leephakpreeda
More informationMultilayer Perceptrons (MLPs)
CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +1-1 o x x 1-1
More informationNeural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationArtificial Neural Networks
Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks
More informationMLPR: Logistic Regression and Neural Networks
MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition Amos Storkey Amos Storkey MLPR: Logistic Regression and Neural Networks 1/28 Outline 1 Logistic Regression 2 Multi-layer
More informationOutline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.
Outline MLPR: and Neural Networks Machine Learning and Pattern Recognition 2 Amos Storkey Amos Storkey MLPR: and Neural Networks /28 Recap Amos Storkey MLPR: and Neural Networks 2/28 Which is the correct
More informationOn Learning µ-perceptron Networks with Binary Weights
On Learning µ-perceptron Networks with Binary Weights Mostefa Golea Ottawa-Carleton Institute for Physics University of Ottawa Ottawa, Ont., Canada K1N 6N5 050287@acadvm1.uottawa.ca Mario Marchand Ottawa-Carleton
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationCh.6 Deep Feedforward Networks (2/3)
Ch.6 Deep Feedforward Networks (2/3) 16. 10. 17. (Mon.) System Software Lab., Dept. of Mechanical & Information Eng. Woonggy Kim 1 Contents 6.3. Hidden Units 6.3.1. Rectified Linear Units and Their Generalizations
More informationEEE 241: Linear Systems
EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of
More informationComputational Graphs, and Backpropagation
Chapter 1 Computational Graphs, and Backpropagation (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction We now describe the backpropagation algorithm for calculation of derivatives
More informationAutomatic Structure and Parameter Training Methods for Modeling of Mechanical System by Recurrent Neural Networks
Automatic Structure and Parameter Training Methods for Modeling of Mechanical System by Recurrent Neural Networks C. James Li and Tung-Yung Huang Department of Mechanical Engineering, Aeronautical Engineering
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More information100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units
Connectionist Models Consider humans: Neuron switching time ~ :001 second Number of neurons ~ 10 10 Connections per neuron ~ 10 4 5 Scene recognition time ~ :1 second 100 inference steps doesn't seem like
More informationRecursive Least Squares for an Entropy Regularized MSE Cost Function
Recursive Least Squares for an Entropy Regularized MSE Cost Function Deniz Erdogmus, Yadunandana N. Rao, Jose C. Principe Oscar Fontenla-Romero, Amparo Alonso-Betanzos Electrical Eng. Dept., University
More informationA summary of Deep Learning without Poor Local Minima
A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given
More informationClassifier s Complexity Control while Training Multilayer Perceptrons
Classifier s Complexity Control while Training Multilayer Perceptrons âdu QDVRaudys Institute of Mathematics and Informatics Akademijos 4, Vilnius 2600, Lithuania e-mail: raudys@das.mii.lt Abstract. We
More informationArtificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!
Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error
More informationRegML 2018 Class 8 Deep learning
RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationUsing the Delta Test for Variable Selection
Using the Delta Test for Variable Selection Emil Eirola 1, Elia Liitiäinen 1, Amaury Lendasse 1, Francesco Corona 1, and Michel Verleysen 2 1 Helsinki University of Technology Lab. of Computer and Information
More informationA Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation
1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,
More informationNeural Networks Lecture 3:Multi-Layer Perceptron
Neural Networks Lecture 3:Multi-Layer Perceptron H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural
More informationOptimal In-Place Self-Organization for Cortical Development: Limited Cells, Sparse Coding and Cortical Topography
Optimal In-Place Self-Organization for Cortical Development: Limited Cells, Sparse Coding and Cortical Topography Juyang Weng and Matthew D. Luciw Department of Computer Science and Engineering Michigan
More informationUnit 8: Introduction to neural networks. Perceptrons
Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad
More informationARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92
ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000
More informationApplication of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption
Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES
More informationEM-algorithm for Training of State-space Models with Application to Time Series Prediction
EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research
More informationCSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer
CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene
More informationArtificial Neural Networks
Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples
More informationMultilayer Perceptrons and Backpropagation
Multilayer Perceptrons and Backpropagation Informatics 1 CG: Lecture 7 Chris Lucas School of Informatics University of Edinburgh January 31, 2017 (Slides adapted from Mirella Lapata s.) 1 / 33 Reading:
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationA Novel Activity Detection Method
A Novel Activity Detection Method Gismy George P.G. Student, Department of ECE, Ilahia College of,muvattupuzha, Kerala, India ABSTRACT: This paper presents an approach for activity state recognition of
More informationChapter 2 Single Layer Feedforward Networks
Chapter 2 Single Layer Feedforward Networks By Rosenblatt (1962) Perceptrons For modeling visual perception (retina) A feedforward network of three layers of units: Sensory, Association, and Response Learning
More informationLecture 5: Logistic Regression. Neural Networks
Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture
More informationFrequency Selective Surface Design Based on Iterative Inversion of Neural Networks
J.N. Hwang, J.J. Choi, S. Oh, R.J. Marks II, "Query learning based on boundary search and gradient computation of trained multilayer perceptrons", Proceedings of the International Joint Conference on Neural
More informationDeep Learning and Information Theory
Deep Learning and Information Theory Bhumesh Kumar (13D070060) Alankar Kotwal (12D070010) November 21, 2016 Abstract T he machine learning revolution has recently led to the development of a new flurry
More informationIntelligent Modular Neural Network for Dynamic System Parameter Estimation
Intelligent Modular Neural Network for Dynamic System Parameter Estimation Andrzej Materka Technical University of Lodz, Institute of Electronics Stefanowskiego 18, 9-537 Lodz, Poland Abstract: A technique
More informationLinear Least-Squares Based Methods for Neural Networks Learning
Linear Least-Squares Based Methods for Neural Networks Learning Oscar Fontenla-Romero 1, Deniz Erdogmus 2, JC Principe 2, Amparo Alonso-Betanzos 1, and Enrique Castillo 3 1 Laboratory for Research and
More informationUnsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation
Cognitive Science 30 (2006) 725 731 Copyright 2006 Cognitive Science Society, Inc. All rights reserved. Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation Geoffrey Hinton,
More informationGMDH-type Neural Networks with a Feedback Loop and their Application to the Identification of Large-spatial Air Pollution Patterns.
GMDH-type Neural Networks with a Feedback Loop and their Application to the Identification of Large-spatial Air Pollution Patterns. Tadashi Kondo 1 and Abhijit S.Pandya 2 1 School of Medical Sci.,The Univ.of
More informationArtificial Neural Networks (ANN)
Artificial Neural Networks (ANN) Edmondo Trentin April 17, 2013 ANN: Definition The definition of ANN is given in 3.1 points. Indeed, an ANN is a machine that is completely specified once we define its:
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More information