Multi-Layer Perceptrons for Functional Data Analysis: a Projection Based Approach 0
|
|
- Sibyl Black
- 5 years ago
- Views:
Transcription
1 Multi-Layer Perceptrons for Functional Data Analysis: a Projection Based Approach 0 Brieuc Conan-Guez 1 and Fabrice Rossi 23 1 INRIA, Domaine de Voluceau, Rocquencourt, B.P Le Chesnay Cedex, France Brieuc.Conan-Guez@inria.fr 2 LISE/CEREMADE, UMR CNRS 7534, Université Paris-IX Dauphine, Place du Maréchal de Lattre de Tassigny, Paris, France Fabrice.Rossi@dauphine.fr 3 Up to date contact informations for Fabrice Rossi are avaible at Abstract. In this paper, we propose a new way to use Functional Multi- Layer Perceptrons (FMLP). In our previous work, we introduced a natural extension of Multi Layer Perceptrons (MLP) to functional inputs based on direct manipulation of input functions. We propose here to rely on a representation of input and weight functions thanks to projection on a truncated base. We show that the proposed model has the universal approximation property. Moreover, parameter estimation for this model is consistent. The new model is compared to the previous one on simulated data: performances are comparable but training time it greatly reduced. 1 Introduction In many practical applications, multivariate representation of studied objects is not a very satisfactory choice. If we study for instance the growth of a group of children, it is interesting to take into account the fact that the height of each child is a continuous function from a real interval (the time period of the study) to IR. Functional Data Analysis (FDA) [5] is an extension of traditional Data Analysis to individuals described by one or several regular real valued functions. The main difference between multivariate analysis and FDA is that the latter manipulates the functional representation of each individual and therefore can take into account smoothness of the considered functions. Moreover, when measurement points differ from one individual to another, multivariate representation is quite difficult because features (or variables) are now different from one individual to another: direct comparison is no more possible. In [6, 7], we have proposed an extension of Multi-Layer Perceptrons (MLP) that works on functional inputs. Given an input x in IR n, a numerical neuron 0 Published in ICANN 2002 Proceedings. Available at
2 calculates an output given by T (ax + b), where T is an activation function from IR to IR, a is a vector in IR n and b is a real number. The main idea of Functional MLP (FMLP) is to replace the linear form x bx by a continuous linear form defined on a functional space. This approach is quite different from what is commonly used in FDA. FDA is indeed mostly based on linear methods. Moreover, we proposed a direct manipulation of functions, whereas FDA relies on a pre-processing phase. In this phase, each input function is projected on a set of smooth functions (for instance on a truncated B-Spline base). In the present paper, we show that FMLP can also be applied to pre-processed functions. The main advantage of such a solution is that computation time can be greatly reduced without impairing performances (for low dimensional input spaces). The price to pay is less flexibility in the way continuous linear form are represented and therefore reduced performances for high dimensional input spaces. The paper is organized as follows. We first introduce the FMLP model and our new projection based approach. Then we give important theoretical results for the new model: universal approximation and consistency of parameter estimation. We conclude with some experiments that illustrate differences between direct and projection base approaches. 2 Functional MLP 2.1 Functional neurons As explained in the introduction, the main idea of FMLP is to replace in a neuron a linear form on IR n by a continuous linear form defined on the input space of the neuron. This idea was proposed and studied on a theoretical point of view in [8]. Basically, a generalized neuron takes input in E, a normed vectorial space. It is parametrized thanks to a real number b and a weight form, i.e., an element w of E, the vectorial space of continuous linear forms on E. The output of the neuron is T (w(x) + b). Of course, it is quite difficult to represent arbitrary continuous linear forms. That s why we proposed in [7] to restrict ourselves to functional inputs. More precisely, we denote µ a finite positive Borel measure on IR n (the rational for using such a measure will be explained in section 3), and L p (µ) the space of measurable functions from IR n to IR such that f p dµ <. Then we can define a neuron that maps elements of L p (µ) to IR thanks to an activation function T, a numerical threshold b and a weight function, i.e. a function w L q (µ) (where q is the conjugate exponent of p). Such a neuron maps an input function g to T (b + wg dµ) IR. 2.2 Functional MLP As a functional neuron gives a numerical output, we can define a functional MLP by combining numerical neurons with functional neurons. The first hidden layer of the network consists exclusively in functional neurons, whereas subsequent
3 layers are constructed exclusively with numerical neurons. For instance an one hidden layer functional MLP with real output computes the following function: H(g) = k a i T i=1 ( b i + where w i are functions of the functional weight space. ) w i g dµ, (1) 3 Practical implementation 3.1 Two problems Unfortunately, even if equation 1 uses simplified linear forms, we still have two practical problems: it is not easy to manipulate functions (i.e., weights) and we have only incomplete data (i.e., input functions are known only thanks to finite sets of input/output pairs). Therefore, computing wg dµ is quite difficult. 3.2 Projection based solution We have already proposed a direct solution to both problems in [6, 7]. The main drawback of the direct solution is that computation times can be quite long. In this paper, we propose a new solution based on projected representation commonly used in FDA (see [5]). The main idea is to use a regularized representation of each input function. In this case we assume that the input space is L 2 (µ) and we consider (φ i ) i IN a Hilbertian base of L 2 (µ). We define Π n (g) as the projection of g L 2 (µ) on the vectorial space spanned by (φ 1,..., φ n ) (denoted span(φ 1,..., φ n )), i.e. Π n (g) = n i=1 ( φ i g dµ)φ i. Rather than computing H(g), we try to compute H(Π n (g)). Its is important to note that in the proposed approach we consider Π n (g) to be the effective input function. Therefore, with the projection based approach, we do not focus anymore on computing H(g). In order to represent the weight functions, we consider another Hilbertian base of L 2 (µ), (ψ i ) i IN and we choose weight functions in span(ψ 1,..., ψ p ). For the weight function w = p i=1 α iψ i, we have: wπ n (g) dµ = n p ( φ i g dµ)α j φ i ψ j dµ (2) i=1 j=1 For well chosen bases (e.g., B-splines and Fourier series), φ i ψ j dµ can be easily computed exactly (especially if we use the same base for both input and weight functions!). Moreover, the values do not depend on actual weight and/or input functions. Efficiency of the method is deeply linked to this fact which is a direct consequence of using truncated bases. A FMLP based on this approach uses a finite number of real parameters because each weight function is represented thanks to its p real coordinates in
4 span(ψ 1,..., ψ p ). Weights can be adjusted by gradient descent based algorithms, as the actual calculation done by a functional neuron is now a weighted sum of its inputs if we consider the finite dimensional input vector ( φ 1 g dµ,..., φ n g dµ). Back-propagation is very easily adapted to this model and allows efficient calculation of the gradient. 3.3 Approximation of the projection Of course, Π n (g) cannot be computed exactly because our knowledge on g is limited to a finite set of input/output pairs, i.e., (x i, g(x i )). We assume that the measurement points x i are realizations of independent identically distributed random variables (X l ) l IN (defined on a probability space P) and we denote P X the probability measure induced on IR n by those random variables. This observation measure weights measurement points and it seems therefore natural to consider that µ = P X. Then we can define the random element Π n (g) m as the function n i=1 β iφ i that minimizes 1 m m i=1 (g(x i) n i=1 β iφ i (x i )) 2. This is a straightforward generalized linear problem which can be solved easily and efficiently with standard techniques. Therefore, rather than computing H(Π n (g)) we compute a realization of the random variable H( Π n (g) m ). Following [1] we show in [4] that Π n (g) m converges almost surely to Π n (g) (in L 2 (µ)). This proof is based on an application of theorem of [3]. 4 Theoretical results 4.1 Universal approximation Projection based FMLP are universal approximators: Corollary 1. We use notations and hypothesis of subsection 3.2. Let F be a continuous function from a compact subset K of L 2 (µ) to IR and let ɛ be an arbitrary strictly positive real number. We use a continuous non polynomial activation function T. Then there is n > 0 and p > 0 and a FMLP (based on T and using weight functions in span(ψ 1,..., ψ p )) such that H(Π n (g)) F (g) < ɛ for each g K. Proof. Details can be found in [4]. First we apply corollary 1 of [7] (this result is based on very general theorems from [8]) to find a FMLP H based on T that approximates F with precision ɛ 2. This is possible because (ψ i) i IN is a Hilbertian base of L 2 (µ) which is its own dual. p is obtained as the maximum number of base functions used to represent weights functions in the obtained FMLP. H is continuous (because T is continuous) and therefore uniformly continuous on K. There is η > 0 such that for each (f, g) K 2, f g 2 < η implies H(f) H(g) < ɛ 2. As (φ i) i IN is a basis and K is compact, there is n > 0 such that for each g K, Π n (g) g 2 < η, which allows to conclude.
5 In the practical case, Π n (g) is replaced by Π n (g) m which does not introduce any problem thanks to the convergence result given in Consistency When we estimate optimal parameters for a Functional MLP, our knowledge of data is limited in two ways. As always in Data Analysis problems, we have a finite set of input/output pairs. Moreover (and this is specific to FDA), we have also a limited knowledge of each input function. Therefore we cannot apply classical MLP consistency results (e.g., [9]) to our model. Nevertheless, we demonstrate a consistency result in [4], which is summarized here. We assume that input functions are realizations of a sequence of independent identically distributed random elements (G j ) j IN with values in L 2 (µ) and we denote G = G 1. In regression or discrimination problems, each studied function G j is associated to a real value Y j ((Y j ) j IN is a sequence of independent identically distributed random variables and we denote Y = Y 1 ). Our goal is that the FMLP gives a correct mapping from g to y. This is modeled thanks to a cost function l (in fact we embed in l both the cost function and the FMLP calculation). If we call w the vector of all numerical parameters of a FMLP, we have to minimize λ(w) = E(l(G, Y, w)). As we have a finite number of functions, the theoretical cost is replaced by the random variable λ N (w) = 1 N N j=1 l(gj, Y j, w). Moreover, H(Π n (g)) is replaced by the random variable H( Π n (g) m ). Therefore, we have to replace l by an approximation which gives the random variable λ N (w) m = 1 N N j=1 l(gj, Y j, w) m. We call ŵ m N a minimizer of λ N (w) m and W the set of minimizers of λ(w). We show in [4] that lim n lim m d(ŵ m n, W ) = 0. 5 Simulation result We have tried our model on a simple discrimination problem: we ask to a functional MLP to classify functions into classes. Example of the first class are functions of the form f d (x) = sin(2π(x d)), whereas function of the second class have the general form g d (x) = sin(4π(x d)). For each class we generate example input functions according to the following procedure: 1. d is randomly chosen uniformly in [0, 1] measurement points are randomly chosen uniformly in [0, 1] 3. we add a centered Gaussian noise with 0.7 standard deviation to the corresponding outputs Training is done thanks to a conjugate gradient algorithm and uses early stopping: we used 100 training examples (50 of each class), 100 validation examples (50 of each class) and 300 test examples (150 for each class). We have compared on those data our direct approach ([6]) to the projection based approach. For both approaches we used a FMLP with three hidden
6 functional neurons. Each functional neuron uses 4 cubic B-splines to represent its weight function. For the projection based approach, input functions are represented thanks to 6 cubic B-splines. Both models use a total of 19 numerical parameters. For the direct approach, we obtain a recognition rate of 94.4 % whereas the projection based approach achieves 97% recognition rate. Moreover, each iteration of the training algorithm takes about 20 times more time for the direct approach than for the projection based approach. We have conducted other experiments (for instance the circle based one described in [6]) which give similar results. When the dimension of the input space of each function increases, the generalized linear model used to represent input and weight functions becomes less efficient than a non linear representation ([2]). We have therefore to increase the number of parameters in order to remain competitive with the direct method which can use numerical MLP to represent weights. 6 Conclusion We have proposed in this paper a new way to work with Functional Multi Layer Perceptrons (FMLP). The projection based approach shares with the direct one very important theoretical properties: projection based FMLP are universal approximators and parameter estimation is consistent for such model. The main advantage of projection based FMLP is that computation time is greatly reduced compared to FMLP based on direct manipulation of input functions, whereas performances remain comparable. Additional experiments are needed on input functions with higher dimensional input spaces (especially on real world data). For this kind of functions, advantages of projection based FMLP should be reduced by the fact that they cannot use traditional MLP to represent weight and input functions. The direct approach does not have this limitation and should therefore use less parameters. Our goal is to establish practical guidelines for choosing between both approaches. References 1. Christophe Abraham, Pierre-André Cornillon, Eric Matzner-Lober, and Nicolas Molinari. Unsupervised curve clustering using b-splines. Technical Report 00 04, ENSAM INRA UM II Montpellier, October Andrew R. Barron. Universal Approximation Bounds for Superpositions of a Sigmoidal Function. IEEE Trans. Information Theory, 39(3): , May Helga Bunke and Olaf Bunke, editors. Nonlinear Regression, Functional Relations and Robust Methods, volume II of Series in Probability and Mathematical Statistics. Wiley, Brieuc Conan-Guez and Fabrice Rossi. Projection based functional multi layer perceptrons. Technical report, LISE/CEREMADE & INRIA, dauphine.fr/, february Jim Ramsay and Bernard Silverman. Functional Data Analysis. Springer Series in Statistics. Springer Verlag, June 1997.
7 6. Fabrice Rossi, Brieuc Conan-Guez, and François Fleuret. Functional data analysis with multi layer perceptrons. In IJCNN 2002/WCCI 2002, volume 3, pages IEEE/NNS/INNS, May Fabrice Rossi, Brieuc Conan-Guez, and François Fleuret. Theoretical properties of functional multi layer perceptrons. In ESANN 2002, April Maxwell B. Stinchcombe. Neural network approximation of continuous functionals and continuous functions on compactifications. Neural Networks, 12(3): , Halbert White. Learning in Artificial Neural Networks: A Statistical Perspective. Neural Computation, 1(4): , 1989.
Functional Preprocessing for Multilayer Perceptrons
Functional Preprocessing for Multilayer Perceptrons Fabrice Rossi and Brieuc Conan-Guez Projet AxIS, INRIA, Domaine de Voluceau, Rocquencourt, B.P. 105 78153 Le Chesnay Cedex, France CEREMADE, UMR CNRS
More informationFunctional Multi-Layer Perceptron: a Nonlinear Tool for Functional Data Analysis
Functional Multi-Layer Perceptron: a Nonlinear Tool for Functional Data Analysis Fabrice Rossi, Brieuc Conan-Guez To cite this version: Fabrice Rossi, Brieuc Conan-Guez. Functional Multi-Layer Perceptron:
More informationSupport Vector Machine For Functional Data Classification
Support Vector Machine For Functional Data Classification Nathalie Villa 1 and Fabrice Rossi 2 1- Université Toulouse Le Mirail - Equipe GRIMM 5allées A. Machado, 31058 Toulouse cedex 1 - FRANCE 2- Projet
More informationFunctional Radial Basis Function Networks (FRBFN)
Bruges (Belgium), 28-3 April 24, d-side publi., ISBN 2-9337-4-8, pp. 313-318 Functional Radial Basis Function Networks (FRBFN) N. Delannay 1, F. Rossi 2, B. Conan-Guez 2, M. Verleysen 1 1 Université catholique
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationMultilayer Perceptron
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationIntroduction to Neural Networks
CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character
More informationMultilayer feedforward networks are universal approximators
Multilayer feedforward networks are universal approximators Kur Hornik, Maxwell Stinchcombe and Halber White (1989) Presenter: Sonia Todorova Theoretical properties of multilayer feedforward networks -
More informationARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92
ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000
More informationNeural Networks Lecture 4: Radial Bases Function Networks
Neural Networks Lecture 4: Radial Bases Function Networks H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi
More informationMultilayer Neural Networks
Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient
More informationIn the Name of God. Lectures 15&16: Radial Basis Function Networks
1 In the Name of God Lectures 15&16: Radial Basis Function Networks Some Historical Notes Learning is equivalent to finding a surface in a multidimensional space that provides a best fit to the training
More informationMultilayer Perceptrons and Backpropagation
Multilayer Perceptrons and Backpropagation Informatics 1 CG: Lecture 7 Chris Lucas School of Informatics University of Edinburgh January 31, 2017 (Slides adapted from Mirella Lapata s.) 1 / 33 Reading:
More informationLecture 4: Perceptrons and Multilayer Perceptrons
Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons
More informationEmpirical Risk Minimization
Empirical Risk Minimization Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Introduction PAC learning ERM in practice 2 General setting Data X the input space and Y the output space
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationLab 5: 16 th April Exercises on Neural Networks
Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the
More informationArtificial Neural Networks. Edward Gatt
Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very
More informationMultivariate Analysis, TMVA, and Artificial Neural Networks
http://tmva.sourceforge.net/ Multivariate Analysis, TMVA, and Artificial Neural Networks Matt Jachowski jachowski@stanford.edu 1 Multivariate Analysis Techniques dedicated to analysis of data with multiple
More informationBits of Machine Learning Part 1: Supervised Learning
Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More informationClassifier s Complexity Control while Training Multilayer Perceptrons
Classifier s Complexity Control while Training Multilayer Perceptrons âdu QDVRaudys Institute of Mathematics and Informatics Akademijos 4, Vilnius 2600, Lithuania e-mail: raudys@das.mii.lt Abstract. We
More informationMLPR: Logistic Regression and Neural Networks
MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition Amos Storkey Amos Storkey MLPR: Logistic Regression and Neural Networks 1/28 Outline 1 Logistic Regression 2 Multi-layer
More information18.6 Regression and Classification with Linear Models
18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight
More informationLecture 6. Regression
Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron
More informationOutline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.
Outline MLPR: and Neural Networks Machine Learning and Pattern Recognition 2 Amos Storkey Amos Storkey MLPR: and Neural Networks /28 Recap Amos Storkey MLPR: and Neural Networks 2/28 Which is the correct
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationArtificial Neural Networks (ANN)
Artificial Neural Networks (ANN) Edmondo Trentin April 17, 2013 ANN: Definition The definition of ANN is given in 3.1 points. Indeed, an ANN is a machine that is completely specified once we define its:
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V
More informationArtificial Neural Networks
Artificial Neural Networks Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Knowledge
More informationMulti-Layer Boosting for Pattern Recognition
Multi-Layer Boosting for Pattern Recognition François Fleuret IDIAP Research Institute, Centre du Parc, P.O. Box 592 1920 Martigny, Switzerland fleuret@idiap.ch Abstract We extend the standard boosting
More informationCurves clustering with approximation of the density of functional random variables
Curves clustering with approximation of the density of functional random variables Julien Jacques and Cristian Preda Laboratoire Paul Painlevé, UMR CNRS 8524, University Lille I, Lille, France INRIA Lille-Nord
More informationNeural Networks. Volker Tresp Summer 2015
Neural Networks Volker Tresp Summer 2015 1 Introduction The performance of a classifier or a regression model critically depends on the choice of appropriate basis functions The problem with generic basis
More informationMachine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler
+ Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions
More informationKolmogorov Superposition Theorem and its application to multivariate function decompositions and image representation
Kolmogorov Superposition Theorem and its application to multivariate function decompositions and image representation Pierre-Emmanuel Leni pierre-emmanuel.leni@u-bourgogne.fr Yohan D. Fougerolle yohan.fougerolle@u-bourgogne.fr
More informationLearning from Data: Multi-layer Perceptrons
Learning from Data: Multi-layer Perceptrons Amos Storkey, School of Informatics University of Edinburgh Semester, 24 LfD 24 Layered Neural Networks Background Single Neurons Relationship to logistic regression.
More informationMulti-layer Neural Networks
Multi-layer Neural Networks Steve Renals Informatics 2B Learning and Data Lecture 13 8 March 2011 Informatics 2B: Learning and Data Lecture 13 Multi-layer Neural Networks 1 Overview Multi-layer neural
More informationMultilayer Neural Networks
Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the
More informationMultilayer Perceptron = FeedForward Neural Network
Multilayer Perceptron = FeedForward Neural Networ History Definition Classification = feedforward operation Learning = bacpropagation = local optimization in the space of weights Pattern Classification
More informationTWO-SCALE SIMULATION OF MAXWELL S EQUATIONS
ESAIM: PROCEEDINGS, February 27, Vol.6, 2-223 Eric Cancès & Jean-Frédéric Gerbeau, Editors DOI:.5/proc:27 TWO-SCALE SIMULATION OF MAWELL S EQUATIONS Hyam Abboud,Sébastien Jund 2,Stéphanie Salmon 2, Eric
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationLinks between Perceptrons, MLPs and SVMs
Links between Perceptrons, MLPs and SVMs Ronan Collobert Samy Bengio IDIAP, Rue du Simplon, 19 Martigny, Switzerland Abstract We propose to study links between three important classification algorithms:
More informationA FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE
A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE Li Sheng Institute of intelligent information engineering Zheiang University Hangzhou, 3007, P. R. China ABSTRACT In this paper, a neural network-driven
More informationComputational statistics
Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial
More informationBACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation
BACKPROPAGATION Neural network training optimization problem min J(w) w The application of gradient descent to this problem is called backpropagation. Backpropagation is gradient descent applied to J(w)
More informationFeedforward Neural Nets and Backpropagation
Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features
More informationArtificial Intelligence
Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement
More informationA Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation
1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,
More informationA recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz
In Neurocomputing 2(-3): 279-294 (998). A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models Isabelle Rivals and Léon Personnaz Laboratoire d'électronique,
More informationRadial Basis Function Networks. Ravi Kaushik Project 1 CSC Neural Networks and Pattern Recognition
Radial Basis Function Networks Ravi Kaushik Project 1 CSC 84010 Neural Networks and Pattern Recognition History Radial Basis Function (RBF) emerged in late 1980 s as a variant of artificial neural network.
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationAdvanced statistical methods for data analysis Lecture 2
Advanced statistical methods for data analysis Lecture 2 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline
More informationPattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore
Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal
More information6.036 midterm review. Wednesday, March 18, 15
6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that
More informationSupport vector machine for functional data classification
Support vector machine for functional data classification Fabrice Rossi, Nathalie Villa To cite this version: Fabrice Rossi, Nathalie Villa. Support vector machine for functional data classification. Neurocomputing
More informationTHE multilayer perceptron (MLP) is a nonlinear signal
Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 013 Partially Affine Invariant Training Using Dense Transform Matrices Melvin D Robinson and Michael T
More informationAdvanced Machine Learning
Advanced Machine Learning Lecture 4: Deep Learning Essentials Pierre Geurts, Gilles Louppe, Louis Wehenkel 1 / 52 Outline Goal: explain and motivate the basic constructs of neural networks. From linear
More information4. Multilayer Perceptrons
4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationSingle layer NN. Neuron Model
Single layer NN We consider the simple architecture consisting of just one neuron. Generalization to a single layer with more neurons as illustrated below is easy because: M M The output units are independent
More informationMachine Learning. Nathalie Villa-Vialaneix - Formation INRA, Niveau 3
Machine Learning Nathalie Villa-Vialaneix - nathalie.villa@univ-paris1.fr http://www.nathalievilla.org IUT STID (Carcassonne) & SAMM (Université Paris 1) Formation INRA, Niveau 3 Formation INRA (Niveau
More informationNN V: The generalized delta learning rule
NN V: The generalized delta learning rule We now focus on generalizing the delta learning rule for feedforward layered neural networks. The architecture of the two-layer network considered below is shown
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationArtifical Neural Networks
Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................
More informationArtificial neural networks
Artificial neural networks Chapter 8, Section 7 Artificial Intelligence, spring 203, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 8, Section 7 Outline Brains Neural
More informationFeature Extraction with Weighted Samples Based on Independent Component Analysis
Feature Extraction with Weighted Samples Based on Independent Component Analysis Nojun Kwak Samsung Electronics, Suwon P.O. Box 105, Suwon-Si, Gyeonggi-Do, KOREA 442-742, nojunk@ieee.org, WWW home page:
More informationLECTURE NOTE #NEW 6 PROF. ALAN YUILLE
LECTURE NOTE #NEW 6 PROF. ALAN YUILLE 1. Introduction to Regression Now consider learning the conditional distribution p(y x). This is often easier than learning the likelihood function p(x y) and the
More informationEngineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I
Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 2012 Engineering Part IIB: Module 4F10 Introduction In
More informationNonlinear Support Vector Machines through Iterative Majorization and I-Splines
Nonlinear Support Vector Machines through Iterative Majorization and I-Splines P.J.F. Groenen G. Nalbantov J.C. Bioch July 9, 26 Econometric Institute Report EI 26-25 Abstract To minimize the primal support
More informationHilbert s 13th Problem Great Theorem; Shame about the Algorithm. Bill Moran
Hilbert s 13th Problem Great Theorem; Shame about the Algorithm Bill Moran Structure of Talk Solving Polynomial Equations Hilbert s 13th Problem Kolmogorov-Arnold Theorem Neural Networks Quadratic Equations
More informationECS171: Machine Learning
ECS171: Machine Learning Lecture 4: Optimization (LFD 3.3, SGD) Cho-Jui Hsieh UC Davis Jan 22, 2018 Gradient descent Optimization Goal: find the minimizer of a function min f (w) w For now we assume f
More informationIdentification of Nonlinear Systems Using Neural Networks and Polynomial Models
A.Janczak Identification of Nonlinear Systems Using Neural Networks and Polynomial Models A Block-Oriented Approach With 79 Figures and 22 Tables Springer Contents Symbols and notation XI 1 Introduction
More informationArtificial Neural Networks
Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks
More informationWhat Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1
What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single
More informationADAPTIVE NEURO-FUZZY INFERENCE SYSTEMS
ADAPTIVE NEURO-FUZZY INFERENCE SYSTEMS RBFN and TS systems Equivalent if the following hold: Both RBFN and TS use same aggregation method for output (weighted sum or weighted average) Number of basis functions
More informationCombination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters
Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Kyriaki Kitikidou, Elias Milios, Lazaros Iliadis, and Minas Kaymakis Democritus University of Thrace,
More informationMachine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Neural Networks Le Song Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU Reading: Chap. 5 CB Learning highly non-linear functions f:
More informationMultilayer Perceptron
Aprendizagem Automática Multilayer Perceptron Ludwig Krippahl Aprendizagem Automática Summary Perceptron and linear discrimination Multilayer Perceptron, nonlinear discrimination Backpropagation and training
More informationAN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009
AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is
More informationLinear Discrimination Functions
Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach
More information3.4 Linear Least-Squares Filter
X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum
More informationContinuous Neural Networks
Continuous Neural Networks Nicolas Le Roux Université de Montréal Montréal, Québec nicolas.le.roux@umontreal.ca Yoshua Bengio Université de Montréal Montréal, Québec yoshua.bengio@umontreal.ca Abstract
More informationTutorial on Tangent Propagation
Tutorial on Tangent Propagation Yichuan Tang Centre for Theoretical Neuroscience February 5, 2009 1 Introduction Tangent Propagation is the name of a learning technique of an artificial neural network
More informationStatistical Machine Learning from Data
January 17, 2006 Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Multi-Layer Perceptrons Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole
More informationVasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks
C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,
More informationMachine Learning
Machine Learning 10-315 Maria Florina Balcan Machine Learning Department Carnegie Mellon University 03/29/2019 Today: Artificial neural networks Backpropagation Reading: Mitchell: Chapter 4 Bishop: Chapter
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More informationConfidence Estimation Methods for Neural Networks: A Practical Comparison
, 6-8 000, Confidence Estimation Methods for : A Practical Comparison G. Papadopoulos, P.J. Edwards, A.F. Murray Department of Electronics and Electrical Engineering, University of Edinburgh Abstract.
More informationNeural networks. Chapter 19, Sections 1 5 1
Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10
More informationMachine Learning : Support Vector Machines
Machine Learning Support Vector Machines 05/01/2014 Machine Learning : Support Vector Machines Linear Classifiers (recap) A building block for almost all a mapping, a partitioning of the input space into
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationPart 8: Neural Networks
METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as
More informationLearning and Neural Networks
Artificial Intelligence Learning and Neural Networks Readings: Chapter 19 & 20.5 of Russell & Norvig Example: A Feed-forward Network w 13 I 1 H 3 w 35 w 14 O 5 I 2 w 23 w 24 H 4 w 45 a 5 = g 5 (W 3,5 a
More informationNeural Networks (and Gradient Ascent Again)
Neural Networks (and Gradient Ascent Again) Frank Wood April 27, 2010 Generalized Regression Until now we have focused on linear regression techniques. We generalized linear regression to include nonlinear
More informationArtificial Neural Networks The Introduction
Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000
More informationCOMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW1 due next lecture Project details are available decide on the group and topic by Thursday Last time Generative vs. Discriminative
More information