Stability of backpropagation learning rule

Size: px
Start display at page:

Download "Stability of backpropagation learning rule"

Transcription

1 Stability of backpropagation learning rule Petr Krupanský, Petr Pivoňka, Jiří Dohnal Department of Control and Instrumentation Brno University of Technology Božetěchova 2, Brno, Czech republic krupan, pivonka, Abstract A control of real processes requires different approach to neural network learning. The presented modification of backpropagation learning algorithm changes a meaning of learning constants. A base of modification is stability condition of learning dynamics. Keywords: Neural networks, ARMA model, control, backpropagation, stability, the largest singular value, Euclidean norm. 1 Introduction The backpropagation algorithm of learning has suitable properties for an online adaptation. The algorithm is low time and memory consuming. A disadvantage is relatively low convergence ability and a possibility of non-stability in case we choose improperly learning constants. On-line learning simultaneously with running control can bring about a number of problems which can substantially influence the criteria common in learning neural networks. One of the main criteria of the quality of learning processes is the possibility of fast convergence. Specific problems in control of real processes require a different approach to the neural network learning algorithm. An optimum speed of learning must be chosen. It is of crucial importance that the algorithm is able to complete modification of weights in a certain limited time period. In connection with control of real processes the stability of the learning algorithm must be considered. 2 Backpropagation algorithm The learning algorithm is based on minimization of error E k between output y k and the desired value d k : E k = 1 2 m j=1 [y k ( j) d k ( j)] 2 (1) where: y k ( j)... response of network to j-th input pattern in step k, d k ( j)... desired response to j-th input pattern in step k and m... number of input patterns. Gradient of error to weights of network (sigmoid as the output transient function): w k (i, j) = y k (i) where: y k (i) u k (i) u k (i, j) w k (i, j) = ( ) = [ y k (i) d k (i)] y k (i)[1 y k (i)] x k ( j) = = δ k (i)x k ( j) (2) y k (i)... derivation of the error. Difference between the neuron output and the desired value. For inner layer: x j=1 k ( j), = n y k (i) = y k (i) u k (i, j)... derivation of output transient function of neuron,

2 u k (i) w k (i, j).. derivation of the sum output with respect to a relevant weight. The previous neuron output, x k ( j)... the previous neuron output (the neuron input), δ k (i)... derivation of the error with respect to the sum output. Learning parameters α and β are used. α represents the speed of movement on gradient, and β inertia of the previous value. The final function for the change of weights is: w k (i, j) = αδ k 1 (i)x k 1 ( j)+β w k 1 (i, j) (3) These values w k were calculated for the whole network, and the weights are changed for the next step. 3 Modification for ARMA model If the neural network is used for construction of ARMA model of the system, the network configuration is reduced to one neuron with linear output function. This structure is fully convenient to ARMA model. Also x, a and the system output d are constants, independent of learning dynamics, for one pattern learning. Gradient of local error: w k (i) = y k u k y k u k w k (i) = (y k d) 1 x(i) (4) where: d... constant value of the system output, x(i)... constant value of input vector. The formula for w k : The weights: w k (i) = α(y k d)x(i)+β w k 1 (i) (5) w k+1 (i) = w k (i)+ w k (i) = The vector form of formula: w k = w k (i) α(y k d)x(i)+ +β[w k (i) w k 1 (i)] (6) = w k 1 α[y k 1 d]x+ +β[w k 1 w k 2 ] (7) 4 Stability conditions The formula for y k : y k = w k x T (8) Let us assume the system can be approximated by the formula: d = ax T (9) where: a... vector of the system parameters. Substituting formula 8 and 9 into 7, we get: w k = w k 1 α ( w k 1 x T ax T) x+ +β(w k 1 w k 2 ) (1) By modification we get the formula in Z-transformation: w(z) = w(z)z 1 α [ w(z)z 1 x T ax T] x+ And get the final formula: w(z) = ( αax T x ) +β [ w(z)z 1 w(z)z 2] (11) (I+ [ αx T x βi I ] z 1 + βiz 2) 1 = = N(z)D(z) 1 (12) where: I... unit matrix with suitable dimension. For simplicity and inference it is possible replace matrix αx T x by scalar q and unit matrixes I by scalar 1. The condition for stability of this transfer function is observed, if all roots of characteristic polynomial are inside of unit circle in Z plane. The characteristic polynomial D 1 (for the system with one input) is described by the formula: D 1 (z) = 1+[q β 1] z 1 + β z 2 (13) The roots of the characteristic polynomial D 1 (z) are: z 1,2 = q+β+1± (q β) 2 2(q+β)+1 2 (14) The absolute values of the roots must be smaller than 1: z 1,2 < 1 (15)

3 By substituting, we get the condition for q: < q < (2+2β) (16) For the system with more inputs will be as follows: < αx T x S < 2I+2Iβ S (17) where: S... the largest singular value norm. The solution for α has lower limit α >. Higher limit is: α < 2I+2Iβ x T x = 2I+2Iβ S S x T = x S = 2+2β = 2+2β S (18) 5 Batch learning on a number of patterns If the network is learnt on h patterns in the learning set, then for i = 1...h the network output is given by the formula: y k,i = w k x T i (19) The system output is: d i = ax T i (2) The upper limit for α is given by the formula: α < 2I+2Iβ S 2+2β = (24) h S x T h S i x i x T i x i If the patterns in the learning set are not too far apart, the largest singular value can be substituted by much simpler Euclidean norm: A E = i a 2 i, j (25) j Then the simplified computation is the following: α 2+2β (26) h E x T i x i 6 Examples of a learning behaviour 6.1 Learning dynamics The example shows the learning behaviour for neuron with two weights. The first figure fig.6.1 shows zeros and poles of the characteristic polynomial of learning dynamics with α on edge of the stability (α = 2+2β ). The second figure fig.6.1 shows the step response of learning dynamics. 1 Pole Zero Map The modification of weights will be as follows: w k = w k 1 α h [ (wk 1 x i T ax i T ) x i ] + +β(w k 1 w k 2 ) (21) Imag Axis After transfer to Z-transform and modification, we get the final formula for weights: ( w(z) = [ I+ ( α α h h ax i T x i ) x i T x i βi I )z 1 + βiz 2 ] 1 (22) The relationship can be compared with equation 12. After modifications we get the stability condition for a number of patterns: h < α x T i x i < (2I+2Iβ) S (23) S Real Axis Figure 1: The characteristic polynomial on edge of stability The next figures fig.3 and fig.4 show the learning dynamics for α =.5 2+2β, so for half value of the stability limit. The last two figures fig.5 and fig.6 show the behaviour for α = β, so for a little bit bigger value of the stability limit.

4 1 Pole Zero Map Step Response.35 Weight 1 Weight Imag Axis Real Axis Time (sec) Figure 2: The learning dynamics on edge of stability Figure 5: Non-stable characteristic polynomial Step Response Weight 1 Weight Pole Zero Map Imag Axis Time (sec) Real Axis Figure 3: Stable characteristic polynomial Figure 6: Non-stable learning dynamics 6.2 Norm comparison Weight 1 Step Response Weight 2 For simplicity of computation in real time control process is suitable change Euclidean norm instead the largest singular value norm. Graphs show behavior of the norm values during process. The modeled system is described by the transfer function:.1 F S (p) = 1.5 1p 2 +.7p+1 (27) Time (sec) Figure 4: Stable learning dynamics Sample period is T = 1 s, the level of noise is.5. There is 15 patterns in the training set. The form of patterns S(i) is as follows: ( S(i) = x(i),x(i 1),x(i 2),x(i 3), ),y(i 1),y(i 2),y(i 3),1 (28)

5 where: i... i-th pattern in the training set, x(i)... i-th system input, y(i)... i-th system output, 1... bias. The first figure 7 shows system response to input signal with noise. There is difference of norms in the second figure (fig.8). The last figure demonstrates values of norms (fig.9). The difference values are approximately 1 times less then the norm values. So consequently we can choose the learning constant with accuracy one decimal place. Norm value Difference of norms Difference of norms Euclidean norm and the largest singular value norm Time (s) Figure 9: The norm values 6.3 Algorithm comparison System output For demonstration there were tested algorithms on backpropagation base implemented in MATLAB: 1. GD - Gradient descent backpropagation. 2. GDA - Gradient descent with adaptive learning rate backpropagation. 3. GDM - Gradient descent with momentum backpropagation. System input Time (s) Figure 7: The response to input signal with noise 4. GDX - Gradient descent with momentum and adaptive learning rate backpropagation. 5. RP - Resilient backpropagation. 6. SBP - Modified BP by stability condition. Difference of norms Difference of norms The simulation had same learning set and same learning constant β =.9. The learning constant α was set by MATLAB function maxlinlr. For modified BP algorithm was α =.9 2+2β. As a criterion of quality of the learning was used mean square error MSE = 1 8. The main parameters of learning were a number of epochs, a learning time of one epoch and quality rate, which is calculated by the formula: Time (s) Figure 8: The difference between norms Quality rate = Learning time o f one epoch Number o f epochs (29) The following table shows the simulation results of these learning algorithms:

6 Table 1: Algorithm comparison. Learning time of one epoch (1 1 3 s) Algorithm Number of epochs Quality rate GD GDA GDM GDX RP SBP Conclusion The stability conditions were given, and on their basis the limits for learning constants. This principle changed the meaning of learning constants. Their magnitude does not only mean the speed of the learning algorithm, but also expresses the degree of its stability. The principle, of course, makes it possible to set such constants that learning can be made faster for different patterns, independent of the initial network values, than with the common methods of BP modification. Acknowledgements The paper has been prepared as a part of the solution of GAČR project No. 12/1/1485, with the support by the research plan CEZ: MSM 2613 and the Ph.D. research grant FR: MSMT IS Adaptive Controllers based on Neural Networks [5] K. J. Hunt - D. Sbarbaro - R. Żbikowski - P. J. Gawthrop (1992). Neural Network for Control Systems-A Survey. Automatica, volume 28, no. 6, pages , 1992 [6] K. S. Narendra - S. Mukhopadhyay (1992). Intelligent Control using Neural Networks. IEEE Control Systems, pages 11-18, April 1992 [7] H. Demuth - M. Beale (21). Neural Network Toolbox User s Guide. The MathWorks, Inc., version 4, 21 [8] R. M. Hristev (1998). The Ann Book. GNU Public Licence, 1998 [9] M. Riedmiller - H. Braun (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. Proceedings of the IEEE International Conference on Neural Networks, 1993 References [1] P. Krupanský - P. Pivoňka (21). Adaptive Neural Controllers and Their Implementation. Proc. 9th Zittau Fuzzy Colloquium, Zittau, 21 [2] P. Krupanský - P. Pivoňka (22). The Possibilities of Direct Inversion in Neural Controllers. Mendel 22 8th International Conference of Soft Computing, Brno, 22 [3] J. Najvárek (1998). Neural Nets in Predictive Control. Dissertation work, Dept. of Automatic Control and Instrumentation, Brno, 1998 [4] P. Vavřín (1988). Theory of automatic control. Scriptum, Brno, 1988

Computational Intelligence Winter Term 2017/18

Computational Intelligence Winter Term 2017/18 Computational Intelligence Winter Term 207/8 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Plan for Today Single-Layer Perceptron Accelerated Learning

More information

Computational Intelligence

Computational Intelligence Plan for Today Single-Layer Perceptron Computational Intelligence Winter Term 00/ Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Accelerated Learning

More information

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design

More information

Artifical Neural Networks

Artifical Neural Networks Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

More information

8. Lecture Neural Networks

8. Lecture Neural Networks Soft Control (AT 3, RMA) 8. Lecture Neural Networks Learning Process Contents of the 8 th lecture 1. Introduction of Soft Control: Definition and Limitations, Basics of Intelligent" Systems 2. Knowledge

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 4, April 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Application of

More information

Backpropagation Neural Net

Backpropagation Neural Net Backpropagation Neural Net As is the case with most neural networks, the aim of Backpropagation is to train the net to achieve a balance between the ability to respond correctly to the input patterns that

More information

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.

More information

Deep Learning & Artificial Intelligence WS 2018/2019

Deep Learning & Artificial Intelligence WS 2018/2019 Deep Learning & Artificial Intelligence WS 2018/2019 Linear Regression Model Model Error Function: Squared Error Has no special meaning except it makes gradients look nicer Prediction Ground truth / target

More information

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning Lecture 0 Neural networks and optimization Machine Learning and Data Mining November 2009 UBC Gradient Searching for a good solution can be interpreted as looking for a minimum of some error (loss) function

More information

Identification of Non-Linear Systems, Based on Neural Networks, with Applications at Fuzzy Systems

Identification of Non-Linear Systems, Based on Neural Networks, with Applications at Fuzzy Systems Proceedings of the 0th WSEAS International Conference on AUTOMATION & INFORMATION Identification of Non-Linear Systems, Based on Neural Networks, with Applications at Fuzzy Systems CONSTANTIN VOLOSENCU

More information

ROTARY INVERTED PENDULUM AND CONTROL OF ROTARY INVERTED PENDULUM BY ARTIFICIAL NEURAL NETWORK

ROTARY INVERTED PENDULUM AND CONTROL OF ROTARY INVERTED PENDULUM BY ARTIFICIAL NEURAL NETWORK Proc. Natl. Conf. Theor. Phys. 37 (2012, pp. 243-249 ROTARY INVERTED PENDULUM AND CONTROL OF ROTARY INVERTED PENDULUM BY ARTIFICIAL NEURAL NETWORK NGUYEN DUC QUYEN (1, NGO VAN THUYEN (1 (1 University of

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

A New Weight Initialization using Statistically Resilient Method and Moore-Penrose Inverse Method for SFANN

A New Weight Initialization using Statistically Resilient Method and Moore-Penrose Inverse Method for SFANN A New Weight Initialization using Statistically Resilient Method and Moore-Penrose Inverse Method for SFANN Apeksha Mittal, Amit Prakash Singh and Pravin Chandra University School of Information and Communication

More information

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16 COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS6 Lecture 3: Classification with Logistic Regression Advanced optimization techniques Underfitting & Overfitting Model selection (Training-

More information

Based on the original slides of Hung-yi Lee

Based on the original slides of Hung-yi Lee Based on the original slides of Hung-yi Lee Google Trends Deep learning obtains many exciting results. Can contribute to new Smart Services in the Context of the Internet of Things (IoT). IoT Services

More information

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning CSC 578 Neural Networks and Deep Learning Fall 2018/19 3. Improving Neural Networks (Some figures adapted from NNDL book) 1 Various Approaches to Improve Neural Networks 1. Cost functions Quadratic Cross

More information

NONLINEAR IDENTIFICATION ON BASED RBF NEURAL NETWORK

NONLINEAR IDENTIFICATION ON BASED RBF NEURAL NETWORK DAAAM INTERNATIONAL SCIENTIFIC BOOK 2011 pp. 547-554 CHAPTER 44 NONLINEAR IDENTIFICATION ON BASED RBF NEURAL NETWORK BURLAK, V. & PIVONKA, P. Abstract: This article is focused on the off-line identification

More information

Multilayer Perceptrons and Backpropagation

Multilayer Perceptrons and Backpropagation Multilayer Perceptrons and Backpropagation Informatics 1 CG: Lecture 7 Chris Lucas School of Informatics University of Edinburgh January 31, 2017 (Slides adapted from Mirella Lapata s.) 1 / 33 Reading:

More information

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units Connectionist Models Consider humans: Neuron switching time ~ :001 second Number of neurons ~ 10 10 Connections per neuron ~ 10 4 5 Scene recognition time ~ :1 second 100 inference steps doesn't seem like

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Threshold units Gradient descent Multilayer networks Backpropagation Hidden layer representations Example: Face Recognition Advanced topics 1 Connectionist Models Consider humans:

More information

A Particle Swarm Optimization (PSO) Primer

A Particle Swarm Optimization (PSO) Primer A Particle Swarm Optimization (PSO) Primer With Applications Brian Birge Overview Introduction Theory Applications Computational Intelligence Summary Introduction Subset of Evolutionary Computation Genetic

More information

Defining Feedforward Network Architecture. net = newff([pn],[s1 S2... SN],{TF1 TF2... TFN},BTF,LF,PF);

Defining Feedforward Network Architecture. net = newff([pn],[s1 S2... SN],{TF1 TF2... TFN},BTF,LF,PF); Appendix D MATLAB Programs for Neural Systems D.. Defining Feedforward Network Architecture Feedforward networks often have one or more hidden layers of sigmoid neurons followed by an output layer of linear

More information

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE Li Sheng Institute of intelligent information engineering Zheiang University Hangzhou, 3007, P. R. China ABSTRACT In this paper, a neural network-driven

More information

Identification of two-mass system parameters using neural networks

Identification of two-mass system parameters using neural networks 3ème conférence Internationale des énergies renouvelables CIER-2015 Proceedings of Engineering and Technology - PET Identification of two-mass system parameters using neural networks GHOZZI Dorsaf 1,NOURI

More information

Machine Learning: Multi Layer Perceptrons

Machine Learning: Multi Layer Perceptrons Machine Learning: Multi Layer Perceptrons Prof. Dr. Martin Riedmiller Albert-Ludwigs-University Freiburg AG Maschinelles Lernen Machine Learning: Multi Layer Perceptrons p.1/61 Outline multi layer perceptrons

More information

Neural Network to Control Output of Hidden Node According to Input Patterns

Neural Network to Control Output of Hidden Node According to Input Patterns American Journal of Intelligent Systems 24, 4(5): 96-23 DOI:.5923/j.ajis.2445.2 Neural Network to Control Output of Hidden Node According to Input Patterns Takafumi Sasakawa, Jun Sawamoto 2,*, Hidekazu

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Bidirectional Representation and Backpropagation Learning

Bidirectional Representation and Backpropagation Learning Int'l Conf on Advances in Big Data Analytics ABDA'6 3 Bidirectional Representation and Bacpropagation Learning Olaoluwa Adigun and Bart Koso Department of Electrical Engineering Signal and Image Processing

More information

Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation

Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation Steve Renals Machine Learning Practical MLP Lecture 5 16 October 2018 MLP Lecture 5 / 16 October 2018 Deep Neural Networks

More information

Intro to Neural Networks and Deep Learning

Intro to Neural Networks and Deep Learning Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs

More information

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural 1 2 The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural networks. First we will look at the algorithm itself

More information

y(x n, w) t n 2. (1)

y(x n, w) t n 2. (1) Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,

More information

Lab 5: 16 th April Exercises on Neural Networks

Lab 5: 16 th April Exercises on Neural Networks Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data January 17, 2006 Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Multi-Layer Perceptrons Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole

More information

Novel determination of dierential-equation solutions: universal approximation method

Novel determination of dierential-equation solutions: universal approximation method Journal of Computational and Applied Mathematics 146 (2002) 443 457 www.elsevier.com/locate/cam Novel determination of dierential-equation solutions: universal approximation method Thananchai Leephakpreeda

More information

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Neural Networks Le Song Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU Reading: Chap. 5 CB Learning highly non-linear functions f:

More information

ADAPTIVE NEURAL NETWORK CONTROL OF MECHATRONICS OBJECTS

ADAPTIVE NEURAL NETWORK CONTROL OF MECHATRONICS OBJECTS acta mechanica et automatica, vol.2 no.4 (28) ADAPIE NEURAL NEWORK CONROL OF MECHARONICS OBJECS Egor NEMSE *, Yuri ZHUKO * * Baltic State echnical University oenmeh, 985, St. Petersburg, Krasnoarmeyskaya,

More information

LECTURE # - NEURAL COMPUTATION, Feb 04, Linear Regression. x 1 θ 1 output... θ M x M. Assumes a functional form

LECTURE # - NEURAL COMPUTATION, Feb 04, Linear Regression. x 1 θ 1 output... θ M x M. Assumes a functional form LECTURE # - EURAL COPUTATIO, Feb 4, 4 Linear Regression Assumes a functional form f (, θ) = θ θ θ K θ (Eq) where = (,, ) are the attributes and θ = (θ, θ, θ ) are the function parameters Eample: f (, θ)

More information

epochs epochs

epochs epochs Neural Network Experiments To illustrate practical techniques, I chose to use the glass dataset. This dataset has 214 examples and 6 classes. Here are 4 examples from the original dataset. The last values

More information

ECE521 Lectures 9 Fully Connected Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance

More information

Learning Neural Networks

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex decision boundaries Variable size. Any boolean function can be represented. Hidden units can be interpreted as new features Deterministic

More information

Artificial Neural Networks

Artificial Neural Networks 0 Artificial Neural Networks Based on Machine Learning, T Mitchell, McGRAW Hill, 1997, ch 4 Acknowledgement: The present slides are an adaptation of slides drawn by T Mitchell PLAN 1 Introduction Connectionist

More information

Modelling of Pehlivan-Uyaroglu_2010 Chaotic System via Feed Forward Neural Network and Recurrent Neural Networks

Modelling of Pehlivan-Uyaroglu_2010 Chaotic System via Feed Forward Neural Network and Recurrent Neural Networks Modelling of Pehlivan-Uyaroglu_2010 Chaotic System via Feed Forward Neural Network and Recurrent Neural Networks 1 Murat ALÇIN, 2 İhsan PEHLİVAN and 3 İsmail KOYUNCU 1 Department of Electric -Energy, Porsuk

More information

Address for Correspondence

Address for Correspondence Research Article APPLICATION OF ARTIFICIAL NEURAL NETWORK FOR INTERFERENCE STUDIES OF LOW-RISE BUILDINGS 1 Narayan K*, 2 Gairola A Address for Correspondence 1 Associate Professor, Department of Civil

More information

PATTERN RECOGNITION FOR PARTIAL DISCHARGE DIAGNOSIS OF POWER TRANSFORMER

PATTERN RECOGNITION FOR PARTIAL DISCHARGE DIAGNOSIS OF POWER TRANSFORMER PATTERN RECOGNITION FOR PARTIAL DISCHARGE DIAGNOSIS OF POWER TRANSFORMER PO-HUNG CHEN 1, HUNG-CHENG CHEN 2, AN LIU 3, LI-MING CHEN 1 1 Department of Electrical Engineering, St. John s University, Taipei,

More information

Simple neuron model Components of simple neuron

Simple neuron model Components of simple neuron Outline 1. Simple neuron model 2. Components of artificial neural networks 3. Common activation functions 4. MATLAB representation of neural network. Single neuron model Simple neuron model Components

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

A Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation

A Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation A Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation Vu Malbasa and Slobodan Vucetic Abstract Resource-constrained data mining introduces many constraints when learning from

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Recursive Least Squares for an Entropy Regularized MSE Cost Function Recursive Least Squares for an Entropy Regularized MSE Cost Function Deniz Erdogmus, Yadunandana N. Rao, Jose C. Principe Oscar Fontenla-Romero, Amparo Alonso-Betanzos Electrical Eng. Dept., University

More information

Tutorial on Tangent Propagation

Tutorial on Tangent Propagation Tutorial on Tangent Propagation Yichuan Tang Centre for Theoretical Neuroscience February 5, 2009 1 Introduction Tangent Propagation is the name of a learning technique of an artificial neural network

More information

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo

More information

Lecture 17: Neural Networks and Deep Learning

Lecture 17: Neural Networks and Deep Learning UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions

More information

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /

More information

Choosing Variables with a Genetic Algorithm for Econometric models based on Neural Networks learning and adaptation.

Choosing Variables with a Genetic Algorithm for Econometric models based on Neural Networks learning and adaptation. Choosing Variables with a Genetic Algorithm for Econometric models based on Neural Networks learning and adaptation. Daniel Ramírez A., Israel Truijillo E. LINDA LAB, Computer Department, UNAM Facultad

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Single layer NN. Neuron Model

Single layer NN. Neuron Model Single layer NN We consider the simple architecture consisting of just one neuron. Generalization to a single layer with more neurons as illustrated below is easy because: M M The output units are independent

More information

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH Abstract POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH A.H.M.A.Rahim S.K.Chakravarthy Department of Electrical Engineering K.F. University of Petroleum and Minerals Dhahran. Dynamic

More information

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

Rprop Using the Natural Gradient

Rprop Using the Natural Gradient Trends and Applications in Constructive Approximation (Eds.) M.G. de Bruin, D.H. Mache & J. Szabados International Series of Numerical Mathematics Vol. 1?? c 2005 Birkhäuser Verlag Basel (ISBN 3-7643-7124-2)

More information

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4 Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:

More information

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Jan Drchal Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science Topics covered

More information

A Novel Activity Detection Method

A Novel Activity Detection Method A Novel Activity Detection Method Gismy George P.G. Student, Department of ECE, Ilahia College of,muvattupuzha, Kerala, India ABSTRACT: This paper presents an approach for activity state recognition of

More information

*** (RASP1) ( - ***

*** (RASP1) (  - *** 8-95 387 0 *** ** * 87/7/4 : 8/5/7 :. 90/ (RASP).. 88. : (E-mail: mazloumzadeh@gmail.com) - - - - * ** *** 387 0.(5) 5--4-. Hyperbolic Tangent 30 70.. 0/-0/ -0- Sigmoid.(4).(7). Hyperbolic Tangent. 3-3-3-.(8).(

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Neural network modelling of reinforced concrete beam shear capacity

Neural network modelling of reinforced concrete beam shear capacity icccbe 2010 Nottingham University Press Proceedings of the International Conference on Computing in Civil and Building Engineering W Tizani (Editor) Neural network modelling of reinforced concrete beam

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

Least Mean Squares Regression. Machine Learning Fall 2018

Least Mean Squares Regression. Machine Learning Fall 2018 Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises

More information

Identification and Control of Nonlinear Systems using Soft Computing Techniques

Identification and Control of Nonlinear Systems using Soft Computing Techniques International Journal of Modeling and Optimization, Vol. 1, No. 1, April 011 Identification and Control of Nonlinear Systems using Soft Computing Techniques Wahida Banu.R.S.D, ShakilaBanu.A and Manoj.D

More information

Artificial Neural Networks. Edward Gatt

Artificial Neural Networks. Edward Gatt Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

Optimization of Gaussian Process Hyperparameters using Rprop

Optimization of Gaussian Process Hyperparameters using Rprop Optimization of Gaussian Process Hyperparameters using Rprop Manuel Blum and Martin Riedmiller University of Freiburg - Department of Computer Science Freiburg, Germany Abstract. Gaussian processes are

More information

C4 Phenomenological Modeling - Regression & Neural Networks : Computational Modeling and Simulation Instructor: Linwei Wang

C4 Phenomenological Modeling - Regression & Neural Networks : Computational Modeling and Simulation Instructor: Linwei Wang C4 Phenomenological Modeling - Regression & Neural Networks 4040-849-03: Computational Modeling and Simulation Instructor: Linwei Wang Recall.. The simple, multiple linear regression function ŷ(x) = a

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Lecture 6. Regression

Lecture 6. Regression Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron

More information

Fundamentals of Metaheuristics

Fundamentals of Metaheuristics Fundamentals of Metaheuristics Part I - Basic concepts and Single-State Methods A seminar for Neural Networks Simone Scardapane Academic year 2012-2013 ABOUT THIS SEMINAR The seminar is divided in three

More information

1 What a Neural Network Computes

1 What a Neural Network Computes Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks Philipp Koehn 4 April 205 Linear Models We used before weighted linear combination of feature values h j and weights λ j score(λ, d i ) = j λ j h j (d i ) Such models can

More information

GENETICALLY OPTIMIZED ARTIFICIAL NEURAL NETWORK BASED OPTIMUM DESIGN OF SINGLY AND DOUBLY REINFORCED CONCRETE BEAMS

GENETICALLY OPTIMIZED ARTIFICIAL NEURAL NETWORK BASED OPTIMUM DESIGN OF SINGLY AND DOUBLY REINFORCED CONCRETE BEAMS ASIAN JOURNAL OF CIVIL ENGINEERING (BUILDING AND HOUSING) VOL. 7, NO. 6 (2006) PAGES 603-619 GENETICALLY OPTIMIZED ARTIFICIAL NEURAL NETWORK BASED OPTIMUM DESIGN OF SINGLY AND DOUBLY REINFORCED CONCRETE

More information

Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

More information

Artificial Neural Networks (ANN)

Artificial Neural Networks (ANN) Artificial Neural Networks (ANN) Edmondo Trentin April 17, 2013 ANN: Definition The definition of ANN is given in 3.1 points. Indeed, an ANN is a machine that is completely specified once we define its:

More information

Chapter 3 Supervised learning:

Chapter 3 Supervised learning: Chapter 3 Supervised learning: Multilayer Networks I Backpropagation Learning Architecture: Feedforward network of at least one layer of non-linear hidden nodes, e.g., # of layers L 2 (not counting the

More information

Fundamentals of Neural Network

Fundamentals of Neural Network Chapter 3 Fundamentals of Neural Network One of the main challenge in actual times researching, is the construction of AI (Articial Intelligence) systems. These systems could be understood as any physical

More information

EPL442: Computational

EPL442: Computational EPL442: Computational Learning Systems Lab 2 Vassilis Vassiliades Department of Computer Science University of Cyprus Outline Artificial Neuron Feedforward Neural Network Back-propagation Algorithm Notes

More information

A Priori and A Posteriori Machine Learning and Nonlinear Artificial Neural Networks

A Priori and A Posteriori Machine Learning and Nonlinear Artificial Neural Networks A Priori and A Posteriori Machine Learning and Nonlinear Artificial Neural Networks Jan Zelinka, Jan Romportl, and Luděk Müller The Department of Cybernetics, University of West Bohemia, Czech Republic

More information

CSC321 Lecture 5 Learning in a Single Neuron

CSC321 Lecture 5 Learning in a Single Neuron CSC321 Lecture 5 Learning in a Single Neuron Roger Grosse and Nitish Srivastava January 21, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 5 Learning in a Single Neuron January 21, 2015 1 / 14

More information

A neuro-fuzzy system for portfolio evaluation

A neuro-fuzzy system for portfolio evaluation A neuro-fuzzy system for portfolio evaluation Christer Carlsson IAMSR, Åbo Akademi University, DataCity A 320, SF-20520 Åbo, Finland e-mail: ccarlsso@ra.abo.fi Robert Fullér Comp. Sci. Dept., Eötvös Loránd

More information

Neural Networks Task Sheet 2. Due date: May

Neural Networks Task Sheet 2. Due date: May Neural Networks 2007 Task Sheet 2 1/6 University of Zurich Prof. Dr. Rolf Pfeifer, pfeifer@ifi.unizh.ch Department of Informatics, AI Lab Matej Hoffmann, hoffmann@ifi.unizh.ch Andreasstrasse 15 Marc Ziegler,

More information

CE213 Artificial Intelligence Lecture 14

CE213 Artificial Intelligence Lecture 14 CE213 Artificial Intelligence Lecture 14 Neural Networks: Part 2 Learning Rules -Hebb Rule - Perceptron Rule -Delta Rule Neural Networks Using Linear Units [ Difficulty warning: equations! ] 1 Learning

More information

Intelligent Decision Support for New Product Development: A Consumer-Oriented Approach

Intelligent Decision Support for New Product Development: A Consumer-Oriented Approach Appl. Math. Inf. Sci. 8, No. 6, 2761-2768 (2014) 2761 Applied Mathematics & Information Sciences An International Journal http://dx.doi.org/10.12785/amis/080611 Intelligent Decision Support for New Product

More information

Revision: Neural Network

Revision: Neural Network Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn

More information

Day 3 Lecture 3. Optimizing deep networks

Day 3 Lecture 3. Optimizing deep networks Day 3 Lecture 3 Optimizing deep networks Convex optimization A function is convex if for all α [0,1]: f(x) Tangent line Examples Quadratics 2-norms Properties Local minimum is global minimum x Gradient

More information

Optimization for neural networks

Optimization for neural networks 0 - : Optimization for neural networks Prof. J.C. Kao, UCLA Optimization for neural networks We previously introduced the principle of gradient descent. Now we will discuss specific modifications we make

More information

More Tips for Training Neural Network. Hung-yi Lee

More Tips for Training Neural Network. Hung-yi Lee More Tips for Training Neural Network Hung-yi ee Outline Activation Function Cost Function Data Preprocessing Training Generalization Review: Training Neural Network Neural network: f ; θ : input (vector)

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK Dusan Marcek Silesian University, Institute of Computer Science Opava Research Institute of the IT4Innovations

More information