ADAPTIVE NEURO-FUZZY INFERENCE SYSTEMS

Similar documents
MODELLING OF TOOL LIFE, TORQUE AND THRUST FORCE IN DRILLING: A NEURO-FUZZY APPROACH

y(x n, w) t n 2. (1)

Computational Intelligence Lecture 20:Neuro-Fuzzy Systems

Modelling and Optimization of Primary Steam Reformer System (Case Study : the Primary Reformer PT Petrokimia Gresik Indonesia)

NN V: The generalized delta learning rule

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE

1. A discrete-time recurrent network is described by the following equation: y(n + 1) = A y(n) + B x(n)

A Soft Computing Approach for Fault Prediction of Electronic Systems

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

Artifical Neural Networks

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Study of a neural network-based system for stability augmentation of an airplane

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Multilayer Perceptrons (MLPs)

4. Multilayer Perceptrons

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Neural networks. Chapter 19, Sections 1 5 1

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

In the Name of God. Lectures 15&16: Radial Basis Function Networks

Machine Learning (CSE 446): Neural Networks

18.6 Regression and Classification with Linear Models

Computational statistics

Neural Networks. Intro to AI Bert Huang Virginia Tech

Radial Basis Function (RBF) Networks

Multilayer Perceptrons and Backpropagation

Neural networks. Chapter 20. Chapter 20 1

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Neural Networks Lecture 4: Radial Bases Function Networks

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

N. Sarikaya Department of Aircraft Electrical and Electronics Civil Aviation School Erciyes University Kayseri 38039, Turkey

CHAPTER 4 BASICS OF ULTRASONIC MEASUREMENT AND ANFIS MODELLING

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

FERROMAGNETIC HYSTERESIS MODELLING WITH ADAPTIVE NEURO FUZZY INFERENCE SYSTEM

Identification of Nonlinear Systems Using Neural Networks and Polynomial Models

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural networks. Chapter 20, Section 5 1

Neural Networks: Backpropagation

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Data Mining Part 5. Prediction

Unit III. A Survey of Neural Network Model

8. Lecture Neural Networks

Artificial Intelligence

Revision: Neural Network

Modeling and Control Based on Generalized Fuzzy Hyperbolic Model

Convolutional networks. Sebastian Seung

Deep Feedforward Networks

Lecture 5: Recurrent Neural Networks

CS489/698: Intro to ML

Lab 5: 16 th April Exercises on Neural Networks

Intelligent Systems Discriminative Learning, Neural Networks

Introduction to Neural Networks

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Deep Neural Networks (1) Hidden layers; Back-propagation

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Deep Neural Networks (1) Hidden layers; Back-propagation

Neural Networks and the Back-propagation Algorithm

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Neural Networks biological neuron artificial neuron 1

Multilayer Perceptron

A neuro-fuzzy system for portfolio evaluation

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Intro to Neural Networks and Deep Learning

ECE521 Lectures 9 Fully Connected Neural Networks

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

AI Programming CS F-20 Neural Networks

A Study on the Fuzzy Modeling of Nonlinear Systems Using Kernel Machines

Neural networks COMS 4771

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

On Some Mathematical Results of Neural Networks

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Artificial Neural Networks. Edward Gatt

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

A Neuro-Fuzzy Scheme for Integrated Input Fuzzy Set Selection and Optimal Fuzzy Rule Generation for Classification

Linear discriminant functions

Function Approximation through Fuzzy Systems Using Taylor Series Expansion-Based Rules: Interpretability and Parameter Tuning

CS:4420 Artificial Intelligence

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Advanced Machine Learning

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

CSC 411 Lecture 10: Neural Networks

A Novel Activity Detection Method

Nonlinear Classification

Neuro-Fuzzy Methods for Modeling and Identification

Feedforward Neural Nets and Backpropagation

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Lecture 6. Regression

Transcription:

ADAPTIVE NEURO-FUZZY INFERENCE SYSTEMS RBFN and TS systems Equivalent if the following hold: Both RBFN and TS use same aggregation method for output (weighted sum or weighted average) Number of basis functions in RBFN equals number of rules in TS TS uses Gaussian membership functions with same as basis functions and rule firing is determined by multiplication RBFN response function (c i ) and TS rule consequents are equal 35

ANFIS Adaptive Neuro-Fuzzy Inference Systems (ANFIS) Takagi-Sugeno fuzzy system mapped onto a neural network structure. Different representations are possible, but one with 5 layers is the most common. Network nodes in different layers have different structures. 352 ANFIS Consider a first-order Sugeno fuzzy model, with two inputs, x and y, and one output, z. Rule set Rule : If x is A and y is B, then f = p x + q y + r Rule 2: If x is A 2 and y is B 2, then f 2 = p 2 x + q 2 y + r 2 A A X 2 B 2 X x y B Y Y w w 2 f = p x + q y + r f 2 = p 2 x + q 2 y + r 2 Weighted fuzzy-mean: wf wf 2 2 f w w wf 2 wf 2 2 353 2

ANFIS architecture Corresponding equivalent ANFIS architecture: 354 ANFIS layers Layer : every node is an adaptive node with node function: O, ( x ) i i i Parameters in this layer are called premise parameters. Layer 2: every node is fixed whose output (representing firing strength) is the product of the inputs: O w 2,i i j j Layer 3: every node is fixed (normalization): wi O3, i wi w j j 355 3

ANFIS layers Layer 4: every node is adaptive (consequent parameters) : O O f w( p px... px) 4, i 3, i i i n n Layer 5: single node, sums up inputs: wf i i i O5, i wf i i w i Adaptive network is functionally equivalent to a Sugeno fuzzy model! i i 356 ANFIS with multiple rules 357 4

Hybrid learning for ANFIS Consider the two rules ANFIS with two inputs x and y and one output z; Let the premise parameters be fixed; ANFIS output is given by linear combination of consequent parameters p, q and r: w w z f f w w w w 2 2 2 2 w ( p xq yr) w ( p xq yr ) 2 2 2 2 ( w x) p ( w y) q ( w ) r ( w x) p ( w y) q ( w ) r A 2 2 2 2 2 2 358 Hybrid learning for ANFIS Partition total parameters set S as: S : set of premise (nonlinear) parameters S 2 : set of consequent (linear) parameters q: unknown vector which elements are parameters in S 2 z = Aq: standard linear least-squares problem Best solution for q that minimizes Aq z 2 is the least-squares estimator q * : q * = (A T A) - A T z 359 5

Hybrid learning for ANFIS What if premise parameters are not optimal? Combine steepest descent and least-squares estimator to update parameters in adaptive network. Each epoch is composed of:. Forward pass: node outputs go forward until Layer 4 and consequent parameters are identified by leastsquares estimator; 2. Backward pass: error signals propagate backward and the premise parameters are updated by gradient descent. 36 Hybrid learning for ANFIS Error signals: derivative of error measure with respect to each node output. Forward pass Backward pass Premise parameters Fixed Gradient descent Consequent parameters Least-squares estimator Fixed Signals Node outputs Error signals Hybrid approach converges much faster by reducing the search space of pure backpropagation method. 36 6

Stone-Weierstrass theorem Let D be a compact space of N dimensions and let be a set of continuous real-valued functions on D satisfying:. Identity function: the constant f (x) = is in. 2. Separability: for any two points x x 2 in D, there is an f in such that f(x ) f(x 2 ). 3. Algebraic closure: if f and g are two functions in, then fg and af+bg are also in for any reals a and b. Then, is dense in the closure C(D) of D, i.e.: "e>, " gc(d), $ f : g(x) f(x) < e, " xd. 362 Universal approximator ANFIS According to Stone-Weierstrass theorem, an ANFIS has unlimited approximation power for matching any continuous nonlinear function arbitrarily well Identity: obtained by having a constant consequent Separability: obtained by selecting different parameters in the network 363 7

Algebraic closure Consider two systems with two rules and final outputs: Additive: z w f w f 2 and z w f w f w w w w2 2 2 2 2 wf wf 2 2 w f w2 f 2 azbz a b w w2 ww2 ww ( afbf) ww 2( afbf2) w2w( af2 bf) w2w2( af2 bf2) ww ww2 wwww2 2 2 Construct 4 rule inference system that computes: az bz 364 Algebraic closure Multiplicative: wf wf 2 2 w f w2 f 2 zz ww2 w w2 wwf f ww 2f f wwf 2 2 f ww 2 2f2 f w w w w2 w ww w2 2 2 2 2 Construct 4 rule inference system that computes: zz 365 8

Model building guidelines Select number of fuzzy sets per variable: empirically by examining data or trial and error using clustering techniques using regression trees (CART) Initially, distribute bell-shaped membership functions evenly: Using an adaptive step size can speed up training. 366 How to design ANFIS? Initialization Define number and type of inputs Define number and type of outputs Define number of rules and type of consequents Define objective function and stop conditions Collect data Normalize inputs Determine initial rules Initialize network TRAIN 367 9

Ex. : Two-input sinc function sin( x)sin( y) zsinc( xy, ) xy Input range: [-,][-,], 2 training data pairs. Multi-Layer Perceptron vs. ANFIS: MLP: 8 neurons in hidden layer, 73 parameters, quick propagation (best learning algorithm for backpropagation MLP). ANFIS: 6 rules, 4 membership functions per variable, 72 fitting parameters (48 linear, 24 nonlinear), hybrid learning rule. 368 MLP vs. ANFIS results Average of runs: MLP: different sets of initial random weights; ANFIS: step sizes between. and.. MLP s approximation power decrease due to: learning processes trapped in local minima or some neurons can be pushed into saturation during training. 369

ANFIS output Training data ANFIS Output.5.5 Y - - X Y - - X root mean squared error.2.5..5 error curve 5 5 2 25 epoch number step size.4.3.2. step size curve 5 5 2 25 epoch number 37 ANFIS model Degree of membership Degree of membership Initial MFs on X.5 - -5 5 input Final MFs on X.5 - -5 5 input Degree of membership Degree of membership Initial MFs on Y.5 - -5 5 input2 Final MFs on Y.5 - -5 5 input2 37

Ex. 2: 3-input nonlinear function.5.5 2 output x y z Two membership functions per variable, 8 rules Input ranges: [,6][,6][,6] 26 training data, 25 validation data root mean squared error.8.6.4.2 error curves training error checking error step size.2.5..5 step size curve 2 4 6 8 epoch number 2 4 6 8 epoch number 372 ANFIS model Degree of membership Degree of membership Initial MFs on X, Y and Z.5 2 3 4 5 6 input Final MFs on Y.5 2 3 4 5 6 input2 Degree of membership Degree of membership Final MFs on X.5 2 3 4 5 6 input Final MFs on Z.5 2 3 4 5 6 input3 373 2

Results comparison P Ti () Oi () APE Average Percentage Error.% P Ti () i Model Training error Checking error # Param. Training data size Checking data size ANFIS.43%.66% 5 26 25 GMDH model [] Fuzzy model [2] Fuzzy model 2 [2] 4.7% 5.7% - 2 2.5% 2.% 22 2 2.59% 3.4% 32 2 2 [] T. Kondo. Revised GMDH algorithm estimating degree of the complete polynomial. Trans. of the Society of Instrument and Control Engineers, 22(9):928:934, 986. [2] M. Sugeno and G. T. Kang, Structure Identification of fuzzy model. Fuzzy Sets and Systems, 28:5-33, 988. 374 Ex. 3: Modeling dynamic system Plant equation yk ( ).3 yk ( ).6 yk ( ) f( uk ( )) f(.) has the following form f( u).6sin( u).3sin(3 u).sin(5 u) Estimate nonlinear function F with ANFIS yk ˆ( ).3 yk ˆ( ).6 yk ˆ( ) Fuk ( ( )) Plant input: u( k) sin(2 k / 25) ANFIS parameters updated at each step (on-line) Learning rate: =.; forgetting factor: =.99 ANFIS can adapt even after the input changes Question: was the input signal rich enough? João M. C. Sousa 375 3

Plant and model outputs 376 Effect of number of MFs Initial MFs Final MFs 5 membership functions.8.6.4.2 - -.5.5.5 -.5 f(u) and ANFIS Outputs.8.6.4.2 - -.5.5.5 -.5 Each Rule's Outputs - - -.5.5 - - -.5.5 377 4

Effect of number of MFs Initial MFs Final MFs 4 membership functions.8.6.4.2 - -.5.5.5 -.5 f(u) and ANFIS Outputs.8.6.4.2 - -.5.5 - Each Rule's Outputs - - -.5.5 - -.5.5 378 Effect of number of MFs Initial MFs Final MFs 3 membership functions.8.6.4.2 - -.5.5.5 -.5 f(u) and ANFIS Outputs.8.6.4.2 - -.5.5.5 -.5 Each Rule's Outputs - - -.5.5 - - -.5.5 379 5

Ex. 4: Chaotic time series Consider a chaotic time series generated by Task: predict system output at some future instance t+p by using past outputs 5 training data, 5 validation data ANFIS input: [x(t 8), x(t 2), x(t 6), x(t)] ANFIS output: x(t + 6).2 xt ( ) xt (). xt () x ( t ) Two MFs per variable, 6 rules 4 parameters (24 premise, 8 consequent) Data generated from t =8 to t =7 38 ANFIS model Final MFs on Input, x(t - 8) Final MFs on Input 2, x(t - 2).8.8.6.6.4.4.2.2.6.8.2.6.8.2 Final MFs on Input 3, x(t - 6) Final MFs on Input 4, x(t).8.8.6.6.4.4.2.2.6.8.2.6.8.2 38 6

Model output 2.8 2.6 x -3 Error Curves Training Error Checking Error.5. Step Sizes 2.4 2.2.5 2 5 Desired and ANFIS Outputs. 5 x -3 Prediction Errors.2 5.8.6-5 2 4 6 8-2 4 6 8 382 3 rd order AR model 383 7

Order selection ( n) ( n) () y t y t y t y t u t () () () () () Select optimal order of AR model in order to prevent overfitting Select the order that minimizes the error on a test set 384 44 th order AR model 385 8

ANFIS output for P = 84 386 ANFIS extensions Different types of membership functions in layer Parameterized t-norms in layer 2 Interpretability constrained gradient descent optimization bounds on fuzziness E' Ewiln( wi) i parameterize to reflect constraints Structure identification N P 387 9