An artificial neural networks (ANNs) model is a functional abstraction of the

Similar documents
Pattern Classification

Lecture 7 Artificial neural networks: Supervised learning

Artificial Neural Network : Training

Lecture 4: Feed Forward Neural Networks

LECTURE # - NEURAL COMPUTATION, Feb 04, Linear Regression. x 1 θ 1 output... θ M x M. Assumes a functional form

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Data Mining Part 5. Prediction

Multilayer Perceptron = FeedForward Neural Network

Artificial Neural Networks. Edward Gatt

Part 8: Neural Networks

Neural Networks and the Back-propagation Algorithm

Artificial Neural Network and Fuzzy Logic

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

EEE 241: Linear Systems

Artificial Intelligence

Multilayer Neural Networks

Unit III. A Survey of Neural Network Model

Introduction to Artificial Neural Networks

Neural networks. Chapter 19, Sections 1 5 1

Lecture 16: Introduction to Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Artificial Neural Networks. Historical description

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Artificial Neural Networks

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

Neural Networks DWML, /25

Lecture 4: Perceptrons and Multilayer Perceptrons

Bidirectional Representation and Backpropagation Learning

Introduction to Neural Networks

Artificial Neural Network

Artificial Neural Networks Examination, June 2005

Multilayer Feedforward Networks. Berlin Chen, 2002

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Multilayer Perceptron

Course 395: Machine Learning - Lectures

Multilayer Neural Networks

Artificial Intelligence Hopfield Networks

Introduction To Artificial Neural Networks

RESPONSE PREDICTION OF STRUCTURAL SYSTEM SUBJECT TO EARTHQUAKE MOTIONS USING ARTIFICIAL NEURAL NETWORK

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Artifical Neural Networks

Simple Neural Nets For Pattern Classification

Simulating Neural Networks. Lawrence Ward P465A

Neural Networks Lecture 4: Radial Bases Function Networks

Artificial Neural Networks. MGS Lecture 2

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Feedforward Neural Nets and Backpropagation

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Neural networks. Chapter 20, Section 5 1

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Unconstrained Multivariate Optimization

CS:4420 Artificial Intelligence

Neural Networks Introduction

18.6 Regression and Classification with Linear Models

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Revision: Neural Network

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Improving the Convergence of Back-Propogation Learning with Second Order Methods

CSC Neural Networks. Perceptron Learning Rule

Neural Networks Lecture 2:Single Layer Classifiers

Chapter 9: The Perceptron

Neural Networks and Deep Learning

Artificial Neuron (Perceptron)

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

4. Multilayer Perceptrons

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

ECE521 Lectures 9 Fully Connected Neural Networks

Artificial Neural Network Method of Rock Mass Blastability Classification

Using a Hopfield Network: A Nuts and Bolts Approach

Neural Networks (Part 1) Goals for the lecture

Sections 18.6 and 18.7 Artificial Neural Networks

y(x n, w) t n 2. (1)

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Radial-Basis Function Networks

CE213 Artificial Intelligence Lecture 14

Sections 18.6 and 18.7 Artificial Neural Networks

Lab 5: 16 th April Exercises on Neural Networks

Instituto Tecnológico y de Estudios Superiores de Occidente Departamento de Electrónica, Sistemas e Informática. Introductory Notes on Neural Networks

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Computational Intelligence Lecture 6: Associative Memory

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Multilayer Perceptrons and Backpropagation

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Radial-Basis Function Networks

AI Programming CS F-20 Neural Networks

On the convergence speed of artificial neural networks in the solving of linear systems

CHAPTER 4 FUZZY AND NEURAL NETWORK FOR SR MOTOR

Artificial Neural Networks

Multilayer Neural Networks

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

CSC242: Intro to AI. Lecture 21

Neural Networks: Introduction

Artificial Neural Networks The Introduction

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

CMSC 421: Neural Computation. Applications of Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Transcription:

CHAPER 3 3. Introduction An artificial neural networs (ANNs) model is a functional abstraction of the biological neural structures of the central nervous system. hey are composed of many simple and highly interconnected computational elements, also called neurons, that operate in parallel and are arranged in patterns similar to biological neural nets. Figure 3. shows the typical diagram of a neuron. he neurons are connected by weighted lins passing signals from one neuron to another. Each neuron receives a number of input signals through its connections. he output signal is transmitted through the neuron s outgoing connection. he outgoing connection, in turn, splits into a number of branches that transmit the same signal. he outgoing branches terminate at the incoming connections of other neurons in the networ. It is generally thought that a neural networ is highly sophisticated nonlinear dynamic system. Although each neuron is primitive both in architecture and in function, a networ comprising many neurons is intricate. In addition to its nonlinear nature, neural networ is a signal processing system. he inherent dynamic process can be classified as a fast process and a slow process. he former is a numerical process to evolve to an equilibrium status with given inputs. he latter is a learning process where the values of the connective weights between neurons are adjusted according to the 25

environment. After learning, environmental information is stored on the connective weights. Input signals Weights Output signals W y 2 W 2 Neuron y y 2 n W n y m Figure 3. Diagram of a neuron In 943, McCulloch, a neurobiologist, and Pitts, a statistician, published a seminal paper [75] which inspired the development of the modern digital computer. At approimately the same time, Rosenblatt [76] was also motivated by this paper to investigate the computation of the eye, which eventually led to the first generation of artificial neural networs, nown as the perceptron. Since then, the theory and design of ANNs have advanced significantly. Over the last two decades, ANNs have found application in pattern recognition, signal process, intelligence control, system identification, optimization, etc. [48, 49, 77] because of their ecellent learning capacity and their high tolerance to partially inaccurate data. hey are suitable particularly for problems too comple to be modeled and solved by classical mathematics and traditional procedures. A good review article by Adeli [78] summarized the applications of the ANNs in civil engineering during the 26

20th century. Artificial neural networs are typically characterized by their computational elements, their networ topology, and the learning algorithm used. According to the learning approaches adopted, ANNs can be classified into two major groups: supervised and unsupervised. A supervised networ is given both inputs and desired outputs pairs for training or learning. he networ adjusts its weights until the errors between its outputs and the desired reach a predefined bound. An unsupervised networ is commonly used for classification or clustering. Its weights are adjusted using predefined criteria until the networ has performed a classification. Since these two types of networs are both adopted for signal processing and/or damage detection in this wor, they are sequentially introduced in the following sections. he first introduced networ model is a supervised neural networ with L-BFGS learning algorithm, and the second one is an unsupervised neural networ with fuzzy reasoning algorithm. 3.2 Supervised Neural Networ With L-BFGS Learning Algorithm 3.2. Bacpropagation Networ (BPN) Among the several different types of ANN, the feedforward, multilayered, supervised neural networ with the error bacpropagation algorithm, the bacpropagation networ (BPN) [79], is by far the most frequently applied neural networ learning model, due to its simplicity. he architecture of BP networs, depicted in Figure 3.2, includes an input layer, one or more hidden layers, and an output layer. Five components eist in such a 27

networ [48]: neuron (node); lin; bias unit (threshold); weight; and transfer function. he nodes in each layer are connected to each node in the adjacent layer. Notably, Hecht-Nielsen [80] proved that one hidden layer of neurons suffices to model any solution surface of practical interest. Hence, a networ with only one hidden layer is considered in this wor. Error signal θ v θ w y 2 2 2 2 y 2 j i y i N i N i v j N h w ij N o y N o Input layer Hidden layer Output layer Input signals : neuron (node) : bias unit Figure 3.2 A typical three-layer neural networ 28

Before an ANN can be used, it must be trained from an eisting training set of pairs of input-output elements. he training of a supervised neural networ using a BP learning algorithm normally involves three stages. he first stage is the data feed forward. he net input to a node (j) in hidden layer is j N i = net v + θ (3.) = j vj where v j is the connective weight between nodes in the input layer and those in hidden layer; is the input of the th node in input layer.; θ νj is bias term associated with the hidden layer; and N i is the number of nodes in input layers. hen the output to this node is O = net ) (3.2) j g( j where g is the transfer function in the node. he transfer function can be linear or nonlinear according to the problems of interest. follows. Similarly, the computed output of the ith node in output layer is defined as yi Nh Ni = g ( ( wij g ( v j j= = +θ ν j ) + θ wi )) i =, 2, L N (3.3), o where w ij is the connective weight between nodes in the hidden layer and those in the output layer; θ wi is bias term associated with the output layer. erms N h, and N o are the number of nodes in hidden and output layers, respectively. he second stage is error bac-propagation through the networ. During training, 29

a system error function is used to monitor the performance of the networ. his function is often defined as follows. P ~ ~ E ( W) = ( Yp Yp )( Yp Yp) 2P p= (3.4.a) ~ Y = ( ~ y ~ y L ~ Ly ~ 2 y i N o ) (3.4.b) Y = y y L L y ) (3.4.c) ( 2 y i N o where W represents the collection of the weight matrices and the bias terms associated to the current networ; and y~ i is the desired (or measured) value of output node i. he final stage is the adjustment of the weights. he standard BP algorithm uses a gradient descent approach with a constant step length (learning ratio? to train the networ. W = W + W ( + ) ( ) ( ) (3.5) E ( ) W = η ( ) (3.6) W where η is the constant, general learning ratio in the range, [0, ]. he superscript inde, (), indicates the th learning iteration. Bacpropagation supervised neural networ learning models, however, always tae an etended period to learn. Moreover, the convergence of a BP neural networ is strongly depends upon the use of a learning rate (η). Herein, a more effective adaptive L-BFGS learning algorithm [8], based on the limited memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi Newton second-order method [82] with an ineact line search algorithm, is employed. 30

3.2.2 L-BFGS Learning Algorithm In the conventional BFGS method, the approimation, H +, to the inverse Hessian matri of the error function, E, is updated by, H + = ( I? sz ) H ( I? zs ) V H V +? s s +? s s (3.7.a) where? = / z s (3.7.b) V = I? z s (3.7.c) s = W W ( + ) ( ) (3.7.d) z = g + g (3.7.e) g E = () (3.7.f) W Rather than forming the matri H in the BFGS method, the vectors s and y first define and then implicitly and dynamically update the Hessian approimation using information from the preceding few iterations. herefore, the final stage of adjusting weights in a supervised ANN is modified as follows. W = W ( + ) ( ) + α d (3.8) he search direction is given by, d β (3.9) = H g + d 3

z H ( ) ( ) g ( ) β = (3.0) z ( ) d ( ) he step length, α, is mathematically adapted during the learning process, according to the ineact line search algorithm. his algorithm is used rather than a constant learning ratio in the L-BFGS learning algorithm. It is based on three sequential operations braceting, sectioning, and interpolation. he braceting operation bracets the potential step length, α, between two points, through a series of function evaluations. Sectioning then taes the two points of the bracet, for eample α and α 2, as the initial points, reduces the step size piecemeal, and determines the minimum between the points, to a desired degree of accuracy. Finally, quadratic interpolation approach taes the three points, α, α 2, and (α + α 2 )/2, to fit a parabola and thus determine the step length, α. Accordingly, the step length, α, must satisfy the following conditions in each iteration. ( ) ( ) ( ) E( W α d ) E( W ) + βα ( E( W ) d ) β (0,) and α > 0 (3..a) + ( ) ( ) E( W + α d ) d θ ( E( W ) d θ ( β,) and α > 0 (3..b) + ( ) E( W + α d ) d( ) < 0 (3..c) Hence, the problem of trial and error selection of a learning ratio in the conventional BP algorithm is avoided in the adaptive L-BFGS learning algorithm. 3.3 Unsupervised Fuzzy Neural Networ Reasoning Model he unsupervised fuzzy neural networs (UFN) reasoning model was proposed by 32

Hung and Jan [83]. his model had been successful applied to the problems of preliminary design [84, 85] and control of building structures [86], and is further applied to the damage detection of structures [87]. he UFN reasoning model consists of an unsupervised neural networ with a fuzzy computing process. he basic concept of the proposed model is that, the solution of a new instance can be solved by retrieved the similar instances from a collection of solved instances, named as instance base, to a specified domain. he following is a brief review of the UFN reasoning model. he UFN reasoning model is implemented in the following steps (Figure 3.3): (i) measuring the similarities between new instance and eisted instances; (ii) generating the fuzzy set of similar instances; and (iii) synthesizing the solution based on the fuzzy set of similar instances. he three steps are introduced in the following subsections. 3.3. Similarity Measurement he first step involves searching for instances that similar to the new instance (Y) in the instance base (U j ) according to their inputs ( Y i and U, ). It is performed j i through a single-layered laterally-connected networ with an unsupervised competing algorithm. he similarity measurement is implemented by calculating the degree of difference between two instances. he function of degree of difference is defined as 33

d Yj M m m 2 i, U ji ) = m ( u j ) m= = diff ( Y α (3.2) where d Yj denotes the discrepancy between the inputs of the new instance Y and the jth instance U j in the instance base; α m denotes predefined weight which is used to represent the degree of importance for the mth decision variable in the input. After the values of d Yj for all instances are calculated, the degree of similarity of instances Y and U j can be derived by the following fuzzy membership function. 0 if dyj Rma Rma Rmin RmindYj µ Yj = f ( dyj, Rma, Rmin ) = if Rmin < dyj < Rma (3.3) ( Rma Rmin ) dyj if dyj Rmin he terms and define the upper and lower bounds of the degree of difference. In case the degree of difference is less than the upper bound, any two instances are treated as similar in some measure. Obviously, Equations (3.2) and (3.3) show that the smaller the discrepancy d Yj is, the higher is the degree of similarity. It is seen from equation (3.3) that the upper bound heavily influences the measurement of similarity. A large implies a loose similar relationship between instances. Consequently, a large number of instances are considered as similar instances. Since the solution of the new instance is based on the similar instances been taen, a selection of large could result in inferior solution. On the other hand, a small indicates that a strict similar relationship is adopted. Accordingly, most of the instances in the instance base are sorted as dissimilar to the new instance, and the UFN reasoning model could generate no solution. herefore, a linear correlation analysis is employed to systematically determine the appropriate value of. More 34

details about the correlation analysis process can be seen in Appendi A [85, 88]. 3.3.2 Fuzzy Set Generation he second step entails representing the fuzzy relationships among the new instance and its similar instances. he instances with the degree of difference smaller than (the fuzzy membership value larger than zero, in other words) are etracted from the instance base as similar instances. Subsequently, the fuzzy set of similar to Y is then formed with the similar instances and their corresponding fuzzy membership values and is epressed by { S µ ), S ( µ ),..., ( ),...} S = µ (3.4) sup, Y ( 2 2 S p p where S p is the pth similar instance to instance Y; and µ p is the corresponding fuzzy membership value. 3.3.3 Solution Synthesis Finally, the solution for instance Y is generated by synthesizing the outputs of similar instances according to their associated fuzzy membership value through the center of gravity (COG) method. As a result, the output Y o of instance Y via the COG method is defined as follows: Y o p = µ, o = p = S µ (3.5) 35

Measuring the similarities between new instance and eisted instances New instance Y? d Y Instance base U U d Y2 d YN U 2 µ.0 U N µ j d R min d Yj R ma Generating the fuzzy set of similar instances Fuzzy set of similar instances S (µ ) S 2 (µ 2 ) S p (µ p ) Find any d Yj <R ma Synthesizing the solution based on the fuzzy set COG method: Y p = µ, o o = p = S µ Solution for new instance:? =Y o Figure 3.3 Process of the UFN reasoning 36