Bidirectional Representation and Backpropagation Learning
|
|
- Martha Shaw
- 5 years ago
- Views:
Transcription
1 Int'l Conf on Advances in Big Data Analytics ABDA'6 3 Bidirectional Representation and Bacpropagation Learning Olaoluwa Adigun and Bart Koso Department of Electrical Engineering Signal and Image Processing Institute University of Southern California Abstract The bacpropagation learning algorithm extends to bidirectional training of multilayer neural networs The bidirectional operation gives a form of bacward chaining or bacward inference from a networ output We first prove that a fixed three-layer networ of threshold neurons can exactly represent any finite permutation function and its inverse The forward pass gives the function value The bacward pass through the same networ gives the inverse value We then derive and test a bidirectional version of the bacpropagation algorithm that can learn bidirectional mappings or their approximations Keywords: bidirectional associative memory, bacpropagation learning, function representation, bacward inference Input Layer Forward Pass: a x a h a y Hidden Layer Output Layer Bacward Pass: a x a h a y Fig : Bidirectional Representation of a Permutation Function This 3-layer bidirectional threshold networ exactly represents the invertible 3-bit bipolar permutation function f in Table The forward pass feeds the input vector x to the input layer and passes it through the weighted lins and the hidden layer of threshold neurons (each with zero threshold) to the output layer The bacward pass sends the output bit vector y bac through the same weighted lins and threshold neurons The networ computes y f(x) on the forward pass and the inverse value f (y) on the bacward pass Bidirectional Bacpropagation We show that bidirectional bacpropagation (B-BP) training endows a multilayered neural networ N : R n R p with a form of bacward inference The forward pass gives the usual predicted neural output N(x) given a vector input x The output vector value y N(x) in effect answers the what-if question that x poses: What would we observe if x occurred? What would be the effect? Then the bacward pass answers the why question that y poses: Why did y occur? What type of input would cause y? Feedbac convergence to a resonating bidirectional fixed-point attractor [], [] gives a long-term or equilibrium answer to both the what-if and why questions This bidirectional approach to neural learning applies to big data because the BP algorithm [3], [4], [5] scales linearly with training data BP has time complexity O(n) for n training samples because the forward pass is O() while the bacward pass is O(n) So the new B-BP algorithm still has only O(n) complexity This linear scaling does not hold in general for most machine-learning algorithms An example is the quadratic complexity O(n ) of support-vector ernel methods [6] We present the bidirectional results in two parts The first part proves that there exist fixed-weight multilayer threshold networs that can exactly represent some invertible functions Theorem shows that this holds for all finite bipolar (or binary) permutation functions Figure shows such a bidirectional 3-layer networ of zero-threshold neurons It exactly represents the 3-bit permutation function f in Table where,, +} denotes,, } So f is a self-biection that rearranges the 8 vectors in the bipolar hypercube, } 3 The forward pass converts the input bipolar vector (,, ) into the output bipolar vector (,, ) The bacward pass converts (,, ) into (,, ) over the same fixed synaptic connection weights These same weights and neurons convert the other 7 input vectors in the first column of Table to the corresponding 7 output vectors in the second column and conversely Theorem requires n hidden neurons to represent a permutation function on the bipolar hypercube, } n Using so many hidden neurons is neither practical nor necessary The representation in Figure uses only 4 hidden neurons It is ust one example of a representation that uses fewer than 8 hidden neurons We see instead an efficient learning ISBN: , CSREA Press
2 4 Int'l Conf on Advances in Big Data Analytics ABDA'6 algorithm that can learn bidirectional representations (or at least approximations) from sample data The second part extends the BP algorithm to ust this bidirectional case This taes some care because training the same weights in one direction tends to overwrite or undo the BP training in the other direction The B-BP algorithm solves this problem by minimizing a oint error It found representations of the permutation in Table that needed only 3 hidden neurons The learning approximation also improves by adding more hidden neurons Figure shows the effect of training with hidden neurons Figure 3 shows how the B-BP training error falls off as the number of hidden neurons grows when learning the 5-bit permutation in Table Bidirectional Function Representation of Bipolar Permutations This section proves that there exists multilayered neural networs that can exactly bidirectionally represent some invertible functions We first define the networ variables The proof uses threshold neurons while the B-BP algorithm uses soft-threshold logistic sigmoids for hidden neurons and uses identity activations for input and output neurons A bidirectional neural networ is a multilayer networ N : X Y that maps the input space X to the output space Y and conversely through the same set of weights The bacward pass uses the transpose matrices of the weight matrices that the forward pass uses Such a networ is a bidirectional associative memory or BAM [], [] The forward pass sends input vector x through weight matrix W from the input layer to the hidden layer and then on through matrix U to the output layer The bacward pass sends the output y from the output layer bac through the hidden layer to the input layer Let I, J, and K denote the respective number of input, hidden, and output neurons Then the I J matrix W connects the input layer to the hidden The J K matrix U connects the hidden layer to the output layer Table : 3-Bit Bipolar Permutation Function f Input x Output t [+ + +] [ +] [+ + ] [ + +] [+ +] [+ + +] [+ ] [+ +] [ + +] [ + +] [ + ] [ ] [ +] [+ ] [ ] [+ + ] The hidden-neuron input o h has the affine form o h w i a x i (x i ) + b h () i where weight w i connects the i th input neuron to the th hidden neuron, a x i is the activation of the i th input neuron, and b h is the bias term of the th hidden neuron The activation a h of the th hidden neuron is a bipolar threshold: a h (o h if o h ) if o h > () The B-BP algorithm in the next section uses soft-threshold bipolar logistic functions for the hidden activations because such sigmoid functions are differentiable The proof below also modifies the hidden thresholds to tae on binary values in (3) and to fire with a slightly different condition The input o y to the th output neuron from the hidden layer is also affine: o y u a h + b y (3) where weight u connects the th hidden neuron to the th output neuron Term b y is the additive bias of the th output neuron The output activation vector a y gives the predicted outcome or target on the forward pass The th output neuron has bipolar threshold activation a y : a y (oy ) if o y if o y > (4) The forward pass of an input bipolar vector x from Table through the networ in Figure gives an output activation vector a y that equals the table s corresponding target vector y The bacward pass feeds y from the output layer bac through the hidden layer to the input layer Then the bacward-pass input o hb to the th hidden neuron is o hb a hb (o hb ) u y + b h (5) where y is the output of the th output neuron The bacward-pass activation of the th hidden neuron a hb is The bacward-pass input o xb i o xb i if o hb if o hb > to the i th input neuron is (6) w i a hb + b x i (7) where b x i is the bias for the ith input neuron The input-layer activation a x gives the predicted value for the bacward pass The i th input neuron has bipolar activation a x i (o xb i ) if o xb i if o xb i > (8) ISBN: , CSREA Press
3 Int'l Conf on Advances in Big Data Analytics ABDA'6 5 We can now state and prove the bidirectional representation theorem for bipolar permutations The theorem also applies to binary permutations because the input and output neurons have bipolar threshold activations Theorem : Exact Bidirectional Representation of Bipolar Permutation Functions Suppose that the invertible function f :, } n, } n is a permutation Then there exists a 3-layer bidirectional neural networ N :, } n, } n that exactly represents f in the sense that N(x) f(x) and N (x) f (x) for all x Proof: The proof strategy pics weight matrices W and U so that only one hidden neuron fires on both the forward and the bacward pass So we structure the networ such that any input vector x fires only one hidden neuron on the forward pass and such that the output vector y N(x) fires only the same hidden neuron on the bacward pass The bipolar permutation f is a biective map of the bipolar hypercube, } n into itself The bipolar hypercube contains the n input bipolar column vectors x, x,, x n It liewise contains the n output bipolar vectors y, y,, y n The networ will use n corresponding hidden threshold neurons So J n Matrix W connects the input layer to the hidden layer Matrix U connects the hidden layer to output layer Define W so that each row lists all n bipolar input vectors and define U so that each column lists all n transposed bipolar output vectors: W x x x n T y T y U y T n We now show that this arrangement fires only one hidden neuron and that the forward pass of any input vector x n gives the corresponding output vector y n Assume that every neuron has zero bias Pic a bipolar input vector x m for the forward pass Then the input activation vector a x (x m ) (a x (x m),, a x n(x n m)) equals the input bipolar vector x m because the input activations (8) are bipolar threshold functions with zero threshold So a x equals x m because the vector space is bipolar, } n The hidden layer input o h is the same as () It has the matrix-vector form o h W T a x (9) W T x m () (o h, o h,, o h n,, o h n)t () (x T x m, x T x m,, x T x m,, x T nx m) T () from the definition of W since o h is the inner product of x and x m The input o h to the th neuron of the hidden layer obeys o h n when m and oh < n when m This holds because the vectors x are bipolar with scalar components in, } The magnitude of a bipolar vector in, } n is n The inner product x T x m is maximum when both vectors have the same direction This occurs when m The inner product is otherwise less than n Now comes the ey step in the proof Define the hidden activation a h as a binary (not bipolar) threshold function where n is the threshold value: a h (o h if o h ) n, if o h < n (3) Then the hidden layer activation a h is the unit bit vector (,,,,, ) T where a h when m and where a h when m This holds because all n bipolar vectors x m in, } are distinct and so exactly one of these n vectors achieves the maximum inner-product value n x T mx m The input vector o y to the output layer is o y U T a h (4) y a h (5) y m (6) where a h is the activation of the th hidden neuron The activation a y of the output layer is: a y (o y ) if o y if o y < (7) The output layer activation leaves o y unchanged because o y equals y m and because y m is a vector in, } n So a y y m (8) So the forward pass of an input vector x m through the networ yields the desired corresponding output vector y m where y m f(x m ) for bipolar permutation map f Consider next the bacward pass over N The bacward pass propagates the output vector y m from the output layer to the input layer through the hidden layer The hidden layer input o h has the form (5) and so o h U y m (9) ISBN: , CSREA Press
4 6 Int'l Conf on Advances in Big Data Analytics ABDA'6 where o h (y T y m, y T y m,, y T y m,, y T ny m) T The input o h of the th neuron in the hidden layer o h equals the inner product of y and y m So o h n when m and o h < n when m This holds because again the magnitude of a bipolar vector in, } n is n The inner product o h is maximum when vectors y m and y lie in the same direction The activation a h for the hidden layer has the same components as (3) So the hidden-layer activation a h again equals the unit bit vecgtor (,,,,, ) T where a h when m and a h when m Then the input vector o x for the input layer is o x W a h () x a h () x m () The i th input neuron has a threshold activation that is the same as a x (o x if o x i i ) if o x (3) i < where o x i is the input of i th neuron in the input layer This activation leaves o x unchanged because o x equals x m and because the vector x m lies in, } n So a x o x (4) x m (5) So the bacward pass of any target vector y m yields the desired input vector x m where f (y m ) x m This completes the bacward pass and the proof 3 Bidirectional BP Learning We now develop the new bidirectional BP algorithm for learning bidirectional function representations or approximations Bidirectional BP training minimizes both the error function for the forward pass and the bacward pass The forward-pass error E f is the error at the output layer The bacward-pass error E b is the error at the input layer Bidirectional BP training combines these two errors The forward pass sends the input vector x through the hidden layer to the ouput layer We use only one hidden layer There is no loss of generality in using any finite number of them The hidden-layer input values o h are the same as () The th hidden activation a h is the bipolar logistic that shifts and scales the ordinary logistic: a h (o h ) (6) + e oh and (3) gives the input o y to the th output neuron The hidden activations can also be logistic or any other sigmoidal function The activation for an output neuron is the identity function: a y oy (7) where a y is the activation of th output neuron The error function for the forward pass was the squared error E f E f (y a y ) (8) where y is the value of th neuron in the output layer Ordinary unidirectional BP updates the weights and other networ parameters by propagating the error from the output layer bac to the input layer The bacward pass sends the output vector y from the output layer to the input layer through the hidden layer The input to th hidden neuron o h is the same as (5) The activation a h for th hidden neuron is: a h (9) + e oh The input o x i for the ith input neuron is the same as (7) The activation at the input layer is the identity function: a x i o x i (3) A nonlinear sigmoid (or Gaussian) activation can replace the linear function The bacward-pass error E b is E b (x i a x i ) (3) i The partial derivative of the hidden-layer activation in the forward direction is a h o h ( ) o h (3) + e oh 4e oh (33) ( + e oh ) [ ] (34) + e oh + e oh (a h + )( a h ) (35) Let a h denote the derivative of a h with respect to the inner-product term o h We again use the superscript b to denote bacward pass Then the partial derivative of E f with respect to weight u is E f u u (y a y ) (36) E f a y a y o y o y u (37) (y a y ) ay (38) ISBN: , CSREA Press
5 Int'l Conf on Advances in Big Data Analytics ABDA'6 7 The partial derivative of E f with respect to w i is E f w i ( K w i (y a y ) (39) E f a y a y o y o y a h ) a h o h o h w i (4) (y a y )u a h xi (4) The partial derivative of E f with respect to the bias b y of the th output neuron is E f b y b y (y a y ) (4) E f a y a y o y o y b y (43) (y a y ) (44) The partial derivative of E f with respect to the bias b h of the th hidden neuron is E f b h b h ( K (y a y ) (45) E f a y a y o y o y a h ) a h o h o h b h (46) (y a y )u a h (47) The partial derivative of E b with respect to w i is E b w i w i (x i a x i ) (48) E b a x i o x i a x i o x i w i (49) (x i a x i ) a x i (5) The partial derivative of E b with respect to u is E b u ( u i i E b a x i (x i a x i ) (5) a x i o x i o x ) i a hb a hb o hb o hb u (5) (x i a x i )w i a hb y (53) i The partial derivative of E b with respect to the bias b x i of the i th input neuron is: E b b x i b x i (x i a x i ) (54) i E b a x i o x i a x i o x i b x i (55) (x i a x i ) (56) The partial derivative of E b with respect to the bias b h of the th hidden neuron is E b b h b h i (x i a x i ) (57) i ( E b a x i a x i o x i o x ) i a hb a hb o hb o hb b h (58) (x i a x i )w i a hb (59) i Bidirectional BP training minimizes the oint error E of the forward and bacward passes The oint error E sums the forward error E f and bacward error E b : E E f + E b (6) Then the partial derivative of E with respect to u is E u E f u + E b u (6) (y a y )ay + (x i a x i )w i a hb y (6) i from (36) and (5) The partial derivative of the oint error E with respect to the weight w i is E E f + E b w i w i w i (63) (y a y )u a h xi (64) + (x i a x i )a x i (65) from (39) and (48) The partial derivative of E with respect to b h E b h E f b h + E b b h (y a y )u a h + from (45) and (57) (x i a x i )w i a hb i gives (66) (67) (68) ISBN: , CSREA Press
6 8 Int'l Conf on Advances in Big Data Analytics ABDA'6 The error for the input neuron bias is E b only because x o x for the forward pass The error for the output neuron bias is E f only for output neuron bias because y o y for the bacward pass Then E b x i E b y E b b x i E f b y x i a x i (69) y a y (7) Then B-BP training updates the parameters as u (n+) w (n+) i u (n) w (n) i b x(n+) i b x(n) i η E η E u (7) η E w i (7) b x i b h (n+) b h(n) E η b h b y(n+) b y (n) η E b y (73) (74) (75) where η is the learning rate The partial derivatives are from (6) (7) Algorithm summarizes the B-BP algorithm 4 Simulation Results We tested the bidirectional BP algorithm on a 5-bit permutation functions in 3-layer networs with different numbers of hidden neurons The B-BP algorithm produced either an exact representation or approximation We report results for learning a permutation function from the 5-bit bipolar vector space, } n The hidden neurons used bipolar logistic activations The input and output neurons used identity activations Table displays the the permutation test function that mapped, } 5 to itself We compared the forward and bacward forms unidirectional BP with bidirectional BP We also tested to see whether adding more hidden neurons improved networ approximation accuracy The simulations used 8, samples for networ training and, separate samples for testing Forward-pass of (standard) BP used E f as its error while bacward-pass BP used E b as its error Bidirectional BP combined both E f and E b for its oint error We computed testing error for forward pass and bacward pass Each plotted error value averaged runs Figure shows the results of running the three types of BP learning on a 3-layer networ with hidden neurons The training error falls along both directions as the training progresses This in not the case for the unidirectional cases of forward BP and bacward BP training Forward training and bacward training perform well only for function approximation in their preferred direction and not in the opposite direction 3 Forward bacpropagation using hidden neurons E f E b Training Iterations Bacward bacpropagation using hidden neurons E f E b Training Iterations Bidirectional bacpropagation using hidden neurons Training Iterations Fig : Training-set squared error using hidden neurons with forward BP training, bacward BP training, and bidirectional BP training Forward BP tuned the networ with respect to E f only Bacward BP training tuned it respect to E b only Bidirectional BP training combined E f and E b to update the networ parameters Testing Error with Bidirectional Bacpropagation Forward Pass Bacward Pass Hidden Neurons Fig 3: B-BP training error for the 3-bit permutation in Table with different numbers of hidden neurons The two curves describe the training error for the forward and bacward passes through the 3- layer networ Each test used samples The number of hidden neurons varied from 5,,, 5,, to Table 3 shows the forward-pass test errors for learning 3-layer neural networs as the number of hidden neurons grows We again compared the three forms of BP for the networ training two forms of unidirectional BP and E f E b ISBN: , CSREA Press
7 Int'l Conf on Advances in Big Data Analytics ABDA'6 9 Table : 5-Bit Bipolar Permutation Function Input x Output t [ ] [+ + + ] [ +] [ ] [ + ] [ + +] [ + +] [ +] [ + ] [+ + +] [ + +] [ + +] [ + + ] [+ + ] [ + + +] [+ + + ] [ + ] [ + ] [ + +] [ + + +] [ + + ] [+ + ] [ + + +] [+ +] [ + + ] [ ] [ + + +] [ + ] [ ] [ ] [ ] [ + ] Input x Output t [+ ] [+ ] [+ +] [+ ] [+ + ] [ ] [+ + +] [ + + +] [+ + ] [ + + ] [+ + +] [+ + +] [+ + + ] [ + ] [ ] [+ + ] [+ + ] [ ] [+ + +] [ ] [+ + + ] [+ + +] [ ] [ + + ] [+ + + ] [ ] [ ] [ ] [ ] [+ + + ] [ ] [ + +] Table 3: Forward-Pass Testing Error E f Bacpropagation errors Hidden Neurons Forward Bacward Bidirectional Data: T input vectors x,, x T }, T target vectors y,, y T } such that f(x i ) y i Number of hidden neurons J Batch size S and number of epochs R Choose the learning rate η Result: Bidirectional networ for function f Initialize: Randomly select the weights of W () and U () Randomly pic the bias weights for input, hidden, and output neurons b x, b h, b y } while epoch r : End R do Initialize: W, U, b x, b y, b z while batch_size l : End L do Pic input vector x and its corresponding target vector y Compute hidden activation a h and output activation a y for forward pass Compute hidden activation a h and output activation a y for bacward pass Compute the derivatives with respect to W and U: W E and U E Compute the derivatives with respect to the bias weights: b xe, b ye, and b ye Compute change in weights: W W + W E and W W + U E Compute change in bias weights: ( b x b x + b xe), ( b y b y + b xe) and ( b z b z + b ze) Update: W (r+) W (r) η L U U (r+) U (r) η L U b x (r+) b x (r) η L bx b y (r+) b y (r) η L by b z(r+) b z(r) η L bz Algorithm : The Bidirectional BP Algorithm Table 4: Bacward-Pass Testing Error E b Bacpropagation errors Hidden Neurons Forward Bacward Bidirectional bidirectional BP The forward-pass error for forward BP fell significantly as the number of hidden neurons grew The forward-pass error of bacward BP decreased slightly as the number of hidden neurons grew and gave the worst performance Bidirectional BP performed well on the test set Its forward-pass error also fell significantly as the number of hidden neurons grew Table 4 shows similar errorversus-hidden-neuron results for the bacward-pass error The two tables ointly show that the unidirectional forms of BP performed well only in one direction while the B- BP algorithm performed well in both directions Hence we propose using only the B-BP algorithm for learning bidirectional function representations or approximations 5 Conclusions We have shown that a bidirectional multilayer networ can exactly represent bipolar or binary permutation mappings if the networ uses enough threshold or sigmoidal neurons The proof requires an exponential number of hidden neurons for exact representations but much simpler representation exist in general The new B-BP algorithm allows bidirectional learning of sampled functions It can often find efficient bidirectional representations or approximations with a smaller set of hidden neurons References [] B Koso, Bidirectional associative memories, IEEE Transactions on Systems, Man and Cybernetics, vol 8, no, pp 49 6, 988 [] B Koso, Neural Networs and Fuzzy Systems: A Dynamical Systems Approach to Machine Intelligence Prentice Hall, 99 [3] D Rumelhart, G Hinton, and R Williams, Learning representations by bac-propagating errors, Nature, pp , 986 [4] Y LeCun, Y Bengio, and G Hinton, Deep learning, Nature, vol 5, pp , 5 [5] M Jordan and T Mitchell, Machine learning: trends, perspectives, and prospects, Science, vol 349, pp 55 6, 5 [6] S Y Kung, Kernel methods and machine learning Cambridge University Press, 4 ISBN: , CSREA Press
Multilayer Perceptron
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationLecture 16: Introduction to Neural Networks
Lecture 16: Introduction to Neural Networs Instructor: Aditya Bhasara Scribe: Philippe David CS 5966/6966: Theory of Machine Learning March 20 th, 2017 Abstract In this lecture, we consider Bacpropagation,
More informationLecture 13 Back-propagation
Lecture 13 Bac-propagation 02 March 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/21 Notes: Problem set 4 is due this Friday Problem set 5 is due a wee from Monday (for those of you with a midterm
More informationSimple Neural Nets For Pattern Classification
CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,
More informationAn artificial neural networks (ANNs) model is a functional abstraction of the
CHAPER 3 3. Introduction An artificial neural networs (ANNs) model is a functional abstraction of the biological neural structures of the central nervous system. hey are composed of many simple and highly
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationArtificial Neural Network : Training
Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural
More informationHopfield Neural Network
Lecture 4 Hopfield Neural Network Hopfield Neural Network A Hopfield net is a form of recurrent artificial neural network invented by John Hopfield. Hopfield nets serve as content-addressable memory systems
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationSlide04 Haykin Chapter 4: Multi-Layer Perceptrons
Introduction Slide4 Hayin Chapter 4: Multi-Layer Perceptrons CPSC 636-6 Instructor: Yoonsuc Choe Spring 28 Networs typically consisting of input, hidden, and output layers. Commonly referred to as Multilayer
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationLECTURE # - NEURAL COMPUTATION, Feb 04, Linear Regression. x 1 θ 1 output... θ M x M. Assumes a functional form
LECTURE # - EURAL COPUTATIO, Feb 4, 4 Linear Regression Assumes a functional form f (, θ) = θ θ θ K θ (Eq) where = (,, ) are the attributes and θ = (θ, θ, θ ) are the function parameters Eample: f (, θ)
More informationLecture 5: Logistic Regression. Neural Networks
Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture
More informationNeural Network Language Modeling
Neural Network Language Modeling Instructor: Wei Xu Ohio State University CSE 5525 Many slides from Marek Rei, Philipp Koehn and Noah Smith Course Project Sign up your course project In-class presentation
More informationAI Programming CS F-20 Neural Networks
AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols
More informationArtificial Neural Network and Fuzzy Logic
Artificial Neural Network and Fuzzy Logic 1 Syllabus 2 Syllabus 3 Books 1. Artificial Neural Networks by B. Yagnanarayan, PHI - (Cover Topologies part of unit 1 and All part of Unit 2) 2. Neural Networks
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: Ryan Lowe (ryan.lowe@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted,
More informationSerious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions
BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design
More informationLecture 6. Regression
Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationTopic 3: Neural Networks
CS 4850/6850: Introduction to Machine Learning Fall 2018 Topic 3: Neural Networks Instructor: Daniel L. Pimentel-Alarcón c Copyright 2018 3.1 Introduction Neural networks are arguably the main reason why
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationCourse 395: Machine Learning - Lectures
Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture
More informationArtificial Intelligence
Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement
More informationUsing Noise to Speed Up Video Classification with Recurrent Backpropagation
Using Noise to Speed Up Video Classification with Recurrent Bacpropagation Olaoluwa Adigun Department of Electrical Engineering University of Southern California Los Angeles, CA 90089. E-mail: adigun@usc.edu
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationIntroduction to Neural Networks
Introduction to Neural Networks Philipp Koehn 3 October 207 Linear Models We used before weighted linear combination of feature values h j and weights λ j score(λ, d i ) = j λ j h j (d i ) Such models
More informationSupervised Learning. George Konidaris
Supervised Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell,
More information4. Multilayer Perceptrons
4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output
More informationIntroduction Biologically Motivated Crude Model Backpropagation
Introduction Biologically Motivated Crude Model Backpropagation 1 McCulloch-Pitts Neurons In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published A logical calculus of the
More informationNeuro-Fuzzy Comp. Ch. 4 March 24, R p
4 Feedforward Multilayer Neural Networks part I Feedforward multilayer neural networks (introduced in sec 17) with supervised error correcting learning are used to approximate (synthesise) a non-linear
More informationMachine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler
+ Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions
More informationLearning Deep Architectures for AI. Part I - Vijay Chakilam
Learning Deep Architectures for AI - Yoshua Bengio Part I - Vijay Chakilam Chapter 0: Preliminaries Neural Network Models The basic idea behind the neural network approach is to model the response as a
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More information18.6 Regression and Classification with Linear Models
18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight
More informationRadial-Basis Function Networks
Radial-Basis Function etworks A function is radial () if its output depends on (is a nonincreasing function of) the distance of the input from a given stored vector. s represent local receptors, as illustrated
More informationIntroduction to Neural Networks
Introduction to Neural Networks Steve Renals Automatic Speech Recognition ASR Lecture 10 24 February 2014 ASR Lecture 10 Introduction to Neural Networks 1 Neural networks for speech recognition Introduction
More informationMulti-layer Neural Networks
Multi-layer Neural Networks Steve Renals Informatics 2B Learning and Data Lecture 13 8 March 2011 Informatics 2B: Learning and Data Lecture 13 Multi-layer Neural Networks 1 Overview Multi-layer neural
More informationCSC 411 Lecture 10: Neural Networks
CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11
More informationCHAPTER 3. Pattern Association. Neural Networks
CHAPTER 3 Pattern Association Neural Networks Pattern Association learning is the process of forming associations between related patterns. The patterns we associate together may be of the same type or
More informationCS4442/9542b Artificial Intelligence II prof. Olga Veksler. Lecture 5 Machine Learning. Neural Networks. Many presentation Ideas are due to Andrew NG
CS4442/9542b Artificial Intelligence II prof. Olga Vesler Lecture 5 Machine Learning Neural Networs Many presentation Ideas are due to Andrew NG Outline Motivation Non linear discriminant functions Introduction
More informationMultilayer Perceptron
Aprendizagem Automática Multilayer Perceptron Ludwig Krippahl Aprendizagem Automática Summary Perceptron and linear discrimination Multilayer Perceptron, nonlinear discrimination Backpropagation and training
More informationEEE 241: Linear Systems
EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of
More informationPattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore
Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal
More informationA Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation
1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,
More informationIntroduction to Neural Networks
Introduction to Neural Networks Philipp Koehn 4 April 205 Linear Models We used before weighted linear combination of feature values h j and weights λ j score(λ, d i ) = j λ j h j (d i ) Such models can
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationSpeaker Representation and Verification Part II. by Vasileios Vasilakakis
Speaker Representation and Verification Part II by Vasileios Vasilakakis Outline -Approaches of Neural Networks in Speaker/Speech Recognition -Feed-Forward Neural Networks -Training with Back-propagation
More informationMultilayer Perceptron = FeedForward Neural Network
Multilayer Perceptron = FeedForward Neural Networ History Definition Classification = feedforward operation Learning = bacpropagation = local optimization in the space of weights Pattern Classification
More informationESTIMATING THE ACTIVATION FUNCTIONS OF AN MLP-NETWORK
ESTIMATING THE ACTIVATION FUNCTIONS OF AN MLP-NETWORK P.V. Vehviläinen, H.A.T. Ihalainen Laboratory of Measurement and Information Technology Automation Department Tampere University of Technology, FIN-,
More informationNeural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21
Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural
More informationUnit III. A Survey of Neural Network Model
Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of
More informationCSC321 Lecture 5: Multilayer Perceptrons
CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3
More informationMultilayer Feedforward Networks. Berlin Chen, 2002
Multilayer Feedforard Netors Berlin Chen, 00 Introduction The single-layer perceptron classifiers discussed previously can only deal ith linearly separable sets of patterns The multilayer netors to be
More informationCOGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.
COGS Q250 Fall 2012 Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November. For the first two questions of the homework you will need to understand the learning algorithm using the delta
More informationNeural networks. Chapter 20. Chapter 20 1
Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms
More informationLast update: October 26, Neural networks. CMSC 421: Section Dana Nau
Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications
More informationECE 471/571 - Lecture 17. Types of NN. History. Back Propagation. Recurrent (feedback during operation) Feedforward
ECE 47/57 - Lecture 7 Back Propagation Types of NN Recurrent (feedback during operation) n Hopfield n Kohonen n Associative memory Feedforward n No feedback during operation or testing (only during determination
More informationMACHINE LEARNING AND PATTERN RECOGNITION Fall 2006, Lecture 4.1 Gradient-Based Learning: Back-Propagation and Multi-Module Systems Yann LeCun
Y. LeCun: Machine Learning and Pattern Recognition p. 1/2 MACHINE LEARNING AND PATTERN RECOGNITION Fall 2006, Lecture 4.1 Gradient-Based Learning: Back-Propagation and Multi-Module Systems Yann LeCun The
More informationCS 4700: Foundations of Artificial Intelligence
CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman selman@cs.cornell.edu Machine Learning: Neural Networks R&N 18.7 Intro & perceptron learning 1 2 Neuron: How the brain works # neurons
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationArtificial Neural Networks
Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationPart 8: Neural Networks
METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as
More informationArtifical Neural Networks
Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................
More information2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller
2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks Todd W. Neller Machine Learning Learning is such an important part of what we consider "intelligence" that
More informationLearning and Memory in Neural Networks
Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units
More informationThe Multi-Layer Perceptron
EC 6430 Pattern Recognition and Analysis Monsoon 2011 Lecture Notes - 6 The Multi-Layer Perceptron Single layer networks have limitations in terms of the range of functions they can represent. Multi-layer
More informationNeural Networks (Part 1) Goals for the lecture
Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationNeural Network Identification of Non Linear Systems Using State Space Techniques.
Neural Network Identification of Non Linear Systems Using State Space Techniques. Joan Codina, J. Carlos Aguado, Josep M. Fuertes. Automatic Control and Computer Engineering Department Universitat Politècnica
More informationKeywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm
Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding
More informationArtificial Intelligence Hopfield Networks
Artificial Intelligence Hopfield Networks Andrea Torsello Network Topologies Single Layer Recurrent Network Bidirectional Symmetric Connection Binary / Continuous Units Associative Memory Optimization
More informationIntroduction to Deep Learning
Introduction to Deep Learning Some slides and images are taken from: David Wolfe Corne Wikipedia Geoffrey A. Hinton https://www.macs.hw.ac.uk/~dwcorne/teaching/introdl.ppt Feedforward networks for function
More informationThe Backpropagation Algorithm Functions for the Multilayer Perceptron
Proceedings of the th WSEAS International Conference on Sustainability in Science Engineering The Bacpropagation Algorithm Functions for the Multilayer Perceptron MARIUS-CONSTANTIN POPESCU VALENTINA BALAS
More informationIntelligent Systems Discriminative Learning, Neural Networks
Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationArtificial Neural Networks 2
CSC2515 Machine Learning Sam Roweis Artificial Neural s 2 We saw neural nets for classification. Same idea for regression. ANNs are just adaptive basis regression machines of the form: y k = j w kj σ(b
More informationECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann
ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output
More informationBackpropagation Neural Net
Backpropagation Neural Net As is the case with most neural networks, the aim of Backpropagation is to train the net to achieve a balance between the ability to respond correctly to the input patterns that
More informationBits of Machine Learning Part 1: Supervised Learning
Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification
More informationNeural Networks Learning the network: Backprop , Fall 2018 Lecture 4
Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:
More informationArtificial Neural Networks. Historical description
Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of
More informationCOMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017
COMP9444 Neural Networks and Deep Learning 2. Perceptrons COMP9444 17s2 Perceptrons 1 Outline Neurons Biological and Artificial Perceptron Learning Linear Separability Multi-Layer Networks COMP9444 17s2
More information1 What a Neural Network Computes
Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationNeural networks. Chapter 20, Section 5 1
Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of
More informationMachine Learning
Machine Learning 10-315 Maria Florina Balcan Machine Learning Department Carnegie Mellon University 03/29/2019 Today: Artificial neural networks Backpropagation Reading: Mitchell: Chapter 4 Bishop: Chapter
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationLab 5: 16 th April Exercises on Neural Networks
Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the
More informationComputational statistics
Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial
More informationCourse 10. Kernel methods. Classical and deep neural networks.
Course 10 Kernel methods. Classical and deep neural networks. Kernel methods in similarity-based learning Following (Ionescu, 2018) The Vector Space Model ò The representation of a set of objects as vectors
More informationCS 4700: Foundations of Artificial Intelligence
CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman selman@cs.cornell.edu Machine Learning: Neural Networks R&N 18.7 Intro & perceptron learning 1 2 Neuron: How the brain works # neurons
More informationClassification with Perceptrons. Reading:
Classification with Perceptrons Reading: Chapters 1-3 of Michael Nielsen's online book on neural networks covers the basics of perceptrons and multilayer neural networks We will cover material in Chapters
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable
More informationAddress for Correspondence
Research Article APPLICATION OF ARTIFICIAL NEURAL NETWORK FOR INTERFERENCE STUDIES OF LOW-RISE BUILDINGS 1 Narayan K*, 2 Gairola A Address for Correspondence 1 Associate Professor, Department of Civil
More information