CS407 Neural Computation
|
|
- Aldous Simpson
- 6 years ago
- Views:
Transcription
1 CS407 Neural Computation Lecture 5: The Multi-Layer Perceptron (MLP) and Backpropagation Lecturer: A/Prof. M. Bennamoun
2 What is a perceptron and what is a Multi-Layer Perceptron (MLP)? 2
3 What is a perceptron? x x 2 w k w k2 Bias b k v m = w x + k kj j j= Activation function b y k k = ϕ (v k ) Σ v k ϕ(.) Output y k x m Input signal w km Synaptic weights Summing junction Discrete Perceptron: ϕ( ) = sign( ) Continous Perceptron: ϕ( ) = S shape 3
4 Activation Function of a perceptron + + v i v i - Signum Function (sign) Discrete Perceptron: ϕ( ) = sign( ) Continous Perceptron: ϕ( v) = s shape 4
5 MLP Architecture The Multi-Layer-Perceptron was first introduced by M. Minsky and S. Papert in 969 Type: Feedforward Neuron layers: input layer or more hidden layers output layer Learning Method: Supervised 5
6 Terminology/Conventions Arrows indicate the direction of data flow. The first layer, termed input layer, just contains the input vector and does not perform any computations. The second layer, termed hidden layer, receives input from the input layer and sends its output to the output layer. After applying their activation function, the neurons in the output layer contain the output vector. 6
7 Why the MLP? The single-layer perceptron classifiers discussed previously can only deal with linearly separable sets of patterns. The multilayer networks to be introduced here are the most widespread neural network architecture Made useful until the 980s, because of lack of efficient training algorithms (McClelland and Rumelhart 986) The introduction of the backpropagation training algorithm. 7
8 Different Non-Linearly Separable Problems Structure Types of Decision Regions Exclusive-OR Problem Classes with Most General Meshed regionsregion Shapes Single-Layer Half Plane Bounded By Hyperplane A B B A B A Two-Layer Convex Open Or Closed Regions A B B A B A Three-Layer Arbitrary (Complexity Limited by No. of Nodes) A B B A B A 8
9 What is backpropagation Training and how does it work? 9
10 What is Backpropagation? Supervised Error Back-propagation Training The mechanism of backward error transmission (delta learning rule) is used to modify the synaptic weights of the internal (hidden) and output layers The mapping error can be propagated into hidden layers Can implement arbitrary complex/output mappings or decision surfaces to separate pattern classes For which, the explicit derivation of mappings and discovery of relationships is almost impossible Produce surprising results and generalizations 0
11 Architecture: Backpropagation Network The Backpropagation Net was first introduced by G.E. Hinton, E. Rumelhart and R.J. Williams in 986 Type: Feedforward Neuron layers: input layer or more hidden layers output layer Learning Method: Supervised Reference: Clara Boyd
12 Backpropagation Preparation Training Set A collection of input-output patterns that are used to train the network Testing Set A collection of input-output patterns that are used to assess network performance Learning Rate-α A scalar parameter, analogous to step size in numerical integration, used to set the rate of adjustments 2
13 Backpropagation training cycle Reference Eric Plammer / Feedforward of the input training pattern 3/ Adjustement of the weights 2/ Backpropagation of the associated error 3
14 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 4
15 BP NN With Single Hidden Layer Reference: Dan St. Clair Fausett: Chapter 6 O/P layer w j, k Hidden layer v i, j I/P layer Source: Fausett, L., Fundamentals of Neural Networks, Prentice Hall, 994. Notation Notation p. p of of Fausett Fausett 5
16 Notation x = input training vector t = Output target vector. δ k = portion of error correction weight for w jk that is due to an error at output unit Y k ; also the information about the error at unit Y k that is propagated back to the hidden units that feed into unit Y k δ j = portion of error correction weight for v jk that is due to the backpropagation of error information from the output layer to the hidden unit Z j α = learning rate. v oj = bias on hidden unit j w ok = bias on output unit k 6
17 Activation Functions Binary step Should be continuos, differentiable, and monotonically non-decreasing. Plus, its derivative should be easy to compute. Hyberbolic tangent f ( x) = + exp( x) f ' ( x) = f ( x)*[ f ( x)] Source: Fausett, L., Fundamentals of Neural Networks, Prentice Hall,
18 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 8
19 Y k Z Z j Z 3 X X 2 X 3 Fausett, L., pp
20 Y k Z Z j Z 3 X X 2 X 3 Fausett, L., pp
21 Y k Z Z j Z 3 X X 2 X 3 Fausett, L., pp
22 Y k Z Z j Z 3 X X 2 X 3 Fausett, L., pp
23 Let s examine Training Algorithm Equations Y [ x ] X =... x n Vectors & matrices make computation easier. Z Z 2 Z 3 X X 2 X 3 v,... v, p V = vn,... vn, p [ v v ] V 0 = 0,... 0, p W = w... wp,, w w, m... p, m v 2, [ w ] W 0 = 0,... w0, m Step 4 computation becomes Step 5 computation becomes Z Z _ in = = V 0 + [ f ( z _ in )... f ( z _ )] XV in p Y Y _ in = W = 0 + ZW [ f ( y _ in )... f ( y _ )] in m 23
24 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 24
25 Generalisation Once trained, weights are held contstant, and input patterns are applied in feedforward mode. - Commonly called recall mode. We wish network to generalize, i.e. to make sensible choices about input vectors which are not in the training set Commonly we check generalization of a network by dividing known patterns into a training set, used to adjust weights, and a test set, used to evaluate performance of trained network 25
26 Generalisation Generalisation can be improved by Using a smaller number of hidden units (network must learn the rule, not just the examples) Not overtraining (occasionally check that error on test set is not increasing) Ensuring training set includes a good mixture of examples No good rule for deciding upon good network size (# of layers, # units per layer) Usually use one input/output per class rather than a continuous variable or binary encoding 26
27 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 27
28 Reference: R. Spillman Example The XOR function could not be solved by a single layer perceptron network The function is: X Y F
29 XOR Architecture x v 0 Σ v f v 2 w 0 Σ w w 2 f y v 02 Σ v 2 v 22 f 29
30 Initial Weights Randomly assign small weight values: x Σ f Σ.3 f y Σ f. 30
31 Feedfoward st Pass x0 z in = -.3() +.2(0) +.5(0) = -.3 z = Σ f y in = -.4() -.2(.43) +.3(.56) = z in2 =.25() -.4(0) +.(0) Σ.3 f y =.42 (not 0) y Σ. Training Case: (0 0 0) f z 2 =.56 Activation function f: f ( x) = + e x 3
32 Backpropagate Σ.5 f δ_in = δ w = -.02(-.2) =.02 δ = δ_in f (z_in ) =.02(.43)(-.43) =.005 δ = (t y )f (y_in ) =(t y )f(y_in )[- f(y_in )] Σ.3 f Σ. δ_in 2 = δ w 2 = -.02(.3) = -.03 f δ 2 = δ_in 2 f (z_in 2 ) = -.03(.56)(-.56) = δ = (0.42).42[-.42] =
33 Calculate the Weights First Pass v v ij 0 j = αδ x j = αδ j i j =,2 w w j = αδ j =, 2 0 = αδ z j v0 = v = δx = (.005)(0) = Σ f w0 =.02 v2 = δ 2x = (.007)(0) = 0 v2 = δx2 = (.005)(0) = 0.5 v02 =.007 w = δz = (.02)(.43) = Σ.3 f Σ f w2 = δz2 = (.02)(.56) =.057 v22 = δ 2x2 = (.007)(0) = 0. 33
34 Update the Weights First Pass Σ f Σ.243 f Σ f. 34
35 Final Result After about 500 iterations: x -.5 Σ f Σ f y -.5 Σ f 35
36 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 36
37 Reference: Vamsi Pegatraju and Aparna Patsa Example 2 Y m = v X = [ ] [ 0 0 ] 0 = v t = = Desired output for X input α = 0.3 Z Z 2 Z 3 X X 2 X 3 v 2, f ( x) = x ( + e ) [ ] w0 = w = 2 p = 3 n = 3 37
38 Primary Values: Inputs to Epoch - I X=[ ]; W=[- 2] ; W 0 =[-]; V= V 0 =[ 0 0 -]; Target t=0.9; α = 0.3; 38
39 Epoch I Step 4: Z_in= V 0 +XV = [ ]; Z=f([Z_in])=[ ]; Step 5: Y_in = W 0 +ZW = [0.34]; Y=f([Z_in])=0.5772; Sum of Squares Error obtained originally: ( ) 2 =
40 Step 6: Error = t k Y k = Now we have only one output and hence the value of k=. δ = (t y )f (Y_in ) We know f (x) for sigmoid = f(x)(-f(x)) δ = ( )(0.5772)( ) =
41 For intermediate weights we have (j=,2,3) W j,k =α δ κ Ζ j = α δ Ζ j W =(0.3)(0.0788)[ ] =[ ] ; Bias W 0, =α δ = (0.3)(0.0788)=0.0236; 4
42 Step 7: Backpropagation to the first hidden layer For Z j (j=,2,3), we have δ_in j = k=..m δ κ W j,k = δ W j, δ_in = ;δ_in 2 =0.0788;δ_in 3 =0.576; δ j = δ_in j f (Z_in j ) => δ = ; δ 2 =0.007; δ 3 =0.036; 42
43 X=[ ] V i,j = αδ j X i V = [ ] ; V 2 = [ ] ; V 3 = [ ] ; V 0 =α[δ δ 2 δ 3 ] = [ ]; 43
44 Step 8: Updating of W, V, W 0, V 0 W new = W old + W =[ ] ; V new = V old + V =[ ; ; 0 3 ]; W 0new = ; V 0new = [ ]; Completion of the first epoch. 44
45 Primary Values: Inputs to Epoch - 2 X=[ ]; W=[ ] ; W 0 =[ ]; V=[ ; ; 0 3 ]; V 0 =[ ]; Target t=0.9; α = 0.3; 45
46 Epoch 2 Step 4: Z_in=V 0 +XV=[ ]; Z=f([Z_in])=[ ]; Step 5: Y_in = W 0 +ZW = [0.3925]; Y=f([Z_in])=0.5969; Sum of Squares Error obtained from first epoch: ( ) 2 =
47 Step 6: Error = t k Y k = Now again, as we have only one output, the value of k=. δ = (t y )f (Y_in ) =>δ = ( )(0.5969)( ) =
48 For intermediate weights we have (j=,2,3) W j,k =α δ κ Ζ j = α δ Ζ j W =(0.3)*(0.0729)* [ ] =[ ] ; Bias W 0, =α δ = 0.029; 48
49 Step 7: Backpropagation to the first hidden layer For Z j (j=,2,3), we have δ_in j = k=..m δ κ W j,k = δ W j, δ_in =-0.074;δ_in 2 =0.0745;δ_in 3 =0.469; δ j = δ_in j f (Z_in j ) => δ = ; δ 2 =0.0067; δ 3 =0.0334; 49
50 V i,j = αδ j X i V = [ ] ; V 2 = [ ] ; V 3 = [ ] ; V 0 =α[δ δ 2 δ 3 ] = [ ]; 50
51 Step 8: Updating of W, V, W 0, V 0 W new = W old + W =[ ] ; V new = V old + V =[ ; ; 0 3 ]; W 0new = ; V 0new = [ ]; Completion of the second epoch. 5
52 Z_in=V 0 +XV=[ ]; =>Z=f([Z_in])=[ ]; Step 5: Y_in = W 0 +ZW = [0.4684]; => Y=f([Z_in])=0.650; Sum of Squares Error at the end of the second epoch: ( ) 2 = From the last two values of Sum of Squares Error, we see that the value is gradually decreasing as the weights are getting updated. 52
53 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 53
54 Functional Approximation Multi-Layer Perceptrons can approximate any continuous function by a two-layer network with squashing activation functions. If activation functions can vary with the function, can show that a n-input, m-output function requires at most 2n+ hidden units. See Fausett: for more details. 54
55 Function Approximators Example: a function h(x) approximated by H(w,x) 55
56 Applications We look at a number of applications for backpropagation MLP s. In each case we ll examine Problem to be solved Architecture Used Results Reference: J.Hertz, A. Krogh, R.G. Palmer, Introduction to the Theory of Neural Computation, Addison Wesley, 99 56
57 NETtalk - Specifications Problem is to convert written text to speech. Conventionally, this is done by hand-coded linguistic rules, such as the DECtalk system. NETtalk uses a neural network to achieve similar results Input is written text Output is choice of phoneme for speech synthesiser 57
58 NETtalk - architecture 26 output units, of 26 code representing most likely phoneme 80 hidden units, fully interconnected T h e c a t o n 7 letter sliding window, generating phoneme for centre character. Input units use of 29 code. => 203 input units (=29x7) 58
59 NETtalk - Results 024 Training Set After 0 epochs - intelligible speech After 50 epochs - 95% correct on training set - 78% correct on test set Note that this network must generalise - many input combinations are not in training set Results not as good as DECtalk, but significantly less effort to code up. 59
60 Sonar Classifier Task - distinguish between rock and metal cylinder from sonar return of bottom of bay Convert time-varying input signal to frequency domain to reduce input dimension. (This is a linear transform and could be done with a fixed weight neural network.) Used a 60-x-2 network with x from 0 to 24 Training took about 200 epochs classified about 80% of training set; classified 00% training, 85% test set 60
61 ALVINN Drives 70 mph on a public highway 30 outputs for steering 4 hidden units 30x32 pixels as inputs 30x32 weights into one out of four hidden unit 6
62 Navigation of a Car Task is to control a car on a winding road Inputs are a 30x32 pixel image from a video camera on roof, 8x32 image from a range finder => 26 inputs 29 hidden units 45 output units arranged in a line, -of-45 code representing hard-left..straight-ahead..hard-right 62
63 Navigation of Car - Results Training set of 200 simulated road images Trained for 40 epochs Could drive at 5 km/hr on road, limited by calculation speed of feed-forward network. Twice as fast as best non-net solution 63
64 Backgammon Trained on 3000 example board scenarios of (position, dice, move) rated from -00 (very bad) to +00 (very good) from human expert. Some important information such as pipcount and degree-of-trapping was included as input. Some noise added to input set (scenarios with random score) Handcrafted examples added to training set to correct obvious errors 64
65 Backgammon results 459 inputs, 2 hidden layers, each 24 units, plus output for score (All possible moves evaluated) Won 59% against a conventional backgammon program (4% without extra info, 45% without noise in training set) Won computer olympiad, 989, but lost to human expert (Not surprising since trained by human scored examples) 65
66 Encoder / Image Compression Wish to encode a number of input patterns in an efficient number of bits for storage or transmission We can use an autoassociative network, i.e. an M-N-M network, where we have M inputs, and N<M hidden units, M outputs, trained with target outputs same as inputs Hidden units need to encode inputs in fewer signals in the hidden layers. Outputs from hidden layer are encoded signal 66
67 Encoders We can store/transmit hidden values using first half of network; decode using second half. We may need to truncate hidden unit values to fixed precision, which must be considered during training. Cottrell et al. tried 8x8 blocks (8 bits each) of images, encoded in 6 units, giving results similar to conventional approaches. Works best with similar images 67
68 Neural network for OCR feedforward network trained using Backpropagation A B C D E Hidden Layer Output Layer Input Layer
69 Pattern Recognition Post-code (or ZIP code) recognition is a good example - hand-written characters need to be classified. One interesting network used 6x6 pixel map input of handwritten digits already found and scaled by another system. 3 hidden layers plus -of-0 output layer. First two hidden layers were feature detectors. 69
70 ZIP code classifier First layer had same feature detector connected to 5x5 blocks of input, at 2 pixel intervals => 8x8 array of same detector, each with the same weights but connected to different parts of input. Twelve such feature detector arrays. Same for second hidden layer, but 4x4 arrays connected to 5x5 blocks of first hidden layer; with 2 different features. Conventional 30 unit 3rd hidden layer 70
71 ZIP Code Classifier - Results Note 8x8 and 4x4 arrays of feature detectors use the same weights => many fewer weights to train. Trained on 7300 digits, tested on 2000 Error rates: % on training, 5% on test set If cases with no clear winner rejected (i.e. largest output not much greater than second largest output), then, with 2% rejection, error rate on test set reduced to %. Performance improved further by removing more weights: optimal brain damage. 7
72 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 72
73 Heuristics for making BP Better Training with BP is more an art than science result of own experience Normalizing the inputs preprocessed so that its mean value is closer to zero (see prestd function in matlab). input variables should be uncorrelated by Principal Component Analysis (PCA). See prepca and trapca functions in Matlab. 73
74 Sequential vs. Batch update Sequential learning means that a given input pattern is forward propagated, the error is determined and back-propagated, and the weights are updated. Then the same procedure is repeated for the next pattern. Batch learning means that the weights are updated only after the entire set of training patterns has been presented to the network. In other words, all patterns are forward propagated, and the error is determined and back-propagated, but the weights are only updated when all patterns have been processed. Thus, the weight update is only performed every epoch. If P = # patterns in one epoch P w = w p P p = 74
75 Sequential vs. Batch update i.e.in some cases, it is advantageous to accumulate the weight correction terms for several patterns (or even an entire epoch if there are not too many patterns) and make a single weight adjustment (equal to the average of the weight correction terms) for each weight rather than updating the weights after each pattern is presented. This procedure has a smoothing effect (because of the use of the average) on the correction terms. In some cases, this smoothing may increase the chances of convergence to a local minimum. 75
76 Initial weights Initial weights will influence whether the net reaches a global (or only a local minimum) of the error and if so, how quickly it converges. The values for the initial weights must not be too large otherwise, the initial input signals to each hidden or output unit will be likely to fall in the region where the derivative of the sigmoid function has a very small value (f (net)~0) : so called saturation region. On the other hand, if the initial weights are too small, the net input to a hidden or output unit will be close to zero, which also causes extremely slow learning. Best to set the initial weights (and biases) to random numbers between 0.5 and 0.5 (or between and or some other suitable interval). The values may be +ve or ve because the final weights after training may be of either sign also. 76
77 Memorization vs. generalization How long to train the net: Since the usual motivation for applying a backprop net is to achieve a balance between memorization and generalization, it is not necessarily advantageous to continue training until the error actually reaches a minimum. Use 2 disjoint sets of data during training: / a set of training patterns and 2/ a set of training- testing patterns (or validation set). Weight adjustment are based on the training patterns; however, at intervals during training, the error is computed using the validation patterns. As long as the error for the validation decreases, training continues. When the error begins to increase, the net is starting to memorize the training patterns too specifically (starts to loose its ability to generalize). At this point, training is terminated. 77
78 L. Studer, IPHE-UNIL Early stopping Error With validation set (which does not change w ij ) With training set (which changes w ij ) Stop Here! Training time 78
79 Backpropagation with momentum Backpropagation with momentum: the weight change is in a direction that is a combination of / the current gradient and 2/ the previous gradient. Momentum can be added so weights tend to change more quickly if changing in the same direction for several training cycles:- w ij (t+) = α δ x i + µ. w ij (t) µ is called the momentum factor and ranges from 0 < µ <. When subsequent changes are in the same direction increase the rate (accelerated descent) When subsequent changes are in opposite directions decrease the rate (stabilizes) 79
80 Backpropagation with momentum Weight update equation Momentum w( t + ) w(t) w( t) + αδ z w( t ) Source: Fausett, L., Fundamentals of Neural Networks, Prentice Hall, 994, pg
81 BP training algorithm Adaptive Learning Rate Adaptive learning rate Source: Fausett, L., Fundamentals of Neural Networks, Prentice Hall,
82 Adaptive Learning rate Adaptive Parameters: Vary the learning rate during training, accelerating learning slowly if all is well ( error, E, decreasing), but reducing it quickly if things go unstable (E increasing). For example: α (t) + a if α (t + ) = (-b). α (t) α(t) Typically, a = 0., b = 0.5 E < 0for last fewepochs if E > 0 otherwise 82
83 Matlab BP NN Architecture A neuron with a single R-element input vector is shown below. Here the individual element inputs are multiplied by weights and the weighted values are fed to the summing junction. Their sum is simply Wp, the dot product of the (single row) matrix W and the vector p. The neuron has a bias b, which is summed with the weighted inputs to form the net input n. This sum, n, is the argument of the transfer function f. This expression can, of course, be written in MATLAB code as: n = W*p + b However, the user will seldom be writing code at this low level, for such code is already built into functions to define and simulate entire networks. 83
84 Matlab BP NN Architecture 84
85 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 85
86 Learning Rule Fausett, section 6.3, p324 Similar to Delta Rule. Our goal is to minimize the error, E, which is the difference between targets, t m, and our outputs, y km, using a least squares error measure: E = / 2 Σ k (t k -y k ) 2 To find out how to change w jk and v ij to reduce E, we need to find E w E and jk v ij 86
87 Delta Rule Derivation Hidden-to-Output = 2 E E 0.5 [ t = k y k ] hence (t k w jk w jk 2 k k y k ) 2 where E w JK y k = f ( y = wjk 2 k ink [ t y ] = [ t f ( )] k ) and k K y ink wjk y ink = j z j w jk Notice the difference between the subscripts k (which corresponds to any node between hidden and output layers) and K (which represents a particular node K of interest) E w JK E w JK = (t = (t K K y y K K f ( y ) w )f '(y ink JK in K ) = (t ).z J K y K )f '(y in K ( y ). w ink JK ) 87
88 Delta Rule Derivation Hidden-to-Output It is convenient to define : δ K = (t K y K )f '(y ink ) Thus, E w jk = α = α [ t k y k ] f '( yink ) z j = w jk α δ k z j In summary, with δ K = w (t K jk = α δ k z j y )f '(y K ink ) 88
89 Delta Rule Derivation: Input to Hidden = 2 E E 0.5 [ t = k yk ] hence (tk y k vij vij 2 k where y = f ( y ) and y = z w E v IJ k ink y = k [ tk y k ] = [ tk y k ] f '( yink ) k v IJ k ink j j jk k y v ) ink IJ 2 E v IJ y z ink J = δ k = δ k wjk = δ k k vij k vij k Notice the difference between the subscripts j and J and i and I w Jk f '( z inj )[ x I ] It is convenient to define : v ij δ J k = k δ w k Jk f '(z E = α = α f '( zinj ) x i δ kw jk = αδ j x v ij i inj ) 89
90 Delta Rule Derivation: Input to Hidden In summary where : δ J = v k ij = αδ x δ w k Jk j f i '(z inj ) 90
91 Backpropagation Neural Networks Architecture BP BP training Algorithm Generalization Examples Example Example 2 Uses (applications) of of BP BP networks Options/Variations on on BP BP Momentum Sequential vs. vs. batch Adaptive learning rates Appendix References and and suggested reading 9
92 Suggested Reading. L. Fausett, Fundamentals of Neural Networks, Prentice-Hall, 994, Chapter 6. 92
93 References: These lecture notes were based on the references of the previous slide, and the following references. Eric Plummer, University of Wyoming 2. Clara Boyd, Columbia Univ. N.Y comet.ctr.columbia.edu/courses/elen_e40/2002/artificial.ppt 3. Dan St. Clair, University of Missori-Rolla, 404_fall200/Lectures/Lect09_0230/ 4. Vamsi Pegatraju and Aparna Patsa: web.umr.edu/~stclair/class/classfiles/cs404_fs02/ Lectures/Lect09_02902/Lect8_Homework/L8_3.ppt 5. Richard Spillman, Pacific Lutheran University: 6. Khurshid Ahmad and Matthew Casey Univ. Surrey, 93
Backpropagation Neural Net
Backpropagation Neural Net As is the case with most neural networks, the aim of Backpropagation is to train the net to achieve a balance between the ability to respond correctly to the input patterns that
More informationUnit III. A Survey of Neural Network Model
Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of
More informationMultilayer Perceptrons (MLPs)
CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +1-1 o x x 1-1
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationNeural Networks biological neuron artificial neuron 1
Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input
More information4. Multilayer Perceptrons
4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output
More informationThe error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural
1 2 The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural networks. First we will look at the algorithm itself
More informationLecture 5: Logistic Regression. Neural Networks
Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate
More informationSimple Neural Nets For Pattern Classification
CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification
More informationIntroduction To Artificial Neural Networks
Introduction To Artificial Neural Networks Machine Learning Supervised circle square circle square Unsupervised group these into two categories Supervised Machine Learning Supervised Machine Learning Supervised
More informationArtificial Neural Networks
Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationSerious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions
BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design
More informationLearning and Memory in Neural Networks
Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units
More informationChapter 3 Supervised learning:
Chapter 3 Supervised learning: Multilayer Networks I Backpropagation Learning Architecture: Feedforward network of at least one layer of non-linear hidden nodes, e.g., # of layers L 2 (not counting the
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron
More informationArtificial Neural Networks Examination, June 2005
Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either
More informationARTIFICIAL INTELLIGENCE. Artificial Neural Networks
INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
More informationArtificial Neural Network
Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation
More informationInput layer. Weight matrix [ ] Output layer
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2003 Recitation 10, November 4 th & 5 th 2003 Learning by perceptrons
More informationAI Programming CS F-20 Neural Networks
AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols
More informationIntroduction to Neural Networks
Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning
More informationMachine Learning. Neural Networks
Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE
More informationUnit 8: Introduction to neural networks. Perceptrons
Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad
More informationArtificial Neural Networks Examination, March 2004
Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationNeural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21
Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural
More informationNeural networks. Chapter 20. Chapter 20 1
Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms
More information1 What a Neural Network Computes
Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists
More informationCMSC 421: Neural Computation. Applications of Neural Networks
CMSC 42: Neural Computation definition synonyms neural networks artificial neural networks neural modeling connectionist models parallel distributed processing AI perspective Applications of Neural Networks
More informationAdministration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6
Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects
More informationNeural networks. Chapter 19, Sections 1 5 1
Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10
More informationNeural networks. Chapter 20, Section 5 1
Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More informationAN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009
AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is
More informationLecture 4: Perceptrons and Multilayer Perceptrons
Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons
More information100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units
Connectionist Models Consider humans: Neuron switching time ~ :001 second Number of neurons ~ 10 10 Connections per neuron ~ 10 4 5 Scene recognition time ~ :1 second 100 inference steps doesn't seem like
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationChapter 2 Single Layer Feedforward Networks
Chapter 2 Single Layer Feedforward Networks By Rosenblatt (1962) Perceptrons For modeling visual perception (retina) A feedforward network of three layers of units: Sensory, Association, and Response Learning
More informationMultilayer Perceptrons and Backpropagation
Multilayer Perceptrons and Backpropagation Informatics 1 CG: Lecture 7 Chris Lucas School of Informatics University of Edinburgh January 31, 2017 (Slides adapted from Mirella Lapata s.) 1 / 33 Reading:
More informationA Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation
1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationMachine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler
+ Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions
More informationCourse 395: Machine Learning - Lectures
Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More informationNeural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET
Unit-. Definition Neural network is a massively parallel distributed processing system, made of highly inter-connected neural computing elements that have the ability to learn and thereby acquire knowledge
More informationARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92
ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000
More informationBack-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples
Back-Propagation Algorithm Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples 1 Inner-product net =< w, x >= w x cos(θ) net = n i=1 w i x i A measure
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationArtificial Neural Networks Examination, June 2004
Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More informationMark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.
University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x
More informationArtificial Neural Networks. Q550: Models in Cognitive Science Lecture 5
Artificial Neural Networks Q550: Models in Cognitive Science Lecture 5 "Intelligence is 10 million rules." --Doug Lenat The human brain has about 100 billion neurons. With an estimated average of one thousand
More informationADALINE for Pattern Classification
POLYTECHNIC UNIVERSITY Department of Computer and Information Science ADALINE for Pattern Classification K. Ming Leung Abstract: A supervised learning algorithm known as the Widrow-Hoff rule, or the Delta
More informationKeywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm
Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding
More informationIntroduction to Machine Learning
Introduction to Machine Learning Neural Networks Varun Chandola x x 5 Input Outline Contents February 2, 207 Extending Perceptrons 2 Multi Layered Perceptrons 2 2. Generalizing to Multiple Labels.................
More informationNeural Networks Learning the network: Backprop , Fall 2018 Lecture 4
Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:
More informationArtificial Neural Networks
Artificial Neural Networks Oliver Schulte - CMPT 310 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will focus on
More informationLast update: October 26, Neural networks. CMSC 421: Section Dana Nau
Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications
More informationMultilayer Feedforward Networks. Berlin Chen, 2002
Multilayer Feedforard Netors Berlin Chen, 00 Introduction The single-layer perceptron classifiers discussed previously can only deal ith linearly separable sets of patterns The multilayer netors to be
More informationNeural Networks. Xiaojin Zhu Computer Sciences Department University of Wisconsin, Madison. slide 1
Neural Networks Xiaoin Zhu erryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison slide 1 Terminator 2 (1991) JOHN: Can you learn? So you can be... you know. More human. Not
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)
More informationECE 471/571 - Lecture 17. Types of NN. History. Back Propagation. Recurrent (feedback during operation) Feedforward
ECE 47/57 - Lecture 7 Back Propagation Types of NN Recurrent (feedback during operation) n Hopfield n Kohonen n Associative memory Feedforward n No feedback during operation or testing (only during determination
More informationArtificial Neural Networks. MGS Lecture 2
Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation
More informationArtificial Neural Networks
Artificial Neural Networks Threshold units Gradient descent Multilayer networks Backpropagation Hidden layer representations Example: Face Recognition Advanced topics 1 Connectionist Models Consider humans:
More informationComputational Intelligence Winter Term 2017/18
Computational Intelligence Winter Term 207/8 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Plan for Today Single-Layer Perceptron Accelerated Learning
More informationIntroduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen
Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /
More informationSimple neuron model Components of simple neuron
Outline 1. Simple neuron model 2. Components of artificial neural networks 3. Common activation functions 4. MATLAB representation of neural network. Single neuron model Simple neuron model Components
More informationFeedforward Neural Nets and Backpropagation
Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features
More informationCOMP-4360 Machine Learning Neural Networks
COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca
More informationMulti-layer Neural Networks
Multi-layer Neural Networks Steve Renals Informatics 2B Learning and Data Lecture 13 8 March 2011 Informatics 2B: Learning and Data Lecture 13 Multi-layer Neural Networks 1 Overview Multi-layer neural
More informationIntroduction to feedforward neural networks
. Problem statement and historical context A. Learning framework Figure below illustrates the basic framework that we will see in artificial neural network learning. We assume that we want to learn a classification
More informationNeural Nets Supervised learning
6.034 Artificial Intelligence Big idea: Learning as acquiring a function on feature vectors Background Nearest Neighbors Identification Trees Neural Nets Neural Nets Supervised learning y s(z) w w 0 w
More informationComputational Intelligence
Plan for Today Single-Layer Perceptron Computational Intelligence Winter Term 00/ Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Accelerated Learning
More information22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1
Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable
More informationIncremental Stochastic Gradient Descent
Incremental Stochastic Gradient Descent Batch mode : gradient descent w=w - η E D [w] over the entire data D E D [w]=1/2σ d (t d -o d ) 2 Incremental mode: gradient descent w=w - η E d [w] over individual
More informationTemporal Backpropagation for FIR Neural Networks
Temporal Backpropagation for FIR Neural Networks Eric A. Wan Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract The traditional feedforward neural network is a static
More informationNeural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture
Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network architecture Geoffrey Hinton with Nitish Srivastava Kevin Swersky Feed-forward neural networks These are
More informationArtificial neural networks
Artificial neural networks Chapter 8, Section 7 Artificial Intelligence, spring 203, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 8, Section 7 Outline Brains Neural
More informationEPL442: Computational
EPL442: Computational Learning Systems Lab 2 Vassilis Vassiliades Department of Computer Science University of Cyprus Outline Artificial Neuron Feedforward Neural Network Back-propagation Algorithm Notes
More informationNeural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10
CS9840 Learning and Computer Vision Prof. Olga Veksler Lecture 0 Neural Networks Many slides are from Andrew NG, Yann LeCun, Geoffry Hinton, Abin - Roozgard Outline Short Intro Perceptron ( layer NN) Multilayer
More informationPattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore
Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal
More information) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.
1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2011 Recitation 8, November 3 Corrected Version & (most) solutions
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward
More informationIntroduction to Artificial Neural Networks
Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationArtificial Neural Networks
Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: Ryan Lowe (ryan.lowe@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted,
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationLearning and Neural Networks
Artificial Intelligence Learning and Neural Networks Readings: Chapter 19 & 20.5 of Russell & Norvig Example: A Feed-forward Network w 13 I 1 H 3 w 35 w 14 O 5 I 2 w 23 w 24 H 4 w 45 a 5 = g 5 (W 3,5 a
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationNeural Networks Lecture 3:Multi-Layer Perceptron
Neural Networks Lecture 3:Multi-Layer Perceptron H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural
More information