Fundamentals of Computatonal Neuroscence e Thomas Trappenberg February 7, 9 Chapter 6: Feed-forward mappng networks
Dgtal representaton of letter A 3 3 4 5 3 33 4 5 34 35 <-5 Optcal character recognton: Predct meanng from features Eg, gven features x, what s the character y f : x S n y S m
Further examples gven by lookup table A Boolean AND functon x x y B Non-Boolean functon x x y 3 - - - - 5 7
The populaton node as perceptron Update rule: r out = g(wr n ) (component-wse: r out = g( P j w jr n j )) For example: r n = x, ỹ = r out, lnear gran functon g(x) = x: ỹ = w x + w x 5 y y, ~ r n r n w w Σ g r out -5 4 x 4 4 x 4
How to fnd the rght weght values? Objectve (error) functon, for example: mean square error (MSE) E = (r out y ) Gradent descent method: w j w j ɛ E w j = w j ɛ(y r out )rj n for MSE, lnear gan E(w) Intalze weghts arbtrarly Repeat untl error s suffcently small Apply a sample pattern to the nput nodes: r = r n = ξ n Calculate rate of the output nodes: r out = g( P j w j rn ) j Compute the delta term for the output layer: δ = g (h out w )(ξ out Update the weght matrx by addng the term: w j = ɛδ r j n r out )
Example: OCR A Tranng pattern B Learnng curve C Generalzaton ablty >> dsplayletter() +++ +++ +++++ ++ ++ ++ ++ +++ +++ +++++++++ +++++++++++ +++ +++ +++ +++ +++ +++ +++ +++ Average Hammng dstance 9 8 7 6 5 4 3 5 5 Tranng step Average Hammng dstance 8 6 4 4 6 8 Number of flpped bts
Example: Boolean functon A Boolean OR functon x x y x x x x w = w = x = w = Θ = Σ y w x + w x = Θ B Boolean XOR functon x x x y x? x x x x
perceptrontranm %% Letter recognton wth threshold perceptron clear; clf; 3 nin=*3; nout=6; 4 wout=rand(nout,nin)-5; 5 6 % tranng vectors 7 load pattern; 8 rin=reshape(pattern, nin, 6); 9 rdes=dag(ones(,6)); % Updatng and tranng network for tranng_step=:; 3 % test all pattern 4 rout=(wout*rin)>5; 5 dsth=sum(sum((rdes-rout)ˆ))/6; 6 error(tranng_step)=dsth; 7 % tranng wth delta rule 8 wout=wout+*(rdes-rout)*rin ; 9 end plot(:9,error) xlabel( Tranng step ) 3 ylabel( Average Hammng dstance )
Multlayer Perceptron (MLP) n h out n n n r h n r n r n rn n out r out rn out w h w out Update rule: r out = g out (w out g h (w h r n )) Learnng rule (error backpropagaton): w j w j ɛ E w j Intalze weghts arbtrarly Repeat untl error s suffcently small Apply a sample pattern to the nput nodes: r := r n = ξ n Propagate nput through the network by calculatng the rates of nodes n successve layers l: r l = g(h l ) = g(p j wl j rl ) j Compute the delta term for the output layer: δ out = g (h out )(ξ out r out ) Back-propagate delta terms through the network: δ l = g (h l ) P j wl j δl j Update weght matrx by addng the term: w j l = ɛδ l rl j
perceptrontranm %% MLP wth backpropagaton learnng on XOR problem clear; clf; 3 N_=; N_h=; N_o=; 4 w_h=rand(n_h,n_)-5; w_o=rand(n_o,n_h)-5; 5 6 % tranng vectors (XOR) 7 r_=[ ; ]; 8 r_d=[ ]; 9 % Updatng and tranng network wth sgmod actvaton functon for sweep=:; % tranng randomly on one pattern 3 =cel(4*rand); 4 r_h=/(+exp(-w_h*r_(:,))); 5 r_o=/(+exp(-w_o*r_h)); 6 d_o=(r_o*(-r_o))*(r_d(:,)-r_o); 7 d_h=(r_h*(-r_h))*(w_o *d_o); 8 w_o=w_o+7*(r_h*d_o ) ; 9 w_h=w_h+7*(r_(:,)*d_h ) ; % test all pattern r_o_test=/(+exp(-w_o*(/(+exp(-w_h*r_))))); d(sweep)=5*sum((r_o_test-r_d)ˆ); 3 end 4 plot(d)
A MLP for representng the XORfuncton B Approxmaton of sn functons by a smal MLP n r 5 5 out 5 r f (x) n r 5 5 Tranng error C Learnng curve for XOR problem 5 4 3 5 Tranng steps 4 6 8 x
Overfttng and underfttng f (x ) 3 overfttng underfttng true mean 3 x Regularzaton, for example E = y ) γ r (r out w
Support Vector Machnes A Lnear large margne classfer D Non-lnear separaton x x B Lnear not separable case C Lnear separable case φ(x)
Further Readngs Smon Haykn (999), Neural networks: a comprehensve foundaton, MacMllan (nd edton) John Hertz, Anders Krogh, and Rchard G Palmer (99), Introducton to the theory of neural computaton, Addson-Wesley Berndt Müller, Joachm Renhardt, and Mchael Thomas Strckland (995), Neural Networks: An Introducton, Sprnger Chrstopher M Bshop (6), Pattern Recognton and Machne Learnng, Sprnger Laurence F Abbott and Sacha B Nelson (), Synaptc plastcty: tamng the beast, n Nature Neurosc (suppl), 3: 78 83 Chrstopher J C Burges (998), A Tutoral on Support Vector Machnes for Pattern Recognton n Data Mnng and Knowledge Dscovery : 67 Alex J Smola and Bernhard Schölhopf (4), A tutoral on support vector regresson n Statstcs and computng 4: 99- Davd E Rumelhart, James L McClelland, and the PDP research group (986), Parallel Dstrbuted Processng: Exploratons n the Mcrostructure of Cognton, MIT Press Peter McLeod, Km Plunkett, and Edmund T Rolls (998), Introducton to connectonst modellng of cogntve processes, Oxford Unversty Press E Bruce Goldsten (999), Sensaton & percepton, Brooks/Cole Publshng Company (5th edton)