ME 539, Fall 2008: Learning-Based Control

Size: px

Start display at page:

Download "ME 539, Fall 2008: Learning-Based Control"

Roger Blaise Wilkerson
5 years ago
Views:

1 ME 539, Fall 2008: Learig-Based Cotrol Neural Network Basics 10/1/2008 & 10/6/2008 Uiversity Orego State Neural Network Basics Questios??? Aoucemet: Homework 1 has bee posted Due Friday 10/10/08 at oo Readig assigmet: Sectios 4.1 to 4.5 i text Suggested: Chapters 1 ad 2 1

2 Neural Networks for Noliear Cotrol Motivatio: Cotrol a system with oliear dyamics Robot Satellite Air vehicle Do we kow what the good cotrol strategies are? Yes: teach eural etwork those strategies Drive a car ad record good driver actios for each state Fly a helicopter ad record good pilot actios for each state No: have a eural etwork discover those strategies Let car drive aroud ad provide feedback o performace Neural Networks Why Neural Networks? McCuloch-Pitts Neuros Neural Network Architectures Activatio Fuctios Sigle Layer Feed Forward Networks Multi Layer Feed Forward Networks Error Backpropagatio Implemetatio Issues 2

3 Why Neural Networks? Neural Network: A massively parallel distributed processor made up of simple processig uits. It stores kowledge. A artificial eural etwork is similar to the brai i that: Kowledge is acquired by the etwork from its eviromet through a learig process Itereuro coectio stregths (syaptic weights) are used to store the acquired kowledge A artificial eural etwork is differet from the brai i a thousad ways Thik of a eural etwork as a statistical tool. Beefits of Neural Networks Performs a iput/output mappig Noliear regressio ++ Ca be traied from examples Fuctioal form of mappig eed ot be kow Is adaptive to chagig eviromets Track ostatioarity Provides probabilistic respose Cofidece i solutio Results i fault tolerat computig Graceful degradatio 3

4 Iput / Output Mappig Supervised learig: learig with a set of labeled examples Each example has: A iput A desired output Traiig: Preset iput Compute output Compare etwork output to desired output Update etwork weights to miimize error Whe weights are stable etwork has leared a iput/output mappig Types of Learig Learig Rules: Hebbia Memory Based Competitive Gradiet descet Learig Paradigms: Supervised Critic (Reiforcemet Learig) Usupervised 4

5 Hebbia Learig If two euros are activated at the same time, stregthe the weight betwee them (Hebb, 1949) Properties: Highly Local Time depedet Iteractive Appeal: Evidece for biological plausibility Memory Based Learig Explicitly store experieces (patters) i memory Whe a ew patter is observed: Fid stored patters i eighborhood of test patter Example: Nearest Neighbor algorithm For each ew usee patter, fid closest (or closest K) patters i memory Assig ew patter to class most frequetly represeted i the eighborhood Slow recall (search through all stored patters) 5

6 Competitive Learig Oly euros wiig some competitio are updated Basic elemets: All euros start the same There is a limit o the total stregth of each euro A mechaism for euros to compete. Wier is called wier-takes-all euro Example: euros represet cocetratios of data For each patter, the wiig euro is modified to be closer to that particular patter Neuros form clumps to represet the differet data clusters Gradiet Descet Update weights to miimize error Take steps proportioal to the egative of the derivative More later 6

7 Model of a Neuro Each iput is a product of some sigal (output) ad a weight All icomig iputs are summed Sum goes through a activatio fuctio Output is set out to the etwork Activatio Fuctios 7

8 Neural Network Architectures Neural Network Architectures 8

9 Sigle Layer Feed Forward Networks Iput s a m elemet iput vector Target t is the desired output (ca be a vector) Output y is respose to x Error e is differece betwee desired ad etwork outputs x 1 x 2 w 2? y e = t y x m w m w 0 Sigle Layer Feed Forward Networks Liear Discrimiatio: m y = w k x k + w 0 k x 1 x 2 x m w 2 w m w 0 y Logistic Discrimiatio: m y = f ( w k x k ) k x 1 x 2 x m w 2 w m f ( ) y 9

10 Sigle Layer Feed Forward Networks Liear Discrimiatio: m y = w k x k + w 0 k x 1 x 2 x m w 2 w m w 0 y Logistic Discrimiatio: m y = f ( w k x k + w 0 ) k x 1 x 2 x m w 2 w m f ( ) w 0 y Sigle Layer Feed Forward Networks patters (x,t ) Mea Square Error: E = 1 2 N =1 (t y ) 2 10

11 Sigle Layer Feed Forward Networks Error o patter : e = t y Sigle Layer Feed Forward Networks Error o patter : e = t y Mea Square Error: E = 1 2 N =1 (t y ) 2 11

12 Sigle Layer Feed Forward Networks Error o patter : e = t y Mea Square Error: Least Mea Square algorithm: E = 1 2 N =1 (t y ) 2 E w = e e w Sigle Layer Feed Forward Networks Error o patter : e = t y Mea Square Error: Least Mea Square algorithm: Gradiet descet: E = 1 2 N =1 (t y ) 2 E w = e e w Δw = η E w 12

13 Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) d E(w)/ d d E(w)/d > 0 <= - η d E(w)/d i.e., the rule decreases Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) d E(w)/ d d E(w)/d > 0 <= - η de(w)/d i.e., the rule decreases 13

14 Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) Gradiet Descet: Move i directio of egative derivative E(w) Decreasig E(w) d E(w)/ d d E(w)/d < 0 <= - η de(w)/d i.e., the rule icreases 14

15 Sigle Layer Feed Forward Networks Liear activatio fuctio: x 1 x 2 x m w 2 w m w 0 y Sigle Layer Feed Forward Networks Liear activatio fuctio: x 1 x 2 x m w 2 w m w 0 y E e = e i i w i, w i, = e i y i w i, = e i x 15

16 Sigle Layer Feed Forward Networks Liear activatio fuctio: x 1 x 2 x m w 2 w m Weight update: w 0 y E e = e i i w i, w i, = e i y i w i, = e i x Δw i, = η E w i, = η e i x Sigle Layer Feed Forward Networks Sigmoid activatio fuctio: Derivative of sigmoid: 1 f (a) = 1+ e a f (a) = f (a)(1 f (a)) 16

17 Sigle Layer Feed Forward Networks Sigmoid activatio fuctio: Derivative of sigmoid: 1 f (a) = 1+ e a f (a) = f (a)(1 f (a)) Gradiet descet: E e = e i i = e i f ( w i, x ) w i, w i, w i, = e i f ( w i, x )(1 f ( w i, x ))x = e i y i (1 y i )x Sigle Layer Feed Forward Networks Sigmoid activatio fuctio: Derivative of sigmoid: 1 f (a) = 1+ e a f (a) = f (a)(1 f (a)) Gradiet descet: E e = e i i = e i f ( w i, x ) w i, w i, w i, = e i f ( w i, x )(1 f ( w i, x ))x = e i y i (1 y i )x Weight update: Δw i, = η E w i, = η e i y i (1 y i )x 17

18 Illustratio of Gradiet Descet E(w) w 0 Illustratio of Gradiet Descet E(w) w 0 18

19 Illustratio of Gradiet Descet E(w) Directio of steepest descet = directio of egative gradiet w 0 Illustratio of Gradiet Descet E(w) Origial poit i weight space w 0 New poit i weight space 19

20 Neural Network Basics (part 2) Questios??? Aoucemets: Data sets for homework 1 are olie Due Friday 10/10/08 at oo Readig assigmet: Sectios 9.2 & 11.1 i text Suggested readig: Chapter 11 For 10/8 (Proect) : Pick at least two problem/approach pairs Multi Layer Feed Forward Networks x 1 v 1,1 h 1 h y k v i, w,k 20

21 Multi Layer Feed Forward Networks h 1 x 1 v 1,1 h y k v i, y k = f ( w,k h + w 0,k ) w,k 1 f (a) = 1+ e a = f ( w,k f ( v i, + v 0, ) + w 0,k ) i Weight updates Derivative of error wrt weight,k: E = e k w,k e k w,k f = e k k w,k = e k f ( w,k h )(1 f ( w,k h ))h = e k y k (1 y k )h 21

22 Weight updates Derivative of error wrt weight,k: E = e k w,k Updatig hidde-output layer weights: Δw,k = η E = ηδ k h w,k Hidde-output layer deltas: e k w,k f = e k k w,k = e k f ( w,k h )(1 f ( w,k h ))h = e k y k (1 y k )h δ k = e k y k (1 y k ) Multi Layer Feed Forward Networks h 1 x 1 v 1,1 h y k v i, w,k h = f ( v i, + v 0, ) i 22

23 Multi Layer Feed Forward Networks h 1 x 1 v 1,1 h v i, w,k y k What are the errors for the hidde layer? h = f ( v i, + v 0, ) i We do t kow the Targets. Now what? Error Backpropagatio Updatig iput-hidde layer weights: Δv i, = ηδ Delta: δ = e f ( v i, )(1 f ( v i, )) = e h (1 h ) i i = w,k δ k h (1 h ) k ( ) 23

24 Error Backpropagatio Updatig iput-hidde layer weights: Δv i, = ηδ Delta: δ = e f ( v i, )(1 f ( v i, )) = e h (1 h ) i ( ) = w,k δ k h (1 h ) k i Errors for the hidde layer: Backpropagated deltas from output layer Backpropagatio Summary: For sigmoidal activatio fuctios, update ay weight coectig iput i to output : Deltas give by: Δw i, = ηδ For output layer δ = e y (1 y ) For hidde layer δ = w,k δ k h (1 h ) k 24

25 Backpropagatio Summary: For sigmoidal activatio fuctios, update ay weight coectig iput i to output : Deltas give by: Δw i, = ηδ For output layer δ = e y (1 y ) For hidde layer δ = w,k δ k h (1 h ) k Backpropagatio Summary: For sigmoidal activatio fuctios, update ay weight coectig iput i to output : Deltas give by: Δw i, = ηδ For output layer δ = e y (1 y ) For hidde layer δ = w,k δ k h (1 h ) k 25

26 Backpropagatio Algorithm For each epoch: Preset patter x to etwork Propagate sigal forward: Compute hidde uit values Compute output value Fid error Compute output layer deltas Compute hidde layer deltas Compute gradiet for each weight Update each weight Preset ext patter Repeat this process util MSE is satisfactory Radial Basis Fuctio Networks Key RBF differeces: R 1 Local activatio x 1 Liear output layer recommeded R y k Euclidea orm activatio All hidde uits are differet fuctios w,k Oe hidde layer 26

27 Radial Basis Fuctio Networks R 1 x 1 σ i R y k c i w,k c is the ceter of the th radial basis fuctio c = {c,1,,c,i,, c,n } σ is the radius of the th radial basis fuctio Radial Basis Fuctio Networks R 1 x 1 σ i R y k c i w,k R (x) = exp x c 2 2(σ ) 2 c is the ceter of the th radial basis fuctio c = {c,1,,c,i,, c,n } σ is the radius of the th radial basis fuctio 27

28 Radial Basis Fuctio Networks R 1 x 1 σ i R y k c i w,k y k = f ( w R,k + w 0,k ) = f w,k exp x c 2 2(σ ) 2 + w 0,k RBF Ceter Updates For quadratic distace: x c 2 = (x c,1 ) (x c,i ) (x c,n ) 2 Ceter updates: exp x c 2 R 2(σ ) 2 = c,k c,k = exp x c 2 2(σ ) 2 ( 2)( 1 2(σ ) )(x c ) 2 k,k = R (x k c,k ) (σ ) 2 28

29 RBF Ceter Updates For sigle output (y) liear output layer: Updatig Ceters: E = E e y R c,k e y R c,k = e( 1)w R c,k = e( 1)w R (x k c,k ) (σ ) 2 Δc,k = η E c,k = η e w R (x k c,k) (σ ) 2 Implemetatio issues Traiig, Testig ad Validatio Network architecture ad Traiig Iitial weights & Parameter selectio Local miima Mometum term for weights Network complexity Covergece Geeralizatio Model complexity Uiversal approximator theorem 29

30 Traiig, Testig ad Validatio Traiig: usig kow samples to set parameters Testig: Verifyig that leared mappig applies to usee samples Validatio: testig o the traiig samples to set parameters Geeralizatio: Ability to exted learig to ew samples Example: 1000 data poits Use 600 for traiig: set parameters Use 200 for validatio: check performace, adust parameters Use 200 for testig: Geeralizatio performace Cross-validatio: Trai ad validate o data partitios. 4-fold cross validatio meas split data ito four ad trai o three quarters ad validate o oe fourth for each combiatio All traiig data used for traiig (200 data poits ot used above) Validatio results still valid (four validatio sets above) Network Architecture ad Traiig Architecture: Feed forward etwork 2 Layer FFN Neuro selectio Activatio fuctios Learig Algorithm Gradiet descet How may hidde uits? How log should traiig last? 30

31 Iitial Weights Iitial Weights Radom Seed special cocept Clusterig (for RBF etworks) Local Miima Ca have multiple local miima Gradiet descet goes to the closest local miimum: solutio: radom restarts from multiple places i weight space 31

32 Mometum Term Weight update chages too fast with: Δw i, = ηδ Let each update be closer to last update. Give the gradiet mometum : Δw i, = αδw 1 i, + ηδ Covergece Preset time: Trai for 2000 epochs Preset error criteria: Trai util MSE reaches Relative error criteria: Trai till MSE chages by less tha.1% per epoch Use some left out patters to validate traiig. Whe validatio error bottoms out, stop traiig. 32

33 Geeralizatio error Traiig set error Test set error Traiig time Traiig set error reduced cotiuously Test set error (geeralizatio error) icreases after a poit Network starts to lear the oise i the traiig data Model Complexity 33

34 Uiversal Fuctio Approximatio How good a approximator is a multi layer feed forward etwork? Uiversal Fuctio Approximatio How good a approximator is a multi layer feed forward etwork? Uiversal Approximatio Theorem: Uder some assumptios, for ay give costat ε ad cotiuous fuctio f (x 1,...,x m ), there exists a three layer MLP with the property that f (x 1,...,x m ) - H(x 1,...,x m ) < ε where H ( x 1,..., x m ) = Σ i v i h ( Σ w i x + b i ) h(. ) is oliear activatio fuctio 34

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading : ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios