Dimitri Solomatine. D.P. Solomatine. Data-driven modelling (part 2). 2

Daa-driven modelling. Par. Daa-driven Arificial di Neural modelling. Newors Par Dimiri Solomaine Arificial neural newors D.P. Solomaine. Daa-driven modelling par. 1

Arificial neural newors ANN: main pes Arificial Neural Newors Feed Forward Feedbac Self-organising Linear Non-linear Hopfield Model Bolzman Machine Feaure Maps ART Supervised Unsupervised D.P. Solomaine. Daa-driven modelling par. 3 Linear regression as a simple ANN acual oupu value Y Y = a 1 X + a 0 x 1 a 1 a 0 model predics new oupu value v = a 0 +a 1 x 1 x new inpu value x v In one-dimensional case one inpu x, given T vecors daa {x, } =1, T he coefficiens of he equaion = f x = a 1 x + a 0 can be found. Then for he new V vecors {x v }, v =1, V his equaion can approximael reproduce he corresponding funcions values { v }, v =1, V X D.P. Solomaine. Daa-driven modelling par. 4

How o measure he error? Leas squares error is used since i allows for he bes esimaion of he parameers given errors for each measuremen are independen and normall disribued. Opimizaion problem has o be solved: find such a 0 and a 1 ha E is minimal: E T a0 a1x 1 in a similar fashion he problem can be posed for muliple a regression wih man inpus 0 x1 a 1 = a 0 +a 1 x 1 + a x x a D.P. Solomaine. Daa-driven modelling par. 5 Funcion approximaion: linear regression and ANN Y Y X X Linear regression Y = a 1 X + a Neural newor approximaion Y = f X, a 1,, a n D.P. Solomaine. Daa-driven modelling par. 6 3

ANN: muli-laer laer percepron MLP X Inpus X x 1 modelled real ssem weighs Hidden laer weighs a i b 1 f X observed Error = FX - fx min Oupus Z=FX z 1 x x 3 x Ninp 1 1 Nhid N inp Fao ai x i i1 = 1,..., N hid z z 3 z Nou Nhid z F bo b i1 = 1,..., N Fu 1 There are N inp +1N hid + N hid +1N ou weighs a i and b o be idenified b minimizing mean squared error YX - fx. Mehod used: gradien-based seepes descen mehod called error bacpropagaion D.P. Solomaine. Daa-driven modelling par. 7 ou 0 u Binar Sigmoid : Fu = 1/ 1 + e -u ANN: idenificaion of weighs b raining calibraion ANN error in reproducing he observed oupu OBS i i is: E N examp OBS i ANN i i1 Training of ANN is in solving a muli-exremum opimizaion problem: Find such values of weighs ha bring E o a minimum Problem of bacpropagaion algorihm - i assumes singleexremali D.P. Solomaine. Daa-driven modelling par. 8 4

Biological moivaion signals are ransmied beween neurons b elecrical pulses he neuron sums up he effecs of housands impulses if he inegraed poenial exceeds a hreshold, he cell fires - generaes an impulse ha ravels across he axon furher Dendries Cells bodies D.P. Solomaine. Daa-driven modelling par. 9 Hidden node x 1 a 1 a 0 u u = a 0 + a 1 x 1 + a x = g u x a Inpus o he newor are: x i, i = 1,..., N D.P. Solomaine. Daa-driven modelling par. 10 inp Oupu of he -h node of he hidden laer is = g a 0 + N inp i=1 a i x i, = 1,..., N hid 5

Oupu node 1 b 1 b 0 v v = b 0 + b 1 1 + b z = g v z b inpus are he oupus of hidden nodes 1... Nhid oupus are: z = g b 0 + N hid =1 b, = 1,..., N ou D.P. Solomaine. Daa-driven modelling par. 11 Transfer funcion g he ransfer funcion is usuall non-linear, bounded and differeniable. Widel used is he logisic funcion: 1 g u= 1+ e - u Oupu value Logisic funcion 1. 1 0.8 0.6 0.4 0. Slope = α/4 0-10 -8-6 -4-0 4 6 8 10-0. Inpu value D.P. Solomaine. Daa-driven modelling par. 1 6

ANN complexi and is approximaing abili combinaion of ransfer funcions of hidden nodes produces a complex funcion wih man hidden nodes an funcion can be approximaed D.P. Solomaine. Daa-driven modelling par. 13 ANN complexi and is approximaing: example of approximaing a harmonic funcion one inpu x and wo oupus 1 and Oupus are given b sinx and cosx Daa is generaed b running x from 0 o 6.8 wih he sep 0.00 Training se: 315 insances Tes se: 1 insances x 1 0.0000 0.00000 1.00000 0.000 0.0000 0.99980 0.0400 0.03999 0.9990... 1.5400 0.99953 0.03079 1.5600 0.99994 0.01080 1.5800 0.99996-0.0090 1.6000 0.99957-0.090... 6.400-0.04318 0.99907 6.600-0.0319 0.99973 6.800-0.00319 0.99999 x 1 0.0000 0.00000 1.00000 0.3600 0.357 0.93590 0.9800 0.83050 0.5570 1.800 0.96911-0.4663 3.6400-0.4780-0.87835 4.4600-0.9683-0.497 5.600-0.61563 0.78803 5.8400-0.488 0.90339 6.100-0.1647 0.98671 6.400-0.04318 0.99907 6.600-0.0319 0.99973 6.800-0.00319 0.99999 D.P. Solomaine. Daa-driven modelling par. 14 7

Performance of ANNs as is complexi increases a 1 hidden node b hidden nodes c 3 hidden nodes d 4 hidden nodes D.P. Solomaine. Daa-driven modelling par. 15 ANN raining as an opimizaion problem If N ou funcions, each wih N inp independen inpu variables are given, and T insances vecors { x,x,...,x, f,..., f }, = 1,...,T are given, hen, on he basis of hese insances ANN can be rained so ha he error is minimal. Then if presened oher V insances vecors v v v { x1,x,...,x }, v= 1,...,V, f 1 Ninp 1 Nou { N 1 inp i would approximael reproduce he corresponding funcions values v v v { f, f,..., f }, v=1,...,v 1 N ou D.P. Solomaine. Daa-driven modelling par. 16 8

9 Deailed descripion of ANN error o be minimized Deailed descripion of ANN error o be minimized for oupu he error for he inpu paern is: E = f z E = f z oal for all oupus for inpu paern he error is: Toal error is he summaion of he errors for all oupu nodes for all T insances: z f E 1 f E 1 D.P. Solomaine. Daa-driven modelling par. 17 i i i hid ou ou o x a a g b b g f b b g f z f E 0 0 0 ] [ 1 ] [ 1 min min Error funcion w.r.. weighs error surfaces 1 Error funcion w.r.. weighs error surfaces 1 D.P. Solomaine. Daa-driven modelling par. 18

How o updae weighs opimizaion is done b he seepes descen algorihm seps are made in he space of variables weighs w inhe direcion opposie o he direcion of he gradien of he funcion E w N+1 = w N E w N in individual weighs changes will be: w N 1 w s and he updae sep for weigh s is: w s s E w E N ww w w s N N his is he dela rule of Widrow and Hoff 1960 for a single linear percepron D.P. Solomaine. Daa-driven modelling par. 19 s s s Pracical issues of raining Preparing daa scaling inpu daa o preven newor paralsis g1.0 = 0.76 g.0 = 0.964 g3.0 = 0.995 g4.0 = 0.999 so scale inpu daa o [-3, +3] scaling oupu daa since sigmoid func. is in range [0, 1]: Scale measured arge oupu daa o range [0, 1], or beer [0.1, 0.9] o allow ANN o exrapolae Appl he inverse scaling formulas in ANN esing or operaion he number of hidden nodes Nhid Ninp Nou choice of he acivaion funcions remove par of connecions opimal brain damage deal wih local opima re-randomize weighs D.P. Solomaine. Daa-driven modelling par. 0 10

Radial basis funcion newors D.P. Solomaine. Daa-driven modelling par. 1 Funcion approximaion b combining funcions linear regression splines: using cubic funcions ha would pass hrough he poins and he boundaries 1s and nd heir derivaives would be equal orhogonal funcions Chebshev polnomials combining simple ernel funcions D.P. Solomaine. Daa-driven modelling par. 11

Radial basis funcions use simple funcions Fx ha approximae he given funcion in he proximi o some represenaive locaions hese Fx depend onl on he disance from hese ceners and drop o zero as he disance from he ceners increase Ceners: 1 J D.P. Solomaine. Daa-driven modelling par. 3 Radial basis funcions funcion z =f x, where x is a vecor {x 1... x I } in I- dimensional space ceners w =1...J are seleced f x is approximaed b J z x F x w 1 ; b where x w is disance eg., Euclidean b are coefficiens i associaed wih he -h cener w. Ceners: 1 J D.P. Solomaine. Daa-driven modelling par. 4 1

Radial basis funcions we can choose he linear combinaion of basis funcions: I is common o choose Gaussian funcion for F: J z x b F x w 1 f r =exp r / is analogous o he sandard deviaion in a Gaussian normal disribuion Disance x w is usuall undersood in Euclidean sense and denoed as δ : I x w i1 so he approximaion becomes: J z x b 1 D.P. Solomaine. Daa-driven modelling par. 5 i i exp / Radial basis funcions J z x b 1 exp x w / The problem of approximaion requires: he placemen of he localized Gaussians o cover he space posiions of he ceners w i ; he conrol of he widh of each Gaussian parameer σ; he seing of he ampliude of each Gaussian parameers b i. D.P. Solomaine. Daa-driven modelling par. 6 13

Radial basis funcion problem viewed as a neural newor x i w i b x i z Gaussian funcions Linear funcions D.P. Solomaine. Daa-driven modelling par. 7 Training he RBF newor 1 1. Find he posiions of ceners {w }: Choose randoml J insances x and use hem as he posiions of he ceners {w } All oher insances are assigned o a class of he closes cener w, and he locaions of each cener are calculaed again using eg. -neares neighbor mehod. The above seps are repeaed unil he locaions of he ceners sop changing.. Calculae he oupu zx from each hidden neuron... D.P. Solomaine. Daa-driven modelling par. 8 14

Training he RBF newor... 3. Weighs {b } for he oupu laer are calculaed b solving a muliple linear regression problem, which is formulaed as he ssem of linear equaions. The oupu from he oupu node J can be expressed as b 1 z J where b he weigh on he connecion from he hidden node o he oupu node, - he oupu from he hidden node 4. If he oal error is more han he desired limi, change he number of he hidden unis repea all he seps D.P. Solomaine. Daa-driven modelling par. 9 1 Example: using RBF newor o reproduce SIN and COS funcion Inpu file: 1 inpu X, oupus SIN, COS,315 examples 315 1 0.0000 0.00000 1.00000 0.000 0.0000 0.99980 0.0400 0.03999 0.9990 0.0600 0.05996 0.9980 0.0800 0.07991 0.99680 0.1000 0.09983 0.99500 0.100 0.11971 0.9981... 6.400-0.04318 0.99907 6.600-0.0319 0.99973 6.800-0.00319 0.99999 18 ceners found D.P. Solomaine. Daa-driven modelling par. 30 15

Example: using RBF newor o reproduce behaviour of a 1-D modelling ssem SOBEK Inpu file: 7 inpus prev. rainfalls, flows, 1 oupu flow, 1303 examples 1 cener found D.P. Solomaine. Daa-driven modelling par. 31 Radial basis funcions: commens RBF newors provide a global approximaion o he arge funcion, represened b a linear combinaion of man local ernel funcions his can be viewed as he smooh linear combinaion of piecewise local non-linear funcions - ha is he bes funcion is chosen for a paricular range of inpu daa raining is faser han bacpropagaion newors since i is done in wo seps i is an eager mehod, bu used an idea of local approximaion as in laz mehods such as -NN D.P. Solomaine. Daa-driven modelling par. 3 16