In The name of God Lecure4: Percepron and AALIE r. Majid MjidGhoshunih Inroducion The Rosenbla s LMS algorihm for Percepron 958 is buil around a linear neuron a neuron ih a linear acivaion funcion. Hoever, he Percepron is buil around a nonlinear neuron, namely he McCulloch-Pis model of a neuron. This neuron has a hard- limiing acivaion funcion performing he signum funcion. Recenly he erm mulilayer Percepron has ofen been used as a synonym for he erm mulilayer feedforard neural neork. In his secion e ill be referring o he former meaning.
Percepron Goal classifying i applied Inpu ino one of fo classes Procedure if oupu of hard limier is +, o class C ; if i is -, o class C inpu of hard limier : eighed sum of inpu effec of bias b is merely o shif decision boundary aay from origin synapic eighs adaped on ieraion by ieraion basis 3 Percepron ecision regions separaed by a hyper plane poin, above boundary line is assigned o C poin y,y belo boundary line o class C 4
Percepron LearningTheorem Linearly separable if o classes are linearly separable, here eiss decision surface consising of hyper plane. If so, here eiss eigh vecor T > 0 for every inpu vecor belonging o class C T 0 for every inpu vecor belonging o class C for only linearly separable classes, percepron orks ell 5 Percepron LearningTheorem Using modified signal-flo graph bias bn is reaed as synapic eigh driven by fied inpu + 0 n is bn linear combiner oupu 6
Percepron LearningTheorem 3 Weigh adjusmen if n is correcly classified Oherise learning rae parameer ηn conrols adjusmen applied o eigh vecor 7 Summary of Learning. Iniializaion se 0=0. Acivaion a ime sep n, acivae percepron by applying coninuous valued inpu vecor n and desired response dn 3. Compuaion of acual response yn = sgn[ T n n] 4. Adapaion of Weigh Vecor 5. Coninuaion incremen ime sep n and go back o sep 8
The neork is capable of solving linearly separable problem 9 Learning rule An algorihm o updae he eighs so ha finally he inpu paerns lie on boh sides of he line decided by he percepron Le be he ime, a =0 0, e have 0
Learning rule An algorihm o updae he eighs so ha finally he inpu paerns lie on boh sides of he line decided by he percepron Le be he ime, a =, e have Learning rule An algorihm o updae he eighs so ha finally he inpu paerns lie on boh sides of he line decided by he percepron Le be he ime, a =, e have
Learning rule An algorihm o updae he eighs so ha finally he inpu paerns lie on boh sides of he line decided by he percepron Le be he ime, a =3 3, e have 3 Implemenaion of Logical OT, A, and OR 4
Implemenaion of Logical Gae 5 Finding Weighs by MSE Mehod Finding Weighs by MSE Mehod Wrie a equaion for each raining daa Oupu for firs class is + and for second class Oupu for firs class is + and for second class is -or 0 Apply he MSE mehod o solve he problem Eample: Implemenaion of A gae 0 0 pe: pe e o o g e.5 0 0 0 b b * 0 6
Summary: Percepron vs. MSE procedures Percepron rule The percepron rule alays finds a soluion if he classes are linearly separable. Bu does no converge if he classes are no-separable. MSE crierion The MSE soluion has guaraneed convergence, bu i may no find a separaing hyperplane if classes are linearly separable oice ha MSE ries o minimize he sum of he squares of he disances of he raining daa o he separaing hyperplane. 7 Convergence of he Percepron learning la Rosenbla proved ha if inpu paerns are linearly separable, hen he Percepron learning la converges, and he hyperplane separaing o classes of inpu paerns can be deermined. Fied incremen convergence heorem for linearly separable vecors X and X, percepron converges afer some n 0 ieraions 8
Limiaion of Percepron The XOR problem Minsky: nonlinear separabiliy 9 Percepron ih sigmoid acivaion funcion For single neuron ih sep acivaion funcion: For single neuron ih Sigmoid acivaion funcion: 0
Represenaionof of Percepron in MATLAB MATLAB TOOLBOX ne = nepp,,f,lf escripion of funcion Perceprons are used o solve simple i.e. linearly separable classificaion problems. ET = EWPP,T,TF,LF TF akes o inpus, P : R-by-Q mari of Q inpu vecors of R elemens each.. T : S-by-Q mari of Q arge vecors of S elemens each. TF: Transfer funcion, defaul = 'hardlim'. LF: Learningfuncion funcion, defaul='learnp' learnp. Reurns a ne percepron.
Classificaion eample: Linear separabiliy See he M_ file 3 4
5 Classificaion of daa: nonlinear separabiliy 6
Classificaion of daa: nonlinear separabiliy 7 AALIE: The Adapive Linear Elemen AALIE is Percepron ih linear acivaion funcion This is proposed by Widro 8
Applicaions of Adaline In general, he Adaline is used o perform Linear approimaion of a small segmen of a nonlinear hyper surface, hich is generaed by a p variable funcion, y =f. Inhiscase, he bias is usually needed. Linear filering and predicion of daa signals; Paern associaion, ha is, generaion of m elemen oupu vecors associaed ih respecive p elemen inpu vecors. 9 For single neuron ε = d-y For muli neuron Error concep ε i = d i -y i i=:m ε m = d m -y m m is number of oupu neuron The oal measure of he goodness of approimaion, or he performance inde, can be specified by he mean- squared error over m neurons and raining vecors: J W m m i j e j i 30
W : eigh m m m W p g pm p 3 The MSE soluion is: m p p p m p X X X W The Error equaion is: m m E m E W J m p p m W X E 3
For single neuron m=: E E W J Replacing error in equaion: ] [ XW XW W J ] [ ] [ XW X W XW XW W J ] [ XW X W X W XW 33 ] [ XW X W XW 0. Eample: 3..8 4 3 X 4.8 4. W 6 5 X 7.3 5.7 W 8 7 9. 7.9 9 9 9. ] 45 85 ] [ 55] *[330 38 [385 J 0 90 85 0 660 38 [385 ] 0 45 ] [ 55] *[330 [385.38 0, J J 34 0 90 85 0 660 [385.38 0, J
The plo of performance inde J, of eample 35 Eample :he performance inde in general case 36
Mehod of seepes descen If is large he order of calculaion ill be high. In order o avoid his problem, e can find he opimal eigh vecor for hich he meansquared error, J. J aains minimum by ieraive modificaion of he eigh vecor for each raining eemplar in he direcion opposie o he gradien of he performance inde, J, An eample illusraed in Figure 4 5 for a single eigh siuaion. 37 Illusraion of he seepes descen mehod 38
Mehod of seepes descen When heeigh ihvecor aains he opimal value for hich he gradien is zero 0 in Figure 4 5, he ieraions are sopped. More precisely, he ieraions are specified as here he eigh adjusmen, n, is proporional o he gradien of he mean-squared error here η is a learning gain. 39 The LMS Widro Hoff Learning La The Leas- Mean- Square learning la replaces he gradien of he mean- squared error ih he gradien updae and can be rien in folloing form: W i m pm d i n n n d y i m i y p : m m m W n W n n n 40
The LMS Widro Hoff Learning La For single neuron For linear neuron y i i d y J d i i J y d i i i i i For nonlinear neuron v i i y v d y J d v J J v d v v v i i i 4 neork raining To ypes of neork raining: Sequenial mode or incremenal on-line, sochasic, or per-paern: Weighs updaed afer each paern is presened Bach mode off-line or per-epoch : Weighs updaed afer all paern is presened 4
Some general commens on he learning process Compuaionally, i he learning process goes hrough all raining eamples an epoch number of imes, unil a sopping crierion is reached. The convergence process can be moniored ih he plo of he mean- squared error funcion JWn. The popular sopping crieria are: he mean- squared error is sufficienly small: The rae of change of he mean- squared error is sufficienly small: 43 The effec of learning R Rae: ƞ 44
Applicaions Applicaions MA Moving average modeling filering For m= 0 b y 3 0 X 0 b b 3 y 45 3 3 b y Applicaions Applicaions AR auo regressive modeling: M M i i M n b i n y a n y Model Order of : For M=: y y y 3 0 y y y y X a a 3 y y 3 y y X b a y 46 3 3 y
PI conroller: Applicaions 3 47 Simulaion of MA modeling Suppose he MA model as: b M 3 Inpu is Gaussian noise ih mean=0 and var= y is Calculaed by recursive equaion Please see he M_file 48
M_file of MA Modeling 49 MA Modeling Zeros eigh iniial and η= 0.0 umber of daa raining se: =0 50
MA Modeling Zeros eigh iniial and η= 0. umber of daa raining se: =0 5 MA Modeling Random eigh iniial and η= 0.0 umber of daa raining se: =0 5
MA Modeling Random eigh iniial and η= 0. umber of daa raining se: =0 53 MA Modeling Random eigh iniial and η= 0. umber of daa raining se: =0 54
MATLAB TOOLBOX ne = nelinpr,s,i,lr escripion of funcion Linear layers are ofen used as adapive filers for signal processing and predicion. EWLIPR,S,I,LR I LR akes hese argumens, PR - RQ mari of Q represenaive inpu vecors. S - umber of elemens in he oupu vecor. I - Inpu delay vecor, defaul = [0]. LR - Learning rae, defaul = 0.0; and reurns a ne linear layer. 55