Towards Multi-Layer Perceptron as an Evaluator Through Randomly Generated Training Patterns

Similar documents
Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Chapter 8: Statistical Analysis of Simulated Data

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

Introduction to local (nonparametric) density estimation. methods

Unsupervised Learning and Other Neural Networks

Systematic Selection of Parameters in the development of Feedforward Artificial Neural Network Models through Conventional and Intelligent Algorithms

Bayes (Naïve or not) Classifiers: Generative Approach

Functions of Random Variables

ESS Line Fitting

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Rademacher Complexity. Examples

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning

An Improved Differential Evolution Algorithm Based on Statistical Log-linear Model

On Modified Interval Symmetric Single-Step Procedure ISS2-5D for the Simultaneous Inclusion of Polynomial Zeros

Chapter 8. Inferences about More Than Two Population Central Values

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov

PTAS for Bin-Packing

CSE 5526: Introduction to Neural Networks Linear Regression

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Kernel-based Methods and Support Vector Machines

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK

A tighter lower bound on the circuit size of the hardest Boolean functions

An Introduction to. Support Vector Machine

Bayes Interval Estimation for binomial proportion and difference of two binomial proportions with Simulation Study

Cubic Nonpolynomial Spline Approach to the Solution of a Second Order Two-Point Boundary Value Problem

L5 Polynomial / Spline Curves

Regression and the LMS Algorithm

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

CHAPTER 4 RADICAL EXPRESSIONS

Research on SVM Prediction Model Based on Chaos Theory

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution

Runtime analysis RLS on OneMax. Heuristic Optimization

Simple Linear Regression

Mu Sequences/Series Solutions National Convention 2014

About a Fuzzy Distance between Two Fuzzy Partitions and Application in Attribute Reduction Problem

Logistic regression (continued)

PGE 310: Formulation and Solution in Geosystems Engineering. Dr. Balhoff. Interpolation

PROJECTION PROBLEM FOR REGULAR POLYGONS

Analysis of Lagrange Interpolation Formula

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Module 7: Probability and Statistics

13. Artificial Neural Networks for Function Approximation

Entropy ISSN by MDPI

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions

ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations

Chapter 14 Logistic Regression Models

A new type of optimization method based on conjugate directions

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

ENGI 3423 Simple Linear Regression Page 12-01

Nonlinear Blind Source Separation Using Hybrid Neural Networks*

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

On the Interval Zoro Symmetric Single Step. Procedure IZSS1-5D for the Simultaneous. Bounding of Real Polynomial Zeros

Median as a Weighted Arithmetic Mean of All Sample Observations

STK4011 and STK9011 Autumn 2016

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Machine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18

8.1 Hashing Algorithms

Lecture 07: Poles and Zeros

The Mathematical Appendix

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

2006 Jamie Trahan, Autar Kaw, Kevin Martin University of South Florida United States of America

Simulation Output Analysis

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

The internal structure of natural numbers, one method for the definition of large prime numbers, and a factorization test

THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 2 STATISTICAL INFERENCE

Evaluating Polynomials

Chapter 9 Jordan Block Matrices

ENGI 4421 Propagation of Error Page 8-01

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Summary of the lecture in Biostatistics

Chapter 11 Systematic Sampling

A Robust Total Least Mean Square Algorithm For Nonlinear Adaptive Filter

Pseudo-random Functions

Supervised learning: Linear regression Logistic regression

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.

Generalization of the Dissimilarity Measure of Fuzzy Sets

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

Lecture 3. Sampling, sampling distributions, and parameter estimation

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

Comparison of SVMs in Number Plate Recognition

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Fault Diagnosis Using Feature Vectors and Fuzzy Fault Pattern Rulebase

A Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies

Aitken delta-squared generalized Juncgk-type iterative procedure

A Method for Damping Estimation Based On Least Square Fit

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Evaluation of uncertainty in measurements

A Remark on the Uniform Convergence of Some Sequences of Functions

Reliability evaluation of distribution network based on improved non. sequential Monte Carlo method

Binary classification: Support Vector Machines

Research Article A New Iterative Method for Common Fixed Points of a Finite Family of Nonexpansive Mappings

Quantization in Dynamic Smarandache Multi-Space

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Transcription:

Proceedgs of the 5th WSEAS It. Cof. o Artfcal Itellgece, Kowledge Egeerg ad Data Bases, Madrd, Spa, February 5-7, 26 (pp254-258) Towards Mult-Layer Perceptro as a Evaluator Through Ramly Geerated Trag Patters JANIS ZUTERS Departmet of Computer Scece Uversty of Latva Raa bulv. 9, Rga, LV-5 LATVIA Abstract: - Mult-layer perceptro (MLP) s wdely used, because may problems ca be reduced to approxmato of fuctos. Patter evaluato, whch s dscussed ths artcle, belogs to ths rage of problems. MLP maages fucto approxmato problems qut well, however a mportat prerequste s a uformly dstrbuted set of trag patters. Ufortuately, such a set s ot always avalable. I ths artcle, the use of ramly geerated addtoal trag patters s examed to see whether ths mproves the trag result cases, whe ust postve patters are avalable. Key-Words: - Learg process, Mult-layer perceptro, Ramly geerated patters, Patter evaluato Itroducto The problem of patter evaluato ca be smply solved wth a mult-layer perceptro. We ust have to desg a etwork that approxmates the ukow evaluato fucto f( ) []: d = f (x), () where the vector x s the put ad the vector d s the output. We are gve the set of labeled examples (trag patters) Τ: Τ =, ) (2) { } d ( x = The fucto F( ) descrbg the put-output mappg actually realzed by the etwork should be close eough to f( ) a Eucldea sese over all puts: x F( x) f ( x) < ε, (3) where ε s a small postve umber. Such problems are perfect caddates for supervsed learg, t. al. MLPs. A problem arses f we have o set of patters adequately represetg f( ). [2] descrbes such a case wth solvg a tmetablg problem va a geetc algorthm (GA), where a eural etwork s teded to be a part of the ftess fucto wth GA order to support the evaluato (ratg) of solutos (). The dea s to tra the etwork o exstg (.e., vald) school, thus obtag the etwork, whch s able to evaluate. Ufortuately for ths approach we have o bad or ot very good examples of, so the avalable trag set s complete, ad the evaluato should be e as some kd of smlarty computato betwee the caddate tmetable ad vald. Fgure shows the schema of GA-based tmetablg, where the tmetable evaluato should be partally commtted to a eural etwork. Hard costrats 5. Modfcato of reproduced (mutato) 4. Choce of to be modfed ad reproducto. Geeratg the tal set of Set of urated Set of Termato codto satsfed Soft costrats 2. Ratg (ftess fucto) 3. Selectg Fg.. The schema for GA tmetablg [2] I ths artcle the author exames a method of orgazg a learg process (LP) through addg ramly geerated trag patters (TPs) to the trag set. The method s proposed order to compesate for the lack of a complete trag set.

Proceedgs of the 5th WSEAS It. Cof. o Artfcal Itellgece, Kowledge Egeerg ad Data Bases, Madrd, Spa, February 5-7, 26 (pp254-258) 2 Learg Process of a Neural Network The ablty of the etwork to lear from ts evromet ad to mprove ts performace through learg s the property of prmary sgfcace for eural etworks. [] The two aspects of the learg process are to be dstgushed:. Learg algorthm (LA). LAs dffer sgfcatly amog varous eural etwork models. 2. Orgazato of the LP. The latter ecompasses the way whch TPs are passed to the etwork ad correspodgly, to the LA. Usually the LP s arraged epochs. A epoch s oe sweep through all the patters the trag set, presetg them to the etwork,.e. the LA. Provded that the set of trag patters Τ s avalable (see (2)), the LP ca be geerally descrbed as follows (Fg. 2): % Τ - set of trag patters p := fetch the frst patter from Τ operate the learg algorthm o p p := fetch the ext patter from Τ whle p s avalable whle the stoppg crtero ot met Fg. 2. A learg process Fetchg of patters ca dffer terms of sequece some LAs requre ram order, others ot. 3 Orgazato of the Learg Process wth Icomplete Avalable Trag Set 3. The Problem Let s explore the followg problem: Compute the extet of smlarty betwee a gve alphaumerc symbol ad a fxed set of alphaumerc symbols Τ. I order to solve ths problem, we eed to desg a smlarty fucto for the fxed set Τ: Τ h : Α.. (4) { } where Τ Α s a set of alphaumerc symbols. Ths problem s the same as the oe dscussed Secto, ust replacg patters of symbols wth. To solve ths problem we could face the 2 followg sub-problems. Sub-problem (extremely mportat wth tmetable evaluato wth GA). Determe the crtero of smlarty betwee 2 patters (e.g., how close are symbols B ad 8 ). Sub-problem 2 (assumg that the problem s beg solved usg a eural etwork traed ust o postve patters). Overcome the lack of a complete trag set. The author proposes addg ramly geerated patters to the trag set to try to solve such problems. 3.2 Usuccessful Attempts The frst dea was to exame the Kohoe etwork. The competto prcple based o comparso betwee euros seemed very promsg ust to replace wer takes t all to everyoe takes as much as deserved accordg to gaed evaluato. Ufortuately varous attempts crashed the Kohoe etwork dd the clusterzato well, stll t was uable to fd out the evaluato of assgg a patter to some cluster. 3.3 The Proposed Method Trag MLPs wth Ramly Geerated Patters The secod dea, whch yelded results, was to use addtoal trag patters alog wth avalable oes. By ths approach we have to choose the ram rate τ {..}, whch deotes the proporto of the use of ramly geerated patters. The proposed supervsed learg method ca be descrbed as follows (see Fgure 3): % Τ - set of postve trag patters % τ - the ram rate {..} % s - sze of the trag set % determg the sze of a epoch (s2 s): s2 := s / ( - τ) for := to s2 rd := get ram value from terval.. f rd < τ the p := geerate ram patter d := else p := choose a patter from Τ ram d := operate the learg algorthm o [p, d] whle the stoppg crtero ot met Fg. 3. A supervsed learg process wth addtoal ramly geerated trag patters

Proceedgs of the 5th WSEAS It. Cof. o Artfcal Itellgece, Kowledge Egeerg ad Data Bases, Madrd, Spa, February 5-7, 26 (pp254-258) The proposed method was tested through the expermetato descrbed below. 4 Descrpto of the Expermetato The goal of the expermetato was to exame a method proposed by the author for orgazg the LP by addg ramly geerated TPs (solvg problems lke the problem, descrbed Secto 3.). The goal of each expermet s to obta a eural etwork that could be used for evaluato of patters,.e., oe, whch realzes the smlarty fucto (4). 4. Descrbg the Learg Process of the Expermetato The classc MLP wth the backpropagato learg algorthm was used the expermetato alog wth the author s proposed learg method (See Fg. 3). The core MLP operatg ad trag algorthm s brefly show as follows [3]. Propagato fucto: NET = = w o, (5) where NET s propagato value of the euro ; w the th weght of the euro (w bas of the euro ); o the th put sgal of the euro (o = ). Actvato fucto: o = ϕ( NET ), (6) where o output value of the euro ; ϕ( ) logstc actvato fucto. The geeral weght correcto rule: w = ηδ o, (7) where w correcto of the th weght of the euro ; η learg rate; δ local gradet of the euro (ot specfed ths paper); o the th put sgal of the euro. 4.2 Archtecture of Neural Networks Used Expermets I computer expermets eural etworks of the classc MLP archtecture were used: Sze of the put sgal 28. hdde layer wth 3, 5, or 7 euros. euro the output layer. Networks operate as descrbed above: see (5), (6), (7). Logstc fucto was used as a actvato m m m fucto wth the gaγ,,, where m 5 2 umber of euro puts. 4.3 Avalable Set of Trag ad Testg Patters The etworks were tested o black ad whte mages of a sze 32 4 pxels wth black had-wrtte dgts ad captal Lat letters depcted o a whte backgroud (Fg. 4). The umber of patter (symbol) types was =36 ( dgts + 26 letters). Let s deote the set of patter types as Γ. The umber of varats for each patter type was v=4. So, the total sze of the set of avalable patters Τ were v = 44. As we have 4 dfferet varats of each symbol (.e., four rows of patters), the for each expermet, oe of the rows served as the test set, the other three as the trag set. Fg. 4. A subset of the set of avalable patters Τ 4.4 Course of a sgle expermet A total of,25 expermets were coducted. Each expermet volves the desgato, trag, ad testg oe MLP.. Create a ew MLP accordg to the descrpto Secto 4.2. 2. Choose from Τ, at ram, three rows of avalable patters (Fg. 4) as a caddate set of trag patters Τ A Τ, so those of the remag row would be a test set Τ B Τ. 3. Choose at ram the ram rate τ amog values {,.2,.5,.7,.9}. τ= meas that the etwork s traed ust o selected patters wthout use of ramly geerated oes. 4. Choose at ram the amout of patter types c amog values {, 2, 4} the trag set. 5. Select patter types at ram for the trag set from 36 avalable dgts ad letters: Γ={q, q 2,..., q c } Γ. As three rows of four avalable fgures serve for trag, the chose patter types represet s = c 3 patters, so the trag set

Proceedgs of the 5th WSEAS It. Cof. o Artfcal Itellgece, Kowledge Egeerg ad Data Bases, Madrd, Spa, February 5-7, 26 (pp254-258) would cosst of s patters: Τ={p, p 2,..., p s } Τ A. 6. Tra the etwork usg the author s proposed method (Secto. 3.3) utl epochs are passed or the total error of output euros decreases uder ε=.. 7. Ram patters were geerated as black ad whte mages of a sze of 32 4 pxels, wth proporto of black at.7-.35 (also chose at ram). 8. Test the traed etwork o the set of test patters Τ B ad record the outputs, thus formg the expermetal results. The goal of a expermet to obta a etwork that s able to evaluate put patters Τ B (represets Γ ) wth respect to a fxed set Γ,.e. realzes a verso of the smlarty fucto h Γ. 5 Aalyss of the Proposed Trag Method Sub-problem 2 (Secto 3.) s stated to determe the crtero of smlarty betwee two patters. I ths case (aalyzg the trag method), such a crtero should be obtaed by exteral meas, ad the smlarty fucto that s based o t would the serve as a bechmark to measure the qualty of the soluto. 5. A Questoare Method to Obta the Smlarty Crtero As there exst o mafest formal crtera terms of smlarty betwee two patters, huma opo was chose as a crtero. A survey was arraged, ad eght respodets were asked to evaluate pars of symbols (a total of 63 pars each of 36 symbols wth each of the rest) wth the mark betwee ad (wth a mmum step of.), where meas very smlar, but absolutely dfferet (Fg. 5). All the respodets rated most of pars as. As the average result of all respodets, the smlarty measure for two patters g(, ) was obtaed: g : Γ Γ {..} (8) To reduce the computato, the fucto g(, ) was smplfed a maer so that the followg 2 equatos hold true: Γ : g(, ) = (9), Γ : g(, ) = g(, ) () Fg. 5. A excerpt of a qury form used the survey Now we ca defe the smlarty fucto wth respect to Γ: h ( ) = max g(, ) Γ, () Γ where Γ Γ the set of patter types, represetg the trag set, ad Γ the patter type. The troduced smlarty fucto represets the average evaluato of all respodets wth respect to Γ. 5.2 Represetg the Smlarty by the Smlarty Sequece Ufortuately, t was mpossble to exame expermetal results through comparg them drectly to the values yelded by the smlarty fucto. That was because of dsparate absolute output values amog expermets. There was a eed for a dfferet oto of the smlarty fucto. The dea s, stead of the smlarty fucto (), to represet the smlarty by the smlarty sequece wth respect to Γ: χ = χ, (2) = where Γ determes the type of the patter; χ determes the posto of the patter type the sequece, ordered accordg to the smlarty fucto: h( ) h( ) χ χ (3) 5.3 The Proposed Error Fucto to Evaluate The Trag Method Assume that the result of a expermet s also expressed as a ordered sequece of patter types: ψ = ψ, (4) = ordered accordg to the outputs of the etwork: F( ) F( ) ψ ψ (5) where F( ) the fucto realzed by the eural etwork traed o Γ.

Proceedgs of the 5th WSEAS It. Cof. o Artfcal Itellgece, Kowledge Egeerg ad Data Bases, Madrd, Spa, February 5-7, 26 (pp254-258) The we ca defe the sequetal error λ(, ) as dfferece betwee postos of two sequeces: ( χ ψ ) = λ ( χ, ψ ) = (6) Usg the smlarty sequece (2) stead of the smlarty fucto () s acceptable wth evaluato (ftess) fucto wth a GA, as the ftess fucto s volved determg, whch solutos are to be elmated or to be chose as parets, ad the absolute values of the evaluato are ot of great mportace. I the ext secto, the performed expermets are examed accordg to the sequetal error λ (6). 5.4 Aalyss of Expermetal Results ad Cocluso All the expermets were examed accordg to the sequetal error λ ad grouped by the umber of patter types c the trag set ad ram rate τ (see Secto 4.4). The summary of expermetal results s show Table. Table. Expermetal results as average of the sequetal error λ, grouped by the umber of patter types c ad the ram rate τ. sequetal error (λ) ram rate (τ) 2 umber of patter types Τ (c).549.5 2.52552.9 4.7323.5 4.73762.7 4.76883.2 4.882.5.933.2 2.963.7 2.99388.7.32.2.23665.9.26268.9 2.4785..62686. 2 2.28688. 4 To better evaluate the expermetal results, the smlarty sequeces, obtaed from separate respodets, ad the total smlarty sequece (2) also were examed. Ths was e by comparg them terms of sequetal error λ, ad the results were betwee 2.82 ad 5.99,.e., also rather far away from the deal value. Agast that backgroud, the acqured results for the proposed trag method look farly good. Although the mprovemet s small, we stll observe the followg beefts: There s a otceable effect of usg ust postve patters the trag set the mprovemet s small, but stable. The worst results were show by ram rate of value. Ths shows the effect of usg ramly geerated patters the learg process. The results ecourage the author to cotue research order to buld a eural etwork that would serve as evaluator of school, oe, whch would be cluded a geetc algorthm as a part of the ftess fucto. Refereces: [] S. Hayk, Neural etworks: a comprehesve foudato, 2d ed. Pretce-Hall, Ic, 999. [2] J. Zuters, A Adaptable Computatoal Model for Schedulg Trag Sessos, Aual Proceedgs of Vdzeme Uversty College ICTE Regoal Developmet, 25, pp. - 3. [3] J. Zuters, A Exteso of Mult-Layer Perceptro Based o Layer-Topology, Proceedgs of the 5th Iteratoal Eformatka Coferece 5, 25, pp. 78-8. Accordg to the results show Table, the value of λ s early,.e., t s a great dstace away from the deal value of. It s ust a lttle better tha oe, whch ca be acqured by ramly geerated sequeces, whch yeld the λ value of approxmately 4.