Competitive Self-adaptation in Evolutionary Algorithms

Similar documents
Differential Evolution: Competitive Setting of Control Parameters

2 Differential Evolution and its Control Parameters

Adaptive Differential Evolution and Exponential Crossover

COMPETITIVE DIFFERENTIAL EVOLUTION

Modified Differential Evolution for Nonlinear Optimization Problems with Simple Bounds

A Scalability Test for Accelerated DE Using Generalized Opposition-Based Learning

A Comparison of Nonlinear Regression Codes

Crossover and the Different Faces of Differential Evolution Searches

Beta Damping Quantum Behaved Particle Swarm Optimization

Population Variance Based Empirical Analysis of. the Behavior of Differential Evolution Variants

An Introduction to Differential Evolution. Kelly Fleetwood

Dynamic Optimization using Self-Adaptive Differential Evolution

Investigation of Mutation Strategies in Differential Evolution for Solving Global Optimization Problems

Decomposition and Metaoptimization of Mutation Operator in Differential Evolution

THE objective of global optimization is to find the

ARTIFICIAL NEURAL NETWORKS REGRESSION ON ENSEMBLE STRATEGIES IN DIFFERENTIAL EVOLUTION

Three Steps toward Tuning the Coordinate Systems in Nature-Inspired Optimization Algorithms

Integer weight training by differential evolution algorithms

Research Article A Novel Differential Evolution Invasive Weed Optimization Algorithm for Solving Nonlinear Equations Systems

Three Steps toward Tuning the Coordinate Systems in Nature-Inspired Optimization Algorithms

Differential Evolution Based Particle Swarm Optimization

CONVERGENCE ANALYSIS OF DIFFERENTIAL EVOLUTION VARIANTS ON UNCONSTRAINED GLOBAL OPTIMIZATION FUNCTIONS

On the Pathological Behavior of Adaptive Differential Evolution on Hybrid Objective Functions

DE/BBO: A Hybrid Differential Evolution with Biogeography-Based Optimization for Global Numerical Optimization

Finding Multiple Global Optima Exploiting Differential Evolution s Niching Capability

Toward Effective Initialization for Large-Scale Search Spaces

An Improved Differential Evolution Trained Neural Network Scheme for Nonlinear System Identification

Numerical Optimization: Basic Concepts and Algorithms

Performance of Differential Evolution Method in Least Squares Fitting of Some Typical Nonlinear Curves

Performance Assessment of Generalized Differential Evolution 3 with a Given Set of Constrained Multi-Objective Test Problems

OPTIMIZATION OF MODEL-FREE ADAPTIVE CONTROLLER USING DIFFERENTIAL EVOLUTION METHOD

Multi-start JADE with knowledge transfer for numerical optimization

Center-based initialization for large-scale blackbox

Evolutionary Functional Link Interval Type-2 Fuzzy Neural System for Exchange Rate Prediction

Bio-inspired Continuous Optimization: The Coming of Age

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

Multi-objective Emission constrained Economic Power Dispatch Using Differential Evolution Algorithm

Uniform Random Number Generators

DESIGN OF MULTILAYER MICROWAVE BROADBAND ABSORBERS USING CENTRAL FORCE OPTIMIZATION

WORST CASE OPTIMIZATION USING CHEBYSHEV INEQUALITY

Egocentric Particle Swarm Optimization

Adaptive Generalized Crowding for Genetic Algorithms

Research Article Algorithmic Mechanism Design of Evolutionary Computation

Evolving cognitive and social experience in Particle Swarm Optimization through Differential Evolution

CSC 4510 Machine Learning

A PARAMETER CONTROL SCHEME FOR DE INSPIRED BY ACO

Lecture 9 Evolutionary Computation: Genetic algorithms

Problems of cryptography as discrete optimization tasks

Optimization Problems

NONLINEAR IDENTIFICATION ON BASED RBF NEURAL NETWORK

An Evolution Strategy for the Induction of Fuzzy Finite-state Automata

arxiv: v1 [cs.ne] 29 Jul 2014

Constrained Optimization by the Constrained Differential Evolution with Gradient-Based Mutation and Feasible Elites

Differential Evolution: a stochastic nonlinear optimization algorithm by Storn and Price, 1996

ARTIFICIAL NEURAL NETWORK WITH HYBRID TAGUCHI-GENETIC ALGORITHM FOR NONLINEAR MIMO MODEL OF MACHINING PROCESSES

A Mixed Strategy for Evolutionary Programming Based on Local Fitness Landscape

Verification of a hypothesis about unification and simplification for position updating formulas in particle swarm optimization.

A Restart CMA Evolution Strategy With Increasing Population Size

ON THE USE OF RANDOM VARIABLES IN PARTICLE SWARM OPTIMIZATIONS: A COMPARATIVE STUDY OF GAUSSIAN AND UNIFORM DISTRIBUTIONS

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

An Adaptive Population Size Differential Evolution with Novel Mutation Strategy for Constrained Optimization

OPTIMIZATION OF THE SUPPLIER SELECTION PROBLEM USING DISCRETE FIREFLY ALGORITHM

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Improving Differential Evolution Algorithm by Synergizing Different Improvement Mechanisms

Scientific Computing: Optimization

Gradient-based Adaptive Stochastic Search

Conjugate Directions for Stochastic Gradient Descent

A multistart multisplit direct search methodology for global optimization

Application Research of Fireworks Algorithm in Parameter Estimation for Chaotic System

MEAN-ABSOLUTE DEVIATION PORTFOLIO SELECTION MODEL WITH FUZZY RETURNS. 1. Introduction

Metaheuristic algorithms for identification of the convection velocity in the convection-diffusion transport model

A COMPARISON OF PARTICLE SWARM OPTIMIZATION AND DIFFERENTIAL EVOLUTION

Neural Network to Control Output of Hidden Node According to Input Patterns

x 2 i 10 cos(2πx i ). i=1

Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators

RESOLUTION OF NONLINEAR OPTIMIZATION PROBLEMS SUBJECT TO BIPOLAR MAX-MIN FUZZY RELATION EQUATION CONSTRAINTS USING GENETIC ALGORITHM

Introduction to Optimization

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE

Parameter Sensitivity Analysis of Social Spider Algorithm

Fuzzy adaptive catfish particle swarm optimization

Multi-objective approaches in a single-objective optimization environment

A numerical study of some modified differential evolution algorithms

Signal Identification Using a Least L 1 Norm Algorithm

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

Gaussian bare-bones artificial bee colony algorithm

Computational statistics

Metaheuristics and Local Search

THIS paper considers the general nonlinear programming

Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Comparison between cut and try approach and automated optimization procedure when modelling the medium-voltage insulator

The Parameters Selection of PSO Algorithm influencing On performance of Fault Diagnosis

Looking Under the EA Hood with Price s Equation

Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine

Constrained Real-Parameter Optimization with Generalized Differential Evolution

Nonlinear Regression using Particle Swarm Optimization and Genetic Algorithm

Runtime Analysis of Evolutionary Algorithms for the Knapsack Problem with Favorably Correlated Weights

Available online at ScienceDirect. Procedia Computer Science 20 (2013 ) 90 95

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier

Transcription:

Competitive Self-adaptation in Evolutionary Algorithms Josef Tvrdík University of Ostrava josef.tvrdik@osu.cz Ivan Křivý University of Ostrava ivan.krivy@osu.cz Abstract Heuristic search for the global minimum is studied. This paper is focused on the adaptation of control parameters in differential evolution (DE) and in controlled random search (CRS). The competition of different control parameter settings is used in order to ensure the self-adaptation of parameter values within the search process. In the generalized CRS the self-adaptation is ensured by several competing local-search heuristics for the generation of a new trial point. DE was experimentally compared with other adaptive algorithms on a benchmark, self-adaptive CRS was compared in estimation of regression parameters on NIST nonlinear regression datasets. The competitive algorithms outperformed other algorithms both in the reliability and in the convergence rate. Keywords: Global optimization, Differential evolution, Controlled random search, Self-adaptation. 1 Introduction We deal with the heuristic search for the global optimization problem defined as follows: for a given objective function f : S R, S R D, the point x is to be found such that x = arg minx S f(x). The point x is called the global minimum point and S is the search space. We focus on the problems, where the objective function is continuous and the search space is closed compact set, S = D [a d, b d ], a d < b d, d = 1, 2,..., D. This specification of S is called box constrains. The problem of the global optimization is hard. That is why stochastic (heuristic) algorithms are used for its solution, see e.g. [2, 12]. The authors of many stochastic algorithms claim the efficiency and the reliability of the searching for the global minimum. The reliability means that the point with minimum function value found in the search process is sufficiently close to the global minimum point and the efficiency means that the algorithm finds a point sufficiently close to the global minimum point at reasonable time. However, when using such algorithms, we face the problem of their control-parameters setting. The efficiency and the reliability of many algorithms is strongly dependent on the values of control parameters. A user is supposed to be able to change the parameter values according to the results of trial-and-error preliminary experiments with the search process. Such attempt is time-consuming. Adaptive robust algorithms reliable enough at reasonable time-consumption without the necessity of fine tuning their input parameters have been studied in recent years. Theoretical analysis done by Wolpert and Macready [21] implies, that any heuristic search algorithm cannot outperform the others for all objective functions. In spite of this fact, there is empirical evidence, that some algorithms can outperform others for relatively wide range of problems both in the convergence rate and in the reliability of finding the global minimum point. This paper presents an adaptive procedure of control-parameters setting in the algorithm of differential evolution and self-adaptive controlled random search algorithm, where its adaptation is based on the competition of local-search heuristics. 2 Differential Evolution and Adaptation of its Control Parameters The differential evolution (DE) introduced by Storn and Price [13] is simple but powerful evolutionary algorithm for global optimization over the box-constrained search space. The algorithm of DE in pseudo-code is

shown as Algorithm 1. Algorithm 1. Differential evolution 1 generate P = (x 1, x 2,..., x NP ); (points in S) 2 repeat 3 for i := 1 to NP do 4 compute a mutant vector u; 5 create y by the crossover of u and x i ; 6 if f(y) < f(x i ) then insert y into Q 7 else insert x i into Q 8 endif; 9 endfor; 10 P := Q; 11 until stopping condition; There are several strategies how to generate the mutant point u. One of the most popular variant (called DE/rand/1/bin in the literature [3, 10, 13]) generates the point u by adding the weighted difference of two points u = r 1 + F (r 2 r 3 ), (1) where r 1 r 2 r 3 x i, r 1, r 2 and r 3 are taken at random from P and F > 0 is an input parameter. Another variant, called DE/best/2/bin generates the point u according to formula u = x min + F (r 1 + r 2 r 3 r 4 ), (2) where r 1 r 2 r 3 r 4 x min, r 1, r 2, r 3 and r 4 are taken randomly from P (not coinciding with the current x i ), x min is the point of P with minimum function value, and F > 0 is an input parameter. In both strategies mentioned above, the elements y d, d = 1, 2...., D of trial point y are built up by the crossover of its parents x i and u using the following rule { ud if U y d = d CR or d = l (3) x id if U d > CR and d l, where l {1, 2,..., D} is a randomly chosen integer, U 1, U 2,..., U D are independent random variables uniformly distributed in [0, 1), and CR [0, 1] is an input parameter influencing the number of elements to be exchanged by crossover. Rule (3) ensures that at least one element x id of x i is replaced by u d, even if CR = 0. The differential evolution has become one of the most frequently used algorithms for solving the continuous global optimization problems in recent years [10]. But it is also known that the efficiency of searching for the global minimum is very sensitive to the setting of values F and CR. The recommended values are F = 0.8 and CR = 0.5, but even Storn and Price in their principal paper [13] use 0.5 F 1 and 0 CR 1 depending on the results of preliminary tuning. They also used the population size less than recommended NP = 10 D in many test tasks. Many papers deal with the setting of control parameters for differential evolution. Ali and Törn [1] suggested to adapt the value of the scaling factor F within the search process according to the equation { max(fmin, 1 f max f F = min ) if f max f min < 1 max(f min, 1 f min f max ) otherwise, (4) where f min, f max are the minimum and maximum function values in the population and F min is an input parameter ensuring F [F min, 1]. According to [1] this calculation of F reflects the demand to make the search more diversified at early stage and more intensified at latter stages, i.e. to produce rather larger values of F for large difference f max f min, and rather smaller values of F otherwise. The rule (4) works properly only for f min > 0. When f max > 0 and f min < 0, especially if f max < f min, the values of F fluctuate very rapidly in [F min, 1], even if the changes in f max or f min are small. However, from practical point of view, it occurs only as a short episode of the search process in most optimization tasks. Zaharie [22] derived the critical interval for the control parameters of DE. This interval ensures to keep the mean of population variance non-decreasing, which results in the following relationship 2 p F 2 2p NP + p2 NP > 0, (5) where p = max(1/d, CR) is the probability of differential perturbation according to (3). The relationship (5) implies that the mean of population variance is non-decreasing, if F > 1/NP, but practical reason of such result is very limited, because it brings no new information when we compare this result with the minimum value of F = 0.5 used in [13] and in other applications of differential evolution. Some other attempts to the adaptation of DE control parameters have appeared, recent state of adaptive parameter control in differential evolution is summarized by Liu and Lampinen [5]. New idea of self-adaptation of control parameters F and CR in differential evolution was proposed by Brest et al. [3]. The values of F and CR can be changed in each generation with probability τ 1, τ 2, respectively. Thus, the successful values are used more frequently due to the fact, that individuals with lower function level survive longer. New values of F are distributed uniformly in [ F l, F u ] and new CR are also uniform random values from [ 0, 1]. The setting of the control parameters can be made self-adaptive through the implementation of a compe-

tition into the algorithm. This idea, similar to the competition of local-search heuristics in evolutionary algorithm [14] or in controlled random search [15], was proposed recently [17]. Let us have H settings (different values of F and CR used in the statements on line 4 and 5 of Algorithm 1) and choose among them at random with the probability q h, h = 1, 2,..., H. The probabilities can be changed according to the success rate of the settings in preceding steps of search process. The h-th setting is successful, if it generates such a trial point y that f(y) < f(x i ). When n h is the current number of the h-th setting successes, the probability q h can be evaluated simply as the relative frequency q h = n h + n 0 H j=1 (n j + n 0 ), (6) where n 0 > 0 is a constant. The setting of n 0 1 prevents a dramatic change in q h by one random successful use of the h-th parameter setting. In order to avoid the degeneration of process the current values of q h are reset to their starting values (q h = 1/H), if any probability q h decreases bellow a given limit δ > 0. It is supposed that such a competition of different settings will prefer successful settings. The competition provides an self-adaptive mechanism of setting control parameters to appropriate values for the problem actually solved. Four variants of such competitive differential evolution were implemented and tested on benchmark in [19]. The benchmark consists of six functions commonly used, see e.g. [1, 10, 13] at four levels of dimension D of search spaces, namely D = 2, D = 5, D = 10 and D = 30: Ackley s function - multimodal, separable ( ) 1 D f(x) = 20 exp 0.02 D x2 d ( ) 1 D exp D cos 2πx d + 20 + exp(1) x d [ 30, 30], x = (0, 0,..., 0), f(x ) = 0 First De Jong s function (sphere model) - unimodal, continuous, convex f(x) = D x d [ 5.12, 5.12], x = (0, 0,..., 0), f(x ) = 0 Griewank s function - multimodal, nonseparable f(x) = D 2 x d 4000 D x 2 d ( ) xd cos + 1 d x d [ 400, 400], x = (0, 0,..., 0), f(x ) = 0 Rastrigin s function - multimodal, separable f(x) = 10 D + D [x 2 d 10 cos(2πx d )] x d [ 5.12, 5.12], x = (0, 0,..., 0), f(x ) = 0 Rosenbrock s function (banana valley) - unimodal, nonseparable f(x) = D 1 [ 100(x 2 d x d+1 ) 2 + (1 x d ) 2] x d [ 2.048, 2.048], x = (1, 1,..., 1), f(x ) = 0 Schwefel s function - multimodal, the global minimum distant from the next best local minima f(x) = D x d sin( x d ) x d [ 500, 500], x = (s, s,..., s), s = 420.9687, f(x ) = 418.9829 D The most reliable results (and at the same time the second best ones in time consumption) provided the algorithm DEBR18 with 18 competing settings of control parameters, all the combinations of (CR = 0, CR = 0.5, and CR = 1) and (F = 0.5, F = 0.8, and F = 1) in the both DE/rand/1/bin and DE/best/2/bin strategy. Parameters for the competition of settings were set to n 0 = 2, and δ = 1/(5 H). Here we present the comparison of this algorithm with standard DE and three other adaptive DE algorithms published recently. The tests were preformed for all the functions at four levels of dimension D of search spaces like in [19]. One hundred of independent runs were carried out for each function and level of D. The search for the global minimum was stopped, if f max f min < 1e 07 or the number of objective function evaluations exceeds the input upper limit 20000 D. Population size was set to NP = max(20, 2 D) in all the tested algorithms except BREST2, where population size recommended in [3], i.e. NP = 10 D was used. The input parameters for self-adaptation in BREST1 and BREST2 algorithm were set as follows, τ 1 = τ 2 = 0.1, F l = 0.1, and F u = 0.9. In the standard DE the recommended control parameter values F = 0.8 and CR = 0.5 were used. In adaptive DE-Ali the control parameters were set to CR = 0.5 and F min = 0.45.

Table 1: Comparison of DE algorithms Algorithm DEBR18 DER DE-Ali BREST1 BREST2 Function D ne RP rne RP rne RP rne RP rne RP ackley 2 2409 100-2 100 2 100-19 99-18 100 dejong1 2 1162 100-1 100 12 100-18 100-19 100 griewank 2 2876 100 25 78 5 94 1 96 8 95 rastrig 2 1778 100-2 99-2 99-18 100-18 99 rosen 2 1956 100 105 100 150 100 72 100 71 100 schwefel 2 1640 100-3 100-21 100-18 99-18 98 ackley 5 6401 100 1 99-9 100-25 100 91 100 dejong1 5 3176 100-3 100 14 100-26 100 90 100 griewank 5 8686 100 14 70 9 80-15 84 127 100 rastrig 5 4989 100 16 95 20 99-16 98 117 100 rosen 5 6256 100 528 100 331 32 155 97 511 100 schwefel 5 4564 98-3 98-33 97-25 96 91 100 ackley 10 13569 100 14 99-38 96-35 97 248 100 dejong1 10 6973 100 6 100-2 100-34 100 252 100 griewank 10 13153 99 18 78-15 83-37 81 260 100 rastrig 10 10711 100 104 82 78 84-4 96 414 100 rosen 10 20524 100 429 100-6 0 130 97 729 100 schwefel 10 9964 99 9 96-41 90-31 88 264 100 ackley 30 142208 100 164 100-58 100-49 100 179 100 dejong1 30 78664 100 141 100-23 100-48 100 182 100 griewank 30 103095 100 174 100-25 100-46 98 191 100 rastrig 30 110071 100 445 0 445 0 62 100 445 0 rosen 30 381972 100 57 0 50 0 6 98 57 0 schwefel 30 108050 100 206 100-41 100-35 99 246 100 RP rne 100 80 60 40 20 0 600 400 200 0 DEBR18 DER DE_ALI BREST1 BREST2 DEBR18 DER DE_ALI BREST1 BREST2 Figure 1: Comparison of DE algorithms Boxplots of RP and rne The results of comparison of the algorithms are summarized in Fig. 1. The values for all the test tasks are given in Table 1 in more detail. The reliability of the search in columns RP is the percentage of runs, where the minimum value of objective function obtained by the search duplicates at least four digits, when compared with the right result. The time consumption is expressed as the average number (ne) of the objective function evaluations needed to reach the stopping condition. For easier comparison of the algorithms the (ne) values are given only for DEBR18 algorithm, and the relative change of ne in percents when compared with DEBR18 is presented for the other algorithms in the columns denoted rne in the tables. Therefore, the negative values of rne mean smaller time consumption with respect to DEBR18, the positive values larger one. For example, the value rne = 50 means half ne, the value rne = 100 means that ne is twice larger, when compared with DEBR18. The number of the objective function evaluations can be easily recalculated as ne = ne 0 (1 + rne/100), where ne 0 is the appropriate number of objective function evaluations for DEBR18. As it is apparent from Fig. 1, DEBR18 is the most reliable algorithm among the algorithms in the test. The second best is BREST1, both in reliability and in convergence rate. BREST2 is highly reliable except two tasks with D = 30, but its convergence is much slower in comparison with DEBR18. Competitive setting of control parameters proved to be efficient adaptive scheme.

3 Control Random Search with Competing Heuristics In an additive nonlinear regression model, the elements of random vector Y are expressed as follows Y i = g(x i, β) + ε i, i = 1, 2,..., n, (7) where x T i = (x 1, x 2,..., x k ) is i-th row of regressor matrix X, β is vector of parameters, g is a given function nonlinear in parameters, and ε i s are iid random variables with zero means. The estimation of parameters by the least squares method means to find such estimates of β that minimize the residual sum of squares Q(β) given by the following equation Q(β) = n [Y i g(x i, β)] 2. (8) i=1 Due to the fact that the function Q(β) need not be unimodal, the estimation of β is the global optimization problem. Iterative deterministic algorithms (e.g. Levenberg-Marquardt) used in standard statistical packages often fail when searching for the true solution of the problem. Several statistical packages were tested [16] on NIST tasks of higher-level-difficulty [8]. For approximately one half of the tasks, the algorithms either completely failed or resulted in a significant disagreement with the true parameter values. A stochastic algorithm based on controlled random search (CRS) was proposed for the estimation of nonlinear-regression parameters. The CRS algorithm was published originally by Price [9]. The reflection known from simplex method [7] was used for generating a new trial point. There are several modifications of the CRS algorithm, which were successfully used in solving the global optimization problems [1]. This algorithm can be written in pseudo-code as follows. Algorithm 2. Control Random Search 1 generate P (population of N points in D); 2 find x max (point with max function value); 3 repeat 4 generate a new trial point y using a heuristic; 5 if f(y) < f(x max ) then 6 x max := y; 7 find new x max ; 8 endif 9 until stopping condition; The role of a heuristic mentioned at line 4 can play any non-deterministic rule generating a new trial point y S. There are many different heuristics that can be used and, moreover, the heuristics can alternate during the course of search. Four competing heuristics were used in the implementation of the CRS algorithm for the nonlinearregression parameter estimation. Three of them are based on a randomized reflection in the simplex Σ (d + 1 points chosen from P ) proposed by [4]. A new trial point y is generated from the simplex by the relation y = g + U (g x H ), (9) where x H = arg maxx Σ f(x) and g is the centroid of remaining d points of the simplex. The multiplication factor U is a random variable distributed uniformly in [s, α s), α > 0 and s being input parameters, 0 < s < α/2. All the d+1 points of simplex are chosen at random from P in two heuristics: the first heuristic uses α = 2, and s = 0.5, the second one uses α = 5, and s = 1.5. Regarding the third heuristic, one point of the simplex is the point of P with the minimum objective function value and the remaining d points of the simplex S are chosen at random from remaining points of P. Input parameters of this heuristic are set to α = 2 and s = 0.5. The fourth competing heuristic is based on differential evolution, see (3) and (4) with F min = 0.4 and C = 0.9. The algorithm based on the competition of all four heuristics is denoted CRS4. The rules for the competition of heuristics are similar to those described in Section 2, the success is weighted by its relative change in objective function value w h = f max max(f(y), f min ) f max f min. (10) Thus w h (0, 1] and the corresponding probability q h is evaluated as q h = W h + w 0 H j=1 (W j + w 0 ), (11) where W h is the sum of w h in previous searching steps and w 0 > 0 is an input parameter of the algorithm. The collection of NIST datasets [8] was used as a benchmark. This collection contains 27 nonlinear regression datasets (tasks) ordered according to their level of difficulty (lower 8 tasks, average 11 tasks, and higher 8 tasks). The corresponding nonlinear regression models are of exponential or rational types, the number of parameters ranging from 2 to 9. Most of the datasets (18 of 27 tasks) result from experimental studies, the remaining ones are generated artificially. The total number of observations varies in a wide range, from 6 to 250. Several algorithms for the estimation of parameters were compared. One of them is a modification of Levenberg-Marquardt algorithm implemented in nlinfit procedure of Statistical Toolbox [6]. with default values of its control parameters. The other algorithms

Table 2: Comparison of algorithms in estimation of nonlinear-regression parameters Algorithm nlinfit CRS4 CRS4e DER REFL1 Task Level Result RP ne RP rne RP rne RP rne chwirut1 lower OK 100 3008 100-35 100 574 84 47 chwirut2 lower OK 100 2987 100-35 100 571 71 74 danwood lower OK 100 1620 100-28 100 300 81 3 gauss1 lower OK 100 14137 100-35 100 530 24 243 gauss2 lower OK 98 14726 98-36 98 1039 38 280 lanczos3 lower OK 100 29810 100 2 0 705 0 444 misra1a lower OK 100 2157 100-17 100 325 2 59 misra1b lower OK 100 1861 100-19 100 386 9 57 enso middle OK 87 19220 86-30 100 722 86 56 gauss3 middle OK 100 15908 99-35 99 1912 4 532 hahn1 middle OK 93 16509 93-26 0 1596 0 992 kirby2 middle OK 100 8508 100-23 100 2251 34 261 lanczos1 middle L 0 28361 100 639 0 746 0 523 lanczos2 middle OK 55 28251 100 8 0 750 0 475 mgh17 middle X 100 11023 100-18 25 1714 0 381 misra1c middle OK 100 2104 100-11 100 435 0 62 misra1d middle OK 100 2043 100-12 100 414 0 73 nelson middle OK 100 5904 100-17 100 619 0 88 roszman1 middle OK 100 5301 100-36 100 1245 91 51 bennett5 higher L 100 41335 100-11 5 190 1-69 boxbod higher X 100 1308 100-37 100 206 90-1 eckerle4 higher L 100 2629 100-35 100 140 94 3 mgh09 higher L 100 10422 100-15 100 1427 0 184 mgh10 higher L 100 20761 100 1 0 478 0-50 rat42 higher OK 100 2942 100-35 100 474 81 42 rat43 higher OK 100 4807 100-39 100 904 87 83 thurber higher OK 100 13915 100-30 0 1912 1 824 in the tests are population-based, differential evolution (DER) and two variants of the CRS with com- RP rne 100 80 60 40 20 0 2000 1500 1000 500 0 CRS4 CRS4e DER REFL1 CRS4 CRS4e DER REFL1 Figure 2: Comparison of stochastic algorithms used in nonlinear regression Boxplots of RP and rne petition(crs4 and CRS4e) and one variant without competition consisting of one local-search heuristic, randomized reflection with α = 2, and s = 0.5 (REFL1). These algorithms were tested in 100 repeated runs for each task. Search spaces S for the individual tasks are given in [18]. The common control parameters of all the population-based algorithms were set up as follows: population size N = 10 d, stopping condition Rmax 2 Rmin 2 < ε or ne 40000 d, where ε = 1 10 15 (except CRS4e, see later), Rmax 2 and Rmin 2 are maximum and minimum values of the determination index R 2 in population, R 2 = 1 Q( ˆβ)/ n i=1 (Y i Y ) 2. The algorithm CRS4e (described in [18]) contains the same four heuristics as CRS4 and, moreover, provides a procedure for adapting the value ε in the stopping condition. The value ε starts from input value ε 0 and can be decreased, if 1 R 2 max < γ ε, where γ 1 is

another input parameter. Values of input parameters of CRS4e were set to ε 0 = 1 10 9 and γ = 1 10 7. The results of our experiments are summarized in Table 2. The quantities reported in this table have similar meaning as in the Section 2, rne denoting relative change in ne, when compared with CRS4. The overall comparison of all the stochastic algorithms used in testing is shown in Fig. 2. The algorithm CRS4 is reliable enough for most tasks except lanczos1 and lanczos2 (caused by their extremely low values of 1 R 2 ). As regards the other tasks, RP is less than 100 only in four tasks, but it always exceeds 85 %. The best results in both RP and ne were achieved by using CRS4e, where the use of adaptive stopping condition decreases ne by one tenth to one third in most cases and increases RP in tasks with small value of 1 R 2. 4 Conclusions Several self-adaptive evolutionary algorithms were described and the experimental results were briefly presented. The results showed that the algorithms, where the search strategy is adapted by competition (using the different control-parameters settings or using several local-search heuristics), outperform the other algorithms in most test tasks. Some of these self-adaptive evolutionary algorithms are implemented in the Matlab program library [20] available on website 1. This program library is free software. Any user of Matlab can download selected function of the library and use and/or modify it under the terms of the GNU General Public License. Although all the algorithms included into the Matlab program library were tested extensively and proved to be high reliable, often with significantly lower time consumption in the comparison with other stochastic algorithms, there is no guarantee of their right performance in other global optimization problems. However, the advantage of these algorithms consists in the fact, that they can run with default values of their control parameters for a wide range of optimization tasks. The presentation of this non-fuzzy topic at fuzzyexperts forum follows two objectives. The first of them is a challenge for fuzzy approach to the self-adaptation in evolutionary algorithms. The second reason is to offer a new robust optimization procedure for application. The self-adaptive evolutionary algorithms described in this paper can be also applied to the optimization of the Takagi-Sugeno fuzzy models. When optimizing them, it is desirable (see [11]): 1 http://albert.osu.cz/oukip/optimization/ to realize partitioning the premise space of the models, preferably from real data by using an efficient fuzzy clustering algorithm, to optimize the model successively by combining an evolutionary algorithm with the fuzzy rule base reduction and simplification, to check for meeting both partition and search space constraints during the optimization process. Acknowledgement The research was supported by the grant 201/05/0284 of the Czech Grant Agency and by the research scheme MSM 6198898701 of the Institute for Research and Applications of Fuzzy Modeling. References [1] M. M. Ali, Törn, Population set based global optimization algorithms: Some modifications and numerical studies, Computers and Operations Research 31 (2004) 1703 1725. [2] T. Bäck, Evolutionary Algorithms in Theory and Practice, Oxford University Press, New York, 1996. [3] J. Brest, S. Greimer, B. Boškovič, M. Mernik, V. Žumer, Self-Adapting Control Parameters in Differential Evolution: A Comparative Study on Numerical Benchmark Problems,. IEEE Transactions on Evolutionary Computation 10 (2006) 646 657. [4] I. Křivý, J. Tvrdík, The controlled random search algorithm in optimizing regression models, Comput. Statist. and Data Anal. 20 (1995) 229 234. [5] J. Liu, J. Lampinen, A Fuzzy Adaptive Differential Evolution Algortithm. Soft Computing 9 (2005) 448 462. [6] MATLAB, version 2006b, The MathWorks, Inc., 2006. [7] J. A. Nelder, R. Mead, A simplex method for function minimization, Computer J. 7 (1964) 308 313. [8] NIST, Statistical Reference Datasets: nonlinear regression, NIST Information Technology Laboratory, 2001. http://www.itl.nist.gov/div898/strd/. [9] W. L. Price, A controlled random search procedure for global optimization, Computer J. 20, (1977) 367 370. [10] K. V. Price, R. Storn, J. Lampinen, Differential Evolution: A Practical Approach to Global Optimization, Springer-Verlag, 2005.

[11] H. Roubos, M. Setnes, Compact Fuzzy Models and Classifiers through Model Reduction and Evolutionary Optimization. In: Lance Chambers (Ed.) The Practical Handbook of Genetic Algorithms Applications, chapter 2. New York: Chapman & Hall/CRS 2000, pp. 31 59. [12] J. C. Spall, Introduction to Stochastic Search and Optimization, Wiley-Intersience, 2003. [13] R. Storn, K. V. Price, Differential evolution - a Simple and Efficient Heuristic for Global Optimization over Continuous Spaces, J. Global Optimization 11 (1997) 341 359. [14] J. Tvrdík, L. Mišík, I. Křivý, Competing Heuristics in Evolutionary Algorithms, in: P. Sinčák et al. (Eds.), 2nd Euro-ISCI, Intelligent Technologies - Theory and Applications. IOS Press, Amsterdam 2002, pp. 159 165. [15] J. Tvrdík, Generalized controlled random search and competing heuristics, in: R. Matoušek and P. Ošmera (Eds.), MENDEL 2004, 10th International Conference on Soft Computing. Technical University Press, Brno, 2004, pp. 228 233. [16] J. Tvrdík, I. Křivý, Comparison of algorithms for nonlinear regression estimates, in: J. Antoch (Ed.), COMPSTAT 2004. Physica-Verlag, Heidelberg, 2004, pp. 1917 1924. [17] J. Tvrdík, Competitive Differential Evolution, in: R. Matoušek and P. Ošmera (Eds.), MENDEL 2006, 12th International Conference on Soft Computing. Technical University Press, Brno, 2006, pp. 7 12. [18] J. Tvrdík, I. Křivý, L. Mišík, Adaptive Population-based Search: Application to Estimation of Nonlinear Regression Parameters, Computational Statistics and Data Analysis (2007), In Press, Corrected Proof available online since 9 November 2006, (http://www.sciencedirect.com/science/article/ B6V8V-4M9HWXT- 3/2/e4fa1077aa80c154b130396b3c486286) [19] J. Tvrdík, Differential Evolution with Competitive Setting of its Control Parameters. TASK Quarterly 11 (2007) (In Press). [20] J. Tvrdík, H. Habiballa, V. Pavliska, Matlab Program Library for Box-Costrained Global Optimization. in: APLIMAT 2007, 7th International Conference on Applied Math. Slovak Technical University, Bratislava, 2007, pp. 463 470. [21] D. H. Wolpert, W. G. Macready, No Free Lunch Theorems for Optimization, IEEE Transactions on Evolutionary Computation 1 (1997) 67 82. [22] D. Zaharie, Critical Values for the Control Parameter of the Differential Evolution Algorithms, in: R. Matoušek and P. Ošmera (Eds.), MENDEL 2002, 8th International Conference on Soft Computing, Technical University Press, Brno, 2002, pp. 62 67.