Starrylink Editrice. Tesi e Ricerca. Informatica

Size: px
Start display at page:

Download "Starrylink Editrice. Tesi e Ricerca. Informatica"

Transcription

1

2 Starrylink Editrice Tesi e Ricerca Informatica

3

4

5

6 Table of Contents Measures to Characterize Search in Evolutionary Algorithms... 7 Giancarlo Mauri and Leonardo Vanneschi 1 Introduction PreviousandRelatedWork FitnessDistanceCorrelation StructuralTreeDistance Structural Mutations Property 1. Distance/Operator Consistency Experimental Results on fdc W MultimodalTrapFunctions Series Series Series RoyalTrees MAX Problem Counterexample Negative Slope Coefficient Experimental Results on nsc Thebinomial-3problem The even parity k problem Summingup SubtreeCrossoverDistance ExperimentalResults FitnessDistanceCorrelation SyntacticTrees TrapFunctions FitnessSharing Diversity SummingUp ConclusionsandFutureWork... 37

7

8 Measures to Characterize Search in Evolutionary Algorithms 7 Measures to Characterize Search in Evolutionary Algorithms Giancarlo Mauri 1 and Leonardo Vanneschi 1 Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.) University of Milano-Bicocca Milan, Italy Abstract. The ability of evolutionary algorithms to solve a combinatorial optimization problem is often very hard to verify. Thus, it would be very useful to have one or more numeric measures able to quantify the ability of evolutionary algorithms to find good quality solutions to a given problem from its high level specifications. In this paper, two difficulty measures are presented: fitness distance correlation and negative slope coefficient. Advantages and drawbacks of both these measures are presented both from a theoretical and empirical point of view for genetic programming. Furthermore, to analyse various properties of the search process of evolutionary algorithms, it is useful to quantify the distance between two individuals. Using operator-based distance measures can make this analysis more accurate and reliable than using distance measures which have no relationship with the genetic operators. This paper also presents a pseudo-distance measure based on subtree crossover for genetic programming. Empirical studies are presented that show the suitability of this measure to dynamically calculate the fitness distance correlation during the evolution, to construct a fitness sharing system for genetic programming and to measure genotypic diversity in the population. Experiments have been performed on a set of well-known hand-tailored problems and real-life-like GP benchmarks. 1 Introduction For classical algorithms a well-developed theory exists to categorise problems into complexity classes. Problems in the same class have roughly the same complexity, i.e., they consume (asymptotically) the same amount of computational resources, usually time [36]. Although, properly speaking, Evolutionary Algorithms (EAs) are randomised heuristics and not algorithms, it would be useful to be somehow able to likewise classify problems according to some measure of difficulty. Difficulty studies in Genetic Algorithms (GAs) have been pioneered by Goldberg and coworkers (e.g., see [14, 9, 21]). Their approach consisted in constructing functions that should a priori be easy or hard for GAs to solve. These ideas have been followed by many others (e.g. [33, 12]) and have been at least partly successful in the sense that they have been the source of a considerable amount of other work on what makes a problem easy or difficult for GAs.

9 8 G.Mauri and L. Vanneschi One concept that underlies many approaches is the notion of fitness landscape, which originated with the work of Wright in genetics [53]. The fitness landscape metaphor can be helpful to understand the difficulty of a problem for a searcher that is trying to find the optimal solution for that problem. For example, imagine a very smooth and regular landscape with a single hill top. This is the typical fitness landscape of an easy problem: most search strategies (hill climbing, simulated annealing, tabu search, EAs, etc.) are able to find the top of the hill in a straightforward manner. The opposite is true for a very rugged landscape, with many hills and local optima which are not as high as the best one. In this case, even approaches based on populations of individuals, like GAs or Genetic Programming (GP), might have problems. The graphical visualisation of fitness landscapes, whenever possible, can give an indication about the difficulty of a problem for a searching agent like EAs. However, even assuming that one is able to draw a fitness landscape (which is generally not the case, given the huge sizes of typical search spaces and neighbourhoods), the mere observation of its graph surely lacks formality. The ideal situation would be to have a numeric measure able to condense useful informations on fitness landscapes. This work presents two numeric indicators of problem hardness: fitness distance correlation (fdc) and negative slope coefficient (nsc). Their ability to measure the difficulty of fitness landscapes will be investigated for tree based GP [25] in this paper. Nevertheless, all the concepts can be applied to other search heuristics, like local search, simulated annealing or tabu search and to other kinds of EAs, like GAs. Tree-based GP uses transformation operators on tree structures [26] to carry out search. These operators define a neighbourhood structure over the trees. To analyse various dynamics of the GP search process, it is often useful to quantify the distance between two trees in this topological space. For example, the distance between trees is useful if we want to monitor population diversity (see for instance [17, 16, 4, 32, 43, 11]) or if we want to calculate the fdc, asin the first part of this work (see among others [23, 42, 48, 44]). Operator-based distance measures can make calculating distance and the analysis of the search process more accurate [48, 44, 17, 16, 4, 32]. The difficulty in defining operatorbased distance measures was highlighted in [18]. Defining a distance measure, or a measure of similarity, that is, in some sense bound to (or consistent with) the genetic operators informally means that if two trees are close to each other, or similar, one can be transformed into the other in a few applications of the operator(s). Mutation-based distance measures for GP have been defined, the most common being some variations on the Levenshtein edit distance [16] and the structural distance [11]. In the second part of this paper, a subtree crossover based pseudo-distance measure for GP is defined and its usefulness to analyse some properties of the search process is experimentally shown. This paper is structured as follows: Section 2 presents the main results and investigations related to the present work that can be found in literature. Section 3 defines the fdc, presents some experimental results and discussed the main advantages and drawbacks of this measure. Section 4 presents the nsc as a mea-

10 Measures to Characterize Search in Evolutionary Algorithms 9 sure defined to overcome the main limitations of the fdc. A set of experimental results shows the suitability of the nsc for some standard and real-life-like GP benchmarks. Section 5 contains the definition of the subtree crossover pseudodistance for GP. A large set of experiments demonstrate its usefulness to describe some important dynamics of GP using the crossover operator. Finally, Section 6 concludes this work and proposes some hints for future research activities. 2 Previous and Related Work The usual empirical approach to problem difficulty in GP has been to run a more or less agreed upon set of test problems that have their origin in Koza s work [25], such as the even-n parity problem, various kinds of symbolic regression and the artificial ant on the Santa Fe trail. However, this point of view, while useful for practical benchmarking purposes, lacks generality since results are problemdependent and it is difficult to infer more general issues pertaining to intrinsic GP difficulty by just looking at statistics derived from running these problems a number of times. There have been few attempts to date to characterize GP difficulty by means of a single measure. One early approach has also been proposed by Koza [25] and consists in calculating, for a given problem, the number of individuals that must beprocessedinorderforasolutiontobefoundwithagivenprobabilityp (usually P = 0.99). This gives a number characterizing the required computational effort but it cannot be relied upon for distinguishing easy from hard in GP. In particular, it cannot give any clue as to what causes difficulty for GP on a given problem. However, empirical performance measures like this one are required in any case to assess the reliability of other synthetic hardness indicators and will be used later in this work. A potentially more fruitful approach would be to transfer to GP some of the considerations that have proved useful for studying GAs difficulty, in particular investigations that make use of the concept of fitness landscape. One early example is the work of Kinnear [24], where GP difficulty was related to the shape of the fitness landscape and analysed through the use of the fitness autocorrelation function, as first proposed by Weinberger [52] and later used in GAs work by Manderick et al. [30]. While fitness autocorrelation analysis has been useful in the study of GAs NK landscapes [52], Kinnear found his results inconclusive and difficult to interpret for GP: essentially no simple relationship was found between correlation length values and GP hardness. As well, correlation analysis was found to be unreliable in another study on the MAX3SAT problem, where the measure predicts the problem to be an easy one while it is actually difficult [40]. The work of Nikolaev and Slavov [35] is the only previous one that makes use of the concept of fitness distance correlation in GP. They define a suitable distance for trees and apply fdc to a problem of regular expression induction. However, their main goal was to determine which mutation operator, among a

11 10 G.Mauri and L. Vanneschi few that they propose, sees a smoother landscape on this problem, rather than a general study of problem difficulty in GP, which is the aim here. In the same vein, Punch and coworkers [37, 38] proposed a new synthetic benchmark problem of tunable difficulty dubbed the Royal Tree problem which was inspired by the well-known Royal Road problem used in GA theory [33]. However, Royal Trees were used by Punch and coworkers to test the effectiveness of multipopulation GP as compared to standard single population GP and not to gauge intrinsic GP difficulty. Nevertheless, Royal Trees will be used to that end in the present work. A recent, and atypical, attempt to quantify GP problem difficulty also deserves to be mentioned. In [7, 8], Daida and coworkers claim that, although a fitness landscape description of what is seen by a GP searcher may be legitimate, there are other factors that are not taken into account by this view and that are related to structural mechanisms such as tree shapes and depths. Their approach is based on the exhaustive study of the dynamics of the binomial-3 problem [7], and on purely structural constructive problems [8]. These are of tunable difficulty due to the existence of a range of ephemeral random constants for the binomial problem and other parameters. This approach is interesting but it is different from the one used here and, in some sense, it goes counter to the main goal of difficult studies, namely trying to find broad classes of problem types. In fact, it effectively restricts the scope of the search by limiting itself to contingencies and problem-specific aspects. However, this point of view is not necessarily contradictory with the fitness landscape view and could yield useful insights of general value. Langdon and Poli take an experimental view of GP fitness landscapes in several works summarized in their book [29]. After selecting important and typical classes of GP problems such as boolean problems, the ant problem, and the MAX problem, they study these fitness landscapes either exhaustively, whenever possible, or by randomly sampling the program space when enumeration becomes unfeasible. Their work highlights several important characteristics of GP spaces, such as density and size of solutions and their distribution. This is useful work and even if the author s goals are not openly aimed at establishing problem difficulty, it certainly has a bearing on it. More recently, Langdon has extended his studies for convergence rates in GP for simple machine models (which are amenable to quantitative analysis by Markov chain techniques) to convergence of program fitness landscapes for the same machine models using genetic operators and search strategies to traverse the space [27]. This approach is rigorous because the models are simple enough to be mathematically treatable. The ideas are thus welcome, although their extension to standard GP might prove difficult. A middle ground approach, based on what Goldberg has aptly called facetwise models [15] has still something to offer in the effort of better understanding the behavior of real programs on real program spaces using standard, or at least typical, GP operators and machine models. In fact, probably, an ensemble of techniques, rather than a single magic silver bullet would prove useful to advance the understanding of GP problem

12 Measures to Characterize Search in Evolutionary Algorithms 11 difficulty. To this aim, two indicators, which are different but not contradictory, will be proposed in Sections3 and 4 respectively. 3 Fitness Distance Correlation Fitness distance correlation has previously been studied for GAs by Jones [22] and some preliminary results on GP have been presented in [44, 41, 46, 47, 5]. Jones s approach to GAs problem difficulty states that what makes a problem easy or hard is the relationship between fitness of individuals in the search space and their distance to the global optima. The easiest way to measure the extent to which the fitness function values are correlated to the distance to a global optimum is to examine a problem with known optima, take a sample of individuals and compute the correlation of the set of (fitness, distance) pairs. Thus, given a sample F = {f 1,f 2,..., f n } of n individual fitnesses and a corresponding sample D = {d 1,d 2,..., d n } of the n distances to the nearest global optimum, fdc is defined as: fdc = C FD σ F σ D where: C FD = 1 n n (f i f)(d i d) i=1 is the covariance of F and D and σ F, σ D, f and d are the standard deviations and means of F and D. As shown in [22], GAs problems can be classified in three classes, depending on the value of the fdc coefficient: misleading (fdc 0.15), unknown ( 0.15 < fdc < 0.15) and straightforward (fdc 0.15), where the threshold values 0.15 and 0.15 have been determined empirically by Jones [22]. The second class corresponds to problems for which the difficulty can t be estimated because there is virtually no correlation between fitness and distance. This section of the paper contains an empirical demonstration that this problem classification also holds for GP. 3.1 Structural Tree Distance The distance metric used to calculate the fdc should be defined with regard to the neighborhood produced by the genetic operators, so to assure the conservation of the genetic material between neighbors [22, 39]. This is generally an easy task for GAs, where the well-known Hamming distance has a clear and simple relationship with standard GA mutation [20, 14]. For GP, defining a distance bound to the main genetic operators is obviously more difficult, given that genotypes are trees and genetic operators are more complex. In the first part of this paper, the well known structural distance (see [11]) is used for GP. This distance is the most suitable between the different definitions found in literature and the tranformations on which this distance is based allow to define two new mutation genetic operators in a very simple way. The new resulting evolutionary process

13 12 G.Mauri and L. Vanneschi will be called structural mutation genetic programming (SMGP), to distinguish it from GP based on the standard Koza s crossover (that will be referred to as standard GP). Later in this paper, a new pseudo-distance measure bound to GP crossover will be defined (see section 5). According to structural distance, given the sets F and T of functions and terminal symbols that are used to build the trees that are evolved by GP, a coding function c must be defined such that c : {T F} IN. The distance of two trees T 1 and T 2 with roots R 1 and R 2 is defined as follows: m d(t 1,T 2 )=d(r 1,R 2 )+k d(child i (R 1 ),child i (R 2 )) (1) i=1 where: d(r 1,R 2 )=( c(r 1 ) c(r 2 ) ) z, z IN, child i (Y )isthei th of the m possible children of a generical node Y,ifi m, or the empty tree otherwise, and c evaluated on the root of an empty tree is 0. Constant k is used to give different weights to nodes belonging to different levels. In most of this section of the paper, individuals will be coded using the same syntax as in [5] and [37], i.e. considering a set of functions A, B, C, etc. with increasing arity (i.e. arity(a) = 1, arity(b) = 2, and so on) and a single terminal X (i.e. arity(x) = 0) as follows: F = {A, B, C, D,...}, T = {X} The c function, for this particular language, will be defined as follows: x {F T} c(x) = arity(x) + 1. In the experiments presented here, the following values will always be used: k = 1 2 and z = Structural Mutations Given the sets F and T and the coding function c defined in Section 3.1, we define c max (resp. c min ) as the maximum (resp. the minimum) value assumed by c on the domain {F T }. Moreover, given a symbol n (resp. m) such that n {F T}(resp. m {F T})andc(n) <c max (resp. c(m) >c min ), we define: succ(n) (resp.pred(m)) as a node such that c(succ(n)) = c(n)+1 (resp. c(pred(m)) = c(m) 1). Then we can define the following operators on a generic tree T [44]: inflate mutation: a node labelled with a symbol n such that c(n) <c max is selected in T and replaced by succ(n). A new random terminal node is added to this new node in a random position (i.e. the new terminal becomes the i th son of succ(n), where i is comprised between 0 and arity(n)). deflate mutation: a node labelled with a symbol m such that c(m) >c min, and such that at least one of his sons is a leaf, is selected in T and replaced by pred(m). A random leaf, between the sons of this node, is deleted from T. Given these definitions, the following property holds: Property 1. Distance/Operator Consistency. Let s consider the sets F and T and the coding function c defined in Section 3.1. Let T 1 and T 2 be two trees composed

14 Measures to Characterize Search in Evolutionary Algorithms 13 by symbols belonging to {F T } and let s consider the k and z constants of definition (1) to be both equal to 1. If d(t 1,T 2 )=D, thent 2 can be obtained from T 1 by a sequence of D 2 editing operations, where an editing operation can be a inflate mutation or a deflate mutation. (this property has been formally proven in [44, 48]). From this property, we conclude that the operators of inflate and deflate mutation are consistant with the notion of structural distance: an application of these operators enable to move on the search space from a tree to its neighbors according to that distance. 3.3 Experimental Results on fdc To test the suitability of the fdc as a hardness measure for GP, we have used a large set of hand-tailored GP bechmarks, that all share the property of having a tunable difficulty (i.e. the difficulty of the problem for GP can be changed by simply variating the value of some parameters, which makes these problems particularly interesting for the study that is presented here). These benchmarks are: unimodal trap functions, multimodal trap functions, Royal Trees and the MAX problem. Only results concerning multimodal trap functions, Royal Trees and MAX problem are shown here, given that unimodal trap functions can be seen as a particular case of multimodal ones. Before showing the experimental results, these problems are briefly presented in each of the respective paragraphs. For a more detailed introduction, see for instance [44]. In all experiments, fdc has been calculated via a sampling of randomly chosen individuals without repetitions. For the GP simulations, the total population size was 200 and generational GP was used with tournament selection of size 10. Both standard GP (i.e. GP using standard subtree crossover [25]) and SMGP (i.e. GP using structural mutation operators presented in Section 3.2 as the sole genetic operators) will be used in the experiments. In both cases, the GP process was stopped either when a perfect solution was found (global optimum) or when 500 generations were executed. All experiments have been performed 100 times. Once a measure of hardness and the way to compute it have been chosen, the problem remains of finding a means to validate the prediction of the measure with respect to the problem instance and the algorithm. This is a necessary step in this approach for otherwise the whole argument might become circular, since there would be no way to relate different values of the difficulty measure among themselves. The easiest way is to use a performance measure. Naudts and Kallel [34] have a good discussion of that point. For the purposes of the present work, performance is defined as being the proportion of the runs for which the global optimum has been found in less than 500 generations over 100 runs. Even if this definition is informal and prone to criticism, good or bad performance values correspond to our intuition of what easy or hard means in practice 1. 1 McPhee and Poli [31] defined some other performance measures for a trivial problem (the 1-then-0s problem) and showed how one can use the GP schema equation to find optimum choice of operators and parameters.

15 14 G.Mauri and L. Vanneschi W Multimodal Trap Functions Multimodal trap functions, first proposed in [10] (informally called W trap functions, given their typical shape shown in figure 1), are typical EAs benchmarks, where the fitness of each individual can be calculated as a function of its distance to one of the global optima. They are characterized by the presence of several global optima. They depend R 1 R 2 Fitness (f) B 1 B 2 B 3 Distance (d) Fig. 1. Graphical representation of a W trap function with B 1 =0.1, B 2 =0.3, B 3 =0.7, R 1 =1,R 2 =0.7. Note that distances and fitness are normalized into the range [0,1]. on 5 variables called B 1, B 2, B 3, R 1 and R 2, each one belonging to the range [0, 1], and they can be expressed by the following formula: 1 d if d B 1 B 1 f(d) = R 1 (d B 1 ) B 2 B 1 if B 1 d B 2 R 1 (B 3 d) B 3 B 2 if B 2 d B 3 R 2 (d B 3 ) 1 B 3 otherwise where the property B 1 B 2 B 3 must hold. To built a GP landscape for multimodal trap functions, one has to choose a particular tree T o as the origin of the fitness/distance plane. All the distances lying on the abscissas of this plane will be calculated to T o,andthust o is a global optimum, since its fitness is equal to 1. All the trees having a distance to T o equal to B 2 (if R 1 =1)or equalto1(ifr 2 = 1) are global optima, since they have the same fitness as T o.

16 Measures to Characterize Search in Evolutionary Algorithms 15 The term multimodal has been used here to indicate the presence of multiple global optima (while this term is often used in literature to indicate the presence of multiple optima, either local or global). In this work, a particular tree belonging to the search space is chosen as origin (it will be called T o ). The maximum fitness (i.e. a fitness value equal to 1) is arbitrarily assigned to T o, so as to make it a global optimum. Now, if the R 1 constant is set to 1, all the trees having a normalized distance equal to B 2 to the origin have a fitness equal to 1 and thus are global optima too. In the same way, if R 2 is set to 1, all the trees at distance 1 to the origin are global optima. Let S o be the set of all these global optima different from T o.onetree (that will be indicated as T 1 ) belonging to S o is arbitrarily chosen, and all the other trees belonging to S o, but different from T 1, receive a random normalized fitness different from 1. To calculate the fdc, the minimum of the distances from each sampled tree to T o and T 1 is respectively considered (as suggested for GAs by Jones in [22]). Three series of different experiments have been performed, using both SMGP and standard GP and the coding language introduced in section 3.1: in the first two series (series 1 and series 2), the distance between T o and T 1 has been chosen inside the range (0, 1) (precisely, it is equal to 0.25 in series 1 and equal to 0.75 in series 2); in the third series (series 3), the distance between T o and T 1 has been chosen to be equal to 1. Results of these experiments are discussed below. Series 1. Trees T o and T 1 have been chosen randomly among all the couples of trees in the search space that respect the property that the distance between them is equal to Table 1 shows a subset of the experimental results of this first series of tests, on various W trap functions, obtained by changing the values of B 1, B 3 and R 2,whileR 1 has been kept constantly equal to 1 and B 2 has been set to the value of the normalized distance between T o and T 1.Results shownhavebeenchoseninthefollowingway:valuesforwhichfdc gives a result included into the range [ 0.15, 0.15] have been discarded, since no correlation between fitness and distance exists. Among all the other cases, a subset of triples B 1, B 3, R 2 have been randomly chosen. They are shown in the first column of Table 1.The value of the fdc for the corresponding multimodal trap function is shown in the second column of Table 1. The third column of Table 1 classifies each multimodal trap function according to Jones terminology, i.e. problems for which fdc 0.15 are labelled as straightforward and problems for which fdc 0.15 are labelled as misleading. For each multimodal trap function, corresponding to different triple values of B 1, B 2 and R 2, 100 independent GP runs have been executed and performance has been calculated, both for SMGP (reported in the fourth column of Table 1) and standard GP (reported in the fifth column of Table 1). As this table clearly shows, for each one of the considered multimodal trap functions, when the fdc value is lower than 0.15, i.e. the problem is classified as an easy one, the corresponding performance value is remarkably higher than 0.5 both for standard GP and SMGP, i.e. the global optimum has been found in the large majority of the runs that have been executed. This clearly corresponds

17 16 G.Mauri and L. Vanneschi fdc fdc prediction p (SMGP) p (stgp) B 1 =0,B 3 =0.3,R 2 = misleading B 1 =0,B 3 =0.35,R 2 = misleading B 1 =0,B 3 =0.4,R 2 = misleading B 1 =0,B 3 =0.45,R 2 = misleading B 1 =0,B 3 =0.5,R 2 = misleading B 1 =0,B 3 =0.55,R 2 = misleading 0 0 B 1 =0,B 3 =0.6,R 2 = misleading B 1 =0,B 3 =0.65,R 2 = misleading 0 0 B 1 =0,B 3 =0.7,R 2 = misleading 0 0 B 1 =0,B 3 =0.75,R 2 = misleading 0 0 B 1 =0,B 3 =0.8,R 2 = misleading 0 0 B 1 =0,B 3 =0.85,R 2 = misleading B 1 =0,B 3 =0.9,R 2 = misleading 0 0 B 1 =0,B 3 =0.95,R 2 = misleading B 1 =0,B 3 =1,R 2 = misleading 0 0 B 1 =0.1,B 3 =0.4,R 2 = straightf B 1 =0.1,B 3 =0.5,R 2 = straightf B 1 =0.15,B 3 =0.3,R 2 = straightf B 1 =0.15,B 3 =0.35,R 2 = straightf B 1 =0.15,B 3 =0.55,R 2 = straightf B 1 =0.15,B 3 =0.6,R 2 = straightf B 1 =0.15,B 3 =0.7,R 2 = straightf B 1 =0.15,B 3 =0.85,R 2 = straightf B 1 =0.15,B 3 =1,R 2 = straightf B 1 =0.2,B 3 =0.3,R 2 = straightf B 1 =0.2,B 3 =0.35,R 2 = straightf B 1 =0.2,B 3 =0.55,R 2 = straightf B 1 =0.2,B 3 =0.65,R 2 = straightf B 1 =0.2,B 3 =0.75,R 2 = straightf B 1 =0.2,B 3 =0.85,R 2 = straightf B 1 =0.2,B 3 =1,R 2 = straightf B 1 =0.25,B 3 =0.35,R 2 = straightf B 1 =0.25,B 3 =0.4,R 2 = straightf B 1 =0.25,B 3 =0.55,R 2 = straightf B 1 =0.25,B 3 =0.6,R 2 = straightf B 1 =0.25,B 3 =0.75,R 2 = straightf B 1 =0.25,B 3 =0.8,R 2 = straightf B 1 =0.25,B 3 =1,R 2 = straightf Table 1. Results of fdc using SMGP and standard GP for the first series of experiments with W trap functions; p stands for performance. to our intuition of what an easy problem is. Analogously, for each multimodal trap function with an fdc value larger than 0.15, i.e. for each problem that is classified as an hard one by the fdc, performance is remarkably lower than 0.5, and often equal or approximately equal to zero, i.e. the problem is indeed hard

18 Measures to Characterize Search in Evolutionary Algorithms 17 (the global optimum has been found very rarely over the runs that have been performed). Results shown in Table 1 are encouraging and suggest that fdc could be a reasonable measure to predict problem difficulty for some typical W trap functions. Finally, we remark that this is true both for SMGP (which uses genetic operators bound to the distance metric used to compute the fdc) and for standard GP (which uses standard subtree crossover). This probably means that a relationship also exists between structural distance and subtree crossover, even though we are not able to formalize it. Series 2. In this case, T o and T 1 have been chosen randomly with the contraint that the distance between them has to be equal to Table 2 shows a subset of the experimental results of this second series of tests, on various W trap functions, obtained by changing the values of B 1, B 3 and R 2. This table has to be interpreted exactly as Table 1. Once again, fdc seems to be a fairly good indicator of problem difficulty for each one of the functions that we have tested, both for standard GP and for SMGP. Series 3. This last series of experiments consists in choosing as global optima two trees with a normalized distance equal to 1 between them. Table 3 shows a subset of the results obtained with this set of experiments. These results are encouraging too, thus confirming the suitability of fdc as an indicator of problem hardness for multimodal trap functions both for standard GP and for SMGP Royal Trees This set of function uses the same language to code individuals as the one presented in Section 3.1. It was first introduced in [37] and it isbasedontheconceptof perfect tree.forinstance,ifonlyfunctionsymbols A and B are allowed (i.e. the maximum allowed arity for the nodes is equal to 2), the perfect tree has B as root (node at level 0), A as sons of the root (nodes at level 1) and leaves (X nodes) as nodes of level 2. This tree is shown at the left side of figure 2. If the set of function nodes is composed by A, B and C (i.e. the C A B A A B A A B A A B A X X X X X X X X Fig. 2. Optimum trees for F = {A, B} (left) and for F = {A, B, C} (right). maximum arity allowed is 3), then the perfect tree is the one shown in the right part of figure 2. Note that this tree has C as root and three optimum trees of root B as subtrees. By the same argument, the form of the optimum trees of any

19 18 G.Mauri and L. Vanneschi fdc fdc prediction p (SMGP) p (stgp) B 1 =0,B 3 =0.8,R 2 = misleading 0 0 B 1 =0,B 3 =0.85,R 2 = misleading B 1 =0,B 3 =0.85,R 2 = misleading B 1 =0,B 3 =0.9,R 2 = misleading B 1 =0,B 3 =0.9,R 2 = misleading 0 0 B 1 =0,B 3 =0.9,R 2 = misleading B 1 =0,B 3 =0.95,R 2 = misleading 0 0 B 1 =0,B 3 =0.95,R 2 = misleading 0 0 B 1 =0,B 3 =0.95,R 2 = misleading B 1 =0,B 3 =1,R 2 = misleading 0 0 B 1 =0,B 3 =1,R 2 = misleading 0 0 B 1 =0,B 3 =1,R 2 = misleading B 1 =0.05,B 3 =0.95,R 2 = straightf B 1 =0.1,B 3 =0.8,R 2 = straightf B 1 =0.15,B 3 =0.8,R 2 = straightf B 1 =0.15,B 3 =0.8,R 2 = straightf B 1 =0.2,B 3 =0.8,R 2 = straightf B 1 =0.2,B 3 =0.95,R 2 = straightf B 1 =0.2,B 3 =1,R 2 = straightf. 1 1 B 1 =0.25,B 3 =0.9,R 2 = straightf. 1 1 B 1 =0.3,B 3 =0.8,R 2 = straightf B 1 =0.3,B 3 =0.9,R 2 = straightf B 1 =0.35,B 3 =0.85,R 2 = straightf. 1 1 B 1 =0.4,B 3 =0.8,R 2 = straightf B 1 =0.4,B 3 =0.95,R 2 = straightf B 1 =0.4,B 3 =0.95,R 2 = straightf B 1 =0.45,B 3 =0.85,R 2 = straightf. 1 1 B 1 =0.5,B 3 =0.8,R 2 = straightf B 1 =0.5,B 3 =0.95,R 2 = straightf. 1 1 B 1 =0.55,B 3 =0.9,R 2 = straightf B 1 =0.55,B 3 =0.95,R 2 = straightf B 1 =0.6,B 3 =0.9,R 2 = straightf. 1 1 B 1 =0.6,B 3 =0.95,R 2 = straightf B 1 =0.65,B 3 =0.9,R 2 = straightf. 1 1 B 1 =0.7,B 3 =0.85,R 2 = straightf B 1 =0.7,B 3 =0.95,R 2 = straightf B 1 =0.7,B 3 =1,R 2 = straightf B 1 =0.75,B 3 =0.9,R 2 = straightf. 1 1 Table 2. Results of fdc using SMGP and standard GP for the second series of experiments with W trap functions; p stands for performance. other root can be deduced. The fitness of a tree (or any subtree) is defined as the score of its root. Each function calculates its score by summing the weighted scores of its direct children. If the child is a perfect tree of the appropriate level (for instance, a complete level-c tree beneath a D node), then the score of that

20 Measures to Characterize Search in Evolutionary Algorithms 19 fdc fdc prediction p (SMGP) p (stgp) B 1 =0,B 2 =0.5,B 3 =0.9,R 1 = misleading 0 0 B 1 =0,B 2 =0.8,B 3 =0.9,R 1 = misleading 0 0 B 1 =0,B 2 =0.6,B 3 =0.9,R 1 = misleading 0 0 B 1 =0.1,B 2 =0.2,B 3 =0.3,R 1 = straightf. 1 1 B 1 =0.1,B 2 =0.2,B 3 =0.3,R 1 = straightf. 1 1 B 1 =0.2,B 2 =0.3,B 3 =0.9,R 1 = straightf. 1 1 B 1 =0.3,B 2 =0.5,B 3 =0.6,R 1 = straightf. 1 1 B 1 =0.6,B 2 =0.8,B 3 =0.9,R 1 = straightf. 1 1 B 1 =0.8,B 2 =0.9,B 3 =1,R 1 = straightf. 1 1 Table 3. Results of fdc using SMGP and standard GP for the second series of experiments with W trap functions; p stands for performance. subtree, times a FullBonus weight, is added to the score of the root. If the child hasacorrectrootbutisnotaperfecttree,thentheweightispartialbonus. If the child s root is incorrect, then the weight is Penalty. After scoring the root, if the function is itself the root of a perfect tree, the final sum is multiplied by CompleteBonus. Usual values for these constants are as follows: FullBonus = 2, PartialBonus =1,Penalty = 1 3, CompleteBonus = 2 (see [37] for a more detailed explanation and for some examples of fitness calculations for some simple trees). According to this algorithm, the global optima for the royal trees of a given arity are the perfect trees having as root the node with the maximum arity. Different experiments, considering different nodes as the node with maximum arity have been performed. The values of the royal tree constants used here are, as in [37] i.e. FullBonus =2,PartialBonus =1,Penalty = 1 3, CompleteBonus = 2. Results are shown in table 4. Predictions made by fdc for level-a, level-b, Root fdc fdc prediction p (SMGP) p (stgp) B straightf. 1 1 C straightf. 1 1 D straightf E unknown F 0.44 misleading 0 0 G 0.73 misleading 0 0 Table 4. Results of fdc for the Royal Trees; p stands for performance. level-c and level-d functions are correct. For level-e function, no correlation between fitness and distance is observed and performance is either equal (for SMGP)or near(for standard GP)to zero.finally,level-f and level-g functions are predicted to be misleading (in accord with Punch in [37]) and they really are, since the global optimum is never found before 500 generations. Royal trees

21 20 G.Mauri and L. Vanneschi problem spans all the classes of difficulty as described by the fdc and fdc works for these problems, both for standard GP and for SMGP MAX Problem Contrarily to trap functions and to the Royal Tree problem, the MAX problem do not use the coding language defined in section 3.1, but sets of arithmetic functions and constants. The task of the MAX problem for GP, defined in [13] and [28], is to find the program which returns the largest value for a given terminal and function set with a depth limit d, wheretheroot node counts as depth 0. The choice of the sets F and T strongly influences the ability of GP to generate the optimum tree. For a deeper introduction to this problem and a study the GP behavior on it, see [28]. Three series of experiments have been performed for the MAX problem and are presented here: in the first series, F = {+} and T = {1} have been used, in the second series, F = {+} and T = {1, 2} have been used, and in the third series F = {+, } and T = {0.3} have been used (where the particular constant value has been chosen to avoid the presence of multiple global optima). Table 5 shows the fdc and p values for these three series (each line corresponds to a series). Two problems are correctly classified as straightforward by fdc, MAX problem fdc fdc prediction p (SMGP) p (stgp) {+} {1} straightf. 1 1 {+} {1,2} straightf. 1 1 {+, *} {0.3} 0.08 difficult 0 0 Table 5. Results of fdc for the MAX problem using SMGP and standard GP. The first column shows the sets of functions and terminals used in the experiments; p stands for performance. both for SMGP and standard GP. For the third problem, that is difficult since solutions are never found over the 100 runs performed, no correlation between fitness and distance to the global optimum has been detected for the individuals of the sample used (the fdc value is approximately equal to 0). In conclusion, also for the MAX problem the fdc seems a reasonable measure to quantify difficulty. 3.4 Counterexample Until now, we have only shown test problems for which the fdc succeeds in correctly quantifying difficulty for GP. An hand-tailored problem, built to contradict the fdc conjecture, is presented here. This problem is based on the Royal Trees and is inspired by the technique used in [39] to build a counterexample for fdc in GAs. The technique basically consists in assigning to all the trees of the search

22 Measures to Characterize Search in Evolutionary Algorithms 21 space the same fitness as for the Royal Tree problem, except all the trees containing only the nodes A and X. To these trees, a fitness equal to the optimal Royal Tree s fitness times its own depth is assigned. It is clear that the optimal Royal Tree is now a local optimum, while the global optimum is the tree containing only A and X symbols having the maximum possible depth. Moreover, it is clear that a very specific path {A(X),A(A(X)),A(A(A(X))),A(A(A(A(X)))),...} has been defined. Each tree belonging to this path has a fitness greater or equal to the optimal Royal Tree, while all the trees that don t belong to this path have a fitness smaller or equal to the optimal Royal Tree. The value of fdc for this function is 0.88.Thus,according to the fdc conjecture this function should be difficult to solve. By the way, over 100 independent runs, the global optimum has been found 100 times before generation 500 (i.e. p = 1) both by standard GP and by SMGP. These results, obviously contradict the fdc conjecture. This clearly happens because individuals belonging to the path are easy to build by means of genetic operators and, once at least one of them has been generated, it is easy to obtain the global optimum by composition or by simple mutations. In conclusion, the fdc is a reasonable hardness measure for many test functions, but it is not infallible: functions can be build for which it fails to correctly measure the difficulty. 4 Negative Slope Coefficient As shown in the previous section, fdc is a rather reliable indicator of problem hardness. However, it has some flaws: the existence of counterexamples casts a shadow on its usefulness, although such cases are contrived ones and they do not seem to appear often among natural problems. But the most severe drawback of fdc, and its main weakness, is that the optimal solution (or solutions) must be known beforehand, which is obviously unrealistic in applied search and optimization problems, and prevents one from applying fdc to more usual GP benchmarks and real-life applications. Thus, although the study of fdc is useful to understand EAs dynamics, it is also important to try other approaches based on quantities that can be measured without any explicit knowledge of the genotype of optimal solutions. The measure presented in this section, the negative slope coefficient (nsc), is based on the concepts of evolvability and fitness clouds. Evolvability is a feature that is intuitively related, although not exactly identical, to problem difficulty. It has been defined as the ability of genetic operators to improve fitness quality [1]. The most natural way to study evolvability is, probably, to plot the fitness values of individuals against the fitness values of their neighbours, where a neighbour is obtained by applying one step of a genetic operator to the individual. Such a plot has been presented in [51, 6, 50, 3] and it is called a fitness cloud. Since high-fitness points tend to be much more important than low-fitness ones in determining the behaviour of EAs, an alternative algorithm to generate fitness clouds was proposed in [45]. The main steps of this algorithm can be informally summarised as follows:

23 22 G.Mauri and L. Vanneschi Generate a set of individuals Γ = {γ 1,..., γ n } by sampling the search space and let f i = f(γ i ), where f(.) is the fitness function. For each γ j Γ generate k neighbours, v j 1,...,vj k, by applying a genetic operator to γ j and let f j =max j f(v j ). Finally, take C = {(f 1,f 1),...,(f n,f n)} as the fitness cloud. This is the interpretation of fitness cloud used in this paper. Note how this algorithm essentially corresponds to the sampling produced by a set of n stochastic hill-climbers at their first iteration after initialisation. The fitness cloud can be of help in determining some characteristics of the fitness landscape related to evolvability and problem difficulty. But the mere observation of the scatterplot is not sufficient to quantify these features. The nsc has been defined to capture with a single number some interesting characteristics of fitness clouds. It can be calculated as follows: let us partition C into a certain number of separate ordered bins C 1,...,C m such that (f a,f a) C j and (f b,f b ) C k with j<kimplies f a <f b. Consider the averages fitnesses f i = 1 and f i = 1 C i (f,f ) C i f.thepoints( f i, f i C i (f,f ) C i f ) can be seen as the vertices of a polyline, which effectively represents the skeleton of the fitness cloud. For each of the segments of this we can define a slope, S i =(f i+1 f i )/(f i+1 f i ). Finally, the negative slope coefficient is defined as: nsc = m 1 i=1 min (0,S i ). (2) The hypothesis that is proposed in this paper is that ncs should classify problems in the following way: if nsc= 0, the problem is easy; if nsc< 0 the problem is difficult and the value of nsc quantifies this difficulty: the smaller its value, the more difficult the problem. The justification for this hypothesis is that the presence of a segment with negative slope would indicate a bad evolvability for individuals having fitness values contained in that segment as neighbours are, on average, worse than their parents in that segment. The definition of nsc is very general and has many degrees of freedom. In particular, a question must be answered to be able to calculate the nsc: how should we partition the abscissas of a fitness cloud into bins? A partitioning technique called size driven bisection and inspired by the well-known bisection algorithm was proposed and justified in [49]. This technique is used in this paper, given its strong theoretical fondation based on statistics. 4.1 Experimental Results on nsc In the last few years, the nsc hasbeentestedasameasureofproblemhardnessfor GP over a large set of problems. These problems include hand-tailored functions as the ones used to test the fdc (trap functions, Royal Trees, MAX functions); but, given that the nsc can be calculated without prior knowledge of the global optima, it can also be applied on more realistic real-life-like GP benchmarks, like various forms of symbolic regressions, the even parity problem, the artificial

24 Measures to Characterize Search in Evolutionary Algorithms 23 ant on the Santa Fe trail, the multiplexer problem and the intertwined spirals problem (all these problems are described and discussed in detail in [25]). Tests of the ability of the nsc to predict the difficulty have been performed for all these problems and have given remarkably positive results. In this paper, results on two of these problems are presented: the binomial-3 problem (an instance of the symbolic regression problem, which is particularly interesting for this work, because its difficulty can be tuned by simply changing the value of a parameter) and the even parity problem. Both these problems are briefly introduced below in the respective paragraphs, before showing and discussing the experimental results The binomial-3 problem This benchmark (first introduced by Daida et al. in [7]) is an instance of the well known symbolic regression problem [25]. Thefunctiontobeapproximatedisf(x) =1+3x +3x 2 + x 3. Fitness cases are 50 equidistant points over the range [ 1, 0). Fitness is the sum of absolute errors over all fitness cases. A hit is defined as being within 0.01 in ordinate for each one of the 50 fitness cases. The function set is F = {+,,,//}, where// is the protected division, i.e. it returns 1 if the denominator is 0. The terminal set is T = {x, R}, wherex is the symbolic variable and R is the set of ephemeral random constants (ERCs). ERCs are uniformly distributed over a specified interval of the form [ a R,a R ], they are generated once at population initialization and they are not changed in value during the course of a GP run. According to Daida and coworkers, difficulty tuning is achieved by varying the value of a R. Figure 3 shows the scatterplots and the set of segments {S 1,S 2,...,S m } as defined in the previous section (with m = 10) for the binomial-3 problem with a R = 1 (Figure 3(a)), a R = 10 (Figure 3(b)), a R = 100 (Figure 3(c)) and a R = 1000 (Figure 3(d)). Parameters used are as follows: maximum tree depth = 26 and sample size of individuals. Table 6 shows some data about these experiments. Column one of table 6 represents the corresponding scatterplot in Figure 3. Column two contains the a R value. Column three contains performance. Columns four contains the value of the nsc. These results show that nsc scatterplot a R p nsc fig. 3(a) fig. 3(b) fig. 3(c) fig. 3(d) Table 6. Binomial-3 problem. Some data related to scatterplots of Figure 3. values get smaller as the problem becomes harder, and it is zero when the problem is easy (a R = 1). Moreover, the points in the scatterplots seem to cluster around good (i.e. small) fitness values as the problem gets easier.

25 24 G.Mauri and L. Vanneschi (a) (b) (c) (d) Fig. 3. Binomial-3 results. (a): a R = 1. (b): a R = 10. (c): a R = 100. (d): a R = The even parity k problem The boolean even parity k function [25] of k boolean arguments returns true if an even number of its boolean arguments evaluates to true, otherwise it returns false. The number of fitness cases to be checked is 2 k. Fitness is computed as 2 k minus the number of hits over the 2 k cases. Thus a perfect individual has fitness 0, while the worst individual has fitness 2 k. The set of functions we employed is F = {NAND,NOR}. The terminal set is composed of k different boolean variables. Difficulty tuning is achieved by varying the value of k. Figure 4 shows the scatterplots and the set of segments for the even parity 3, even parity 5, even parity 7 and even parity 9 problems. Parameters used are as follows: maximum tree depth = 10, sample size of individuals.

26 Measures to Characterize Search in Evolutionary Algorithms 25 Table 7 shows some data about these experiments with the same notation and 6 19 Fitness of Neighbors Fitness of Neighbors Fitness of Neighbors Fitness (a) Fitness of Neighbors Fitness (b) Fitness (c) Fitness (d) Fig. 4. Results for the even parity k problem. (a): Even parity 3. (b): Even parity 5. (c): Even parity 7. (d): Even parity 9. meaning as in table 6, except that column two now refers to the problem rank. Analogously to what happens for the binomial-3 problem, nsc values get smaller scatterplot problem p nsc fig. 4(a) even parity fig. 4(b) even parity fig. 4(c) even parity fig. 4(d) even parity Table 7. Even parity. Indicators related to scatterplots of Figure 4. as the problem becomes harder, they are always negative for hard problems, and zero for easy ones.

27 26 G.Mauri and L. Vanneschi 4.2 Summing up The goal of the study of GP problem hardness is to provide the final user a tool, or a set of tools, to measure the ability of GP systems to find good solutions to a given problem. Before being able to define and develop these tools, it is necessary to identify the features that make a problem an easy or hard one for GP to solve. The attempt to answer this question, in order to develop a theory of problem hardness, has lead to the hypothesis that the relationship between fitness and distance to the goal is one of the main features that make a problem easy or hard, and to the definition of the fdc. In Section 3, it has been shown that the fdc is an adequate descriptor of GP landscape s statistics and can be used to measure the difficulty of many test function, but also that it has some known drawbacks: functions can be built for which fdc fails to correctly measure the difficulty and, even more importantly, fdc is not predictive: global optima must be known beforehand to be able to calculate it. Even though sometimes some genotypic characteristics of the global optima can be known even in reallife applications, and this can allow one to approximate fdc values, this is not the case in general, and this prevents fdc to be used in a wide set of practical cases. For this reason, in the present section, the negative slope coefficient (nsc) has been presented. It is based on the concept of fitness cloud and it is a predictive measure, in the sense that no prior knowledge of the global optima s genotypes is requested to calculate it. For this reason, nsc canbeappliedtoanygpproblem (and also to other EAs different from GP). The suitability of nsc as an indicator of GP problem hardness has been experimentally shown on a set of rather diverse standard GP benchmarks and synthetic hand-tailored test functions. Tha main drawback of the nsc is that it uses only mutation as a variation operator to generate neighborhoods. A measure to model and capture the main properties of subtree crossover, the most common and most frequently used GP genetic operator, is presented below. 5 Subtree Crossover Distance In the previous section, only mutation has been considered as a variation operator to obtain neighbors and generate fitness clouds. Indeed, being mutation a unary operator, it is much easier to use it, rather than crossover, in theoretical models and measures. Nevertheless, crossover is the most common and the most frequently used genetic operator in EAs. In particular in GP, Koza s subtree crossover is often considered as one of the main search engines [25, 2]. Thus, it d be a relevant result to define a distance, or a similarity/dissimilarity measure, which could be bound to standard subtree crossover in a similar sense as structural distance is related to structural mutation (see Section 3.2). Following the same notation as in [18], let P be a population of trees, T 1 be the tree we want to compute a distance from (or the parent tree) and T 2 be the tree which we would like to transform T 1 into. The subtree crossover

28 Measures to Characterize Search in Evolutionary Algorithms 27 distance 2 (SCD) between T 1 and T 2 depends not only on T 1 and T 2 themselves, but also on the population P.Letdiff(T 1, T 2 ) be an operator that returns the set S = {(s 1 T 1,s 1 T 2 ), (s 2 T 1,s 2 T 2 ),..., (s n T 1,s n T 2 )} such that i [1,n]ifwereplace s i T 1 with s i T 2 in T 2 we obtain T 1 ; diff(t 1, T 2 ) returns the empty set if T 1 and T 2 share no genetic material. Now, the new SCD can be defined by the following algorithm: func SCD(T 1,T 2,P ){ S = diff (T 1,T 2 ) res =0 for i =1 to cardinality(s) do ps1 =probselecting(s i T 1, T 1 ) ps2 =probcreating(s i T 2, P) res = res +(ps1 ps2) endfor return(1 res) } Given the subtrees s i T 2 that need to replace s i T 1 T 1, the distance is defined in terms of the probability of selecting the subtrees s i T 1 in T 1 and the probability of creating (or selecting) the subtrees s i T 2 from P. Both functions, probselecting(.) and probcreating(.) require knowledge of the selection probabilities used in the algorithm, but can be calculated in linear computational time, as proven in [19]. 5.1 Experimental Results The goal of this section is to show the suitability of the new definition of SCD for monitoring various properties of the GP search process. In particular, Section shows how this distance can be used to calculate fitness distance correlation (fdc) inside the population during the search process, Section shows results of fitness sharing using SCD and Section shows how the SCD can be used to measure genotypic diversity of populations Fitness Distance Correlation Given that no bound has ever been proven between subtree crossover and structural distance, large samples of individuals (and not just the individuals composing a normal population) have been 2 The subtree crossover distance that we consider in this paper is a probability and thus it is clearly not a metric. Furthermore it is not just a function of two trees, but also of the population they belong to and in general it does not respect the properties of metrics (like for instance the triangle inequality). Thus, the term pseudo-distance (in the sense that it indicates how far apart the two items are) would be more appropriate than the term distance. In some senses, we could say that our measure is more like a similarity/dissimilarity measure than a proper distance (Euclidean) metric: it conveys information about how likely it is to make two trees equal, which does largely depend on their similarity. Nevertheless, we use the term distance for the sake of brevity.

29 28 G.Mauri and L. Vanneschi used in section 3 to calculate the fdc. On the other hand, the study of the trend of the fdc in the population during the evolution would be very interesting, since this study would be more dynamic than studying the fdc once for all on a single large sample of individuals. In fact, this investigation would allow us to study how the fdc gets modified during the evolution and this information could allow us to draw some conclusions on the dynamics of the GP search process. In particular, if the fdc value decreases during the evolution and it tends towards 1, the population should be converging towards the global optimum (individuals are approaching the global optimum as fitness is improving). On the other hand, if the fdc value increases during the evolution, or it remains static at some initial positive level or at zero, this probably means that the population is converging towards a local optimum (fitness is improving, but the distance to the global optimum is not decreasing). The following experiments have been done to confirm this hypothesis and to test the suitability of SCD to calculate the fdc. Syntactic Trees. In the syntactic trees problem, as used in [18], trees are represented using the set of functions F = {N}, wheren is a binary operator (N stands for Non-terminal ) and the set of terminal symbols T = {L} (L stands for Leaf ). No content is associated with the nodes and fitness is simply equal to the structural distance to a fixed global optimum. The global optimum of an instance is generated using a random tree growing algorithm described in [18]. Figure 5(a) shows the tree chosen as optimum for the experiments in Figure 6, and Figure 5(b) shows the tree chosen as optimum for the experiments in Figure 7. These experiments have been performed using the following set of parameters: (a) (b) Fig. 5. (a) The tree used as optimum for the experiments in Figure 6. (b) The tree used as optimum for the experiments in Figure 7. generational GP, population size of 30 individuals, standard subtree crossover as the only genetic operator, tournament selection of size 5, ramped half-and-half initialisation, maximum depth of individuals for the initialisation phase equal to 4, maximum depth of individuals for crossover equal to 8. All the runs have

30 Measures to Characterize G.Mauri Searc and L. in Vanneschi Evolutionary Algorithms 29 (a) (b) Fig. 6. Syntactic Trees Problem. Average values (a) and average values with their standard deviations (b) of average fitness, best fitness and fdc in the population against generations over 50 independent GP runs. In all these runs the optimum has been found before generation 100. The tree used as optimum in these experiments was the tree in Figure 5(a). been stopped at generation 100. Figure 6 reports the average values (with their standard deviations in Figure 6(b)) of the best fitness, the average fitness and the fdc (calculated using SCD) in the population (against generations) over 50 independent GP runs in which the global optimum has been found before generation 100 (successful runs). Producing 50 successful runs has been easy, probably for the very simple shape of the tree that we have used as optimum (shown in Figure 5(a)). The method that has been used to collect 50 successful runs was simply to execute a sequence of GP runs until 50 successful ones were found. It has been sufficient to execute 52 runs to get 50 successful ones. Figure 7 reports the same information, but this time for 50 unsuccessful runs. Collecting 50 unsuccessful runs has been easy, probably for the particular shape of the tree that has been used as optimum, shown in Figure 5(b) (over 61 runs, 50 were successful). These figures show that, in case of success, the fdc decreases until the global optimum is found and than remains negative until the end of the run. In case of unsuccessful runs, the fdc value always stays around zero, independently from the fact that fitness is slightly improving. The interpretation is that, in this last case the evolutionary process in leading the population towards a local optimum, which probably has a rather large crossover distance from the global one. For successful runs the fact that fdc is negative indicates that evolution is leading the population towards the global optimum. Trap Functions. Unimodal trap functions are used here. They are defined as multimodal trap functions, but they only have one global optimum. They can be considered a particular case of multimodal W trap functions where B 2 = B 3

31 30 G.Mauri and L. Vanneschi (a) (b) Fig. 7. Syntactic Trees Problem. Average values (a) and average values with their standard deviations (b) of average fitness, best fitness and fdc in the population against generations over 50 independent GP runs. In all these runs the optimum has not been found before generation 100. The tree used as optimum in these experiments was the tree in Figure 5(b). and R 1 = R 2. Figure 8(a) shows the tree chosen as optimum for the experiments in Figure 9 and Figure 8(b) shows the tree chosen as optimum for the experiments in Figure 10. Parameters used in these experiments are as follows: generational GP, population size of 100 individuals, standard subtree crossover used as the sole genetic operator, tournament selection of size 10, ramped halfand-half initialisation, maximum depth of individuals for the initialisation phase equal to 6, maximum depth of individuals for crossover equal to 10. Here, larger trees than in the case of the syntactic tree problem discussed in the previous section have been used (arity 3 nodes and deeper trees have been considered), because the hypotheses presented in this section have to be tested in different conditions. Figure 9 reports the average values (with their standard deviations in Figure 9(b)) of the best fitness, the average fitness and the fdc (calculated using SCD) in the population (against generations) over 50 independent successful GP runs. The method used to collect 50 successful runs was the same as the one discussed above. In these experiments, we have set the B 2, B 3, R 1 and R 2 trap functions parameters as follows: B 2 = B 3 =0.9 andr 1 = R 2 =0.1. In this way, the fitness landscape is easy to search for GP [44] and thus it is easy to have successful runs. Figure 10 reports the same information as Figure 9, but for 50 independent unsuccessful GP runs. In this case, the B 2 and B 3 parameters were both set to 0.1 and the R 1 and R 2 parameters to 0.9 inordertomake

32 Measures to Characterize Search in Evolutionary Algorithms 31 (a) (b) Fig. 8. (a) The tree used as optimum for the experiments in Figure 9. (b) The tree used as optimum for the experiments in Figure 10. (a) (b) Fig. 9. Trap Functions. Average values (a) and average values with their standard deviations (b) of average fitness, best fitness and fdc in the population against generations over 50 independent GP runs. In all these runs the optimum has been found before generation 100. The tree used as optimum in these experiments was the tree in Figure 8(a). the fitness landscape difficult to search for GP [44]. The method used to collect

33 32 G.Mauri and L. Vanneschi (a) (b) Fig. 10. Trap Functions. Average values (a) and average values with their standard deviations (b) of average fitness, best fitness and fdc in the population against generations over 50 independent GP runs. In all these runs the optimum has not been found before generation 100. The tree used as optimum in these experiments was the tree in Figure 8(b). 50 unsuccessful runs was the same as the one discussed in the previous section. In Figure 10(b), the scale on the ordinates axis has been restricted in order to enlarge the graph and to make it clearer and more readable. These figures show that for successful runs the fdc decreases until the global optimum is found and than remains negative until the end of the run, while in case of failure the fdc is always positive. Here the phenomenon is even more marked than in the case of syntactic trees. In fact, for successful runs fdc rapidly stabilises to approximately -0.6, while for unsuccessful runs fdc always remains approximately equal to 0.8. Once again, the conclusion is that the value of fdc in the population (calculated using the SCD) is a good indicator of the direction the search process is leading the population: negative values of the fdc mean that the search is moving towards a global optimum, while positive values of the fdc mean that the search is moving towards local ones Fitness Sharing In the previous section, it has been shown that SCD can appropriately be used to dynamically calculate the population fdc during the evolution. However, as discussed previously, fdc is not a predictive measure (i.e. the global optima must be known to be able to calculate it), which makes the fdc almost unusable in practice. Other than a measure for diversity, can the SCD be useful for practitioners? In this section, we discuss fitness sharing,

Defining Locality as a Problem Difficulty Measure in Genetic Programming

Defining Locality as a Problem Difficulty Measure in Genetic Programming Noname manuscript No. (will be inserted by the editor) Defining Locality as a Problem Difficulty Measure in Genetic Programming Edgar Galván-López James McDermott Michael O Neill Anthony Brabazon Received:

More information

Fitness Clouds and Problem Hardness in Genetic Programming

Fitness Clouds and Problem Hardness in Genetic Programming Fitness Clouds and Problem Hardness in Genetic Programming Leonardo Vanneschi, Manuel Clergue, Philippe Collard, Marco Tomassini, Sébastien Verel To cite this version: Leonardo Vanneschi, Manuel Clergue,

More information

Defining Locality in Genetic Programming to Predict Performance

Defining Locality in Genetic Programming to Predict Performance Defining Locality in Genetic Programming to Predict Performance Edgar Galván-López, James McDermott, Michael O Neill and Anthony Brabazon Abstract A key indicator of problem difficulty in evolutionary

More information

Theoretical results in genetic programming: the next ten years?

Theoretical results in genetic programming: the next ten years? Genet Program Evolvable Mach (2010) 11:285 320 DOI 10.1007/s10710-010-9110-5 CONTRIBUTED ARTICLE Theoretical results in genetic programming: the next ten years? Riccardo Poli Leonardo Vanneschi William

More information

Proof Techniques (Review of Math 271)

Proof Techniques (Review of Math 271) Chapter 2 Proof Techniques (Review of Math 271) 2.1 Overview This chapter reviews proof techniques that were probably introduced in Math 271 and that may also have been used in a different way in Phil

More information

A.I.: Beyond Classical Search

A.I.: Beyond Classical Search A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously

More information

k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 August 28, 2003 These supplementary notes review the notion of an inductive definition and give

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 September 2, 2004 These supplementary notes review the notion of an inductive definition and

More information

2-bit Flip Mutation Elementary Fitness Landscapes

2-bit Flip Mutation Elementary Fitness Landscapes RN/10/04 Research 15 September 2010 Note 2-bit Flip Mutation Elementary Fitness Landscapes Presented at Dagstuhl Seminar 10361, Theory of Evolutionary Algorithms, 8 September 2010 Fax: +44 (0)171 387 1397

More information

Evolving a New Feature for a Working Program

Evolving a New Feature for a Working Program Evolving a New Feature for a Working Program Mike Stimpson arxiv:1104.0283v1 [cs.ne] 2 Apr 2011 January 18, 2013 Abstract A genetic programming system is created. A first fitness function f 1 is used to

More information

Direct Proof and Counterexample I:Introduction

Direct Proof and Counterexample I:Introduction Direct Proof and Counterexample I:Introduction Copyright Cengage Learning. All rights reserved. Goal Importance of proof Building up logic thinking and reasoning reading/using definition interpreting :

More information

Direct Proof and Counterexample I:Introduction. Copyright Cengage Learning. All rights reserved.

Direct Proof and Counterexample I:Introduction. Copyright Cengage Learning. All rights reserved. Direct Proof and Counterexample I:Introduction Copyright Cengage Learning. All rights reserved. Goal Importance of proof Building up logic thinking and reasoning reading/using definition interpreting statement:

More information

Supplementary Notes on Inductive Definitions

Supplementary Notes on Inductive Definitions Supplementary Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 August 29, 2002 These supplementary notes review the notion of an inductive definition

More information

Introducing Proof 1. hsn.uk.net. Contents

Introducing Proof 1. hsn.uk.net. Contents Contents 1 1 Introduction 1 What is proof? 1 Statements, Definitions and Euler Diagrams 1 Statements 1 Definitions Our first proof Euler diagrams 4 3 Logical Connectives 5 Negation 6 Conjunction 7 Disjunction

More information

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,

More information

On improving matchings in trees, via bounded-length augmentations 1

On improving matchings in trees, via bounded-length augmentations 1 On improving matchings in trees, via bounded-length augmentations 1 Julien Bensmail a, Valentin Garnero a, Nicolas Nisse a a Université Côte d Azur, CNRS, Inria, I3S, France Abstract Due to a classical

More information

Gecco 2007 Tutorial / Grammatical Evolution

Gecco 2007 Tutorial / Grammatical Evolution Gecco 2007 Grammatical Evolution Tutorial Conor Ryan Biocomputing and Developmental Systems Group Department of Computer Science and Information Systems University of Limerick Copyright is held by the

More information

56 CHAPTER 3. POLYNOMIAL FUNCTIONS

56 CHAPTER 3. POLYNOMIAL FUNCTIONS 56 CHAPTER 3. POLYNOMIAL FUNCTIONS Chapter 4 Rational functions and inequalities 4.1 Rational functions Textbook section 4.7 4.1.1 Basic rational functions and asymptotes As a first step towards understanding

More information

REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531

REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531 U N I V E R S I T Y OF D O R T M U N D REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531 Design and Management of Complex Technical Processes and Systems by means of Computational Intelligence

More information

Algorithms and Complexity theory

Algorithms and Complexity theory Algorithms and Complexity theory Thibaut Barthelemy Some slides kindly provided by Fabien Tricoire University of Vienna WS 2014 Outline 1 Algorithms Overview How to write an algorithm 2 Complexity theory

More information

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016) FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016) The final exam will be on Thursday, May 12, from 8:00 10:00 am, at our regular class location (CSI 2117). It will be closed-book and closed-notes, except

More information

Inference of A Minimum Size Boolean Function by Using A New Efficient Branch-and-Bound Approach From Examples

Inference of A Minimum Size Boolean Function by Using A New Efficient Branch-and-Bound Approach From Examples Published in: Journal of Global Optimization, 5, pp. 69-9, 199. Inference of A Minimum Size Boolean Function by Using A New Efficient Branch-and-Bound Approach From Examples Evangelos Triantaphyllou Assistant

More information

Implicit Formae in Genetic Algorithms

Implicit Formae in Genetic Algorithms Implicit Formae in Genetic Algorithms Márk Jelasity ½ and József Dombi ¾ ¾ ½ Student of József Attila University, Szeged, Hungary jelasity@inf.u-szeged.hu Department of Applied Informatics, József Attila

More information

A Genetic Algorithm Approach for Doing Misuse Detection in Audit Trail Files

A Genetic Algorithm Approach for Doing Misuse Detection in Audit Trail Files A Genetic Algorithm Approach for Doing Misuse Detection in Audit Trail Files Pedro A. Diaz-Gomez and Dean F. Hougen Robotics, Evolution, Adaptation, and Learning Laboratory (REAL Lab) School of Computer

More information

Koza s Algorithm. Choose a set of possible functions and terminals for the program.

Koza s Algorithm. Choose a set of possible functions and terminals for the program. Step 1 Koza s Algorithm Choose a set of possible functions and terminals for the program. You don t know ahead of time which functions and terminals will be needed. User needs to make intelligent choices

More information

Distance Metrics and Fitness Distance Analysis for the Capacitated Vehicle Routing Problem

Distance Metrics and Fitness Distance Analysis for the Capacitated Vehicle Routing Problem MIC2005. The 6th Metaheuristics International Conference 603 Metrics and Analysis for the Capacitated Vehicle Routing Problem Marek Kubiak Institute of Computing Science, Poznan University of Technology

More information

Pengju

Pengju Introduction to AI Chapter04 Beyond Classical Search Pengju Ren@IAIR Outline Steepest Descent (Hill-climbing) Simulated Annealing Evolutionary Computation Non-deterministic Actions And-OR search Partial

More information

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Matthew J. Streeter Computer Science Department and Center for the Neural Basis of Cognition Carnegie Mellon University

More information

Computational Tasks and Models

Computational Tasks and Models 1 Computational Tasks and Models Overview: We assume that the reader is familiar with computing devices but may associate the notion of computation with specific incarnations of it. Our first goal is to

More information

Evolutionary computation

Evolutionary computation Evolutionary computation Andrea Roli andrea.roli@unibo.it DEIS Alma Mater Studiorum Università di Bologna Evolutionary computation p. 1 Evolutionary Computation Evolutionary computation p. 2 Evolutionary

More information

Decision Tree Learning

Decision Tree Learning Decision Tree Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Machine Learning, Chapter 3 2. Data Mining: Concepts, Models,

More information

Expected Running Time Analysis of a Multiobjective Evolutionary Algorithm on Pseudo-boolean Functions

Expected Running Time Analysis of a Multiobjective Evolutionary Algorithm on Pseudo-boolean Functions Expected Running Time Analysis of a Multiobjective Evolutionary Algorithm on Pseudo-boolean Functions Nilanjan Banerjee and Rajeev Kumar Department of Computer Science and Engineering Indian Institute

More information

Network Augmentation and the Multigraph Conjecture

Network Augmentation and the Multigraph Conjecture Network Augmentation and the Multigraph Conjecture Nathan Kahl Department of Mathematical Sciences Stevens Institute of Technology Hoboken, NJ 07030 e-mail: nkahl@stevens-tech.edu Abstract Let Γ(n, m)

More information

Artificial Intelligence Methods (G5BAIM) - Examination

Artificial Intelligence Methods (G5BAIM) - Examination Question 1 a) According to John Koza there are five stages when planning to solve a problem using a genetic program. What are they? Give a short description of each. (b) How could you cope with division

More information

A NEW SET THEORY FOR ANALYSIS

A NEW SET THEORY FOR ANALYSIS Article A NEW SET THEORY FOR ANALYSIS Juan Pablo Ramírez 0000-0002-4912-2952 Abstract: We present the real number system as a generalization of the natural numbers. First, we prove the co-finite topology,

More information

Evolutionary Algorithms How to Cope With Plateaus of Constant Fitness and When to Reject Strings of The Same Fitness

Evolutionary Algorithms How to Cope With Plateaus of Constant Fitness and When to Reject Strings of The Same Fitness Evolutionary Algorithms How to Cope With Plateaus of Constant Fitness and When to Reject Strings of The Same Fitness Thomas Jansen and Ingo Wegener FB Informatik, LS 2, Univ. Dortmund, 44221 Dortmund,

More information

Generalized Pigeonhole Properties of Graphs and Oriented Graphs

Generalized Pigeonhole Properties of Graphs and Oriented Graphs Europ. J. Combinatorics (2002) 23, 257 274 doi:10.1006/eujc.2002.0574 Available online at http://www.idealibrary.com on Generalized Pigeonhole Properties of Graphs and Oriented Graphs ANTHONY BONATO, PETER

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1 CS 70 Discrete Mathematics and Probability Theory Fall 013 Vazirani Note 1 Induction Induction is a basic, powerful and widely used proof technique. It is one of the most common techniques for analyzing

More information

Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators

Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators Andrea Mambrini 1 University of Birmingham, Birmingham UK 6th June 2013 1 / 33 Andrea Mambrini GSGP: theory-laden

More information

Fundamentals of Metaheuristics

Fundamentals of Metaheuristics Fundamentals of Metaheuristics Part I - Basic concepts and Single-State Methods A seminar for Neural Networks Simone Scardapane Academic year 2012-2013 ABOUT THIS SEMINAR The seminar is divided in three

More information

Realization Plans for Extensive Form Games without Perfect Recall

Realization Plans for Extensive Form Games without Perfect Recall Realization Plans for Extensive Form Games without Perfect Recall Richard E. Stearns Department of Computer Science University at Albany - SUNY Albany, NY 12222 April 13, 2015 Abstract Given a game in

More information

CS 70 Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand Midterm 1 Solutions

CS 70 Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand Midterm 1 Solutions CS 70 Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand Midterm 1 Solutions PRINT Your Name: Answer: Oski Bear SIGN Your Name: PRINT Your Student ID: CIRCLE your exam room: Dwinelle

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 3

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 3 EECS 70 Discrete Mathematics and Probability Theory Spring 014 Anant Sahai Note 3 Induction Induction is an extremely powerful tool in mathematics. It is a way of proving propositions that hold for all

More information

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

Zebo Peng Embedded Systems Laboratory IDA, Linköping University TDTS 01 Lecture 8 Optimization Heuristics for Synthesis Zebo Peng Embedded Systems Laboratory IDA, Linköping University Lecture 8 Optimization problems Heuristic techniques Simulated annealing Genetic

More information

Tutorial on Mathematical Induction

Tutorial on Mathematical Induction Tutorial on Mathematical Induction Roy Overbeek VU University Amsterdam Department of Computer Science r.overbeek@student.vu.nl April 22, 2014 1 Dominoes: from case-by-case to induction Suppose that you

More information

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia)

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia) Evolutionary Computation DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia) andrea.roli@unibo.it Evolutionary Computation Inspiring principle: theory of natural selection Species face

More information

1 AC 0 and Håstad Switching Lemma

1 AC 0 and Håstad Switching Lemma princeton university cos 522: computational complexity Lecture 19: Circuit complexity Lecturer: Sanjeev Arora Scribe:Tony Wirth Complexity theory s Waterloo As we saw in an earlier lecture, if PH Σ P 2

More information

Finite Mathematics : A Business Approach

Finite Mathematics : A Business Approach Finite Mathematics : A Business Approach Dr. Brian Travers and Prof. James Lampes Second Edition Cover Art by Stephanie Oxenford Additional Editing by John Gambino Contents What You Should Already Know

More information

Evolution Strategies for Constants Optimization in Genetic Programming

Evolution Strategies for Constants Optimization in Genetic Programming Evolution Strategies for Constants Optimization in Genetic Programming César L. Alonso Centro de Inteligencia Artificial Universidad de Oviedo Campus de Viesques 33271 Gijón calonso@uniovi.es José Luis

More information

Heuristics for The Whitehead Minimization Problem

Heuristics for The Whitehead Minimization Problem Heuristics for The Whitehead Minimization Problem R.M. Haralick, A.D. Miasnikov and A.G. Myasnikov November 11, 2004 Abstract In this paper we discuss several heuristic strategies which allow one to solve

More information

Genetic Algorithms: Basic Principles and Applications

Genetic Algorithms: Basic Principles and Applications Genetic Algorithms: Basic Principles and Applications C. A. MURTHY MACHINE INTELLIGENCE UNIT INDIAN STATISTICAL INSTITUTE 203, B.T.ROAD KOLKATA-700108 e-mail: murthy@isical.ac.in Genetic algorithms (GAs)

More information

Evolutionary computation

Evolutionary computation Evolutionary computation Andrea Roli andrea.roli@unibo.it Dept. of Computer Science and Engineering (DISI) Campus of Cesena Alma Mater Studiorum Università di Bologna Outline 1 Basic principles 2 Genetic

More information

Stochastic Search: Part 2. Genetic Algorithms. Vincent A. Cicirello. Robotics Institute. Carnegie Mellon University

Stochastic Search: Part 2. Genetic Algorithms. Vincent A. Cicirello. Robotics Institute. Carnegie Mellon University Stochastic Search: Part 2 Genetic Algorithms Vincent A. Cicirello Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 cicirello@ri.cmu.edu 1 The Genetic Algorithm (GA)

More information

Mechanisms of Emergent Computation in Cellular Automata

Mechanisms of Emergent Computation in Cellular Automata Mechanisms of Emergent Computation in Cellular Automata Wim Hordijk, James P. Crutchfield, Melanie Mitchell Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, 87501 NM, USA email: {wim,chaos,mm}@santafe.edu

More information

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

Notes on the Matrix-Tree theorem and Cayley s tree enumerator Notes on the Matrix-Tree theorem and Cayley s tree enumerator 1 Cayley s tree enumerator Recall that the degree of a vertex in a tree (or in any graph) is the number of edges emanating from it We will

More information

Introduction. Genetic Algorithm Theory. Overview of tutorial. The Simple Genetic Algorithm. Jonathan E. Rowe

Introduction. Genetic Algorithm Theory. Overview of tutorial. The Simple Genetic Algorithm. Jonathan E. Rowe Introduction Genetic Algorithm Theory Jonathan E. Rowe University of Birmingham, UK GECCO 2012 The theory of genetic algorithms is beginning to come together into a coherent framework. However, there are

More information

Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups

Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups Contemporary Mathematics Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups Robert M. Haralick, Alex D. Miasnikov, and Alexei G. Myasnikov Abstract. We review some basic methodologies

More information

On the mean connected induced subgraph order of cographs

On the mean connected induced subgraph order of cographs AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 71(1) (018), Pages 161 183 On the mean connected induced subgraph order of cographs Matthew E Kroeker Lucas Mol Ortrud R Oellermann University of Winnipeg Winnipeg,

More information

Alpha-Beta Pruning: Algorithm and Analysis

Alpha-Beta Pruning: Algorithm and Analysis Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction Alpha-beta pruning is the standard searching procedure used for solving

More information

Geometric Steiner Trees

Geometric Steiner Trees Geometric Steiner Trees From the book: Optimal Interconnection Trees in the Plane By Marcus Brazil and Martin Zachariasen Part 3: Computational Complexity and the Steiner Tree Problem Marcus Brazil 2015

More information

Models of Computation,

Models of Computation, Models of Computation, 2010 1 Induction We use a lot of inductive techniques in this course, both to give definitions and to prove facts about our semantics So, it s worth taking a little while to set

More information

Methods for finding optimal configurations

Methods for finding optimal configurations CS 1571 Introduction to AI Lecture 9 Methods for finding optimal configurations Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Search for the optimal configuration Optimal configuration search:

More information

An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees

An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees Francesc Rosselló 1, Gabriel Valiente 2 1 Department of Mathematics and Computer Science, Research Institute

More information

Coalescing Cellular Automata

Coalescing Cellular Automata Coalescing Cellular Automata Jean-Baptiste Rouquier 1 and Michel Morvan 1,2 1 ENS Lyon, LIP, 46 allée d Italie, 69364 Lyon, France 2 EHESS and Santa Fe Institute {jean-baptiste.rouquier, michel.morvan}@ens-lyon.fr

More information

1 Computational problems

1 Computational problems 80240233: Computational Complexity Lecture 1 ITCS, Tsinghua Univesity, Fall 2007 9 October 2007 Instructor: Andrej Bogdanov Notes by: Andrej Bogdanov The aim of computational complexity theory is to study

More information

CHAPTER 10. Gentzen Style Proof Systems for Classical Logic

CHAPTER 10. Gentzen Style Proof Systems for Classical Logic CHAPTER 10 Gentzen Style Proof Systems for Classical Logic Hilbert style systems are easy to define and admit a simple proof of the Completeness Theorem but they are difficult to use. By humans, not mentioning

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Decision Trees Instructor: Yang Liu 1 Supervised Classifier X 1 X 2. X M Ref class label 2 1 Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short}

More information

Qualifying Exam in Machine Learning

Qualifying Exam in Machine Learning Qualifying Exam in Machine Learning October 20, 2009 Instructions: Answer two out of the three questions in Part 1. In addition, answer two out of three questions in two additional parts (choose two parts

More information

Introduction to Computer Science and Programming for Astronomers

Introduction to Computer Science and Programming for Astronomers Introduction to Computer Science and Programming for Astronomers Lecture 8. István Szapudi Institute for Astronomy University of Hawaii March 7, 2018 Outline Reminder 1 Reminder 2 3 4 Reminder We have

More information

Lecture 5: Linear Genetic Programming

Lecture 5: Linear Genetic Programming Lecture 5: Linear Genetic Programming CIU036 Artificial Intelligence 2, 2010 Krister Wolff, Ph.D. Department of Applied Mechanics Chalmers University of Technology 41296 Göteborg, Sweden krister.wolff@chalmers.se

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Local Search and Optimization

Local Search and Optimization Local Search and Optimization Outline Local search techniques and optimization Hill-climbing Gradient methods Simulated annealing Genetic algorithms Issues with local search Local search and optimization

More information

Mathematics 114L Spring 2018 D.A. Martin. Mathematical Logic

Mathematics 114L Spring 2018 D.A. Martin. Mathematical Logic Mathematics 114L Spring 2018 D.A. Martin Mathematical Logic 1 First-Order Languages. Symbols. All first-order languages we consider will have the following symbols: (i) variables v 1, v 2, v 3,... ; (ii)

More information

Notes on Complexity Theory Last updated: December, Lecture 2

Notes on Complexity Theory Last updated: December, Lecture 2 Notes on Complexity Theory Last updated: December, 2011 Jonathan Katz Lecture 2 1 Review The running time of a Turing machine M on input x is the number of steps M takes before it halts. Machine M is said

More information

COMP 355 Advanced Algorithms

COMP 355 Advanced Algorithms COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background 1 Polynomial Running Time Brute force. For many non-trivial problems, there is a natural brute force search algorithm that

More information

A Guide to Proof-Writing

A Guide to Proof-Writing A Guide to Proof-Writing 437 A Guide to Proof-Writing by Ron Morash, University of Michigan Dearborn Toward the end of Section 1.5, the text states that there is no algorithm for proving theorems.... Such

More information

Graph Theory. Thomas Bloom. February 6, 2015

Graph Theory. Thomas Bloom. February 6, 2015 Graph Theory Thomas Bloom February 6, 2015 1 Lecture 1 Introduction A graph (for the purposes of these lectures) is a finite set of vertices, some of which are connected by a single edge. Most importantly,

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

1.4 Mathematical Equivalence

1.4 Mathematical Equivalence 1.4 Mathematical Equivalence Introduction a motivating example sentences that always have the same truth values can be used interchangeably the implied domain of a sentence In this section, the idea of

More information

CONSTRUCTION OF THE REAL NUMBERS.

CONSTRUCTION OF THE REAL NUMBERS. CONSTRUCTION OF THE REAL NUMBERS. IAN KIMING 1. Motivation. It will not come as a big surprise to anyone when I say that we need the real numbers in mathematics. More to the point, we need to be able to

More information

NP Completeness and Approximation Algorithms

NP Completeness and Approximation Algorithms Chapter 10 NP Completeness and Approximation Algorithms Let C() be a class of problems defined by some property. We are interested in characterizing the hardest problems in the class, so that if we can

More information

A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms

A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms Yang Yu and Zhi-Hua Zhou National Laboratory for Novel Software Technology Nanjing University, Nanjing 20093, China

More information

NP-Completeness. Until now we have been designing algorithms for specific problems

NP-Completeness. Until now we have been designing algorithms for specific problems NP-Completeness 1 Introduction Until now we have been designing algorithms for specific problems We have seen running times O(log n), O(n), O(n log n), O(n 2 ), O(n 3 )... We have also discussed lower

More information

FORMULATION OF THE LEARNING PROBLEM

FORMULATION OF THE LEARNING PROBLEM FORMULTION OF THE LERNING PROBLEM MIM RGINSKY Now that we have seen an informal statement of the learning problem, as well as acquired some technical tools in the form of concentration inequalities, we

More information

A Brief Introduction to Multiobjective Optimization Techniques

A Brief Introduction to Multiobjective Optimization Techniques Università di Catania Dipartimento di Ingegneria Informatica e delle Telecomunicazioni A Brief Introduction to Multiobjective Optimization Techniques Maurizio Palesi Maurizio Palesi [mpalesi@diit.unict.it]

More information

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden 1 Selecting Efficient Correlated Equilibria Through Distributed Learning Jason R. Marden Abstract A learning rule is completely uncoupled if each player s behavior is conditioned only on his own realized

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

The number of distributions used in this book is small, basically the binomial and Poisson distributions, and some variations on them.

The number of distributions used in this book is small, basically the binomial and Poisson distributions, and some variations on them. Chapter 2 Statistics In the present chapter, I will briefly review some statistical distributions that are used often in this book. I will also discuss some statistical techniques that are important in

More information

High Wind and Energy Specific Models for Global. Production Forecast

High Wind and Energy Specific Models for Global. Production Forecast High Wind and Energy Specific Models for Global Production Forecast Carlos Alaíz, Álvaro Barbero, Ángela Fernández, José R. Dorronsoro Dpto. de Ingeniería Informática and Instituto de Ingeniería del Conocimiento

More information

CAN a population-based algorithm take advantage of

CAN a population-based algorithm take advantage of IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 Learning the Large-Scale Structure of the MAX-SAT Landscape Using Populations Mohamed Qasem and Adam Prügel-Bennett School of Electronics and Computer Science,

More information

Characterization of Semantics for Argument Systems

Characterization of Semantics for Argument Systems Characterization of Semantics for Argument Systems Philippe Besnard and Sylvie Doutre IRIT Université Paul Sabatier 118, route de Narbonne 31062 Toulouse Cedex 4 France besnard, doutre}@irit.fr Abstract

More information

P, NP, NP-Complete, and NPhard

P, NP, NP-Complete, and NPhard P, NP, NP-Complete, and NPhard Problems Zhenjiang Li 21/09/2011 Outline Algorithm time complicity P and NP problems NP-Complete and NP-Hard problems Algorithm time complicity Outline What is this course

More information

COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background

COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background 1 Polynomial Time Brute force. For many non-trivial problems, there is a natural brute force search algorithm that checks every

More information

Correlation of Moving with Math Grade 7 to HSEE Mathematics Blueprint

Correlation of Moving with Math Grade 7 to HSEE Mathematics Blueprint Correlation of Moving with Math Grade 7 to HSEE Mathematics Blueprint Number Sense 1.0 Students know the properties of, and compute with, rational numbers expressed n a variety of forms: 1.1 Read, write

More information

Introduction to Machine Learning

Introduction to Machine Learning Outline Contents Introduction to Machine Learning Concept Learning Varun Chandola February 2, 2018 1 Concept Learning 1 1.1 Example Finding Malignant Tumors............. 2 1.2 Notation..............................

More information

1. Introduction to commutative rings and fields

1. Introduction to commutative rings and fields 1. Introduction to commutative rings and fields Very informally speaking, a commutative ring is a set in which we can add, subtract and multiply elements so that the usual laws hold. A field is a commutative

More information

Maximising the number of induced cycles in a graph

Maximising the number of induced cycles in a graph Maximising the number of induced cycles in a graph Natasha Morrison Alex Scott April 12, 2017 Abstract We determine the maximum number of induced cycles that can be contained in a graph on n n 0 vertices,

More information

1 Introduction (January 21)

1 Introduction (January 21) CS 97: Concrete Models of Computation Spring Introduction (January ). Deterministic Complexity Consider a monotonically nondecreasing function f : {,,..., n} {, }, where f() = and f(n) =. We call f a step

More information