Epigenetics in Evolutionary Algorithms and Computer Generated Artwork COMP8755

Size: px
Start display at page:

Download "Epigenetics in Evolutionary Algorithms and Computer Generated Artwork COMP8755"

Transcription

1 The Australian National University Epigenetics in Evolutionary Algorithms and Computer Generated Artwork COMP8755 William Maroney (u ) Supervised by: Tom Gedeon and Bob McKay May 26, 2017 Abstract This report presents extensions to genetic algorithms by incorporating possible models of epigenetics and demonstrates how they can be applied to an interactive evolutionary computing problem involving computer generated artwork. This computer generated artwork problem provides a sufficiently complex/nontrivial setting to investigate whether these extensions help, hinder or do not influence the performance of the genetic algorithm. The genetic algorithm is extended with two possible models incorporating epigenetics. The abstract models used assume that two identical phenotypes could come from different underlying genomes, but that there would be a differential cost to the individual to achieve the phenotype which would provide different evolutionary selective pressure in even seemingly identical individuals. Initial experimental results are inconclusive as to whether these extensions actually improve the convergence of genetic algorithms, however, they are shown to affect the selective pressures of genetic algorithms. The suggestive results identify future work options to take these ideas forward. 1

2 Acknowledgements Thank you to my supervisors Tom Gedeon and Bob McKay for their continual support, encouragement and direction throughout this project. Without their tireless efforts in working with me to understand issues as they arose, and in identifying resolutions, this project would not have progressed at all. In particular their willingness to review my work, a number of times while in a questionable state of completion, was invaluable. To everyone who generously gave their time to participate in testing, I thank you. 2

3 Contents 1. Introduction 5 2. Background Evolutionary Computing Background Computer Generated Artwork Prior and Related Works Motivation Major Contribution of this Work Roadmap for this Report Algorithms Overview Genotypes, Phenotypes and Epigenetics Notation and Definitions Hyper-parameters The Genetic Algorithm Proposed Epigenetic Models Software Architecture Software Operation Software Synopsis Experiments Test Hypotheses Key Performance Measures Experimental Set-up Results and Analysis Conclusion and Future Work Suggested Future Work References 35 Appendix A. Independent Study Contract 36 Appendix B. Original Project Outline 37 Appendix C. Software Description 41 Appendix D. Software Usage 44 Appendix E. Darwindrian Representation (Jian Yin Shen) 45 Appendix F. Darwindrian Representation (Mathew Smith) 48 3

4 Appendix G. Experimental Data 52 4

5 1. Introduction Evolutionary biology, as it is currently understood, models changes in heritable traits in a population of individual biological entities across generations. Whether a trait possessed by an individual in one generation is passed onto subsequent generations or not, is subject to natural selection, or survival of the fittest. Simulating this core concept forms the basis for genetic algorithms. Genetic algorithms can solve various computational problems by simulating biological evolution. This simulation guides the search of a possible set of solutions in a way that mimics natural selection. With a suitable problem and mechanism to determine the fitness of any possible solution, genetic algorithms can outperform naive brute force or random search strategies. By performing these simulations on a computer, genetic algorithms may very quickly explore possible solutions; leveraging the ever-increasing computational power of modern computers. However, there are a subset of problems were assessing the fitness of individuals is not well understood or well defined, meaning complete automation of a genetic algorithm solution is not possible. In such problems, a human must provide insight into the critical fitness evaluation step before the genetic algorithm can proceed. This is known as interactive evolutionary computing. Examples of such problems involve the computer generation of music or artwork that is aesthetically pleasing. Determining whether any individual, or many people, will enjoy a certain piece of music or artwork is not something that has been reduced to a deterministic function. The surest way to find out, is to ask people their opinion. Requiring humans to evaluate many individual works however exposes the inconsistency and mortal limits of the human. A human may make a mistake, may not know and guess, may change their minds, may get tired. The human factor will introduce noise and fragility into any interactive evolutionary computing solution. In the context of genetic algorithms, there have been various bodies of research that have attempted to optimise how quickly fitness evaluations may be produced, however, the human element will always be a limiting factor in interactive evolutionary computing. Optimising other aspects of genetic algorithms is unlikely to overcome this limitation. This work takes a different approach. That is, accepting the human factor as an unavoidable and limiting constraint, how can one extract the most value from each piece of human provided information? The overarching goal of this work is to maximise the value extracted from each and every human engagement in the interactive evolutionary computing model. That is: to extract as much information as possible from individual fitness evaluations to converge to satisfactory solutions in as few iterations as possible to reduce or smooth the noise inherent in human provided fitness evaluations This report investigates a proposition by Gedeon [13] for incorporating a model of epigenetics into genetic algorithms. 5

6 2. Background 2.1. Evolutionary Computing Background Genetic algorithms, often attributed to Bremermann [1], Fraser [2] and Holland [3] in their contemporary form, is a set of techniques that have practical applications to optimisation and search problems where: there may be no single best answer we may be satisfied with a good enough answer the search space is not well-defined, precluding gradient-descent methods the search space is prohibitively large or complex the quality of given candidate solutions (fitness) can be determined relatively easily The genetic algorithm is a specific example of an evolutionary algorithm, or evolutionary computing; that is, algorithms inspired by biological evolutionary processes. In the case of genetic algorithms, the theory of natural selection is simulated to guide the exploration of the possible search space, with full details presented in section 3.5. Evolutionary computation includes the special case of interactive evolutionary computation (IEC). IEC is defined by the additional constraint that determining the quality of candidate solutions (i.e. computing their fitness ) requires human intervention. IEC can be used where an individual s aesthetic preferences are involved in determining the fitness of a solution. For example in the creation of computer generated artwork or music that is pleasing to someone or a group of people. With current knowledge, it is generally not possible to describe, programatically or deterministically, human aesthetic responses. So while IEC provides a framework to intelligently produce many candidate solutions quickly, human input is generally required to assess those solutions. Importantly, there are some optimisation/search problems where the requirement for human input is unavoidable; this inevitably introduces a significant bottleneck. The utilisation of computing power can only accelerate evolutionary algorithms so much. They all generally involve an iterative step where fitness values must be computed (or in the case of IEC, received from a human). This periodic and sequential need for human input limits the number of candidates that can be evaluated within an IEC algorithm. This is further exacerbated by human fatigue; people can only perform repetitive tasks for so long before they tire or become distracted. These factors motivate the search for better performing search/optimisation algorithms, that is, algorithms that will find good enough solutions as quickly as possible. 6

7 2.2. Computer Generated Artwork The Dutch painter Piet Mondrian ( ) was one of the key contributors to an artistic movement known as neoplasticism [4]. This particular category of artwork was abstract in nature consisting of very simple geometric objects and rules. In particular: the use of a white canvas background solid black lines drawn only in horizontal and vertical directions rectangular regions coloured in one of three primary colours all lines ending on the canvas border or intersecting another line no right angles (excluding a line intersecting the canvas border) no lines parallel to and adjacent to a border no lines parallel to and adjacent to each other The following image is one of Mondrian s works and illustrates these geometric criteria. Figure 2.1: Composition II in Red, Blue, and Yellow, 1930 [5] These specific geometric properties are particularly amenable to computer representation and manipulation, thus will form the motivating problem considered in this work. That is, can we produce computer-generated and aesthetically pleasing Mondrian-like images based on simple user feedback of candidate artworks? 7

8 2.3. Prior and Related Works This exact problem has been considered in earlier works by Shen [6] and Smith [7]. Shen encapsulated this as an optimisation problem and applied various evolutionary algorithms 1 to solve it. Smith then built on this work, broadly maintaining the problem encapsulation while investigating whether the performance of Shen s methods could be improved. Smith investigated a number of problem and implementation specific optimisations along with some algorithmic adaptations; the major contribution modelled user behaviour with artificial neural networks in an attempt to automatically make choices based on likely user actions. This allowed for actual human input to be reserved for cases where only choices with higher degrees of uncertainty were available. These prior works applied the conventional genetic algorithm, without a model for epigenetics, to the problem of generating aesthetically pleasing Mondrian-like artwork. The algorithmic details of Shen s and Smith s prior works are described in appendices E and F respectively. These algorithmic details are captured in this report to illustrate a number of problem-specific heuristics that were incorporated into those implementations of the interactive genetic algorithm. This work removes these heuristics as they introduce biases into the optimisation, thus allowing for a clearer comparison between the genetic algorithm with and without the proposed epigenetic models. Some of these problem-specific heuristics resulted in significant deviation from the typical behaviour of genetic algorithms. For example, Shen s work introduced constraints on valid pairs of parents; making their pairing almost deterministic. Further, each gene within a chromosome was independently subject to crossover, the details of which were fully determined by the relative fitness values of the parents undergoing crossover. Finally, no mutation operator was employed. The lack of fresh genetic material resulted in heavily biased search behaviour and an inability to escape local optima. Outside of the changes to the genetic algorithms, specific desirable traits for the final Mondrian-like artworks were identified and sought out in Shen s model. In particular, the property that small coloured rectangles were included in the final images was preferred. Complementary biases were introduced to ensure that such coloured rectangles would quickly be achieved, artificially converging to pre-identified examples of good solutions. Smith s work identified both the lack of a mutation operator and this convergence towards small coloured rectangles. However, the solutions employed largely reduced the strength of these biases but did not ultimately remove them. The work discussed in this report broadly reuses the problem definition, the encoding of candidate solutions and the Mondrian-like image creation logic. But while these design components are retained, they have been re-implemented into an unbiased framework of genetic algorithms to enable an objective comparison between the various epigenetic models and traditional genetic algorithms. A more detailed treatment of genetic algorithms is presented in section 3. Section 3 will present the conceptual set-up for genetic algorithms and define in detail how they can be applied to the Mondrian-like problem. 1 The interactive genetic algorithm and the interactive bacterial evolution algorithm 8

9 2.4. Motivation The evolutionary computing background presented in section 2.1 and the problem definition motivates the desire for a better performing IEC technique. Better performing in this context means an algorithm that converges to a good enough solution with fewer fitness evaluations; thus mitigating to some extent the human bottleneck factor Major Contribution of this Work Genetic algorithms have been demonstrated to practically solve real-world optimisation problems (for example, see [8], [9], [10]). It does this by modifying its search behaviour through a possible solution space in a way inspired by the theory of natural selection. Importantly, this simulation is somewhat crude and just captures a few key concepts: survival of the fittest - the fitter an individual is, the greater chance it has of passing its genetic material on to subsequent generations random mutation - randomly, the genetic material of individuals can mutate producing better or worse results than would otherwise be achieved The fact that even this very high-level approximation of biological evolution results in convergence better than expected by random search is suggestive. It motivates the question: does simulating biological evolutionary processes more accurately result in even better performance? The major contribution of this work is to investigate this question. In particular, the genetic algorithm is extended to incorporate aspects of epigenetics. Additionally, the minimisation of heuristics specific to the Mondrian-like problem in this work aims to ensure that any findings, positive or negative, can be applied more generally to optimisation problems; be they interactive or not Roadmap for this Report The rest of this report is broken up into a number of major sections. Firstly, the genetic algorithm as it applies to the Mondrian-like problem is described in detail along with two specific epigenetic extensions. Secondly, the software architecture of an experimental environment that was built to test the ideas in of this work is described. Thirdly, experiments conducted along with their results are documented. Finally, this report concludes with a high-level analysis of the results of this work and suggests future work to carry this further. 9

10 3. Algorithms This section presents the algorithmic details of the Mondrian-like problem and how to use the genetic algorithm to solve it; with and without extensions incorporating possible models of epigenetics Overview The genetic algorithm is an optimisation technique that simulates evolutionary processes to converge to optima. These processes are derived from the theory of natural selection. The genetic algorithm operates on a fixed-size generation of candidate solutions, or individuals. The genetic algorithm also requires a real-valued fitness function which acts on an individual and provides a measure of how good a solution it is. The genetic algorithm operates on the current generation of individuals by selecting subsets of individuals to act as parents that are combined to produce offspring in the next generation. The fittest individuals (i.e. those with the highest fitness value) are more likely to be selected as parents and thus contribute to subsequent generations. The specific details of this transition from one generation to another can vary by problem and implementation; this report will denote the selection of these specific details as the evolution 2 algorithm. The evolution algorithm can utilise any or all of the following genetic operators: selection produces a collection of one or more unique individuals from the current generation that will combine their genetic material to produce one or more offspring - this selection is based on the fitness of individuals, those having greater fitness will have a greater chance of selection than individuals with lesser fitness crossover takes a collection of selected individuals and merges their genetic material in some way, resulting in one or more offspring mutation takes a single individual and randomly mutates its genes These four key functions define a specific implementation of the genetic algorithm. Such an implementation of the genetic algorithm can then be applied to any problem that can be phrased as an optimisation problem. To do so, we must define a fitness function and encode the set of all individuals (i.e. all candidate solutions) in such a way that it is closed under all of the genetic operators listed above. Note: a common technique employed in genetic algorithms is elitism, or elite selection [11]. This technique will identify one (or several) of the fittest individual(s) from every generation and propagate them unchanged to the next. Where the number of elites selected is non-zero, the quality of solutions will never decrease across generations. The work discussed in this report will make use of elitism. 2 While this is not a commonly employed nomenclature in the area of genetic algorithms, it is a useful construct to clearly contrast such design choices made in this work with those made in [6] and [7]. 10

11 3.2. Genotypes, Phenotypes and Epigenetics The final critical component of this set-up is to distinguish genotypes from phenotypes. This is best first illustrated with a more familiar example and then tied back to the Mondrian-like problem. A genotype encapsulates the genetic make-up of a biological entity (e.g. a human) - our DNA. However, while our DNA constitutes our genetic material (our genotype), that genotype results in, or expresses, the actual physical human being - us; our phenotype. While most genetic operators are applied to our genotypes (i.e. crossover and mutation), it is our expressed phenotypes (actual people) that compete for survival of their genetic material across subsequent generations. It is the fitness of an individual s phenotype that influences the selection of their corresponding genotype during evolution. This nuance is particularly important in this work as epigenetics, broadly speaking, is about a more complex interplay between genotypes and phenotypes than is traditionally described by the theory of natural selection, but is now part of modern evolutionary biology. It is this complexity that is utilised in both possible epigenetic extensions to the genetic algorithm proposed in this work. Briefly, epigenetics is concerned with individual traits that may be inherited across generations that cannot be explained by genetic material. It is considered outside of genetics - hence the name, epigenetics. The full details of the application of epigenetics to genetic algorithms and the Mondrian-like problem is described in section 3.6. Subsequent sections will clearly identify when objects are genotypes or phenotypes, and when functions are defined to act on genotypes or phenotypes. Returning to the case of the Mondrian-like problem, Shen proposed a suitable genotype representation in [6] that was largely adopted by [7] and will also be utilised in this work; it is described in detail in section These genotypes then probabilistically express an actual Mondrian-like artwork - the phenotype Notation and Definitions The following notations and definitions will be used throughout the rest of this report. Let U(a, b) R be a random variable, uniformly selected from the closed interval [a, b] Let G be the set of all valid genotypes (abstract algebraic objects, see 3.5.1) Let P be the set of all valid phenotypes ( Mondrian-like image, see 3.5.2) Let g G be a genotype Let p P be a phenotype Let g be the number of genes in the genotype g Let m : G P be the probabilistic expression mapping from genotypes to phenotypes Vectors will be denoted in bold, e.g. g while scalars will not, e.g. a. Vector components will be denoted and indexed by subscript, e.g. g 1. Mathematically, the key functions comprising the genetic algorithm are defined as: evolution : G n G n (where n is the population size) 11

12 selection : G n G m (where m is the number of parents used during crossover) crossover : G m G l (where l is the number of offspring produced during crossover) mutation : G G 3.4. Hyper-parameters The following parameters will be fixed within the context of any experiment. They are almost completely common across all of the representations presented in this report. Let n be the number of individuals within a generation Let w, h be the width and height (respectively) of a phenotype in pixels Let n q be the number of generating points (see section 3.5.1) Let n l be the number of generating loops (see section 3.5.3) Let n elites be the number of elites to select for the next generation Let p t be probability of emitting any line out of a point of type t The Genetic Algorithm The algorithmic details employed by the prior works of Shen [6] and Smith [7] employed a number of problem-specific biases and heuristics. For example, the fitness of an individual influenced every genetic operator - i.e. crossover and mutation, not just selection as is typical of genetic algorithms. In an effort to focus this work on the proposed epigenetic models and reduce overall complexity, these heuristics and biases were fully removed and replaced with the algorithms outlined in this section. The representation presented in this chapter is a generalisation of the representations from prior work on this problem, each fully specified in appendices E and F Genotype Representation The genotype used in the Mondrian-like problem is a probabilistic structure that can produce many possible phenotypes. At a high-level, it encodes a set of generating points on a canvas that will emit lines in directions that loosely follow the geometric rules of a Mondrian-like image. Associated with these generating points are probability distributions that dictate how lines are drawn from these points, how rectangles are selected for filling and the colour to use when filling those rectangles. The following is a visual representation of an example Mondrian-like genotype. In this case there are n q = 3 generating points plotted at specific locations on a two-dimensional canvas; each of which share a probability distribution d that describes the likelihood of emitting lines in the valid directions (north, south, east and west). The length of the vectors are proportional to the corresponding probabilities, with (in this example) the 3 Where t {terminal,online,right-angled,connected,etc.} - possible generating point states encountered during phenotype emission as defined by Shen in [6] 12

13 lowest probability associated with emitting a line west, and the the highest probability associated with emitting a line south. Figure 3.1: Graphical representation of q, d genes in a genotype The remaining genes in the genotype representation describe bounds used to identify rectangles that can be coloured (a min, a max, c max ) and a probability distribution over the possible colours (c). More formally: Let g G s.t. g = (a min, a max, c max, d, c, q); where a min Z/wZ Z/hZ is the minimum dimensions of a rectangle that may be coloured a max Z/wZ Z/hZ is the maximum dimensions of a rectangle that may be coloured c max Z is the maximum number of rectangles that may be coloured d R 4 s.t. 0 d i 1 i [1, 4] is the probability of drawing north/south/east/west c R 3 s.t. 0 c i 1 i [1, 3] is the probability of colouring red/blue/yellow ( nq q Z/wZ Z/hZ) is the set of generating points Note: this is the genotype encoding from section F.1 with continuous probabilities Phenotype Representation The phenotype used in this work on the Mondrian-like problem is a geometric representation of a two-dimensional artwork that conforms to the criteria of section

14 More specifically, the phenotype consists of a canvas, a set of horizontal and vertical lines and coloured rectangles. For example, software constructed for this work produced the following phenotype - encoded as an 8-bit RGB PNG image file: Figure 3.2: Sample Mondrian-like phenotype Genotype to Phenotype Expression Mapping The following algorithm expresses one of the possible phenotypes from a given genotype. Algorithm 1 Expression mapping m (g) (genotype to phenotype) Let (a min, a max, c max, d, c, q) g decompose g Let segments the empty set for i = 1,..., n l do iterate number of generating loops times randomly shuffle q for q j q do random ordering over q j if possible to emit line from q j then Let d random direction drawn using the probability distribution {p t } Let e all valid end-points (i.e. points of intersection) Let e x e be a random end-point from e selected uniformly at random Let l (q j e x ) (line from q j to e x ) Let segments segments {l} return c (a min, a max, c, segments) 14

15 Algorithm 2 Phenotype rectangle colouring c (a min, a max, c, segments) for i = 1,..., c max do Select colour i with probability c i Σ 3 j=1 c j no more than c max coloured rectangles Randomly select a rectangle r from segments with area bounded by a min, a max Fill the rectangle r with colour i return segments with any colouring applied Fitness Function The fitness function is defined as follows: Let ratings {strongly dislike,dislike,indifferent,like,strongly like} Let f P : P ratings be the phenotype fitness function Let f G : G R be the genotype fitness function Define f P as f P (p) = user selected rating of p Define fτ G τ (generations up to and including τ) as 2 f P (m (g)) = strongly dislike 1 fτ G (g) = f P (m (g)) = dislike 0 f P (m (g)) = indifferent 1 f P (m (g)) = like 2 f P (m (g)) = strongly like That is, the entire fitness value of a phenotype is simply assigned to the individual genotype that expressed it Genetic Operator: Evolution Recall, the evolution algorithm is defined in this report to cover the iterative method used to pass from one generation of individuals to the next. First, each individual genotype in the current generation emits a phenotype. These are then used to assign fitness values to the underlying individuals. A common practice known as elite selection is employed here. Elite selection takes some of the most fit individuals and passes them on to the next generation unchanged. The rest of the next generation is constructed in the typical method. A set of unique parents is selected (weighted by fitness), each set of parents performs crossover which results in another set of offspring. Finally, each of these offspring may then undergo some mutation - i.e. modification of their genes (helping to escape local optima). 15

16 Algorithm 3 evolution({g i } n i=1) Express a p i = m (g i ) i [1, n] Evaluate fitness values at generation τ, fτ G (g i ) i [1, n] Sort {g i } n i=1 by fitness values fτ G (g i ) in descending order Let next_generation {g 1,..., g n elites } elite selection complete the next generation while next_generation < n do Let parents selection({g i } n i=1) Let offspring crossover(parents) for child offspring do Let next_generation next_generation {mutation(child)} return next_generation truncated to length of n if necessary Genetic Operator: Selection The selection method employed in this work is known as roulette selection, or fitness proportionate selection. Critically, each individual is given a selection probability proportional to its fitness value. Selection is performed without replacement so that the same parent cannot pass its genetic material on to the same offspring multiple times. Algorithm 4 selection({g i } n i=1) where m = 2 Let parents the empty set Scale fitness values s.t. fτ G (g i ) 0 ensure non-negative values for j = 1,..., m do Select g j fτ with probability G(gj ) ; Σ n k=1 f τ G (g k ) gj, g k / parents without replacement Let parents parents {g j } return parents Genetic Operator: Crossover The crossover method employed in this work is two-parent, single-point crossover. A single point in the genotypes chromosomes is selected uniformly at random. At this point, a single crossover is performed resulting in two new individuals. The first new individual is constructed from the genes of parent A until the crossover point, and then continued with the genes of parent B. The second new individual is similarly constructed from the genes of parent B until the crossover point, and then continued with the genes of parent A. This process is visually represented below. 16

17 Figure 3.3: Two parent, single-point crossover Algorithm 5 crossover({g i } m i=1) where m = 2, l = 2 Let (g1, 1..., g g 1 ) g1 decompose g 1 Let (g1, 2..., g g 2 ) g2 decompose g 2 Let r U(1, g ) rounded to nearest integer random crossover point Let child 1 (g1, 1..., gr, 1 gr+1, 2..., g g 2 ) single-point crossover Let child 2 (g1, 2..., gr, 2 gr+1, 1..., g g 1 ) single-point crossover return {child 1, child 2 } Genetic Operator: Mutation A simple probabilistic mutation method is employed such that the expected number of mutated genes in each individual is low and constant; typically this expected value is 1. 17

18 Algorithm 6 mutation(g) Let (g 1,..., g g ) g for i = 1,..., g do if U(0, 1) < 1 g then decompose g with constant mutation probability per gene if g i is a bounded integer in the range [a, b] then g i perturb(g i, a, b) rounded to nearest integer else if g i is a probability distribution then Let x be one of the possible events in g i selected uniformly at random Let p x perturb(p x, 0, 1) mutate the probability of x Let p j = p j k p p j g k i renormalise return (g 1,..., g g ) (with any perturbations applied) The helper function perturb referenced in the mutation operator fits a Gaussian distribution to the current value of the variable to mutate. This distribution is sampled until a valid value is retrieved - this becomes the new value. While the expected outcome of this operation is for the input value to remain unchanged, there is random mutation introduced about this value. Algorithm 7 perturb(x, lower, upper) Let σ upper lower σ = 3 covers 99.7% of the range of x 3 Let y upper + 1 while y / [lower, upper] do re-sample if necessary y N (µ = x, σ 2 ) Gaussian distribution: mean µ, variance σ 2 return y Note: σ is selected such that three standard deviations cover the range of valid values. So, if the mean is centred then there is a 99.7% chance that this algorithm terminates after one iteration. The worst case is that x is either lower or upper, in which case there is a % chance of terminating after one iteration or 1 % 50% chance of not 2 2 terminating. In the worst case, the expected number of iterations is approximately two Invalid Phenotypes This combination of crossover and mutation can produce genotypes that are able to express phenotypes that violate some desirable constraints from section 2.2. This issue will be revisited later in the report with possible remedies suggested as potential future work. This report does not resolve the limitation Proposed Epigenetic Models Epigenetics in biology are stable heritable traits which can not be explained by changes in DNA sequence, often referring to changes in gene methylation which affects gene 18

19 activity, and also includes prions. As seen in biology, epigenetic changes do not fit well with modern views of Darwin s theory of evolution (all inheritance is via DNA), nor Lamarck s (which would require phenotypic changes to modify the DNA). Baldwin s theory comes close, and assumes that evolutionary pressure will favour individuals with the capability to learn during their lifetime [12]. In genetic algorithms with a single chromosome with a direct phenotype to genotype matching, there is no place for epigenetics. That is, we could not meaningfully differentiate two parts of a data structure representing algorithmic genetic information and all one DNA and the other not-dna (and hence epigenetic). In this project, we have a chromosome with a probabilistic process to create a phenotype from a genotype, thus potentially allowing many genotypes to generate the same phenotype, and a single genotype to generate many phenotypes. So, by analogy with Baldwin s theory, we consider the process of search over genometo-phenotype pairs as epigenetic, if it is used as part of the evolutionary selection. This is not exactly the same as an individual learning over a lifetime, rather it is abstractly analogous, and is potentially useful to speed up interactive evolutionary algorithms. In a (simulated) biological system which has both short term and long term adaptation mechanisms for individuals in a population, the long term process would maintain the primary inheritable information. I propose that the function of the short term mechanism is to adapt individuals to their current environment at a fitness cost related to the distance between the simplest expression of their primary inheritable information, and the actual expression. Thus, short term environmental changes can be accommodated by the short term mechanism. If the short term continues long enough, some of the changes become incorporated into the primary inheritable information due to selection pressure. This prepares individuals in the population to be fitter if the environmental change continues in the same direction further into the future. In terms of biology, the long term process is Darwinian selection using DNA, while the short term process is similar or analogous to epigenetic changes. In this project, there is a chromosome with a probabilistic process to create a phenotype from a genotype, thus potentially allowing many genotypes to generate the same phenotype, and a single genotype to generate many phenotypes. The search over genotype-to-phenotype pairs to assign fitness from phenotype to all of the genotypes available which could have created this phenotype is an abstract model of an epigenetic mechanism as proposed above, which has the potential to speed up interactive evolutionary algorithms or where fitness evaluation is particularly costly. Tom Gedeon s proposition for epigenetic models [13] Gedeon s proposition forms the foundation for both epigenetic extensions to the genetic algorithm investigated in this work. These models are built on top of the existing framework from section 3.5 and are constrained to a redefinition of the fitness functions. While section 3.5 employed the symbol fτ G for the fitness function, the parameter τ was un-utilised. It was defined in that way so that the fitness function could be extended 19

20 to be temporal in nature in these epigenetic models without having to redefine any of the algorithms from previous sections. The main implication of this construction is that the fitness of an individual may change over time Epigenetics: Exact Phenotype Matches The fitness function is defined as follows: Let ratings {strongly dislike,dislike,indifferent,like,strongly like} Let rating : P ratings be the phenotype rating function Let f P : P Z be the phenotype fitness function Let fτ G : G R be the genotype fitness function at time τ Define f P as 2 rating(p) = strongly dislike Define f G τ 1 f P (p) = rating(p) = dislike 0 rating(p) = indifferent 1 rating(p) = like 2 rating(p) = strongly like as, p P observed and rated at time t τ f G τ (g) = p ( p(m (g) = p) f P (p) 1 k m (g) p +1 ( p p(m 1 (g) = p) MAX_FITNESS k m (g) p +1 Where MAX_FITNESS = max{f P (p)} = 2 This fitness function accumulates the fitness value of all previously observed phenotypes. These fitness values are weighted proportionally to the expression difficulty (or probability) and inverse proportionally to the distance between each observed phenotype and that which was actually expressed by the given genotype. Note: the denominator is used to normalise the fitness function such that genotype to phenotype transitions with high probability to do not dominate the summation Epigenetics: Inexact Phenotype Matches Generalising further, we may consider inexact matches. i.e. not just phenotypes that can be directly expressed by a genotype, but those close to phenotypes directly expressible. A measure on this closeness will be included. Define fτ G (g) as f G τ (g) = p ( p ( (m (g)) (j) = p (j)) f P (p) 1 k 1 j+1 1 k 2 m (g) p +1 ( p p ( (m (g)) (j) = p (j)) MAX_FITNESS 1 1 k 1 j+1 k 2 m (g) p +1 p P observed and rated at time t τ Where p is given by min j N s.t. ( m (g) ) (j) = p (j) ) ) ) ) 20

21 Where p (j) is the phenotype p, with line segments co-ordinates as multiples of j Where p is the nearest (possibly exact) match to p Where MAX_FITNESS = max{f P (p)} = 2 The p (j) construction is best illustrated with an example showing various values of j. As j increases, the granularity of valid plotting coordinates decreases. With sufficient j, lines begin to collapse on top of each other. The purpose of this construction is to force (with sufficiently high j) all phenotypes to collapse to all other phenotypes. The degree of granularity reduction required forms a distance metric between phenotypes; the greater the granularity decrease required for equivalence, the greater the distance between two phenotypes. A sufficiently high value of j causes all phenotypes to degenerate to the empty canvas, thus ensuring that every pair of phenotypes has a finite distance. (a) Example p (b) p (10) (c) p (40) (d) p (80) (e) p (160) Computing Epigenetic Fitness Functions From a computability perspective, it would be ideal to invert( the function m.) That is, we would like to compute (m ) 1 and enumerate {g i } s.t. p g i = (m ) 1 (p) > 0 for any given phenotype p. However, this inverse does not exist. The randomised nature of m means it is a one-many function. It is also easy to see that multiple genotypes can produce the same phenotype. Since m is a many-many function, it is not invertible. Enumerating all possible genotypes that could have produced a given phenotype superficially appears intractable. However, recall that the context of this is evolutionary computing. While the set of genotypes that could have generated a given phenotype will impact how we spread a fitness value over the search space, what really matters is the accumulated fitness value of genotypes as they occur during selection. Thus, consider the deterministic function m corresponding to m which also acts on a multi-variate random variable representing the random choices made in the computation of m. Let m : G X P where X is a discrete and finite multi-variate random variable. Observe that while computing f G τ (g) g G is computationally difficult, we only require the fitness values for genotypes in the current population. Thus, we may accumulate the set of all observed and rated phenotypes p i at time τ and compute f G τ (g) on-the-fly by evaluating m(g, x) x X and comparing each result to all {p i }. While this algorithm involves identifying any match between m(g, x) and {p i } at each evaluation of f G τ, a phenotype is simply a collection of unique line and rectangle co-ordinates. A suitable use of hash-map data structures reduces this operation to O(1). 21

22 This general algorithmic approach may be extended to the epigenetic model with inexact matches. Specifically, for each historical phenotype, and for each reachable phenotype enumerated during the fτ G computation, the translated phenotype p (j) must be computed j [1, max(w, h)]. This is required to identify inexact matches and the smallest such translation factor j (as used by the fitness function at The upper-bound on j comes from the fact that any further loss of resolution results in a blank canvas for all starting phenotypes. That is, if the canvas has less pixel locations then our granularity, we cannot plot anything. So, each observed and rated phenotype p now corresponds to the set {p j } j [1, max(w, h)]. The following pseudo-code provides algorithms to compute these fitness functions. Algorithm 8 Epigenetic (exact matches) fitness at generation τ, f G τ (g) Let p be the phenotype already expressed by g Let H {p i } be the historical phenotypes already encountered i [1, n τ] Let R {q j m(g, x) x X} be all phenotypes reachable from g for (q j, p i ) R H do if q j = p i then Let a a + p(m (g)=q j ) f P (p i ) k 1 p q j +1 Let b b + p(m (g)=q j ) MAX_FITNESS k 1 p q j +1 return a b The asymptotic complexity of this algorithm is O(n X τ); that is, linear in τ. Algorithm 9 Epigenetic (inexact matches) fitness at generation τ, f G τ (g) Let p be the phenotype already expressed by g Let H {p i } be the historical phenotypes already encountered i [1, n τ] Let R {q j m(g, x) x X} be all phenotypes reachable from g for p i H do for l = 1,..., max(w, h) do for q j R do return a b if q j(l) = p i(l) then Let a a + inexact match on multiple-of-l coordinate system p(m (g)=q j ) f P (p i ) (k 1 j+1) (k 2 p q j(l) +1 Let b b + p(m (g)=q j ) MAX_FITNESS (k 1 j+1) (k 2 p q j(l) +1 Proceed to next p i Accounting for inexact phenotype matches in this way results in only a constant increase of complexity, that is, computing f G τ (g) for inexact matches has asymptotic complexity O(n max(w, h) X τ); which is still linear in τ. 22

23 4. Software Architecture The software constructed to support this project provided a flexible framework in which to implement various genetic algorithms to solve the Mondrian-like problem. In particular, all three implementations described in this paper were incorporated: Maroney (section 3.5) Shen (appendix E) Smith (appendix F) This software presents a HTTP server through which a user can perform experiment(s). All data is recorded and stored on the hosting server for later analysis. The user is expected to connect to this HTTP server via any modern web browser - standards based and portable client-side technologies are employed. The high-level architectural design of the software is described in the image below. 1. Start HTTP server 3b. 3c. 3a. 3d. Darwindrian 4b. 4e. 4b. 4e.. HTTP server 4a. 4f. 4a. 4f.. Web browser 5b. 5c. 5a. 5d. 4c. 4d. 2. Start web browser Worker pool Figure 4.1: Architectural diagram of software artefacts developed in this project 23

24 4.1. Software Operation The main program is named darwindrian, as was used in the works of [6] and [7]. This name is a play on words, combining Darwin (for the evolutionary computing aspects of this work) and Mondrian (for the artist that inspired the computer generated artwork model). Upon launching the darwindrian application a simple HTTP server is launched (1) and after a brief pause, the default local web browser is launched to open the associated darwindrian homepage (2). Note: while the system hosting the HTTP server will also launch a local web browser by default, the server will accept connections from any routable device. A separate computer may be used to access and perform the experiments remotely. The HTTP server acts largely as a mechanism through which to pass data between darwindrian and the web browser. It also maintains the state of the current genetic algorithm experiment. This software has been designed as a single-user application. To run multiple experiments concurrently, host multiple darwindrian instances on separate port numbers and connect to them individually with separate web browser sessions. Once running, the first page presented to the user allows them to set the hyperparameters and other settings of an experiment. Steps 3a, 3b, 3c, 3d in the architectural diagram above illustrate this exchange. A screen shot of this first page is included below. Figure 4.2: Screen shot of experimental hyper-parameter selection 24

25 Once these hyper-parameter selections are submitted, a genetic algorithm instance is created on the server, ready for interactive evolution. The user will then be redirected (3d) to the main evolution interface, a web page with a the current generation of Mondrian-like images and associated rating forms. A screen shot of this main interactive evolutionary computing page is included below. Figure 4.3: Screen shot of interactive evolutionary feedback Essentially all of the experimental time will be spent on this page, with the user rating a generation of individuals at a time until they have reached the pre-established number of generations or they click on the See Results button to exit early. Alternatively, an experiment may be halted without producing any results with the Start Again button. The user will typically first request the next generation with the See Next Generation button (4a). The HTTP server then relays this request (4b) to the darwindrian logic where the next generation is computed with the evolution algorithm. Each individual genotype in this new generation then expresses a Mondrian-like phenotype which is rendered as an image file. These images, embedded in the main evolution web page, are then served to the user (4e, 4f). These steps (4a-4f) will iterate a number of times. The steps 4c,4d only apply to the epigenetic models, not the basic genetic model. As described in section 3.6, the epigenetic fitness functions require an enumeration of all phenotypes reachable from the given genotype. Due to the combinatorial complexity of the Mondrian-like phenotype expression logic, this is computationally expensive. To reduce the impact on the user experience, the genotype to phenotype enumeration has been 25

26 implemented in an optimised C++ stand-alone executable utilising a dynamic programming algorithm [14] to avoid redundant calculations during the search tree expansion. Further, the enumeration for each individual genotype is independent of all others and so can be computed in parallel. A worker pool is created and an enumeration for each individual genotype is queued to the pool. This allows for the work to be distributed to a number of processing cores local to the computer hosting darwindrian. The web form will be pre-populated with a suggestion of allocating all but one of the local processing cores to this worker pool, however, this can be changed by the test subject. Finally, this enumeration is independent of the ultimate user rating assigned to the singly expressed Mondrian-like phenotype. Thus, the enumerations begin processing immediately while the user is presented with the current generation s images for rating. This concurrency leverages the relative slowness of human actions to mask the computational cost of the genotype to phenotype enumerations required in the epigenetic models. Once the steps (4a-4f, with or without steps 4c,4d for the epigenetic models) have iterated the requisite number of times, the experiment will conclude. At this point, the user will be re-directed to request the results view (5a). The HTTP server will again relay this request (5b) to darwindrian which will compute a series of performance measures and serialise all experimental data for any subsequent analysis. The darwindrian application then produces a high-level summary of the experiment and its results, along with a sample of a Mondrian that was produced by a genotype with the highest observed fitness value. A screen shot of this results page is included below. Figure 4.4: Screen shot of experimental results 26

27 The graph in the middle of the results screen shot was used to illustrate a key performance measure, namely, the average phenotype fitness per generation. This graph gives a rough view of the average user rating per Mondrian-like image over the generations. The Mondrian-like image on the right was expressed from one of (there may be multiple) the individual genotypes with the highest fitness value. In a sense, this image represents one of the best solutions identified in the experiment. The underlying data for all test subjects that produced both the performance measure graph and the sample Mondrian-like image are maintained for subsequent analysis. The aggregate results of this analysis is captured and discussed in section 5.4. This software was designed to support one final experiment that captures the user s subjective comparison between the final generation of an experiment under each of the three models supported (the genetic model and epigenetic models with exact and inexact phenotype matches). It is assumed that a user will perform three experiments in sequence, one for each model. At this point, they will return to the main page (see figure 4.2) and click on the Final Evaluation button. This final experiment retrieves all Mondrian-like phenotypes from the last generation for each of the three models tested. These images are then randomly shuffled and presented in an arbitrary order. The user is asked to rate all images, the results of which are serialised back into the experimental result directories Software Synopsis The darwindrian software has a single entry point, its synopsis is defined by: usage: darwindrian.py [-h] [--port PORT] [--nobrowser] optional arguments: -h, --help show this help message and exit --port PORT web server port (default 8080) --nobrowser disable launch of local web browser By executing darwindrian without any arguments, it will create a HTTP server listening on the default network port and locally launch a web browser to connect to that HTTP server. Both of these behaviours can be modified with command-line arguments. To have darwindrian listen on an alternative port, simply use the --port PORT flag. To stop the launch of a local web browser, simply use the --nobrowser flag. You may wish to do this for example if you intend to connect to darwindrian from a network connected client instead of locally. 27

Phenotype to Genotype Matching and Epigenetics in Evolutionary Algorithms

Phenotype to Genotype Matching and Epigenetics in Evolutionary Algorithms Phenotype to Genotype Matching and Epigenetics in Evolutionary Algorithms William Maroney Australian National University u5612989 May 21, 2017 Supervisors: Tom Gedeon and Bob McKay Overview Evolutionary

More information

Lecture 9 Evolutionary Computation: Genetic algorithms

Lecture 9 Evolutionary Computation: Genetic algorithms Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Simulation of natural evolution Genetic algorithms Case study: maintenance scheduling with genetic

More information

Chapter 8: Introduction to Evolutionary Computation

Chapter 8: Introduction to Evolutionary Computation Computational Intelligence: Second Edition Contents Some Theories about Evolution Evolution is an optimization process: the aim is to improve the ability of an organism to survive in dynamically changing

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 10: Gene(c Algorithms CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ Slides of this presenta(on

More information

Lecture 22. Introduction to Genetic Algorithms

Lecture 22. Introduction to Genetic Algorithms Lecture 22 Introduction to Genetic Algorithms Thursday 14 November 2002 William H. Hsu, KSU http://www.kddresearch.org http://www.cis.ksu.edu/~bhsu Readings: Sections 9.1-9.4, Mitchell Chapter 1, Sections

More information

GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS

GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS A genetic algorithm is a random search technique for global optimisation in a complex search space. It was originally inspired by an

More information

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way EECS 16A Designing Information Devices and Systems I Spring 018 Lecture Notes Note 1 1.1 Introduction to Linear Algebra the EECS Way In this note, we will teach the basics of linear algebra and relate

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining 13. Meta-Algorithms for Classification Data Warehousing & Data Mining Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13.

More information

Computational statistics

Computational statistics Computational statistics Combinatorial optimization Thierry Denœux February 2017 Thierry Denœux Computational statistics February 2017 1 / 37 Combinatorial optimization Assume we seek the maximum of f

More information

Designing Information Devices and Systems I Fall 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Designing Information Devices and Systems I Fall 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way EECS 16A Designing Information Devices and Systems I Fall 018 Lecture Notes Note 1 1.1 Introduction to Linear Algebra the EECS Way In this note, we will teach the basics of linear algebra and relate it

More information

Evolutionary computation

Evolutionary computation Evolutionary computation Andrea Roli andrea.roli@unibo.it Dept. of Computer Science and Engineering (DISI) Campus of Cesena Alma Mater Studiorum Università di Bologna Outline 1 Basic principles 2 Genetic

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

Crossover Techniques in GAs

Crossover Techniques in GAs Crossover Techniques in GAs Debasis Samanta Indian Institute of Technology Kharagpur dsamanta@iitkgp.ac.in 16.03.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 16.03.2018 1 / 1 Important

More information

Evolutionary Computation

Evolutionary Computation Evolutionary Computation - Computational procedures patterned after biological evolution. - Search procedure that probabilistically applies search operators to set of points in the search space. - Lamarck

More information

Evolving more efficient digital circuits by allowing circuit layout evolution and multi-objective fitness

Evolving more efficient digital circuits by allowing circuit layout evolution and multi-objective fitness Evolving more efficient digital circuits by allowing circuit layout evolution and multi-objective fitness Tatiana Kalganova Julian Miller School of Computing School of Computing Napier University Napier

More information

Introduction to Digital Evolution Handout Answers

Introduction to Digital Evolution Handout Answers Introduction to Digital Evolution Handout Answers Note to teacher: The questions in this handout and the suggested answers (in red, below) are meant to guide discussion, not be an assessment. It is recommended

More information

Computational Tasks and Models

Computational Tasks and Models 1 Computational Tasks and Models Overview: We assume that the reader is familiar with computing devices but may associate the notion of computation with specific incarnations of it. Our first goal is to

More information

Evolutionary computation

Evolutionary computation Evolutionary computation Andrea Roli andrea.roli@unibo.it DEIS Alma Mater Studiorum Università di Bologna Evolutionary computation p. 1 Evolutionary Computation Evolutionary computation p. 2 Evolutionary

More information

Genetic Algorithms. Donald Richards Penn State University

Genetic Algorithms. Donald Richards Penn State University Genetic Algorithms Donald Richards Penn State University Easy problem: Find the point which maximizes f(x, y) = [16 x(1 x)y(1 y)] 2, x, y [0,1] z (16*x*y*(1-x)*(1-y))**2 0.829 0.663 0.497 0.331 0.166 1

More information

The Evolution of Gene Dominance through the. Baldwin Effect

The Evolution of Gene Dominance through the. Baldwin Effect The Evolution of Gene Dominance through the Baldwin Effect Larry Bull Computer Science Research Centre Department of Computer Science & Creative Technologies University of the West of England, Bristol

More information

[Read Chapter 9] [Exercises 9.1, 9.2, 9.3, 9.4]

[Read Chapter 9] [Exercises 9.1, 9.2, 9.3, 9.4] 1 EVOLUTIONARY ALGORITHMS [Read Chapter 9] [Exercises 9.1, 9.2, 9.3, 9.4] Evolutionary computation Prototypical GA An example: GABIL Schema theorem Genetic Programming Individual learning and population

More information

OPTIMIZED RESOURCE IN SATELLITE NETWORK BASED ON GENETIC ALGORITHM. Received June 2011; revised December 2011

OPTIMIZED RESOURCE IN SATELLITE NETWORK BASED ON GENETIC ALGORITHM. Received June 2011; revised December 2011 International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 12, December 2012 pp. 8249 8256 OPTIMIZED RESOURCE IN SATELLITE NETWORK

More information

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia)

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia) Evolutionary Computation DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia) andrea.roli@unibo.it Evolutionary Computation Inspiring principle: theory of natural selection Species face

More information

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations Lab 2 Worksheet Problems Problem : Geometry and Linear Equations Linear algebra is, first and foremost, the study of systems of linear equations. You are going to encounter linear systems frequently in

More information

Development. biologically-inspired computing. lecture 16. Informatics luis rocha x x x. Syntactic Operations. biologically Inspired computing

Development. biologically-inspired computing. lecture 16. Informatics luis rocha x x x. Syntactic Operations. biologically Inspired computing lecture 16 -inspired S S2 n p!!! 1 S Syntactic Operations al Code:N Development x x x 1 2 n p S Sections I485/H400 course outlook Assignments: 35% Students will complete 4/5 assignments based on algorithms

More information

Exercise 3 Exploring Fitness and Population Change under Selection

Exercise 3 Exploring Fitness and Population Change under Selection Exercise 3 Exploring Fitness and Population Change under Selection Avidians descended from ancestors with different adaptations are competing in a selective environment. Can we predict how natural selection

More information

Simulation of the Evolution of Information Content in Transcription Factor Binding Sites Using a Parallelized Genetic Algorithm

Simulation of the Evolution of Information Content in Transcription Factor Binding Sites Using a Parallelized Genetic Algorithm Simulation of the Evolution of Information Content in Transcription Factor Binding Sites Using a Parallelized Genetic Algorithm Joseph Cornish*, Robert Forder**, Ivan Erill*, Matthias K. Gobbert** *Department

More information

The Evolution of Sex Chromosomes through the. Baldwin Effect

The Evolution of Sex Chromosomes through the. Baldwin Effect The Evolution of Sex Chromosomes through the Baldwin Effect Larry Bull Computer Science Research Centre Department of Computer Science & Creative Technologies University of the West of England, Bristol

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

OPSIAL Manual. v Xiaofeng Tan. All Rights Reserved

OPSIAL Manual. v Xiaofeng Tan. All Rights Reserved OPSIAL Manual v1.0 2016 Xiaofeng Tan. All Rights Reserved 1. Introduction... 3 1.1 Spectral Calculator & Fitter (SCF)... 3 1.2 Automated Analyzer (AA)... 3 2. Working Principles and Workflows of OPSIAL...

More information

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes. CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers

More information

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics: Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships

More information

Name: Period Study Guide 17-1 and 17-2

Name: Period Study Guide 17-1 and 17-2 Name: Period Study Guide 17-1 and 17-2 17-1 The Fossil Record (pgs. 417-422) 1. What is the fossil record? 2. What evidence does the fossil record provide? 1. 2. 3. List the 2 techniques paleontologists

More information

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 8 9 10 11 12 Probabilistic

More information

A Note on Crossover with Interval Representations

A Note on Crossover with Interval Representations A Note on Crossover with Interval Representations Christopher Stone and Larry Bull UWE Learning Classifier System Technical Report UWELCSG03-00 Faculty of Computing, Engineering and Mathematical Sciences

More information

Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators

Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators Andrea Mambrini 1 University of Birmingham, Birmingham UK 6th June 2013 1 / 33 Andrea Mambrini GSGP: theory-laden

More information

Geometric Semantic Genetic Programming (GSGP): theory-laden design of variation operators

Geometric Semantic Genetic Programming (GSGP): theory-laden design of variation operators Geometric Semantic Genetic Programming (GSGP): theory-laden design of variation operators Andrea Mambrini University of Birmingham, UK NICaiA Exchange Programme LaMDA group, Nanjing University, China 7th

More information

S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA

S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA Date: 16th May 2012 Wed, 3pm to 3.25pm(Adv. Session) Sathyanarayana K., Manish Banga, and Ravi Kumar G. V. V. Engineering Services,

More information

Genetic Algorithms: Basic Principles and Applications

Genetic Algorithms: Basic Principles and Applications Genetic Algorithms: Basic Principles and Applications C. A. MURTHY MACHINE INTELLIGENCE UNIT INDIAN STATISTICAL INSTITUTE 203, B.T.ROAD KOLKATA-700108 e-mail: murthy@isical.ac.in Genetic algorithms (GAs)

More information

Enduring understanding 1.A: Change in the genetic makeup of a population over time is evolution.

Enduring understanding 1.A: Change in the genetic makeup of a population over time is evolution. The AP Biology course is designed to enable you to develop advanced inquiry and reasoning skills, such as designing a plan for collecting data, analyzing data, applying mathematical routines, and connecting

More information

Teachers Guide. Overview

Teachers Guide. Overview Teachers Guide Overview BioLogica is multilevel courseware for genetics. All the levels are linked so that changes in one level are reflected in all the other levels. The BioLogica activities guide learners

More information

how should the GA proceed?

how should the GA proceed? how should the GA proceed? string fitness 10111 10 01000 5 11010 3 00011 20 which new string would be better than any of the above? (the GA does not know the mapping between strings and fitness values!)

More information

Metaheuristics and Local Search

Metaheuristics and Local Search Metaheuristics and Local Search 8000 Discrete optimization problems Variables x 1,..., x n. Variable domains D 1,..., D n, with D j Z. Constraints C 1,..., C m, with C i D 1 D n. Objective function f :

More information

Evolution of Genotype-Phenotype mapping in a von Neumann Self-reproduction within the Platform of Tierra

Evolution of Genotype-Phenotype mapping in a von Neumann Self-reproduction within the Platform of Tierra Evolution of Genotype-Phenotype mapping in a von Neumann Self-reproduction within the Platform of Tierra Declan Baugh and Barry Mc Mullin The Rince Institute, Dublin City University, Ireland declan.baugh2@mail.dcu.ie,

More information

INVARIANT SUBSETS OF THE SEARCH SPACE AND THE UNIVERSALITY OF A GENERALIZED GENETIC ALGORITHM

INVARIANT SUBSETS OF THE SEARCH SPACE AND THE UNIVERSALITY OF A GENERALIZED GENETIC ALGORITHM INVARIANT SUBSETS OF THE SEARCH SPACE AND THE UNIVERSALITY OF A GENERALIZED GENETIC ALGORITHM BORIS MITAVSKIY Abstract In this paper we shall give a mathematical description of a general evolutionary heuristic

More information

Using Evolutionary Techniques to Hunt for Snakes and Coils

Using Evolutionary Techniques to Hunt for Snakes and Coils Using Evolutionary Techniques to Hunt for Snakes and Coils Abstract The snake-in-the-box problem is a difficult problem in mathematics and computer science that deals with finding the longest-possible

More information

The Effects of Coarse-Graining on One- Dimensional Cellular Automata Alec Boyd UC Davis Physics Deparment

The Effects of Coarse-Graining on One- Dimensional Cellular Automata Alec Boyd UC Davis Physics Deparment The Effects of Coarse-Graining on One- Dimensional Cellular Automata Alec Boyd UC Davis Physics Deparment alecboy@gmail.com Abstract: Measurement devices that we use to examine systems often do not communicate

More information

Analysis of Crossover Operators for Cluster Geometry Optimization

Analysis of Crossover Operators for Cluster Geometry Optimization Analysis of Crossover Operators for Cluster Geometry Optimization Francisco B. Pereira Instituto Superior de Engenharia de Coimbra Portugal Abstract We study the effectiveness of different crossover operators

More information

ECE521 Lecture 7/8. Logistic Regression

ECE521 Lecture 7/8. Logistic Regression ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression

More information

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,

More information

Local and Stochastic Search

Local and Stochastic Search RN, Chapter 4.3 4.4; 7.6 Local and Stochastic Search Some material based on D Lin, B Selman 1 Search Overview Introduction to Search Blind Search Techniques Heuristic Search Techniques Constraint Satisfaction

More information

Fundamentals of Genetic Algorithms

Fundamentals of Genetic Algorithms Fundamentals of Genetic Algorithms : AI Course Lecture 39 40, notes, slides www.myreaders.info/, RC Chakraborty, e-mail rcchak@gmail.com, June 01, 2010 www.myreaders.info/html/artificial_intelligence.html

More information

Parallel/Distributed Evolutionary Computation The influence of spatial interaction in evolutionary behaviour

Parallel/Distributed Evolutionary Computation The influence of spatial interaction in evolutionary behaviour Parallel/Distributed Evolutionary Computation The influence of spatial interaction in evolutionary behaviour Garrett Camp {camp@enel.ucalgary.ca} CPSC605 Simple Genetic Algorithms Based on darwin's theory

More information

Scaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.).

Scaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.). Local Search Scaling Up So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.). The current best such algorithms (RBFS / SMA*)

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

The Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma

The Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma The Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma Tzai-Der Wang Artificial Intelligence Economic Research Centre, National Chengchi University, Taipei, Taiwan. email: dougwang@nccu.edu.tw

More information

Forecasting & Futurism

Forecasting & Futurism Article from: Forecasting & Futurism December 2013 Issue 8 A NEAT Approach to Neural Network Structure By Jeff Heaton Jeff Heaton Neural networks are a mainstay of artificial intelligence. These machine-learning

More information

Transactions on Information and Communications Technologies vol 18, 1998 WIT Press, ISSN

Transactions on Information and Communications Technologies vol 18, 1998 WIT Press,   ISSN GIS in the process of road design N.C. Babic, D. Rebolj & L. Hanzic Civil Engineering Informatics Center, University ofmaribor, Faculty of Civil Engineering, Smetanova 17, 2000 Maribor, Slovenia. E-mail:

More information

Cell-based Model For GIS Generalization

Cell-based Model For GIS Generalization Cell-based Model For GIS Generalization Bo Li, Graeme G. Wilkinson & Souheil Khaddaj School of Computing & Information Systems Kingston University Penrhyn Road, Kingston upon Thames Surrey, KT1 2EE UK

More information

Avida-ED Quick Start User Manual

Avida-ED Quick Start User Manual Avida-ED Quick Start User Manual I. General Avida-ED Workspace Viewer chooser Lab Bench Freezer (A) Viewer chooser buttons Switch between lab bench views (B) Lab bench Three lab bench options: 1. Population

More information

Keywords: tropical cyclone eye fix; remote sensing data; Genetic algorithm.

Keywords: tropical cyclone eye fix; remote sensing data; Genetic algorithm. Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Optimum Analysis

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

AP Curriculum Framework with Learning Objectives

AP Curriculum Framework with Learning Objectives Big Ideas Big Idea 1: The process of evolution drives the diversity and unity of life. AP Curriculum Framework with Learning Objectives Understanding 1.A: Change in the genetic makeup of a population over

More information

A Simple Haploid-Diploid Evolutionary Algorithm

A Simple Haploid-Diploid Evolutionary Algorithm A Simple Haploid-Diploid Evolutionary Algorithm Larry Bull Computer Science Research Centre University of the West of England, Bristol, UK larry.bull@uwe.ac.uk Abstract It has recently been suggested that

More information

Artificial Intelligence Methods (G5BAIM) - Examination

Artificial Intelligence Methods (G5BAIM) - Examination Question 1 a) According to John Koza there are five stages when planning to solve a problem using a genetic program. What are they? Give a short description of each. (b) How could you cope with division

More information

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches Discrete Mathematics for Bioinformatics WS 07/08, G. W. Klau, 31. Januar 2008, 11:55 1 Metaheuristics and Local Search Discrete optimization problems Variables x 1,...,x n. Variable domains D 1,...,D n,

More information

Evolving a New Feature for a Working Program

Evolving a New Feature for a Working Program Evolving a New Feature for a Working Program Mike Stimpson arxiv:1104.0283v1 [cs.ne] 2 Apr 2011 January 18, 2013 Abstract A genetic programming system is created. A first fitness function f 1 is used to

More information

CRISP: Capture-Recapture Interactive Simulation Package

CRISP: Capture-Recapture Interactive Simulation Package CRISP: Capture-Recapture Interactive Simulation Package George Volichenko Carnegie Mellon University Pittsburgh, PA gvoliche@andrew.cmu.edu December 17, 2012 Contents 1 Executive Summary 1 2 Introduction

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, etworks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

A A A A B B1

A A A A B B1 LEARNING OBJECTIVES FOR EACH BIG IDEA WITH ASSOCIATED SCIENCE PRACTICES AND ESSENTIAL KNOWLEDGE Learning Objectives will be the target for AP Biology exam questions Learning Objectives Sci Prac Es Knowl

More information

Evolutionary Multiobjective. Optimization Methods for the Shape Design of Industrial Electromagnetic Devices. P. Di Barba, University of Pavia, Italy

Evolutionary Multiobjective. Optimization Methods for the Shape Design of Industrial Electromagnetic Devices. P. Di Barba, University of Pavia, Italy Evolutionary Multiobjective Optimization Methods for the Shape Design of Industrial Electromagnetic Devices P. Di Barba, University of Pavia, Italy INTRODUCTION Evolutionary Multiobjective Optimization

More information

The University of Birmingham School of Computer Science MSc in Advanced Computer Science. Behvaiour of Complex Systems. Termite Mound Simulator

The University of Birmingham School of Computer Science MSc in Advanced Computer Science. Behvaiour of Complex Systems. Termite Mound Simulator The University of Birmingham School of Computer Science MSc in Advanced Computer Science Behvaiour of Complex Systems Termite Mound Simulator John S. Montgomery msc37jxm@cs.bham.ac.uk Lecturer: Dr L. Jankovic

More information

Essential knowledge 1.A.2: Natural selection

Essential knowledge 1.A.2: Natural selection Appendix C AP Biology Concepts at a Glance Big Idea 1: The process of evolution drives the diversity and unity of life. Enduring understanding 1.A: Change in the genetic makeup of a population over time

More information

The Role of Crossover in Genetic Algorithms to Solve Optimization of a Function Problem Falih Hassan

The Role of Crossover in Genetic Algorithms to Solve Optimization of a Function Problem Falih Hassan The Role of Crossover in Genetic Algorithms to Solve Optimization of a Function Problem Falih Hassan ABSTRACT The genetic algorithm is an adaptive search method that has the ability for a smart search

More information

Lecture 15: Genetic Algorithms

Lecture 15: Genetic Algorithms Lecture 15: Genetic Algorithms Dr Roman V Belavkin BIS3226 Contents 1 Combinatorial Problems 1 2 Natural Selection 2 3 Genetic Algorithms 3 31 Individuals and Population 3 32 Fitness Functions 3 33 Encoding

More information

Evolutionary Algorithms

Evolutionary Algorithms Evolutionary Algorithms a short introduction Giuseppe Narzisi Courant Institute of Mathematical Sciences New York University 31 January 2008 Outline 1 Evolution 2 Evolutionary Computation 3 Evolutionary

More information

DETECTING THE FAULT FROM SPECTROGRAMS BY USING GENETIC ALGORITHM TECHNIQUES

DETECTING THE FAULT FROM SPECTROGRAMS BY USING GENETIC ALGORITHM TECHNIQUES DETECTING THE FAULT FROM SPECTROGRAMS BY USING GENETIC ALGORITHM TECHNIQUES Amin A. E. 1, El-Geheni A. S. 2, and El-Hawary I. A **. El-Beali R. A. 3 1 Mansoura University, Textile Department 2 Prof. Dr.

More information

Theory, Concepts and Terminology

Theory, Concepts and Terminology GIS Workshop: Theory, Concepts and Terminology 1 Theory, Concepts and Terminology Suggestion: Have Maptitude with a map open on computer so that we can refer to it for specific menu and interface items.

More information

5 ProbabilisticAnalysisandRandomized Algorithms

5 ProbabilisticAnalysisandRandomized Algorithms 5 ProbabilisticAnalysisandRandomized Algorithms This chapter introduces probabilistic analysis and randomized algorithms. If you are unfamiliar with the basics of probability theory, you should read Appendix

More information

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat: Local Search Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat: I I Select a variable to change Select a new value for that variable Until a satisfying assignment

More information

Chapter 9: The Perceptron

Chapter 9: The Perceptron Chapter 9: The Perceptron 9.1 INTRODUCTION At this point in the book, we have completed all of the exercises that we are going to do with the James program. These exercises have shown that distributed

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

RNA evolution and Genotype to Phenotype maps

RNA evolution and Genotype to Phenotype maps RNA evolution and Genotype to Phenotype maps E.S. Colizzi November 8, 2018 Introduction Biological evolution occurs in a population because 1) different genomes can generate different reproductive success

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

Big Idea 1: The process of evolution drives the diversity and unity of life.

Big Idea 1: The process of evolution drives the diversity and unity of life. Big Idea 1: The process of evolution drives the diversity and unity of life. understanding 1.A: Change in the genetic makeup of a population over time is evolution. 1.A.1: Natural selection is a major

More information

Valley Central School District 944 State Route 17K Montgomery, NY Telephone Number: (845) ext Fax Number: (845)

Valley Central School District 944 State Route 17K Montgomery, NY Telephone Number: (845) ext Fax Number: (845) Valley Central School District 944 State Route 17K Montgomery, NY 12549 Telephone Number: (845)457-2400 ext. 18121 Fax Number: (845)457-4254 Advance Placement Biology Presented to the Board of Education

More information

Evolutionary Robotics

Evolutionary Robotics Evolutionary Robotics Previously on evolutionary robotics Evolving Neural Networks How do we evolve a neural network? Evolving Neural Networks How do we evolve a neural network? One option: evolve the

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Evolutionary Computation: introduction

Evolutionary Computation: introduction Evolutionary Computation: introduction Dirk Thierens Universiteit Utrecht The Netherlands Dirk Thierens (Universiteit Utrecht) EC Introduction 1 / 42 What? Evolutionary Computation Evolutionary Computation

More information

M E R C E R W I N WA L K T H R O U G H

M E R C E R W I N WA L K T H R O U G H H E A L T H W E A L T H C A R E E R WA L K T H R O U G H C L I E N T S O L U T I O N S T E A M T A B L E O F C O N T E N T 1. Login to the Tool 2 2. Published reports... 7 3. Select Results Criteria...

More information

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation page 2 Page 2 2 Introduction Family Trees for all grades Goals Discover Darwin all over Pittsburgh in 2009 with Darwin 2009: Exploration is Never Extinct. Lesson plans, including this one, are available

More information

2012 Assessment Report. Mathematics with Calculus Level 3 Statistics and Modelling Level 3

2012 Assessment Report. Mathematics with Calculus Level 3 Statistics and Modelling Level 3 National Certificate of Educational Achievement 2012 Assessment Report Mathematics with Calculus Level 3 Statistics and Modelling Level 3 90635 Differentiate functions and use derivatives to solve problems

More information

AP Biology Curriculum Framework

AP Biology Curriculum Framework AP Biology Curriculum Framework This chart correlates the College Board s Advanced Placement Biology Curriculum Framework to the corresponding chapters and Key Concept numbers in Campbell BIOLOGY IN FOCUS,

More information

Genetic Engineering and Creative Design

Genetic Engineering and Creative Design Genetic Engineering and Creative Design Background genes, genotype, phenotype, fitness Connecting genes to performance in fitness Emergent gene clusters evolved genes MIT Class 4.208 Spring 2002 Evolution

More information

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS chapter MORE MATRIX ALGEBRA GOALS In Chapter we studied matrix operations and the algebra of sets and logic. We also made note of the strong resemblance of matrix algebra to elementary algebra. The reader

More information

An artificial chemical reaction optimization algorithm for. multiple-choice; knapsack problem.

An artificial chemical reaction optimization algorithm for. multiple-choice; knapsack problem. An artificial chemical reaction optimization algorithm for multiple-choice knapsack problem Tung Khac Truong 1,2, Kenli Li 1, Yuming Xu 1, Aijia Ouyang 1, and Xiaoyong Tang 1 1 College of Information Science

More information

14 Random Variables and Simulation

14 Random Variables and Simulation 14 Random Variables and Simulation In this lecture note we consider the relationship between random variables and simulation models. Random variables play two important roles in simulation models. We assume

More information

a (b + c) = a b + a c

a (b + c) = a b + a c Chapter 1 Vector spaces In the Linear Algebra I module, we encountered two kinds of vector space, namely real and complex. The real numbers and the complex numbers are both examples of an algebraic structure

More information