arxiv: v2 [cs.ne] 26 Nov 2014

Size: px

Start display at page:

Download "arxiv: v2 [cs.ne] 26 Nov 2014"

Vernon Wood
5 years ago
Views:

1 How Crossover Speeds Up Buildig-Block Assembly i Geetic Algorithms arxiv: v2 [cs.ne] 26 Nov 204 Dirk Sudholt Departmet of Computer Sciece, Uiversity of Sheffield, Uited Kigdom Abstract We re-ivestigate a fudametal questio: how effective is crossover i Geetic Algorithms i combiig buildig blocks of good solutios? Although this has bee discussed cotroversially for decades, we are still lackig a rigorous ad ituitive aswer. We provide such aswers for royal road fuctios ad ONEMAX, where every bit is a buildig block. For the latter we show that usig crossover makes every µ+λ Geetic Algorithm at least twice as fast as the fastest evolutioary algorithm usig oly stadard bit mutatio, up to small-order terms ad for moderate µ ad λ. Crossover is beeficial because it effectively turs fitess-eutral mutatios ito improvemets by combiig the right buildig blocks at a later stage. Compared to mutatio-based evolutioary algorithms, this makes multi-bit mutatios more useful. Itroducig crossover chages the optimal mutatio rate o ONEMAX from / to + 5/2 /.68/. This holds both for uiform crossover ad k-poit crossover. Experimets ad statistical tests cofirm that our fidigs apply to a broad class of buildig-block fuctios. Keywords Geetic algorithms, crossover, recombiatio, mutatio rate, rutime aalysis, theory. Itroductio Ever sice the early days of geetic algorithms GAs, researchers have wodered whe ad why crossover is a effective search operator. It has bee folklore that crossover is useful if it ca combie buildig blocks, i. e., schema of high fitess, to give better solutios [30]. But, as Watso ad Jase [50] put it, there has bee a cosiderable difficulty i demostratig this rigorously ad ituitively. May attempts at uderstadig crossover have bee made i the past. Mitchell, Forrest, ad Hollad [30] preseted so-called royal road fuctios as a example where, supposedly, geetic algorithms outperform other search algorithms due to the use of crossover. Royal roads divide a bit strig ito disjoit blocks. Each block makes a positive cotributio to the fitess i case all bits therei are set to. Blocks thus represet schemata, ad all-oes cofiguratios are buildig blocks of optimal solutios. However, the same authors later cocluded that simple radomized hill climbers performed better tha GAs [7, 3]. The role of crossover has bee studied from multiple agles, icludig algebra [42], Markov chai models [49], ifiite populatio models ad dyamical systems see [7, Chapter 6] for a overview ad statistical mechaics see, e. g. [5, Chapter ]. Also i biology the role of crossover is far from settled. I populatio geetics, explorig the advatages of recombiatio, or sexual reproductio, is a famous ope questio [5] ad has bee called the quee of problems i evolutioary biology by

2 Graham Bell [6] ad others. Evolutioary processes were foud to be harder to aalyze tha those usig oly asexual reproductio as they represet quadratic dyamical systems [2, 40]. Recet work i populatio geetics has focussed o studyig the speed of adaptatio, which describes the efficiecy of evolutio, i a similar vei to research i evolutioary computatio [52, 53]. Furthermore, a ew theory of mixability has bee proposed recetly from the perspective of theoretical computer sciece [28, 29], arguig that recombiatio favours idividuals that are good mixers, that is, idividuals that create good offsprig whe beig recombied with others. Several researchers recetly ad idepedetly reported empirical observatios that usig crossover improves the performace of evolutioary algorithms EAs o the simple fuctio ONEMAX x = i= x i [24, 4], but were uable to explai why. The fact that eve settigs as simple as ONEMAX are ot well uderstood demostrates the eed for a solid theory ad serves as motivatio for this work. Rutime aalysis has become a major area of research that ca give rigorous evidece ad prove theorems [35, 3, 8]. However, studies so far have eluded the most fudametal settig of buildig-block fuctios. Crossover was prove to be superior to mutatio oly o costructed artificial examples like Jump k [20, 23] ad Real Royal Road fuctios [2, 43], the H-IFF problem [8], colorig problems ispired by the Isig model from physics [6, 44], computig uique iput-output sequeces for fiite state machies [26], selected problems from multi-objective optimizatio [39], ad the allpairs shortest path problem [, 48, 34]. H-IFF [8] ad the Isig model o trees [44] cosist of hierarchical buildig blocks. But oe of the above papers addresses siglelevel buildig blocks i a settig as simple as royal roads. Watso ad Jase [50] preseted a costructed buildig-block fuctio ad proved expoetial performace gaps betwee EAs usig oly mutatio ad a GA. However, the defiitio of the iteral structure of buildig blocks is complicated ad artificial, ad they used a tailored multi-deme GA to get the ecessary diversity. With regard to the questio o how GAs combie buildig blocks, their approach does ot give the ituitive explaatio oe is hopig for. This paper presets such a ituitive explaatio, supported by rigorous aalyses. We cosider royal roads ad other fuctios composed of buildig blocks, such as mootoe polyomials. ONEMAXx = i= x i is a special case where every bit is a buildig block. We give rigorous proofs for ONEMAX ad show how the mai proof argumets trasfer to broader classes of buildig-block fuctios. Experimets support the latter. Our mai results are as follows.. We show i Sectio 3 that o ONEMAX every µ+λ GA with uiform crossover ad stadard bit mutatio is at least twice as fast as every evolutioary algorithm EA that oly uses stadard bit mutatios up to small-order terms. More precisely, the domiatig term i the expected umber of fuctio evaluatios decreases from e l to e/2 l. This holds provided that the paret populatio ad offsprig populatio sizes µ ad λ are moderate, so that the iertia of a large populatio does ot slow dow exploitatio. The reaso for this speedup is that the GA ca store a eutral mutatio a mutatio ot alterig the paret s fitess i the populatio, alog with the respective paret. It ca the use crossover to combie the good buildig blocks betwee these two idividuals, improvig the curret best fitess. For bipartite graphs, the problem is equivalet to the classical Graph Colorig problem with 2 colors. 2

3 I other words, crossover ca capitalize o mutatios that have both beeficial ad disruptive effects o buildig blocks. 2. The use of uiform crossover leads to a shift i the optimal mutatio rate o ONE- MAX. We demostrate this i Sectio 4 for a simple greedy 2+ GA that always selects parets amog the curret best idividuals. While for mutatio-based EAs / is the optimal mutatio rate [55], the greedy 2+ GA has a optimal mutatio rate of + 5/2 /.68/ igorig small-order terms. This is because itroducig crossover makes eutral mutatios more useful ad larger mutatio rates icrease the chace of a eutral mutatio. Optimality is proved by meas of a matchig lower boud o the expected optimizatio time of the greedy 2+ GA that applies to all mask-based crossover operators where each bit value is take from either paret. Usig the optimal mutatio rate, the expected umber of fuctio evaluatios is.9 l ± O log log. 3. These results are ot limited to uiform crossover or the absece of likage. Sectio 5 shows that the same results hold for GAs usig k-poit crossover, for arbitrary k, uder slightly stroger coditios o µ ad λ, if the crossover probability p c is set to a appropriately small value. 4. The reasoig for ONEMAX carries over to other fuctios with a clear buildig block structure. Experimets i Sectio 6 reveal similar performace differeces as o ONEMAX for royal road fuctios ad radom polyomials with uweighted, positive coefficiets. This is largely cofirmed by statistical tests. There is evidece that fidigs also trasfer to weighted buildig-block fuctios like liear fuctios, provided that the populatio ca store solutios with differet fitess values ad differet buildig blocks util crossover is able to combie them. This is ot the case for the greedy 2+ GA, but a simple 5+ GA is sigificatly faster o radom liear fuctios tha the optimal mutatio-based EA for this class of fuctios, the + EA [55]. The first result, the aalysis for uiform crossover, is remarkably simple ad ituitive. It gives direct isight ito the workig priciples of GAs. Its simplicity also makes it very well suited for teachig purposes. This work exteds a prelimiary coferece paper [46] with parts of the results, where results were restricted to oe particular GA, the greedy 2+ GA. This exteded versio presets a geeral aalytical framework that applies to all µ+λ GAs, subject to mild coditios, ad icludes the greedy 2+ GA as a special case. To this ed, we provide tools for aalyzig paret ad offsprig populatios i µ+λ GAs, which we believe are of idepedet iterest. Moreover, results for k-poit crossover have bee improved. The leadig costat 2c 3+3c larger i the upper boud for k-poit crossover i [46] was by a additive term of tha that for uiform crossover, for mutatio rates of c/. This left ope the questio whether k-poit crossover is as effective as uiform crossover for assemblig buildig blocks i ONEMAX. Here we provide a ew ad refied aalysis, which gives a affirmative aswer, uder mild coditios o the crossover probability.. Related Work Kötzig, Sudholt, ad Theile [23] cosidered the search behaviour of a idealized GA o ONEMAX, to highlight the potetial beefits of crossover uder ideal circumstaces. 3

4 If a GA was able to recombie two idividuals with equal fitess that result from idepedet evolutioary lieages, the fitess gai ca be of order Ω. The idealized GA would therefore be able to optimize ONEMAX i expected time O [23]. However, this idealizatio caot reasoably be achieved i realistic EAs with commo search operators, hece the result should be regarded a academic study o the potetial beefit of crossover. A related strad of research deals with the aalysis of the Simple GA o ONEMAX. The Simple GA is oe of the best kow ad best researched GAs i the field. It uses a geeratioal model where parets are selected usig fitess-proportioal selectio ad the geerated offsprig form the ext populatio. Neuma, Oliveto ad Witt [33] showed that the Simple GA without crossover with high probability caot optimize ONEMAX i less tha expoetial time. The reaso is that the populatio typically cotais idividuals of similar fitess, ad the fitess-proportioal selectio is similar to uiform selectio. Oliveto ad Witt [38] exteded this result to uiform crossover: the Simple GA with uiform crossover ad populatio size µ /8 ε, ε > 0, still eeds expoetial time o ONEMAX. It eve eeds expoetial time to reach a solutio of fitess larger tha + c /2 for a arbitrary costat c > 0. I [37] the same authors relaxed their coditio o the populatio size to µ /4 ε. Their work does ot exclude that crossover is advatageous, particularly sice uder the right circumstaces crossover may lead to a large icrease i fitess cf. [23]. But if there is a advatage, it is ot oticeable as the Simple GA with crossover still fails badly o ONEMAX. Oe year after [46] was published, Doerr, Doerr, ad Ebel [9] preseted a groudbreakig result: they desiged a EA that was prove to optimise ONEMAX ad ay simple trasformatio thereof i time O log. This is a spectacular result as all black-box search algorithms usig oly ubiased uary operators operators modifyig oe idividual oly, ad ot exhibitig ay iheret search bias eed time Ω log as show by Lehre ad Witt [27]. So their EA shows that crossover ca lower the expected ruig time by more tha a costat factor. They call their algorithm a +λ, λ EA: startig with oe paret, it first creates λ offsprig by mutatio, with a radom ad potetially high mutatio rate. The it selects the best mutat, ad crosses it λ times with the origial paret, usig parameterized uiform crossover the probability of takig a bit from the first paret is ot always /2, but a parameter of the algorithm. This leads to a umber of O log expected fuctio evaluatios, which ca be further decreased to O with a scheme adaptig λ accordig to the curret fitess. The +λ, λ EA from [9] is very cleverly desiged to work efficietly o ONE- MAX ad similar fuctios. It uses a o-stadard EA desig because of its two phases of evirometal selectio. Other differeces are that mutatio is performed before crossover, ad mutatio is ot fully idepedet for all offsprig: the umber of flippig bits is a radom variable determied as for stadard bit mutatios, but the same umber of flippig bits is the used i all offsprig. The focus of this work is differet as our goal is to uderstad how stadard EAs operate, ad how crossover ca be used to speed up buildig-block assembly i commoly used µ+λ EAs. 2 Prelimiaries We measure the performace of the algorithm with respect to the umber of fuctio evaluatios performed util a optimum is foud, ad refer to this as optimizatio time. For steady-state algorithms this equals the umber of geeratios apart from the iitializatio, ad for EAs with offsprig populatios such as µ+λ EAs or µ+λ GAs the optimizatio time is by a factor of λ larger tha the umber of geeratios. Note 4

5 that the umber of geeratios eeded to optimize a fitess fuctio ca ofte be easily decreased by usig offsprig populatios or parallel evolutioary algorithms [25]. But this sigificatly icreases the computatioal effort withi oe geeratio, so the umber of fuctio evaluatios is a more fair ad widely used measure. Lookig at fuctio evaluatios is ofte motivated by the fact that this operatio domiates the executio time of the algorithm. The the umber of fuctio evaluatios is a reliable measure for wall clock time. However, the wall clock time might icrease whe itroducig crossover as a additioal search operator. Also whe icreasig the mutatio rate, more pseudo-radom umbers might be required. Jase ad Zarges [22] poit out a case where this effect leads to a discrepacy betwee the umber of fuctio evaluatios ad wall clock time. This cocer must be take seriously whe aimig at reducig wall clock time. However, each implemetatio must be checked idividually i this respect [22]. Therefore, we keep this cocer i mid, but still use the umber of fuctio evaluatios i the followig. 3 Uiform Crossover Makes µ+λ EAs Twice as Fast We show that, uder mild coditios, every µ+λ GA is at least twice as fast as its couterpart without crossover. For the latter, that is, evolutioary algorithms usig oly stadard bit mutatio, the author recetly proved the followig lower boud o the ruig time of a very broad class of mutatio-based EAs [47]. It covers all possible selectio mechaisms, paret or offsprig populatios, ad eve parallel evolutioary algorithms. We slightly rephrase this result. Theorem Sudholt [47]. Let 2. Every EA that uses oly stadard bit mutatio with mutatio rate p to create ew solutios has expected optimizatio time at least mi{l, l/p 2 } l l 3 p p o ONEMAX ad every other fuctio with a uique optimum, if 2 /3 p log. If p = c/, c > 0 costat, this is at least c e c l o. I fact, for ONEMAX the author proved that amog all evolutioary algorithms that start with oe radom solutio ad oly use stadard bit mutatios the expected umber of fuctio evaluatios is miimized by the simple + EA [47, Theorem 3]. Also the mutatio rate p = / is the best possible choice for ONEMAX, leadig to a lower boud of e l e l l 3e. For the special case of p = /, Doerr, Fouz, ad Witt [0] recetly improved the above boud towards e l O. We show that for a rage of µ+λ EAs, as defied i the followig, itroducig uiform crossover ca cut the domiat term of the ruig time i half, for the stadard mutatio rate p = /. The oly requiremet o the paret selectio mechaism is that selectio does ot favor iferior solutios over fitter oes. Formally, for maximizig a fitess fuctio f, x, y : fx fy Probselect x Probselect y. 5

6 This i particular implies that equally fit solutios are selected with the same probability. Coditio is satisfied for all commo selectio mechaisms: uiform selectio, fitess-proportioal selectio, touramet selectio, cut selectio, ad rak-based mechaisms. The class of µ+λ EAs covered i this work is defied i Algorithm. All µ+λ EAs therei create λ offsprig through crossover ad mutatio, or just mutatio, ad the pick the best out of the µ previous search poits ad the λ ew offsprig. Algorithm : Scheme of a µ+λ GA with mutatio rate p ad uiform crossover with crossover probability p c for maximizig f : {0, } R. Iitialize populatio P of size µ N u. a. r. 2 while true do 3 Let P =. 4 for i =,..., λ do 5 With probability p c do 6 Select x, x 2 with a operator respectig. 7 Let y := uiform crossoverx, x 2. 8 otherwise do 9 Select y with a operator respectig. 0 ed Flip each bit i y idepedetly with probability p. 2 Add y to P. 3 ed 4 Let P cotai the µ best idividuals from P P ; break ties towards icludig idividuals with the fewest duplicates i P P. 5 ed I the case of ties, we pick solutios that have the fewest duplicates amog the cosidered search poits. This strategy has already bee used by Jase ad Wegeer [2] i their groudbreakig work o Real Royal Roads; it esures a sufficiet degree of diversity wheever the populatio cotais differet search poits of the same fitess. Before statig the mai result of this sectio, we provide two lemmas showig how to aalyse populatio dyamics. Both lemmas are of idepedet iterest ad may prove useful i other studies of populatio-based EAs. The followig lemma estimates the expected time util idividuals with fitess at least i take over the whole populatio. It geeralizes Lemma 3 i [45], which i tur goes back to Witt s aalysis of the µ+ EA [54]. Note that the lemma applies to arbitrary fitess fuctios, arbitrary values for µ ad λ, ad arbitrary crossover operators; it merely relies o fudametal ad uiversal properties of cut selectio ad stadard bit mutatios. Lemma 2. Cosider ay µ+λ GA implemetig Algorithm, with ay crossover operator, o ay -bit fitess fuctio. Assume the curret populatio cotais at least oe idividual of fitess i. The expected umber of fuctio evaluatios eeded for the µ+λ GA before all idividuals i its curret populatio have fitess at least i is at most Oµ + λ log µ p c p. 6

7 This holds for ay tie-breakig rule used i the evirometal selectio. Proof. Call a idividual fit if it has fitess at least i. We ow estimate the expected umber of geeratios util the populatio is take over by fit idividuals, which we call the expected takeover time. As fit idividuals are always preferred to o-fit idividuals i the evirometal selectio, the expected takeover time equals the expected umber of geeratios util µ fit idividuals have bee created, startig with oe fit idividual. For each offsprig beig created, there is a chace that the µ+λ GA will simply create a cloe of a fit idividual. This happes if, durig the creatio of a offsprig, the µ+λ GA decides ot to perform crossover, it selects a fit idividual as paret to be mutated, ad mutatio does ot flip ay bit. The probability for this evet is at least p c p umber of fit idividuals i populatio µ sice each fit idividual is selected as paret with probability at least /µ. Now we divide the ru of the µ+λ GA ito phases i order to get a lower boud o the umber of fit idividuals at certai time steps. The j-th phase, 0 j log 5 µ, starts with the first offsprig creatio i the first geeratio where the umber of fit idividuals is at least 5 j. It eds i the first geeratio where this umber is icreased to mi{5 j+, µ}. Let T j describe the radom umber of geeratios spet i the j-th phase. Startig with a ew geeratio with µ 5 j fit idividuals i the paret populatio, we ow cosider a phase of 8µ/ p c p offsprig creatios, disregardig geeratio bouds. Let N i deote the radom umber of ew fit offsprig created i the phase, the EN i 8µ p c p p c p 5i µ = 8 5i ad by classical Cheroff bouds see, e. g. [32, Chapter 4] ProbN i < 4 5 i e ENi/8 e 5i e. If N i < 4 5 i the phase is called usuccessful ad we cosider aother phase of 8µ/ p c p offsprig creatios. The expected waitig time for a successful phase is at most / e ad the expected umber of offsprig creatios util N i 4 5 i is at most 8µ/ p c p e. Sice phases start at geeratio bouds, we may eed to accout for up to λ further offsprig creatios i betwee phases. This implies ET i ad the expected takeover time is at most log 5 µ i=0 ET i log 5 µ = 8µ p c p e + λ Oµ + λ log µ p c p. 8µ p c p e + λ 7

8 We also provide the followig simple but hady lemma, which relates success probabilities for created offsprig to the expected umber of fuctio evaluatios eeded to complete a geeratio where such a evet has first happeed. Lemma 3. Cosider ay µ+λ GA implemetig Algorithm, ad assume that i each offsprig creatio there is a probability at least q that some specific evet occurs. The the expected umber of fuctio evaluatios to complete a geeratio where this evet first occurs is at most λ + q. Proof. The expected umber of trials for a evet with probability q to occur is /q. To complete the geeratio, at most λ further fuctio evaluatios are required. Now we are able to prove the mai result of this sectio. Theorem 4. The expected optimizatio time of every µ+λ GA implemetig Algorithm with 0 < p c < costat, mutatio probability 0 < p < ad µ 2 o ONEMAX is at most l 2 p p Oµ + λ log µ p p + + p p. 2 If p = c/, c > 0 costat, ad µ, λ = olog /log log, this boud simplifies to Both statemets hold for arbitrary iitial populatios. l c e c + o. 3 + c The mai differece betwee the upper boud for µ+λ GAs ad the lower boud for all mutatio-based EAs is a additioal factor of + p i the deomiator of the upper boud. This is a factor of 2 for p = / ad a eve larger gai for larger mutatio rates. For the default value of p = /, this shows that itroducig crossover makes EAs at least twice as fast as the fastest EA usig oly stadard bit mutatio. It also implies that itroducig crossover makes EAs at least twice as fast as their couterparts without crossover i. e. where p c = 0. Proof of Theorem 4. Boud 3 ca be derived from 2 usig /x x /e for x > to estimate c = c /c c c c e c c2 = e c O/ as well as lc c/ = l + O. Note that µ + λ log µ = o log by coditios o µ, λ, hece this ad all other small-order terms are absorbed i the term o. I order to prove the geeral boud 2, we cosider caoical fitess levels, i. e., the i-th fitess level cotais all search poits with fitess i. We estimate the time spet o each level i, i. e., whe the best fitess i the curret populatio is i. For each fitess level we cosider three cases. The first case applies whe the populatio cotais idividuals o fitess levels less tha i. The secod case is whe the populatio oly cotais copies of a sigle idividual o level i. The third case occurs whe the populatio cotais more tha oe idividual o level i; the the populatio cotais differet buildig blocks that ca be recombied effectively by crossover. 8

9 All these cases capture the typical behaviour of a µ+λ GA, albeit some of these cases, ad eve whole fitess levels, may be skipped. We obtai a upper boud o its expected optimizatio time by summig up expected times the µ+λ GA may sped i all cases ad o all fitess levels. Case i.: The populatio cotais a idividual o level i ad at least oe idividual o a lower fitess level. A sufficiet coditio for leavig this case is that all idividuals i the populatio obtai fitess at least i. Sice the µ+λ GA ever accepts worseigs, the case is left for good. The time for all idividuals reachig fitess at least i has already bee estimated i Lemma 2. Applyig this lemma to all fitess levels i, the overall time spet i all cases i. is at most Oµ + λ log µ Oµ + λ log µ = p c p p. Case i.2: The populatio cotais µ copies of the same idividual x o level i. I this case, each offsprig created by the µ+λ GA will be a stadard mutatio of x. This is obvious for offsprig where the µ+λ GA decides ot to use crossover. If crossover is used, the µ+λ GA will pick x, x 2 = x, create y = x by crossover, ad hece perform a mutatio o x. The µ+λ GA leaves this case for good if either a better search poit is created or if it creates aother search poit with i oes. I the latter case we will create a populatio with two differet idividuals o level i. Note that due to the choice of the tie-breakig rule i the evirometal selectio, the µ+λ GA will always maitai at least two idividuals o level i, uless a improvemet with larger fitess is foud. The probability of creatig a better search poit i oe mutatio is at least i p p as there are i suitable -bit flips. The probability of creatig a differet search poit o level i is at least i i p 2 p 2 as it is sufficiet to flip oe of i -bits, to flip oe of i 0-bits, ad ot to flip ay other bit. The probability of either evet happeig i oe offsprig creatio is thus at least i p p + i i p 2 p 2 p p i + ip. By Lemma 3, the expected umber of fuctio evaluatios i Case i.2 is at most λ + p p i + ip. The expected umber of fuctios evaluatios made i all cases i.2 is hece at most λ + p p i + ip = λ + i=0 p p i + ip. 4 i=0 9

10 The last sum ca be estimated as follows. Separatig the summad for i =, 2 i=0 i=0 i + ip + + p i + ip di + + p + p. We use equatio i [] to simplify the itegral ad get [ ] + ip + p l + + p i + p lp + p + l = + p l2 p p. + p p + p Pluggig this ito 4 yields that the expected time i all cases i.2 is at most λ + l2 p p p p + p. Case i.3: The populatio oly cotais idividuals o level i, ot all of which are idetical. I this case we ca rely o crossover recombiig two differet idividuals o level i. As they both have differet buildig blocks, i. e., differet bits are set to, there is a good chace that crossover will geerate a offsprig with a higher umber of -bits. The probability of performig a crossover with two differet parets i oe offsprig creatio is at least p c µ µ 2 as i the worst case the populatio cotais µ copies of oe particular idividual. Assumig two differet parets are selected for crossover, let these have Hammig distace 2d ad let X deote the umber of -bits amog these positios i the offsprig. Note that X is biomially distributed with parameters 2d ad /2 ad its expectatio is d. We estimate the probability of gettig a surplus of -bits as this leads to a improvemet i fitess. This estimate holds for ay d N. Sice ProbX < d = ProbX > d, we have ProbX > d = 2 ProbX = d = 2d 2 2d 2 d 4. Mutatio keeps all -bits with probability at least p. Together, the probability of icreasig the curret best fitess i oe offsprig creatio is at least p c µ µ 2 p. 4 By Lemma 3, the expected umber of fuctio evaluatios i Case i.3 is at most λ + 4µ 2 p c µ p. 0

11 The total expected time spet i all cases i.3 is hece at most λ + 4µ 2 Oµ = λ + p c µ p p as p c = Ω. Summig up all expected times yields a total time boud of l 2 p p Oµ + Oµ + λ log µ p p + 2λ + + p p = l2 p p Oµ + λ log µ p p + + p p. Remark O coditios for µ ad λ. The secod statemet of Theorem 4 requires µ, λ = olog /log log i order to establish the upper boud i 3. This coditio seems ecessary as for larger values of µ ad λ the iertia of a large populatio slows dow exploitatio, at least i the absece of crossover. Note ot all EAs covered by Theorem 4 after removig crossover optimize ONEMAX i time O log. Witt [54] showed that a µ+ EA with uiform paret selectio has a expected optimizatio time of Ωµ + log o ONEMAX. For µ = ωlog, this lower boud is ω log. Jase, De Jog, ad Wegeer [9] showed that a +λ EA eeds time ω log o ONEMAX if λ = ωlog log log /log log log. Badkobeh, Lehre, ad Sudholt [4] showed that every black-box algorithm creatig λ offsprig, usig oly stadard bit mutatio or other uary ubiased operators, eeds time ω log o ONEMAX for λ = ωlog log log. This idicates that the threshold i the coditio µ, λ = olog /log log is tight up to polyomials of log log. Remark 2 O coditios for p c. Theorem 4 assumes 0 < p c < costat, which reflects the most commo choices i applicatios of EAs. The theorem ca be exteded towards smaller or larger values as follows. If p c = o the upper boud o the time spet i Cases i.3 icreases as it cotais a factor of /p c. The other cases remai uaffected, ad if µ + λ log µ/p c = olog we still get the upper boud from 3. For high crossover probabilities, that is, p c = o or p c =, oly Cases i. eed to be revisited. The time i those cases was derived from Lemma 2, which ca be adapted as follows: the probability for icreasig the umber of fit idividuals is at least p c p umber of fit idividuals i populatio2 2µ 2 as it suffices to select two fit idividuals ad geerate a average or above-average umber of -bits i the offsprig, which happes with probability at least /2. The time boud from Lemma 2 the becomes ad the time boud i Theorem 4 becomes Oµ 2 + λ log µ p l 2 p p p p + p + Oµ2 + λ log µ p. For p = c/, c > 0 costat, ad µ, λ = o log, this also establishes the upper boud from 3.

12 It is remarkable that the waitig time for successful crossovers i Cases i.3 is oly of order Oµ + λ. For small values of µ ad λ, e. g. µ, λ = O, the time spet i all Cases i.3 is O, which is egligible compared to the overall time boud of order Θ log. This shows how effective crossover is i recombiig buildig blocks. Also ote that the proof of Theorem 4 is relatively simple as it uses oly elemetary argumets ad, alog with Lemmas 2 ad 3, it is fully self-cotaied. The aalysis therefore leds itself for teachig purposes o the behavior of evolutioary algorithms ad the beefits of crossover. Our aalysis has revealed that fitess-eutral mutatios, that is, mutatios creatig a differet search poit of the same fitess, ca help to escape from the case of a populatio with idetical idividuals. Eve though these mutatios do ot immediately yield a improvemet i terms of fitess, they icrease the diversity i the populatio. Crossover is very efficiet i exploitig this gaied diversity by combiig two differet search poits at a later stage. This meas that crossover ca capitalize o mutatios that have both beeficial ad disruptive effects o buildig blocks. A iterestig cosequece is that this affects the optimal mutatio rate o ONE- MAX. For EAs usig oly stadard bit mutatios Witt [55] recetly proved that / is the optimal mutatio rate for the + EA o all liear fuctios. Recall that the + EA is the optimal mutatio-based EA i the sese of Theorem o ONE- MAX [47]. For mutatio-based EAs o ONEMAX eutral mutatios are either helpful or detrimetal. With crossover eutral mutatios ow become helpful. Icreasig the mutatio rate icreases the likelihood of eutral mutatios. I fact, we ca easily derive better upper bouds from Theorem 4 for slightly larger mutatio rates, thaks to the additioal term + p i the deomiator of the upper boud. The domiat term i 3, l c e c + c is miimized for c beig the golde ratio c = 5 + /2.68. This leads to the followig. Corollary 5. The asymptotically best ruig time boud from Theorem 4 is obtaied for p = + 5/2. For this choice the domiat term i 3 becomes e 5+/ l.9 l. 4 The Optimal Mutatio Rate Corollary 5 gives the mutatio rate that yields the best upper boud o the ruig time that ca be obtaied with the proof of Theorem 4. However, it does ot establish that this mutatio rate is ideed optimal for ay GA. After all, we caot exclude that aother mutatio rate leads to a smaller expected optimizatio time. I the followig, we show for a simple 2+ GA Algorithm 2 that the upper boud from Theorem 4 is ideed tight up to small-order terms, which establishes p = + 5/2 as the optimal mutatio rate for that 2+ GA. Provig lower bouds o expected optimizatio times is ofte a otoriously hard task, hece we restrict ourselves to a simple bare-boes GA that captures the characteristics of GAs covered by Theorem 4 ad is easy to aalyze. The latter is achieved by fixig as may parameters as possible. 2

13 As the upper boud from Theorem 4 grows with µ ad λ, we pick the smallest possible values: µ = 2 ad λ =. The paret selectio is made as simple as possible: we select parets uiformly at radom from the curret best idividuals i the populatio. I other words, if we defie the paret populatio as the set of idividuals that have a positive probability to be chose as parets, the paret populatio oly cotais idividuals of the curret best fitess. We call this paret selectio greedy because it is a greedy strategy to choose the curret best search poits as parets. I the cotext of the proof of Theorem 4 greedy paret selectio implies that Cases i. are ever reached as the paret populatio ever spas more tha oe fitess level. So the time spet i these cases is 0. This also allows us to elimiate oe further parameter by settig p c =, as lower values for p c were oly beeficial i Cases i.. Settig p c = miimizes our estimate for the time spet i Cases i.3. So Theorem 4 exteds towards this GA see also Remark 2. We call the resultig GA greedy 2+ GA because its mai characteristics is the greedy paret selectio. The greedy 2+ GA is defied i Algorithm 2 2. Algorithm 2: Greedy 2+ GA with mutatio rate p for maximizig f : {0, } R. Iitialize populatio P of size 2 u. a. r. 2 while true do 3 Select x, x 2 u. a. r. from {x P y P : fx fy}. 4 Let y := crossoverx, x 2. 5 Flip each bit i y idepedetly with probability p. 6 Let P cotai the 2 best idividuals from P {y}; break ties towards icludig idividuals with the fewest duplicates i P {y}. 7 ed The followig result applies to the greedy 2+ GA usig ay kid of mask-based crossover. A mask-based crossover is a recombiatio operator where each bit value is take from either paret; that is, it is ot possible to itroduce a bit value which is ot represeted i ay paret. All commo crossovers are mask-based crossovers: uiform crossover, icludig parameterized uiform crossover, as well as k-poit crossovers for ay k. The followig result eve icludes biased operators like a bit-wise OR, which iduces a tedecy to icrease the umber of -bits. Theorem 6. Cosider the greedy 2+ GA with mutatio rate 0 < p / log usig a arbitrary mask-based crossover operator. Its expected optimizatio time o ONEMAX is at least mi{l, l/p 2 } Olog log + max k { pk k!k! } p. p Before givig the proof, ote that for p = c/ with 0 < c 4 costat, max k { pk k!k! } = p as for 0 < p 4 ad i N pi+ pi i!i! hece a maximum is attaied for k =. The the lower boud from Theorem 6 is l c e c O log log. + c i+!i+! = p i+ 2 pi i!i!, 2 Note that i [46] the greedy 2+ GA was defied slightly differetly as there duplicate geotypes are always rejected. Algorithm 2 is equivalet to the greedy 2+ GA from [46] for the followig reasos. If the curret populatio cotais two differet idividuals of equal fitess ad a duplicate of oe of the parets is created, both algorithms reject a duplicate geotype. If the populatio cotais two idividuals of differet fitess, both behave like the populatio oly cotaied the fitter idividual. 3

14 This matches the upper boud 3 up to small order terms, showig for the greedy 2+ GA that the ew term + c i the deomiator of the boud from Theorem 4 was ot a coicidece. For p > 4/, the lower boud is at least e + Ω l. Together, this establishes the optimal mutatio rate for the greedy 2+ GA o ONE- MAX. Theorem 7. For the greedy 2+ GA with uiform crossover o ONEMAX mutatio rate p = + 5/2 miimizes the expected umber of fuctio evaluatios, up to small-order terms. For the proof of Theorem 6 we use the followig lower-boud techique based o fitess levels by the author [47]. Theorem 8 Sudholt [47]. Cosider a partitio of the search space ito o-empty sets A,..., A m. For a search algorithm A we say that it is i A i or o level i if the best idividual created so far is i A i. If there are χ, u i, γ i,j for i < j where. the probability of traversig from level i to level j i oe step is at most u i γ i,j for all i < j, 2. m j=i+ γ i,j = for all i, ad 3. γ i,j χ m k=j γ i,k for all i < j ad some 0 χ, the the expected hittig time of A m is at least m i= m ProbA starts i A i χ j=i u j. 5 Proof of Theorem 6. We prove a lower boud for the followig sped-up GA istead of the origial greedy 2+ GA. Wheever it creates a ew offsprig with the same fitess, but a differet bit strig as the curret best idividual, we assume the followig. The algorithm automatically performs a crossover betwee the two. Also, we assume that this crossover leads to the best possible offsprig i a sese that all bits where both parets differ are set to i. e., the algorithm performs a bit-wise OR. That is, if both search poits have i -bits ad Hammig distace 2k, the the resultig offsprig has i + k -bits. Due to our assumptios, at the ed of each geeratio there is always a sigle best idividual. For this reaso we ca model the algorithm by a Markov chai represetig the curret best fitess. The aalysis follows a lower boud for EAs o ONEMAX [47, Theorem 9]. As i [47] we cosider the followig fitess-level partitio that focuses oly o the very last fitess values. Let l = mi{/log, /p 2 log }. Let A i = {x x = i} for i > l ad A l cotai all remaiig search poits. We kow from [47] that the GA is iitialized i A l with probability at least / log if is large eough. The probability p i,i+k that the sped-up GA makes a trasitio from fitess i to fitess i + k equals p i,i+k = Probk more 0-bits tha -bits flip + Probk 0-bits ad k -bits flip 4

15 Accordig to [47, Lemma 2], for the cosidered fitess levels i > l the former probability is bouded by p k p k The latter probability is bouded by ik k! + 35 i ip2 p 2. Probk 0-bits flip Probk -bits flip ik k! ik k! p k p i k ik k! pk p i k p k p p k p 2k k!. Together, p i,i+k is at most p i k p + 3 i ip2 5 p 2+k + p k p 2k. k!k! We eed to fid variables u i ad γ i,i+k alog with some 0 χ such that all coditios of Theorem 8 are fulfilled. Defie u i := p p 35 i ip2 i + p 3 + p k p 2 max k k!k! ad Observe that, for every k N, γ i,i+k := k p i p 2. u iγ i,i+k p k p i k i ip 2 p +2k + pk p 2k k!k! p k p i k + 3 i ip2 5 p 2+k + p k p 2k k!k! p i,i+k. I order to fulfill the secod coditio i Theorem 8, we cosider the followig ormalized variables: u i := u i j=i+ γ i,j ad γ γ i,j := i,j. As u i γ i,j = u j=i+ γ i γ i,j p i,j, i,j this proves the first coditio of Theorem 8. Followig the proof of Theorem 9 i [47], it is easy to show that for χ := p 2 log we get γ i,j χ m k=j γ i,k for all i, j with j > i the calculatios o [47, pp ] carry over by replacig p with p 2. This establishes the third ad last coditio. As γ i,j χ m k=j γ i,k is equivalet to γ i,j χ m k=j γ i,k, we get j=i+ γ i,j γ i,i+ χ χ, 5

16 which implies, usig i ip 2 lp 2 log [47, 2] as well as +x / x for x <, u i p p i χ 35 i ip2 + p 3 + p k p 2 max k k!k! p p 3 i χ + 3 p k 5 log + max k k!k! p p 3 i χ p p 3 i χ 3 5 log + max k + max k p k k!k! 3 5 log. p k Ivokig Theorem 8 ad recallig that the first fitess level is reached with probability at least / log, we get a lower boud of χ log χ 2 log u i i=l O log 3 5 log + max k p k k!k! + max k p k k!k! k!k! p 3 p p i i=l p p i i=l where i the last step we used that all factors χ, 3 5 log, ad p are O d ad c log cd log for ay positive costats c, d. Boudig i=l log lmi{, /p 2 } llog as i [47] ad absorbig all small-order terms i the Olog log term from the statemet gives the claimed boud. We also ra experimets to see whether the outcome matches our ispectio of the domiatig terms i the ruig time bouds for realistic problem dimesios. We chose = 000 bits ad recorded the average optimizatio time over 000 rus. The mutatio rate p was set to c/ with c {0., 0.2,..., 4}. The result is show i Figure. Oe ca see that for every mutatio rate the greedy 2+ GA has a lower average optimizatio time. As predicted, the performace differece becomes larger as the mutatio rate icreases. The optimal mutatio rates for both algorithms match with miimal average optimizatio times. Note that also the variace/stadard deviatio was much lower for the GA for higher mutatio rates. Prelimiary rus for = 00 ad = 0000 bits gave very similar results. More experimets ad statistical tests are give i Sectio k-poit Crossover The k-poit crossover operator picks k cuttig poits from {,..., } uiformly at radom without replacemet. These cuttig poits divide both parets ito segmets that are the assembled from alteratig parets. That is, for parets x, y ad cuttig 6 i,

17 umber of evaluatios EA 2+ GA+uiform mutatio rate / Figure : Average optimizatio times for the + EA ad the greedy 2+ GA with uiform crossover o OeMax with = 000 bits. The mutatio rate p is set to c/ with c {0., 0.2,..., 4}. The thi lies show mea ± stadard deviatio. poits l < l 2 < < l k the offsprig will be: x... x l y l+... y l2 x l2+... x l3 y l3+... y l4... the suffix beig y lk +... y if k is odd ad x lk +... x if k is eve. For uiform crossover we have see that populatios cotaiig differet search poits of equal fitess are beeficial as uiform crossover ca easily combie the good buildig blocks. This holds regardless of the Hammig distace betwee these differet idividuals, ad the positio of bits where idividuals differ. The µ+λ GA with k-poit crossover is harder to aalyse as there the probability of crossover creatig a improvemet depeds o the Hammig distace of parets ad the positio of differig bits. Cosider parets that differ i two bits, where these bit positios are quite close. The -poit crossover has a high probability of takig both bits from the same paret. I order to recombie those buildig blocks, the cuttig poit has to be chose i betwee the two bit positios. A similar effect occurs for 2-poit crossover also if the two bit positios are o opposite eds of the bit strig. The followig lemma gives a lower boud o the probability that k-poit crossover combies the right buildig blocks o ONEMAX, if two parets are equally fit ad differ i two bits. The lemma ad its proof may be of idepedet iterest. Lemma 9. Cosider two search poits x, y with x i =, x i+d = 0, y i = 0, y i+d = for i < i+d ad x s = y s for s / {i, i+d}. The probability of k-poit crossover of x ad y, for ay k N, where N := 4 is the umber of possible cuttig poits, creatig a offsprig with a larger umber of -bits is at least ad exactly d/n for k =. dn d NN 7

18 Proof. We idetify cuttig poits with bits such that cuttig poit a results i two strigs x... x a ad x a+... x. We say that a cuttig poit a separates i ad i + d if a {i,..., i + d }. Note that the prefix is always take from x. The claim ow follows from showig that the umber of separatig cuttig poits is odd with the claimed probability. Let X N,d,k be the radom variables that describes the umber of cuttig poits separatig i ad i+d. This variable follows a hypergeometric distributio HypN, d, k, illustrated by the followig ur model with red ad white balls. The ur cotais N balls, d of which are red. We draw k balls uiformly at radom, without replacemet. The X N,d,k describes the umber of red balls draw. We defie the probability of X N,d,k beig odd, for d N ad k N as k k d N d P N, d, k := ProbX N,d,k = x = x k x N. k Note that for k = ad for k = 2 x=, x odd d N d P N, d, = 0 = d N N d N d P N, d, 2 = = N 2 x=, x odd 2dN d NN. For all d N ad all k N the followig recurrece holds. Imagie drawig the first cuttig poit separately. With probability d/n, the cuttig poit is a separatig cuttig poit, ad the we eed a eve umber of further separatig cuttig poits amog the remaiig k cuttig poits, draw from a radom variable X N,d,k. With the remaiig probability N d/n, the umber of remaiig cuttig poits must be eve, ad this umber is draw from a radom variable X N,d,k. Hece P N, d, k = d N d P N, d, k + P N, d, k. 6 N N Assume for a iductio that for all 2 k < k dn d NN P N, d, dn d k NN. 7 This is true for k = 2 as, usig 3dN d 3 N/2 2 NN for N 4, P N, d, 2 = 2dN d NN For k > 2, combiig 6 ad 7 yields = 3dN d dn d NN dn d NN. P N, d, k = d N d P N, d, k + P N, d, k N N d d N d N N N 2 + N d N dn d N N 2 dn dd + N d = NN N 2 dn d = NN. 8

19 The upper boud follows similarly: P N, d, k d N By iductio, the claim follows. d N d + N d N N 2 N dn dd + N d = NN N 2 dn d = NN. dn d N N 2 I the settig of Lemma 9, the probability of k-poit crossover creatig a improvemet depeds o the distace betwee the two differig bits. Fortuately, for search poits that result from a mutatio of oe aother, this distace has a favourable distributio. This is made precise i the followig lemma. Lemma 0. Let x result from x by a mutatio flippig oe -bit ad oe 0-bit, where the positios i, j of these bits are chose uiformly amog all -bits ad 0-bits, respectively. The for d := i j the radom variable mi{d, d} stochastically domiates the uiform distributio o {,..., /4}. Proof. We first show the followig. For ay fixed idex i ad ay iteger z < /2 there are exactly two positios j such that mi{d, d} = z. If i {,..., } ad z N are fixed, the oly values for j that result i either i j = z or i j = z are i+z, i z, i+z, ad i z +. Note that at most two of these values are i {,..., }. Hece, there are at most 2 feasible values for j for every d N. Let l deote the umber of -bits i x. If l /2, we assume that first the 0-bit is chose uiformly at radom, ad the cosider the uiform radom choice of a correspodig -bit. Without loss of geerality assume x i = 0 ad x j =. Whe i has bee fixed ad j is chose uiformly at radom, a worst case distributio for mi{ i j, i j } is attaied whe the -bits are distributed such that for each d l/2 both feasible bit positios are. The worst case is hece a uiform distributio o {,..., l/2 }, which stochastically domiates the uiform distributio o {,..., /4}. The case l < /2 is symmetrical: exchagig the roles of x j ad x i as well as the roles of zeros ad oes yields a uiform distributio o the set {,..., l/2 } as worst case, which agai stochastically domiates the uiform distributio o {,..., /4}. Take together, Lemma 9 ad Lemma 0 idicate that k-poit crossover has a good chace of fidig improvemets through recombiig the right buildig blocks. However, this is based o the populatio cotaiig potetial parets of equal fitess that oly differ i two bits. The followig aalysis shows that the populatio is likely to cotai such a favourable pair of parets. However, such a pair might get lost agai if other idividuals of the same fitess are beig created, after all duplicates have bee removed from the populatio. For parets that differ i more tha 2 bits, Lemma 9 does ot apply, hece we do ot have a estimate of how likely such a crossover will fid a improvemet. I order to avoid this problem, we cosider a more detailed tie-breakig rule. As before, idividuals with fewer duplicates are beig preferred. I case there are still ties 9

20 umber of evaluatios GA +-poit, dup-rd 2+ GA +2-poit, dup-rd 2+ GA +-poit, dup-old 2+ GA +2-poit, dup-old mutatio rate / Figure 2: Average optimizatio times o ONEMAX with = 000 bits over 000 rus for the greedy 2+ GA with - ad 2-poit crossover usig differet tie-breakig rules if idividuals are tied with regard to fitess ad the umber of duplicates. dup-rd breaks these ties radomly, whereas dup-old Algorithm 3 prefers older idividuals. The mutatio rate p is set to c/ with c {0., 0.2,..., 4}. after cosiderig the umber of duplicates, the µ+λ GA will retai older idividuals. This refied tie-breakig rule is show i Algorithm 3. As will be show i the remaider, it implies that oce a favourable pair of parets with Hammig distace 2 has bee created, this pair will ever get lost. Algorithm 3: Refied tie-breakig rule dup-old. 4 Let P cotai the µ best idividuals from P P ; break ties towards icludig idividuals with the fewest duplicates i P P. If there are still ties, break them towards icludig older idividuals. This tie-breakig rule, called dup-old differs from the oe used for the experimets i Figure ad those i Sectio 6. There, we broke ties uiformly at radom i case idividuals are tied with respect to both fitess ad the umber of duplicates. We call the latter rule dup-rd. Experimets for the greedy 2+ GA comparig tie-breakig rules dup-old ad dup-rd over 000 rus idicate that performace differeces are very small, see Figure 2. 3 Note, however, that o fuctios with plateaus, like royal road fuctios, retaiig the older idividuals prevets the µ+λ GA from performig radom walks o the plateau, oce the populatio has spread such that there are o duplicates of ay idividual. I this case we expect that performace will deteriorate whe breakig ties towards older idividuals. With the refied tie-breakig rule, the performace of µ+λ GAs is as follows. 3 Eve though differeces are small, oe-sided Ma-Whitey U tests reveal some statistically sigificat differeces: for -poit crossover dup-rd is sigificatly faster tha dup-old o a sigificace level of 0.00 for mutatio rates at least 2.4/ with two exceptios, 2.8/ ad 3.6/, with p-values still below Cotrarily, dup-old was sigificatly faster for 2-poit crossover for mutatio rates i the rage of 0.8/ to 3/. 20

21 Theorem. The expected optimizatio time of every µ+λ GA implemetig Algorithm with tie-breakig rule dup-old from Algorithm 3, 2 µ = O, λ < µ, p c = o ad p c = ω/ log, p = c/ for some costat c > 0, ad k-poit crossover with ay k 2, o ONEMAX is at most l c e c + o. + c This boud equals the upper boud 3 for µ+λ GAs with uiform crossover. It improves upo the previous upper boud for the greedy 2+ GA from [46, Theorem 8], whose domiat term was by a additive term of 2c 3+3c l larger. The reaso was that for the 2+ GA favourable parets could get lost, which is ow preveted by the dup-old tie-breakig rule ad coditios o p c. The coditios p c = o as well as µ, λ = O are useful because they allow us to estimate the probability that a sigle good idividual takes over the whole populatio with copies of itself. I the remaider of this sectio we work towards provig Theorem ad assume that 0 for some 0 chose such that all asymptotic statemets that require a large eough value of hold true. For < 0 there is othig to prove, as the statemet holds trivially for bouded. We agai estimate the time spet o each fitess level i, i. e., whe the best fitess i the curret populatio is i. To this ed, we focus o the higher fitess levels i / log where the probability of creatig a offsprig o the same level ca be estimated icely. The time for reachig these higher fitess levels oly costitutes a small-order term, compared to the claimed ruig time boud. The followig lemma proves this claim i a more geeral settig tha eeded for the proof of Theorem. I particular, it holds for arbitrary tie-breakig rules ad crossover operators. Lemma 2. For every µ+λ GA implemetig Algorithm with µ, λ = O, p c = Ω, ad p = c/ for a costat c > 0, usig ay iitializatio ad ay crossover operator, the expected time util a fitess level i / log is reached for the first time is o log. A proof is give i the appedix. I the remaider of the sectio we focus o higher fitess levels i / log ad specify the differet cases o each such fitess level. The cases i., i.2, ad i.3 are similar to the oes for uiform crossover, with additioal coditios o the similarity of idividuals i Cases i.2 ad i.3. We also have a additioal error state that accouts for udesirable ad uexpected behavior. We pessimistically assume that the error state caot be left towards other cases o level i. Case i.: The populatio cotais a idividual o level i ad at least oe idividual o a lower fitess level. Case i.2: The populatio cotais µ copies of a idividual x o level i. Case i.3: The populatio cotais two search poits x, y with curret best fitess i, where y resulted from a mutatio of x ad the Hammig distace of x ad y is 2. Case i.error: A error state reached from ay Case i. whe the best fitess is i ad oe of the prior cases applies. The differece to the aalysis of uiform crossover is that i Case i.2 we rely o the populatio collapsig to copies of a sigle idividual. This helps to estimate the probability of creatig a favourable paret-offsprig pair i Case i.3 as the µ+λ GA effectively oly performs mutatios of x while beig i Case i.2. Lemma 3. Cosider ay µ+λ GA as defied i Theorem, with parameters 2 µ = O, λ < µ, p c = o ad p c = ω/ log, p = c/ for some costat c > 0. The total expected 2

arxiv: v1 [cs.ne] 26 Mar 2014

arxiv: v1 [cs.ne] 26 Mar 2014 How Crossover Speeds Up Buildig-Block Assembly i Geetic Algorithms arxiv:403.6600v [cs.ne] 26 Mar 204 Dirk Sudholt Departmet of Computer Sciece, Uiversity of Sheffield, Uited Kigdom Abstract We re-ivestigate