arxiv: v2 [cs.ne] 26 Nov 2014

Size: px
Start display at page:

Download "arxiv: v2 [cs.ne] 26 Nov 2014"

Transcription

1 How Crossover Speeds Up Buildig-Block Assembly i Geetic Algorithms arxiv: v2 [cs.ne] 26 Nov 204 Dirk Sudholt Departmet of Computer Sciece, Uiversity of Sheffield, Uited Kigdom Abstract We re-ivestigate a fudametal questio: how effective is crossover i Geetic Algorithms i combiig buildig blocks of good solutios? Although this has bee discussed cotroversially for decades, we are still lackig a rigorous ad ituitive aswer. We provide such aswers for royal road fuctios ad ONEMAX, where every bit is a buildig block. For the latter we show that usig crossover makes every µ+λ Geetic Algorithm at least twice as fast as the fastest evolutioary algorithm usig oly stadard bit mutatio, up to small-order terms ad for moderate µ ad λ. Crossover is beeficial because it effectively turs fitess-eutral mutatios ito improvemets by combiig the right buildig blocks at a later stage. Compared to mutatio-based evolutioary algorithms, this makes multi-bit mutatios more useful. Itroducig crossover chages the optimal mutatio rate o ONEMAX from / to + 5/2 /.68/. This holds both for uiform crossover ad k-poit crossover. Experimets ad statistical tests cofirm that our fidigs apply to a broad class of buildig-block fuctios. Keywords Geetic algorithms, crossover, recombiatio, mutatio rate, rutime aalysis, theory. Itroductio Ever sice the early days of geetic algorithms GAs, researchers have wodered whe ad why crossover is a effective search operator. It has bee folklore that crossover is useful if it ca combie buildig blocks, i. e., schema of high fitess, to give better solutios [30]. But, as Watso ad Jase [50] put it, there has bee a cosiderable difficulty i demostratig this rigorously ad ituitively. May attempts at uderstadig crossover have bee made i the past. Mitchell, Forrest, ad Hollad [30] preseted so-called royal road fuctios as a example where, supposedly, geetic algorithms outperform other search algorithms due to the use of crossover. Royal roads divide a bit strig ito disjoit blocks. Each block makes a positive cotributio to the fitess i case all bits therei are set to. Blocks thus represet schemata, ad all-oes cofiguratios are buildig blocks of optimal solutios. However, the same authors later cocluded that simple radomized hill climbers performed better tha GAs [7, 3]. The role of crossover has bee studied from multiple agles, icludig algebra [42], Markov chai models [49], ifiite populatio models ad dyamical systems see [7, Chapter 6] for a overview ad statistical mechaics see, e. g. [5, Chapter ]. Also i biology the role of crossover is far from settled. I populatio geetics, explorig the advatages of recombiatio, or sexual reproductio, is a famous ope questio [5] ad has bee called the quee of problems i evolutioary biology by

2 Graham Bell [6] ad others. Evolutioary processes were foud to be harder to aalyze tha those usig oly asexual reproductio as they represet quadratic dyamical systems [2, 40]. Recet work i populatio geetics has focussed o studyig the speed of adaptatio, which describes the efficiecy of evolutio, i a similar vei to research i evolutioary computatio [52, 53]. Furthermore, a ew theory of mixability has bee proposed recetly from the perspective of theoretical computer sciece [28, 29], arguig that recombiatio favours idividuals that are good mixers, that is, idividuals that create good offsprig whe beig recombied with others. Several researchers recetly ad idepedetly reported empirical observatios that usig crossover improves the performace of evolutioary algorithms EAs o the simple fuctio ONEMAX x = i= x i [24, 4], but were uable to explai why. The fact that eve settigs as simple as ONEMAX are ot well uderstood demostrates the eed for a solid theory ad serves as motivatio for this work. Rutime aalysis has become a major area of research that ca give rigorous evidece ad prove theorems [35, 3, 8]. However, studies so far have eluded the most fudametal settig of buildig-block fuctios. Crossover was prove to be superior to mutatio oly o costructed artificial examples like Jump k [20, 23] ad Real Royal Road fuctios [2, 43], the H-IFF problem [8], colorig problems ispired by the Isig model from physics [6, 44], computig uique iput-output sequeces for fiite state machies [26], selected problems from multi-objective optimizatio [39], ad the allpairs shortest path problem [, 48, 34]. H-IFF [8] ad the Isig model o trees [44] cosist of hierarchical buildig blocks. But oe of the above papers addresses siglelevel buildig blocks i a settig as simple as royal roads. Watso ad Jase [50] preseted a costructed buildig-block fuctio ad proved expoetial performace gaps betwee EAs usig oly mutatio ad a GA. However, the defiitio of the iteral structure of buildig blocks is complicated ad artificial, ad they used a tailored multi-deme GA to get the ecessary diversity. With regard to the questio o how GAs combie buildig blocks, their approach does ot give the ituitive explaatio oe is hopig for. This paper presets such a ituitive explaatio, supported by rigorous aalyses. We cosider royal roads ad other fuctios composed of buildig blocks, such as mootoe polyomials. ONEMAXx = i= x i is a special case where every bit is a buildig block. We give rigorous proofs for ONEMAX ad show how the mai proof argumets trasfer to broader classes of buildig-block fuctios. Experimets support the latter. Our mai results are as follows.. We show i Sectio 3 that o ONEMAX every µ+λ GA with uiform crossover ad stadard bit mutatio is at least twice as fast as every evolutioary algorithm EA that oly uses stadard bit mutatios up to small-order terms. More precisely, the domiatig term i the expected umber of fuctio evaluatios decreases from e l to e/2 l. This holds provided that the paret populatio ad offsprig populatio sizes µ ad λ are moderate, so that the iertia of a large populatio does ot slow dow exploitatio. The reaso for this speedup is that the GA ca store a eutral mutatio a mutatio ot alterig the paret s fitess i the populatio, alog with the respective paret. It ca the use crossover to combie the good buildig blocks betwee these two idividuals, improvig the curret best fitess. For bipartite graphs, the problem is equivalet to the classical Graph Colorig problem with 2 colors. 2

3 I other words, crossover ca capitalize o mutatios that have both beeficial ad disruptive effects o buildig blocks. 2. The use of uiform crossover leads to a shift i the optimal mutatio rate o ONE- MAX. We demostrate this i Sectio 4 for a simple greedy 2+ GA that always selects parets amog the curret best idividuals. While for mutatio-based EAs / is the optimal mutatio rate [55], the greedy 2+ GA has a optimal mutatio rate of + 5/2 /.68/ igorig small-order terms. This is because itroducig crossover makes eutral mutatios more useful ad larger mutatio rates icrease the chace of a eutral mutatio. Optimality is proved by meas of a matchig lower boud o the expected optimizatio time of the greedy 2+ GA that applies to all mask-based crossover operators where each bit value is take from either paret. Usig the optimal mutatio rate, the expected umber of fuctio evaluatios is.9 l ± O log log. 3. These results are ot limited to uiform crossover or the absece of likage. Sectio 5 shows that the same results hold for GAs usig k-poit crossover, for arbitrary k, uder slightly stroger coditios o µ ad λ, if the crossover probability p c is set to a appropriately small value. 4. The reasoig for ONEMAX carries over to other fuctios with a clear buildig block structure. Experimets i Sectio 6 reveal similar performace differeces as o ONEMAX for royal road fuctios ad radom polyomials with uweighted, positive coefficiets. This is largely cofirmed by statistical tests. There is evidece that fidigs also trasfer to weighted buildig-block fuctios like liear fuctios, provided that the populatio ca store solutios with differet fitess values ad differet buildig blocks util crossover is able to combie them. This is ot the case for the greedy 2+ GA, but a simple 5+ GA is sigificatly faster o radom liear fuctios tha the optimal mutatio-based EA for this class of fuctios, the + EA [55]. The first result, the aalysis for uiform crossover, is remarkably simple ad ituitive. It gives direct isight ito the workig priciples of GAs. Its simplicity also makes it very well suited for teachig purposes. This work exteds a prelimiary coferece paper [46] with parts of the results, where results were restricted to oe particular GA, the greedy 2+ GA. This exteded versio presets a geeral aalytical framework that applies to all µ+λ GAs, subject to mild coditios, ad icludes the greedy 2+ GA as a special case. To this ed, we provide tools for aalyzig paret ad offsprig populatios i µ+λ GAs, which we believe are of idepedet iterest. Moreover, results for k-poit crossover have bee improved. The leadig costat 2c 3+3c larger i the upper boud for k-poit crossover i [46] was by a additive term of tha that for uiform crossover, for mutatio rates of c/. This left ope the questio whether k-poit crossover is as effective as uiform crossover for assemblig buildig blocks i ONEMAX. Here we provide a ew ad refied aalysis, which gives a affirmative aswer, uder mild coditios o the crossover probability.. Related Work Kötzig, Sudholt, ad Theile [23] cosidered the search behaviour of a idealized GA o ONEMAX, to highlight the potetial beefits of crossover uder ideal circumstaces. 3

4 If a GA was able to recombie two idividuals with equal fitess that result from idepedet evolutioary lieages, the fitess gai ca be of order Ω. The idealized GA would therefore be able to optimize ONEMAX i expected time O [23]. However, this idealizatio caot reasoably be achieved i realistic EAs with commo search operators, hece the result should be regarded a academic study o the potetial beefit of crossover. A related strad of research deals with the aalysis of the Simple GA o ONEMAX. The Simple GA is oe of the best kow ad best researched GAs i the field. It uses a geeratioal model where parets are selected usig fitess-proportioal selectio ad the geerated offsprig form the ext populatio. Neuma, Oliveto ad Witt [33] showed that the Simple GA without crossover with high probability caot optimize ONEMAX i less tha expoetial time. The reaso is that the populatio typically cotais idividuals of similar fitess, ad the fitess-proportioal selectio is similar to uiform selectio. Oliveto ad Witt [38] exteded this result to uiform crossover: the Simple GA with uiform crossover ad populatio size µ /8 ε, ε > 0, still eeds expoetial time o ONEMAX. It eve eeds expoetial time to reach a solutio of fitess larger tha + c /2 for a arbitrary costat c > 0. I [37] the same authors relaxed their coditio o the populatio size to µ /4 ε. Their work does ot exclude that crossover is advatageous, particularly sice uder the right circumstaces crossover may lead to a large icrease i fitess cf. [23]. But if there is a advatage, it is ot oticeable as the Simple GA with crossover still fails badly o ONEMAX. Oe year after [46] was published, Doerr, Doerr, ad Ebel [9] preseted a groudbreakig result: they desiged a EA that was prove to optimise ONEMAX ad ay simple trasformatio thereof i time O log. This is a spectacular result as all black-box search algorithms usig oly ubiased uary operators operators modifyig oe idividual oly, ad ot exhibitig ay iheret search bias eed time Ω log as show by Lehre ad Witt [27]. So their EA shows that crossover ca lower the expected ruig time by more tha a costat factor. They call their algorithm a +λ, λ EA: startig with oe paret, it first creates λ offsprig by mutatio, with a radom ad potetially high mutatio rate. The it selects the best mutat, ad crosses it λ times with the origial paret, usig parameterized uiform crossover the probability of takig a bit from the first paret is ot always /2, but a parameter of the algorithm. This leads to a umber of O log expected fuctio evaluatios, which ca be further decreased to O with a scheme adaptig λ accordig to the curret fitess. The +λ, λ EA from [9] is very cleverly desiged to work efficietly o ONE- MAX ad similar fuctios. It uses a o-stadard EA desig because of its two phases of evirometal selectio. Other differeces are that mutatio is performed before crossover, ad mutatio is ot fully idepedet for all offsprig: the umber of flippig bits is a radom variable determied as for stadard bit mutatios, but the same umber of flippig bits is the used i all offsprig. The focus of this work is differet as our goal is to uderstad how stadard EAs operate, ad how crossover ca be used to speed up buildig-block assembly i commoly used µ+λ EAs. 2 Prelimiaries We measure the performace of the algorithm with respect to the umber of fuctio evaluatios performed util a optimum is foud, ad refer to this as optimizatio time. For steady-state algorithms this equals the umber of geeratios apart from the iitializatio, ad for EAs with offsprig populatios such as µ+λ EAs or µ+λ GAs the optimizatio time is by a factor of λ larger tha the umber of geeratios. Note 4

5 that the umber of geeratios eeded to optimize a fitess fuctio ca ofte be easily decreased by usig offsprig populatios or parallel evolutioary algorithms [25]. But this sigificatly icreases the computatioal effort withi oe geeratio, so the umber of fuctio evaluatios is a more fair ad widely used measure. Lookig at fuctio evaluatios is ofte motivated by the fact that this operatio domiates the executio time of the algorithm. The the umber of fuctio evaluatios is a reliable measure for wall clock time. However, the wall clock time might icrease whe itroducig crossover as a additioal search operator. Also whe icreasig the mutatio rate, more pseudo-radom umbers might be required. Jase ad Zarges [22] poit out a case where this effect leads to a discrepacy betwee the umber of fuctio evaluatios ad wall clock time. This cocer must be take seriously whe aimig at reducig wall clock time. However, each implemetatio must be checked idividually i this respect [22]. Therefore, we keep this cocer i mid, but still use the umber of fuctio evaluatios i the followig. 3 Uiform Crossover Makes µ+λ EAs Twice as Fast We show that, uder mild coditios, every µ+λ GA is at least twice as fast as its couterpart without crossover. For the latter, that is, evolutioary algorithms usig oly stadard bit mutatio, the author recetly proved the followig lower boud o the ruig time of a very broad class of mutatio-based EAs [47]. It covers all possible selectio mechaisms, paret or offsprig populatios, ad eve parallel evolutioary algorithms. We slightly rephrase this result. Theorem Sudholt [47]. Let 2. Every EA that uses oly stadard bit mutatio with mutatio rate p to create ew solutios has expected optimizatio time at least mi{l, l/p 2 } l l 3 p p o ONEMAX ad every other fuctio with a uique optimum, if 2 /3 p log. If p = c/, c > 0 costat, this is at least c e c l o. I fact, for ONEMAX the author proved that amog all evolutioary algorithms that start with oe radom solutio ad oly use stadard bit mutatios the expected umber of fuctio evaluatios is miimized by the simple + EA [47, Theorem 3]. Also the mutatio rate p = / is the best possible choice for ONEMAX, leadig to a lower boud of e l e l l 3e. For the special case of p = /, Doerr, Fouz, ad Witt [0] recetly improved the above boud towards e l O. We show that for a rage of µ+λ EAs, as defied i the followig, itroducig uiform crossover ca cut the domiat term of the ruig time i half, for the stadard mutatio rate p = /. The oly requiremet o the paret selectio mechaism is that selectio does ot favor iferior solutios over fitter oes. Formally, for maximizig a fitess fuctio f, x, y : fx fy Probselect x Probselect y. 5

6 This i particular implies that equally fit solutios are selected with the same probability. Coditio is satisfied for all commo selectio mechaisms: uiform selectio, fitess-proportioal selectio, touramet selectio, cut selectio, ad rak-based mechaisms. The class of µ+λ EAs covered i this work is defied i Algorithm. All µ+λ EAs therei create λ offsprig through crossover ad mutatio, or just mutatio, ad the pick the best out of the µ previous search poits ad the λ ew offsprig. Algorithm : Scheme of a µ+λ GA with mutatio rate p ad uiform crossover with crossover probability p c for maximizig f : {0, } R. Iitialize populatio P of size µ N u. a. r. 2 while true do 3 Let P =. 4 for i =,..., λ do 5 With probability p c do 6 Select x, x 2 with a operator respectig. 7 Let y := uiform crossoverx, x 2. 8 otherwise do 9 Select y with a operator respectig. 0 ed Flip each bit i y idepedetly with probability p. 2 Add y to P. 3 ed 4 Let P cotai the µ best idividuals from P P ; break ties towards icludig idividuals with the fewest duplicates i P P. 5 ed I the case of ties, we pick solutios that have the fewest duplicates amog the cosidered search poits. This strategy has already bee used by Jase ad Wegeer [2] i their groudbreakig work o Real Royal Roads; it esures a sufficiet degree of diversity wheever the populatio cotais differet search poits of the same fitess. Before statig the mai result of this sectio, we provide two lemmas showig how to aalyse populatio dyamics. Both lemmas are of idepedet iterest ad may prove useful i other studies of populatio-based EAs. The followig lemma estimates the expected time util idividuals with fitess at least i take over the whole populatio. It geeralizes Lemma 3 i [45], which i tur goes back to Witt s aalysis of the µ+ EA [54]. Note that the lemma applies to arbitrary fitess fuctios, arbitrary values for µ ad λ, ad arbitrary crossover operators; it merely relies o fudametal ad uiversal properties of cut selectio ad stadard bit mutatios. Lemma 2. Cosider ay µ+λ GA implemetig Algorithm, with ay crossover operator, o ay -bit fitess fuctio. Assume the curret populatio cotais at least oe idividual of fitess i. The expected umber of fuctio evaluatios eeded for the µ+λ GA before all idividuals i its curret populatio have fitess at least i is at most Oµ + λ log µ p c p. 6

7 This holds for ay tie-breakig rule used i the evirometal selectio. Proof. Call a idividual fit if it has fitess at least i. We ow estimate the expected umber of geeratios util the populatio is take over by fit idividuals, which we call the expected takeover time. As fit idividuals are always preferred to o-fit idividuals i the evirometal selectio, the expected takeover time equals the expected umber of geeratios util µ fit idividuals have bee created, startig with oe fit idividual. For each offsprig beig created, there is a chace that the µ+λ GA will simply create a cloe of a fit idividual. This happes if, durig the creatio of a offsprig, the µ+λ GA decides ot to perform crossover, it selects a fit idividual as paret to be mutated, ad mutatio does ot flip ay bit. The probability for this evet is at least p c p umber of fit idividuals i populatio µ sice each fit idividual is selected as paret with probability at least /µ. Now we divide the ru of the µ+λ GA ito phases i order to get a lower boud o the umber of fit idividuals at certai time steps. The j-th phase, 0 j log 5 µ, starts with the first offsprig creatio i the first geeratio where the umber of fit idividuals is at least 5 j. It eds i the first geeratio where this umber is icreased to mi{5 j+, µ}. Let T j describe the radom umber of geeratios spet i the j-th phase. Startig with a ew geeratio with µ 5 j fit idividuals i the paret populatio, we ow cosider a phase of 8µ/ p c p offsprig creatios, disregardig geeratio bouds. Let N i deote the radom umber of ew fit offsprig created i the phase, the EN i 8µ p c p p c p 5i µ = 8 5i ad by classical Cheroff bouds see, e. g. [32, Chapter 4] ProbN i < 4 5 i e ENi/8 e 5i e. If N i < 4 5 i the phase is called usuccessful ad we cosider aother phase of 8µ/ p c p offsprig creatios. The expected waitig time for a successful phase is at most / e ad the expected umber of offsprig creatios util N i 4 5 i is at most 8µ/ p c p e. Sice phases start at geeratio bouds, we may eed to accout for up to λ further offsprig creatios i betwee phases. This implies ET i ad the expected takeover time is at most log 5 µ i=0 ET i log 5 µ = 8µ p c p e + λ Oµ + λ log µ p c p. 8µ p c p e + λ 7

8 We also provide the followig simple but hady lemma, which relates success probabilities for created offsprig to the expected umber of fuctio evaluatios eeded to complete a geeratio where such a evet has first happeed. Lemma 3. Cosider ay µ+λ GA implemetig Algorithm, ad assume that i each offsprig creatio there is a probability at least q that some specific evet occurs. The the expected umber of fuctio evaluatios to complete a geeratio where this evet first occurs is at most λ + q. Proof. The expected umber of trials for a evet with probability q to occur is /q. To complete the geeratio, at most λ further fuctio evaluatios are required. Now we are able to prove the mai result of this sectio. Theorem 4. The expected optimizatio time of every µ+λ GA implemetig Algorithm with 0 < p c < costat, mutatio probability 0 < p < ad µ 2 o ONEMAX is at most l 2 p p Oµ + λ log µ p p + + p p. 2 If p = c/, c > 0 costat, ad µ, λ = olog /log log, this boud simplifies to Both statemets hold for arbitrary iitial populatios. l c e c + o. 3 + c The mai differece betwee the upper boud for µ+λ GAs ad the lower boud for all mutatio-based EAs is a additioal factor of + p i the deomiator of the upper boud. This is a factor of 2 for p = / ad a eve larger gai for larger mutatio rates. For the default value of p = /, this shows that itroducig crossover makes EAs at least twice as fast as the fastest EA usig oly stadard bit mutatio. It also implies that itroducig crossover makes EAs at least twice as fast as their couterparts without crossover i. e. where p c = 0. Proof of Theorem 4. Boud 3 ca be derived from 2 usig /x x /e for x > to estimate c = c /c c c c e c c2 = e c O/ as well as lc c/ = l + O. Note that µ + λ log µ = o log by coditios o µ, λ, hece this ad all other small-order terms are absorbed i the term o. I order to prove the geeral boud 2, we cosider caoical fitess levels, i. e., the i-th fitess level cotais all search poits with fitess i. We estimate the time spet o each level i, i. e., whe the best fitess i the curret populatio is i. For each fitess level we cosider three cases. The first case applies whe the populatio cotais idividuals o fitess levels less tha i. The secod case is whe the populatio oly cotais copies of a sigle idividual o level i. The third case occurs whe the populatio cotais more tha oe idividual o level i; the the populatio cotais differet buildig blocks that ca be recombied effectively by crossover. 8

9 All these cases capture the typical behaviour of a µ+λ GA, albeit some of these cases, ad eve whole fitess levels, may be skipped. We obtai a upper boud o its expected optimizatio time by summig up expected times the µ+λ GA may sped i all cases ad o all fitess levels. Case i.: The populatio cotais a idividual o level i ad at least oe idividual o a lower fitess level. A sufficiet coditio for leavig this case is that all idividuals i the populatio obtai fitess at least i. Sice the µ+λ GA ever accepts worseigs, the case is left for good. The time for all idividuals reachig fitess at least i has already bee estimated i Lemma 2. Applyig this lemma to all fitess levels i, the overall time spet i all cases i. is at most Oµ + λ log µ Oµ + λ log µ = p c p p. Case i.2: The populatio cotais µ copies of the same idividual x o level i. I this case, each offsprig created by the µ+λ GA will be a stadard mutatio of x. This is obvious for offsprig where the µ+λ GA decides ot to use crossover. If crossover is used, the µ+λ GA will pick x, x 2 = x, create y = x by crossover, ad hece perform a mutatio o x. The µ+λ GA leaves this case for good if either a better search poit is created or if it creates aother search poit with i oes. I the latter case we will create a populatio with two differet idividuals o level i. Note that due to the choice of the tie-breakig rule i the evirometal selectio, the µ+λ GA will always maitai at least two idividuals o level i, uless a improvemet with larger fitess is foud. The probability of creatig a better search poit i oe mutatio is at least i p p as there are i suitable -bit flips. The probability of creatig a differet search poit o level i is at least i i p 2 p 2 as it is sufficiet to flip oe of i -bits, to flip oe of i 0-bits, ad ot to flip ay other bit. The probability of either evet happeig i oe offsprig creatio is thus at least i p p + i i p 2 p 2 p p i + ip. By Lemma 3, the expected umber of fuctio evaluatios i Case i.2 is at most λ + p p i + ip. The expected umber of fuctios evaluatios made i all cases i.2 is hece at most λ + p p i + ip = λ + i=0 p p i + ip. 4 i=0 9

10 The last sum ca be estimated as follows. Separatig the summad for i =, 2 i=0 i=0 i + ip + + p i + ip di + + p + p. We use equatio i [] to simplify the itegral ad get [ ] + ip + p l + + p i + p lp + p + l = + p l2 p p. + p p + p Pluggig this ito 4 yields that the expected time i all cases i.2 is at most λ + l2 p p p p + p. Case i.3: The populatio oly cotais idividuals o level i, ot all of which are idetical. I this case we ca rely o crossover recombiig two differet idividuals o level i. As they both have differet buildig blocks, i. e., differet bits are set to, there is a good chace that crossover will geerate a offsprig with a higher umber of -bits. The probability of performig a crossover with two differet parets i oe offsprig creatio is at least p c µ µ 2 as i the worst case the populatio cotais µ copies of oe particular idividual. Assumig two differet parets are selected for crossover, let these have Hammig distace 2d ad let X deote the umber of -bits amog these positios i the offsprig. Note that X is biomially distributed with parameters 2d ad /2 ad its expectatio is d. We estimate the probability of gettig a surplus of -bits as this leads to a improvemet i fitess. This estimate holds for ay d N. Sice ProbX < d = ProbX > d, we have ProbX > d = 2 ProbX = d = 2d 2 2d 2 d 4. Mutatio keeps all -bits with probability at least p. Together, the probability of icreasig the curret best fitess i oe offsprig creatio is at least p c µ µ 2 p. 4 By Lemma 3, the expected umber of fuctio evaluatios i Case i.3 is at most λ + 4µ 2 p c µ p. 0

11 The total expected time spet i all cases i.3 is hece at most λ + 4µ 2 Oµ = λ + p c µ p p as p c = Ω. Summig up all expected times yields a total time boud of l 2 p p Oµ + Oµ + λ log µ p p + 2λ + + p p = l2 p p Oµ + λ log µ p p + + p p. Remark O coditios for µ ad λ. The secod statemet of Theorem 4 requires µ, λ = olog /log log i order to establish the upper boud i 3. This coditio seems ecessary as for larger values of µ ad λ the iertia of a large populatio slows dow exploitatio, at least i the absece of crossover. Note ot all EAs covered by Theorem 4 after removig crossover optimize ONEMAX i time O log. Witt [54] showed that a µ+ EA with uiform paret selectio has a expected optimizatio time of Ωµ + log o ONEMAX. For µ = ωlog, this lower boud is ω log. Jase, De Jog, ad Wegeer [9] showed that a +λ EA eeds time ω log o ONEMAX if λ = ωlog log log /log log log. Badkobeh, Lehre, ad Sudholt [4] showed that every black-box algorithm creatig λ offsprig, usig oly stadard bit mutatio or other uary ubiased operators, eeds time ω log o ONEMAX for λ = ωlog log log. This idicates that the threshold i the coditio µ, λ = olog /log log is tight up to polyomials of log log. Remark 2 O coditios for p c. Theorem 4 assumes 0 < p c < costat, which reflects the most commo choices i applicatios of EAs. The theorem ca be exteded towards smaller or larger values as follows. If p c = o the upper boud o the time spet i Cases i.3 icreases as it cotais a factor of /p c. The other cases remai uaffected, ad if µ + λ log µ/p c = olog we still get the upper boud from 3. For high crossover probabilities, that is, p c = o or p c =, oly Cases i. eed to be revisited. The time i those cases was derived from Lemma 2, which ca be adapted as follows: the probability for icreasig the umber of fit idividuals is at least p c p umber of fit idividuals i populatio2 2µ 2 as it suffices to select two fit idividuals ad geerate a average or above-average umber of -bits i the offsprig, which happes with probability at least /2. The time boud from Lemma 2 the becomes ad the time boud i Theorem 4 becomes Oµ 2 + λ log µ p l 2 p p p p + p + Oµ2 + λ log µ p. For p = c/, c > 0 costat, ad µ, λ = o log, this also establishes the upper boud from 3.

12 It is remarkable that the waitig time for successful crossovers i Cases i.3 is oly of order Oµ + λ. For small values of µ ad λ, e. g. µ, λ = O, the time spet i all Cases i.3 is O, which is egligible compared to the overall time boud of order Θ log. This shows how effective crossover is i recombiig buildig blocks. Also ote that the proof of Theorem 4 is relatively simple as it uses oly elemetary argumets ad, alog with Lemmas 2 ad 3, it is fully self-cotaied. The aalysis therefore leds itself for teachig purposes o the behavior of evolutioary algorithms ad the beefits of crossover. Our aalysis has revealed that fitess-eutral mutatios, that is, mutatios creatig a differet search poit of the same fitess, ca help to escape from the case of a populatio with idetical idividuals. Eve though these mutatios do ot immediately yield a improvemet i terms of fitess, they icrease the diversity i the populatio. Crossover is very efficiet i exploitig this gaied diversity by combiig two differet search poits at a later stage. This meas that crossover ca capitalize o mutatios that have both beeficial ad disruptive effects o buildig blocks. A iterestig cosequece is that this affects the optimal mutatio rate o ONE- MAX. For EAs usig oly stadard bit mutatios Witt [55] recetly proved that / is the optimal mutatio rate for the + EA o all liear fuctios. Recall that the + EA is the optimal mutatio-based EA i the sese of Theorem o ONE- MAX [47]. For mutatio-based EAs o ONEMAX eutral mutatios are either helpful or detrimetal. With crossover eutral mutatios ow become helpful. Icreasig the mutatio rate icreases the likelihood of eutral mutatios. I fact, we ca easily derive better upper bouds from Theorem 4 for slightly larger mutatio rates, thaks to the additioal term + p i the deomiator of the upper boud. The domiat term i 3, l c e c + c is miimized for c beig the golde ratio c = 5 + /2.68. This leads to the followig. Corollary 5. The asymptotically best ruig time boud from Theorem 4 is obtaied for p = + 5/2. For this choice the domiat term i 3 becomes e 5+/ l.9 l. 4 The Optimal Mutatio Rate Corollary 5 gives the mutatio rate that yields the best upper boud o the ruig time that ca be obtaied with the proof of Theorem 4. However, it does ot establish that this mutatio rate is ideed optimal for ay GA. After all, we caot exclude that aother mutatio rate leads to a smaller expected optimizatio time. I the followig, we show for a simple 2+ GA Algorithm 2 that the upper boud from Theorem 4 is ideed tight up to small-order terms, which establishes p = + 5/2 as the optimal mutatio rate for that 2+ GA. Provig lower bouds o expected optimizatio times is ofte a otoriously hard task, hece we restrict ourselves to a simple bare-boes GA that captures the characteristics of GAs covered by Theorem 4 ad is easy to aalyze. The latter is achieved by fixig as may parameters as possible. 2

13 As the upper boud from Theorem 4 grows with µ ad λ, we pick the smallest possible values: µ = 2 ad λ =. The paret selectio is made as simple as possible: we select parets uiformly at radom from the curret best idividuals i the populatio. I other words, if we defie the paret populatio as the set of idividuals that have a positive probability to be chose as parets, the paret populatio oly cotais idividuals of the curret best fitess. We call this paret selectio greedy because it is a greedy strategy to choose the curret best search poits as parets. I the cotext of the proof of Theorem 4 greedy paret selectio implies that Cases i. are ever reached as the paret populatio ever spas more tha oe fitess level. So the time spet i these cases is 0. This also allows us to elimiate oe further parameter by settig p c =, as lower values for p c were oly beeficial i Cases i.. Settig p c = miimizes our estimate for the time spet i Cases i.3. So Theorem 4 exteds towards this GA see also Remark 2. We call the resultig GA greedy 2+ GA because its mai characteristics is the greedy paret selectio. The greedy 2+ GA is defied i Algorithm 2 2. Algorithm 2: Greedy 2+ GA with mutatio rate p for maximizig f : {0, } R. Iitialize populatio P of size 2 u. a. r. 2 while true do 3 Select x, x 2 u. a. r. from {x P y P : fx fy}. 4 Let y := crossoverx, x 2. 5 Flip each bit i y idepedetly with probability p. 6 Let P cotai the 2 best idividuals from P {y}; break ties towards icludig idividuals with the fewest duplicates i P {y}. 7 ed The followig result applies to the greedy 2+ GA usig ay kid of mask-based crossover. A mask-based crossover is a recombiatio operator where each bit value is take from either paret; that is, it is ot possible to itroduce a bit value which is ot represeted i ay paret. All commo crossovers are mask-based crossovers: uiform crossover, icludig parameterized uiform crossover, as well as k-poit crossovers for ay k. The followig result eve icludes biased operators like a bit-wise OR, which iduces a tedecy to icrease the umber of -bits. Theorem 6. Cosider the greedy 2+ GA with mutatio rate 0 < p / log usig a arbitrary mask-based crossover operator. Its expected optimizatio time o ONEMAX is at least mi{l, l/p 2 } Olog log + max k { pk k!k! } p. p Before givig the proof, ote that for p = c/ with 0 < c 4 costat, max k { pk k!k! } = p as for 0 < p 4 ad i N pi+ pi i!i! hece a maximum is attaied for k =. The the lower boud from Theorem 6 is l c e c O log log. + c i+!i+! = p i+ 2 pi i!i!, 2 Note that i [46] the greedy 2+ GA was defied slightly differetly as there duplicate geotypes are always rejected. Algorithm 2 is equivalet to the greedy 2+ GA from [46] for the followig reasos. If the curret populatio cotais two differet idividuals of equal fitess ad a duplicate of oe of the parets is created, both algorithms reject a duplicate geotype. If the populatio cotais two idividuals of differet fitess, both behave like the populatio oly cotaied the fitter idividual. 3

14 This matches the upper boud 3 up to small order terms, showig for the greedy 2+ GA that the ew term + c i the deomiator of the boud from Theorem 4 was ot a coicidece. For p > 4/, the lower boud is at least e + Ω l. Together, this establishes the optimal mutatio rate for the greedy 2+ GA o ONE- MAX. Theorem 7. For the greedy 2+ GA with uiform crossover o ONEMAX mutatio rate p = + 5/2 miimizes the expected umber of fuctio evaluatios, up to small-order terms. For the proof of Theorem 6 we use the followig lower-boud techique based o fitess levels by the author [47]. Theorem 8 Sudholt [47]. Cosider a partitio of the search space ito o-empty sets A,..., A m. For a search algorithm A we say that it is i A i or o level i if the best idividual created so far is i A i. If there are χ, u i, γ i,j for i < j where. the probability of traversig from level i to level j i oe step is at most u i γ i,j for all i < j, 2. m j=i+ γ i,j = for all i, ad 3. γ i,j χ m k=j γ i,k for all i < j ad some 0 χ, the the expected hittig time of A m is at least m i= m ProbA starts i A i χ j=i u j. 5 Proof of Theorem 6. We prove a lower boud for the followig sped-up GA istead of the origial greedy 2+ GA. Wheever it creates a ew offsprig with the same fitess, but a differet bit strig as the curret best idividual, we assume the followig. The algorithm automatically performs a crossover betwee the two. Also, we assume that this crossover leads to the best possible offsprig i a sese that all bits where both parets differ are set to i. e., the algorithm performs a bit-wise OR. That is, if both search poits have i -bits ad Hammig distace 2k, the the resultig offsprig has i + k -bits. Due to our assumptios, at the ed of each geeratio there is always a sigle best idividual. For this reaso we ca model the algorithm by a Markov chai represetig the curret best fitess. The aalysis follows a lower boud for EAs o ONEMAX [47, Theorem 9]. As i [47] we cosider the followig fitess-level partitio that focuses oly o the very last fitess values. Let l = mi{/log, /p 2 log }. Let A i = {x x = i} for i > l ad A l cotai all remaiig search poits. We kow from [47] that the GA is iitialized i A l with probability at least / log if is large eough. The probability p i,i+k that the sped-up GA makes a trasitio from fitess i to fitess i + k equals p i,i+k = Probk more 0-bits tha -bits flip + Probk 0-bits ad k -bits flip 4

15 Accordig to [47, Lemma 2], for the cosidered fitess levels i > l the former probability is bouded by p k p k The latter probability is bouded by ik k! + 35 i ip2 p 2. Probk 0-bits flip Probk -bits flip ik k! ik k! p k p i k ik k! pk p i k p k p p k p 2k k!. Together, p i,i+k is at most p i k p + 3 i ip2 5 p 2+k + p k p 2k. k!k! We eed to fid variables u i ad γ i,i+k alog with some 0 χ such that all coditios of Theorem 8 are fulfilled. Defie u i := p p 35 i ip2 i + p 3 + p k p 2 max k k!k! ad Observe that, for every k N, γ i,i+k := k p i p 2. u iγ i,i+k p k p i k i ip 2 p +2k + pk p 2k k!k! p k p i k + 3 i ip2 5 p 2+k + p k p 2k k!k! p i,i+k. I order to fulfill the secod coditio i Theorem 8, we cosider the followig ormalized variables: u i := u i j=i+ γ i,j ad γ γ i,j := i,j. As u i γ i,j = u j=i+ γ i γ i,j p i,j, i,j this proves the first coditio of Theorem 8. Followig the proof of Theorem 9 i [47], it is easy to show that for χ := p 2 log we get γ i,j χ m k=j γ i,k for all i, j with j > i the calculatios o [47, pp ] carry over by replacig p with p 2. This establishes the third ad last coditio. As γ i,j χ m k=j γ i,k is equivalet to γ i,j χ m k=j γ i,k, we get j=i+ γ i,j γ i,i+ χ χ, 5

16 which implies, usig i ip 2 lp 2 log [47, 2] as well as +x / x for x <, u i p p i χ 35 i ip2 + p 3 + p k p 2 max k k!k! p p 3 i χ + 3 p k 5 log + max k k!k! p p 3 i χ p p 3 i χ 3 5 log + max k + max k p k k!k! 3 5 log. p k Ivokig Theorem 8 ad recallig that the first fitess level is reached with probability at least / log, we get a lower boud of χ log χ 2 log u i i=l O log 3 5 log + max k p k k!k! + max k p k k!k! k!k! p 3 p p i i=l p p i i=l where i the last step we used that all factors χ, 3 5 log, ad p are O d ad c log cd log for ay positive costats c, d. Boudig i=l log lmi{, /p 2 } llog as i [47] ad absorbig all small-order terms i the Olog log term from the statemet gives the claimed boud. We also ra experimets to see whether the outcome matches our ispectio of the domiatig terms i the ruig time bouds for realistic problem dimesios. We chose = 000 bits ad recorded the average optimizatio time over 000 rus. The mutatio rate p was set to c/ with c {0., 0.2,..., 4}. The result is show i Figure. Oe ca see that for every mutatio rate the greedy 2+ GA has a lower average optimizatio time. As predicted, the performace differece becomes larger as the mutatio rate icreases. The optimal mutatio rates for both algorithms match with miimal average optimizatio times. Note that also the variace/stadard deviatio was much lower for the GA for higher mutatio rates. Prelimiary rus for = 00 ad = 0000 bits gave very similar results. More experimets ad statistical tests are give i Sectio k-poit Crossover The k-poit crossover operator picks k cuttig poits from {,..., } uiformly at radom without replacemet. These cuttig poits divide both parets ito segmets that are the assembled from alteratig parets. That is, for parets x, y ad cuttig 6 i,

17 umber of evaluatios EA 2+ GA+uiform mutatio rate / Figure : Average optimizatio times for the + EA ad the greedy 2+ GA with uiform crossover o OeMax with = 000 bits. The mutatio rate p is set to c/ with c {0., 0.2,..., 4}. The thi lies show mea ± stadard deviatio. poits l < l 2 < < l k the offsprig will be: x... x l y l+... y l2 x l2+... x l3 y l3+... y l4... the suffix beig y lk +... y if k is odd ad x lk +... x if k is eve. For uiform crossover we have see that populatios cotaiig differet search poits of equal fitess are beeficial as uiform crossover ca easily combie the good buildig blocks. This holds regardless of the Hammig distace betwee these differet idividuals, ad the positio of bits where idividuals differ. The µ+λ GA with k-poit crossover is harder to aalyse as there the probability of crossover creatig a improvemet depeds o the Hammig distace of parets ad the positio of differig bits. Cosider parets that differ i two bits, where these bit positios are quite close. The -poit crossover has a high probability of takig both bits from the same paret. I order to recombie those buildig blocks, the cuttig poit has to be chose i betwee the two bit positios. A similar effect occurs for 2-poit crossover also if the two bit positios are o opposite eds of the bit strig. The followig lemma gives a lower boud o the probability that k-poit crossover combies the right buildig blocks o ONEMAX, if two parets are equally fit ad differ i two bits. The lemma ad its proof may be of idepedet iterest. Lemma 9. Cosider two search poits x, y with x i =, x i+d = 0, y i = 0, y i+d = for i < i+d ad x s = y s for s / {i, i+d}. The probability of k-poit crossover of x ad y, for ay k N, where N := 4 is the umber of possible cuttig poits, creatig a offsprig with a larger umber of -bits is at least ad exactly d/n for k =. dn d NN 7

18 Proof. We idetify cuttig poits with bits such that cuttig poit a results i two strigs x... x a ad x a+... x. We say that a cuttig poit a separates i ad i + d if a {i,..., i + d }. Note that the prefix is always take from x. The claim ow follows from showig that the umber of separatig cuttig poits is odd with the claimed probability. Let X N,d,k be the radom variables that describes the umber of cuttig poits separatig i ad i+d. This variable follows a hypergeometric distributio HypN, d, k, illustrated by the followig ur model with red ad white balls. The ur cotais N balls, d of which are red. We draw k balls uiformly at radom, without replacemet. The X N,d,k describes the umber of red balls draw. We defie the probability of X N,d,k beig odd, for d N ad k N as k k d N d P N, d, k := ProbX N,d,k = x = x k x N. k Note that for k = ad for k = 2 x=, x odd d N d P N, d, = 0 = d N N d N d P N, d, 2 = = N 2 x=, x odd 2dN d NN. For all d N ad all k N the followig recurrece holds. Imagie drawig the first cuttig poit separately. With probability d/n, the cuttig poit is a separatig cuttig poit, ad the we eed a eve umber of further separatig cuttig poits amog the remaiig k cuttig poits, draw from a radom variable X N,d,k. With the remaiig probability N d/n, the umber of remaiig cuttig poits must be eve, ad this umber is draw from a radom variable X N,d,k. Hece P N, d, k = d N d P N, d, k + P N, d, k. 6 N N Assume for a iductio that for all 2 k < k dn d NN P N, d, dn d k NN. 7 This is true for k = 2 as, usig 3dN d 3 N/2 2 NN for N 4, P N, d, 2 = 2dN d NN For k > 2, combiig 6 ad 7 yields = 3dN d dn d NN dn d NN. P N, d, k = d N d P N, d, k + P N, d, k N N d d N d N N N 2 + N d N dn d N N 2 dn dd + N d = NN N 2 dn d = NN. 8

19 The upper boud follows similarly: P N, d, k d N By iductio, the claim follows. d N d + N d N N 2 N dn dd + N d = NN N 2 dn d = NN. dn d N N 2 I the settig of Lemma 9, the probability of k-poit crossover creatig a improvemet depeds o the distace betwee the two differig bits. Fortuately, for search poits that result from a mutatio of oe aother, this distace has a favourable distributio. This is made precise i the followig lemma. Lemma 0. Let x result from x by a mutatio flippig oe -bit ad oe 0-bit, where the positios i, j of these bits are chose uiformly amog all -bits ad 0-bits, respectively. The for d := i j the radom variable mi{d, d} stochastically domiates the uiform distributio o {,..., /4}. Proof. We first show the followig. For ay fixed idex i ad ay iteger z < /2 there are exactly two positios j such that mi{d, d} = z. If i {,..., } ad z N are fixed, the oly values for j that result i either i j = z or i j = z are i+z, i z, i+z, ad i z +. Note that at most two of these values are i {,..., }. Hece, there are at most 2 feasible values for j for every d N. Let l deote the umber of -bits i x. If l /2, we assume that first the 0-bit is chose uiformly at radom, ad the cosider the uiform radom choice of a correspodig -bit. Without loss of geerality assume x i = 0 ad x j =. Whe i has bee fixed ad j is chose uiformly at radom, a worst case distributio for mi{ i j, i j } is attaied whe the -bits are distributed such that for each d l/2 both feasible bit positios are. The worst case is hece a uiform distributio o {,..., l/2 }, which stochastically domiates the uiform distributio o {,..., /4}. The case l < /2 is symmetrical: exchagig the roles of x j ad x i as well as the roles of zeros ad oes yields a uiform distributio o the set {,..., l/2 } as worst case, which agai stochastically domiates the uiform distributio o {,..., /4}. Take together, Lemma 9 ad Lemma 0 idicate that k-poit crossover has a good chace of fidig improvemets through recombiig the right buildig blocks. However, this is based o the populatio cotaiig potetial parets of equal fitess that oly differ i two bits. The followig aalysis shows that the populatio is likely to cotai such a favourable pair of parets. However, such a pair might get lost agai if other idividuals of the same fitess are beig created, after all duplicates have bee removed from the populatio. For parets that differ i more tha 2 bits, Lemma 9 does ot apply, hece we do ot have a estimate of how likely such a crossover will fid a improvemet. I order to avoid this problem, we cosider a more detailed tie-breakig rule. As before, idividuals with fewer duplicates are beig preferred. I case there are still ties 9

20 umber of evaluatios GA +-poit, dup-rd 2+ GA +2-poit, dup-rd 2+ GA +-poit, dup-old 2+ GA +2-poit, dup-old mutatio rate / Figure 2: Average optimizatio times o ONEMAX with = 000 bits over 000 rus for the greedy 2+ GA with - ad 2-poit crossover usig differet tie-breakig rules if idividuals are tied with regard to fitess ad the umber of duplicates. dup-rd breaks these ties radomly, whereas dup-old Algorithm 3 prefers older idividuals. The mutatio rate p is set to c/ with c {0., 0.2,..., 4}. after cosiderig the umber of duplicates, the µ+λ GA will retai older idividuals. This refied tie-breakig rule is show i Algorithm 3. As will be show i the remaider, it implies that oce a favourable pair of parets with Hammig distace 2 has bee created, this pair will ever get lost. Algorithm 3: Refied tie-breakig rule dup-old. 4 Let P cotai the µ best idividuals from P P ; break ties towards icludig idividuals with the fewest duplicates i P P. If there are still ties, break them towards icludig older idividuals. This tie-breakig rule, called dup-old differs from the oe used for the experimets i Figure ad those i Sectio 6. There, we broke ties uiformly at radom i case idividuals are tied with respect to both fitess ad the umber of duplicates. We call the latter rule dup-rd. Experimets for the greedy 2+ GA comparig tie-breakig rules dup-old ad dup-rd over 000 rus idicate that performace differeces are very small, see Figure 2. 3 Note, however, that o fuctios with plateaus, like royal road fuctios, retaiig the older idividuals prevets the µ+λ GA from performig radom walks o the plateau, oce the populatio has spread such that there are o duplicates of ay idividual. I this case we expect that performace will deteriorate whe breakig ties towards older idividuals. With the refied tie-breakig rule, the performace of µ+λ GAs is as follows. 3 Eve though differeces are small, oe-sided Ma-Whitey U tests reveal some statistically sigificat differeces: for -poit crossover dup-rd is sigificatly faster tha dup-old o a sigificace level of 0.00 for mutatio rates at least 2.4/ with two exceptios, 2.8/ ad 3.6/, with p-values still below Cotrarily, dup-old was sigificatly faster for 2-poit crossover for mutatio rates i the rage of 0.8/ to 3/. 20

21 Theorem. The expected optimizatio time of every µ+λ GA implemetig Algorithm with tie-breakig rule dup-old from Algorithm 3, 2 µ = O, λ < µ, p c = o ad p c = ω/ log, p = c/ for some costat c > 0, ad k-poit crossover with ay k 2, o ONEMAX is at most l c e c + o. + c This boud equals the upper boud 3 for µ+λ GAs with uiform crossover. It improves upo the previous upper boud for the greedy 2+ GA from [46, Theorem 8], whose domiat term was by a additive term of 2c 3+3c l larger. The reaso was that for the 2+ GA favourable parets could get lost, which is ow preveted by the dup-old tie-breakig rule ad coditios o p c. The coditios p c = o as well as µ, λ = O are useful because they allow us to estimate the probability that a sigle good idividual takes over the whole populatio with copies of itself. I the remaider of this sectio we work towards provig Theorem ad assume that 0 for some 0 chose such that all asymptotic statemets that require a large eough value of hold true. For < 0 there is othig to prove, as the statemet holds trivially for bouded. We agai estimate the time spet o each fitess level i, i. e., whe the best fitess i the curret populatio is i. To this ed, we focus o the higher fitess levels i / log where the probability of creatig a offsprig o the same level ca be estimated icely. The time for reachig these higher fitess levels oly costitutes a small-order term, compared to the claimed ruig time boud. The followig lemma proves this claim i a more geeral settig tha eeded for the proof of Theorem. I particular, it holds for arbitrary tie-breakig rules ad crossover operators. Lemma 2. For every µ+λ GA implemetig Algorithm with µ, λ = O, p c = Ω, ad p = c/ for a costat c > 0, usig ay iitializatio ad ay crossover operator, the expected time util a fitess level i / log is reached for the first time is o log. A proof is give i the appedix. I the remaider of the sectio we focus o higher fitess levels i / log ad specify the differet cases o each such fitess level. The cases i., i.2, ad i.3 are similar to the oes for uiform crossover, with additioal coditios o the similarity of idividuals i Cases i.2 ad i.3. We also have a additioal error state that accouts for udesirable ad uexpected behavior. We pessimistically assume that the error state caot be left towards other cases o level i. Case i.: The populatio cotais a idividual o level i ad at least oe idividual o a lower fitess level. Case i.2: The populatio cotais µ copies of a idividual x o level i. Case i.3: The populatio cotais two search poits x, y with curret best fitess i, where y resulted from a mutatio of x ad the Hammig distace of x ad y is 2. Case i.error: A error state reached from ay Case i. whe the best fitess is i ad oe of the prior cases applies. The differece to the aalysis of uiform crossover is that i Case i.2 we rely o the populatio collapsig to copies of a sigle idividual. This helps to estimate the probability of creatig a favourable paret-offsprig pair i Case i.3 as the µ+λ GA effectively oly performs mutatios of x while beig i Case i.2. Lemma 3. Cosider ay µ+λ GA as defied i Theorem, with parameters 2 µ = O, λ < µ, p c = o ad p c = ω/ log, p = c/ for some costat c > 0. The total expected 2

arxiv: v1 [cs.ne] 26 Mar 2014

arxiv: v1 [cs.ne] 26 Mar 2014 How Crossover Speeds Up Buildig-Block Assembly i Geetic Algorithms arxiv:403.6600v [cs.ne] 26 Mar 204 Dirk Sudholt Departmet of Computer Sciece, Uiversity of Sheffield, Uited Kigdom Abstract We re-ivestigate

More information

General Lower Bounds for the Running Time of Evolutionary Algorithms

General Lower Bounds for the Running Time of Evolutionary Algorithms Geeral Lower Bouds for the Ruig Time of Evolutioary Algorithms Dirk Sudholt Iteratioal Computer Sciece Istitute, Berkeley, CA 94704, USA Abstract. We preset a ew method for provig lower bouds i evolutioary

More information

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES Peter M. Maurer Why Hashig is θ(). As i biary search, hashig assumes that keys are stored i a array which is idexed by a iteger. However, hashig attempts to bypass

More information

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory 1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

Analysis of Algorithms. Introduction. Contents

Analysis of Algorithms. Introduction. Contents Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

CS / MCS 401 Homework 3 grader solutions

CS / MCS 401 Homework 3 grader solutions CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of

More information

The Growth of Functions. Theoretical Supplement

The Growth of Functions. Theoretical Supplement The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that

More information

1 Hash tables. 1.1 Implementation

1 Hash tables. 1.1 Implementation Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

More information

arxiv: v1 [cs.ne] 10 Aug 2016

arxiv: v1 [cs.ne] 10 Aug 2016 Escapig Local Optima usig Crossover with Emerget or Reiforced Diversity arxiv:1608.03123v1 [cs.ne] 10 Aug 2016 Duc-Cuog Dag 1, Tobias Friedrich 2, Timo Kötzig 2, Marti S. Krejca 2, Per Kristia Lehre 1,

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Radom Models Tusheg Zhag February 14, 013 1 Radom Walks Let me describe the model. Radom walks are used to describe the motio of a movig particle (object). Suppose that a particle (object) moves alog the

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Escaping Local Optima Using Crossover with Emergent Diversity

Escaping Local Optima Using Crossover with Emergent Diversity This article has bee accepted for publicatio i a future issue of this joural, but has ot bee fully edited Cotet may chage prior to fial publicatio Citatio iformatio: DOI 009/TEVC07740, IEEE Trasactios

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Disjoint set (Union-Find)

Disjoint set (Union-Find) CS124 Lecture 7 Fall 2018 Disjoit set (Uio-Fid) For Kruskal s algorithm for the miimum spaig tree problem, we foud that we eeded a data structure for maitaiig a collectio of disjoit sets. That is, we eed

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

HOMEWORK 2 SOLUTIONS

HOMEWORK 2 SOLUTIONS HOMEWORK SOLUTIONS CSE 55 RANDOMIZED AND APPROXIMATION ALGORITHMS 1. Questio 1. a) The larger the value of k is, the smaller the expected umber of days util we get all the coupos we eed. I fact if = k

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

A Rigorous View On Neutrality

A Rigorous View On Neutrality A Rigorous View O Neutrality Bejami Doerr Michael Gewuch Nils Hebbighaus Frak Neuma Algorithms ad Complexity Group Max-Plack-Istitut für Iformatik Saarbrücke, Germay Departmet of Computer Sciece Christia-Albrechts-Uiversity

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

arxiv: v1 [cs.ai] 20 Nov 2013

arxiv: v1 [cs.ai] 20 Nov 2013 Aalyzig Evolutioary Optimizatio i Noisy Eviromets Chao Qia, Yag Yu, Zhi-Hua Zhou Natioal Key Laboratory for Novel Software Techology Najig Uiversity, Najig 20023, Chia arxiv:3.4987v [cs.ai] 20 Nov 203

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Lecture 2. The Lovász Local Lemma

Lecture 2. The Lovász Local Lemma Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio

More information

MA131 - Analysis 1. Workbook 2 Sequences I

MA131 - Analysis 1. Workbook 2 Sequences I MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

More information

Computational Intelligence Winter Term 2018/19

Computational Intelligence Winter Term 2018/19 Computatioal Itelligece Witer Term 28/9 Prof. Dr. Güter Rudolph Lehrstuhl für Algorithm Egieerig (LS ) Fakultät für Iformatik TU Dortmud Pla for Today Lecture Evolutioary Algorithms (EA) Optimizatio Basics

More information

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Final Review for MATH 3510

Final Review for MATH 3510 Fial Review for MATH 50 Calculatio 5 Give a fairly simple probability mass fuctio or probability desity fuctio of a radom variable, you should be able to compute the expected value ad variace of the variable

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Hashing and Amortization

Hashing and Amortization Lecture Hashig ad Amortizatio Supplemetal readig i CLRS: Chapter ; Chapter 7 itro; Sectio 7.. Arrays ad Hashig Arrays are very useful. The items i a array are statically addressed, so that isertig, deletig,

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Lecture 11: Pseudorandom functions

Lecture 11: Pseudorandom functions COM S 6830 Cryptography Oct 1, 2009 Istructor: Rafael Pass 1 Recap Lecture 11: Pseudoradom fuctios Scribe: Stefao Ermo Defiitio 1 (Ge, Ec, Dec) is a sigle message secure ecryptio scheme if for all uppt

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

Runtime Analysis of the (µ+1) EA on Simple Pseudo-Boolean Functions

Runtime Analysis of the (µ+1) EA on Simple Pseudo-Boolean Functions Rutime Aalysis of the (µ+1) EA o Simple Pseudo-Boolea Fuctios Carste Witt FB Iformatik, LS 2, Uiversität Dortmud, 44221 Dortmud, Germay carste.witt@cs.ui-dortmud.de November 14, 2005 Abstract Although

More information

Beurling Integers: Part 2

Beurling Integers: Part 2 Beurlig Itegers: Part 2 Isomorphisms Devi Platt July 11, 2015 1 Prime Factorizatio Sequeces I the last article we itroduced the Beurlig geeralized itegers, which ca be represeted as a sequece of real umbers

More information

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS DEMETRES CHRISTOFIDES Abstract. Cosider a ivertible matrix over some field. The Gauss-Jorda elimiatio reduces this matrix to the idetity

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Polynomial identity testing and global minimum cut

Polynomial identity testing and global minimum cut CHAPTER 6 Polyomial idetity testig ad global miimum cut I this lecture we will cosider two further problems that ca be solved usig probabilistic algorithms. I the first half, we will cosider the problem

More information

Kinetics of Complex Reactions

Kinetics of Complex Reactions Kietics of Complex Reactios by Flick Colema Departmet of Chemistry Wellesley College Wellesley MA 28 wcolema@wellesley.edu Copyright Flick Colema 996. All rights reserved. You are welcome to use this documet

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

arxiv: v1 [cs.ne] 4 Sep 2017

arxiv: v1 [cs.ne] 4 Sep 2017 Theoretical Aalysis of Stochastic Search Algorithms Per Kristia Lehre School of Computer Sciece, Uiversity of Birmigham, Birmigham, UK Pietro S. Oliveto Departmet of Computer Sciece, Uiversity of Sheffield,

More information

Reinforcement Learning Based Dynamic Selection of Auxiliary Objectives with Preserving of the Best Found Solution

Reinforcement Learning Based Dynamic Selection of Auxiliary Objectives with Preserving of the Best Found Solution Reiforcemet Learig Based Dyamic Selectio of Auxiliary Objectives with Preservig of the Best Foud Solutio arxiv:1704.07187v1 [cs.ne] 24 Apr 2017 Abstract Efficiecy of sigle-objective optimizatio ca be improved

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

PRACTICE PROBLEMS FOR THE FINAL

PRACTICE PROBLEMS FOR THE FINAL PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

arxiv: v1 [cs.ne] 2 Nov 2017

arxiv: v1 [cs.ne] 2 Nov 2017 Noame mauscript No. will be iserted by the editor) Ruig Time Aalysis of the +)-EA for OeMax ad LeadigOes uder Bit-wise Noise Chao Qia Chao Bia Wu Jiag Ke Tag Received: date / Accepted: date arxiv:7.00956v

More information

On forward improvement iteration for stopping problems

On forward improvement iteration for stopping problems O forward improvemet iteratio for stoppig problems Mathematical Istitute, Uiversity of Kiel, Ludewig-Mey-Str. 4, D-24098 Kiel, Germay irle@math.ui-iel.de Albrecht Irle Abstract. We cosider the optimal

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Notes for Lecture 11

Notes for Lecture 11 U.C. Berkeley CS78: Computatioal Complexity Hadout N Professor Luca Trevisa 3/4/008 Notes for Lecture Eigevalues, Expasio, ad Radom Walks As usual by ow, let G = (V, E) be a udirected d-regular graph with

More information

The coalescent coalescence theory

The coalescent coalescence theory The coalescet coalescece theory Peter Beerli September 1, 009 Historical ote Up to 198 most developmet i populatio geetics was prospective ad developed expectatios based o situatios of today. Most work

More information

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m?

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m? MATH 529 The Boudary Problem The drukard s walk (or boudary problem) is oe of the most famous problems i the theory of radom walks. Oe versio of the problem is described as follows: Suppose a particle

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Polynomials with Rational Roots that Differ by a Non-zero Constant. Generalities

Polynomials with Rational Roots that Differ by a Non-zero Constant. Generalities Polyomials with Ratioal Roots that Differ by a No-zero Costat Philip Gibbs The problem of fidig two polyomials P(x) ad Q(x) of a give degree i a sigle variable x that have all ratioal roots ad differ by

More information

CS 330 Discussion - Probability

CS 330 Discussion - Probability CS 330 Discussio - Probability March 24 2017 1 Fudametals of Probability 11 Radom Variables ad Evets A radom variable X is oe whose value is o-determiistic For example, suppose we flip a coi ad set X =

More information

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6) STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated

More information

Lecture 5: April 17, 2013

Lecture 5: April 17, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 5: April 7, 203 Scribe: Somaye Hashemifar Cheroff bouds recap We recall the Cheroff/Hoeffdig bouds we derived i the last lecture idepedet

More information

lim za n n = z lim a n n.

lim za n n = z lim a n n. Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget

More information

Sequences I. Chapter Introduction

Sequences I. Chapter Introduction Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

Lecture 14: Graph Entropy

Lecture 14: Graph Entropy 15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number

More information

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36 Probability Distributios A Example With Dice If X is a radom variable o sample space S, the the probablity that X takes o the value c is Similarly, Pr(X = c) = Pr({s S X(s) = c} Pr(X c) = Pr({s S X(s)

More information

Running Time Analysis of the (1+1)-EA for OneMax and LeadingOnes under Bit-wise Noise

Running Time Analysis of the (1+1)-EA for OneMax and LeadingOnes under Bit-wise Noise Ruig Time Aalysis of the +-EA for OeMax ad LeadigOes uder Bit-wise Noise Chao Qia Uiversity of Sciece ad Techology of Chia Hefei 3007, Chia chaoqia@ustc.edu.c Wu Jiag Uiversity of Sciece ad Techology of

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Mathematical Induction

Mathematical Induction Mathematical Iductio Itroductio Mathematical iductio, or just iductio, is a proof techique. Suppose that for every atural umber, P() is a statemet. We wish to show that all statemets P() are true. I a

More information