Reinforcement Learning Based Dynamic Selection of Auxiliary Objectives with Preserving of the Best Found Solution

Size: px
Start display at page:

Download "Reinforcement Learning Based Dynamic Selection of Auxiliary Objectives with Preserving of the Best Found Solution"

Transcription

1 Reiforcemet Learig Based Dyamic Selectio of Auxiliary Objectives with Preservig of the Best Foud Solutio arxiv: v1 [cs.ne] 24 Apr 2017 Abstract Efficiecy of sigle-objective optimizatio ca be improved by itroducig some auxiliary objectives. Ideally, auxiliary objectives should be helpful. However, i practice, objectives may be efficiet o some optimizatio stages but obstructive o others. I this paper we propose a modificatio of the EA+RL method which dyamically selects optimized objectives usig reiforcemet learig. The proposed modificatio prevets from losig the best foud solutio. We aalysed the proposed modificatio ad compared it with the EA+RL method ad Radom Local Search o XdivK, Geeralized OeMax ad LeadigOes problems. The proposed modificatio outperforms the EA+RL method o all problem istaces. It also outperforms the sigle objective approach o the most problem istaces. We also provide detailed aalysis of how differet compoets of the cosidered algorithms ifluece efficiecy of optimizatio. I additio, we preset theoretical aalysis of the proposed modificatio o the XdivK problem. I. INTRODUCTION Cosider sigle-objective optimizatio of a target objective by a evolutioary algorithm (EA). Commoly, efficiecy of EA is measured i oe of two ways. I the first oe efficiecy is defied as the umber of fitess fuctio evaluatios eeded to reach the optimum. I the secod oe efficiecy of EA is computed as the maximum target objective value obtaied withi the fixed umber of evaluatios. I this work we use the first way. Efficiecy of the target objective optimizatio ca be icreased by itroducig some auxiliary objectives [1] [5]. Ideally, auxiliary objectives should be helpful [2]. However, i practice, objectives ca be geerated automatically ad may be efficiet o some optimizatio stages but obstructive o others [6], [7]. We call such objectives o-statioary. Oe of the approaches to deal with such objectives is dyamic selectio of the best objective at the curret stage of optimizatio. The objectives may be selected radomly [8]. The better method is EA+RL which uses reiforcemet learig (RL) [9], [10]. It was theoretically show for a umber of optimizatio problems that EA+RL efficietly works with statioary objectives [11], [12]. However, theoretical aalysis of EA+RL with o-statioary objectives showed that EA+RL does ot igore obstructive objective o the XdivK problem [13]. Selectio of a iefficiet auxiliary objective causes losig of the best Iria Petrova, Aria Buzdalova ITMO Uiversity 49 Kroverkskiy av. Sait-Petersburg, Russia, ireepetrova@yadex.com, abuzdalova@gmail.com foud solutio ad the algorithm eeds a lot of steps to fid a good solutio agai. Also EA+RL ca stuck i local optima while solvig the Geeralized OeMax problem with obstructive objectives [14]. I this paper we propose a modified versio of EA+RL ad aalyse it theoretically ad experimetally o XdivK, Geeralized OeMax ad LeadigOes problems. The rest of the paper is orgaized as follows. First, the EA+RL ad model problems with o-statioary objectives are described. Secod, a modificatio of the EA+RL method is proposed. The we experimetally aalyse the cosidered methods. Fially, we provide discussio ad theoretical explaatio of the achieved results. II. PRELIMINARIES I this sectio we describe the EA+RL method. Also we defie model problems ad o-statioary objectives used i this study. A. EA+RL method I reiforcemet learig (RL) a aget applies a actio to a eviromet. The the eviromet returs a umerical reward ad a represetatio of its state ad the process repeats. The goal of the aget is to maximize the total reward [9]. I the EA+RL method, EA is treated as a eviromet, selectio of a objective to be optimized correspods to a actio. The aget selects a objective, EA geerates ew populatio usig this objective ad returs some reward to the aget. The reward depeds o differece of the best target objective value i two subsequet geeratios. We cosider maximizatio problems. So the higher is the ewly obtaied target value, the higher is the reward. I recet theoretical aalysis of EA+RL with o-statioary objectives, radom local search (RLS) is used istead of EA, populatio cosists of a sigle idividual [13]. Idividuals are represeted as bit strigs, flip-oe-bit mutatio is used. If values of the selected objective computed o the ew idividual ad o the curret oe are equal, the ew idividual is accepted. The used reiforcemet learig algorithm is Q- learig. Therefore, the EA+RL algorithm i this case is called

2 RLS + Q-learig. The pseudocode of RLS + Q-learig is preseted i Algorithm 1. The curret state s is defied as the target objective value of the curret idividual. The reward is calculated as differece of the target objective values i two subsequet geeratios. I Q-learig, the efficiecy of selectig a objective h i a state s is measured by the value Q(s, a), which is updated dyamically after each selectio as show i lie 13 of the pseudocode, where α ad γ are the learig rate ad the discout factor correspodigly. Algorithm 1 RLS + Q-learig Algorithm 1: Idividual y a radom bit strig 2: Costruct set H of auxiliary objectives ad target objective 3: Q(s, h) 0 for each state s ad objective h H 4: while (Optimum of the target objective t is ot foud) do 5: Curret state s t(y) 6: Idividual y mutate y (flip a radom bit) 7: Objective h: Q(s, h) = max h H Q(s, h ) If Q-values are equal, objectives are selected equiprobably 8: if h(y ) h(y) the 9: y y 10: ed if 11: New state s t(y) 12: Reward r s s 13: Q(s, h) Q(s, h) + α(r + max h H Q(s, h ) Q(s, h)) 14: ed while B. Model problems I this paper we cosider three model problems which were used i studies of EA+RL [12] [14]. I all cosidered problems, a idividual is a bit strig of legth. Let x be the umber of bits i a idividual which are set to oe. The the objective ONEMAX is equal to x ad the objective ZEROMAX is equal to x. Oe of the cosidered problems is GENERALIZED ONE- MAX, deoted as OM d. The target objective of this problem called OM d is calculated as the umber of bits i a idividual of legth that matches a give bit mask. The bit mask has d 0-bits ad d 1-bits. Aother problem is XDIVK. The target objective is calculated as x k, where x is the umber of oes, k is a costat, k < ad k divides. The last cosidered problem is LEADINGONES. The target objective of this problem is equal to the legth of the maximal prefix cosistig of bits set to oe. C. No-statioary objectives We used the two followig o-statioary auxiliary objectives for all the cosidered problems. These auxiliary objectives ca be both ONEMAX or ZEROMAX at differet stages of optimizatio. More precisely, cosider the auxiliary objectives h 1 ad h 2 defied i (1). The parameter p is called a switch poit, because at this poit auxiliary objectives chage their properties. ONEMAX ad ZEROMAX are deoted as OM ad ZM correspodigly. { { OM, x p ZM, x p h 1 (x) = h 2 (x) = ZM, p < x OM, p < x (1) I the XDIVK problem, a objective which is curretly equal to ONEMAX allows to distiguish idividuals with the same value of the target objective ad give preferece to the idividual with a higher x value. Such a idividual is more likely to produce a descedat with a higher target objective value. I the LEADINGONES problem, ONEMAX is helpful because it has the same optimum but ruig time of optimizig ONEMAX is lower [15]. Therefore, i LEADINGONES ad XDIVK problems the objective which is equal to ONEMAX at the curret stage of optimizatio is helpful. I both these problems, the goal is to obtai idividual of all oes, so ZEROMAX is obstructive. I the OM d problem, both objectives may be obstructive or eutral. For example, if for some i the i-th bit of the OM d bit mask is set to 1, ad the mutatio operator flips the i-th bit of a idividual from 0 to 1, the ZEROMAX objective is obstructive, because it would ot accept this idividual. I the iverse case, ONEMAX is obstructive. III. MODIFIED EA+RL I this sectio we propose a modificatio of the EA+RL method which prevets EA+RL from losig the best foud solutio. I the EA+RL method, if the ewly geerated idividual is better tha the existig oe accordig to the selected objective, the ew idividual is accepted. However, if the selected objective is obstructive, the ew idividual may be worse tha the existig idividual i terms of the target objective. I this case EA loses the idividual with the best target objective value. I the modified EA+RL, if the ewly geerated idividual is better tha the existig oe accordig to the selected objective, but is worse accordig to the target objective, the ew idividual is rejected. As i the recet theoretical works, we use RLS as optimizatio problem solver ad apply Q- learig to select objectives. The pseudocode of the modified RLS + Q-learig is preseted i Algorithm 2. To motivate the approach of reward calculatio i the ew method, we eed to describe how the aget lears which objective should be selected i the EA+RL method. If the aget selects a obstructive objective ad EA loses idividual with the best target value, the best target value i the ew geeratio is decreased. So the aget achieves a egative reward for this objective ad will ot select this objective i the same state later. However, whe properties of auxiliary objectives are chaged, the obstructive objective may become helpful. If the properties are chaged withi oe RL state, the objective which became helpful will ot be selected, because the aget previously achieved egative reward for it. Ad iversely, the objective which became obstructive could be selected because it was helpful earlier. Ad, as it was show i [13], the EA+RL eeds a lot of steps to get out of this

3 Algorithm 2 Modified RLS + Q-learig Algorithm 1: Idividual y a radom bit strig 2: Costruct set H of auxiliary objectives ad target objective 3: Q(s, h) 0 for each state s ad objective h H 4: while (Optimum of the target objective t is ot foud) do 5: Calculate curret state s 6: Save target fitess: f t(y) 7: Idividual y mutate y (flip a radom bit) 8: Objective h: Q(s, h) = max h H Q(s, h ) If Q-values are equal, objectives are selected equiprobably 9: if h(y ) h(y) ad t(y ) t(y) the 10: y y 11: ed if 12: Calculate ew state s 13: Calculate reward r 14: Q(s, h) Q(s, h) + α(r + max h H Q(s, h ) Q(s, h)) 15: ed while trap. For this reaso, i the ew method we cosider two versios of the modified EA+RL with differet ways of reward calculatio. I the first versio of the modified EA+RL, if the ew idividual is better tha the curret oe accordig to the selected objective, ad its target objective value is lower, the ew idividual is rejected but the aget achieves egative reward as if the ew idividual was accepted. So the aget lears as i EA+RL. We call this algorithm the modificatio of EA+RL with learig o mistakes. I this case i lie 13 of Algorithm 2 reward is calculated as preseted i Algorithm 3. I the secod versio the aget i the same situatio achieves zero reward because the idividual i the populatio is ot chaged. Thereby, aget does ot lear if the actio was iefficiet ad lears oly if the target objective value was icreased. We call this algorithm the modificatio of EA+RL without learig o mistakes. I this case i lie 13 of Algorithm 2 reward is calculated as t(y) f. Algorithm 3 Reward calculatio i the modified EA+RL with learig o mistakes (Algorithm 2, lie 13) 1: r t(y) f 2: if h(y ) h(y) the 3: r t(y ) f 4: ed if I the existig theoretical works o EA+RL, RL state is defied as the target objective value [11], [14]. Deote it as target state. However, if the idividual with the best target objective value is preserved, the algorithm will ever retur to the state where it achieved positive reward. So the aget ever kows which objective is helpful. It oly ca lear that a objective is obstructive if the aget achieved egative reward for it. Therefore, i the preset work we cosider two defiitios of a state. The first oe is the target state. The secod oe is the sigle state which is the same i the whole optimizatio process. This state is used to ivestigate efficiecy of the proposed method whe the aget has leared which objective is good. IV. THEORETICAL ANALYSIS Previously, it was show that the EA+RL method gets stuck i local optima o XdivK with o-statioary objectives [13]. Below we preset theoretical aalysis of the ruig time of the proposed EA+RL modificatio without learig o mistakes o this problem. The target state is used. To compute the expected ruig time of the algorithm, we costruct the Markov chai that represets the correspodig optimizatio process [11], [13]. Recall that RL states are determied by the target objective value. Markov states correspod to the umber of 1-bits i a idividual. Therefore, a RL state icludes k Markov states with differet umber of 1-bits. The Markov chai for the XDIVK problem is show i the Fig. 1. The labels o trasitios have the format F, M, where F is a fitess fuctio that ca be chose for this trasitio, M is the correspodig effect of mutatio. A trasitio probability is computed as the sum for all f F of product of probabilities of selectio f ad the correspodig mutatio m. Further we describe each Markov state ad explai why the trasitios have such labels. p = dk RL state (d+1) RL state d h 1 = ZeroMax h 2 = OeMax h 1 = OeMax h 2 = ZeroMax RL state (d-1) h 1, 0 1 h 2, 1 0 t, 1 0 h 1, 1 0 h 1, 1 0 h 2, 0 1 h 2, 1 0 t, dk+k... dk+2 dk+1 dk t, 0 1 h 2, 0 1 dk-1 t, 0 1 h 1, 0 1 Fig. 1. Markov chai for the EA+RL modificatio without learig o mistakes o XDIVK Cosider the umber of oes equal to dk, where d is a costat. So the aget is i the RL state d ad the Markov

4 state is dk. Sice the aget has o experiece i the state d, the objectives are selected equiprobably. If the aget selects the target objective or a objective which is curretly equal to ONEMAX, ad the mutatio operator iverts 1-bit, the ew idividual has dk 1 1-bits ad is worse tha the curret idividual accordig to the selected objective. So the ew idividual will ot be accepted by EA. The same situatio occurs, if the selected objective is equal to ZEROMAX ad the mutatio operator iverts 0-bit. I the case of selectio of the ZEROMAX objective ad iversio of 1-bit, the ew idividual is better tha the curret oe accordig to the selected objective. However, the target objective value of the ew idividual is less tha the target objective value of the curret idividual. Therefore, the ew idividual is ot selected for the ext geeratio. If ONEMAX or the target objective is selected ad 0-bit is iverted, the ew idividual is accepted. The trasitios i Markov states dk ad dk + 1 differ from each other whe 1-bit is iverted ad the aget selects the target objective or the objective which is equal to ZEROMAX. I this case, the ew idividual is equal to or better tha the curret oe accordig to the selected objective. So the ew idividual is accepted. However, the ew idividual cotais less 1-bits tha the curret oe, so the algorithm moves to the dk Markov state. Trasitios i the states dk+2,..., dk+k 1 are the same as trasitios i the state dk+1. From the Markov chai of the algorithm, we ca see, that trasitios ad, as a cosequece, performace of the algorithm do ot deped o the umber ad positios of switch poits. To aalyse the ruig time of the cosidered algorithm, we also eed to costruct Markov chai for RLS without auxiliary objectives (see Fig. 2). This Markov chai is costructed aalogically to the Markov chai described above. The expected ruig time of the EA+RL modificatio without learig o mistakes for XDIVK with o-statioary objectives is equal to the umber of fitess fuctio evaluatios eeded to get from the Markov state 0 to the Markov state. Each trasitio i the Markov chai correspods to oe fitess fuctio evaluatio of the mutated idividual. So the expected ruig time is equal to the umber of trasitios i the Markov chai. Deote the expected ruig time of the algorithm as T (): 1 T () = E(i i + 1), (2) i=0 where E(i i + 1) is the expected umber of trasitios eeded to reach the Markov state i + 1 from the state i. Cosider two cases for the state i. The first is i = dk, where d is a costat. The expected umber of trasitios eeded to reach the state dk + 1 from the state dk is evaluated as z dk = E(dk dk + 1): z dk = 2 3 ( dk) 1 + ( 2 3 dk ) (1 + z dk) (3) Fig. 2. t, 1 0 t, dk+k... dk+2 dk+1 dk dk-1 t, 0 1 t, 0 1 Markov chai for RLS without auxiliary objectives o XDIVK From (3) we obtai that z dk is evaluated as: z dk = 2( dk) The secod case is i = dk + t, where 1 t k 1. The expected umber of trasitios eeded to reach the state dk + t + 1 from the state dk + t is evaluated as z dk+t = E(dk + t dk + t + 1): 2( dk t) 2(dk + t) z dk+t = + (1 + z dk+t 1 + z dk+t )+ +( dk + t + dk t ) (1 + z dk+t ) (5) From (5) we obtai that z dk+t is evaluated as: dk + t z dk+t = z dk+t 1 dk t + (6) 2( dk t) To estimate the efficiecy of the aalysed EA+RL modificatio, let us calculate the expected ruig time of RLS without auxiliary objectives. The evaluatio approach is similar to the oe preseted above for the EA+RL modificatio. The total ruig time is calculated by (2). Aalogically, we cosider two cases: i = dk ad i = dk + t. The expected umber of trasitios eeded to reach the state dk + 1 from the state dk is evaluated as a dk = E(dk dk + 1): ( dk) a dk = 1 + dk (1 + a dk) (7) From (7) we obtai that a dk is evaluated as: a dk = (8) ( dk) (4)

5 The expected value of trasitios eeded to reach the state dk + t + 1 from the state dk + t is evaluated as a dk+t = E(dk + t dk + t + 1): ( dk t) a dk+t = 1+ dk + t (1+a dk+t 1 +a dk+t ) (9) From (9) we obtai that a dk+t is evaluated as: a dk+t = a dk+t 1 dk + t dk t + ( dk t) (10) From (4) ad (8) we obtai that z dk = 3 2 a dk. From the equatios (2), (6), (10) usig mathematical iductio we obtai that the ruig time of EA+RL modificatio without learig o mistakes for XDIVK with o-statioary objectives is 1.5 times greater tha the ruig time of RLS without auxiliary objective. Therefore, the EA+RL modificatio without learig o mistakes has asymptotically the same ruig time as RLS, which is bouded by Ω( k ) ad O( k+1 ) [11]. This shows that the EA+RL modificatio without learig o mistakes ca deal with o-statioary auxiliary objectives ulike EA+RL does. V. EXPERIMENTAL ANALYSIS OF MODIFIED EA+RL I this sectio we experimetally evaluate efficiecy of the both proposed EA+RL modificatios o the optimizatio problems defied i Sectio II-B. A. Descriptio of experimets We aalysed the two versios of the proposed modificatio of EA+RL ad the EA+RL method o OM d, XDIVK ad LEADINGONES. The o-statioary objectives described i (1) were used for all the problems. For XDIVK we aalysed two cases of the switch poit positio. The first case is the worst case [13], whe the switch poit is i the ed of optimizatio, p = k+1. I the secod case, the switch poit is i the middle of optimizatio, p = /2. For each algorithm we aalysed two state defiitios: the sigle state ad the target state. Also we studied applyig of ε-greedy strategy. I this strategy the aget selects the objective with the maximum expected reward with probability 1 ε ad with probability ε the aget selects a radom objective. This strategy gives the aget a opportuity to select the objective which was iefficiet but became efficiet after the switch poit. The obtaied umbers of fitess fuctio evaluatios eeded to reach the optimum were averaged by 1000 rus. We used Q-learig algorithm with the same parameters as i [13]. The learig rate was set to α = 0.5 ad the discout factor was equal to γ = 0.5. The ε-greedy strategy was used with ε = 0.1. B. Discussio of experimet results Table I presets the results of the experimets. The first colum cotais parameter values for the cosidered problems. The ext colum cotais results of RLS without auxiliary objectives. The ext three colums correspod to results of EA+RL modificatio with learig o mistakes (modified EA+RL, learig). The followig three colums correspod to the results of EA+RL modificatio without learig o mistakes (modified EA+RL, o learig). The last three colums correspod to the results of the EA+RL method (EA+RL). Each algorithm was aalysed o the sigle state (ss, ε = 0 ad ss, ε = 0.1) ad the target state (ts, ε = 0 ad ts, ε = 0.1). Noe of the algorithms reached the optimum usig the sigle state ad ε = 0. So these results are ot preseted. Whe the optimum was ot reached withi 10 9 iteratios the correspodig result is marked as if. We ca see from Table I that the modificatio of EA+RL with learig o mistakes usig the sigle state ad ε = 0.1 is the most efficiet algorithm o LEADINGONES, OM d ad XDIVK with switch poit i the middle. O LeadigOes ad XdivK with switch poit i the middle this algorithm igores a iefficiet objective ad selects efficiet oe, so the achieved results are better tha the results of RLS without objectives. O OM d this modificatio igores obstructive objectives ad achieves the same results as RLS. O XdivK with switch poit i the ed, the best results are achieved usig the modificatio of EA+RL without learig o mistakes. For each problem, we picked the best cofiguratio of each algorithm ad compared them by Ma-Whitey test with Boferroi correctio. The algorithms were statistically distiguishable with p-value less tha Below we aalyse how differet compoets of the cosidered methods ifluece optimizatio performace. More precisely, we cosider ifluece of the best idividual preservatio, learig which objective is obstructive, state defiitio ad ε-greedy strategy. 1) Ifluece of learig which objective is obstructive: We ca see from the results that the modificatio of EA+RL without learig is outperformed by the versio with learig o LEADINGONES ad XDIVK with switch poit i the middle. Therefore, learig o mistakes is useful because it allows the aget to remember that the objective is obstructive ad ot to select it further. However, o XdivK problem with switch poit i the ed, the best results are achieved usig modificatio of EA+RL without learig o mistakes. Below we explai why learig o mistakes is ot always efficiet. If some objective becomes helpful, the aget will ot select this objective i the same state because it obtaied a egative reward for it. So there are two ways for the aget to lear that the obstructive objective became helpful. The first way is to select this objective with ε-probability ad achieve a positive reward. The secod way is to move to the ew state, where the aget has ot leared which objective is efficiet. It is impossible whe usig the sigle state. Cosider what actios aget should do to move to the ew state if the target state is used. The oly way to move to the ew state is icreasig of the target objective value. The aget could select the target objective or the objective which was efficiet but became obstructive. So target objective value ca be icreased oly if the ew idividual has a higher target objective value ad the selected objective is the target oe. I some optimizatio problems it is ot always possible to icrease the target objective i oe iteratio of the algorithm. For example, cosider XDIVK problem. Let the umber of 1-

6 TABLE I AVERAGED NUMBER OF RUNS NEEDED TO REACH THE GLOBAL OPTIMUM modified EARL, learig modified EARL, o learig EARL Parameters RLS ss, ε = 0.1 ts, ε = 0 ts, ε = 0.1 ss, ε = 0.1 ts, ε = 0 ts, ε = 0.1 ss, ε = 0.1 ts, ε = 0 ts, ε = 0.1 LeadigOes if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if if, d OMd 100, , , , , , k XdivK, switch poit i the ed 40, if , if , if , if , if , if , if , if , if if 96, if if 108, if if 120, if if, k XdivK, switch poit i the middle 40, if , if , if , if , if , if , if , if , if , if if 108, if if 120, if if bits be dk, so the RL state is d. To move to the state d+1, the algorithm eeds to mutate k 0-bits. Let switch poit p be equal to dk + l, where 0 < l < k. The if the umber of 1-bits is greater tha dk+l, the algorithm ca icrease the umber of 1- bits oly if 0-bit is mutated ad the target objective is selected. However, whatever bit is mutated ad whatever objective is selected, the target objective value stays uchaged util a idividual with dk + k 1-bits is obtaied. So the aget does ot recogize if its actio is good or bad because the reward is equal to zero. Therefore, to icrease the target objective value algorithm eeds a lot of steps. I additio, if the switch poit is i the ed of optimizatio, probability to mutate a 0-bit ad select the target objective at the same time is low. The worst case is whe the switch poit is equal to k + 1, because the aget eeds to select the target objective ad icrease the umber of 1-bits durig k 1 iteratios. The modificatio

7 of EA+RL without learig o mistakes does ot have this drawback, because as it was show i theoretical aalysis, it does ot deped o the umber of switch poits ad their positios. To coclude, the modificatio of EA+RL with learig performs better tha the proposed modificatio without learig, except the case whe the switch poit occurs at the time whe it is hard to geerate good solutios (stagatio). 2) Ifluece of best idividual preservatio: Thaks to preservig of idividual with the best target value, the proposed modificatio of EA+RL achieves the optimum o LEADINGONES ad OM d problems ulike the EA+RL method does. However, we ca see that o XdivK with switch poit i the middle EA+RL achieves better results despite the possibility to lose the best idividual. It is explaied by the fact that EA+RL lears which objective is obstructive ad for this problem it is more importat tha preservig of the best idividual. Iverse situatio is observed for XdivK with switch poit i the ed where learig does ot help to achieve the optimum faster eve i case of preservig of the best idividual (see sectio V-B1). To coclude, preservatio of the best idividual improves the EA+RL method. However, learig which objective is efficiet has a greater impact o the algorithm performace. 3) Ifluece of state defiitio: To begi with, let us separately cosider RL efficiecy whe usig the target state ad the sigle state. The we will discuss how state defiitio ifluece the performace of two proposed modificatios of EA+RL. Cosider the target state. The algorithm moves to the ew state if the target objective value is icreased ad, as follows, a positive reward is obtaied. If the target objective value ca ot be decreased, the algorithm ever returs to the state where it achieved a positive reward. Therefore, the aget ever kows which actio is efficiet if the target state is used. The sigle state allows to lear which objective is helpful cotrary to the target state. However, whe the objective which was efficiet becomes obstructive, the aget cotiues selectio of this objective. Re-learig that the objective which was helpful became obstructive whe usig the sigle state is more difficult tha if the target state is used (see Sectio V-B1). Noe of the algorithms usig the sigle state with ε = 0 reached the optimum. So without ε-greedy strategy the aget could ot re-lear. Also EA+RL without preservatio of the best obtaied idividual does ot achieve the optimum usig the sigle state. Cosider ifluece of state defiitio i the EA+RL modificatio with learig. Whe solvig LEADINGONES, the positive effect of ability to lear which objective is efficiet is more importat tha the egative effect of difficult re-learig. We ca see differet iflueces of these egative ad positive effects whe solvig the XdivK problem with switch poit i the middle. Whe k becomes bigger, it is more importat to select a efficiet objective durig k iteratios where aget does ot achieve reward ad could ot defie which objective is better. So we ca see that for k = 3 the sigle state is much better tha the target state. O XdivK problem with switch poit i the ed, the results obtaied usig the sigle state ad the target state with ε = 0.1 are the same. These results are better tha the results obtaied usig the target state with ε = 0. Therefore, the impact of ε is greater tha the impact of state defiitio. I the EA+RL modificatio without learig the aget does ot achieve egative reward so re-learig whe usig the sigle state is harder. Re-learig ca occur oly if a positive reward is obtaied whe a better solutio is geerated. I XdivK, may iteratios are eeded to icrease the target objective value (see Sectio V-B1). Therefore, the results o XdivK for the target state are better tha the results for the sigle state. I LEADINGONES, geeratig of a idividual with a higher target objective value is ot so difficult. Therefore, usig the sigle state the algorithm obtais better results tha usig the target state. To coclude, i the modificatio of EA+RL with learig o mistakes the sigle state is better i the most cases. I the modificatio of EA+RL without learig the sigle state is worse tha the target oly o XdivK where re-learig is difficult. Also we ca ote that EA+RL does ot achieve the optimum usig the sigle state. 4) Ifluece of ε-greedy strategy: Cosider ifluece of ε value o performace of the cosidered algorithms whe usig the target state. I this state the aget ca oly lear that a objective is obstructive, if a egative reward is achieved after applyig this objective. So this objective will ot be selected i the same state eve if it will become helpful. No-zero ε allows to select this objective. I the modificatio of EA+RL without learig o mistakes the aget ca obtai oly oegative reward. So the situatio described above is impossible i this modificatio. Therefore, ε value does ot ifluece the efficiecy of this modificatio. I the modificatio of EA+RL with learig o mistakes o-zero ε is helpful oly o XdivK problem with switch poit i the ed. O the other problems the results obtaied with o-zero ε are worse tha the results obtaied with ε = 0. Whe usig the sigle state, ε allows to select the objective which was obstructive, as if the target state was used. However, if the sigle state is used, the aget lears ot oly that a objective is iefficiet, but also that a objective is efficiet. Therefore, ε also allows ot to select the objective which was efficiet but became obstructive. This results i reachig the optimum whe usig o-zero ε i the sigle state. To coclude, if the sigle state is used, ε value have to be o-zero. If the target state is used, o-zero ε allows to achieve better results oly if re-learig is very difficult, such as i XdivK with switch poit i the ed. VI. CONCLUSION We proposed a modificatio of the EA+RL method which preserves the best foud solutio. We cosidered two versios of the proposed modificatio. I the first versio, called the modificatio of EA+RL without learig o mistakes, the RL aget lears oly whe the algorithm fids a better solutio.

8 I the secod versio, called the modificatio of EA+RL with learig o mistakes, the RL aget also lears whe the algorithm obtais a iefficiet solutio. We cosidered two auxiliary objectives which chage their efficiecy at switch poit. We experimetally aalysed the two proposed modificatios ad the EA+RL method o OM d, LEADINGONES, XDIVK with switch poit i the middle of optimizatio ad XDIVK with switch poit i the ed. Two RL state defiitios were cosidered: the sigle state ad the target state. Also we cosidered how ε-greedy exploratio strategy ifluece the performace of the algorithm. The both proposed modificatios reached the optimum o OM d ad LEADINGONES ulike the EA+RL method did. The modificatio of EA+RL with learig usig the sigle state ad ε = 0.1 achieved the best results amog the cosidered objective selectio methods o all problems, except the XDIVK problem with switch poit i the ed. O LEADINGONES ad XDIVK with switch poit i the middle this algorithm was able to select a efficiet objective ad obtai better results tha RLS. O OM d without helpful objectives, this modificatio igored obstructive objectives ad achieved the same results as RLS. Therefore, keepig the best idividual ad usig ε- greedy exploratio i the sigle state seems to be the most promisig reiforcemet based objective selectio approach. We theoretically proved that the lower ad upper bouds o the ruig time of the modificatio of EA+RL without learig o mistakes o the XDIVK problem are Ω( k ) ad O( k+1 ) correspodigly. The asymptotic of RLS o XDIVK without auxiliary objectives is the same. This meas that the modificatio of EA+RL without learig o mistakes o the XDIVK problem igores the objective which is curretly obstructive. Also we proved that performace of this modificatio is idepedet of the umber of switch poits ad their positios, while performace of the modificatio with learig [2] J. D. Kowles, R. A. Watso, ad D. Core, Reducig local optima i sigle-objective problems by multi-objectivizatio, i Proceedigs of the First Iteratioal Coferece o Evolutioary Multi-Criterio Optimizatio. Spriger-Verlag, 2001, pp [3] F. Neuma ad I. Wegeer, Ca sigle-objective optimizatio profit from multiobjective optimizatio? i Multiobjective Problem Solvig from Nature, ser. Natural Computig Series. Spriger Berli Heidelberg, 2008, pp [4] D. Brockhoff, T. Friedrich, N. Hebbighaus, C. Klei, F. Neuma, ad E. Zitzler, O the effects of addig objectives to plateau fuctios, IEEE Trasactios o Evolutioary Computatio, vol. 13, o. 3, pp , depeds o these factors. Particularly, the modificatio without learig achieves the best results o XdivK with switch poit i the ed. This is a especially difficult case because icreasig of the target objective value i the ed of optimizatio eeds a lot of iteratios. VII. ACKNOWLEDGMENTS This work was supported by RFBR accordig to the research project No mol a. REFERENCES [1] C. Segura, C. A. C. Coello, G. Mirada, ad C. Léo, Usig multiobjective evolutioary algorithms for sigle-objective optimizatio, 4OR, vol. 3, o. 11, pp , [5] J. Hadl, S. C. Lovell, ad J. D. Kowles, Multiobjectivizatio by decompositio of scalar cost fuctios, i Parallel Problem Solvig from Nature PPSN X, ser. Lecture Notes i Computer Sciece. Spriger, 2008, o. 5199, pp [6] M. Buzdalov ad A. Buzdalova, Adaptive selectio of helper-objectives for test case geeratio, i 2013 IEEE Cogress o Evolutioary Computatio, vol. 1, 2013, pp [7] D. F. Lochtefeld ad F. W. Ciarallo, Helper-objective optimizatio strategies for the Job-Shop schedulig problem, Applied Soft Computig, vol. 11, o. 6, pp , [8] M. T. Jese, Helper-objectives: Usig multi-objective evolutioary algorithms for sigle-objective optimisatio: Evolutioary computatio combiatorial optimizatio, Joural of Mathematical Modellig ad Algorithms, vol. 3, o. 4, pp , [9] R. S. Sutto ad A. G. Barto, Reiforcemet Learig: A Itroductio. Cambridge, MA, USA: MIT Press, [10] A. Buzdalova ad M. Buzdalov, Icreasig efficiecy of evolutioary algorithms by choosig betwee auxiliary fitess fuctios with reiforcemet learig, i Proceedigs of the Iteratioal Coferece o Machie Learig ad Applicatios, vol. 1, 2012, pp [11] M. Buzdalov ad A. Buzdalova, OeMax helps optimizig XdivK: Theoretical rutime aalysis for RLS ad EA+RL, i Proceedigs of Geetic ad Evolutioary Computatio Coferece Compaio. ACM, 2014, pp [12], Ca OeMax help optimizig LeadigOes usig the EA+RL method? i Proceedigs of IEEE Cogress o Evolutioary Computatio, 2015, pp [13] I. Petrova, A. Buzdalova, ad G. Koreev, Rutime aalysis of radom local search with reiforcemet based selectio of o-statioary auxiliary objectives: iitial study, i Proceedigs of 22d Iteratioal Coferece o Soft Computig MENDEL 2016, Czech Republic, 2016, pp [14] A. Buzdalova, I. Petrova, ad M. Buzdalov, Rutime aalysis of differet approaches to select coflictig auxiliary objectives i the geeralized oemax problem, i Proceedigs of IEEE Symposium Series o Computatioal Itelligece, 2016, pp [15] A. Auger ad B. Doerr, Theory of Radomized Search Heuristics: Foudatios ad Recet Developmets. Scietific Publishig Co., Ic., River Edge, NJ, USA: World

A Novel Genetic Algorithm using Helper Objectives for the 0-1 Knapsack Problem

A Novel Genetic Algorithm using Helper Objectives for the 0-1 Knapsack Problem A Novel Geetic Algorithm usig Helper Objectives for the 0-1 Kapsack Problem Ju He, Feidu He ad Hogbi Dog 1 arxiv:1404.0868v1 [cs.ne] 3 Apr 2014 Abstract The 0-1 kapsack problem is a well-kow combiatorial

More information

Markov Decision Processes

Markov Decision Processes Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes

More information

General Lower Bounds for the Running Time of Evolutionary Algorithms

General Lower Bounds for the Running Time of Evolutionary Algorithms Geeral Lower Bouds for the Ruig Time of Evolutioary Algorithms Dirk Sudholt Iteratioal Computer Sciece Istitute, Berkeley, CA 94704, USA Abstract. We preset a ew method for provig lower bouds i evolutioary

More information

Upper and Lower Bounds on Unrestricted Black-Box Complexity of Jump n,l

Upper and Lower Bounds on Unrestricted Black-Box Complexity of Jump n,l Upper ad Lower Bouds o Urestricted Black-Box Complexity of Jump,l Maxim Buzdalov 1, Mikhail Kever 1, ad Bejami Doerr 1 ITMO Uiversity, 49 Kroverkskiy av., Sait-Petersburg, Russia, 197101 mbuzdalov@gmail.com,

More information

Running Time Analysis of the (1+1)-EA for OneMax and LeadingOnes under Bit-wise Noise

Running Time Analysis of the (1+1)-EA for OneMax and LeadingOnes under Bit-wise Noise Ruig Time Aalysis of the +-EA for OeMax ad LeadigOes uder Bit-wise Noise Chao Qia Uiversity of Sciece ad Techology of Chia Hefei 3007, Chia chaoqia@ustc.edu.c Wu Jiag Uiversity of Sciece ad Techology of

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

arxiv: v1 [cs.ai] 20 Nov 2013

arxiv: v1 [cs.ai] 20 Nov 2013 Aalyzig Evolutioary Optimizatio i Noisy Eviromets Chao Qia, Yag Yu, Zhi-Hua Zhou Natioal Key Laboratory for Novel Software Techology Najig Uiversity, Najig 20023, Chia arxiv:3.4987v [cs.ai] 20 Nov 203

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

A Rigorous View On Neutrality

A Rigorous View On Neutrality A Rigorous View O Neutrality Bejami Doerr Michael Gewuch Nils Hebbighaus Frak Neuma Algorithms ad Complexity Group Max-Plack-Istitut für Iformatik Saarbrücke, Germay Departmet of Computer Sciece Christia-Albrechts-Uiversity

More information

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS DEMETRES CHRISTOFIDES Abstract. Cosider a ivertible matrix over some field. The Gauss-Jorda elimiatio reduces this matrix to the idetity

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) = AN INTRODUCTION TO SCHRÖDER AND UNKNOWN NUMBERS NICK DUFRESNE Abstract. I this article we will itroduce two types of lattice paths, Schröder paths ad Ukow paths. We will examie differet properties of each,

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

CSE 202 Homework 1 Matthias Springer, A Yes, there does always exist a perfect matching without a strong instability.

CSE 202 Homework 1 Matthias Springer, A Yes, there does always exist a perfect matching without a strong instability. CSE 0 Homework 1 Matthias Spriger, A9950078 1 Problem 1 Notatio a b meas that a is matched to b. a < b c meas that b likes c more tha a. Equality idicates a tie. Strog istability Yes, there does always

More information

arxiv: v1 [cs.ne] 2 Nov 2017

arxiv: v1 [cs.ne] 2 Nov 2017 Noame mauscript No. will be iserted by the editor) Ruig Time Aalysis of the +)-EA for OeMax ad LeadigOes uder Bit-wise Noise Chao Qia Chao Bia Wu Jiag Ke Tag Received: date / Accepted: date arxiv:7.00956v

More information

Selection Hyper-heuristics Can Provably be Helpful in Evolutionary Multi-objective Optimization

Selection Hyper-heuristics Can Provably be Helpful in Evolutionary Multi-objective Optimization Selectio Hyper-heuristics Ca Provably be Helpful i Evolutioary Multi-objective Optimizatio Chao Qia 1,2, Ke Tag 1, ad Zhi-Hua Zhou 2 1 UBRI, School of Computer Sciece ad Techology, Uiversity of Sciece

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Chapter 2 The Monte Carlo Method

Chapter 2 The Monte Carlo Method Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful

More information

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row: Math 5-4 Tue Feb 4 Cotiue with sectio 36 Determiats The effective way to compute determiats for larger-sized matrices without lots of zeroes is to ot use the defiitio, but rather to use the followig facts,

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

1 Hash tables. 1.1 Implementation

1 Hash tables. 1.1 Implementation Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

More information

Analysis of Algorithms. Introduction. Contents

Analysis of Algorithms. Introduction. Contents Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Castiel, Supernatural, Season 6, Episode 18

Castiel, Supernatural, Season 6, Episode 18 13 Differetial Equatios the aswer to your questio ca best be epressed as a series of partial differetial equatios... Castiel, Superatural, Seaso 6, Episode 18 A differetial equatio is a mathematical equatio

More information

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row: Math 50-004 Tue Feb 4 Cotiue with sectio 36 Determiats The effective way to compute determiats for larger-sized matrices without lots of zeroes is to ot use the defiitio, but rather to use the followig

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Disjoint set (Union-Find)

Disjoint set (Union-Find) CS124 Lecture 7 Fall 2018 Disjoit set (Uio-Fid) For Kruskal s algorithm for the miimum spaig tree problem, we foud that we eeded a data structure for maitaiig a collectio of disjoit sets. That is, we eed

More information

CS 270 Algorithms. Oliver Kullmann. Growth of Functions. Divide-and- Conquer Min-Max- Problem. Tutorial. Reading from CLRS for week 2

CS 270 Algorithms. Oliver Kullmann. Growth of Functions. Divide-and- Conquer Min-Max- Problem. Tutorial. Reading from CLRS for week 2 Geeral remarks Week 2 1 Divide ad First we cosider a importat tool for the aalysis of algorithms: Big-Oh. The we itroduce a importat algorithmic paradigm:. We coclude by presetig ad aalysig two examples.

More information

Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary

Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary Recursive Algorithm for Geeratig Partitios of a Iteger Sug-Hyuk Cha Computer Sciece Departmet, Pace Uiversity 1 Pace Plaza, New York, NY 10038 USA scha@pace.edu Abstract. This article first reviews the

More information

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects

More information

Refined Runtime Analysis of a Basic Ant Colony Optimization Algorithm

Refined Runtime Analysis of a Basic Ant Colony Optimization Algorithm Refied Rutime Aalysis of a Basic At Coloy Optimizatio Algorithm Bejami Doerr ad Daiel Johase Max-Plack-Istitut für Iformatik Campus E1 4, 66123 Saarbrücke, Germay Abstract Neuma ad Witt (2006) aalyzed

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Runtime Analysis of the (µ+1) EA on Simple Pseudo-Boolean Functions

Runtime Analysis of the (µ+1) EA on Simple Pseudo-Boolean Functions Rutime Aalysis of the (µ+1) EA o Simple Pseudo-Boolea Fuctios Carste Witt FB Iformatik, LS 2, Uiversität Dortmud, 44221 Dortmud, Germay carste.witt@cs.ui-dortmud.de November 14, 2005 Abstract Although

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

On the Effectiveness of Sampling for Evolutionary Optimization in Noisy Environments

On the Effectiveness of Sampling for Evolutionary Optimization in Noisy Environments O the Effectiveess of Samplig for Evolutioary Optimizatio i Noisy Eviromets Chao Qia,2 chaoqia@ustc.edu.c Yag Yu 2 yuy@ju.edu.c Ke Tag ketag@ustc.edu.c Yaochu Ji 3 yaochu.ji@surrey.ac.uk Xi Yao,4 x.yao@cs.bham.ac.uk

More information

CS / MCS 401 Homework 3 grader solutions

CS / MCS 401 Homework 3 grader solutions CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of

More information

A New Solution Method for the Finite-Horizon Discrete-Time EOQ Problem

A New Solution Method for the Finite-Horizon Discrete-Time EOQ Problem This is the Pre-Published Versio. A New Solutio Method for the Fiite-Horizo Discrete-Time EOQ Problem Chug-Lu Li Departmet of Logistics The Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog Phoe: +852-2766-7410

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

arxiv: v1 [cs.ne] 4 Sep 2017

arxiv: v1 [cs.ne] 4 Sep 2017 Theoretical Aalysis of Stochastic Search Algorithms Per Kristia Lehre School of Computer Sciece, Uiversity of Birmigham, Birmigham, UK Pietro S. Oliveto Departmet of Computer Sciece, Uiversity of Sheffield,

More information

SEQUENCES AND SERIES

SEQUENCES AND SERIES 9 SEQUENCES AND SERIES INTRODUCTION Sequeces have may importat applicatios i several spheres of huma activities Whe a collectio of objects is arraged i a defiite order such that it has a idetified first

More information

A Block Cipher Using Linear Congruences

A Block Cipher Using Linear Congruences Joural of Computer Sciece 3 (7): 556-560, 2007 ISSN 1549-3636 2007 Sciece Publicatios A Block Cipher Usig Liear Cogrueces 1 V.U.K. Sastry ad 2 V. Jaaki 1 Academic Affairs, Sreeidhi Istitute of Sciece &

More information

The Minimum Distance Energy for Polygonal Unknots

The Minimum Distance Energy for Polygonal Unknots The Miimum Distace Eergy for Polygoal Ukots By:Johaa Tam Advisor: Rollad Trapp Abstract This paper ivestigates the eergy U MD of polygoal ukots It provides equatios for fidig the eergy for ay plaar regular

More information

CS322: Network Analysis. Problem Set 2 - Fall 2009

CS322: Network Analysis. Problem Set 2 - Fall 2009 Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Computational Intelligence Winter Term 2018/19

Computational Intelligence Winter Term 2018/19 Computatioal Itelligece Witer Term 28/9 Prof. Dr. Güter Rudolph Lehrstuhl für Algorithm Egieerig (LS ) Fakultät für Iformatik TU Dortmud Pla for Today Lecture Evolutioary Algorithms (EA) Optimizatio Basics

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

7. Modern Techniques. Data Encryption Standard (DES)

7. Modern Techniques. Data Encryption Standard (DES) 7. Moder Techiques. Data Ecryptio Stadard (DES) The objective of this chapter is to illustrate the priciples of moder covetioal ecryptio. For this purpose, we focus o the most widely used covetioal ecryptio

More information

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation Metodološki zvezki, Vol. 13, No., 016, 117-130 Approximate Cofidece Iterval for the Reciprocal of a Normal Mea with a Kow Coefficiet of Variatio Wararit Paichkitkosolkul 1 Abstract A approximate cofidece

More information

Lecture 9: Hierarchy Theorems

Lecture 9: Hierarchy Theorems IAS/PCMI Summer Sessio 2000 Clay Mathematics Udergraduate Program Basic Course o Computatioal Complexity Lecture 9: Hierarchy Theorems David Mix Barrigto ad Alexis Maciel July 27, 2000 Most of this lecture

More information

2 Markov Chain Monte Carlo Sampling

2 Markov Chain Monte Carlo Sampling 22 Part I. Markov Chais ad Stochastic Samplig Figure 10: Hard-core colourig of a lattice. 2 Markov Chai Mote Carlo Samplig We ow itroduce Markov chai Mote Carlo (MCMC) samplig, which is a extremely importat

More information

Average Convergence Rate of Evolutionary Algorithms

Average Convergence Rate of Evolutionary Algorithms 1 Average Covergece Rate of Evolutioary Algorithms Ju He ad Guagmig Li Abstract arxiv:154.8117v3 [cs.ne] 2 Ju 215 I evolutioary optimizatio, it is importat to uderstad how fast evolutioary algorithms coverge

More information

NUMERICAL METHODS FOR SOLVING EQUATIONS

NUMERICAL METHODS FOR SOLVING EQUATIONS Mathematics Revisio Guides Numerical Methods for Solvig Equatios Page 1 of 11 M.K. HOME TUITION Mathematics Revisio Guides Level: GCSE Higher Tier NUMERICAL METHODS FOR SOLVING EQUATIONS Versio:. Date:

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Math 312 Lecture Notes One Dimensional Maps

Math 312 Lecture Notes One Dimensional Maps Math 312 Lecture Notes Oe Dimesioal Maps Warre Weckesser Departmet of Mathematics Colgate Uiversity 21-23 February 25 A Example We begi with the simplest model of populatio growth. Suppose, for example,

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

Fortgeschrittene Datenstrukturen Vorlesung 11

Fortgeschrittene Datenstrukturen Vorlesung 11 Fortgeschrittee Datestruture Vorlesug 11 Schriftführer: Marti Weider 19.01.2012 1 Succict Data Structures (ctd.) 1.1 Select-Queries A slightly differet approach, compared to ra, is used for select. B represets

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

PARETO-OPTIMAL SOLUTION OF A SCHEDULING PROBLEM ON A SINGLE MACHINE WITH PERIODIC MAINTENANCE AND NON-PRE-EMPTIVE JOBS

PARETO-OPTIMAL SOLUTION OF A SCHEDULING PROBLEM ON A SINGLE MACHINE WITH PERIODIC MAINTENANCE AND NON-PRE-EMPTIVE JOBS Proceedigs of the Iteratioal Coferece o Mechaical Egieerig 2007 (ICME2007) 2-3 December 2007, Dhaka, Bagladesh ICME07-AM-6 PARETO-OPTIMAL SOLUTION OF A SCHEDULING PROBLEM ON A SINGLE MACHINE WITH PERIODIC

More information

De Bruijn Sequences for the Binary Strings with Maximum Density

De Bruijn Sequences for the Binary Strings with Maximum Density De Bruij Sequeces for the Biary Strigs with Maximum Desity Joe Sawada 1, Brett Steves 2, ad Aaro Williams 2 1 jsawada@uoguelph.ca School of Computer Sciece, Uiversity of Guelph, CANADA 2 brett@math.carleto.ca

More information

APPENDIX: STUDY CASES A SURVEY OF NONPARAMETRIC TESTS FOR THE STATISTICAL ANALYSIS OF EVOLUTIONARY COMPUTATION EXPERIMENTS

APPENDIX: STUDY CASES A SURVEY OF NONPARAMETRIC TESTS FOR THE STATISTICAL ANALYSIS OF EVOLUTIONARY COMPUTATION EXPERIMENTS A survey of oparametric tests for the statistical aalysis of evolutioary computatio experimets. Appedix 1 APPENDIX: STUDY CASES A SURVEY OF NONPARAMETRIC TESTS FOR THE STATISTICAL ANALYSIS OF EVOLUTIONARY

More information

GUIDELINES ON REPRESENTATIVE SAMPLING

GUIDELINES ON REPRESENTATIVE SAMPLING DRUGS WORKING GROUP VALIDATION OF THE GUIDELINES ON REPRESENTATIVE SAMPLING DOCUMENT TYPE : REF. CODE: ISSUE NO: ISSUE DATE: VALIDATION REPORT DWG-SGL-001 002 08 DECEMBER 2012 Ref code: DWG-SGL-001 Issue

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

Intensive Algorithms Lecture 11. DFT and DP. Lecturer: Daniel A. Spielman February 20, f(n) O(g(n) log c g(n)).

Intensive Algorithms Lecture 11. DFT and DP. Lecturer: Daniel A. Spielman February 20, f(n) O(g(n) log c g(n)). Itesive Algorithms Lecture 11 DFT ad DP Lecturer: Daiel A. Spielma February 20, 2018 11.1 Itroductio The purpose of this lecture is to lear how use the Discrete Fourier Trasform to save space i Dyamic

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Radom Models Tusheg Zhag February 14, 013 1 Radom Walks Let me describe the model. Radom walks are used to describe the motio of a movig particle (object). Suppose that a particle (object) moves alog the

More information

Math 257: Finite difference methods

Math 257: Finite difference methods Math 257: Fiite differece methods 1 Fiite Differeces Remember the defiitio of a derivative f f(x + ) f(x) (x) = lim 0 Also recall Taylor s formula: (1) f(x + ) = f(x) + f (x) + 2 f (x) + 3 f (3) (x) +...

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

An Alternative Scaling Factor In Broyden s Class Methods for Unconstrained Optimization

An Alternative Scaling Factor In Broyden s Class Methods for Unconstrained Optimization Joural of Mathematics ad Statistics 6 (): 63-67, 00 ISSN 549-3644 00 Sciece Publicatios A Alterative Scalig Factor I Broyde s Class Methods for Ucostraied Optimizatio Muhammad Fauzi bi Embog, Mustafa bi

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

Section 5.1 The Basics of Counting

Section 5.1 The Basics of Counting 1 Sectio 5.1 The Basics of Coutig Combiatorics, the study of arragemets of objects, is a importat part of discrete mathematics. I this chapter, we will lear basic techiques of coutig which has a lot of

More information

SAMPLING LIPSCHITZ CONTINUOUS DENSITIES. 1. Introduction

SAMPLING LIPSCHITZ CONTINUOUS DENSITIES. 1. Introduction SAMPLING LIPSCHITZ CONTINUOUS DENSITIES OLIVIER BINETTE Abstract. A simple ad efficiet algorithm for geeratig radom variates from the class of Lipschitz cotiuous desities is described. A MatLab implemetatio

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desig ad Aalysis of Algorithms Probabilistic aalysis ad Radomized algorithms Referece: CLRS Chapter 5 Topics: Hirig problem Idicatio radom variables Radomized algorithms Huo Hogwei 1 The hirig problem

More information

Time-Domain Representations of LTI Systems

Time-Domain Representations of LTI Systems 2.1 Itroductio Objectives: 1. Impulse resposes of LTI systems 2. Liear costat-coefficiets differetial or differece equatios of LTI systems 3. Bloc diagram represetatios of LTI systems 4. State-variable

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

THIS paper analyzes the behavior of those complex

THIS paper analyzes the behavior of those complex IAENG Iteratioal Joural of Computer Sciece 39:4 IJCS_39_4_6 Itrisic Order Lexicographic Order Vector Order ad Hammig Weight Luis Gozález Abstract To compare biary -tuple probabilities with o eed to compute

More information

Chapter 9: Numerical Differentiation

Chapter 9: Numerical Differentiation 178 Chapter 9: Numerical Differetiatio Numerical Differetiatio Formulatio of equatios for physical problems ofte ivolve derivatives (rate-of-chage quatities, such as velocity ad acceleratio). Numerical

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

SRC Technical Note June 17, Tight Thresholds for The Pure Literal Rule. Michael Mitzenmacher. d i g i t a l

SRC Technical Note June 17, Tight Thresholds for The Pure Literal Rule. Michael Mitzenmacher. d i g i t a l SRC Techical Note 1997-011 Jue 17, 1997 Tight Thresholds for The Pure Literal Rule Michael Mitzemacher d i g i t a l Systems Research Ceter 130 Lytto Aveue Palo Alto, Califoria 94301 http://www.research.digital.com/src/

More information

Lecture 4: April 10, 2013

Lecture 4: April 10, 2013 TTIC/CMSC 1150 Mathematical Toolkit Sprig 01 Madhur Tulsiai Lecture 4: April 10, 01 Scribe: Haris Agelidakis 1 Chebyshev s Iequality recap I the previous lecture, we used Chebyshev s iequality to get a

More information

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES Peter M. Maurer Why Hashig is θ(). As i biary search, hashig assumes that keys are stored i a array which is idexed by a iteger. However, hashig attempts to bypass

More information

The Rand and block distances of pairs of set partitions

The Rand and block distances of pairs of set partitions The Rad ad block distaces of pairs of set partitios Frak Ruskey 1 ad Jeifer Woodcock 1 Dept. of Computer Sciece, Uiversity of Victoria, CANADA Abstract. The Rad distaces of two set partitios is the umber

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

Complex Stochastic Boolean Systems: Generating and Counting the Binary n-tuples Intrinsically Less or Greater than u

Complex Stochastic Boolean Systems: Generating and Counting the Binary n-tuples Intrinsically Less or Greater than u Proceedigs of the World Cogress o Egieerig ad Computer Sciece 29 Vol I WCECS 29, October 2-22, 29, Sa Fracisco, USA Complex Stochastic Boolea Systems: Geeratig ad Coutig the Biary -Tuples Itrisically Less

More information

Benchmark Fitness Landscape Analysis

Benchmark Fitness Landscape Analysis Bechmark Fitess Ladscape Aalysis Galia Merkuryeva, Vitalijs Bolshakovs Departmet of Modellig ad Simulatio Riga Techical Uiversity Riga, Latvia e-mail: galia.merkurjeva@rtu.lv; vitalijs.bolsakovs@rtu.lv

More information

arxiv: v1 [cs.sc] 2 Jan 2018

arxiv: v1 [cs.sc] 2 Jan 2018 Computig the Iverse Melli Trasform of Holoomic Sequeces usig Kovacic s Algorithm arxiv:8.9v [cs.sc] 2 Ja 28 Research Istitute for Symbolic Computatio RISC) Johaes Kepler Uiversity Liz, Alteberger Straße

More information

De Bruijn Sequences for the Binary Strings with Maximum Specified Density

De Bruijn Sequences for the Binary Strings with Maximum Specified Density De Bruij Sequeces for the Biary Strigs with Maximum Specified Desity Joe Sawada 1, Brett Steves 2, ad Aaro Williams 2 1 jsawada@uoguelph.ca School of Computer Sciece, Uiversity of Guelph, CANADA 2 brett@math.carleto.ca

More information

Ma 530 Introduction to Power Series

Ma 530 Introduction to Power Series Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power

More information

Algorithm of Superposition of Boolean Functions Given with Truth Vectors

Algorithm of Superposition of Boolean Functions Given with Truth Vectors IJCSI Iteratioal Joural of Computer Sciece Issues, Vol 9, Issue 4, No, July ISSN (Olie: 694-84 wwwijcsiorg 9 Algorithm of Superpositio of Boolea Fuctios Give with Truth Vectors Aatoly Plotikov, Aleader

More information

11 Hidden Markov Models

11 Hidden Markov Models Hidde Markov Models Hidde Markov Models are a popular machie learig approach i bioiformatics. Machie learig algorithms are preseted with traiig data, which are used to derive importat isights about the

More information

Drift analysis and average time complexity of evolutionary algorithms

Drift analysis and average time complexity of evolutionary algorithms Artificial Itelligece 127 (2001) 57 85 Drift aalysis ad average time complexity of evolutioary algorithms Ju He a,xiyao b, a Departmet of Computer Sciece, Norther Jiaotog Uiversity, Beijig 100044, PR Chia

More information