Genetic Algorithms and permutation. Peter J.B. Hancock. Centre for Cognitive and Computational Neuroscience

Size: px
Start display at page:

Download "Genetic Algorithms and permutation. Peter J.B. Hancock. Centre for Cognitive and Computational Neuroscience"

Transcription

1 Genetic Algorithms and permutation problems: a comparison of recombination operators for neural net structure specication Peter J.B. Hancock Centre for Cognitive and Computational Neuroscience Departments of Psychology and Computing Science University of Stirling, Scotland, FK9 4LA Abstract The specication of neural net architectures by genetic algorithm is thought to be hampered by diculties with crossover. This is the \permutation" or \competing conventions" problem: similar nets may have the hidden units dened in dierent orders so that they have very dissimilar genetic strings, preventing successful recombination of building blocks. Previous empirical tests of a number of recombination operators using a simulated net-building task indicated the superiority of one that sorts hidden unit denitions by overlap prior to crossover. However, simple crossover also fared well, suggesting that the permutation problem is not serious in practice. This is supported by an observed reduction in performance when the permutation problem is removed. The GA is shown to be able to resolve the permutations, so that the advantages of an increase in the number of maxima outweigh the diculties of recombination. 1 Introduction It is well-established that the performance of Backprop-trained neural nets is strongly dependent on their internal structure, one reason being over-tting of the training data if there are too many weights. It is quite possible to halve the error rate on a test set by appropriately pruning connections (Hancock, 1992b). Unfortunately, there are few established guidelines for deciding a priori which connections are important for a given task. Genetic algorithms oer one possible method of exploring the space of connectivities. Miller et al (1989) suggest a classication of methods for coding nets on a genetic string. Strong codings specify each connection individually, while weak codings are more like growth rules, that may specify many units and connections simultaneously, perhaps stochastically. This paper is concerned with the precise denition of relatively small nets, so strong coding, also known as direct coding, is appropriate. The simplest such coding uses one bit per 1

2 connection. Miller et al (1989) use a complete connection matrix, allowing any unit to connect to any other. While this achieved their aim of freeing the GA of their preconceptions about likely patterns of connectivity, it does allow nets with recurrent connections to be specied, which causes problems for a training algorithm such as Backprop. In this work a more limited model is considered, that of a net with a single hidden layer, and forward connections only between adjacent layers. The genetic string then consists of a simple concatenation of hidden unit denitions, each specied by one bit for each input and output unit. With such a net specication there is a potentially severe problem, known variously as the competing conventions, permutation, isomorphism and the structural/functional mapping problem. With a purely feed-forward net, there is no signicance to the order of the hidden units. This means that the genetic coding is redundant, in the sense that many dierent strings will code for the same net. There will be n! possible codings of a net with n hidden units. If crossover attempts to combine two strings that have a dierent order, the result is liable to contain two copies of some units, and none of others. Figure 1 illustrates the potential problem with a net that has a 2 dimensional input, such as an image, and hidden units that have localised receptive elds (RFs). Two good parents may combine to give two poor children. The same potential problem aects eorts to train net weights by GA. Note, however, that this paper is concerned only with the specication of the connectivity of a net, and not the weights associated with the connections. Once the specied net has been built, weights are assumed to be initialised randomly prior to training with an algorithm such as Backprop C A D B A 3 4 B C 2 1 D Parent 1: 1234 Parent 2: ABCD Child 1: AB34 Child 2: 12CD Figure 1: Recombination of nets with localised receptive elds. The parent nets each have 4 similar hidden units, but dened in dierent orders on the genetic string, so unit 2 on parent 1 is most like unit C on parent 2. Here, there is a single crossover point, between the second and third unit denitions. The resulting nets each have two copies of similar hidden units, and do not cover the input space. While this problem is widely recognised (e.g. Belew et al (199)), there have been few attempts to solve it. Indeed, a number of workers have simply removed crossover from the algorithm, e.g. (Bornholdt & Graudenz, 1992; de Garis, 199; Nol et al., 199), which seems drastic given the centrality of recombination to the GA model. Whitley et al (1991) report successful results for training the weights of nets by GA, using what they term a genetic hill-climber, which is essentially mutation driven. This is indicated by a decrease in the number of evaluations required as the population size is reduced to one. Recently, Radclie (1993) has suggested a matching recombination operator that at- 2

3 tempts to overcome the permutation problem. Preliminary tests on this operator, and an extension proposed by the author, indicated that both tended to perform less well than a simple form of crossover (Hancock, 1992c). This paper explores further the causes and implications of this observation. 2 Addressing the permutation problem The permutation problem stems from the possibility that units which are equivalent in the net are dened in dierent places on two parent strings. Ideally, we should like a method of identifying the equivalent units prior to crossover. Montana and Davis (1989) suggested such a method for use in the context of training net weights by GA. They matched hidden units by their responses to a number of trial inputs applied to the net. Radclie (1993) has suggested a method that applies to net structure denition. This treats net specications as a multiset, where the elements are the possible hidden unit connectivities. The hidden layer is a multiset, because it is possible to have more than one copy of a unit with given connectivity (in the limit, all the units might be identical). Radclie suggests searching through the units dened in each parent to match those which are identical. Such units are transmitted to the child string with a high probability. A limitation of this algorithm is that it only matches units which are identical: if they dier in one connection they are treated just as if they dier in every connection. An extension was therefore proposed, which matches units on the basis of the number of connections they have in common (Hancock, 1992a; Hancock, 1992c). Since the unit denitions take the form of binary strings, Hamming distance might seem an appropriate measure of distance. However, the aim of the matching process is to identify units that might play a similar role in the nal trained net. Suppose we are specifying the connections to hidden units from 8 inputs. It seems unlikely that a unit specied by 11 would be at all similar to one specied by 11, but, because of the zeros in common, Hamming distance shows them to be quite close, and relatively more so as the number of unconnected inputs increases. A more plausible measure of overlap is given by counting only connections in common, for units dened by binary strings k and l: overlap = inputs in common total inputs connected = jk:and:lj jk:or:lj (1) Hancock (1992c) compared Radclie's operator, and a modied version that matched unit denitions on overlap, with a simple crossover, which will be referred to as Uniform. In Uniform, multiple cross points are allowed, occuring at the boundary of unit denitions with (fairly high) probability p u, and within unit denitions with (lower) probability p b. The results, on a simulated net-building task (see section 3), showed that Uniform often performed better than either of the more complex operators. However, a modied version of Uniform, referred to as Sort, was best. The recombination phase of this is just like Uniform, but the strings are sorted prior to crossover. An overlap matrix is built and the two parent strings are then reordered such that units in equivalent positions are paired in order of decreasing similarity. This allows crossover to select between equivalent units in the 3

4 two parents, as estimated by the overlap measure. Note that the relative complexity of the various operators is not an issue when it comes to specifying neural nets: the evaluation time of any practical net will far outweigh the generation time taken by any of these algorithms. The next sections present some more detailed comparisons of these operators. Since Radclie's operator and the author's modication of it again performed relatively badly, their results are omitted for clarity: they may be found in (Hancock, 1992a). 3 Testing the operators Ideally, the various operators should be compared by the full process of specifying, training and testing nets, using a variety of data sets. Since the performance of both nets and GAs is inherently stochastic (because of the random starting weights for Backprop-trained nets), several runs are required for statistically signicant results. At the time of writing, such runs are in progress, and are expected to be so for several more cpu-months. Some faster method of comparison is clearly desirable. The method used here is a simulated net-building task. Suppose we know the ideal connectivity for a given task. Then we can set the GA the problem of matching that design, with an evaluation given by some measure of the distance from the target net. As with a real net, the units may be dened in any order on the genetic string. The evaluation function again builds an overlap matrix, this time between test and target unit denitions. In the simplest case the nal evaluation is simply a sum of the individual unit match scores, minus an arbitrary penalty of one per unit dened in excess of the number in the target net. No specic penalty is required for having too few units, since such nets will be penalised by being unable to match all the target units. This match may be evaluated in a fraction of a second, allowing detailed comparisons between the algorithms to be made. The method also allows the operators to be compared on dierent types of net. It might be, for instance, that one operator fares best if the receptive elds of the hidden units do not overlap, but another is better when they do, or when there are multiple copies of the same hidden unit type. For the purposes of this paper, the simulated evaluation has another advantage over the real thing: it allows the eects of the permutation problem to be assessed. If the unit matching procedure is removed from the evaluation algorithm, the task is reduced to a simple bit matching of a target string. Results are shown below for Uniform, labelled NP. There are a number of potentially signicant dierences between this method and the evaluation of real nets. 1. The connections of a real net typically dier in importance, whereas the overlap measure treats all equally. 2. There are likely to be local minima with real net designs. 3. It is likely that a real net design will show a signicant degree of epistasis, i.e. an interdependence between the various units and connections. The value of one connection may only become realised in the presence of one or more others. Note that, at the 4

5 string level, the simulated problem already is epistatic, because of the permutation eect: the value of a given bit depends on which unit it ends up being part of. 4. Linked to this is the possibility of deception: one conguration for a particular unit may be good on its own, but a dierent conguration much more so only when combined with another particular unit type. For example, one large receptive eld might be improved on by two smaller ones, but be better than either small one on its own. 5. The evaluation of real nets will be noisy. All of these factors may be added in to the evaluation procedure, albeit not with the subtlety that is likely to be present in a real net. The initial evaluation algorithm used the same measure of overlap to compare target and test unit denitions as the Sort operator uses for its matching. It was felt that this might give it undue advantage, since it is specically attempting to propagate the same similarities that would then be used to evaluate the result. Two other overlap measures were therefore tried. The rst simply replaced the measure from equation 1 with Hamming distance. The same penalty of 1 per unit dened in excess of the target number was applied. The second matched each pair of target and test units purely on the basis of the fraction of the target connections present in the test unit. A perfect match for any target unit would therefore be obtained by switching on all the connections in the test unit. However, the test string was then penalised for every connection in excess of the total in the target string. If this penalty was too large, the GA tended to respond by reducing the number of units specied in the test string, since this is the quickest way to reduce the total number of connections. A similar eect might be produced with real nets if the cpu time penalty component of the evaluation test total function was too large. The penalty used in the tests reported here was :2 (1? ). target total This replaces the penalty for excess units used in the other two procedures. The maximum score in this case is 1., while in the other two it is equal to the number of target units. The three evaluation operators may be summarised, in each case the summation is over the pairs of test and target unit denitions after they have been matched: 1. E1 F = P i=n target i=1 overlap i? (n target? n test ) P i=n 2. E2 F = target i=1 H i? (n target? n test ) P i=n target 3. E3 F = i=1 jtest i \target i j? :2 (1? target total test total ) target total where F is the performance, and the over-size penalty only applies if the test size is bigger than the target size. Noise was introduced by addition of a Gaussian random variable to the nal score for a string. The results below used a standard deviation of 5% of the maximum evaluation score, which is similar in size to the evaluation noise of Backprop trained nets found in (Hancock & Smith, 1991). Epistasis and a form of deception were introduced by giving the GA two target strings. The rst was scored as usual. For the second, the target units were grouped, with the score 5

6 for the group being given by the product of the scores of the constituent units. This is epistatic because the score for one unit depends upon the match of others within the same group. It can be made deceptive by increasing the value of the group so that it exceeds the combined score of the equivalent individual units in the rst string. The building blocks given by the easy task of matching the units of the rst string do not combine to give the better score obtainable by the harder job of matching the second string. Although not necessarily formally deceptive in the sense dened by Goldberg (1987), such an evaluation function may be expected to cause problems for the GA. In the experiments reported below, the units were gathered in four groups of three. Since each unit scores 1 if correct, the maximum score for a group, being the product of three unit scores, is also 1. The sum of the group scores was multiplied by 9, to give a maximum evaluation of 36, compared with 12 for matching the rst string. The rather high bonus for nding the second string was set so as to ensure that it was in fact found (by all but one of the operators, see below). There are two points of interest in comparing the recombination operators: are there dierences in the frequency of nding the second string, and, having found it, are there dierences in the rate of convergence on it? At lower bonus values, all the operators sometimes failed to nd the second string. Despite averaging over 1 runs, no signicant dierences could be found between the operators in their ability to nd the second string. The bonus was therefore increased to test convergence, since stochastic variations in the number of failures might cause dierences in the average performance bigger than those produced by the dierences in convergence. 3.1 Target nets While a variety of target net designs have been used, results from just three are reported here. These are 1. Net-1 A net with 1 hidden units and 3 inputs. Each hidden unit receives input from three adjacent inputs, such that RFs do not overlap. Target string length l = Net-2 A net with 1 hidden units and 18 inputs, arranged in a 6x3 matrix. The hidden unit RFs tile all the 2x2 squares of the input matrix and are therefore heavily overlapping. Target string length l = Net-3 A net with 12 hidden units and 1 inputs. The input connections of 1 hidden units were generated at random, with a connection probability of.3. Two of these units were then duplicated. This was the design used for the deceptive problem: a second target being produced in the same way, using a dierent random number seed. Target string length l = 12. Note that some combinations of net and evaluation method ought to be quite straightforward: being soluble in a single pass by the simple algorithm of ipping each bit in turn and keeping the change if it gives an improvement. This requires just l evaluations. Such an algorithm will probably fail in the presence of noise or deception, but in the absence of these can solve problems using E2 or E3. E1, the overlap measure, may require more than 6

7 one pass, because a target and test unit pair which do not overlap will have a score of zero, which is unaected by changing any bits not set in the target. 4 Results There are dierent operators to compare, with three target nets and three evaluation procedures. To each evaluation, noise and/or deception may be added. In addition to regular GA parameters such as population and generation size, selection scaling factor and mutation rate, each of the recombination operators has parameters such as crossover probability p u. It is clearly not possible to report detailed results for all the possible combinations. A number of extensive runs were made, stepping through the key parameter values to get a feel for their interactions. The results were used to guide the parameter settings used in the runs that are reported. The main points will be summarised. All the experiments used rank-based selection, with a geometric scaling factor. In the absence of noise, the selection pressure could be very high, with a scaling factor as low as.85, which gives over half the reproduction opportunities to the top ve strings. This underlines the essential simplicity of the task. For all the experiments reported below, the scaling factor was set at.96, a value more like those used in earlier work with real nets, reported in (Hancock & Smith, 1991). The mutation probability p m has a marked eect on convergence rates, the optimal value being of the order of the inverse of the string length. Uniform and Sort are not sensitive to variations in p b, the probability of crossover inbetween unit denition boundaries. It was set at.1. For Uniform, p u, the probability of crossover at unit boundaries should be.5, corresponding to picking each unit from either parent at random. For Sort, p u was best set to 1., which implies picking from the two parents alternately. This odd nding provoked much checking of code, and was tested for many combinations of target net and evaluation procedure, with consistent results. It appears that maximal mixing of the two parents confers some real advantage, but quite how is unclear. Figures 2 to 4 show the results for the main sequence of tests of the three target net designs, with the three evaluation procedures. All used a population size of 1, with a generation size of 1 (a generation gap size of.1, in the terminology of dejong (1975)) for the runs without noise, and 1 for those with. The results shown are for the best individual in the population, averaged over 2 runs from dierent starting populations. When noise was added to the evaluation, the results are shown for the true evaluation of the string, without the noise. Note that, in order to maximise distinguishability of the lines, the axes of the graphs are not uniform. Figure 2 is for Net-1 (with non-overlapping RFs). Mutation rate p m was set at.1. Figure 3 is for Net-2 (with overlapping RFs), with p m.2. Figure 4 is for Net-3 (random connectivity), with deception. The third method of evaluation is not shown because it is not compatible with the deceptive problem. The evaluation rates the whole string at once, while the deception works at the level of individual units. Mutation rate for this problem was set at.1. The results agree with those in Hancock (1992c), with Sort out-performing Uniform. 7

8 1. E1 1. E1+noise Sort Uniform NP E2 9.5 E2+noise E3 1. E3+noise Figure 2: of the various recombination algorithms on a 1 hidden unit, nonoverlapping receptive eld problem Net-1. 8

9 E1 E2 E2+noise E3 E3+noise Uniform Sort NP E1+noise Figure 3: Net-2:Overlapping RF problem 9

10 4 E1 4 E1+noise Sort Uniform NP E2 4 E2+noise Figure 4: Net-3: Deceptive problem. The failure of NP is discussed in section 5 Sometimes the dierence is quite marked, for instance for Net-1 in the presence of noise, gure 2. Note that the gently asymptotic curves can hide quite signicant dierences in evaluations required to reach a given performance: often a factor of two, which might be several days cpu time with real net evaluations! What may be surprising is the eect of removing the permutation \problem": the results are quite consistently worse! The explanation for this is pointed to by the consistently poorer starting evaluation for NP in all the results. When a string is compared to the target in NP, it has to match each unit with whatever is in that position on the string. When the matching algorithm is added to the evaluation, there are n! ways for each string to be arranged to match the target. With 1 hidden units, there are about 3:6 1 6 extra maxima, while the search space remains constant, dened by the string length. On average therefore, the initial population with a permutation scores signicantly better than NP. The permutation problem is that the strings corresponding to these maxima are all in dierent orders, so a simple GA should have 1

11 diculty combining them. These results suggest that the benets outweigh the drawbacks and in most cases Uniform is able to keep its advantage over NP. That there is indeed a permutation problem is conrmed by the consistently superior performance of Sort. This reaps the advantages of the permutations, but then seeks to reorder the unit denitions so that they may be recombined advantageously. It does not appear to be the case that the original evaluation procedure, E1, was unduely favourable to Sort: the results from the three evaluation procedures are broadly similar. 5 Solving the permutation problem The most unexpected result here was that permutations are apparently more of a help than a hindrance. That permutations should give an initially better evaluation seems clear enough, the surprise is that Uniform is able to maintain its advantage over NP. It appears that Uniform is able to resolve the permutation problem in practice. An obvious question is how: does it solve the permutation and then get on with the target problem, or do both concurrently? It is possible to observe it in action, by counting how many times a given target string unit appears in each of the possible positions in the population of test strings. If it is in the same position in (nearly) every string, the GA has (eectively) solved that bit of the permutation problem. The test unit denition does not have to be fully correct, or the same in every string, merely closer in each case to the same target unit than to any other. Displaying the data for all the unit denitions gives an indecipherable graph, so gure 5 shows the rst and last unit to be solved, averaged over ve runs, using Net-1 and E1. This and many other runs (not shown) indicate that solution of the permutations is fairly gradual, and certainly does not precede improvement on the target problem. Resolving the permutations appears to be a semi-sequential process, with each position becoming xed independently, apart from the last two which must go together. As positions become xed, the size of the permutation problem is rapidly reduced. However, the results do not suggest that the process of solving it accelerates. Figure 5 indicates that the nal pair of positions becomes xed more gradually than the rst one. This is probably because there is more eective competition between the remaining permuted strings. What evidently happens is that the whole population gradually gets xed in the order of the best few individuals. The average initial score is about 3 out of 1. If it is supposed that this results from having got three units correct, and seven with no score (obviously not the case), then it is possible to calculate the probability of combining two random individuals and producing an ospring with four or more units correct. It is in the order of.2, quite high enough for the GA to make progress. MonteCarlo simulations would be required to estimate the probabilities for the real problem. However, it is evident that the permutation problem is not as severe as had been thought. It is not necessary to solve it all at once, provided there is a reasonable chance of bringing good alleles together, the GA will do the rest. A sceptic might suppose that crossover is not contributing to the solution, and that 11

12 Number of strings 4 First Last Figure 5: Uniform operator solving the permutation problem. In addition to performance of the best string, the number of strings that have the same unit denition in the same location on the string are shown, for the rst and last units to be resolved. permutation problem is overcome by a mixture of selection and mutation alone, i.e. the system is acting as a \genetic hill-climber". This is easily checked, gure 6 shows Uniform, using Net-1 and E1, with and without crossover enabled, at three mutation rates. Mutation alone evidently can solve the problem, but even with this very simple problem the addition of crossover causes a marked improvement. Resolving the permutations is aided by high selection pressure: by increasing the dominance of the top-ranked string, it is better able to enforce its order on the population. It was therefore thought that deceptive problems would pose a challenge, since too high a rate of convergence would appear to reduce the chances of nding the deceptive solution. As gure 4 shows, NP was actually worse than Uniform, failing to solve the deceptive problem in three out of four cases. The reason is the same as before: when many permutations can be tried there is a better chance of nding a combination of units that scores well on the harder task. If the value of the second string is increased suciently, NP will also nd it every time. NP is successful when using Hamming distance, E2, in the presence of evaluation noise. The noise appears to inhibit convergence long enough to allow the better strings to emerge. 12

13 Px Pm.1 Px Pm.5 Px Pm.2 Px.5 Pm.1 Px.5 Pm.5 Px.5 Pm Figure 6: Uniform on Net-1 with E1, with and without crossover enabled The situation can be changed by increasing the population size, and reducing selection pressure. Figure 7 shows the performance of Uniform, with and without a permutation problem, on a simple net matching task, with 12 hidden units. With low selection pressure, a scaling factor of.995, NP is able to overtake Uniform, because the latter is unable to resolve the permutations fast enough. If the selection pressure is increased, Uniform does better. 6 Conclusions The initial aim of this work was to compare Radclie's proposed method for overcoming the permutation problem with a proposed extension. A simple crossover operator was intended as a baseline. This paper has explored the nding that the simple crossover often worked better than the more sophisticated recombination algorithms. It appears that, in practice, the permutation or competing conventions problem is not be as severe as had been supposed. With the population size and selection pressures used here, a GA is quite capable of resolving the permutations, even with deceptive problems. The increased number of ways of solving the problem outweigh the diculties of bringing the building blocks together. That the GA is not working purely by mutation and selection was demonstrated by showing that performance improves when crossover is enabled. Resolution of the permutation problem is assisted by sorting the strings appropriately before crossover. On the evidence presented here, Sort oers a useful improvement over simple crossover. The obvious question is the extent to which these ndings apply to real nets, as opposed 13

14 Uniform sp=.99 NP sp=.99 Uniform sp=.995 NP sp= Figure 7: Results for Uniform, with and without permutation problem, on a net with 12 hidden units, using E1 and two levels of selection pressure. Population size was 1. to the simulated problem used here. There are two aspects to this: is the proposed measure of overlap useful for identifying similar hidden units, and will it be as easy to resolve the permutations with a real, noisy net to evaluate? The Sort operator is currently being tested on real nets. The extent of the permutation problem may be assessed by comparing the performance of GAs with and without crossover enabled. Menczer and Parisi (199) report that adding crossover at a probability of.25 improves a GA used for optimising the weights of a net: further testing seems appropriate. Acknowledgements I thank Nick Radclie and Leslie Smith for helpful discussions, and the referees for useful comments. This work was partly funded by grant no GR/F97393 from the UK Science and Engineering Research Council Image Interpretation Initiative. References Belew, R.K., McInerney, J., & Schraudolph, N.N Evolving networks: using the genetic algorithm with connectionist learning. Tech. rept. CSE TR UCSD. Bornholdt, S., & Graudenz, D General asymmetric neural networks and structure design by genetic algorithms. Neural Networks, 5, 327{334. de Garis, H Genetic Programming: Modular neural evolution for Darwin machines. In: Proceedings of IJCNN Washington Jan

15 DeJong, K.A An analysis of the behavior of a class of genetic adaptive systems. Ph.D. thesis, University of Michigan, Dissertation Abstracts International 36(1), 514B. Goldberg, D.E Simple genetic algorithms and the minimal deceptive problem. Pages 74{88 of: Davis, L. (ed), Genetic Algorithms and Simulated Annealing. Pitman, London. Hancock, P.J.B. 1992a. Coding strategies for genetic algorithms and neural nets. Ph.D. thesis, Department of Computing Science and Mathematics, University of Stirling. Hancock, P.J.B. 1992b. Pruning neural nets by genetic algorithm. Pages 991{994 of: Aleksander, I., & Taylor, J.G. (eds), Proceedings of the International Conference on Articial Neural Networks, Brighton. Elsevier. Hancock, P.J.B. 1992c. Recombination operators for the design of neural nets by genetic algorithm. Pages 441{45 of: M}anner, R., & Manderick, B. (eds), Parallel Problem Solving from Nature 2. Elsevier, North Holland. Hancock, P.J.B., & Smith, L.S GANNET: Genetic design of a neural net for face recognition. Pages 292{296 of: Schwefel, H-P., & M}anner, R. (eds), Parallel problem solving from nature. Lecture notes in Computer Science 496, Springer Verlag. Menczer, F., & Parisi, D `Sexual' reproduction in neural networks. Tech. rept. PCIA C.N.R.Rome. Miller, G.F., Todd, P.M., & Hegde, S.U Designing neural networks using Genetic Algorithms. Pages 379{384 of: Schaer, J.D. (ed), Proceedings of the third international conference on Genetic Algorithms. Morgan Kaufmann. Montana, D.J., & Davis, L Training feedforward neural networks using Genetic Algorithms. Pages 762{767 of: Proceedings of the Eleventh IJCAI. Nol, S., Parisi, D., Vallar, G., & Burani, C Recall of sequences of items by a neural network. In: Touretzky, D., Hinton, G., & Sejnowski, T. (eds), Proceedings of the 199 Connectionist models summer school. Morgan Kaufmann. Radclie, N Genetic set recombination and its application to neural network topology optimisation. Neural computing and applications, 1, 67{9. Whitley, D., Dominic, S., & Das, R Genetic reinforcement learning with multilayer neural networks. Pages 562{569 of: Belew, R.K., & Booker, LB. (eds), Proceedings of the fourth international conference on Genetic Algorithms. Morgan Kaufmann. 15

0 o 1 i B C D 0/1 0/ /1

0 o 1 i B C D 0/1 0/ /1 A Comparison of Dominance Mechanisms and Simple Mutation on Non-Stationary Problems Jonathan Lewis,? Emma Hart, Graeme Ritchie Department of Articial Intelligence, University of Edinburgh, Edinburgh EH

More information

EVOLUTIONARY OPERATORS FOR CONTINUOUS CONVEX PARAMETER SPACES. Zbigniew Michalewicz. Department of Computer Science, University of North Carolina

EVOLUTIONARY OPERATORS FOR CONTINUOUS CONVEX PARAMETER SPACES. Zbigniew Michalewicz. Department of Computer Science, University of North Carolina EVOLUTIONARY OPERATORS FOR CONTINUOUS CONVEX PARAMETER SPACES Zbigniew Michalewicz Department of Computer Science, University of North Carolina Charlotte, NC 28223, USA and Thomas D. Logan IBM, Charlotte,

More information

Structure Design of Neural Networks Using Genetic Algorithms

Structure Design of Neural Networks Using Genetic Algorithms Structure Design of Neural Networks Using Genetic Algorithms Satoshi Mizuta Takashi Sato Demelo Lao Masami Ikeda Toshio Shimizu Department of Electronic and Information System Engineering, Faculty of Science

More information

Hierarchical Evolution of Neural Networks. The University of Texas at Austin. Austin, TX Technical Report AI January 1996.

Hierarchical Evolution of Neural Networks. The University of Texas at Austin. Austin, TX Technical Report AI January 1996. Hierarchical Evolution of Neural Networks David E. Moriarty and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 moriarty,risto@cs.utexas.edu Technical

More information

Shigetaka Fujita. Rokkodai, Nada, Kobe 657, Japan. Haruhiko Nishimura. Yashiro-cho, Kato-gun, Hyogo , Japan. Abstract

Shigetaka Fujita. Rokkodai, Nada, Kobe 657, Japan. Haruhiko Nishimura. Yashiro-cho, Kato-gun, Hyogo , Japan. Abstract KOBE-TH-94-07 HUIS-94-03 November 1994 An Evolutionary Approach to Associative Memory in Recurrent Neural Networks Shigetaka Fujita Graduate School of Science and Technology Kobe University Rokkodai, Nada,

More information

Genetic Algorithms with Dynamic Niche Sharing. for Multimodal Function Optimization. at Urbana-Champaign. IlliGAL Report No

Genetic Algorithms with Dynamic Niche Sharing. for Multimodal Function Optimization. at Urbana-Champaign. IlliGAL Report No Genetic Algorithms with Dynamic Niche Sharing for Multimodal Function Optimization Brad L. Miller Dept. of Computer Science University of Illinois at Urbana-Champaign Michael J. Shaw Beckman Institute

More information

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract Published in: Advances in Neural Information Processing Systems 8, D S Touretzky, M C Mozer, and M E Hasselmo (eds.), MIT Press, Cambridge, MA, pages 190-196, 1996. Learning with Ensembles: How over-tting

More information

Lecture 9 Evolutionary Computation: Genetic algorithms

Lecture 9 Evolutionary Computation: Genetic algorithms Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Simulation of natural evolution Genetic algorithms Case study: maintenance scheduling with genetic

More information

Designer Genetic Algorithms: Genetic Algorithms in Structure Design. Inputs. C 0 A0 B0 A1 B1.. An Bn

Designer Genetic Algorithms: Genetic Algorithms in Structure Design. Inputs. C 0 A0 B0 A1 B1.. An Bn A B C Sum Carry Carry Designer Genetic Algorithms: Genetic Algorithms in Structure Design Sushil J. Louis Department of Computer Science Indiana University, Bloomington, IN 47405 louis@iuvax.cs.indiana.edu

More information

Forming Neural Networks through Ecient and Adaptive. Coevolution. Abstract

Forming Neural Networks through Ecient and Adaptive. Coevolution. Abstract Evolutionary Computation, 5(4), 1998. (In Press). Forming Neural Networks through Ecient and Adaptive Coevolution David E. Moriarty Information Sciences Institute University of Southern California 4676

More information

a subset of these N input variables. A naive method is to train a new neural network on this subset to determine this performance. Instead of the comp

a subset of these N input variables. A naive method is to train a new neural network on this subset to determine this performance. Instead of the comp Input Selection with Partial Retraining Pierre van de Laar, Stan Gielen, and Tom Heskes RWCP? Novel Functions SNN?? Laboratory, Dept. of Medical Physics and Biophysics, University of Nijmegen, The Netherlands.

More information

Comparison of Evolutionary Methods for. Smoother Evolution. 2-4 Hikaridai, Seika-cho. Soraku-gun, Kyoto , JAPAN

Comparison of Evolutionary Methods for. Smoother Evolution. 2-4 Hikaridai, Seika-cho. Soraku-gun, Kyoto , JAPAN Comparison of Evolutionary Methods for Smoother Evolution Tomofumi Hikage, Hitoshi Hemmi, and Katsunori Shimohara NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho Soraku-gun, Kyoto 619-0237,

More information

Chapter 30 Minimality and Stability of Interconnected Systems 30.1 Introduction: Relating I/O and State-Space Properties We have already seen in Chapt

Chapter 30 Minimality and Stability of Interconnected Systems 30.1 Introduction: Relating I/O and State-Space Properties We have already seen in Chapt Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A. Dahleh George Verghese Department of Electrical Engineering and Computer Science Massachuasetts Institute of Technology 1 1 c Chapter

More information

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

7.1 Basis for Boltzmann machine. 7. Boltzmann machines 7. Boltzmann machines this section we will become acquainted with classical Boltzmann machines which can be seen obsolete being rarely applied in neurocomputing. It is interesting, after all, because is

More information

On the Structure of Low Autocorrelation Binary Sequences

On the Structure of Low Autocorrelation Binary Sequences On the Structure of Low Autocorrelation Binary Sequences Svein Bjarte Aasestøl University of Bergen, Bergen, Norway December 1, 2005 1 blank 2 Contents 1 Introduction 5 2 Overview 5 3 Denitions 6 3.1 Shift

More information

1/sqrt(B) convergence 1/B convergence B

1/sqrt(B) convergence 1/B convergence B The Error Coding Method and PICTs Gareth James and Trevor Hastie Department of Statistics, Stanford University March 29, 1998 Abstract A new family of plug-in classication techniques has recently been

More information

PRIME GENERATING LUCAS SEQUENCES

PRIME GENERATING LUCAS SEQUENCES PRIME GENERATING LUCAS SEQUENCES PAUL LIU & RON ESTRIN Science One Program The University of British Columbia Vancouver, Canada April 011 1 PRIME GENERATING LUCAS SEQUENCES Abstract. The distribution of

More information

1 What a Neural Network Computes

1 What a Neural Network Computes Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists

More information

Widths. Center Fluctuations. Centers. Centers. Widths

Widths. Center Fluctuations. Centers. Centers. Widths Radial Basis Functions: a Bayesian treatment David Barber Bernhard Schottky Neural Computing Research Group Department of Applied Mathematics and Computer Science Aston University, Birmingham B4 7ET, U.K.

More information

Revisiting the Edge of Chaos: Evolving Cellular Automata to Perform. Santa Fe Institute Working Paper (Submitted to Complex Systems)

Revisiting the Edge of Chaos: Evolving Cellular Automata to Perform. Santa Fe Institute Working Paper (Submitted to Complex Systems) Revisiting the Edge of Chaos: Evolving Cellular Automata to Perform Computations Melanie Mitchell 1, Peter T. Hraber 1, and James P. Crutcheld 2 Santa Fe Institute Working Paper 93-3-14 (Submitted to Complex

More information

Laboratory for Computer Science. Abstract. We examine the problem of nding a good expert from a sequence of experts. Each expert

Laboratory for Computer Science. Abstract. We examine the problem of nding a good expert from a sequence of experts. Each expert Picking the Best Expert from a Sequence Ruth Bergman Laboratory for Articial Intelligence Massachusetts Institute of Technology Cambridge, MA 0239 Ronald L. Rivest y Laboratory for Computer Science Massachusetts

More information

Can PAC Learning Algorithms Tolerate. Random Attribute Noise? Sally A. Goldman. Department of Computer Science. Washington University

Can PAC Learning Algorithms Tolerate. Random Attribute Noise? Sally A. Goldman. Department of Computer Science. Washington University Can PAC Learning Algorithms Tolerate Random Attribute Noise? Sally A. Goldman Department of Computer Science Washington University St. Louis, Missouri 63130 Robert H. Sloan y Dept. of Electrical Engineering

More information

Unicycling Helps Your French: Spontaneous Recovery of Associations by. Learning Unrelated Tasks

Unicycling Helps Your French: Spontaneous Recovery of Associations by. Learning Unrelated Tasks Unicycling Helps Your French: Spontaneous Recovery of Associations by Learning Unrelated Tasks Inman Harvey and James V. Stone CSRP 379, May 1995 Cognitive Science Research Paper Serial No. CSRP 379 The

More information

Error Empirical error. Generalization error. Time (number of iteration)

Error Empirical error. Generalization error. Time (number of iteration) Submitted to Neural Networks. Dynamics of Batch Learning in Multilayer Networks { Overrealizability and Overtraining { Kenji Fukumizu The Institute of Physical and Chemical Research (RIKEN) E-mail: fuku@brain.riken.go.jp

More information

Lecture 14 - P v.s. NP 1

Lecture 14 - P v.s. NP 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) February 27, 2018 Lecture 14 - P v.s. NP 1 In this lecture we start Unit 3 on NP-hardness and approximation

More information

An Adaptive Bayesian Network for Low-Level Image Processing

An Adaptive Bayesian Network for Low-Level Image Processing An Adaptive Bayesian Network for Low-Level Image Processing S P Luttrell Defence Research Agency, Malvern, Worcs, WR14 3PS, UK. I. INTRODUCTION Probability calculus, based on the axioms of inference, Cox

More information

What Makes a Problem Hard for a Genetic Algorithm? Some Anomalous Results and Their Explanation

What Makes a Problem Hard for a Genetic Algorithm? Some Anomalous Results and Their Explanation What Makes a Problem Hard for a Genetic Algorithm? Some Anomalous Results and Their Explanation Stephanie Forrest Dept. of Computer Science University of New Mexico Albuquerque, N.M. 87131-1386 Email:

More information

percentage of problems with ( 1 lb/ub ) <= x percentage of problems with ( 1 lb/ub ) <= x n= n=8 n= n=32 n= log10( x )

percentage of problems with ( 1 lb/ub ) <= x percentage of problems with ( 1 lb/ub ) <= x n= n=8 n= n=32 n= log10( x ) Soft vs. Hard Bounds in Probabilistic Robustness Analysis Xiaoyun Zhu Yun Huang John Doyle California Institute of Technology, Pasadena, CA 925 Abstract The relationship between soft vs. hard bounds and

More information

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim Tests for trend in more than one repairable system. Jan Terje Kvaly Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim ABSTRACT: If failure time data from several

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Stochastic Search: Part 2. Genetic Algorithms. Vincent A. Cicirello. Robotics Institute. Carnegie Mellon University

Stochastic Search: Part 2. Genetic Algorithms. Vincent A. Cicirello. Robotics Institute. Carnegie Mellon University Stochastic Search: Part 2 Genetic Algorithms Vincent A. Cicirello Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 cicirello@ri.cmu.edu 1 The Genetic Algorithm (GA)

More information

Analog Neural Nets with Gaussian or other Common. Noise Distributions cannot Recognize Arbitrary. Regular Languages.

Analog Neural Nets with Gaussian or other Common. Noise Distributions cannot Recognize Arbitrary. Regular Languages. Analog Neural Nets with Gaussian or other Common Noise Distributions cannot Recognize Arbitrary Regular Languages Wolfgang Maass Inst. for Theoretical Computer Science, Technische Universitat Graz Klosterwiesgasse

More information

AND AND AND AND NOT NOT

AND AND AND AND NOT NOT Genetic Programming Using a Minimum Description Length Principle Hitoshi IBA 1 Hugo de GARIS 2 Taisuke SATO 1 1)Machine Inference Section, Electrotechnical Laboratory 1-1-4 Umezono, Tsukuba-city, Ibaraki,

More information

Vandegriend & Culberson Section 6 we examine a graph class based on a generalization of the knight's tour problem. These graphs are signicantly harder

Vandegriend & Culberson Section 6 we examine a graph class based on a generalization of the knight's tour problem. These graphs are signicantly harder Journal of Articial Intelligence Research 9 (1998) 219-245 Submitted 3/98; published 11/98 The G n;m Phase Transition is Not Hard for the Hamiltonian Cycle Problem Basil Vandegriend Joseph Culberson Department

More information

Search. Search is a key component of intelligent problem solving. Get closer to the goal if time is not enough

Search. Search is a key component of intelligent problem solving. Get closer to the goal if time is not enough Search Search is a key component of intelligent problem solving Search can be used to Find a desired goal if time allows Get closer to the goal if time is not enough section 11 page 1 The size of the search

More information

Scaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.).

Scaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.). Local Search Scaling Up So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.). The current best such algorithms (RBFS / SMA*)

More information

Neural Systems and Artificial Life Group, Institute of Psychology, National Research Council, Rome. Evolving Modular Architectures for Neural Networks

Neural Systems and Artificial Life Group, Institute of Psychology, National Research Council, Rome. Evolving Modular Architectures for Neural Networks Neural Systems and Artificial Life Group, Institute of Psychology, National Research Council, Rome Evolving Modular Architectures for Neural Networks Andrea Di Ferdinando, Raffaele Calabretta and Domenico

More information

1 Introduction Tasks like voice or face recognition are quite dicult to realize with conventional computer systems, even for the most powerful of them

1 Introduction Tasks like voice or face recognition are quite dicult to realize with conventional computer systems, even for the most powerful of them Information Storage Capacity of Incompletely Connected Associative Memories Holger Bosch Departement de Mathematiques et d'informatique Ecole Normale Superieure de Lyon Lyon, France Franz Kurfess Department

More information

Aijun An and Nick Cercone. Department of Computer Science, University of Waterloo. methods in a context of learning classication rules.

Aijun An and Nick Cercone. Department of Computer Science, University of Waterloo. methods in a context of learning classication rules. Discretization of Continuous Attributes for Learning Classication Rules Aijun An and Nick Cercone Department of Computer Science, University of Waterloo Waterloo, Ontario N2L 3G1 Canada Abstract. We present

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology Massachusetts Institute of Technology 6.034 Articial Intelligence Solutions # Final96 6034 Item # 33 Problem 1 Rules Step Ready to fire Selected Rule Assertion Added 1 R1 R3 R7 R1 Fuzzy is a mammal 2 R5

More information

Computational Complexity and Genetic Algorithms

Computational Complexity and Genetic Algorithms Computational Complexity and Genetic Algorithms BART RYLANDER JAMES FOSTER School of Engineering Department of Computer Science University of Portland University of Idaho Portland, Or 97203 Moscow, Idaho

More information

2 Information transmission is typically corrupted by noise during transmission. Various strategies have been adopted for reducing or eliminating the n

2 Information transmission is typically corrupted by noise during transmission. Various strategies have been adopted for reducing or eliminating the n Finite size eects and error-free communication in Gaussian channels Ido Kanter and David Saad # Minerva Center and Department of Physics, Bar-Ilan University, Ramat-Gan 52900, Israel. # The Neural Computing

More information

Looking Under the EA Hood with Price s Equation

Looking Under the EA Hood with Price s Equation Looking Under the EA Hood with Price s Equation Jeffrey K. Bassett 1, Mitchell A. Potter 2, and Kenneth A. De Jong 1 1 George Mason University, Fairfax, VA 22030 {jbassett, kdejong}@cs.gmu.edu 2 Naval

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

G. Larry Bretthorst. Washington University, Department of Chemistry. and. C. Ray Smith

G. Larry Bretthorst. Washington University, Department of Chemistry. and. C. Ray Smith in Infrared Systems and Components III, pp 93.104, Robert L. Caswell ed., SPIE Vol. 1050, 1989 Bayesian Analysis of Signals from Closely-Spaced Objects G. Larry Bretthorst Washington University, Department

More information

2 P. L'Ecuyer and R. Simard otherwise perform well in the spectral test, fail this independence test in a decisive way. LCGs with multipliers that hav

2 P. L'Ecuyer and R. Simard otherwise perform well in the spectral test, fail this independence test in a decisive way. LCGs with multipliers that hav Beware of Linear Congruential Generators with Multipliers of the form a = 2 q 2 r Pierre L'Ecuyer and Richard Simard Linear congruential random number generators with Mersenne prime modulus and multipliers

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 10: Gene(c Algorithms CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ Slides of this presenta(on

More information

The simplest kind of unit we consider is a linear-gaussian unit. To

The simplest kind of unit we consider is a linear-gaussian unit. To A HIERARCHICAL COMMUNITY OF EXPERTS GEOFFREY E. HINTON BRIAN SALLANS AND ZOUBIN GHAHRAMANI Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 3H5 fhinton,sallans,zoubing@cs.toronto.edu

More information

Standard Particle Swarm Optimisation

Standard Particle Swarm Optimisation Standard Particle Swarm Optimisation From 2006 to 2011 Maurice.Clerc@WriteMe.com 2012-09-23 version 1 Introduction Since 2006, three successive standard PSO versions have been put on line on the Particle

More information

Integer weight training by differential evolution algorithms

Integer weight training by differential evolution algorithms Integer weight training by differential evolution algorithms V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis University of Patras, Department of Mathematics, GR-265 00, Patras, Greece. e-mail: vpp

More information

Computing the acceptability semantics. London SW7 2BZ, UK, Nicosia P.O. Box 537, Cyprus,

Computing the acceptability semantics. London SW7 2BZ, UK, Nicosia P.O. Box 537, Cyprus, Computing the acceptability semantics Francesca Toni 1 and Antonios C. Kakas 2 1 Department of Computing, Imperial College, 180 Queen's Gate, London SW7 2BZ, UK, ft@doc.ic.ac.uk 2 Department of Computer

More information

Function Optimization Using Connectionist Reinforcement Learning Algorithms Ronald J. Williams and Jing Peng College of Computer Science Northeastern

Function Optimization Using Connectionist Reinforcement Learning Algorithms Ronald J. Williams and Jing Peng College of Computer Science Northeastern Function Optimization Using Connectionist Reinforcement Learning Algorithms Ronald J. Williams and Jing Peng College of Computer Science Northeastern University Appears in Connection Science, 3, pp. 241-268,

More information

The Multi-Layer Perceptron

The Multi-Layer Perceptron EC 6430 Pattern Recognition and Analysis Monsoon 2011 Lecture Notes - 6 The Multi-Layer Perceptron Single layer networks have limitations in terms of the range of functions they can represent. Multi-layer

More information

Learning in Boltzmann Trees. Lawrence Saul and Michael Jordan. Massachusetts Institute of Technology. Cambridge, MA January 31, 1995.

Learning in Boltzmann Trees. Lawrence Saul and Michael Jordan. Massachusetts Institute of Technology. Cambridge, MA January 31, 1995. Learning in Boltzmann Trees Lawrence Saul and Michael Jordan Center for Biological and Computational Learning Massachusetts Institute of Technology 79 Amherst Street, E10-243 Cambridge, MA 02139 January

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Abstract. In this paper we propose recurrent neural networks with feedback into the input

Abstract. In this paper we propose recurrent neural networks with feedback into the input Recurrent Neural Networks for Missing or Asynchronous Data Yoshua Bengio Dept. Informatique et Recherche Operationnelle Universite de Montreal Montreal, Qc H3C-3J7 bengioy@iro.umontreal.ca Francois Gingras

More information

Gravitational potential energy *

Gravitational potential energy * OpenStax-CNX module: m15090 1 Gravitational potential energy * Sunil Kumar Singh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 2.0 The concept of potential

More information

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network LETTER Communicated by Geoffrey Hinton Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network Xiaohui Xie xhx@ai.mit.edu Department of Brain and Cognitive Sciences, Massachusetts

More information

Active Guidance for a Finless Rocket using Neuroevolution

Active Guidance for a Finless Rocket using Neuroevolution Active Guidance for a Finless Rocket using Neuroevolution Gomez, F.J. & Miikulainen, R. (2003). Genetic and Evolutionary Computation Gecco, 2724, 2084 2095. Introduction Sounding rockets are used for making

More information

An average case analysis of a dierential attack. on a class of SP-networks. Distributed Systems Technology Centre, and

An average case analysis of a dierential attack. on a class of SP-networks. Distributed Systems Technology Centre, and An average case analysis of a dierential attack on a class of SP-networks Luke O'Connor Distributed Systems Technology Centre, and Information Security Research Center, QUT Brisbane, Australia Abstract

More information

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

More information

In: Proc. BENELEARN-98, 8th Belgian-Dutch Conference on Machine Learning, pp 9-46, 998 Linear Quadratic Regulation using Reinforcement Learning Stephan ten Hagen? and Ben Krose Department of Mathematics,

More information

1 INTRODUCTION This paper will be concerned with a class of methods, which we shall call Illinois-type methods, for the solution of the single nonline

1 INTRODUCTION This paper will be concerned with a class of methods, which we shall call Illinois-type methods, for the solution of the single nonline IMPROVED ALGORITMS OF ILLINOIS-TYPE FOR TE NUMERICAL SOLUTION OF NONLINEAR EQUATIONS J.A. FORD Department of Computer Science, University of Essex, Wivenhoe Park, Colchester, Essex, United Kingdom ABSTRACT

More information

The best expert versus the smartest algorithm

The best expert versus the smartest algorithm Theoretical Computer Science 34 004 361 380 www.elsevier.com/locate/tcs The best expert versus the smartest algorithm Peter Chen a, Guoli Ding b; a Department of Computer Science, Louisiana State University,

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel The Bias-Variance dilemma of the Monte Carlo method Zlochin Mark 1 and Yoram Baram 1 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel fzmark,baramg@cs.technion.ac.il Abstract.

More information

1.1. The analytical denition. Denition. The Bernstein polynomials of degree n are dened analytically:

1.1. The analytical denition. Denition. The Bernstein polynomials of degree n are dened analytically: DEGREE REDUCTION OF BÉZIER CURVES DAVE MORGAN Abstract. This paper opens with a description of Bézier curves. Then, techniques for the degree reduction of Bézier curves, along with a discussion of error

More information

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization Tim Roughgarden February 28, 2017 1 Preamble This lecture fulfills a promise made back in Lecture #1,

More information

Does the Wake-sleep Algorithm Produce Good Density Estimators?

Does the Wake-sleep Algorithm Produce Good Density Estimators? Does the Wake-sleep Algorithm Produce Good Density Estimators? Brendan J. Frey, Geoffrey E. Hinton Peter Dayan Department of Computer Science Department of Brain and Cognitive Sciences University of Toronto

More information

Genetically Generated Neural Networks II: Searching for an Optimal Representation

Genetically Generated Neural Networks II: Searching for an Optimal Representation Boston University OpenBU Cognitive & Neural Systems http://open.bu.edu CAS/CNS Technical Reports 1992-02 Genetically Generated Neural Networks II: Searching for an Optimal Representation Marti, Leonardo

More information

Approximate Optimal-Value Functions. Satinder P. Singh Richard C. Yee. University of Massachusetts.

Approximate Optimal-Value Functions. Satinder P. Singh Richard C. Yee. University of Massachusetts. An Upper Bound on the oss from Approximate Optimal-Value Functions Satinder P. Singh Richard C. Yee Department of Computer Science University of Massachusetts Amherst, MA 01003 singh@cs.umass.edu, yee@cs.umass.edu

More information

Atomic Masses and Molecular Formulas *

Atomic Masses and Molecular Formulas * OpenStax-CNX module: m44278 1 Atomic Masses and Molecular Formulas * John S. Hutchinson This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 1 Introduction

More information

cells [20]. CAs exhibit three notable features, namely massive parallelism, locality of cellular interactions, and simplicity of basic components (cel

cells [20]. CAs exhibit three notable features, namely massive parallelism, locality of cellular interactions, and simplicity of basic components (cel I. Rechenberg, and H.-P. Schwefel (eds.), pages 950-959, 1996. Copyright Springer-Verlag 1996. Co-evolving Parallel Random Number Generators Moshe Sipper 1 and Marco Tomassini 2 1 Logic Systems Laboratory,

More information

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun Boxlets: a Fast Convolution Algorithm for Signal Processing and Neural Networks Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun AT&T Labs-Research 100 Schultz Drive, Red Bank, NJ 07701-7033

More information

CPSC 340: Machine Learning and Data Mining. Regularization Fall 2017

CPSC 340: Machine Learning and Data Mining. Regularization Fall 2017 CPSC 340: Machine Learning and Data Mining Regularization Fall 2017 Assignment 2 Admin 2 late days to hand in tonight, answers posted tomorrow morning. Extra office hours Thursday at 4pm (ICICS 246). Midterm

More information

Haploid & diploid recombination and their evolutionary impact

Haploid & diploid recombination and their evolutionary impact Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis

More information

This material is based upon work supported under a National Science Foundation graduate fellowship.

This material is based upon work supported under a National Science Foundation graduate fellowship. TRAINING A 3-NODE NEURAL NETWORK IS NP-COMPLETE Avrim L. Blum MIT Laboratory for Computer Science Cambridge, Mass. 02139 USA Ronald L. Rivest y MIT Laboratory for Computer Science Cambridge, Mass. 02139

More information

Novel determination of dierential-equation solutions: universal approximation method

Novel determination of dierential-equation solutions: universal approximation method Journal of Computational and Applied Mathematics 146 (2002) 443 457 www.elsevier.com/locate/cam Novel determination of dierential-equation solutions: universal approximation method Thananchai Leephakpreeda

More information

Eects of domain characteristics on instance-based learning algorithms

Eects of domain characteristics on instance-based learning algorithms Theoretical Computer Science 298 (2003) 207 233 www.elsevier.com/locate/tcs Eects of domain characteristics on instance-based learning algorithms Seishi Okamoto, Nobuhiro Yugami Fujitsu Laboratories, 1-9-3

More information

Week Cuts, Branch & Bound, and Lagrangean Relaxation

Week Cuts, Branch & Bound, and Lagrangean Relaxation Week 11 1 Integer Linear Programming This week we will discuss solution methods for solving integer linear programming problems. I will skip the part on complexity theory, Section 11.8, although this is

More information

squashing functions allow to deal with decision-like tasks. Attracted by Backprop's interpolation capabilities, mainly because of its possibility of g

squashing functions allow to deal with decision-like tasks. Attracted by Backprop's interpolation capabilities, mainly because of its possibility of g SUCCESSES AND FAILURES OF BACKPROPAGATION: A THEORETICAL INVESTIGATION P. Frasconi, M. Gori, and A. Tesi Dipartimento di Sistemi e Informatica, Universita di Firenze Via di Santa Marta 3-50139 Firenze

More information

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural 1 2 The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural networks. First we will look at the algorithm itself

More information

Reinforcement Learning: Part 3 Evolution

Reinforcement Learning: Part 3 Evolution 1 Reinforcement Learning: Part 3 Evolution Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Cross-entropy method for TSP Simple genetic style methods can

More information

Rule table φ: Lattice: r = 1. t = 0. t = 1. neighborhood η: output bit: Neighborhood η

Rule table φ: Lattice: r = 1. t = 0. t = 1. neighborhood η: output bit: Neighborhood η Evolving Cellular Automata with Genetic Algorithms: A Review of Recent Work Melanie Mitchell Santa Fe Institute 1399 Hyde Park Road Santa Fe, NM 8751 mm@santafe.edu James P. Crutcheld 1 Santa Fe Institute

More information

Remaining energy on log scale Number of linear PCA components

Remaining energy on log scale Number of linear PCA components NONLINEAR INDEPENDENT COMPONENT ANALYSIS USING ENSEMBLE LEARNING: EXPERIMENTS AND DISCUSSION Harri Lappalainen, Xavier Giannakopoulos, Antti Honkela, and Juha Karhunen Helsinki University of Technology,

More information

Sliding Mode Control: A Comparison of Sliding Surface Approach Dynamics

Sliding Mode Control: A Comparison of Sliding Surface Approach Dynamics Ben Gallup ME237 Semester Project Sliding Mode Control: A Comparison of Sliding Surface Approach Dynamics Contents Project overview 2 The Model 3 Design of the Sliding Mode Controller 3 4 Control Law Forms

More information

Chapter 8: Introduction to Evolutionary Computation

Chapter 8: Introduction to Evolutionary Computation Computational Intelligence: Second Edition Contents Some Theories about Evolution Evolution is an optimization process: the aim is to improve the ability of an organism to survive in dynamically changing

More information

w1 w2 w3 Figure 1: Terminal Cell Example systems. Through a development process that transforms a single undivided cell (the gamete) into a body consi

w1 w2 w3 Figure 1: Terminal Cell Example systems. Through a development process that transforms a single undivided cell (the gamete) into a body consi Evolving Robot Morphology and Control Craig Mautner Richard K. Belew Computer Science and Engineering Computer Science and Engineering University of California, San Diego University of California, San

More information

Cryptanalysis of Akelarre Niels Ferguson Bruce Schneier DigiCash bv Counterpane Systems Kruislaan E Minnehaha Parkway 1098 VA Amsterdam, Nethe

Cryptanalysis of Akelarre Niels Ferguson Bruce Schneier DigiCash bv Counterpane Systems Kruislaan E Minnehaha Parkway 1098 VA Amsterdam, Nethe Cryptanalysis of Akelarre Niels Ferguson Bruce Schneier DigiCash bv Counterpane Systems Kruislaan 9 0 E Minnehaha Parkway 098 VA Amsterdam, Netherlands Minneapolis, MN 559, USA niels@digicash.com schneier@counterpane.com

More information

Evolutionary Computation

Evolutionary Computation Evolutionary Computation - Computational procedures patterned after biological evolution. - Search procedure that probabilistically applies search operators to set of points in the search space. - Lamarck

More information

the subset partial order Paul Pritchard Technical Report CIT School of Computing and Information Technology

the subset partial order Paul Pritchard Technical Report CIT School of Computing and Information Technology A simple sub-quadratic algorithm for computing the subset partial order Paul Pritchard P.Pritchard@cit.gu.edu.au Technical Report CIT-95-04 School of Computing and Information Technology Grith University

More information

Forecasting & Futurism

Forecasting & Futurism Article from: Forecasting & Futurism December 2013 Issue 8 A NEAT Approach to Neural Network Structure By Jeff Heaton Jeff Heaton Neural networks are a mainstay of artificial intelligence. These machine-learning

More information

Second-order Learning Algorithm with Squared Penalty Term

Second-order Learning Algorithm with Squared Penalty Term Second-order Learning Algorithm with Squared Penalty Term Kazumi Saito Ryohei Nakano NTT Communication Science Laboratories 2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 69-2 Japan {saito,nakano}@cslab.kecl.ntt.jp

More information

Ways to make neural networks generalize better

Ways to make neural networks generalize better Ways to make neural networks generalize better Seminar in Deep Learning University of Tartu 04 / 10 / 2014 Pihel Saatmann Topics Overview of ways to improve generalization Limiting the size of the weights

More information

Model Theory Based Fusion Framework with Application to. Multisensor Target Recognition. Zbigniew Korona and Mieczyslaw M. Kokar

Model Theory Based Fusion Framework with Application to. Multisensor Target Recognition. Zbigniew Korona and Mieczyslaw M. Kokar Model Theory Based Framework with Application to Multisensor Target Recognition Abstract In this work, we present a model theory based fusion methodology for multisensor waveletfeatures based recognition

More information

Computational statistics

Computational statistics Computational statistics Combinatorial optimization Thierry Denœux February 2017 Thierry Denœux Computational statistics February 2017 1 / 37 Combinatorial optimization Assume we seek the maximum of f

More information

Contents 1 Introduction 4 2 Go and genetic programming 4 3 Description of the go board evaluation function 4 4 Fitness Criteria for tness : : :

Contents 1 Introduction 4 2 Go and genetic programming 4 3 Description of the go board evaluation function 4 4 Fitness Criteria for tness : : : Go and Genetic Programming Playing Go with Filter Functions S.F. da Silva November 21, 1996 1 Contents 1 Introduction 4 2 Go and genetic programming 4 3 Description of the go board evaluation function

More information

Optimizing Stochastic and Multiple Fitness Functions

Optimizing Stochastic and Multiple Fitness Functions Optimizing Stochastic and Multiple Fitness Functions Joseph L. Breeden SFI WORKING PAPER: 1995-02-027 SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent

More information

Representation and Hidden Bias II: Eliminating Defining Length Bias in Genetic Search via Shuffle Crossover

Representation and Hidden Bias II: Eliminating Defining Length Bias in Genetic Search via Shuffle Crossover Representation and Hidden Bias II: Eliminating Defining Length Bias in Genetic Search via Shuffle Crossover Abstract The traditional crossover operator used in genetic search exhibits a position-dependent

More information

Lecture 22. Introduction to Genetic Algorithms

Lecture 22. Introduction to Genetic Algorithms Lecture 22 Introduction to Genetic Algorithms Thursday 14 November 2002 William H. Hsu, KSU http://www.kddresearch.org http://www.cis.ksu.edu/~bhsu Readings: Sections 9.1-9.4, Mitchell Chapter 1, Sections

More information