Interaction-Detection Metric with Differential Mutual Complement for Dependency Structure Matrix Genetic Algorithm

Size: px
Start display at page:

Download "Interaction-Detection Metric with Differential Mutual Complement for Dependency Structure Matrix Genetic Algorithm"

Transcription

1 Interaction-Detection Metric with Differential Mutual Complement for Dependency Structure Matrix Genetic Algorithm Kai-Chun Fan Jui-Ting Lee Tian-Li Yu Tsung-Yu Ho TEIL Technical Report No April, 2010 Taiwan Evolutionary Intelligence Laboratory (TEIL) Department of Electrical Engineering National Taiwan University No.1, Sec. 4, Roosevelt Rd., Taipei, Taiwan

2 Interaction-Detection Metric with Differential Mutual Complement for Dependency Structure Matrix Genetic Algorithm Kai-Chun Fan, Jui-Ting Lee, Tian-Li Yu, Tsung-Yu Ho National Taiwan University Abstract Dependency structure matrix genetic algorithm (DSMGA), one of estimation of distribution algorithms (EDAs), adopts model-building mechanisms via dependency structure matrix clustering techniques. Previous researches have shown that DSMGA can effectively solve nearly decomposable problems. DSMGA utilizes an entropy-based metric to detect the interactions among genes. The efficiency of DSMGA and other model-building GAs greatly depend on their interaction-detection metrics. This paper investigates several commonly used metrics, and proposes a new interaction-detection metric which aims at what GAs really need. The proposed metric, namely the differential mutual complement, is based on both the disruption and reproduction effects of the crossover operator on significant schemata. Empirical results show that DSMGA with the proposed metric performs better than the other existing metrics on the aspects of population size and sensitivity to threshold. This new metric is shown to perform well on DSMGA and could be expected to work with other EDAs. 1 Introduction Genetic algorithms (GAs) (Holland, 1975; Goldberg, 1989) are stochastic search and optimization methods which require only the quality information of solution candidates. The simple GA (SGA) is a specific class of GAs and has been widely used in many different fields for optimization. However, SGA does not involve learning problem structures and can be deceived. The trap function (Goldberg, 1987) is an example where SGA failed. It has been shown that linkage learning (Goldberg, Korb, & Deb, 1989; Harik, 1996) is essential for GA success on problems of strong linkages. Linkage learning algorithms can identify building blocks (BBs) among genes, where BBs are roughly as groups of interacting genes. The more discussions of BBs can be referred in (Holland, 1975; Goldberg, 1989; Goldberg, 2002). Since the importance of BBs has been addressed, many different linkage-learning techniques have been developed including uni-metric and multi-metric methods. Uni-metric methods utilize merely fitness to serve as the metric to detect interactions and solve problems. The most typical example is the linkage learning genetic algorithm (LLGA) (Harik, 1997). However, till now, LLGA can only solves problems with fewer than 80 BBs. The limited success of LLGA raises the questions that whether there exists intrinsic limitation for uni-metric linkage learning. The answer to that question is yet unknown. Due to the difficulty of developing uni-metric methods, many GAs utilizes another metric for linkage learning. One fast-growing development that falls into this category is the model-building GAs, or estimation of distribution algorithms (EDAs). Because of their powerful problem-solving abilities, many EDAs have been developed, such as ecga (Harik, 1999), BMDA (Pelikan & Mühlenbein, 1999), EBNA (Etxeberria & Larra naga, 1999), BOA (Pelikan, Goldberg, & Cantu-Paz, 1999), hboa (Pelikan & Goldberg, 2001), and D 5 (Tsuji, Munetomo, & Akama, 2004). EDAs utilize interaction-detection metrics, the metrics to detect the interactions among genes, to construct a model and recognize BBs correctly within problems. 1

3 These common interaction-detection metrics are nonlinearity (Munetomo & Goldberg, 1999), simultaneity (Aporntewan & Chongstitvatana, 2003), and entropy-based (Shannon, 1948). This paper adopts dependency structure matrix genetic algorithm (DSMGA) (Yu, Goldberg, Sastry, Lima, & Pelikan, 2009), one of EDAs, to discuss interaction-detection metrics for two reasons. Firstly, most EDAs only adopt the designated metric that can not be replaced with another metric. On the contrary, DSMGA is easier to adopt different interaction-detection metrics. Hence, DSMGA is useful to compare different metrics at the same time. Secondly, DSMGA adds the concept of threshold onto these existing metrics. The threshold in DSMGA is defined to determined whether genes are interacting with each other. A pair of genes is considered as interacting if and only if the calculated metric is greater than the threshold. If an appropriate threshold is decided, DSMGA could decompose problems into sub-problems efficiently. Even though the threshold greatly influences the performance of DSMGA, its value is problem-specific and difficult to be calculated. Hence, to improve the efficiency of DSMGA, the purpose of this paper is to construct an new interactiondetection metric that is less sensitive on threshold and is expected to be closer to what GA really needs. The proposed metric, namely the differential mutual complement (DMC), is to detect interaction among genes and build an linkage-learning model for DSMGA. The idea is to keep good schemata from being disrupted by the crossover operators and to ensure the growth of good schemata. This paper discusses the proposed metric with other common interaction-detection metrics, nonlinearity, simultaneity, and entropybased. However, nonlinearity is not experimented due to its poor performance in many cases. The reason is detailed in Section Hence, the paper mainly discusses DMC with the simultaneity and entropy-based metrics. Among the above these metrics, entropy is most commonly used due to the satisfactory performance that it provides. However, entropy is originally utilized to represents an absolute limit on the best possible lossless compression of any communication, and has less meanings in GAs. Therefore, there exists a question if entropy-based metric is what GAs really need. DMC, the new metric proposed by this paper, is more meaningful for GAs than entropy, because DMC is based on both the disruption and reproduction effects of the crossover operator on significant schemata. This paper shows the comparison of DMC and mutual information (Cover & Thomas, 1991) metrics, where mutual information is the loss of entropy in the case of two genes, on the aspects of population size and sensitivity to threshold. The rest of this paper is structured as follows. Section 2 introduces the DSMGA construction, and shows the mechanism of clustering problems. Section 3 gives the idea of the new interaction-detection metric. Section 4 uses empirical results to show the advantage of proposed metric. Section 5 is the conclusion of this paper. 2 Background This section discusses some related works about DSMGA. These are the introduction of DSMGA, common interaction-detection metrics, and the importance of threshold. Section 2.1 discusses DSM clustering briefly, and more detail discussions can be referred in (Yu, Goldberg, Sastry, Lima, & Pelikan, 2009). Section 2.2 lists the three common metrics including nonlinearity, simultaneity, and entropy-based. Finally, choosing appropriate threshold plays a role on DSMGA, and the difficulty depends on the adopted metrics. This issue is discussed on Section Introduction to DSMGA DSMGA uses the dependency structure matrix (DSM) (McCord & Eppinger, 1993; Pimmler & Eppinger, 1994) techniques to decompose problems. A DSM is a matrix that contains the information of pairwise interaction between every pair of components in a system. The objective of DSM clustering is to transfer the pairwise interaction information into higher-order interaction information. Figure 1 is an example of DSM. The sign X in the first row and third column means the interaction between node A and node C. The diagonal entries have no meaning because of unnecessary relation with node itself. DSM clustering is to reorder the nodes from A to F in Figure 1, and the reordering result is shown in Figure 2. it is clearly that DSM can cluster problem into three parts: {B, D, G}, {A, C, E, H, F}, and {F}, and the result is one of clusters with decomposed groups. To implement DSM clustering, DSMGA uses the 2

4 Figure 1: An example of DSM. The signs of X show the dependence between two components, such A and C; the blank block means the independence of the two components, such as A and D; the diagonal black blocks shows meaningless with components itself. The example shows the matrix presentation before clustering. Figure 2: The example of DSM clustering after reordering. These eight components from A to H will be clustered into three groups. The component F is fully independent with other components. minimal description length (MDL) principle (Barron, Rissenen, & Yu, 1998; Lutz, 2002; Rissanen, 1978; Rissanen, 1998) to decide suitable clustering. The MDL-based clustering objective function is written as f DSM (M) = (n M log n c + log n c nm i=1 M i) + ( S 1 + S 2 ) (2 log n c + 1). (1) In the above equation, f DSM (M) means the description length of specific model M given by DSM clustering; n M is the total number of modules; n c is the total number of components; M i is the number of components within the i-th modules; S 1 and S 2 are two mismatch sets which is incorrect clustering on DSM. Briefly speaking, MDL-based clustering objective function is given by two issues of model description length and mismatched data description. 2.2 Interaction-Detection Metrics Interaction-detection metrics are essential for constructing DSMs. Many different metrics have been used for interaction detection in GAs. This section shows three commonly used metrics nonlinearity, simultaneity, and entropy Nonlinearity Nonlinearity is adopted by LICN and LIMD (Munetomo & Goldberg, 1999). When focusing on only two genes x i and x j, as in the case of DSMGA, the detection metric can be written as f xi =0,x j =0 + f xi =1,x j =1 f xi =1,x j =0 f xi =0,x j =1. (2) The basic idea is that if x i and x j do not interact with each other, the fitness difference from x i = 0 to x i = 1 should not be affected by the value of x j. However, this paper would not experiment with nonlinearity metric due to its poor performance. Figure 3(b) shows that nonlinearity metric has most difference with other metrics, because nonlinearity emphasizes the disruptions more than needed when the proportion of the correct schema is much greater than the other three schemata. To show how this can happen, define the fitness as a power-law of the OneMax problem, f(x) = ( i x i) t, where x i is binary and t is an parameter of order. The problem can be solved by a bit-wise hill climber. However, the nonlinearity metric detects strong interactions among all genes when k is large. Nonlinearity is far from what GAs really need and also depends on the selection scheme. Hence this paper stops further investigating this metric. 3

5 2.2.2 Simultaneity Simultaneity is adopted in the work of Aporntewan and Chongstitvatana (Aporntewan & Chongstitvatana, 2003). The idea is to consider the schema disruptions when 00 crosses with 11 or 10 crosses with 01. The detection metric can be expressed as P xi =0,x j =0P xi =1,x j =1 + P xi =0,x j =1P xi =1,x j =0, (3) where P is the proportion of the event in the current population. For simplicity, P xi =a,x j =b is abbreviated as P ab. Therefore Equation (3) can be written as P 00 P 11 + P 01 P 10, (4) where the form is similar to DMC discussed in Section 3. However, simultaneity and DMC have completely different meanings in GAs. The empirical results in Section 4 shows that the simultaneity is not suitable for DSMGA Entropy-based Entropy-based (Shannon, 1948) metics are most commonly used for interaction-detection in GAs. ecga (Harik, 1999), BMDA (Pelikan & Mühlenbein, 1999), EBNA (Etxeberria & Larra naga, 1999), and BOA (Pelikan, Goldberg, & Cantu-Paz, 1999) are some typical examples. The idea is to measure the certainty of x j given the information of x i. The metric can be written as x i,x j p xi,x j log p x i,x j p xi p xj. (5) In the case of two genes, the loss in entropy is mutual information. Hence we can use mutual information metric to detect interaction among genes. Mutual information is defined as the kullback-leibler distance (Kullback & Leibler, 1951) between the joint distribution and the product distribution. The form of mutual information shows as I(X; Y ) = D( p(x, y) p(x)p(y) ) = p(x, y) p(x, y) log p(x)p(y), (6) x y where D is the Kullback-Leibler distance, X and Y are two random variables, and x and y are the outcomes of these two random variables, respectively. From the definition of mutual information in Equation (6), the range of mutual information is between 0 and 1. The value 1 means that the pairs are completely dependent; the value 0 means that the pairs are completely independent. Note that if X and Y are independent, p(x, y) = p(x)p(y), and hence I(X; Y ) = 0. More detailed discussions can be referred to (Cover & Thomas, 1991; Yu, Goldberg, Sastry, Lima, & Pelikan, 2009). The previous research (Yu, Goldberg, Sastry, Lima, & Pelikan, 2009) adopts entropy-based metric to solve problems efficiently. However, there is another question that whether there exists a more meaningful metric for GA, and it will be discussed in Section Interaction-detection Threshold DSMGA utilizes mutual information metric to detect problems interaction and aims to divide problems into separated parts with an appropriate threshold. Because the problems is hardly to be decomposed without any given threshold, the appropriate threshold is needed for DSMGA to decompose problems properly. To alleviate computational burden, after the sampled mutual information is calculated, it is transferred into the binary domain where the metric indicates whether or not the pair of genes interact with each other. Once an appropriate threshold is calculated, a pair of genes is considered as interacting with each other if and only if the calculated mutual information is greater than the threshold. The appropriate threshold can be decided by investigating the distribution of mutual information. However, the appropriate threshold is difficult to be evaluated due to its problem-specific property. The appropriate threshold will change on different problems. The method of finding threshold is usually to 4

6 Figure 3: The outcomes of crossover when two genes are crossed. Take 11 as an example. 11 is disrupted when crossed only with its complement 00; it is not disrupted when crossed with 01, 10, and 11. On the other hand more 11 is reproduce when 01 is crossed with its complement 10. guess its values and try if it is appropriate for given problems. Hence, the purpose of this paper is to construct a new interaction-detection metric with larger range of appropriate thresholds to be more suitable for solving problems. Because mutual information metric is hard to decide appropriate thresholds and has less meanings for GA, this reason gives us an motivation for new metric discussed in next section. 3 Differential Mutual Complement Although DSMGA with entropy-based metric works well in the previous research, the meaning of entropybased metric in GAs is still unclear. This section proposes DMC as a new metric for pairwise interactiondetection and explains the concept of DMC. 3.1 Mutual: Pairwise Interaction-Detection The computational cost for determining the exact dependency model scales exponentially with problem size, so a proper mechanism is required for building an approximate model. Since it is impractical to explore the whole search space of multivariate interaction, most EDAs construct the interacting models by starting with pairwise interaction between only two variables. It takes O(l k ) to compute all dependencies among k variables in a problem of size l, so it makes sense that most EDAs start with only pairwise interactions with cost limited to O(l 2 ). For example, ecga builds the marginal product models by merging the subsets of variables according to the minimum description length metric, and it needs to detect the pairwise dependency at the very beginning of model building. Therefore, in order to have a reasonable computational cost, DMC detect interaction between two variables mutually. 3.2 Complement: Effects Between Complements The main idea of DMC focuses on how crossover operates on two schemata which both have two alleles. Define H i=x,j=y as the schema where the i-th allele is x and the j-th allele is y, and it is abbreviated as H xy for simplicity. Suppose that the i-th gene and the j-th gene are crossed over during the recombination. Take H 11 as an example. H 11 is disrupted when crossed only with its complement H 00 ; it is not disrupted when crossed with H 01, H 10, and H 11. On the other hand more H 11 is reproduce when H 01 is crossed with its complement H 10. Figure 3 shows all behaviors between four different schemata which have two alleles. 3.3 Differential: Disruption and Reproduction Without loss of generality, we focus on the behavior of crossing the schema H 11 and assume H 11 to be a local optimum. By observing the behavior of crossing the schema H 11, we infer the proportion of H 11 in the population after crossover and write an equation as: P 11 = P P 00P P 01P 10, (7) where P 11 means the proportion of H 11 in the population, and P 11 means the proportion of H 11 in the population after crossover. This equation is deduced by considering crossover operator as uniform crossover with exchange probability 1 2, so the second term, 1 2 P 00P 11, means the disruption in H 11 made by crossover 5

7 with probability 1 2. Likewise, the third term, P 01P 10, refers to the reproduction in H 11 made by crossover with probability 1 2. To combine the disruption and reproduction effects between complements together, Equation (7) is rewritten as: P 11 = P (P 00P 11 P 01 P 10 ). (8) To ensure enough growth of the assumed local optimum H 11, if P 00 P 11 P 10 P 01 is great enough, H 11 has to be protected. Therefore, this paper take P 00 P 11 P 10 P 01 from Equation (8) as a metric, namely DMC, for interaction-detection. This metric considers disruption and reproduction effect of mutual complements, and it is a differential form. The value of P 00 P 11 means the disruption quantity of H 11 (as well as H 00 ); likewise, the value of P 01 P 10 means the reproduction quantity of H 11 (as well as H 00 ). The subtraction of this two terms can be considered as the overall disruption of H 11 made by crossover. A value can be used as a threshold to detect if H 11 will be disrupted severely or not. Therefore, while the value of DMC excess the threshold, DSMGA can bind genes together to a BB and utilize BB-wise crossover. Once the BB is identified, the local optimum H 11 will be protected from being disrupting. In other words, if DSMGA do not protect the potential good schema when DMC metric excess the threshold, the schema will be disrupted severely after crossover. As a result, by using DMC metric with a specific threshold, DSMGA can reserve those potential good schemata from being disrupted. 3.4 Generalization To further generalize DMC metric and not to only assume H 11 as a local optimum (can be H 01 or H 01, too), the protection of H 01 and H 10 are taken into consideration in the similar way. The proportion of H 01 after crossover can be described as: P 01 = P (P 01P 10 P 00 P 11 ). (9) To protect the best schema from disrupting in any case, we modify DMC metric as: { P 00 P 11 P 01 P 10, if P 00 or P 11 is largest; P 01 P 10 P 00 P 11, if P 01 or P 10 is largest. (10) Because the best schema will be affected by the most frequently observed schema (H), each case of H has to be considered. There are three cases of H after selection. In the case of general hill-climbable problems such as OneMax problem, H after selection is the best schema. The problems in this case can be solved even without interaction-detection. In the case of deceptive problem such as trap problem, H after selection is the complement of the best schema. The best schema in this case will be disrupted severely after crossover, so it is necessary to bind H to a BB. In the third case, H after selection is different to the best schema by one bit. For example, while the best schema is H 11, H is H 01 or H 10. The best schema in this case will not be easily disrupted with H as Figure 3 shows. Even H is identified as a BB, the best schema will not be disrupted severely. By considering each case, it makes sense to use P 00 P 11 P 01 P 10 when P 00 or P 11 is the largest to protect H 11 and H 00. Likewise, it makes sense to use P 01 P 10 P 00 P 11 when P 01 or P 10 is the largest to protect H 01 and H Range of DMC Without loss of generality, this paper take P 00 P 11 P 01 P 10 to calculate the range of DMC. The maximum of DMC is trivial to calculate. While P 11 and P 00 are both equal to 0.5, the outcome of DMC is a maximum To gain the minimum of DMC, P 01 P 10 is maximised first, so P 01 is equal to P 10. The condition of P 00 P 11 P 01 P 10 is that P 00 or P 11 is the largest. Define P 00 is the largest. To gain the minimum 6

8 Figure 4: Problem size l = m k, where k = 3 and m = 60, for MI. (θ best, n best ) = (0.017, 1194). Figure 5: Problem size l = m k, where k = 3 and m = 60, for Simultaneity. (θ best, n best ) = (0.1257, 3553). of DMC, P 00 has to be minimised. From above, the equation P 00 = P 01 = P 10 = 1 3 is derived, so the minimum of DMC is 1 9. Therefore, the range of DMC is [ 1 9, 0.25]. After modifying DMC metric, it can be utilized to protect the best schema from disrupting after crossover. Moreover, DMC metric is physically meaningful to GA. How DMC metric works is clear. By far, this paper has proposed a new metric designed for GA, and this metric is practical with GA. The next section shows the comparison of DMC metric with MI metric. 4 Empirical Results This section compares the performance of DSMGA with three interaction-detection metrics: mutual information, simultaneity metric, and DMC. To simplify the expressions, MI, Simultaneity, and DMC in this section mean the cases of DSMGA that adopts mutual information, simultaneity metric, and DMC as interaction-detection metrics. First, we discuss the relations between thresholds and population sizes. Then, we analysis the sensitivity of the metrics. Finally, we discuss the function evaluation in resource usage. We adopt the (m, k)-trap (Goldberg, 1987) as the test problem, where k means the order of traps (problems difficulty), and m means the total number of substructures. The k-bit trap in this paper is given by: { ftrap(u) k 0.9 (1 u = k 1 ), if u < k; (11) 1.0, if u = k, where u is the number of 1 s. For example, the 3-bit trap (k = 3) is given by f 3 trap(u = 0) = 0.9, f 3 trap(u = 1) = 0.45, f 3 trap(u = 2) = 0.0, and f 3 trap(u = 3) = 1.0. The reason for using (m, k)-trap is to emphasize on problem decomposition. If the problem is not properly decomposed, any hill-climbing methods tend to give a wrong solution. Usually, both the appropriate thresholds and sufficient population sizes that can be used to solve a problem properly in polynomial time are unknown and difficult to be estimated. Therefore, we sweep 7

9 Figure 6: Problem size l = m k, where k = 3 and m = 60, for DMC. (θ best, n best ) = (0.038, 1150). the thresholds within the range of MI, Simultaneity, and DMC, where [0, 1] for MI, [0, 0.25] for Simultaneity, and [ 1 9, 0.25] for DMC, and then search the relative population size with bisection method (Rissanen, 1978) for each threshold. The relations between thresholds and population sizes, denoted as (θ, n), mean that for each threshold θ, n is the minimum requirement of population size to solve problems properly in polynomial time. We specially note that the best thresholds are the thresholds that have the smallest relative population sizes, and name such population sizes the best population sizes. The relation of best threshold and best population size is denoted as (θ best, n best ). Figures 4 to 6 show the examples of the relations between thresholds and population sizes for MI, Simultaneity, and DMC with k = 3 and m = 60. The x-axis means the threshold values, and the y-axis means the population sizes. The best threshold based on MI locates at (θ best, n best ) = (0.017, 1194), the best threshold based on Simultaneity locates at (θ best, n best ) = (0.1257, 3553), and the best threshold based on DMC locates at (θ best, n best ) = (0.038, 1150). 4.1 The Sensitivity to Threshold This section discusses the sensitivity of MI, Simultaneity, and DMC. A metric has the less sensitivity here means that if given a new problem with various m and k, we could find the available thresholds based on the metric to solve problems more easily. Where the available thresholds are the appropriate thresholds to solve problems successfully with limited population sizes (will be discussed in detail later.) We consider the sensitivity as the evaluation indicator for a new metric. We may take a guess or use self-adapting methods to find some available thresholds before the metric has been theoretically proven, and there may exist some errors when we use such methods. If the adopted metrics are sensitive, those methods could hardly be used to solve problems because of the difficulty of locating thresholds in the range of available thresholds. On the other hand, if metrics have less sensitivity, we could consider that they have better adaptability to deal with problems. In other words, the larger range of available thresholds is, the less sensitivity is. Therefore, we evaluate MI, Simultaneity, and DMC, by observing their sensitivity. To analysis the issue of sensitivity, setting a limit on population sizes for available thresholds may be necessary. The sizes are limited to be less than a available rate of the best population sizes. The available rate is 110% in common use, but the experiments in this paper cannot show the results explicitly with this rate. Therefore, this paper proposes 150% as the available rate. In Figure 4, for example, the best population size of MI is 1194, so the search space of population sizes will range from 1194 to about 1776 ( %), and the available thresholds range from about to In the case of Simultaneity, we can also find the population sizes range from 3553 to about 5563 and the available thresholds range from about to In the case of DMC, the available thresholds range from about to according to the population sizes which range from 1150 to about To compare the sensitivity between MI, Simultaneity, and DMC more explicitly, we normalize empirical results and combine them together. The range of Simultaneity is normalized from [0.0, 0.25] to [0.0, 1.0] and the range of DMC is normalized from [ 1 9, 0.25] to [0.0, 1.0]. Since MI ranges from 0 to 1 originally, its result data need not to be normalized again. We align the best thresholds for MI and Simultaneity with the best threshold for DMC by translating the threshold values except translating the relative population sizes. Figure 7 shows the result of normalizing Figures 4, 5, and 6 and combines them 8

10 Figure 7: Problem size k=3 and m=60, for MI, Simultaneity, and DMC. The available thresholds based on DMC ranges the largest. Figure 8: Problem size k=4 and m=60, for MI, Simultaneity, and DMC. The available thresholds based on DMC ranges the largest. Note that the thresholds based on Simultaneity to solve this problem have not been found in experiments. Figure 9: Problem size k=5 and m=60, for MI, Simultaneity, and DMC. The available thresholds for DMC ranges the largest. Note that the thresholds based on Simultaneity to solve this problem have not been found in experiments. 9

11 Figure 10: Problem order k = 3, for MI, Simultaneity, and DMC. The range of available thresholds for DMC is the largest. Figure 11: Problem order k = 4, for MI, Simultaneity, and DMC. The range of available thresholds for DMC is the largest. Note that the thresholds based on Simultaneity to solve problems in both the two cases, m = 30 and m = 60, have not been found. together. We compare MI, Simultaneity, and DMC from Figures 7 to 9. Figure 7 shows the difference in the ranges of available thresholds for the three metrics. We could find that DMC ranges much larger than the other metrics. When the problem difficulty increases, the range of available thresholds for MI reduces more rapidly than that for DMC does. Besides, in Figures 8 and 9, the thresholds based on Simultaneity to solve these problems, with k 4, have not been found. Simultaneity performs worse because the idea being designed based on greedy methods is hard to locate an appropriate threshold to solve problems. In other words, when k 4, the range of thresholds that could be used to solve problem may be too small to be found. The results imply that DMC is less sensitive than both MI and Simultaneity to problems with various difficulty. This paper has discussed the sensitivity of metrics with various problem difficulty k and constant m. However, both k and m will affect the available thresholds simultaneously. Furthermore, problems are usually large in real world. Finding the available thresholds without any information may waste of effort. To deal with these kinds of problems, solving simplified problems with similar difficulty is advised. In other words, we could find the available thresholds for probable difficulty with small m first, and then scale back to origin. Hence, the sensitivity to the change of problem size with various m and constant difficulty k also plays a role on evaluating metrics. Again, we normalize the empirical results and combine them together through the similar methods mentioned before. This time we adopt the results of solving problems, with m = 30, with DMC as the base, and then translate the other conditions (DMC m=60, MI m=30, MI m=60, Simultaneity m=30, and Simultaneity m=60 ) to align with the base after normalization. Figures 10 to 12 show the difference between MI, Simultaneity, and DMC with various m. The ranges of available thresholds based on MI and Simultaneity decrease much more rapidly when m increases. As the same mentioned before, in Figure 11 and 12, the thresholds based on Simultaneity to solve problems, with k 4, have not been 10

12 Figure 12: Problem order k = 5, for MI, Simultaneity, and DMC. The range of available thresholds for DMC is the largest. Note that the thresholds based on Simultaneity to solve problems in both the two cases, m = 30 and m = 60, have not been found. Table 1: The entries represent the N fe by the form (populationsize convergencetime) of DSMGA that adopt MI, Simultaneity, and DMC on best thresholds, the smaller is better. When k = 3, MI works better because of less convergence time. When k = 4, the difference of between MI and DMC decreases. When k = 5, DMC wins over MI due to the critical difference of required population size. Simultaneity performs worst in all experiments because of huge population size with k = 3 and the failure of locating appropriate thresholds with k 4. MI Simultaneity DMC k = 3, m = k = 4, m = N/A k = 5, m = N/A Table 2: The entries represent the N fe by the form (populationsize convergencetime) that adopt MI and DMC-m on best thresholds, the smaller is better. Where DMC-m is modified from DMC by increasing the population sizes on the best thresholds to be equal to MI. When k = 3, MI works better because of less convergence time. When k 4, DMC works better than MI. This table ignore the Simultaneity because of the 100% failure rate. MI DMC-m k=3, m= k=4, m= k=5, m= found in experiments. Once again, DMC is shown to have less sensitivity than MI and Simultaneity are. The results imply that deriving an available threshold based on DMC from empirical experience is simpler than MI and Simultaneity. 4.2 Function Evaluation Reduction This section focuses on the function evaluation, denoted as N fe, of DSMGA that adopts MI, Simultaneity, and DMC. We consider the multiplication of population size and convergence time (number of generations) as the indicator of N fe. Table 1 shows both population size and convergence time of DSMGA that adopts MI, Simultaneity, and DMC with various k and constant m. Simultaneity performs worst in all experiments because of huge population size with k = 3 and the failure of locating appropriate thresholds with k 4 In easier problems, with order k = 3, DMC works worse than MI does, although the best population size of DMC is smaller. The result shows that when k increases, DMC tends to have lower N fe. If we further increase the population sizes for DMC on the best thresholds to be equal to those for MI, the convergence time will decrease rapidly. Table 2 shows the empirical results. Figure 13 shows the N fe of MI, DMC, and DMC-m (DMC with population sizes increased to be the 11

13 Figure 13: N fe of DSMGA that adopt MI, DMC, and DMC-m on best thresholds. The N fe of DSMGA with DMC and DMC-m become lower than the N fe of DSMGA with MI. same as MI,) but ignore the case of Simultaneity because of the lack of success in experiments. The empirical results imply that when the difficulty increases, the N fe of DSMGA with DMC becomes lower than the N fe of DSMGA with MI and Simultaneity. 5 Conclusions Linkage learning is essential for GA to solve nearly decomposable problems successfully. EDAs use interaction-detections metrics to recognize BBs within decomposable problems. These common metrics are nonlinearity, simultaneity, and entropy-based metrics. The previous research has shown that EDAs with entropy-based metrics perform efficiently, but leaves a question if entropy-based metrics are what GAs really need. The idea gives us a motivation to develop another metric. This paper proposed a new interaction-detections metric, named the differential mutual complement (DMC). By observing the behaviors of crossover operators, the new metric is designed for more suitable on what GAs really need. The idea of DMC is to keep schemata from disrupting and to protect the potential schemata. Moreover, the proposed metric is physically meaningful for GA, and the way of how this metric works is clear. The other advantage of the proposed metric is less sensitive to the threshold which plays a role on the efficiency of DSMGA. Therefore, DMC can be more suitable for solving problems. The independent pairs and dependent pairs are the distribution over the range of the proposed metric. Therefore, finding the appropriate thresholds for dividing distributions may be an idea to design the selfadaptive methods. Besides, due to meaningfulness of the proposed metric for GA, the issue that whether DMC could be supported theoretically will be discussed as the future work. DMC has shown some promising characteristics and is more meaningful for GAs than other existing metrics. Nevertheless, further investigations are still needed to check how close DMC is to what GAs really need. Imagine that we have an optimal metric that indicates the existence of the interaction between two genes if and only if GAs need to transfer the information of these two genes together during recombination. Then we should be able to find a problem which fails GAs with any other metrics; only GAs with this optimal one would work on this problem. In addition, finding this metric might also give a hint of the limited success of uni-metric methods such as LLGA by comparing the difference of this metric and the fitness the only metric used for linkage learning in uni-metric methods. References Aporntewan, C., & Chongstitvatana, P. (2003). Building-block identification by simultaneity matrix. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2003), Barron, A., Rissenen, J., & Yu, B. (1998). The MDL principle in coding and modeling. IEEE Transactions on Information Theory, vol. 44, no.6, pp Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. Wiley, New York. Etxeberria, R., & Larra naga, P. (1999). Global optimization using bayesian networks. Proceedings of the Second Symposium on Artificial Intelligence Adaptive Systems,

14 Goldberg, D. E. (1987). Simple genetic algorithms and the minimal, deceptive problem. Genetic Algorithms and Simulated Annealing, Goldberg, D. E. (1989). Genetic algorithms in search, optimization and machine learning. Reading, MA: Addison- Wesley. Goldberg, D. E. (2002). The design of innovation: Lessons from and for competent genetic algorithms. Norwell, MA, USA: Kluwer Academic Publishers. Goldberg, D. E., Korb, B., & Deb, K. (1989). Messy genetic algorithms: motivation, analysis, and first results. Complex Systems, vol. 3, Harik, G. R. (1996). Learning linkage. Foundations of Genetic Algorithms 4, Harik, G. R. (1997). Learning gene linkage to efficiently solve problems of bounded difficulty using genetic algorithms. Doctoral dissertation, University of Michigan, Ann Arbor, ML. Harik, G. R. (February 1999). Linkage learning via probabilistic modeling in the ecga. IlliGAL Report No , University of Illinois at Urbana-Champaign, Urbana, IL. Holland, J. H. (1975). Adaptation in natural and artificial systems. Ann Arbor, MI: University of Michigan Press. Kullback, A., & Leibler, R. A. (1951). On information and sufficiency. Annual Mathematical Statistics, vol. 22, Lutz, R. (2002). Recovering high-level structure of software systems using a minimum description length principle. Proceedings of the 13th Irish International Conference on Artificial Intelligence and Cognitive Science (AICS), McCord, K., & Eppinger, S. D. (1993). Managing the integration problem in concurrent engineering. Working Paper 3594, Cambridge, MA: MIT Sloan School of Management. Munetomo, M., & Goldberg, D. E. (1999). Identifying linkage groups by nonlinearity/non-monotonicity detection. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-1999), Pelikan, M., & Goldberg, D. E. (2001). Escaping hierarchical traps with competent genetic algorithms. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), Pelikan, M., Goldberg, D. E., & Cantu-Paz, E. (1999). BOA: The bayesian optimization algorithm. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-1999), Pelikan, M., & Mühlenbein, H. (1999). The bivariate marginal distribution algorithm. pp Pimmler, T., & Eppinger, S. D. (1994). Integration analysis of product decompositions. Proceedings of the ASME International Conference on Design Thory and Methodology, DE-68, Rissanen, J. (1978). Modeling by shortest data description. Automatica, vol.14, pp Rissanen, J. (1998). Hypothesis selection and testing by the MDL principle. The Computer Journal, vol. 42, pp Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, vol. 27, Tsuji, M., Munetomo, M., & Akama, K. (2004). Modeling dependencies of loci with string classification according to fitness differences. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2004), Yu, T.-L., Goldberg, D. E., Sastry, K., Lima, C. F., & Pelikan, M. (2009). Dependency structure matrix, genetic algorithms,and effective recombination. Evolutionary Computation, vol. 17, no. 4, pp

Convergence Time for Linkage Model Building in Estimation of Distribution Algorithms

Convergence Time for Linkage Model Building in Estimation of Distribution Algorithms Convergence Time for Linkage Model Building in Estimation of Distribution Algorithms Hau-Jiun Yang Tian-Li Yu TEIL Technical Report No. 2009003 January, 2009 Taiwan Evolutionary Intelligence Laboratory

More information

State of the art in genetic algorithms' research

State of the art in genetic algorithms' research State of the art in genetic algorithms' Genetic Algorithms research Prabhas Chongstitvatana Department of computer engineering Chulalongkorn university A class of probabilistic search, inspired by natural

More information

Estimation-of-Distribution Algorithms. Discrete Domain.

Estimation-of-Distribution Algorithms. Discrete Domain. Estimation-of-Distribution Algorithms. Discrete Domain. Petr Pošík Introduction to EDAs 2 Genetic Algorithms and Epistasis.....................................................................................

More information

Scalability of Selectorecombinative Genetic Algorithms for Problems with Tight Linkage

Scalability of Selectorecombinative Genetic Algorithms for Problems with Tight Linkage Scalability of Selectorecombinative Genetic Algorithms for Problems with Tight Linkage Kumara Sastry 1,2 and David E. Goldberg 1,3 1 Illinois Genetic Algorithms Laboratory (IlliGAL) 2 Department of Material

More information

Spurious Dependencies and EDA Scalability

Spurious Dependencies and EDA Scalability Spurious Dependencies and EDA Scalability Elizabeth Radetic, Martin Pelikan MEDAL Report No. 2010002 January 2010 Abstract Numerous studies have shown that advanced estimation of distribution algorithms

More information

Linkage Identification Based on Epistasis Measures to Realize Efficient Genetic Algorithms

Linkage Identification Based on Epistasis Measures to Realize Efficient Genetic Algorithms Linkage Identification Based on Epistasis Measures to Realize Efficient Genetic Algorithms Masaharu Munetomo Center for Information and Multimedia Studies, Hokkaido University, North 11, West 5, Sapporo

More information

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Mark Hauschild, Martin Pelikan, Claudio F. Lima, and Kumara Sastry IlliGAL Report No. 2007008 January 2007 Illinois Genetic

More information

New Epistasis Measures for Detecting Independently Optimizable Partitions of Variables

New Epistasis Measures for Detecting Independently Optimizable Partitions of Variables New Epistasis Measures for Detecting Independently Optimizable Partitions of Variables Dong-Il Seo, Sung-Soon Choi, and Byung-Ro Moon School of Computer Science & Engineering, Seoul National University

More information

Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions

Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions Martin Pelikan MEDAL Report No. 20102 February 2011 Abstract Epistasis correlation is a measure that estimates the

More information

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Mark Hauschild Missouri Estimation of Distribution Algorithms Laboratory (MEDAL) Dept. of Math and Computer Science, 320 CCB

More information

Initial-Population Bias in the Univariate Estimation of Distribution Algorithm

Initial-Population Bias in the Univariate Estimation of Distribution Algorithm Initial-Population Bias in the Univariate Estimation of Distribution Algorithm Martin Pelikan and Kumara Sastry MEDAL Report No. 9 January 9 Abstract This paper analyzes the effects of an initial-population

More information

about the problem can be used in order to enhance the estimation and subsequently improve convergence. New strings are generated according to the join

about the problem can be used in order to enhance the estimation and subsequently improve convergence. New strings are generated according to the join Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence Martin Pelikan Dept. of General Engineering and Dept. of Computer Science University of Illinois, Urbana, IL 61801 pelikan@illigal.ge.uiuc.edu

More information

Search. Search is a key component of intelligent problem solving. Get closer to the goal if time is not enough

Search. Search is a key component of intelligent problem solving. Get closer to the goal if time is not enough Search Search is a key component of intelligent problem solving Search can be used to Find a desired goal if time allows Get closer to the goal if time is not enough section 11 page 1 The size of the search

More information

From Mating Pool Distributions to Model Overfitting

From Mating Pool Distributions to Model Overfitting From Mating Pool Distributions to Model Overfitting Claudio F. Lima 1 Fernando G. Lobo 1 Martin Pelikan 2 1 Department of Electronics and Computer Science Engineering University of Algarve, Portugal 2

More information

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Matthew J. Streeter Computer Science Department and Center for the Neural Basis of Cognition Carnegie Mellon University

More information

Probabilistic Model-Building Genetic Algorithms

Probabilistic Model-Building Genetic Algorithms Probabilistic Model-Building Genetic Algorithms Martin Pelikan Dept. of Math. and Computer Science University of Missouri at St. Louis St. Louis, Missouri pelikan@cs.umsl.edu Foreword! Motivation Genetic

More information

Probability (Phi) Proportion BBs (Pbb) 1. Proportion BBs (Pbb) Pop Size (n) Pop Size (n)

Probability (Phi) Proportion BBs (Pbb) 1. Proportion BBs (Pbb) Pop Size (n) Pop Size (n) THE POPULATION SIZING PROBLEM IN (P)GAS Zden»ek Konfr»st Department of Cybernetics, Czech Technical University Abstract One of interesting problems in genetic algorithms (GAs) is the population sizing

More information

Probabilistic Model-Building Genetic Algorithms

Probabilistic Model-Building Genetic Algorithms Probabilistic Model-Building Genetic Algorithms a.k.a. Estimation of Distribution Algorithms a.k.a. Iterated Density Estimation Algorithms Martin Pelikan Foreword Motivation Genetic and evolutionary computation

More information

Bounding the Population Size in XCS to Ensure Reproductive Opportunities

Bounding the Population Size in XCS to Ensure Reproductive Opportunities Bounding the Population Size in XCS to Ensure Reproductive Opportunities Martin V. Butz and David E. Goldberg Illinois Genetic Algorithms Laboratory (IlliGAL) University of Illinois at Urbana-Champaign

More information

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Martin Pelikan 1, Alexander K. Hartmann 2, and Kumara Sastry 1 Dept. of Math and Computer Science, 320 CCB University of Missouri at

More information

Behaviour of the UMDA c algorithm with truncation selection on monotone functions

Behaviour of the UMDA c algorithm with truncation selection on monotone functions Mannheim Business School Dept. of Logistics Technical Report 01/2005 Behaviour of the UMDA c algorithm with truncation selection on monotone functions Jörn Grahl, Stefan Minner, Franz Rothlauf Technical

More information

From Mating Pool Distributions to Model Overfitting

From Mating Pool Distributions to Model Overfitting From Mating Pool Distributions to Model Overfitting Claudio F. Lima DEEI-FCT University of Algarve Campus de Gambelas 85-39, Faro, Portugal clima.research@gmail.com Fernando G. Lobo DEEI-FCT University

More information

Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence

Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence Preprint UCRL-JC-137172 Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence M. Pelikan, D. E. Goldberg and E. Cantu-Paz This article was submitted to Genetic and Evolutionary Computation

More information

Fitness Inheritance in Multi-Objective Optimization

Fitness Inheritance in Multi-Objective Optimization Fitness Inheritance in Multi-Objective Optimization Jian-Hung Chen David E. Goldberg Shinn-Ying Ho Kumara Sastry IlliGAL Report No. 2002017 June, 2002 Illinois Genetic Algorithms Laboratory (IlliGAL) Department

More information

Exploration of population fixed-points versus mutation rates for functions of unitation

Exploration of population fixed-points versus mutation rates for functions of unitation Exploration of population fixed-points versus mutation rates for functions of unitation J Neal Richter 1, Alden Wright 2, John Paxton 1 1 Computer Science Department, Montana State University, 357 EPS,

More information

Hierarchical BOA Solves Ising Spin Glasses and MAXSAT

Hierarchical BOA Solves Ising Spin Glasses and MAXSAT Hierarchical BOA Solves Ising Spin Glasses and MAXSAT Martin Pelikan 1,2 and David E. Goldberg 2 1 Computational Laboratory (Colab) Swiss Federal Institute of Technology (ETH) Hirschengraben 84 8092 Zürich,

More information

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Martin Pelikan and Alexander K. Hartmann MEDAL Report No. 2006002 March 2006 Abstract This paper analyzes the hierarchical Bayesian

More information

Probabilistic Model-Building Genetic Algorithms

Probabilistic Model-Building Genetic Algorithms Probabilistic Model-Building Genetic Algorithms a.k.a. Estimation of Distribution Algorithms a.k.a. Iterated Density Estimation Algorithms Martin Pelikan Missouri Estimation of Distribution Algorithms

More information

An Analysis of Diploidy and Dominance in Genetic Algorithms

An Analysis of Diploidy and Dominance in Genetic Algorithms An Analysis of Diploidy and Dominance in Genetic Algorithms Dan Simon Cleveland State University Department of Electrical and Computer Engineering Cleveland, Ohio d.j.simon@csuohio.edu Abstract The use

More information

Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration

Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration Phan Trung Hai Nguyen June 26, 2017 School of Computer Science University of Birmingham Birmingham B15

More information

Gene Pool Recombination in Genetic Algorithms

Gene Pool Recombination in Genetic Algorithms Gene Pool Recombination in Genetic Algorithms Heinz Mühlenbein GMD 53754 St. Augustin Germany muehlenbein@gmd.de Hans-Michael Voigt T.U. Berlin 13355 Berlin Germany voigt@fb10.tu-berlin.de Abstract: A

More information

c Copyright by Martin Pelikan, 2002

c Copyright by Martin Pelikan, 2002 c Copyright by Martin Pelikan, 2002 BAYESIAN OPTIMIZATION ALGORITHM: FROM SINGLE LEVEL TO HIERARCHY BY MARTIN PELIKAN DIPL., Comenius University, 1998 THESIS Submitted in partial fulfillment of the requirements

More information

The local equivalence of two distances between clusterings: the Misclassification Error metric and the χ 2 distance

The local equivalence of two distances between clusterings: the Misclassification Error metric and the χ 2 distance The local equivalence of two distances between clusterings: the Misclassification Error metric and the χ 2 distance Marina Meilă University of Washington Department of Statistics Box 354322 Seattle, WA

More information

Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA

Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA Peter A.N. Bosman and Dirk Thierens Department of Computer Science, Utrecht University, P.O. Box 80.089, 3508 TB Utrecht,

More information

Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap

Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild Missouri

More information

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence Artificial Intelligence (AI) Artificial Intelligence AI is an attempt to reproduce intelligent reasoning using machines * * H. M. Cartwright, Applications of Artificial Intelligence in Chemistry, 1993,

More information

Polynomial Approximation of Survival Probabilities Under Multi-point Crossover

Polynomial Approximation of Survival Probabilities Under Multi-point Crossover Polynomial Approximation of Survival Probabilities Under Multi-point Crossover Sung-Soon Choi and Byung-Ro Moon School of Computer Science and Engineering, Seoul National University, Seoul, 151-74 Korea

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

WITH. P. Larran aga. Computational Intelligence Group Artificial Intelligence Department Technical University of Madrid

WITH. P. Larran aga. Computational Intelligence Group Artificial Intelligence Department Technical University of Madrid Introduction EDAs Our Proposal Results Conclusions M ULTI - OBJECTIVE O PTIMIZATION WITH E STIMATION OF D ISTRIBUTION A LGORITHMS Pedro Larran aga Computational Intelligence Group Artificial Intelligence

More information

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information. L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate

More information

Dept. of Linguistics, Indiana University Fall 2015

Dept. of Linguistics, Indiana University Fall 2015 L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission

More information

Crossover Gene Selection by Spatial Location

Crossover Gene Selection by Spatial Location Crossover Gene Selection by Spatial Location ABSTRACT Dr. David M. Cherba Computer Science Department Michigan State University 3105 Engineering Building East Lansing, MI 48823 USA cherbada@cse.msu.edu

More information

Implicit Formae in Genetic Algorithms

Implicit Formae in Genetic Algorithms Implicit Formae in Genetic Algorithms Márk Jelasity ½ and József Dombi ¾ ¾ ½ Student of József Attila University, Szeged, Hungary jelasity@inf.u-szeged.hu Department of Applied Informatics, József Attila

More information

Fractional Belief Propagation

Fractional Belief Propagation Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation

More information

A Tractable Walsh Analysis of SAT and its Implications for Genetic Algorithms

A Tractable Walsh Analysis of SAT and its Implications for Genetic Algorithms From: AAAI-98 Proceedings. Copyright 998, AAAI (www.aaai.org). All rights reserved. A Tractable Walsh Analysis of SAT and its Implications for Genetic Algorithms Soraya Rana Robert B. Heckendorn Darrell

More information

EECS 545 Project Report: Query Learning for Multiple Object Identification

EECS 545 Project Report: Query Learning for Multiple Object Identification EECS 545 Project Report: Query Learning for Multiple Object Identification Dan Lingenfelter, Tzu-Yu Liu, Antonis Matakos and Zhaoshi Meng 1 Introduction In a typical query learning setting we have a set

More information

INVARIANT SUBSETS OF THE SEARCH SPACE AND THE UNIVERSALITY OF A GENERALIZED GENETIC ALGORITHM

INVARIANT SUBSETS OF THE SEARCH SPACE AND THE UNIVERSALITY OF A GENERALIZED GENETIC ALGORITHM INVARIANT SUBSETS OF THE SEARCH SPACE AND THE UNIVERSALITY OF A GENERALIZED GENETIC ALGORITHM BORIS MITAVSKIY Abstract In this paper we shall give a mathematical description of a general evolutionary heuristic

More information

Model Complexity of Pseudo-independent Models

Model Complexity of Pseudo-independent Models Model Complexity of Pseudo-independent Models Jae-Hyuck Lee and Yang Xiang Department of Computing and Information Science University of Guelph, Guelph, Canada {jaehyuck, yxiang}@cis.uoguelph,ca Abstract

More information

The Regularized EM Algorithm

The Regularized EM Algorithm The Regularized EM Algorithm Haifeng Li Department of Computer Science University of California Riverside, CA 92521 hli@cs.ucr.edu Keshu Zhang Human Interaction Research Lab Motorola, Inc. Tempe, AZ 85282

More information

Information Theory in Intelligent Decision Making

Information Theory in Intelligent Decision Making Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory

More information

Evolving Presentations of Genetic Information: Motivation, Methods, and Analysis

Evolving Presentations of Genetic Information: Motivation, Methods, and Analysis Evolving Presentations of Genetic Information: Motivation, Methods, and Analysis Peter Lee Stanford University PO Box 14832 Stanford, CA 94309-4832 (650)497-6826 peterwlee@stanford.edu June 5, 2002 Abstract

More information

Looking Under the EA Hood with Price s Equation

Looking Under the EA Hood with Price s Equation Looking Under the EA Hood with Price s Equation Jeffrey K. Bassett 1, Mitchell A. Potter 2, and Kenneth A. De Jong 1 1 George Mason University, Fairfax, VA 22030 {jbassett, kdejong}@cs.gmu.edu 2 Naval

More information

Planning With Information States: A Survey Term Project for cs397sml Spring 2002

Planning With Information States: A Survey Term Project for cs397sml Spring 2002 Planning With Information States: A Survey Term Project for cs397sml Spring 2002 Jason O Kane jokane@uiuc.edu April 18, 2003 1 Introduction Classical planning generally depends on the assumption that the

More information

Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms

Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms Martin Pelikan, Helmut G. Katzgraber and Sigismund Kobe MEDAL Report No. 284 January 28 Abstract

More information

Signal, noise, and genetic algorithms

Signal, noise, and genetic algorithms Oregon Health & Science University OHSU Digital Commons CSETech June 1991 Signal, noise, and genetic algorithms Mike Rudnick David E. Goldberg Follow this and additional works at: http://digitalcommons.ohsu.edu/csetech

More information

A Statistical Genetic Algorithm

A Statistical Genetic Algorithm A Statistical Genetic Algorithm Angel Kuri M. akm@pollux.cic.ipn.mx Centro de Investigación en Computación Instituto Politécnico Nacional Zacatenco México 07738, D.F. Abstract A Genetic Algorithm which

More information

Shortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers

Shortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers Shortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers Rong-Chang Chen, Yi-Ru Liao, Ting-Yao Lin, Chia-Hsin Chuang, Department of Distribution Management,

More information

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis Luigi Malagò Department of Electronics and Information Politecnico di Milano Via Ponzio, 34/5 20133 Milan,

More information

Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions

Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions Parthan Kasarapu & Lloyd Allison Monash University, Australia September 8, 25 Parthan Kasarapu

More information

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Antti Honkela 1, Stefan Harmeling 2, Leo Lundqvist 1, and Harri Valpola 1 1 Helsinki University of Technology,

More information

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population Health

More information

CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs

CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs Professor Erik Sudderth Brown University Computer Science October 6, 2016 Some figures and materials courtesy

More information

Analysis of Epistasis Correlation on NK Landscapes. Landscapes with Nearest Neighbor Interactions

Analysis of Epistasis Correlation on NK Landscapes. Landscapes with Nearest Neighbor Interactions Analysis of Epistasis Correlation on NK Landscapes with Nearest Neighbor Interactions Missouri Estimation of Distribution Algorithms Laboratory (MEDAL University of Missouri, St. Louis, MO http://medal.cs.umsl.edu/

More information

Pengju

Pengju Introduction to AI Chapter04 Beyond Classical Search Pengju Ren@IAIR Outline Steepest Descent (Hill-climbing) Simulated Annealing Evolutionary Computation Non-deterministic Actions And-OR search Partial

More information

The Volume of Bitnets

The Volume of Bitnets The Volume of Bitnets Carlos C. Rodríguez The University at Albany, SUNY Department of Mathematics and Statistics http://omega.albany.edu:8008/bitnets Abstract. A bitnet is a dag of binary nodes representing

More information

Gaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress

Gaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress Gaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress Petr Pošík Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics Technická, 66 7 Prague

More information

Computational Complexity and Genetic Algorithms

Computational Complexity and Genetic Algorithms Computational Complexity and Genetic Algorithms BART RYLANDER JAMES FOSTER School of Engineering Department of Computer Science University of Portland University of Idaho Portland, Or 97203 Moscow, Idaho

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 10: Gene(c Algorithms CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ Slides of this presenta(on

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

1. Computação Evolutiva

1. Computação Evolutiva 1. Computação Evolutiva Renato Tinós Departamento de Computação e Matemática Fac. de Filosofia, Ciência e Letras de Ribeirão Preto Programa de Pós-Graduação Em Computação Aplicada 1.6. Aspectos Teóricos*

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

A Genetic Algorithm Approach for Doing Misuse Detection in Audit Trail Files

A Genetic Algorithm Approach for Doing Misuse Detection in Audit Trail Files A Genetic Algorithm Approach for Doing Misuse Detection in Audit Trail Files Pedro A. Diaz-Gomez and Dean F. Hougen Robotics, Evolution, Adaptation, and Learning Laboratory (REAL Lab) School of Computer

More information

LTI Systems, Additive Noise, and Order Estimation

LTI Systems, Additive Noise, and Order Estimation LTI Systems, Additive oise, and Order Estimation Soosan Beheshti, Munther A. Dahleh Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts

More information

CS 630 Basic Probability and Information Theory. Tim Campbell

CS 630 Basic Probability and Information Theory. Tim Campbell CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)

More information

3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H.

3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H. Appendix A Information Theory A.1 Entropy Shannon (Shanon, 1948) developed the concept of entropy to measure the uncertainty of a discrete random variable. Suppose X is a discrete random variable that

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Outline

More information

A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms

A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms Yang Yu and Zhi-Hua Zhou National Laboratory for Novel Software Technology Nanjing University, Nanjing 20093, China

More information

The Decision List Machine

The Decision List Machine The Decision List Machine Marina Sokolova SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 sokolova@site.uottawa.ca Nathalie Japkowicz SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 nat@site.uottawa.ca

More information

Evolutionary Computation

Evolutionary Computation Evolutionary Computation - Computational procedures patterned after biological evolution. - Search procedure that probabilistically applies search operators to set of points in the search space. - Lamarck

More information

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models JMLR Workshop and Conference Proceedings 6:17 164 NIPS 28 workshop on causality Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Kun Zhang Dept of Computer Science and HIIT University

More information

A Modification of Linfoot s Informational Correlation Coefficient

A Modification of Linfoot s Informational Correlation Coefficient Austrian Journal of Statistics April 07, Volume 46, 99 05. AJS http://www.ajs.or.at/ doi:0.773/ajs.v46i3-4.675 A Modification of Linfoot s Informational Correlation Coefficient Georgy Shevlyakov Peter

More information

Self-Organization by Optimizing Free-Energy

Self-Organization by Optimizing Free-Energy Self-Organization by Optimizing Free-Energy J.J. Verbeek, N. Vlassis, B.J.A. Kröse University of Amsterdam, Informatics Institute Kruislaan 403, 1098 SJ Amsterdam, The Netherlands Abstract. We present

More information

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY OUTLINE 3.1 Why Probability? 3.2 Random Variables 3.3 Probability Distributions 3.4 Marginal Probability 3.5 Conditional Probability 3.6 The Chain

More information

What Makes a Problem Hard for a Genetic Algorithm? Some Anomalous Results and Their Explanation

What Makes a Problem Hard for a Genetic Algorithm? Some Anomalous Results and Their Explanation What Makes a Problem Hard for a Genetic Algorithm? Some Anomalous Results and Their Explanation Stephanie Forrest Dept. of Computer Science University of New Mexico Albuquerque, N.M. 87131-1386 Email:

More information

Structure learning in human causal induction

Structure learning in human causal induction Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use

More information

ECS 120 Lesson 24 The Class N P, N P-complete Problems

ECS 120 Lesson 24 The Class N P, N P-complete Problems ECS 120 Lesson 24 The Class N P, N P-complete Problems Oliver Kreylos Friday, May 25th, 2001 Last time, we defined the class P as the class of all problems that can be decided by deterministic Turing Machines

More information

Simulation of the Evolution of Information Content in Transcription Factor Binding Sites Using a Parallelized Genetic Algorithm

Simulation of the Evolution of Information Content in Transcription Factor Binding Sites Using a Parallelized Genetic Algorithm Simulation of the Evolution of Information Content in Transcription Factor Binding Sites Using a Parallelized Genetic Algorithm Joseph Cornish*, Robert Forder**, Ivan Erill*, Matthias K. Gobbert** *Department

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

Lecture 9 Evolutionary Computation: Genetic algorithms

Lecture 9 Evolutionary Computation: Genetic algorithms Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Simulation of natural evolution Genetic algorithms Case study: maintenance scheduling with genetic

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Adaptive Generalized Crowding for Genetic Algorithms

Adaptive Generalized Crowding for Genetic Algorithms Carnegie Mellon University From the SelectedWorks of Ole J Mengshoel Fall 24 Adaptive Generalized Crowding for Genetic Algorithms Ole J Mengshoel, Carnegie Mellon University Severinio Galan Antonio de

More information

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is

More information

Haploid & diploid recombination and their evolutionary impact

Haploid & diploid recombination and their evolutionary impact Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis

More information

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Some

More information

Variable Dependence Interaction and Multi-objective Optimisation

Variable Dependence Interaction and Multi-objective Optimisation Variable Dependence Interaction and Multi-objective Optimisation Ashutosh Tiwari and Rajkumar Roy Department of Enterprise Integration, School of Industrial and Manufacturing Science, Cranfield University,

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Blackbox Optimization Marc Toussaint U Stuttgart Blackbox Optimization The term is not really well defined I use it to express that only f(x) can be evaluated f(x) or 2 f(x)

More information

Info-Clustering: An Information-Theoretic Paradigm for Data Clustering. Information Theory Guest Lecture Fall 2018

Info-Clustering: An Information-Theoretic Paradigm for Data Clustering. Information Theory Guest Lecture Fall 2018 Info-Clustering: An Information-Theoretic Paradigm for Data Clustering Information Theory Guest Lecture Fall 208 Problem statement Input: Independently and identically drawn samples of the jointly distributed

More information

GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS

GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS A genetic algorithm is a random search technique for global optimisation in a complex search space. It was originally inspired by an

More information