Estimation-of-Distribution Algorithms. Discrete Domain.

Size: px
Start display at page:

Download "Estimation-of-Distribution Algorithms. Discrete Domain."

Transcription

1 Estimation-of-Distribution Algorithms. Discrete Domain. Petr Pošík Introduction to EDAs 2 Genetic Algorithms and Epistasis Genetic Algorithms GA vs EDA Content of the lectures Motivation Example 7 Example Selection, Modeling, Sampling UMDA Behaviour for OneMax problem What about a different fitness? UMDA behaviour on concatanated traps What can be done about traps? Good news! Discrete EDAs Discrete EDAs: Overview EDAs without interactions 7 EDAs without interactions Pairwise Interactions 9 From single bits to pairwise models Example with pairwise dependencies: dependency tree Example of dependency tree learning Dependency tree: probabilities EDAs with pairwise interactions Summary Multivariate Interactions 26 ECGA ECGA: Evaluation metric BOA BOA: Learning the structure Scalability Analysis 3 Test functions Test function (cont.) Scalability analysis OneMax Non-dec. Equal Pairs Decomp. Equal Pairs Non-dec. Sliding XOR Decomp. Sliding XOR Decomp. Trap Model structure during evolution Conclusions 42 Summary Suggestions for discrete EDAs

2 Introduction to EDAs 2 / 44 Genetic Algorithms and Epistasis From the lecture on epistasis: x best =..., f(x best ) = 40 GA works: no dependencies GA fails: deps. exist GA not able to work with them GA works again: deps. exist GA knows about them 4 Popsize60 4 Popsize60 4 Popsize f40xbitonemax f8xbittrap f8xbittrap best average generation best average generation best average generation Popsize60 Popsize60 Popsize xmean xmean xmean generation generation generation P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 3 / 44 Genetic Algorithms Algorithm : Genetic Algorithm begin 2 Initialize the population. 3 while termination criteria are not met do 4 Select parents from the population. Cross over the parents, create offspring. 6 Mutate offspring. 7 Incorporate offspring into the population. Conventional GA operators are not adaptive, and cannot (or ususally do not) discover and use the interactions among solution components. Select cross over mutate approach What does an intearction mean? we would like to create a new offspring by mutation we would like the offspring to have better, or at least the same, quality as the parent if we must modify x i together with x j to reach the desired goal (if it is not possible to improve the solution by modifying either x i or x j only), then x i interacts with x j. The goal of recombination operators: Intensify the search in areas which contained good individuals in previous iterations. Must be able to take the interactions into account. Why not directly describe the distribution of good individuals??? P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 4 / 44 2

3 GA vs EDA Algorithm : Genetic Algorithm begin 2 Initialize the population. 3 while termination criteria are not met do 4 Select parents from the population. Cross over the parents, create offspring. 6 Mutate offspring. 7 Incorporate offspring into the population. Select cross over mutate approach Algorithm 2: Estimation-of-Distribution Alg. begin 2 Initialize the population. 3 while termination criteria are not met do 4 Select parents from the population. Learn a model of their distribution. 6 Sample new individuals. 7 Incorporate offspring into the population. Select model sample approach Explicit probabilistic model: principled way of working with dependencies adaptation ability (different behavior in different stages of evolution) Names: EDA Estimation-of-Distribution Algorithm PMBGA Probabilistic Model-Building Genetic Algorithm IDEA Iterated Density Estimation Algorithm P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms / 44 Content of the lectures. EDA for discrete domains (e.g. binary) Motivation example Without interactions Pairwise interactions Higher order interactions 2. EDA for real domain (vectors of real numbers) Evolution strategies Histograms Gaussian distribution and its mixtures P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 6 / 44 3

4 Motivation Example 7 / 44 Example -bit OneMax (CountOnes) problem: f DxbitOneMax (x) = D d= x d Optimum:, fitness: Algorithm: Univariate Marginal Distribution Algorithm (UMDA) Population size: 6 Tournament selection: t = 2 Model: vector of probabilities p = (p,..., p D ) each p d is the probability of observingat dth element Model learning: compute p from selected individuals Model sampling: generateon dth position with probability p d (independently of other positions) P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 8 / 44 Selection, Modeling, Sampling Old population: 00 (3) 0000 () 0 (4) 0 (4) 0000 () 000 (2) Tournaments: 0 (4) vs. 0 (4) 0 (4) vs. 0 (4) 0 (4) vs () 000 (2) vs () 0000 () vs () 0000 () vs. 00 (3) Selected parents: 0 (4) 0 (4) 0 (4) 000 (2) 0000 () 00 (3) Offspring: 000 (2) 000 (2) 0 (4) 00 (3) 00 (3) 0 (4) Probability vector: P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 9 / 44 4

5 UMDA Behaviour for OneMax problem Probability vector entries Generation s are better then0s on average, selection increases the proportion ofs Recombination preserves and combiness, the ratio ofs increases over time If we have manys in population, we cannot miss the optimum The number of evaluations needed for reliable convergence: Algorithm UMDA behaves similarly to GA with uniform crossover! Nr. of evaluations UMDA O(D ln D) Hill-Climber O(D ln D) GA with uniform xover approx. O(D ln D) GA with -point xover a bit slower P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 0 / 44 What about a different fitness? For OneMax function: UMDA works well, all the bits probably eventually converge to the right value. Will UMDA be similarly successful for other fitness functions? Well, no. :-( Problem: Concatanated -bit traps f = f trap (x, x 2, x 3, x 4, x )+ + f trap (x 6, x 7, x 8, x 9, x 0 ) The trap function is defined as { if u(x) = f trap (x) = 4 u(x) otherwise where u(x) is the so called unity function and returns the number ofs in x (it is actually the One Max function). trap(x) Number of s in chromosome, u(x) P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms / 44

6 UMDA behaviour on concatanated traps Traps: Optimum in... But f trap (0 ) = 2 while f trap ( ) =.37 -dimensional probabilities lead the GA to the wrong way! Exponentially increasing population size is needed, otherwise GA will not find optimum reliably. 0.9 Probability vector entries Generation P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 2 / 44 What can be done about traps? The f trap function is deceptive: Statistics over**** and0**** do not lead us to the right solution The same holds for statistics over*** and00***,** and000**,* and0000* Harder than the needle-in-the-haystack problem: regular haystack simply does not provide any information, where to search for the needle f trap -haystack actively lies to you it points you to the wrong part of the haystack But: f trap (00000) < f trap (), will be better than00000 on average bit statistics should work for bit traps in the same way as bit statistics work for OneMax problem! Model learning: build model for each -tuple of bits compute p(00000), p(0000),..., p(), Model sampling: Each -tuple of bits is generated independently Generate with probability p(00000), 0000 with probability p(0000),... P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 3 / 44 6

7 Good news! Good statistics work great! Relative frequency of Generation What shall we do next? If we were able to find good statistics with a small overhead, and use them in the UMDA framework, we would be able to solve order-k separable problems using O(D 2 ) evaluations.... and there are many problems of this type. The problem solution is closely related to the so-called linkage learning, i.e. discovering and using statistical dependencies among variables. Algorithm Nr. of evaluations UMDA with bit BB O(D ln D) (WOW!) Hill-Climber O(D k ln D), k = GA with uniform xover approx. O(2 D ) GA with -point xover similar to unif. xover P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 4 / 44 Discrete EDAs / 44 Discrete EDAs: Overview. Overview: (a) Univariate models (without interactions) (b) Bivariate models (pairwise dependencies) (c) Multivariate models (higher order interactions) 2. Conclusions P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 6 / 44 7

8 EDAs without interactions 7 / 44 EDAs without interactions. Population-based incremental learning (PBIL) Baluja, Univariate marginal distribution algorithm (UMDA) Mühlenbein and Paaß, Compact genetic algorithm (cga) Harik, Lobo, Goldberg, 998 Similarities: all of them use a vector of probabilities Differences: PBIL and cga do not use population (only the vector p); UMDA does PBIL and cga use different rules for the adaptation of p Advantages: Simplicity Speed Simple simulation of large populations Limitations: Solves reliably only order- decomposable problems P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 8 / 44 EDAs with Pairwise Interactions 9 / 44 From single bits to pairwise models How to describe two positions together? Using the joint probability distribution: A Number of free parameters: 3 B p(a, B) B 0 A 0 p(0, 0) p(0, ) p(, 0) p(, ) Using statistical dependence: A Number of free parameters: 3 B p(a, B) = p(b A) p(a): p(b = A = 0) p(b = A = ) p(a = ) Question: what is the number of parameters in case of the following models? A B C A B C A B C P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 20 / 44 8

9 Example with pairwise dependencies: dependency tree Nodes: binary variables (loci of chromozome) Edges: dependencies among variables Features: Each node depends at most on other node Graph does not contain cycles Graph is connected Learning the structure of dependency tree:. Score the edges using mutual information: p(x, y) I(X, Y) = p(x, y) log x,y p(x)p(y) 2. Use any algorithm to determine the maximum spanning tree of the graph, e.g. Prim (97) (a) (b) Start building the tree from any node Add such a node that is connected to the tree by the edge with maximum score P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 2 / 44 Example of dependency tree learning x x x x x x x x x x 4 6 x 3 x 4 6 x 3 x 4 6 x 3 x x x x 7 3 x 2 x 7 3 x 2 x 7 x x 4 6 x 3 x 4 6 x 3 x 4 x 3 P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 22 / 44 9

10 Dependency tree: probabilities x x 7 x x 4 x 3 Probability Number of params p(x = ) p(x 4 = X ) 2 p(x = X 4 ) 2 p(x 2 = X 4 ) 2 p(x 3 = X 2 ) 2 Whole model 9 P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 23 / 44 EDAs with pairwise interactions. MIMIC (sequences) Mutual Information Maximization for Input Clustering de Bonet et al., COMIT (trees) Combining Optimizers with Mutual Information Trees Baluja and Davies, BMDA (forrest) Bivariate Marginal Distribution Algorithm Pelikan and Mühlenbein, 998 P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 24 / 44 0

11 Summary Advantages: Still simple Still fast Can learn something about the structure Limitations: Reliably solves only order-2 decomposable problems P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 2 / 44 EDAs with Multivariate Interactions 26 / 44 ECGA Extended Compact GA, Harik, 999 Marginal Product Model (MPM) Variables are treated in groups Variables in different groups are considered statistically independent Each group is modeled by its joint probability distribution The algorithm adaptively searches for the groups during evolution Learning the structure Problem. Evaluation metric: Minimum Description Length (MDL) 2. Search procedure: greedy (a) (b) (c) Start with each variable belonging to its own group Ideal group configuration OneMax [][2][3][4][][6][7][8][9][0] bittraps [ ][ ] Perform such a join of two groups which improves the score best Finish if no join improves the score P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 27 / 44

12 ECGA: Evaluation metric Minimum description length: Minimize the number of bits needed to store the model and the data encoded using the model DL(Model, Data) = DL Model + DL Data Model description length: Each group g has g dimensions, i.e. 2 g frequencies, each of them can take on values up to N DL Model = log N (2 g ) g G Data description length using the model: Defined using the entropy of marginal distributions (X g is g -dimensional random vector, x g is its realization): DL Data = N h(x g ) = N p(x g = x g ) log p(x g = x g ) g G g G x g P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 28 / 44 BOA Bayesian Optimization Algorithm: Pelikán, Goldberg, Cantù-Paz, 999 Bayesian network (BN) Conditional dependencies (instead groups) Sequence, tree, forrest special cases of BN For trap function: The same model used independently in Estimation of Bayesian Network Alg. (EBNA), Etxeberria et al., 999 Learning Factorized Density Alg. (LFDA), Mühlenbein et al., 999 P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 29 / 44 2

13 BOA: Learning the structure. Evaluation metric: Bayesian-Dirichlet metric, or Bayesian information criterion (BIC) 2. Search procedure: greedy (a) Start with graph with no edges (univariate marginal product model) (b) Perform one of the following operations, choose the one which improves the score best Add an edge Delete an edge Reverse an edge (c) Finish if no operation improves the score BOA solves order-k decomposable problems in less then O(D 2 ) evaluations! n evals = O(D. ) to O(D 2 ) P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 30 / 44 Scalability Analysis 3 / 44 Test functions One Max: Trap: f DxbitOneMax (x) = D x d d= { D if u(x) = D f DbitTrap (x) = D u(x) otherwise Equal Pairs: f DbitEqualPairs (x) = + D d=2 { if x = x f EqualPair (x d, x d ) f EqualPair (x, x 2 ) = 2 0 if x = x 2 Sliding XOR: f DbitSlidingXOR (x) = + f AllEqual (x)+ D + f XOR (x d 2, x d, x d ) d=3 f AllEqual (x) = if x = ( ) if x = (... ) 0 otherwise { if x x2 = x f XOR (x, x 2, x 3 ) = 3 0 otherwise Concatenated short basis functions: K f NxKbitBasisFunction = f BasisFunction (x K(k )+,..., x Kk ) k= P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 32 / 44 3

14 Test function (cont.). f 40xbitOneMax order- decomposable function, no interactions 2. f x40bitequalpairs non-decomposable function weak interactions: optimal setting of each bit depends on the value of the preceding bit 3. f 8xbitEqualPairs order- decomposable function 4. f x40bitslidingxor non-decomposable function stronger interactions: optimal setting of each bit depends on the value of the 2 preceding bits. f 8xbitSlidingXOR order- decomposable function 6. f 8xbitTrap order- decomposable function interactions in each -bit block are very strong, the basis function is deceptive P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 33 / 44 Scalability analysis Facts: using small population size, population-based optimizers can solve only easy problems increasing the population size, the optimizers can solve increasingly harder problems... but using a too big population is wasting of resources. Scalability analysis: determines the optimal (smallest) population size, with which the algorithm solves the given problem reliably reliably: algorithm finds the optimum in 24 out of 2 runs) for each problem complexity, the optimal population size is determined e.g. using the bisection method studies the influence of the problem complexity (dimensionality) on the optimal population size and on the number of needed evaluations P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 34 / 44 4

15 Scalability on the One Max function 0 4 UMDA COMIT ECGA BOA počet ohodnocení dim P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 3 / 44

16 Scalability on the non-decomposable Equal Pairs function UMDA COMIT ECGA BOA počet ohodnocení dim P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 36 / 44 6

17 Scalability on the decomposable Equal Pairs function UMDA COMIT ECGA BOA počet ohodnocení dim P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 37 / 44 7

18 Scalability on the non-decomposable Sliding XOR function UMDA COMIT ECGA BOA počet ohodnocení dim P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 38 / 44 8

19 Scalability on the decomposable Sliding XOR function 0 UMDA COMIT ECGA BOA počet ohodnocení dim P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 39 / 44 9

20 Scalability on the decomposable Trap function UMDA COMIT ECGA BOA počet ohodnocení dim P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 40 / 44 20

21 Model structure during evolution During the evolution, the model structure is increasingly precise and at the end of the evolution, the model structure describes the problem structure exactly. NO! That s not true! Why? In the beginning, the distribution patterns are not very discernible, models similar to uniform distributions are used. In the end, the population converges and contains many copies of the same individual (or a few individuals). No interactions among variables can be learned. Model structure is wrong (all bits independent), but the model describes the position of optimum very precisely. The model with the best matching structure is found somewhere in the middle of the evolution. Even though the right structure is never found during the evolution, the problem can be solved successfully. P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 4 / 44 Conclusions 42 / 44 Summary Models: Bayesian networks are general models of joint probability High-dimensional models are hard to train High-dimensional models are very flexible Advantages: Reliably solves problems decomposable to subproblems of bounded order Limitations: Does not solve problems decomposable to logarithmic subproblems (hierarchical problems) P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 43 / 44 2

22 Suggestions for discrete EDAs For simple problems: PBIL, UMDA, cga they behave similarly to simple GAs For harder problems: MIMIC, COMIT, BMDA they are able to account for bivariate dependencies For hard problems: BOA, ECGA, EBNA, LFDA they can take into account more general dependencies, problems with hierarchichal structures For even harder problems: hboa (hierarchical BOA) P. Pošík c 204 A0M33EOA: Evolutionary Optimization Algorithms 44 / 44 22

Probabilistic Model-Building Genetic Algorithms

Probabilistic Model-Building Genetic Algorithms Probabilistic Model-Building Genetic Algorithms Martin Pelikan Dept. of Math. and Computer Science University of Missouri at St. Louis St. Louis, Missouri pelikan@cs.umsl.edu Foreword! Motivation Genetic

More information

Probabilistic Model-Building Genetic Algorithms

Probabilistic Model-Building Genetic Algorithms Probabilistic Model-Building Genetic Algorithms a.k.a. Estimation of Distribution Algorithms a.k.a. Iterated Density Estimation Algorithms Martin Pelikan Foreword Motivation Genetic and evolutionary computation

More information

State of the art in genetic algorithms' research

State of the art in genetic algorithms' research State of the art in genetic algorithms' Genetic Algorithms research Prabhas Chongstitvatana Department of computer engineering Chulalongkorn university A class of probabilistic search, inspired by natural

More information

Probabilistic Model-Building Genetic Algorithms

Probabilistic Model-Building Genetic Algorithms Probabilistic Model-Building Genetic Algorithms a.k.a. Estimation of Distribution Algorithms a.k.a. Iterated Density Estimation Algorithms Martin Pelikan Missouri Estimation of Distribution Algorithms

More information

Spurious Dependencies and EDA Scalability

Spurious Dependencies and EDA Scalability Spurious Dependencies and EDA Scalability Elizabeth Radetic, Martin Pelikan MEDAL Report No. 2010002 January 2010 Abstract Numerous studies have shown that advanced estimation of distribution algorithms

More information

Interaction-Detection Metric with Differential Mutual Complement for Dependency Structure Matrix Genetic Algorithm

Interaction-Detection Metric with Differential Mutual Complement for Dependency Structure Matrix Genetic Algorithm Interaction-Detection Metric with Differential Mutual Complement for Dependency Structure Matrix Genetic Algorithm Kai-Chun Fan Jui-Ting Lee Tian-Li Yu Tsung-Yu Ho TEIL Technical Report No. 2010005 April,

More information

Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration

Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration Phan Trung Hai Nguyen June 26, 2017 School of Computer Science University of Birmingham Birmingham B15

More information

Behaviour of the UMDA c algorithm with truncation selection on monotone functions

Behaviour of the UMDA c algorithm with truncation selection on monotone functions Mannheim Business School Dept. of Logistics Technical Report 01/2005 Behaviour of the UMDA c algorithm with truncation selection on monotone functions Jörn Grahl, Stefan Minner, Franz Rothlauf Technical

More information

From Mating Pool Distributions to Model Overfitting

From Mating Pool Distributions to Model Overfitting From Mating Pool Distributions to Model Overfitting Claudio F. Lima 1 Fernando G. Lobo 1 Martin Pelikan 2 1 Department of Electronics and Computer Science Engineering University of Algarve, Portugal 2

More information

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Mark Hauschild, Martin Pelikan, Claudio F. Lima, and Kumara Sastry IlliGAL Report No. 2007008 January 2007 Illinois Genetic

More information

Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA

Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA Peter A.N. Bosman and Dirk Thierens Department of Computer Science, Utrecht University, P.O. Box 80.089, 3508 TB Utrecht,

More information

Convergence Time for Linkage Model Building in Estimation of Distribution Algorithms

Convergence Time for Linkage Model Building in Estimation of Distribution Algorithms Convergence Time for Linkage Model Building in Estimation of Distribution Algorithms Hau-Jiun Yang Tian-Li Yu TEIL Technical Report No. 2009003 January, 2009 Taiwan Evolutionary Intelligence Laboratory

More information

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Mark Hauschild Missouri Estimation of Distribution Algorithms Laboratory (MEDAL) Dept. of Math and Computer Science, 320 CCB

More information

WITH. P. Larran aga. Computational Intelligence Group Artificial Intelligence Department Technical University of Madrid

WITH. P. Larran aga. Computational Intelligence Group Artificial Intelligence Department Technical University of Madrid Introduction EDAs Our Proposal Results Conclusions M ULTI - OBJECTIVE O PTIMIZATION WITH E STIMATION OF D ISTRIBUTION A LGORITHMS Pedro Larran aga Computational Intelligence Group Artificial Intelligence

More information

Initial-Population Bias in the Univariate Estimation of Distribution Algorithm

Initial-Population Bias in the Univariate Estimation of Distribution Algorithm Initial-Population Bias in the Univariate Estimation of Distribution Algorithm Martin Pelikan and Kumara Sastry MEDAL Report No. 9 January 9 Abstract This paper analyzes the effects of an initial-population

More information

Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions

Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions Martin Pelikan MEDAL Report No. 20102 February 2011 Abstract Epistasis correlation is a measure that estimates the

More information

c Copyright by Martin Pelikan, 2002

c Copyright by Martin Pelikan, 2002 c Copyright by Martin Pelikan, 2002 BAYESIAN OPTIMIZATION ALGORITHM: FROM SINGLE LEVEL TO HIERARCHY BY MARTIN PELIKAN DIPL., Comenius University, 1998 THESIS Submitted in partial fulfillment of the requirements

More information

Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms

Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms Martin Pelikan, Helmut G. Katzgraber and Sigismund Kobe MEDAL Report No. 284 January 28 Abstract

More information

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Martin Pelikan 1, Alexander K. Hartmann 2, and Kumara Sastry 1 Dept. of Math and Computer Science, 320 CCB University of Missouri at

More information

Analysis of Epistasis Correlation on NK Landscapes. Landscapes with Nearest Neighbor Interactions

Analysis of Epistasis Correlation on NK Landscapes. Landscapes with Nearest Neighbor Interactions Analysis of Epistasis Correlation on NK Landscapes with Nearest Neighbor Interactions Missouri Estimation of Distribution Algorithms Laboratory (MEDAL University of Missouri, St. Louis, MO http://medal.cs.umsl.edu/

More information

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses

Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Martin Pelikan and Alexander K. Hartmann MEDAL Report No. 2006002 March 2006 Abstract This paper analyzes the hierarchical Bayesian

More information

From Mating Pool Distributions to Model Overfitting

From Mating Pool Distributions to Model Overfitting From Mating Pool Distributions to Model Overfitting Claudio F. Lima DEEI-FCT University of Algarve Campus de Gambelas 85-39, Faro, Portugal clima.research@gmail.com Fernando G. Lobo DEEI-FCT University

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 10: Gene(c Algorithms CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ Slides of this presenta(on

More information

Hierarchical BOA Solves Ising Spin Glasses and MAXSAT

Hierarchical BOA Solves Ising Spin Glasses and MAXSAT Hierarchical BOA Solves Ising Spin Glasses and MAXSAT Martin Pelikan 1,2 and David E. Goldberg 2 1 Computational Laboratory (Colab) Swiss Federal Institute of Technology (ETH) Hirschengraben 84 8092 Zürich,

More information

Lecture 9 Evolutionary Computation: Genetic algorithms

Lecture 9 Evolutionary Computation: Genetic algorithms Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Simulation of natural evolution Genetic algorithms Case study: maintenance scheduling with genetic

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Linkage Identification Based on Epistasis Measures to Realize Efficient Genetic Algorithms

Linkage Identification Based on Epistasis Measures to Realize Efficient Genetic Algorithms Linkage Identification Based on Epistasis Measures to Realize Efficient Genetic Algorithms Masaharu Munetomo Center for Information and Multimedia Studies, Hokkaido University, North 11, West 5, Sapporo

More information

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis Luigi Malagò Department of Electronics and Information Politecnico di Milano Via Ponzio, 34/5 20133 Milan,

More information

A stochastic global optimization method for (urban?) multi-parameter systems. Rodolphe Le Riche CNRS and Ecole des Mines de St-Etienne

A stochastic global optimization method for (urban?) multi-parameter systems. Rodolphe Le Riche CNRS and Ecole des Mines de St-Etienne A stochastic global optimization method for (urban?) multi-parameter systems Rodolphe Le Riche CNRS and Ecole des Mines de St-Etienne Purpose of this presentation Discussion object : cities as an entity

More information

Compact representations as a search strategy: Compression EDAs

Compact representations as a search strategy: Compression EDAs Compact representations as a search strategy: Compression EDAs Marc Toussaint Institute for Adaptive and Neural Computation, University of Edinburgh, 5 Forrest Hill, Edinburgh EH1 2QL, Scotland, UK Abstract

More information

Biitlli Biointelligence Laboratory Lb School of Computer Science and Engineering. Seoul National University

Biitlli Biointelligence Laboratory Lb School of Computer Science and Engineering. Seoul National University Monte Carlo Samling Chater 4 2009 Course on Probabilistic Grahical Models Artificial Neural Networs, Studies in Artificial i lintelligence and Cognitive i Process Biitlli Biointelligence Laboratory Lb

More information

Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap

Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild Missouri

More information

A0M33EOA: EAs for Real-Parameter Optimization. Differential Evolution. CMA-ES.

A0M33EOA: EAs for Real-Parameter Optimization. Differential Evolution. CMA-ES. A0M33EOA: EAs for Real-Parameter Optimization. Differential Evolution. CMA-ES. Petr Pošík Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Many parts adapted

More information

Gene Pool Recombination in Genetic Algorithms

Gene Pool Recombination in Genetic Algorithms Gene Pool Recombination in Genetic Algorithms Heinz Mühlenbein GMD 53754 St. Augustin Germany muehlenbein@gmd.de Hans-Michael Voigt T.U. Berlin 13355 Berlin Germany voigt@fb10.tu-berlin.de Abstract: A

More information

Readings: K&F: 16.3, 16.4, Graphical Models Carlos Guestrin Carnegie Mellon University October 6 th, 2008

Readings: K&F: 16.3, 16.4, Graphical Models Carlos Guestrin Carnegie Mellon University October 6 th, 2008 Readings: K&F: 16.3, 16.4, 17.3 Bayesian Param. Learning Bayesian Structure Learning Graphical Models 10708 Carlos Guestrin Carnegie Mellon University October 6 th, 2008 10-708 Carlos Guestrin 2006-2008

More information

The Genetic Algorithm is Useful to Fitting Input Probability Distributions for Simulation Models

The Genetic Algorithm is Useful to Fitting Input Probability Distributions for Simulation Models The Genetic Algorithm is Useful to Fitting Input Probability Distributions for Simulation Models Johann Christoph Strelen Rheinische Friedrich Wilhelms Universität Bonn Römerstr. 164, 53117 Bonn, Germany

More information

New Epistasis Measures for Detecting Independently Optimizable Partitions of Variables

New Epistasis Measures for Detecting Independently Optimizable Partitions of Variables New Epistasis Measures for Detecting Independently Optimizable Partitions of Variables Dong-Il Seo, Sung-Soon Choi, and Byung-Ro Moon School of Computer Science & Engineering, Seoul National University

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs

CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs Professor Erik Sudderth Brown University Computer Science October 6, 2016 Some figures and materials courtesy

More information

Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data

Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data Juan Torres 1, Ashraf Saad 2, Elliot Moore 1 1 School of Electrical and Computer

More information

The Estimation of Distributions and the Minimum Relative Entropy Principle

The Estimation of Distributions and the Minimum Relative Entropy Principle The Estimation of Distributions and the Heinz Mühlenbein heinz.muehlenbein@ais.fraunhofer.de Fraunhofer Institute for Autonomous Intelligent Systems 53754 Sankt Augustin, Germany Robin Höns robin.hoens@ais.fraunhofer.de

More information

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems c World Scientific Publishing Company

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems c World Scientific Publishing Company International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems c World Scientific Publishing Company UNSUPERVISED LEARNING OF BAYESIAN NETWORKS VIA ESTIMATION OF DISTRIBUTION ALGORITHMS: AN

More information

Crossover Techniques in GAs

Crossover Techniques in GAs Crossover Techniques in GAs Debasis Samanta Indian Institute of Technology Kharagpur dsamanta@iitkgp.ac.in 16.03.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 16.03.2018 1 / 1 Important

More information

{ p if x = 1 1 p if x = 0

{ p if x = 1 1 p if x = 0 Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

Schema Theory. David White. Wesleyan University. November 30, 2009

Schema Theory. David White. Wesleyan University. November 30, 2009 November 30, 2009 Building Block Hypothesis Recall Royal Roads problem from September 21 class Definition Building blocks are short groups of alleles that tend to endow chromosomes with higher fitness

More information

Learning Bayesian belief networks

Learning Bayesian belief networks Lecture 4 Learning Bayesian belief networks Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Administration Midterm: Monday, March 7, 2003 In class Closed book Material covered by Wednesday, March

More information

Structure Learning: the good, the bad, the ugly

Structure Learning: the good, the bad, the ugly Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 Structure Learning: the good, the bad, the ugly Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 29 th, 2006 1 Understanding the uniform

More information

about the problem can be used in order to enhance the estimation and subsequently improve convergence. New strings are generated according to the join

about the problem can be used in order to enhance the estimation and subsequently improve convergence. New strings are generated according to the join Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence Martin Pelikan Dept. of General Engineering and Dept. of Computer Science University of Illinois, Urbana, IL 61801 pelikan@illigal.ge.uiuc.edu

More information

Gaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress

Gaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress Gaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress Petr Pošík Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics Technická, 66 7 Prague

More information

Compressing Tabular Data via Pairwise Dependencies

Compressing Tabular Data via Pairwise Dependencies Compressing Tabular Data via Pairwise Dependencies Amir Ingber, Yahoo! Research TCE Conference, June 22, 2017 Joint work with Dmitri Pavlichin, Tsachy Weissman (Stanford) Huge datasets: everywhere - Internet

More information

When Is an Estimation of Distribution Algorithm Better than an Evolutionary Algorithm?

When Is an Estimation of Distribution Algorithm Better than an Evolutionary Algorithm? When Is an Estimation of Distribution Algorithm Better than an Evolutionary Algorithm? Tianshi Chen, Per Kristian Lehre, Ke Tang, and Xin Yao Abstract Despite the wide-spread popularity of estimation of

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Feature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with

Feature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with Feature selection c Victor Kitov v.v.kitov@yandex.ru Summer school on Machine Learning in High Energy Physics in partnership with August 2015 1/38 Feature selection Feature selection is a process of selecting

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Matt Heavner CSE710 Fall 2009

Matt Heavner CSE710 Fall 2009 Matt Heavner mheavner@buffalo.edu CSE710 Fall 2009 Problem Statement: Given a set of cities and corresponding locations, what is the shortest closed circuit that visits all cities without loops?? Fitness

More information

Bayesian Networks to design optimal experiments. Davide De March

Bayesian Networks to design optimal experiments. Davide De March Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1 Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture

More information

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Matthew J. Streeter Computer Science Department and Center for the Neural Basis of Cognition Carnegie Mellon University

More information

Bayesian Networks. Motivation

Bayesian Networks. Motivation Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations

More information

optimization problem x opt = argmax f(x). Definition: Let p(x;t) denote the probability of x in the P population at generation t. Then p i (x i ;t) =

optimization problem x opt = argmax f(x). Definition: Let p(x;t) denote the probability of x in the P population at generation t. Then p i (x i ;t) = Wright's Equation and Evolutionary Computation H. Mühlenbein Th. Mahnig RWCP 1 Theoretical Foundation GMD 2 Laboratory D-53754 Sankt Augustin Germany muehlenbein@gmd.de http://ais.gmd.de/οmuehlen/ Abstract:

More information

how should the GA proceed?

how should the GA proceed? how should the GA proceed? string fitness 10111 10 01000 5 11010 3 00011 20 which new string would be better than any of the above? (the GA does not know the mapping between strings and fitness values!)

More information

Computational statistics

Computational statistics Computational statistics Combinatorial optimization Thierry Denœux February 2017 Thierry Denœux Computational statistics February 2017 1 / 37 Combinatorial optimization Assume we seek the maximum of f

More information

Evolutionary Functional Link Interval Type-2 Fuzzy Neural System for Exchange Rate Prediction

Evolutionary Functional Link Interval Type-2 Fuzzy Neural System for Exchange Rate Prediction Evolutionary Functional Link Interval Type-2 Fuzzy Neural System for Exchange Rate Prediction 3. Introduction Currency exchange rate is an important element in international finance. It is one of the chaotic,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 5 Bayesian Learning of Bayesian Networks CS/CNS/EE 155 Andreas Krause Announcements Recitations: Every Tuesday 4-5:30 in 243 Annenberg Homework 1 out. Due in class

More information

2-bit Flip Mutation Elementary Fitness Landscapes

2-bit Flip Mutation Elementary Fitness Landscapes RN/10/04 Research 15 September 2010 Note 2-bit Flip Mutation Elementary Fitness Landscapes Presented at Dagstuhl Seminar 10361, Theory of Evolutionary Algorithms, 8 September 2010 Fax: +44 (0)171 387 1397

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

CSC 411 Lecture 3: Decision Trees

CSC 411 Lecture 3: Decision Trees CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03-Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning

More information

Clustering using Mixture Models

Clustering using Mixture Models Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Blackbox Optimization Marc Toussaint U Stuttgart Blackbox Optimization The term is not really well defined I use it to express that only f(x) can be evaluated f(x) or 2 f(x)

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Learning Bayes Net Structures

Learning Bayes Net Structures Learning Bayes Net Structures KF, Chapter 15 15.5 (RN, Chapter 20) Some material taken from C Guesterin (CMU), K Murphy (UBC) 1 2 Learning Bayes Nets Known Structure Unknown Data Complete Missing Easy

More information

Finding Ground States of SK Spin Glasses with hboa and GAs

Finding Ground States of SK Spin Glasses with hboa and GAs Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with hboa and GAs Martin Pelikan, Helmut G. Katzgraber, & Sigismund Kobe Missouri Estimation of Distribution Algorithms Laboratory (MEDAL)

More information

Learning P-maps Param. Learning

Learning P-maps Param. Learning Readings: K&F: 3.3, 3.4, 16.1, 16.2, 16.3, 16.4 Learning P-maps Param. Learning Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 24 th, 2008 10-708 Carlos Guestrin 2006-2008

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

An Evolutionary Programming Based Algorithm for HMM training

An Evolutionary Programming Based Algorithm for HMM training An Evolutionary Programming Based Algorithm for HMM training Ewa Figielska,Wlodzimierz Kasprzak Institute of Control and Computation Engineering, Warsaw University of Technology ul. Nowowiejska 15/19,

More information

Bayesian Networks Structure Learning (cont.)

Bayesian Networks Structure Learning (cont.) Koller & Friedman Chapters (handed out): Chapter 11 (short) Chapter 1: 1.1, 1., 1.3 (covered in the beginning of semester) 1.4 (Learning parameters for BNs) Chapter 13: 13.1, 13.3.1, 13.4.1, 13.4.3 (basic

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

Discrete stochastic optimization with continuous auxiliary variables. Rodolphe Le Riche*, Alexis Lasseigne**, François-Xavier Irisarri**

Discrete stochastic optimization with continuous auxiliary variables. Rodolphe Le Riche*, Alexis Lasseigne**, François-Xavier Irisarri** Discrete stochastic optimization with continuous auxiliary variables Rodolphe Le Riche*, Alexis Lasseigne**, François-Xavier Irisarri** *CNRS and Ecole des Mines de St-Etienne, Fr. ** ONERA, Composite

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Learning Bayesian Networks Does Not Have to Be NP-Hard

Learning Bayesian Networks Does Not Have to Be NP-Hard Learning Bayesian Networks Does Not Have to Be NP-Hard Norbert Dojer Institute of Informatics, Warsaw University, Banacha, 0-097 Warszawa, Poland dojer@mimuw.edu.pl Abstract. We propose an algorithm for

More information

Computer Vision Group Prof. Daniel Cremers. 14. Clustering

Computer Vision Group Prof. Daniel Cremers. 14. Clustering Group Prof. Daniel Cremers 14. Clustering Motivation Supervised learning is good for interaction with humans, but labels from a supervisor are hard to obtain Clustering is unsupervised learning, i.e. it

More information

Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence

Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence Preprint UCRL-JC-137172 Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence M. Pelikan, D. E. Goldberg and E. Cantu-Paz This article was submitted to Genetic and Evolutionary Computation

More information

Series 7, May 22, 2018 (EM Convergence)

Series 7, May 22, 2018 (EM Convergence) Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18

More information

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1 Learning Objectives At the end of the class you should be able to: identify a supervised learning problem characterize how the prediction is a function of the error measure avoid mixing the training and

More information

Structure Design of Neural Networks Using Genetic Algorithms

Structure Design of Neural Networks Using Genetic Algorithms Structure Design of Neural Networks Using Genetic Algorithms Satoshi Mizuta Takashi Sato Demelo Lao Masami Ikeda Toshio Shimizu Department of Electronic and Information System Engineering, Faculty of Science

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 4 Learning Bayesian Networks CS/CNS/EE 155 Andreas Krause Announcements Another TA: Hongchao Zhou Please fill out the questionnaire about recitations Homework 1 out.

More information

CS540 ANSWER SHEET

CS540 ANSWER SHEET CS540 ANSWER SHEET Name Email 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 1 2 Final Examination CS540-1: Introduction to Artificial Intelligence Fall 2016 20 questions, 5 points

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

Exploration of population fixed-points versus mutation rates for functions of unitation

Exploration of population fixed-points versus mutation rates for functions of unitation Exploration of population fixed-points versus mutation rates for functions of unitation J Neal Richter 1, Alden Wright 2, John Paxton 1 1 Computer Science Department, Montana State University, 357 EPS,

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

Learning Bayesian Networks (part 1) Goals for the lecture

Learning Bayesian Networks (part 1) Goals for the lecture Learning Bayesian Networks (part 1) Mark Craven and David Page Computer Scices 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some ohe slides in these lectures have been adapted/borrowed from materials

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

BN Semantics 3 Now it s personal! Parameter Learning 1

BN Semantics 3 Now it s personal! Parameter Learning 1 Readings: K&F: 3.4, 14.1, 14.2 BN Semantics 3 Now it s personal! Parameter Learning 1 Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 22 nd, 2006 1 Building BNs from independence

More information

Scalability of Selectorecombinative Genetic Algorithms for Problems with Tight Linkage

Scalability of Selectorecombinative Genetic Algorithms for Problems with Tight Linkage Scalability of Selectorecombinative Genetic Algorithms for Problems with Tight Linkage Kumara Sastry 1,2 and David E. Goldberg 1,3 1 Illinois Genetic Algorithms Laboratory (IlliGAL) 2 Department of Material

More information

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2 Representation Stefano Ermon, Aditya Grover Stanford University Lecture 2 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 2 1 / 32 Learning a generative model We are given a training

More information

Koza s Algorithm. Choose a set of possible functions and terminals for the program.

Koza s Algorithm. Choose a set of possible functions and terminals for the program. Step 1 Koza s Algorithm Choose a set of possible functions and terminals for the program. You don t know ahead of time which functions and terminals will be needed. User needs to make intelligent choices

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information