Fundamentals of Metaheuristics

Size: px
Start display at page:

Download "Fundamentals of Metaheuristics"

Transcription

1 Fundamentals of Metaheuristics Part I - Basic concepts and Single-State Methods A seminar for Neural Networks Simone Scardapane Academic year

2 ABOUT THIS SEMINAR The seminar is divided in three main parts, each of 2 hours: 1. Part I (this one): Basic concepts and single-state algorithms (Simulated Annealing and Tabu Search). 2. Part II: Population algorithms (Evolutionary Methods and Particle Swarm Optimization). 3. Part III: Advanced topics, including multi-objective optimization and parallelization.

3 ABOUT THIS SEMINAR The seminar is divided in three main parts, each of 2 hours: 1. Part I (this one): Basic concepts and single-state algorithms (Simulated Annealing and Tabu Search). 2. Part II: Population algorithms (Evolutionary Methods and Particle Swarm Optimization). 3. Part III: Advanced topics, including multi-objective optimization and parallelization.

4 ABOUT THIS SEMINAR The seminar is divided in three main parts, each of 2 hours: 1. Part I (this one): Basic concepts and single-state algorithms (Simulated Annealing and Tabu Search). 2. Part II: Population algorithms (Evolutionary Methods and Particle Swarm Optimization). 3. Part III: Advanced topics, including multi-objective optimization and parallelization.

5 REFERENCE MATERIAL Slides are self-contained (as much as possible!) If you want to expand on the subject: Essentials of Metaheuristics (Sean Luke), avalaible online at sean/book/metaheuristics/. Part of this seminar is built on it. Selected papers at the end of each Part.

6 PROJECTS Metaheuristic optimization is really vast, meaning a great range of small projects. me at simonescardapane [at] gmail [dot] com if you re interested. You can also find me at the ISPAMM lab at DIET.

7 TABLE OF CONTENTS INTRODUCTION Definition: what are metaheuristics? Applicability BASIC CONCEPTS Hill Climbing Random Search Exploration vs. Exploitation Model representation SIMULATED ANNEALING General Algorithm Cooling functions Variations TABU SEARCH Algorithm Description Feature-based Tabu Search

8 WHAT IS A METAHEURISTIC? A metaheuristic is: An algorithm for global approximation...

9 WHAT IS A METAHEURISTIC? A metaheuristic is: An algorithm for global approximation that employs a certain degree of randomness... Belongs to the wider class of stochastic optimizers

10 WHAT IS A METAHEURISTIC? A metaheuristic is: An algorithm for global approximation that employs a certain degree of randomness... Belongs to the wider class of stochastic optimizers... and makes as few assumptions as possible.

11 WHAT IS A METAHEURISTIC? A metaheuristic is: An algorithm for global approximation that employs a certain degree of randomness... Belongs to the wider class of stochastic optimizers... and makes as few assumptions as possible. They are also known as black box optimizers. Divided into single-state methods (only one solution is analyzed at a time) vs. population methods.

12 DRAWBACKS Some of the main downsides of metaheuristics are: 1. No guarantee on global convergence 2. Hard to study in general 3. Difficult to choose the right one

13 WHEN CAN YOU USE THEM To apply a metaheuristic only need two elements are needed: 1. Representation of your hypothesis space H.

14 WHEN CAN YOU USE THEM To apply a metaheuristic only need two elements are needed: 1. Representation of your hypothesis space H. 2. Capability of assessing the goodness (or badness) of an hypothesis h H. This is equivalent to having an objective function f (h).

15 WHEN CAN YOU USE THEM To apply a metaheuristic only need two elements are needed: 1. Representation of your hypothesis space H. 2. Capability of assessing the goodness (or badness) of an hypothesis h H. This is equivalent to having an objective function f (h). 3. There is no need of an explicit representation for the objective function, nor any constraints on H.

16 WHEN SHOULD YOU USE THEM Metaheuristics are useful only when one (or more) of the following statements are true: 1. Lack of explicit representation of the objective function (e.g. soccer robot player) 2. Hard (or impossible) to compute first and second-order derivatives 3. H is too vast to be searched thoroughly. In any other case, using a metaheuristic is an overkill

17 WHY SHOULD YOU CARE Metaheuristics have a lot in common with machine learning. It should not be surprising that they have also a lot of possible applications. For example, in the case of neural networks, you can use a metaheuristic as a tool for: 1. Learning the topology of a network, 2. Pruning a network to improve performances and generalization capabilities, 3. Learning weights in an alternative way with respect to backpropagation.

18 HILL CLIMBING Maybe the simplest possible algorithm: 1. Choose a starting hypothesis h.

19 HILL CLIMBING Maybe the simplest possible algorithm: 1. Choose a starting hypothesis h. 2. Randomly tweak h to get a similar hypothesis h 2 = tweak(h). Sometimes h 2 is knows as the neighbour of h.

20 HILL CLIMBING Maybe the simplest possible algorithm: 1. Choose a starting hypothesis h. 2. Randomly tweak h to get a similar hypothesis h 2 = tweak(h). Sometimes h 2 is knows as the neighbour of h. 3. If f (h 2 ) < f (h) keep h 2, otherwise keep h.

21 HILL CLIMBING Maybe the simplest possible algorithm: 1. Choose a starting hypothesis h. 2. Randomly tweak h to get a similar hypothesis h 2 = tweak(h). Sometimes h 2 is knows as the neighbour of h. 3. If f (h 2 ) < f (h) keep h 2, otherwise keep h. 4. Repeat steps 1-4 until f (h) is not improving, or a certain amount of time has passed.

22 HILL CLIMBING Maybe the simplest possible algorithm: 1. Choose a starting hypothesis h. 2. Randomly tweak h to get a similar hypothesis h 2 = tweak(h). Sometimes h 2 is knows as the neighbour of h. 3. If f (h 2 ) < f (h) keep h 2, otherwise keep h. 4. Repeat steps 1-4 until f (h) is not improving, or a certain amount of time has passed. This is for minimizing f. Maximizing is equivalent to minimizing f (h): you just change the sign.

23 LOCAL MINIMA Note the similarities with gradient descent. Hill Climbing is highly sensitive to local minima!

24 LOCAL MINIMA Note the similarities with gradient descent. Hill Climbing is highly sensitive to local minima! A possible solution is random restarts. Then, with infinite time, it will converge to the global optimum.

25 BIAS AND ASSUMPTIONS What are the assumptions of Hill Climbing? The only one is smoothness : similar solutions behave in a similar way. This makes sense: what happens when this assumption is violated?

26 RANDOM SEARCH Without a-priori knowledge, the only possibility is Random Search. There is actually a large family of random search algorithms.

27 EXPLORATION VS. EXPLOITATION Why are these basic concepts? Almost every metaheuristic is an intelligent combination of Hill Climbing and Random Search. Why is that? They need to balance between an exploitive behaviour and an explorative one.

28 THE CLASSICAL DILEMMA Exploitation means using the current knowledge to find the (possibly sub-optimal) solution.

29 THE CLASSICAL DILEMMA Exploitation means using the current knowledge to find the (possibly sub-optimal) solution. Exploration means taking a chance in the hypothesis space.

30 THE CLASSICAL DILEMMA Exploitation means using the current knowledge to find the (possibly sub-optimal) solution. Exploration means taking a chance in the hypothesis space. Random Search is purely explorative, while Hill Climbing is purely Exploitative.

31 THE CLASSICAL DILEMMA Exploitation means using the current knowledge to find the (possibly sub-optimal) solution. Exploration means taking a chance in the hypothesis space. Random Search is purely explorative, while Hill Climbing is purely Exploitative. We will see many variations throughout the seminar.

32 HYPOTHESIS SPACE SELECTION How can you represent your hypothesis? Vast range of possibilities: 1. Real numbers, integers, vectors,

33 HYPOTHESIS SPACE SELECTION How can you represent your hypothesis? Vast range of possibilities: 1. Real numbers, integers, vectors, 2. String,

34 HYPOTHESIS SPACE SELECTION How can you represent your hypothesis? Vast range of possibilities: 1. Real numbers, integers, vectors, 2. String, 3. Trees, graphs,

35 HYPOTHESIS SPACE SELECTION How can you represent your hypothesis? Vast range of possibilities: 1. Real numbers, integers, vectors, 2. String, 3. Trees, graphs, 4. Rules...

36 HYPOTHESIS SPACE SELECTION How can you represent your hypothesis? Vast range of possibilities: 1. Real numbers, integers, vectors, 2. String, 3. Trees, graphs, 4. Rules... For each possible choice, there are many neighbouring choices (different tweak functions).

37 ALGORITHM SELECTION The third step is the choice of the metaheuristic itself. However, remember that choosing the right space is as important as choosing the right metaheuristic. Sadly, this is much more an art than a science, and depends on experience and intuition.

38 ALGORITHM SELECTION The third step is the choice of the metaheuristic itself. However, remember that choosing the right space is as important as choosing the right metaheuristic. Sadly, this is much more an art than a science, and depends on experience and intuition. That s why we re here, by the way.

39 A-PRIORI INFORMATION In case you possess additional information about your target solution, as a final step you can customize your algorithm. There are various techniques:

40 A-PRIORI INFORMATION In case you possess additional information about your target solution, as a final step you can customize your algorithm. There are various techniques: 1. You can use a multi-objective function (see Part III).

41 A-PRIORI INFORMATION In case you possess additional information about your target solution, as a final step you can customize your algorithm. There are various techniques: 1. You can use a multi-objective function (see Part III). 2. You can include some bias in the tweak function.

42 A-PRIORI INFORMATION In case you possess additional information about your target solution, as a final step you can customize your algorithm. There are various techniques: 1. You can use a multi-objective function (see Part III). 2. You can include some bias in the tweak function. 3. Other possibilities given by the metaheuristic itself (we discuss these case-by-case).

43 SIMULATED ANNEALING Same idea as Hill Climbing, however: 1. If f (h 2 ) f (h) we keep h 2 as in Hill Climbing. 2. If f (h 2 ) > f (h), we still keep it with probability: p(h, h 2, t) = e f (h 2 ) f (h) t t is knows as the temperature, while p as the schedule. This is also known as the Metropolis criterion.

44 EXPLANATION Inspired to the annealing process in metallurgy. The temperature decreases with time, down to 0, thus decreasing the probability of choosing a sub-optimal hypothesis. If the cooling schedule is extended enough, the algorithm is proven to converge to the global optimum (not useful in practice). Proven with Markov chains.

45 HEAT AND NOISE The heat works as noise inserted into Hill Climbing to prevent local minima. Thanks to Rohit Ray and for the image.

46 EXAMPLES OF COOLING FUNCTION 1. t k = α k t 0. This results in an exponential cooling schedule. 2. t k = t 0 αt, also known as a linear schedule. 3. t k = t 0 (1 k K )α. 4. See [6] for a small review of classical strategies.

47 EXAMPLES OF COOLING FUNCTION /2 Figure : Comparison of three different cooling strategies.

48 GENERAL BEHAVIOUR Simulated Annealing has a strong explorative behaviour in the beginning, but switches to a strongly exploitative one toward the end. Sometimes the choice of the scheduling function is not trivial. For this reason, methods have been devised for adjusting it: see for example the Thermodynamic Simulated Annealing.

49 SIMULATED ANNEALING IN MATLAB Widely used implementation inside the Global Optimization Toolbox. Usage: simulannealbnd(objectivefunction, StartingPoint). Many possible options, see documentation on Mathworks website. Implements by default a reanneling technique (raises the temperature at certains points).

50 VARIATIONS Some variations of Simulated Annealing prefer to use a probabilistical criterion even if the new hypothesis is better. In Threshold Accepting, new solution is kept if f (h) f (h 2 ) < Q(k) Where Q(k) is the threshold value at iteration k, generally taken as a monotonically decreasing function. This eliminates the need for a random number generator.

51 TABU SEARCH Again, same as Hill Climbing, but with a small twist. Keeps tracks of last N visited elements, and mark them as taboo. When evaluating the objective function, taboo elements are discarded.

52 TABOO AND MEMORY Despite its simplicity, it is known for being efficient in many practical applications. It tries to be as much exploitative as possible without being trapped in local minima. It is like having a primitive form of memory. Many variations exists, we look here at two.

53 CONTINUOUS FUNCTIONS If the function is continuous, we may also want to discard elements that are sufficiently similar to taboo one. Similarity measure depends on the problem, for example an L-norm can be used: L p (h, h 2 ) = ( (h i 2 hi ) p ) 1/p (However, with continuous spaces it would be better to use other metaheuristics.)

54 FEATURE-BASED TABU SEARCH If the search space is too vast, we may want to mark as taboo not a single hypothesis, but the change made to it. A classical example is the Traveling Salesman Problem, where the hypothesis is a path on the current graph. Whenever you delete edge from A to B, you mark it as taboo. Then, for a given number of iterations, the edge cannot be added again. Some categorize this as intermediate-term memory as opposed to the classical tabu list (short-term memory).

55 SELECTED BIBLIOGRAPHY I Rachid Chelouah and Patrick Siarry. Tabu search applied to global optimization. European Journal of Operational Research, 123(2): , Fred Glover and Manuel Laguna. Tabu Search. Kluwer Academic Publishers, Norwell, MA, USA, L. Ingber. Simulated Annealing: Practice versus Theory. Mathematical and Computer Modelling, 18(11):29 57, W. G. Macready and D. H. Wolpert, II. Bandit problems and the exploration/exploitation tradeoff. Trans. Evol. Comp, 2(1):2 22, April 1998.

56 SELECTED BIBLIOGRAPHY II Debasis Mitra, Fabio Romeo, and Alberto S. Vincentelli. Convergence and Finite-Time Behavior of Simulated Annealing. Advances in Applied Probability, 18(3), Yaghout Nourani and Bjarne Andresen. A comparison of simulated annealing cooling strategies. Journal of Physics A: Mathematical and General, 31(41):8373, 1998.

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Artificial Intelligence, Computational Logic PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Lecture 4 Metaheuristic Algorithms Sarah Gaggl Dresden, 5th May 2017 Agenda 1 Introduction 2 Constraint

More information

Methods for finding optimal configurations

Methods for finding optimal configurations CS 1571 Introduction to AI Lecture 9 Methods for finding optimal configurations Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Search for the optimal configuration Optimal configuration search:

More information

A.I.: Beyond Classical Search

A.I.: Beyond Classical Search A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously

More information

Hill climbing: Simulated annealing and Tabu search

Hill climbing: Simulated annealing and Tabu search Hill climbing: Simulated annealing and Tabu search Heuristic algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Hill climbing Instead of repeating local search, it is

More information

Algorithms and Complexity theory

Algorithms and Complexity theory Algorithms and Complexity theory Thibaut Barthelemy Some slides kindly provided by Fabien Tricoire University of Vienna WS 2014 Outline 1 Algorithms Overview How to write an algorithm 2 Complexity theory

More information

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms 1 What is Combinatorial Optimization? Combinatorial Optimization deals with problems where we have to search

More information

Finding optimal configurations ( combinatorial optimization)

Finding optimal configurations ( combinatorial optimization) CS 1571 Introduction to AI Lecture 10 Finding optimal configurations ( combinatorial optimization) Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Constraint satisfaction problem (CSP) Constraint

More information

Methods for finding optimal configurations

Methods for finding optimal configurations S 2710 oundations of I Lecture 7 Methods for finding optimal configurations Milos Hauskrecht milos@pitt.edu 5329 Sennott Square S 2710 oundations of I Search for the optimal configuration onstrain satisfaction

More information

5. Simulated Annealing 5.1 Basic Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini

5. Simulated Annealing 5.1 Basic Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini 5. Simulated Annealing 5.1 Basic Concepts Fall 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Real Annealing and Simulated Annealing Metropolis Algorithm Template of SA A Simple Example References

More information

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness

More information

Metaheuristics and Local Search

Metaheuristics and Local Search Metaheuristics and Local Search 8000 Discrete optimization problems Variables x 1,..., x n. Variable domains D 1,..., D n, with D j Z. Constraints C 1,..., C m, with C i D 1 D n. Objective function f :

More information

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches Discrete Mathematics for Bioinformatics WS 07/08, G. W. Klau, 31. Januar 2008, 11:55 1 Metaheuristics and Local Search Discrete optimization problems Variables x 1,...,x n. Variable domains D 1,...,D n,

More information

SIMU L TED ATED ANNEA L NG ING

SIMU L TED ATED ANNEA L NG ING SIMULATED ANNEALING Fundamental Concept Motivation by an analogy to the statistical mechanics of annealing in solids. => to coerce a solid (i.e., in a poor, unordered state) into a low energy thermodynamic

More information

Metaheuristics. 2.3 Local Search 2.4 Simulated annealing. Adrian Horga

Metaheuristics. 2.3 Local Search 2.4 Simulated annealing. Adrian Horga Metaheuristics 2.3 Local Search 2.4 Simulated annealing Adrian Horga 1 2.3 Local Search 2 Local Search Other names: Hill climbing Descent Iterative improvement General S-Metaheuristics Old and simple method

More information

Single Solution-based Metaheuristics

Single Solution-based Metaheuristics Parallel Cooperative Optimization Research Group Single Solution-based Metaheuristics E-G. Talbi Laboratoire d Informatique Fondamentale de Lille Single solution-based metaheuristics Improvement of a solution.

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

Overview. Optimization. Easy optimization problems. Monte Carlo for Optimization. 1. Survey MC ideas for optimization: (a) Multistart

Overview. Optimization. Easy optimization problems. Monte Carlo for Optimization. 1. Survey MC ideas for optimization: (a) Multistart Monte Carlo for Optimization Overview 1 Survey MC ideas for optimization: (a) Multistart Art Owen, Lingyu Chen, Jorge Picazo (b) Stochastic approximation (c) Simulated annealing Stanford University Intel

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 10: Gene(c Algorithms CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ Slides of this presenta(on

More information

PAC-learning, VC Dimension and Margin-based Bounds

PAC-learning, VC Dimension and Margin-based Bounds More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based

More information

Random Search. Shin Yoo CS454, Autumn 2017, School of Computing, KAIST

Random Search. Shin Yoo CS454, Autumn 2017, School of Computing, KAIST Random Search Shin Yoo CS454, Autumn 2017, School of Computing, KAIST Random Search The polar opposite to the deterministic, examineeverything, search. Within the given budget, repeatedly generate a random

More information

Lecture H2. Heuristic Methods: Iterated Local Search, Simulated Annealing and Tabu Search. Saeed Bastani

Lecture H2. Heuristic Methods: Iterated Local Search, Simulated Annealing and Tabu Search. Saeed Bastani Simulation Lecture H2 Heuristic Methods: Iterated Local Search, Simulated Annealing and Tabu Search Saeed Bastani saeed.bastani@eit.lth.se Spring 2017 Thanks to Prof. Arne Løkketangen at Molde University

More information

Learning Theory Continued

Learning Theory Continued Learning Theory Continued Machine Learning CSE446 Carlos Guestrin University of Washington May 13, 2013 1 A simple setting n Classification N data points Finite number of possible hypothesis (e.g., dec.

More information

CS 331: Artificial Intelligence Local Search 1. Tough real-world problems

CS 331: Artificial Intelligence Local Search 1. Tough real-world problems S 331: rtificial Intelligence Local Search 1 1 Tough real-world problems Suppose you had to solve VLSI layout problems (minimize distance between components, unused space, etc.) Or schedule airlines Or

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Some

More information

Evolutionary Computation

Evolutionary Computation Evolutionary Computation - Computational procedures patterned after biological evolution. - Search procedure that probabilistically applies search operators to set of points in the search space. - Lamarck

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 600.463 Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 25.1 Introduction Today we re going to talk about machine learning, but from an

More information

5. Simulated Annealing 5.2 Advanced Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini

5. Simulated Annealing 5.2 Advanced Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini 5. Simulated Annealing 5.2 Advanced Concepts Fall 2010 Instructor: Dr. Masoud Yaghini Outline Acceptance Function Initial Temperature Equilibrium State Cooling Schedule Stopping Condition Handling Constraints

More information

Local search algorithms

Local search algorithms Local search algorithms CS171, Winter 2018 Introduction to Artificial Intelligence Prof. Richard Lathrop Reading: R&N 4.1-4.2 Local search algorithms In many optimization problems, the path to the goal

More information

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria 12. LOCAL SEARCH gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley h ttp://www.cs.princeton.edu/~wayne/kleinberg-tardos

More information

Pengju

Pengju Introduction to AI Chapter04 Beyond Classical Search Pengju Ren@IAIR Outline Steepest Descent (Hill-climbing) Simulated Annealing Evolutionary Computation Non-deterministic Actions And-OR search Partial

More information

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron

More information

Optimization and Gradient Descent

Optimization and Gradient Descent Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function

More information

Local Search. Shin Yoo CS492D, Fall 2015, School of Computing, KAIST

Local Search. Shin Yoo CS492D, Fall 2015, School of Computing, KAIST Local Search Shin Yoo CS492D, Fall 2015, School of Computing, KAIST If your problem forms a fitness landscape, what is optimisation? Local Search Loop Local Search Loop Start with a single, random solution

More information

CS599 Lecture 1 Introduction To RL

CS599 Lecture 1 Introduction To RL CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming

More information

Lecture 9 Evolutionary Computation: Genetic algorithms

Lecture 9 Evolutionary Computation: Genetic algorithms Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Simulation of natural evolution Genetic algorithms Case study: maintenance scheduling with genetic

More information

Simulated Annealing for Constrained Global Optimization

Simulated Annealing for Constrained Global Optimization Monte Carlo Methods for Computation and Optimization Final Presentation Simulated Annealing for Constrained Global Optimization H. Edwin Romeijn & Robert L.Smith (1994) Presented by Ariel Schwartz Objective

More information

Local and Stochastic Search

Local and Stochastic Search RN, Chapter 4.3 4.4; 7.6 Local and Stochastic Search Some material based on D Lin, B Selman 1 Search Overview Introduction to Search Blind Search Techniques Heuristic Search Techniques Constraint Satisfaction

More information

Introduction to Spring 2009 Artificial Intelligence Midterm Exam

Introduction to Spring 2009 Artificial Intelligence Midterm Exam S 188 Introduction to Spring 009 rtificial Intelligence Midterm Exam INSTRUTINS You have 3 hours. The exam is closed book, closed notes except a one-page crib sheet. Please use non-programmable calculators

More information

Learning in State-Space Reinforcement Learning CIS 32

Learning in State-Space Reinforcement Learning CIS 32 Learning in State-Space Reinforcement Learning CIS 32 Functionalia Syllabus Updated: MIDTERM and REVIEW moved up one day. MIDTERM: Everything through Evolutionary Agents. HW 2 Out - DUE Sunday before the

More information

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization

More information

ARTIFICIAL INTELLIGENCE. Reinforcement learning

ARTIFICIAL INTELLIGENCE. Reinforcement learning INFOB2KI 2018-2019 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Reinforcement learning Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 32. Propositional Logic: Local Search and Outlook Martin Wehrle Universität Basel April 29, 2016 Propositional Logic: Overview Chapter overview: propositional logic

More information

Linear Models for Regression

Linear Models for Regression Linear Models for Regression CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Regression Problem Training data: A set of input-output

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Outline

More information

Optimization Methods via Simulation

Optimization Methods via Simulation Optimization Methods via Simulation Optimization problems are very important in science, engineering, industry,. Examples: Traveling salesman problem Circuit-board design Car-Parrinello ab initio MD Protein

More information

Local Beam Search. CS 331: Artificial Intelligence Local Search II. Local Beam Search Example. Local Beam Search Example. Local Beam Search Example

Local Beam Search. CS 331: Artificial Intelligence Local Search II. Local Beam Search Example. Local Beam Search Example. Local Beam Search Example 1 S 331: rtificial Intelligence Local Search II 1 Local eam Search Travelling Salesman Problem 2 Keeps track of k states rather than just 1. k=2 in this example. Start with k randomly generated states.

More information

Computational statistics

Computational statistics Computational statistics Combinatorial optimization Thierry Denœux February 2017 Thierry Denœux Computational statistics February 2017 1 / 37 Combinatorial optimization Assume we seek the maximum of f

More information

Sequential and reinforcement learning: Stochastic Optimization I

Sequential and reinforcement learning: Stochastic Optimization I 1 Sequential and reinforcement learning: Stochastic Optimization I Sequential and reinforcement learning: Stochastic Optimization I Summary This session describes the important and nowadays framework of

More information

Least Mean Squares Regression. Machine Learning Fall 2018

Least Mean Squares Regression. Machine Learning Fall 2018 Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises

More information

6. APPLICATION TO THE TRAVELING SALESMAN PROBLEM

6. APPLICATION TO THE TRAVELING SALESMAN PROBLEM 6. Application to the Traveling Salesman Problem 92 6. APPLICATION TO THE TRAVELING SALESMAN PROBLEM The properties that have the most significant influence on the maps constructed by Kohonen s algorithm

More information

22c:145 Artificial Intelligence

22c:145 Artificial Intelligence 22c:145 Artificial Intelligence Fall 2005 Informed Search and Exploration III Cesare Tinelli The University of Iowa Copyright 2001-05 Cesare Tinelli and Hantao Zhang. a a These notes are copyrighted material

More information

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

7.1 Basis for Boltzmann machine. 7. Boltzmann machines 7. Boltzmann machines this section we will become acquainted with classical Boltzmann machines which can be seen obsolete being rarely applied in neurocomputing. It is interesting, after all, because is

More information

Least Mean Squares Regression

Least Mean Squares Regression Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method

More information

Introduction to Simulated Annealing 22c:145

Introduction to Simulated Annealing 22c:145 Introduction to Simulated Annealing 22c:145 Simulated Annealing Motivated by the physical annealing process Material is heated and slowly cooled into a uniform structure Simulated annealing mimics this

More information

Stochastic Networks Variations of the Hopfield model

Stochastic Networks Variations of the Hopfield model 4 Stochastic Networks 4. Variations of the Hopfield model In the previous chapter we showed that Hopfield networks can be used to provide solutions to combinatorial problems that can be expressed as the

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Simulated Annealing. Local Search. Cost function. Solution space

Simulated Annealing. Local Search. Cost function. Solution space Simulated Annealing Hill climbing Simulated Annealing Local Search Cost function? Solution space Annealing Annealing is a thermal process for obtaining low energy states of a solid in a heat bath. The

More information

Artificial Intelligence Heuristic Search Methods

Artificial Intelligence Heuristic Search Methods Artificial Intelligence Heuristic Search Methods Chung-Ang University, Jaesung Lee The original version of this content is created by School of Mathematics, University of Birmingham professor Sandor Zoltan

More information

Reinforcement Learning Active Learning

Reinforcement Learning Active Learning Reinforcement Learning Active Learning Alan Fern * Based in part on slides by Daniel Weld 1 Active Reinforcement Learning So far, we ve assumed agent has a policy We just learned how good it is Now, suppose

More information

Enumeration Schemes for Words Avoiding Permutations

Enumeration Schemes for Words Avoiding Permutations Enumeration Schemes for Words Avoiding Permutations Lara Pudwell November 27, 2007 Abstract The enumeration of permutation classes has been accomplished with a variety of techniques. One wide-reaching

More information

Basics of reinforcement learning

Basics of reinforcement learning Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system

More information

Local Search and Optimization

Local Search and Optimization Local Search and Optimization Outline Local search techniques and optimization Hill-climbing Gradient methods Simulated annealing Genetic algorithms Issues with local search Local search and optimization

More information

( ) ( ) ( ) ( ) Simulated Annealing. Introduction. Pseudotemperature, Free Energy and Entropy. A Short Detour into Statistical Mechanics.

( ) ( ) ( ) ( ) Simulated Annealing. Introduction. Pseudotemperature, Free Energy and Entropy. A Short Detour into Statistical Mechanics. Aims Reference Keywords Plan Simulated Annealing to obtain a mathematical framework for stochastic machines to study simulated annealing Parts of chapter of Haykin, S., Neural Networks: A Comprehensive

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

Simulated Annealing. 2.1 Introduction

Simulated Annealing. 2.1 Introduction Simulated Annealing 2 This chapter is dedicated to simulated annealing (SA) metaheuristic for optimization. SA is a probabilistic single-solution-based search method inspired by the annealing process in

More information

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria Coping With NP-hardness Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you re unlikely to find poly-time algorithm. Must sacrifice one of three desired features. Solve

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009 AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

More information

Ant Colony Optimization: an introduction. Daniel Chivilikhin

Ant Colony Optimization: an introduction. Daniel Chivilikhin Ant Colony Optimization: an introduction Daniel Chivilikhin 03.04.2013 Outline 1. Biological inspiration of ACO 2. Solving NP-hard combinatorial problems 3. The ACO metaheuristic 4. ACO for the Traveling

More information

Discrete evaluation and the particle swarm algorithm

Discrete evaluation and the particle swarm algorithm Volume 12 Discrete evaluation and the particle swarm algorithm Tim Hendtlass and Tom Rodgers Centre for Intelligent Systems and Complex Processes Swinburne University of Technology P. O. Box 218 Hawthorn

More information

Unit 8: Introduction to neural networks. Perceptrons

Unit 8: Introduction to neural networks. Perceptrons Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad

More information

Σ N (d i,p z i,p ) 2 (1)

Σ N (d i,p z i,p ) 2 (1) A CLASSICAL ALGORITHM FOR AVOIDING LOCAL MINIMA D Gorse and A Shepherd Department of Computer Science University College, Gower Street, London WC1E 6BT, UK J G Taylor Department of Mathematics King s College,

More information

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets Neural Networks for Machine Learning Lecture 11a Hopfield Nets Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Hopfield Nets A Hopfield net is composed of binary threshold

More information

Advanced computational methods X Selected Topics: SGD

Advanced computational methods X Selected Topics: SGD Advanced computational methods X071521-Selected Topics: SGD. In this lecture, we look at the stochastic gradient descent (SGD) method 1 An illustrating example The MNIST is a simple dataset of variety

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. Revised submission to IEEE TNN

Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. Revised submission to IEEE TNN Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables Revised submission to IEEE TNN Aapo Hyvärinen Dept of Computer Science and HIIT University

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

Support Vector Machines: Training with Stochastic Gradient Descent. Machine Learning Fall 2017

Support Vector Machines: Training with Stochastic Gradient Descent. Machine Learning Fall 2017 Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Fall 2017 1 Support vector machines Training by maximizing margin The SVM objective Solving the SVM optimization problem

More information

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses Steven Bergner, Chris Demwell Lecture notes for Cmpt 882 Machine Learning February 19, 2004 Abstract In these notes, a

More information

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat

More information

Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems

Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline

More information

A pruning pattern list approach to the permutation flowshop scheduling problem

A pruning pattern list approach to the permutation flowshop scheduling problem A pruning pattern list approach to the permutation flowshop scheduling problem Takeshi Yamada NTT Communication Science Laboratories, 2-4 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, JAPAN E-mail :

More information

Computational Graphs, and Backpropagation

Computational Graphs, and Backpropagation Chapter 1 Computational Graphs, and Backpropagation (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction We now describe the backpropagation algorithm for calculation of derivatives

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Dreem Challenge report (team Bussanati)

Dreem Challenge report (team Bussanati) Wavelet course, MVA 04-05 Simon Bussy, simon.bussy@gmail.com Antoine Recanati, arecanat@ens-cachan.fr Dreem Challenge report (team Bussanati) Description and specifics of the challenge We worked on the

More information

Max Margin-Classifier

Max Margin-Classifier Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization

More information

NEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE

NEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Series A, OF THE ROMANIAN ACADEMY Volume 0, Number /009, pp. 000 000 NEAREST NEIGHBOR CLASSIFICATION WITH IMPROVED WEIGHTED DISSIMILARITY MEASURE

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Markov Chain Monte Carlo. Simulated Annealing.

Markov Chain Monte Carlo. Simulated Annealing. Aula 10. Simulated Annealing. 0 Markov Chain Monte Carlo. Simulated Annealing. Anatoli Iambartsev IME-USP Aula 10. Simulated Annealing. 1 [RC] Stochastic search. General iterative formula for optimizing

More information

Computational Intelligence in Product-line Optimization

Computational Intelligence in Product-line Optimization Computational Intelligence in Product-line Optimization Simulations and Applications Peter Kurz peter.kurz@tns-global.com June 2017 Restricted use Restricted use Computational Intelligence in Product-line

More information

CS 570: Machine Learning Seminar. Fall 2016

CS 570: Machine Learning Seminar. Fall 2016 CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or

More information

Reservoir Computing and Echo State Networks

Reservoir Computing and Echo State Networks An Introduction to: Reservoir Computing and Echo State Networks Claudio Gallicchio gallicch@di.unipi.it Outline Focus: Supervised learning in domain of sequences Recurrent Neural networks for supervised

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%

More information

Week Cuts, Branch & Bound, and Lagrangean Relaxation

Week Cuts, Branch & Bound, and Lagrangean Relaxation Week 11 1 Integer Linear Programming This week we will discuss solution methods for solving integer linear programming problems. I will skip the part on complexity theory, Section 11.8, although this is

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Neural Networks Varun Chandola x x 5 Input Outline Contents February 2, 207 Extending Perceptrons 2 Multi Layered Perceptrons 2 2. Generalizing to Multiple Labels.................

More information

Statistical Computing (36-350)

Statistical Computing (36-350) Statistical Computing (36-350) Lecture 19: Optimization III: Constrained and Stochastic Optimization Cosma Shalizi 30 October 2013 Agenda Constraints and Penalties Constraints and penalties Stochastic

More information

HOPFIELD neural networks (HNNs) are a class of nonlinear

HOPFIELD neural networks (HNNs) are a class of nonlinear IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 4, APRIL 2005 213 Stochastic Noise Process Enhancement of Hopfield Neural Networks Vladimir Pavlović, Member, IEEE, Dan Schonfeld,

More information