Local Beam Search. CS 331: Artificial Intelligence Local Search II. Local Beam Search Example. Local Beam Search Example. Local Beam Search Example

Similar documents
CS 331: Artificial Intelligence Local Search 1. Tough real-world problems

LOCAL SEARCH. Today. Reading AIMA Chapter , Goals Local search algorithms. Introduce adversarial search 1/31/14

School of EECS Washington State University. Artificial Intelligence

Local Search & Optimization

Local Search & Optimization

Local Search and Optimization

Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems

A.I.: Beyond Classical Search

22c:145 Artificial Intelligence

Local search algorithms

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

Lecture 15: Genetic Algorithms

Local and Online search algorithms

Search. Search is a key component of intelligent problem solving. Get closer to the goal if time is not enough

Scaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.).

Local search algorithms. Chapter 4, Sections 3 4 1

CS 380: ARTIFICIAL INTELLIGENCE

The Story So Far... The central problem of this course: Smartness( X ) arg max X. Possibly with some constraints on X.

Genetic Algorithms. Donald Richards Penn State University

Crossover Techniques in GAs

Methods for finding optimal configurations

Summary. AIMA sections 4.3,4.4. Hill-climbing Simulated annealing Genetic algorithms (briey) Local search in continuous spaces (very briey)

CSC 4510 Machine Learning

Local search algorithms. Chapter 4, Sections 3 4 1

Lecture 9 Evolutionary Computation: Genetic algorithms

Local and Stochastic Search

Evolutionary computation

CSC242: Artificial Intelligence. Lecture 4 Local Search

CSC242: Intro to AI. Lecture 5. Tuesday, February 26, 13

Computational statistics

Evolutionary Computation: introduction

New rubric: AI in the news

Local search algorithms

Incremental Stochastic Gradient Descent

Genetic Algorithms & Modeling

15. LECTURE 15. I can calculate the dot product of two vectors and interpret its meaning. I can find the projection of one vector onto another one.

Iterative improvement algorithms. Beyond Classical Search. Outline. Example: Traveling Salesperson Problem

Genetic Algorithms: Basic Principles and Applications

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia)

Methods for finding optimal configurations

Lecture 22. Introduction to Genetic Algorithms

Overview. Optimization. Easy optimization problems. Monte Carlo for Optimization. 1. Survey MC ideas for optimization: (a) Multistart

Bounded Approximation Algorithms

Pengju

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

19. Basis and Dimension

Biology 11 UNIT 1: EVOLUTION LESSON 2: HOW EVOLUTION?? (MICRO-EVOLUTION AND POPULATIONS)

Optimization Methods via Simulation

Gradient Descent. Sargur Srihari

Fundamentals of Genetic Algorithms

how should the GA proceed?

Lecture 7: Minimization or maximization of functions (Recipes Chapter 10)

Chapter 8: Introduction to Evolutionary Computation

Fundamentals of Metaheuristics

Fall 2003 BMI 226 / CS 426 LIMITED VALIDITY STRUCTURES

Local search and agents

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

Sexual reproduction & Meiosis

A GENETIC ALGORITHM FOR FINITE STATE AUTOMATA

Report due date. Please note: report has to be handed in by Monday, May 16 noon.

Random Search. Shin Yoo CS454, Autumn 2017, School of Computing, KAIST

Sexual Reproduction and Genetics

Optimization and Gradient Descent

#2 How do organisms grow?

Evolutionary Computation

LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION

V. Evolutionary Computing. Read Flake, ch. 20. Genetic Algorithms. Part 5A: Genetic Algorithms 4/10/17. A. Genetic Algorithms

Support Vector Machines: Training with Stochastic Gradient Descent. Machine Learning Fall 2017

CPSC 340: Machine Learning and Data Mining. Stochastic Gradient Fall 2017

Binary Particle Swarm Optimization with Crossover Operation for Discrete Optimization

First go to

Introduction to Optimization

Discrete Tranformation of Output in Cellular Automata

Adaptive Generalized Crowding for Genetic Algorithms

Local Search. Shin Yoo CS492D, Fall 2015, School of Computing, KAIST

2. Next, try to describe the cell cycle as follows: interphase, prophase, metaphase, anaphase, telophase, cytokinesis

Computer simulation can be thought as a virtual experiment. There are

Genetic Algorithms and Genetic Programming Lecture 17

Machine Learning CS 4900/5900. Lecture 03. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Metaheuristics. 2.3 Local Search 2.4 Simulated annealing. Adrian Horga

Beyond Classical Search

Evolution Strategies for Constants Optimization in Genetic Programming

Machine Learning and Data Mining. Linear regression. Kalev Kask

Haploid & diploid recombination and their evolutionary impact

Lecture 17: Numerical Optimization October 2014

Evolving Presentations of Genetic Information: Motivation, Methods, and Analysis

What Makes a Problem Hard for a Genetic Algorithm? Some Anomalous Results and Their Explanation

V. Evolutionary Computing. Read Flake, ch. 20. Assumptions. Genetic Algorithms. Fitness-Biased Selection. Outline of Simplified GA

The Cell Cycle. The Basis for Heritability: Mitosis and Meiosis. Let s start with a banana.

Learning Objectives:

IV. Evolutionary Computing. Read Flake, ch. 20. Assumptions. Genetic Algorithms. Fitness-Biased Selection. Outline of Simplified GA

Design and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras

Introduction to gradient descent

Meiosis. Two distinct divisions, called meiosis I and meiosis II

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Busy Beaver The Influence of Representation

Genetic Algorithms. Seth Bacon. 4/25/2005 Seth Bacon 1

I N N O V A T I O N L E C T U R E S (I N N O l E C) Petr Kuzmič, Ph.D. BioKin, Ltd. WATERTOWN, MASSACHUSETTS, U.S.A.

Gaussian and Linear Discriminant Analysis; Multiclass Classification

Evolving a New Feature for a Working Program

Transcription:

1 S 331: rtificial Intelligence Local Search II 1 Local eam Search Travelling Salesman Problem 2 Keeps track of k states rather than just 1. k=2 in this example. Start with k randomly generated states. Local eam Search Example 3 Generate all successors of all the k states Local eam Search Example 4 None of these is a goal state so we continue Local eam Search Example Select the best k successors from the complete list Local eam Search Example 6 Repeat the process until goal found

Local eam Search How is this different from k random restarts in parallel? Random-restart search: each search runs independently of the others Local beam search: useful information is passed among the k parallel search threads Eg. One state generates good successors while the other k-1 states all generate bad successors, then the more promising states are expanded Local eam Search isadvantage: all k states can become stuck in a small region of the state space To fix this, use stochastic beam search Stochastic beam search: oesn t pick best k successors hooses k successors at random, with probability of choosing a given successor being an increasing function of its value 7 8 Genetic lgorithms Like natural selection in which an organism creates offspring according to its fitness for the environment Essentially a variant of stochastic beam search that combines two parent states (just like sexual reproduction) Over time, population contains individuals with high fitness 9 efinitions Fitness function: Evaluation function in G terminology Population: k randomly generated states (called individuals) Individual: String over a finite alphabet gene 4 8 1 5 1 6 2 3 4 2 chromosome 10 efinitions Selection: Pick two random individuals for reproduction rossover: Mix the two parent strings at the crossover point Parents Offspring efinitions Mutation: randomly change a location in an individual s string with a small independent probability 4 8 1 5 1 6 2 3 4 2 4 8 1 5 1 6 2 3 4 2 5 4 1 7 3 7 3 6 1 7 4 8 1 5 3 7 3 6 1 7 5 4 1 7 1 6 2 3 4 2 4 8 1 5 1 6 0 3 4 2 rossover point randomly chosen 11 Randomness aids in avoiding small local extrema 12 2

G Overview Population = Initial population Iterate until some individual is fit enough or enough time has elapsed: NewPopulation = Empty For 1 to size(population) Select pair of parents (P 1,P 2 ) using Selection(P,Fitness Function) hild = rossover(p 1, P 2 ) With small random probability, Mutate() dd to NewPopulation Population = NewPopulation Return individual in Population with best Fitness Function G Overview Population = Initial population Iterate until some individual is fit enough or enough time has elapsed: NewPopulation = Empty For 1 to size(population) This pseudocode only produces one child. ould Select pair of parents (P 1,P 2 ) using Selection(P,Fitness Function) also do a variant like before hild = rossover(p 1, P 2 ) where we produce 2 children With small random probability, Mutate() dd to NewPopulation Population = NewPopulation Return individual in Population with best Fitness Function 13 14 Lots of variants Variant1: ulling - individuals below a certain threshold are removed Variant2: Selection based on: Eval ( X ) P( X selected) Eval( Y ) YPopulation Example: 8-queens Fitness Function: number of nonattacking pairs of queens (28 is the value for the solution) Represent 8-queens state as an 8 digit string in which each digit represents position of queen 3 2 7 5 2 4 1 1 15 16 Example: 8-queens Example: 8-queens (Fitness Function) Values of Fitness Function Probability of selection (proportional to fitness score) 17 18 3

Example: 8-queens (Selection) Example: 8-queens (rossover) Notice 3 2 7 5 2 4 1 1 is selected twice while 3 2 5 4 3 2 1 3 is not selected at all 19 3 2 7 5 2 4 1 1 2 4 7 4 8 5 5 2 3 2 7 4 8 5 5 2 20 Example: 8-queens (Mutation) Mutation corresponds to randomly selecting a queen and randomly moving it in its column Implementation details on Genetic lgorithms Initially, population is diverse and crossover produces big changes from parents Over time, individuals become quite similar and crossover doesn t produce such a big change rossover is the big advantage: Preserves a big block of genes that have evolved independently to perform useful functions Eg. Putting first 3 queens in positions 2, 4, and 6 is a useful block 21 22 Schemas substring in which some of the positions can be left unspecified eg. 246***** Instances: strings that match the schema If the average fitness of the instances of a schema is above the mean, then the number of instances of the schema within the population will grow over time Schemas Schemas are important if contiguous blocks provide a consistent benefit Genetic algorithms work best when schemas correspond to meaningful components of a solution 23 24 4

The fine print The representation of each state is critical to the performance of the G Lots of parameters to tweak but if you get them right, Gs can work well Limited theoretical results (skeptics say it s just a big hack) nd remember. (From http://www.xkcd.com/534/) 25 26 iscrete Environments Gradient escent Hillclimbing pseudocode X Initial configuration Iterate: E Eval(X) N Neighbors(X) For each X i in N E i Eval(X i ) E* Highest E i X* X i with highest E i If E* > E X X* Else Return X In discrete state spaces, the # of neighbors is finite. What if there is a continuum of possible moves leading to an infinite # of neighbors? 27 28 Local Search in ontinuous State Spaces lmost all real world problems involve continuous state spaces To perform local search in continuous state spaces, you need techniques from calculus The main technique to find a minimum is called gradient descent (or gradient ascent if you want to find the maximum) Gradient escent What is the gradient of a function f(x)? Usually written as f ( x) f ( x) xx f (x) (the gradient itself) represents the direction of the steepest slope f ( x) (the magnitude of the gradient) tells you how big the steepest slope is 29 30 5

Gradient escent Suppose we want to find a local minimum of a function f(x). We use the gradient descent rule: x xf x α is the learning rate, which is usually a small number like 0.05 Gradient escent Examples 3 2 f ( x) x 2x 2 These pictures were taken from Wolfram Mathworld Suppose we want to find a local maximum of a function f(x). We use the gradient ascent rule: x xf x 31 32 Question of the ay Why not just calculate the global optimum using f ( x) 0? May not be able to solve this equation in closed form If you can t solve it globally, you can still compute the gradient locally (like we are doing in gradient descent) Multivariate Gradient escent What happens if your function is multivariate eg. f(x 1,x 2,x 3 )? Then f f f f ( x 1, x2, x3),, x1 x2 x3 The gradient descent rule becomes: f f x1 x1 x3 x3 x1 x3 f x2 x2 x 2 33 34 Multivariate Gradient scent More bout the Learning Rate If α is too large Gradient descent overshoots the optimum point If α is too small Gradient descent requires too many steps and will take a very long time to converge Pictures taken from Wikipedia entry on gradient descent 35 36 6

Weaknesses of Gradient escent 1. an be very slow to converge to a local optimum, especially if the curvature in different directions is very different 2. Good results depend on the value of the learning rate α 3. What if the function f(x) isn t differentiable at x? What you should know e able to formulate a problem as a Genetic lgorithm Understand what crossover and mutation do and why they are important ifferences between hillclimbing, simulated annealing, local beam search, and genetic algorithms Understand how gradient descent works, including its strengths and weaknesses Understand how to derive the gradient descent rule 37 38 7