Local search and agents

Similar documents
New rubric: AI in the news

Pengju

Local and Online search algorithms

Planning and search. Lecture 6: Search with non-determinism and partial observability

Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems

CS 380: ARTIFICIAL INTELLIGENCE

Local search algorithms. Chapter 4, Sections 3 4 1

A.I.: Beyond Classical Search

Beyond Classical Search

Searching in non-deterministic, partially observable and unknown environments

Searching in non-deterministic, partially observable and unknown environments

Iterative improvement algorithms. Beyond Classical Search. Outline. Example: Traveling Salesperson Problem

Local Search & Optimization

Local Search and Optimization

Summary. AIMA sections 4.3,4.4. Hill-climbing Simulated annealing Genetic algorithms (briey) Local search in continuous spaces (very briey)

22c:145 Artificial Intelligence

Local search algorithms. Chapter 4, Sections 3 4 1

School of EECS Washington State University. Artificial Intelligence

Local Search & Optimization

CS 331: Artificial Intelligence Local Search 1. Tough real-world problems

Scaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.).

LOCAL SEARCH. Today. Reading AIMA Chapter , Goals Local search algorithms. Introduce adversarial search 1/31/14

IS-ZC444: ARTIFICIAL INTELLIGENCE

Last time: Summary. Last time: Summary

Local search algorithms

Introduction to Simulated Annealing 22c:145

Methods for finding optimal configurations

CS 188: Artificial Intelligence

Local and Stochastic Search

Finding optimal configurations ( combinatorial optimization)

CSC242: Artificial Intelligence. Lecture 4 Local Search

CSE 473: Artificial Intelligence Spring 2014

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

CSE 473: Artificial Intelligence Spring 2014

The Story So Far... The central problem of this course: Smartness( X ) arg max X. Possibly with some constraints on X.

CSC242: Intro to AI. Lecture 5. Tuesday, February 26, 13

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

Methods for finding optimal configurations

CS 188: Artificial Intelligence Spring Announcements

Deterministic Planning and State-space Search. Erez Karpas

Optimization. Announcements. Last Time. Searching: So Far. Complete Searching. Optimization

Optimization Methods via Simulation

Approximate Inference

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

CSC 4510 Machine Learning

CS:4420 Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

Artificial Intelligence Heuristic Search Methods

CSE 473: Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

Problem solving and search. Chapter 3. Chapter 3 1

HILL-CLIMBING JEFFREY L. POPYACK

Constraint satisfaction search. Combinatorial optimization search.

CS 188: Artificial Intelligence. Bayes Nets

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Bayes Nets III: Inference

CS 188: Artificial Intelligence Spring 2009

Single Solution-based Metaheuristics

Random Search. Shin Yoo CS454, Autumn 2017, School of Computing, KAIST

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms

Local search algorithms

Solving Problems By Searching

Announcements. CS 188: Artificial Intelligence Fall Adversarial Games. Computing Minimax Values. Evaluation Functions. Recap: Resource Limits

CS 188: Artificial Intelligence Fall Announcements

Markov Models and Reinforcement Learning. Stephen G. Ware CSCI 4525 / 5525

Mini-project 2 (really) due today! Turn in a printout of your work at the end of the class

Computational Intelligence in Product-line Optimization

Metaheuristics. 2.3 Local Search 2.4 Simulated annealing. Adrian Horga

Constraint Satisfaction Problems

CS188 Outline. We re done with Part I: Search and Planning! Part II: Probabilistic Reasoning. Part III: Machine Learning

CS 4100 // artificial intelligence. Recap/midterm review!

Learning in State-Space Reinforcement Learning CIS 32

Artificial Intelligence (Künstliche Intelligenz 1) Part II: Problem Solving

CS188 Outline. CS 188: Artificial Intelligence. Today. Inference in Ghostbusters. Probability. We re done with Part I: Search and Planning!

CS 380: ARTIFICIAL INTELLIGENCE

CSEP 573: Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions.

22c:145 Artificial Intelligence

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Local Beam Search. CS 331: Artificial Intelligence Local Search II. Local Beam Search Example. Local Beam Search Example. Local Beam Search Example

Announcements. CS 188: Artificial Intelligence Spring Mini-Contest Winners. Today. GamesCrafters. Adversarial Games

CS 188: Artificial Intelligence

Introduction to Spring 2006 Artificial Intelligence Practice Final

Lecture 9: Search 8. Victor R. Lesser. CMPSCI 683 Fall 2010

CS 188: Artificial Intelligence

Factored State Spaces 3/2/178

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

5.3 Conditional Probability and Independence

Algorithms for Playing and Solving games*

CS 343: Artificial Intelligence

Artificial Intelligence

Simulated Annealing. Local Search. Cost function. Solution space

CS 188: Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Data Mining Part 5. Prediction

CS 188: Artificial Intelligence Fall Recap: Inference Example

Markov Chains and Hidden Markov Models

Transcription:

Artificial Intelligence Local search and agents Instructor: Fabrice Popineau [These slides adapted from Stuart Russell, Dan Klein and Pieter Abbeel @ai.berkeley.edu]

Local search algorithms In many optimization problems, path is irrelevant; the goal state is the solution Then state space = set of complete configurations; find configuration satisfying constraints, e.g., n-queens problem; or, find optimal configuration, e.g., travelling salesperson problem In such cases, can use iterative improvement algorithms; keep a single current state, try to improve it Constant space, suitable for online as well as offline search

Heuristic for n-queens problem Goal: n queens on board with no conflicts, i.e., no queen attacking another States: n queens on board, one per column Heuristic value function: number of conflicts

Hill-climbing algorithm function HILL-CLIMBING(problem) returns a state current make-node(problem.initial-state) loop do neighbor a highest-valued successor of current if neighbor.value current.value then return current.state current neighbor Like climbing Everest in thick fog with amnesia

Global and local maxima Random restarts find global optimum duh Random sideways moves Escape from shoulders Loop forever on flat local maxima

Hill-climbing on the 8-queens problem No sideways moves: Succeeds w/ prob. 0.14 Average number of moves per trial: 4 when succeeding, 3 when getting stuck Expected total number of moves needed: 3(1-p)/p + 4 =~ 22 moves Allowing 100 sideways moves: Succeeds w/ prob. 0.94 Average number of moves per trial: 21 when succeeding, 65 when getting stuck Expected total number of moves needed: 65(1-p)/p + 21 =~ 25 moves Moral: algorithms with knobs to twiddle are irritating

Hill Climbing Quiz Starting from X, where do you end up? Starting from Y, where do you end up? Starting from Z, where do you end up?

Simulated annealing Resembles the annealing process used to cool metals slowly to reach an ordered (lowenergy) state Basic idea: Allow bad moves occasionally, depending on temperature High temperature => more bad moves allowed, shake the system out of its local minimum Gradually reduce temperature according to some schedule Sounds pretty flaky, doesn t it? Theorem: simulated annealing finds the global optimum with probability 1 for a slow enough cooling schedule

Simulated annealing algorithm function SIMULATED-ANNEALING(problem,schedule) returns a state current make-node(problem.initial-state) for t = 1 to do T schedule(t) if T = 0 then return current next a randomly selected successor of current E next.value current.value if E > 0 then current next else current next only with probability e E/T

Simulated Annealing Theoretical guarantee: Stationary distribution: If T decreased slowly enough, will converge to optimal state! Is this an interesting guarantee? Sounds like magic, but reality is reality: The more downhill steps you need to escape a local optimum, the less likely you are to ever make them all in a row People think hard about ridge operators which let you jump around the space in better ways

Local beam search Basic idea: K copies of a local search algorithm, initialized randomly For each iteration Generate ALL successors from K current states Choose best K of these to be the new current states Why is this different from K local searches in parallel? The searches communicate! Come over here, the grass is greener! What other well-known algorithm does this remind you of? Evolution! Or, K chosen randomly with a bias towards good ones

Genetic Algorithms Genetic algorithms use a natural selection metaphor Keep best N hypotheses at each step (selection) based on a fitness function Also have pairwise crossover operators, with optional mutation to give variety Possibly the most misunderstood, misapplied (and even maligned) technique around

Example: N-Queens Why does crossover make sense here? When wouldn t it make sense? What would mutation be? What would a good fitness function be?

Searching in the real world Nondeterminism: actions have unpredictable effects Modified problem formulation to allow multiple outcomes Solutions are now contingency plans New algorithm to find them: AND-OR search May need plans with loops! Partial observability: percept is not the whole state New concept: belief state = set of states agent could be in Modified formulation for search in belief state space; add observation model Simple and general agent design Nondeterminism and partial observability

The erratic vacuum world If square is dirty, Suck sometimes cleans up dirt in adjacent square as well E.g., state 1 could go to 5 or 7 If square is clean, Suck may dump dirt on it by accident E.g., state 4 could go to 4 or 2

Problem formulation Results( s,a) returns a set of states Results( 1,Suck) = {5,7} Results( 4,Suck) = {2,4} Results( 1,Right) = {2} Everything else is the same as before

Contingent solutions From state 1, does [Suck] solve the problem? Not necessarily! What about [Suck,Right,Suck]? Not necessarily! [Suck; if state=5 then [Right,Suck] else []] This is a contingent solution (a.k.a. a branching or conditional plan) Great! So, how do we find such solutions?

AND-OR search trees OR-node: Agent chooses action; At least one branch must be solved AND-node: Nature chooses outcome; All branches must be solved

AND-OR search made easy AND-OR search: call OR-Search on the root node OR-search(node): succeeds if AND-search succeeds on the outcome set for any action AND-search(set of nodes): succeeds if OR-search succeeds on ALL nodes in the set

AND-OR search function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan, or failure OR-SEARCH(problem.initial-state,problem,[]) function OR-SEARCH(state,problem,path) returns a conditional plan, or failure if problem.goal-test(state) then return the empty plan if state is on path then return failure for each action in problem.actions(state) do plan AND-SEARCH(results(state,action),problem,[state path]) if plan failure then return [action plan] return failure

AND-OR search contd. function AND-SEARCH(states,problem,path) returns a conditional plan, or failure for each s i in states do plan i OR-SEARCH(s i,problem,path) if plan i = failure then return failure return [if s 1 then plan 1 else if s 2 then plan 2 else... if s n 1 then plan n 1 else plan n ]

Slippery vacuum world Sometimes movement fails There is no guaranteed contingent solution! There is a cyclic solution: [Suck, L1 : Right, if State = 5 then L1 else Suck] Here L1 is a label Modify AND-OR-GRAPH-SEARCH to add a label when it finds a repeated state, try adding a branch to the label instead of failure A cyclic plan is a cyclic solution if Every leaf is a goal state From every point in the plan there is a path to a leaf

What does nondeterminism really mean? Example: your hotel key card doesn t open the door of your room Explanation 1: you didn t put it in quite right This is nondeterminism; keep trying Explanation 2: something wrong with the key This is partial observability; get a new key Explanation 3: it isn t your room This is embarrassing; get a new brain A nondeterministic model is appropriate when outcomes don t depend deterministically on some hidden state

Partial observability

Extreme partial observability: Sensorless worlds Vacuum world with known geometry, dirt, but no sensors at all! Belief state: set of all environment states the agent could be in More generally, what the agent knows given all percepts to date Right Suck Suck Left

Sensorless problem formulation Underlying physical problem has Actions P, Result P, Cost P. Initial state: a belief state b (set of physical states s) N physical states => 2 N belief states Goal test: every element s in b satisfies Goal-Test P (s) Actions: union of Actions P (s) for each s in b This is OK if doing an illegal action has no effect Transition model: Deterministic: Result(b,a) = union of Result P (s,a) for each s in b Nondeterministic: Result(b,a) = union of Results P (s,a) for each s in b Step-Cost(b,a,b ) = Step-Cost P (s,a,s ) for any s in b Goal-Test P, and Step-

Search in sensorless belief state space Everything works exactly as before! Solutions are still action sequences! Some opportunities for improvement: If any s in b is unsolvable, b is unsolvable If b is superset of b and b is in tree, discard b If b is a superset of b and b has a solution, b has same solution

What use are sensorless problems? They correspond to many real-world robotic manipulation problems A part orientation conveyor consists of a sequence of slanted guides that orient the part correctly no matter what its initial orientation It s a lot cheaper and more reliable than using a camera and robot arm!

Partial observability: formulation Partially observable problem formulation has to say what the agent can observe: Deterministic: Percept(s) is the percept received in physical state s Nondeterministic: Percepts(s) is the set of possible percepts received in s Fully observable: Percept(s) = s Sensorless: Percept(s)=null Local sensing vacuum world Percept(s1) = [A,Dirty] Percept(s3) = [A,Dirty]

Partial observability: belief state transition model b = Predict(b,a) updates the belief state just for the action Identical to transition model for sensorless problems Possible-Percepts(b ) is the set of percepts that could come next Union of Percept(s) for every s in b Update(b,p) is the new belief state if percept p is received Just the states s in b for which p = Percept(s) Results(b,a) contains Update(Predict(b,a),p) for each p in Possible-Percepts(Predict(b,a))

Example: Results(b 0,Right) b 0 b =Predict(b 0,Right) Update(b,[B,Dirty]) Update(b,[B,Clean]) Possible-Percepts(b )

Percept is p given by the environment Repeat after me: b <- Update(Predict(b,a),p) Maintaining belief state in an agent This is the predict-update cycle Also known as monitoring, filtering, state estimation Localization and mapping are two special cases

Summary Nondeterminism requires contingent plans AND-OR search finds them Sensorless problems require ordinary plans Search in belief state space to find them General partial observability induces nondeterminism for percepts AND-OR search in belief state space Predict-Update cycle for belief state transitions

Next Time: Adversarial Search!