Chapter 17 Planning Based on Model Checking

Size: px
Start display at page:

Download "Chapter 17 Planning Based on Model Checking"

Transcription

1 Lecture slides for Automated Planning: Theory and Practice Chapter 17 Planning Based on Model Checking Dana S. Nau CMSC 7, AI Planning University of Maryland, Spring 008 1

2 Motivation c Actions with multiple possible outcomes Action failures» e.g., gripper drops its load Exogenous events» e.g., road closed Like Markov Decision Processes (MDPs), but without probabilities attached to the outcomes Useful if accurate probabilities aren t available, or if probability calculations would introduce inaccuracies a c b Grasp block c a b Intended outcome a b c Unintended outcome

3 Nondeterministic Systems Nondeterministic system: a triple Σ = (S, A, γ) S = finite set of states A = finite set of actions γ: S A s Like in the previous chapter, the book doesn t commit to any particular representation It only deals with the underlying semantics Draw the state-transition graph explicitly Like in the previous chapter, a policy is a function from states into actions π: S A At certain points we will also have nondeterministic policies π: S A, or equivalently, π S A 3

4 Example Robot r1 starts at location l1 Objective is to get r1 to location l4 move(r1,l,l1) π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 4

5 Example Robot r1 starts at location l1 Objective is to get r1 to location l4 move(r1,l,l1) π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 5

6 Example Robot r1 starts at location l1 Objective is to get r1 to location l4 move(r1,l,l1) π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 6

7 Execution Structures Execution structure for a policy π: s The graph of all of π s execution paths Notation: Σ π = (Q,T) Q S T S S move(r1,l,l1) π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 7

8 Execution Structures Execution structure for a policy π: s The graph of all of π s execution paths Notation: Σ π = (Q,T) Q S T S S π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 8

9 Execution Structures Execution structure for a policy π: s The graph of all of π s execution paths Notation: Σ π = (Q,T) Q S T S S move(r1,l,l1) π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 9

10 Execution Structures Execution structure for a policy π: s The graph of all of π s execution paths Notation: Σ π = (Q,T) Q S T S S π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 10

11 Execution Structures Execution structure for a policy π: The graph of all of π s execution paths Notation: Σ π = (Q,T) Q S T S S move(r1,l,l1) π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 11

12 Execution Structures Execution structure for a policy π: The graph of all of π s execution paths Notation: Σ π = (Q,T) Q S T S S π 1 = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4))} π = {(, move(r1,l1,l)), (s, move(r1,l,l3)), (, move(r1,l3,l4)), (, move(r1,l3,l4))} π 3 = {(, move(r1,l1,l4))} 1

13 Types of Solutions Weak solution: at least one execution path reaches a goal s0 a0 a1 s a Strong solution: every execution path reaches a goal s0 a0 a1 s a a3 Strong-cyclic solution: every fair execution path reaches a goal Don t stay in a cycle forever if there s a state-transition out of it s0 a0 a1 a3 s a 13

14 Backward breadth-first search Finding Strong Solutions StrongPreImg(S) = {(s,a) : γ(s,a), γ(s,a) S} all state-action pairs for which all of the successors are in S PruneStates(π,S) = {(s,a) π : s S} S is the set of states we ve already solved keep only the state-action pairs for other states MkDet(π') π' is a policy that may be nondeterministic remove some state-action pairs if necessary, to get a deterministic policy 14

15 Example π = failure π' = S π' = S g S π' = {} 15

16 Example π = failure π' = S π' = S g S π' = {} π'' PreImage = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} 16

17 Example π = failure π' = S π' = S g S π' = {} π'' PreImage = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} π π' = π' π' U π'' = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} 17

18 π = Example π' = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} S π' = {,,} S g S π' = {,,} 18

19 π = Example π' = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} s S π' = {,,} S g S π' = {,,} π'' {(s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4)), (,move(r1,l4,l3)), (,move(r1,l4,l5))} π π' = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} π' {(s,move(r1,l,l3), (,move(r1,l3,l4)), (,move(r1,l5,l4))} 19

20 Example π = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} π' = {(s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} s S π' = {s,,,} S g S π' = {s,,,} 0

21 Example π = {(,move(r1,l3,l4)), (,move(r1,l5,l4))} π' = {(s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} S π' = {s,,,} S g S π' = {s,,,} π'' {(,move(r1,l1,l))} π π' = {(s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} π' {(,move(r1,l1,l)), (s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} s 1

22 Example π = {(s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} s π' = {(,move(r1,l1,l)), (s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} S π' = {,s,,,} S g S π' = {,s,,,}

23 Example π = {(s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} s π' = {(,move(r1,l1,l)), (s,move(r1,l,l3)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} S π' = {,s,,,} S g S π' = {,s,,,} S 0 S g S π' MkDet(π') = π' 3

24 Finding Weak Solutions Weak-Plan is just like Strong-Plan except for this: WeakPreImg(S) = {(s,a) : γ(s,a) i S } at least one successor is in S Weak Weak 4

25 π = failure π' = S π' = Example S g S π' = {} Weak Weak 5

26 Example π = failure π' = S π' = S g S π' = {} π'' = PreImage = {(,move(r1,l1,l4)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} Weak π π' = π' π' U π'' = {(,move(r1,l1,l4)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} Weak 6

27 Example π = π' = {(,move(r1,l1,l4)), (,move(r1,l3,l4)), (,move(r1,l5,l4))} S π' = {,,,} S g S π' = {,,,} S 0 S g S π' MkDet(π') = π' Weak Weak 7

28 Finding Strong-Cyclic Solutions Begin with a universal policy π' that contains all state-action pairs Repeatedly, eliminate state-action pairs that take us to bad states PruneOutgoing removes state-action pairs that go to states not in S g S π» PruneOutgoing(π,S) = π {(s,a) π : γ(s,a) subset S S π PruneUnconnected removes states from which it is impossible to get to S g» with π' =, compute fixpoint of π' π WeakPreImg(S g S π' ) Once the policy stops changing, If it s a solution, return a deterministic subpolicy Else return failure 8

29 Finding Strong-Cyclic Solutions Once the policy stops changing, If it s a solution, return a deterministic subpolicy Else return failure 9

30 Example 1 π π' {(s,a) : a is applicable to s} s s6 at(r1,l6) 30

31 Example 1 π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = π' RemoveNonProgress(π') = s6 at(r1,l6) 31

32 Example 1 π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = π' RemoveNonProgress(π') = s6 at(r1,l6) 3

33 Example 1 π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = π' RemoveNonProgress(π') = shown s6 at(r1,l6) 33

34 Example 1 π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = π' RemoveNonProgress(π') = shown MkDet(shown) = either {(,move(r1,l1,l4), (,move(r1,l4,l6))} or {(,move(r1,l1,l), (s,move(r1,l,l3)), s6 at(r1,l6) (,move(r1,l3,l4), (,move(r1,l4,l6), (,move(r1,l5,l4)} 34

35 Example π π' {(s,a) : a is applicable to s} s s6 at(r1,l6) 35

36 Example π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = s6 at(r1,l6) 36

37 Example π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = s6 at(r1,l6) 37

38 Example π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = shown s6 at(r1,l6) 38

39 Example π π' {(s,a) : a is applicable to s} s π {(s,a) : a is applicable to s} PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = shown π' shown s6 at(r1,l6) 39

40 π' shown π π' Example PruneOutgoing(π',S g ) = π' PruneUnconnected(π',S g ) = π' so π = π' RemoveNonProgress(π') = s s6 at(r1,l6) 40

41 π' shown π π' Example PruneOutgoing(π',S g ) = π' PruneUnconnected(π'',S g ) = π' so π = π' RemoveNonProgress(π') = shown MkDet(shown) = shown s s6 at(r1,l6) 41

42 Planning for Extended s Here, extended means temporally extended Constraints that apply to some sequence of states Examples: move to l3, and then to l5 keep going back and forth between l3 and l5 move(r1,l,l1) 4

43 Planning for Extended s Context: the internal state of the controller Plan: (C, c 0, act, ctxt) C = a set of execution contexts c 0 is the initial context act: S C A ctxt: S C S C Sections 17.3 extends the ideas in Sections 17.1 and 17. to deal with extended goals We ll skip the details 43

Chapter 17 Planning Based on Model Checking

Chapter 17 Planning Based on Model Checking Lecture slides for Automated Planning: Theory and Practice Chapter 17 Planning Based on Model Checking Dana S. Nau University of Maryland 1:19 PM February 9, 01 1 Motivation c a b Actions with multiple

More information

Chapter 16 Planning Based on Markov Decision Processes

Chapter 16 Planning Based on Markov Decision Processes Lecture slides for Automated Planning: Theory and Practice Chapter 16 Planning Based on Markov Decision Processes Dana S. Nau University of Maryland 12:48 PM February 29, 2012 1 Motivation c a b Until

More information

Chapter 7 Propositional Satisfiability Techniques

Chapter 7 Propositional Satisfiability Techniques Lecture slides for Automated Planning: Theory and Practice Chapter 7 Propositional Satisfiability Techniques Dana S. Nau CMSC 722, AI Planning University of Maryland, Spring 2008 1 Motivation Propositional

More information

CS 7180: Behavioral Modeling and Decisionmaking

CS 7180: Behavioral Modeling and Decisionmaking CS 7180: Behavioral Modeling and Decisionmaking in AI Markov Decision Processes for Complex Decisionmaking Prof. Amy Sliva October 17, 2012 Decisions are nondeterministic In many situations, behavior and

More information

Chapter 7 Propositional Satisfiability Techniques

Chapter 7 Propositional Satisfiability Techniques Lecture slides for Automated Planning: Theory and Practice Chapter 7 Propositional Satisfiability Techniques Dana S. Nau University of Maryland 12:58 PM February 15, 2012 1 Motivation Propositional satisfiability:

More information

Chapter 14 Temporal Planning

Chapter 14 Temporal Planning Lecture slides for Automated Planning: Theory and Practice Chapter 14 Temporal Planning Dana S. Nau CMSC 722, AI Planning University of Maryland, Spring 2008 1 Temporal Planning Motivation: want to do

More information

Task Decomposition on Abstract States, for Planning under Nondeterminism

Task Decomposition on Abstract States, for Planning under Nondeterminism Task Decomposition on Abstract States, for Planning under Nondeterminism Ugur Kuter a, Dana Nau a, Marco Pistore b, Paolo Traverso b a Department of Computer Science and Institute of Systems Research and

More information

Planning Under Uncertainty II

Planning Under Uncertainty II Planning Under Uncertainty II Intelligent Robotics 2014/15 Bruno Lacerda Announcement No class next Monday - 17/11/2014 2 Previous Lecture Approach to cope with uncertainty on outcome of actions Markov

More information

Decision Theory: Markov Decision Processes

Decision Theory: Markov Decision Processes Decision Theory: Markov Decision Processes CPSC 322 Lecture 33 March 31, 2006 Textbook 12.5 Decision Theory: Markov Decision Processes CPSC 322 Lecture 33, Slide 1 Lecture Overview Recap Rewards and Policies

More information

CS 4100 // artificial intelligence. Recap/midterm review!

CS 4100 // artificial intelligence. Recap/midterm review! CS 4100 // artificial intelligence instructor: byron wallace Recap/midterm review! Attribution: many of these slides are modified versions of those distributed with the UC Berkeley CS188 materials Thanks

More information

Intelligent Agents. Formal Characteristics of Planning. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University

Intelligent Agents. Formal Characteristics of Planning. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University Intelligent Agents Formal Characteristics of Planning Ute Schmid Cognitive Systems, Applied Computer Science, Bamberg University Extensions to the slides for chapter 3 of Dana Nau with contributions by

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Noel Welsh 11 November 2010 Noel Welsh () Markov Decision Processes 11 November 2010 1 / 30 Annoucements Applicant visitor day seeks robot demonstrators for exciting half hour

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 3, 2016 CPSC 422, Lecture 11 Slide 1 422 big picture: Where are we? Query Planning Deterministic Logics First Order Logics Ontologies

More information

CS 570: Machine Learning Seminar. Fall 2016

CS 570: Machine Learning Seminar. Fall 2016 CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or

More information

Markov decision processes (MDP) CS 416 Artificial Intelligence. Iterative solution of Bellman equations. Building an optimal policy.

Markov decision processes (MDP) CS 416 Artificial Intelligence. Iterative solution of Bellman equations. Building an optimal policy. Page 1 Markov decision processes (MDP) CS 416 Artificial Intelligence Lecture 21 Making Complex Decisions Chapter 17 Initial State S 0 Transition Model T (s, a, s ) How does Markov apply here? Uncertainty

More information

Chapter 5 Plan-Space Planning

Chapter 5 Plan-Space Planning Lecture slides for Automated Planning: Theory and Practice Chapter 5 Plan-Space Planning Dana S. Nau University of Maryland 5:10 PM February 6, 2012 1 Problem with state-space search Motivation In some

More information

Planning With Information States: A Survey Term Project for cs397sml Spring 2002

Planning With Information States: A Survey Term Project for cs397sml Spring 2002 Planning With Information States: A Survey Term Project for cs397sml Spring 2002 Jason O Kane jokane@uiuc.edu April 18, 2003 1 Introduction Classical planning generally depends on the assumption that the

More information

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes. CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 12: Probability 3/2/2011 Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. 1 Announcements P3 due on Monday (3/7) at 4:59pm W3 going out

More information

Don t Plan for the Unexpected: Planning Based on Plausibility Models

Don t Plan for the Unexpected: Planning Based on Plausibility Models Don t Plan for the Unexpected: Planning Based on Plausibility Models Thomas Bolander, DTU Informatics, Technical University of Denmark Joint work with Mikkel Birkegaard Andersen and Martin Holm Jensen

More information

Introduction to Artificial Intelligence (AI)

Introduction to Artificial Intelligence (AI) Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 9 Oct, 11, 2011 Slide credit Approx. Inference : S. Thrun, P, Norvig, D. Klein CPSC 502, Lecture 9 Slide 1 Today Oct 11 Bayesian

More information

Machine Learning. Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING. Slides adapted from Tom Mitchell and Peter Abeel

Machine Learning. Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING. Slides adapted from Tom Mitchell and Peter Abeel Machine Learning Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING Slides adapted from Tom Mitchell and Peter Abeel Machine Learning: Jordan Boyd-Graber UMD Machine Learning

More information

COMP4418, 25 October 2017 Planning. Planning. KRR for Agents in Dynamic Environments. David Rajaratnam 2017 ( Michael Thielscher 2015) COMP s2

COMP4418, 25 October 2017 Planning. Planning. KRR for Agents in Dynamic Environments. David Rajaratnam 2017 ( Michael Thielscher 2015) COMP s2 1 Planning KRR for Agents in Dynamic Environments 2 Overview Last week discussed Decision Making in some very general settings: Markov process, MDP, HMM, POMDP. This week look at a practical application

More information

Markov Decision Processes (and a small amount of reinforcement learning)

Markov Decision Processes (and a small amount of reinforcement learning) Markov Decision Processes (and a small amount of reinforcement learning) Slides adapted from: Brian Williams, MIT Manuela Veloso, Andrew Moore, Reid Simmons, & Tom Mitchell, CMU Nicholas Roy 16.4/13 Session

More information

Reinforcement Learning and Control

Reinforcement Learning and Control CS9 Lecture notes Andrew Ng Part XIII Reinforcement Learning and Control We now begin our study of reinforcement learning and adaptive control. In supervised learning, we saw algorithms that tried to make

More information

Introduction to Spring 2006 Artificial Intelligence Practice Final

Introduction to Spring 2006 Artificial Intelligence Practice Final NAME: SID#: Login: Sec: 1 CS 188 Introduction to Spring 2006 Artificial Intelligence Practice Final You have 180 minutes. The exam is open-book, open-notes, no electronics other than basic calculators.

More information

Motivation for introducing probabilities

Motivation for introducing probabilities for introducing probabilities Reaching the goals is often not sufficient: it is important that the expected costs do not outweigh the benefit of reaching the goals. 1 Objective: maximize benefits - costs.

More information

Data Structures in Java

Data Structures in Java Data Structures in Java Lecture 21: Introduction to NP-Completeness 12/9/2015 Daniel Bauer Algorithms and Problem Solving Purpose of algorithms: find solutions to problems. Data Structures provide ways

More information

Lecture Notes on Software Model Checking

Lecture Notes on Software Model Checking 15-414: Bug Catching: Automated Program Verification Lecture Notes on Software Model Checking Matt Fredrikson André Platzer Carnegie Mellon University Lecture 19 1 Introduction So far we ve focused on

More information

1. (3 pts) In MDPs, the values of states are related by the Bellman equation: U(s) = R(s) + γ max a

1. (3 pts) In MDPs, the values of states are related by the Bellman equation: U(s) = R(s) + γ max a 3 MDP (2 points). (3 pts) In MDPs, the values of states are related by the Bellman equation: U(s) = R(s) + γ max a s P (s s, a)u(s ) where R(s) is the reward associated with being in state s. Suppose now

More information

Planning as Satisfiability

Planning as Satisfiability Planning as Satisfiability Alan Fern * Review of propositional logic (see chapter 7) Planning as propositional satisfiability Satisfiability techniques (see chapter 7) Combining satisfiability techniques

More information

Introduction to Spring 2009 Artificial Intelligence Midterm Exam

Introduction to Spring 2009 Artificial Intelligence Midterm Exam S 188 Introduction to Spring 009 rtificial Intelligence Midterm Exam INSTRUTINS You have 3 hours. The exam is closed book, closed notes except a one-page crib sheet. Please use non-programmable calculators

More information

Some AI Planning Problems

Some AI Planning Problems Course Logistics CS533: Intelligent Agents and Decision Making M, W, F: 1:00 1:50 Instructor: Alan Fern (KEC2071) Office hours: by appointment (see me after class or send email) Emailing me: include CS533

More information

Lecture 25: Learning 4. Victor R. Lesser. CMPSCI 683 Fall 2010

Lecture 25: Learning 4. Victor R. Lesser. CMPSCI 683 Fall 2010 Lecture 25: Learning 4 Victor R. Lesser CMPSCI 683 Fall 2010 Final Exam Information Final EXAM on Th 12/16 at 4:00pm in Lederle Grad Res Ctr Rm A301 2 Hours but obviously you can leave early! Open Book

More information

Chapter 4 Deliberation with Temporal Domain Models

Chapter 4 Deliberation with Temporal Domain Models Last update: April 10, 2018 Chapter 4 Deliberation with Temporal Domain Models Section 4.4: Constraint Management Section 4.5: Acting with Temporal Models Automated Planning and Acting Malik Ghallab, Dana

More information

Artificial Intelligence Planning

Artificial Intelligence Planning Artificial Intelligence Planning Gholamreza Ghassem-Sani Sharif University of Technology Lecture slides are mainly based on Dana Nau s AI Planning University of Maryland 1 For technical details: Related

More information

Algorithms for Playing and Solving games*

Algorithms for Playing and Solving games* Algorithms for Playing and Solving games* Andrew W. Moore Professor School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~awm awm@cs.cmu.edu 412-268-7599 * Two Player Zero-sum Discrete

More information

CMU Lecture 11: Markov Decision Processes II. Teacher: Gianni A. Di Caro

CMU Lecture 11: Markov Decision Processes II. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 11: Markov Decision Processes II Teacher: Gianni A. Di Caro RECAP: DEFINING MDPS Markov decision processes: o Set of states S o Start state s 0 o Set of actions A o Transitions P(s s,a)

More information

Reinforcement Learning. Spring 2018 Defining MDPs, Planning

Reinforcement Learning. Spring 2018 Defining MDPs, Planning Reinforcement Learning Spring 2018 Defining MDPs, Planning understandability 0 Slide 10 time You are here Markov Process Where you will go depends only on where you are Markov Process: Information state

More information

Partially observable Markov decision processes. Department of Computer Science, Czech Technical University in Prague

Partially observable Markov decision processes. Department of Computer Science, Czech Technical University in Prague Partially observable Markov decision processes Jiří Kléma Department of Computer Science, Czech Technical University in Prague https://cw.fel.cvut.cz/wiki/courses/b4b36zui/prednasky pagenda Previous lecture:

More information

A Gentle Introduction to Reinforcement Learning

A Gentle Introduction to Reinforcement Learning A Gentle Introduction to Reinforcement Learning Alexander Jung 2018 1 Introduction and Motivation Consider the cleaning robot Rumba which has to clean the office room B329. In order to keep things simple,

More information

What happens to the value of the expression x + y every time we execute this loop? while x>0 do ( y := y+z ; x := x:= x z )

What happens to the value of the expression x + y every time we execute this loop? while x>0 do ( y := y+z ; x := x:= x z ) Starter Questions Feel free to discuss these with your neighbour: Consider two states s 1 and s 2 such that s 1, x := x + 1 s 2 If predicate P (x = y + 1) is true for s 2 then what does that tell us about

More information

Markov decision processes

Markov decision processes CS 2740 Knowledge representation Lecture 24 Markov decision processes Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Administrative announcements Final exam: Monday, December 8, 2008 In-class Only

More information

Decision Theory: Q-Learning

Decision Theory: Q-Learning Decision Theory: Q-Learning CPSC 322 Decision Theory 5 Textbook 12.5 Decision Theory: Q-Learning CPSC 322 Decision Theory 5, Slide 1 Lecture Overview 1 Recap 2 Asynchronous Value Iteration 3 Q-Learning

More information

Planning and search. Lecture 6: Search with non-determinism and partial observability

Planning and search. Lecture 6: Search with non-determinism and partial observability Planning and search Lecture 6: Search with non-determinism and partial observability Lecture 6: Search with non-determinism and partial observability 1 Today s lecture Non-deterministic actions. AND-OR

More information

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning CSCI-699: Advanced Topics in Deep Learning 01/16/2019 Nitin Kamra Spring 2019 Introduction to Reinforcement Learning 1 What is Reinforcement Learning? So far we have seen unsupervised and supervised learning.

More information

CSE 546 Final Exam, Autumn 2013

CSE 546 Final Exam, Autumn 2013 CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,

More information

MS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction

MS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction MS&E338 Reinforcement Learning Lecture 1 - April 2, 2018 Introduction Lecturer: Ben Van Roy Scribe: Gabriel Maher 1 Reinforcement Learning Introduction In reinforcement learning (RL) we consider an agent

More information

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms Computer Science 385 Analysis of Algorithms Siena College Spring 2011 Topic Notes: Limitations of Algorithms We conclude with a discussion of the limitations of the power of algorithms. That is, what kinds

More information

Machine Learning I Reinforcement Learning

Machine Learning I Reinforcement Learning Machine Learning I Reinforcement Learning Thomas Rückstieß Technische Universität München December 17/18, 2009 Literature Book: Reinforcement Learning: An Introduction Sutton & Barto (free online version:

More information

Logic: Intro & Propositional Definite Clause Logic

Logic: Intro & Propositional Definite Clause Logic Logic: Intro & Propositional Definite Clause Logic Alan Mackworth UBC CS 322 Logic 1 February 27, 2013 P & M extbook 5.1 Lecture Overview Recap: CSP planning Intro to Logic Propositional Definite Clause

More information

Asynchronous Models For Consensus

Asynchronous Models For Consensus Distributed Systems 600.437 Asynchronous Models for Consensus Department of Computer Science The Johns Hopkins University 1 Asynchronous Models For Consensus Lecture 5 Further reading: Distributed Algorithms

More information

Probabilistic Planning. George Konidaris

Probabilistic Planning. George Konidaris Probabilistic Planning George Konidaris gdk@cs.brown.edu Fall 2017 The Planning Problem Finding a sequence of actions to achieve some goal. Plans It s great when a plan just works but the world doesn t

More information

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012 CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline

More information

Reinforcement Learning. Machine Learning, Fall 2010

Reinforcement Learning. Machine Learning, Fall 2010 Reinforcement Learning Machine Learning, Fall 2010 1 Administrativia This week: finish RL, most likely start graphical models LA2: due on Thursday LA3: comes out on Thursday TA Office hours: Today 1:30-2:30

More information

Equivalence of Regular Expressions and FSMs

Equivalence of Regular Expressions and FSMs Equivalence of Regular Expressions and FSMs Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin Regular Language Recall that a language

More information

Class Note #20. In today s class, the following four concepts were introduced: decision

Class Note #20. In today s class, the following four concepts were introduced: decision Class Note #20 Date: 03/29/2006 [Overall Information] In today s class, the following four concepts were introduced: decision version of a problem, formal language, P and NP. We also discussed the relationship

More information

Local search and agents

Local search and agents Artificial Intelligence Local search and agents Instructor: Fabrice Popineau [These slides adapted from Stuart Russell, Dan Klein and Pieter Abbeel @ai.berkeley.edu] Local search algorithms In many optimization

More information

Q-learning Tutorial. CSC411 Geoffrey Roeder. Slides Adapted from lecture: Rich Zemel, Raquel Urtasun, Sanja Fidler, Nitish Srivastava

Q-learning Tutorial. CSC411 Geoffrey Roeder. Slides Adapted from lecture: Rich Zemel, Raquel Urtasun, Sanja Fidler, Nitish Srivastava Q-learning Tutorial CSC411 Geoffrey Roeder Slides Adapted from lecture: Rich Zemel, Raquel Urtasun, Sanja Fidler, Nitish Srivastava Tutorial Agenda Refresh RL terminology through Tic Tac Toe Deterministic

More information

Contingent Planning with Goal Preferences

Contingent Planning with Goal Preferences Contingent Planning with Goal Preferences Dmitry Shaparau ITC-IRST, Trento, Italy shaparau@irst.itc.it Marco Pistore University of Trento, Italy pistore@dit.unitn.it Paolo Traverso ITC-IRST, Trento, Italy

More information

total 58

total 58 CS687 Exam, J. Košecká Name: HONOR SYSTEM: This examination is strictly individual. You are not allowed to talk, discuss, exchange solutions, etc., with other fellow students. Furthermore, you are only

More information

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008 CS 88: Artificial Intelligence Fall 28 Lecture 9: HMMs /4/28 Announcements Midterm solutions up, submit regrade requests within a week Midterm course evaluation up on web, please fill out! Dan Klein UC

More information

An Introduction to Markov Decision Processes. MDP Tutorial - 1

An Introduction to Markov Decision Processes. MDP Tutorial - 1 An Introduction to Markov Decision Processes Bob Givan Purdue University Ron Parr Duke University MDP Tutorial - 1 Outline Markov Decision Processes defined (Bob) Objective functions Policies Finding Optimal

More information

ARTIFICIAL INTELLIGENCE. Reinforcement learning

ARTIFICIAL INTELLIGENCE. Reinforcement learning INFOB2KI 2018-2019 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Reinforcement learning Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

MIT Manufacturing Systems Analysis Lectures 18 19

MIT Manufacturing Systems Analysis Lectures 18 19 MIT 2.852 Manufacturing Systems Analysis Lectures 18 19 Loops Stanley B. Gershwin Spring, 2007 Copyright c 2007 Stanley B. Gershwin. Problem Statement B 1 M 2 B 2 M 3 B 3 M 1 M 4 B 6 M 6 B 5 M 5 B 4 Finite

More information

Starters and activities in Mechanics. MEI conference 2012 Keele University. Centre of mass: two counter-intuitive stable positions of equilibrium

Starters and activities in Mechanics. MEI conference 2012 Keele University. Centre of mass: two counter-intuitive stable positions of equilibrium Starters and activities in Mechanics MEI conference 2012 Keele University Starters Centre of mass: two counter-intuitive stable positions of equilibrium The directions of displacement, velocity and acceleration

More information

Lecture 6 Force and Motion. Identifying Forces Free-body Diagram Newton s Second Law

Lecture 6 Force and Motion. Identifying Forces Free-body Diagram Newton s Second Law Lecture 6 Force and Motion Identifying Forces Free-body Diagram Newton s Second Law We are now moving on from the study of motion to studying what causes motion. Forces are what cause motion. Forces are

More information

CS 188: Artificial Intelligence Spring 2009

CS 188: Artificial Intelligence Spring 2009 CS 188: Artificial Intelligence Spring 2009 Lecture 21: Hidden Markov Models 4/7/2009 John DeNero UC Berkeley Slides adapted from Dan Klein Announcements Written 3 deadline extended! Posted last Friday

More information

Lecture 3: The Reinforcement Learning Problem

Lecture 3: The Reinforcement Learning Problem Lecture 3: The Reinforcement Learning Problem Objectives of this lecture: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which

More information

The State Explosion Problem

The State Explosion Problem The State Explosion Problem Martin Kot August 16, 2003 1 Introduction One from main approaches to checking correctness of a concurrent system are state space methods. They are suitable for automatic analysis

More information

Last time: Summary. Last time: Summary

Last time: Summary. Last time: Summary 1 Last time: Summary Definition of AI? Turing Test? Intelligent Agents: Anything that can be viewed as perceiving its environment through sensors and acting upon that environment through its effectors

More information

Reading Response: Due Wednesday. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

Reading Response: Due Wednesday. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 Reading Response: Due Wednesday R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 Another Example Get to the top of the hill as quickly as possible. reward = 1 for each step where

More information

The Markov Decision Process (MDP) model

The Markov Decision Process (MDP) model Decision Making in Robots and Autonomous Agents The Markov Decision Process (MDP) model Subramanian Ramamoorthy School of Informatics 25 January, 2013 In the MAB Model We were in a single casino and the

More information

Markov chains. Randomness and Computation. Markov chains. Markov processes

Markov chains. Randomness and Computation. Markov chains. Markov processes Markov chains Randomness and Computation or, Randomized Algorithms Mary Cryan School of Informatics University of Edinburgh Definition (Definition 7) A discrete-time stochastic process on the state space

More information

Machine Learning, Midterm Exam

Machine Learning, Midterm Exam 10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have

More information

CSE 573. Markov Decision Processes: Heuristic Search & Real-Time Dynamic Programming. Slides adapted from Andrey Kolobov and Mausam

CSE 573. Markov Decision Processes: Heuristic Search & Real-Time Dynamic Programming. Slides adapted from Andrey Kolobov and Mausam CSE 573 Markov Decision Processes: Heuristic Search & Real-Time Dynamic Programming Slides adapted from Andrey Kolobov and Mausam 1 Stochastic Shortest-Path MDPs: Motivation Assume the agent pays cost

More information

Lecture 1: March 7, 2018

Lecture 1: March 7, 2018 Reinforcement Learning Spring Semester, 2017/8 Lecture 1: March 7, 2018 Lecturer: Yishay Mansour Scribe: ym DISCLAIMER: Based on Learning and Planning in Dynamical Systems by Shie Mannor c, all rights

More information

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22. Introduction to Artificial Intelligence V22.0472-001 Fall 2009 Lecture 15: Bayes Nets 3 Midterms graded Assignment 2 graded Announcements Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides

More information

Introduction to Artificial Intelligence (AI)

Introduction to Artificial Intelligence (AI) Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 10 Oct, 13, 2011 CPSC 502, Lecture 10 Slide 1 Today Oct 13 Inference in HMMs More on Robot Localization CPSC 502, Lecture

More information

PART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B.

PART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B. Advanced Topics in Machine Learning, GI13, 2010/11 Advanced Topics in Machine Learning, GI13, 2010/11 Answer any THREE questions. Each question is worth 20 marks. Use separate answer books Answer any THREE

More information

Principles of AI Planning

Principles of AI Planning Principles of 7. State-space search: relaxed Malte Helmert Albert-Ludwigs-Universität Freiburg November 18th, 2008 A simple heuristic for deterministic planning STRIPS (Fikes & Nilsson, 1971) used the

More information

Lecture 18: Reinforcement Learning Sanjeev Arora Elad Hazan

Lecture 18: Reinforcement Learning Sanjeev Arora Elad Hazan COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 18: Reinforcement Learning Sanjeev Arora Elad Hazan Some slides borrowed from Peter Bodik and David Silver Course progress Learning

More information

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012 Hidden Markov Models Hal Daumé III Computer Science University of Maryland me@hal3.name CS 421: Introduction to Artificial Intelligence 19 Apr 2012 Many slides courtesy of Dan Klein, Stuart Russell, or

More information

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example CS 88: Artificial Intelligence Fall 29 Lecture 9: Hidden Markov Models /3/29 Announcements Written 3 is up! Due on /2 (i.e. under two weeks) Project 4 up very soon! Due on /9 (i.e. a little over two weeks)

More information

Discrete Probability and State Estimation

Discrete Probability and State Estimation 6.01, Fall Semester, 2007 Lecture 12 Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.01 Introduction to EECS I Fall Semester, 2007 Lecture 12 Notes

More information

Final Exam December 12, 2017

Final Exam December 12, 2017 Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes

More information

Goal specification using temporal logic in presence of non-deterministic actions

Goal specification using temporal logic in presence of non-deterministic actions Goal specification using temporal logic in presence of non-deterministic actions Chitta Baral Matt Barry Department of Computer Sc. and Engg. Advance Tech Development Lab Arizona State University United

More information

Reasoning About Imperative Programs. COS 441 Slides 10b

Reasoning About Imperative Programs. COS 441 Slides 10b Reasoning About Imperative Programs COS 441 Slides 10b Last time Hoare Logic: { P } C { Q } Agenda If P is true in the initial state s. And C in state s evaluates to s. Then Q must be true in s. Program

More information

Lecture 3: Reductions and Completeness

Lecture 3: Reductions and Completeness CS 710: Complexity Theory 9/13/2011 Lecture 3: Reductions and Completeness Instructor: Dieter van Melkebeek Scribe: Brian Nixon Last lecture we introduced the notion of a universal Turing machine for deterministic

More information

Decision, Computation and Language

Decision, Computation and Language Decision, Computation and Language Non-Deterministic Finite Automata (NFA) Dr. Muhammad S Khan (mskhan@liv.ac.uk) Ashton Building, Room G22 http://www.csc.liv.ac.uk/~khan/comp218 Finite State Automata

More information

CMSC 722, AI Planning. Planning and Scheduling

CMSC 722, AI Planning. Planning and Scheduling CMSC 722, AI Planning Planning and Scheduling Dana S. Nau University of Maryland 1:26 PM April 24, 2012 1 Scheduling Given: actions to perform set of resources to use time constraints» e.g., the ones computed

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 18: HMMs and Particle Filtering 4/4/2011 Pieter Abbeel --- UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore

More information

SFM-11:CONNECT Summer School, Bertinoro, June 2011

SFM-11:CONNECT Summer School, Bertinoro, June 2011 SFM-:CONNECT Summer School, Bertinoro, June 20 EU-FP7: CONNECT LSCITS/PSS VERIWARE Part 3 Markov decision processes Overview Lectures and 2: Introduction 2 Discrete-time Markov chains 3 Markov decision

More information

Internet Monetization

Internet Monetization Internet Monetization March May, 2013 Discrete time Finite A decision process (MDP) is reward process with decisions. It models an environment in which all states are and time is divided into stages. Definition

More information

PART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B.

PART A and ONE question from PART B; or ONE question from PART A and TWO questions from PART B. Advanced Topics in Machine Learning, GI13, 2010/11 Advanced Topics in Machine Learning, GI13, 2010/11 Answer any THREE questions. Each question is worth 20 marks. Use separate answer books Answer any THREE

More information

Program Analysis Part I : Sequential Programs

Program Analysis Part I : Sequential Programs Program Analysis Part I : Sequential Programs IN5170/IN9170 Models of concurrency Program Analysis, lecture 5 Fall 2018 26. 9. 2018 2 / 44 Program correctness Is my program correct? Central question for

More information

Partially Observable Markov Decision Processes (POMDPs)

Partially Observable Markov Decision Processes (POMDPs) Partially Observable Markov Decision Processes (POMDPs) Sachin Patil Guest Lecture: CS287 Advanced Robotics Slides adapted from Pieter Abbeel, Alex Lee Outline Introduction to POMDPs Locally Optimal Solutions

More information

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal

More information

Example: Robot Tasks. What is planning? Background: Situation Calculus. Issues in Planning I

Example: Robot Tasks. What is planning? Background: Situation Calculus. Issues in Planning I What is planning? q A plan is any hierarchical process in the organism that can control the order in which a sequence of operations is to be performed. [Miller, Galanter and Pribram 1960] q Planning can

More information

Controlling probabilistic systems under partial observation an automata and verification perspective

Controlling probabilistic systems under partial observation an automata and verification perspective Controlling probabilistic systems under partial observation an automata and verification perspective Nathalie Bertrand, Inria Rennes, France Uncertainty in Computation Workshop October 4th 2016, Simons

More information