AUTONOMOUS SYSTEMS. Task Planning. Pedro U. Lima M. Isabel Ribeiro Luis Custódio
|
|
- Martin Smith
- 6 years ago
- Views:
Transcription
1 AUTONOMOUS SYSTEMS Task Planning Pedro U. Lima M. Isabel Ribeiro Luis Custódio Institute for Systems and Robotics Instituto Superior Técnico Lisbon, Portugal March 2007 Revised by Pedro U. Lima in November 2015
2 Outline 1. Planning Problem 2. Logic 3. Logic-Based Planning: Situation Calculus, STRIPS 4. Plan Representation and Modeling: Petri Net Task Models 5. Plan Analysis 6. Planning Under Uncertainty 7. Markov Decision Processes (MDP) 8. Dynamic Programming Solution of MDPs 9. Reinforcement Learning Solution of MDPs
3 Planning Planning consists of determining the sequence of actions that enables reaching the goal(s) of an agent. Robot Task Planning consists of determining the appropriate sequence of actions to move a robot from the current world situation to a world situation that satisfies its preferences.
4 Planning Robot Task Planning [Courtesy of JSK Lab U. Tokyo, Japan]
5 Logic Logic can be seen as a language to represent the knowledge about the world and a particular problem to be solved. Syntactic System Set of accepted symbols Set of rules establishing how symbols can be aggregated so as to build formulas/ sentences Alphabet Formation rules LANGUAGE Set of rules that establish how to derive formulas from other formulas INFERENCE RULES
6 Logic Semantic System Assigns a meaning to the language formulas World (semantics) Facts Language (syntax) Formulas
7 Logic Syntactic System vs Semantic System Language rules g + r + e + e + n green Associates a color to the word green Arithmetic rules x, y expressions representing numbers x > y is a formula over numbers Fact is true when the number represented by x is greater than the number represented by y
8 Logic Typically, one deals only with the world issues relevant for the problem, through a conceptualization of the reality Objects and their relations are defined Functions given a set of objects, a function establishes which object is related to the object(s) in the set and how, e.g., left_room(kitchen) Relations given a set of objects, establishes if that set is related in a certain way e.g., on(laptop, table)
9 Logic The concept of interpretation establishes the link between the language elements and the conceptualization of the reality elements (objects, functions and relations). Given a formula written in the defined language, its interpretation is designated as proposition A proposition is true iff it correctly describes the world, based on the adopted conceptualization of the reality A formula is satisfied iff there is an interpretation that associates it to a true proposition
10 Logic A fact is a true proposition for a given (conceptualized) world state The initial known facts compose the initial knowledge base Inference is the process of obtaining new propositions (conclusions) from the knowledge base To ensure that a given reached conclusion is satisfied by the adopted interpretation, only a conclusion satisfied for all the interpretations that satisfy the starting propositions (premises) is accepted. This way, we guarantee that, should the premises be satisfied, so is the conclusion, irrespectively of the interpretation. Inference Rule Ex. (Modus Ponens): Premises Conclusion IF on(a,b) THEN above(a,b) above(a,b) on(a,b)
11 Logic Propositional Logic Facts objects, functions and relations Predicate Logic variables quantifiers
12 Propositional vs Predicate Logic room S1 P1 door P1 room S2 Example R S2 door P2 P2 S1 robot S3 B box room S3 World Model (KB) Propositional Logic robot_inroom_s1 box_inroom_s3 door_p1_connects_rooms_s1_s2 door_p2_connects_rooms_s2_s3 Predicate Logic inroom(<obj>,<room>) <OBJ> ß robot; <ROOM> ß S1 <OBJ> ß box; <ROOM> ß S3 connects(<door>,<room1>,<room2>) <DOOR> ß P1; <ROOM1> ß S1; <ROOM2> ß S2 <DOOR> ß P2; <ROOM1> ß S2; <ROOM2> ß S3
13 Situation Calculus logic handles propositions truth, not action execution logic can not tell which action should be executed at most it can suggest the possible actions time and changes are not adequately handled by basic logic (propositional, predicate) Idea: the world state is represented by a proposition set the set is changed according to received perceptions and executed actions the world evolution is described by diachronic rules, which express how the world changes representation of change Situation Calculus attempts to solve the problems associated to representation and reasoning under changes. It is based on predicate logic and describes the world as a sequence of situations, each of which represents a world state
14 Situation Calculus one situation is generated from another situation by executing an action an argument is added to each property (represented by a predicate) that may change denoting the situation where the property is satisfied Ex: localization( agent, (1,1), S 0 ) localization( agent, (1,2), S 1 ) to represent passing from one situation to another, the following function is used: Result( action, situation) : Α Σ Σ Ex: Result( go_ahead, S 0 ) = S 1
15 Situation Calculus Effect Axioms pre-conditions predicate (to execute the action) (whose logical value changes after the action is executed) State action effects to describe the change(s) due to the action effect(s) e.g.,: x s Present(x, s) Portable(x) Hold(x, Result(pickup, s)) x s Hold(x, s) Hold(x, Result(release, s))
16 Situation Calculus Frame Axioms predicate conditions predicate (logical value in current situation) (for no change) (in the situation following the action) One needs to explain what does not change due to the action execution e.g.,: a x s Hold(x, s) (a release) Hold(x, Result(a, s)) a x s Hold(x, s) (a pickup (Present(x, s) Portable(x)) Hold(x, Result(a, s))
17 Situation Calculus Successor State Axioms merge effect and frame axioms Predicate true in the next situation [ one action makes it true It was true in the previous situation no action made it false] e.g., a x s Hold(x, Result(a, s)) [ (a = pickup Present(x, s) Portable(x)) (Hold(x, s) a release) ] a x s Hold(x, Result(a, s)) [ (a = release) ( Hold(x, s) (a pickup (Present(x, s) Portable(x)))) ]
18 Situation Calculus Example (Blocks World) a b c Initial Situation Action Sequence? Final Situation c b a Predicates: On(x, y, s) ClearTop(x,s) Block(x) Objects: A B C M (blocks and table) Action: PutOn(x, y) Effect Axioms: x y s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, y, result(puton(x,y), s)) x y w s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, w, s) ClearTop(w, result(puton(x,y), s))
19 Situation Calculus Example (Blocks World) a b c Initial Situation Action Sequence? Final Situation c b a Predicates: On(x, y, s) ClearTop(x,s) Block(x) Objects: A B C M (blocks and table) Action: PutOn(x, y) Frame Axioms: x y z s On(x, y, s) (a PutOn(x,z)) On(x, y, Result(a, s)) x y s ClearTop(y, s) (a PutOn(x,y)) ClearTop(y, Result(a, s))
20 Situation Calculus Example (Blocks World) a b c Initial Situation Action Sequence? Final Situation c b a Predicates: On(x, y, s) ClearTop(x,s) Block(x) Objects: A B C M (blocks and table) Action: PutOn(x, y) Resulting Successor State Axioms: x y z a s On(x, y, result(a,s)) [ ( a=puton(x,y) On(x, z, s) ClearTop(x,s) ClearTop(y,s) Block(x) (Block(y) y=m) ) ( a PutOn(x,z) On(x, y, s) ) ] x y z a s ClearTop(z, result(a,s)) [ ( a=puton(x,y) On(x, z, s) ClearTop(x,s) ClearTop(y,s) Block(x) (Block(y) y=m) ) ( a PutOn(x,z) On(x, y, s) ) ]
21 Situation Calculus Example (Blocks World) a b c Initial Situation Action Sequence? Final Situation c b a Predicates: On(x, y, s) ClearTop(x,s) Block(x) Objects: A B C M (blocks and table) Action: PutOn(x, y) Initial State: Block(A) Block(B) Block(C) On(C, M, s 0 ) On(B, C, s 0 ) On(A, B, s 0 ) ClearTop(A, s 0 ) ClearTop(M, s 0 ) Goal State s: Block(A) Block(B) Block(C) On(A, M, s) On(B, A, s) On(C, B, s) ClearTop(C, s) ClearTop(M, s)
22 a b c Effect Axioms: E1) x y s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, y, result(puton(x,y), s)) E2) x y w s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, w, s) ClearTop(w, result(puton(x,y), s)) Frame Axioms: N1) x y z s On(x, y, s) (a PutOn(x,z)) On(x, y, Result(a, s)) N2) x y s ClearTop(y, s) (a PutOn(x,y)) ClearTop(y, Result(a, s)) Initial Situation(s 0 ): Block(A); Block(B); Block(C); On(C, M, s o ); On(B, C, s o ); On(A, B, s 0 ); ClearTop(A, s 0 ); ClearTop(M, s 0 ) s 0 Situation Calculus Example (Blocks World) In s 0, axiom E1) is applicable with x=a and y=m : block(a) (block(m) M = M) ClearTop(A,s 0 ) ClearTop(M,s 0 ) On(A, M, result(puton(a,m), s 0 )) If s 1 = result(puton(a,m), s 0 ) then On(A, M, s 1 ) In s 0, axiom E2) is applicable with x=a, y=m and w=b then ClearTop(B, s 1 ) In s 0, axiom N1) is applicable with w=a, z=m, x=c and y=m then On(C, M, s 1 ) In s 0, axiom N1) is applicable with w=a, z=m, x=b and y=c then On(B, C, s 1 ) In s 0, axiom N1) is not applicable with w=a, z=m, x=a and y=b (i.e., On(A,B,s 1 ) is false) In s 0, axiom N2) is applicable with x=a, w=m and y=a then ClearTop(A, s 1 ) Situation s 1 : On(A, M, s 1 ); On(C, M, s 1 ); On(B, C, s 1 ), ClearTop(B, s 1 ); ClearTop(A, s 1 ); ClearTop(M,s 1 ) b c a
23 Situation Calculus Example (Blocks World) s 1 b c a Effect Axioms: E1) x y s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, y, result(puton(x,y), s)) E2) x y w s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, w, s) ClearTop(w, result(puton(x,y), s)) Frame Axioms: N1) x y z s On(x, y, s) (a PutOn(x,z)) On(x, y, Result(a, s)) N2) x y s ClearTop(y, s) (a PutOn(x,y)) ClearTop(y, Result(a, s)) Initial Situation(s 0 ): Block(A); Block(B); Block(C); On(C, M, s o ); On(B, C, s o ); On(A, B, s 0 ); ClearTop(A, s 0 ); ClearTop(M, s 0 ) In s 1, axiom E1) is applicable with x=b e y=a : block(b) (block(a) A = M) ClearTop(B,s 1 ) ClearTop(A,s 1 ) On(B, A, result(puton(b,a), s 1 )) If s 2 = result(puton(b,a), s 1 ) then On(B, A, s 2 ) In s 1, axiom E2) is applicable with x=b, y=a and w=c then ClearTop(C, s 2 ) In s 1, axiom N1) is applicable with w=b, z=a, x=c and y=m then On(C, M, s 2 ) In s 1, axiom N1) is applicable with w=b, z=a, x=a and y=m then On(A, M, s 2 ) In s 1, axiom N1) is not applicable with w=b, z=a, x=b and y=c (i.e., On(B,C,s 2 ) is false) In s 1, axiom N2) is applicable with x=b, w=a and y=b then ClearTop(B, s 2 ) In s 1, axiom N2) is not applicable with x=b, w=a and y=a (i.e., ClearTop(A,s 2 ) is false) Situation s 2 : On(A, M, s 2 ); On(C, M, s 2 ); On(B, A, s 2 ), ClearTop(C, s 2 ); ClearTop(B, s 2 ); ClearTop(M,s 2 ) c b a
24 Situation Calculus Example (Blocks World) s 2 c b a Effect Axioms: E1) x y s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, y, result(puton(x,y), s)) E2) x y w s block(x) (block(y) y = M) ClearTop(x,s) ClearTop(y,s) On(x, w, s) ClearTop(w, result(puton(x,y), s)) Frame Axioms: N1) x y z s On(x, y, s) (a PutOn(x,z)) On(x, y, Result(a, s)) N2) x y s ClearTop(y, s) (a PutOn(x,y)) ClearTop(y, Result(a, s)) Initial Situation(s 0 ): Block(A); Block(B); Block(C); On(C, M, s o ); On(B, C, s o ); On(A, B, s 0 ); ClearTop(A, s 0 ); ClearTop(M, s 0 ) In s 2, axiom E1) is applicable for x=c e y=b : block(c) (block(b) B = M) ClearTop(C,s 2 ) ClearTop(B,s 2 ) On(C, B, result(puton(c,b), s 2 )) If s 3 = result(puton(c,b), s 2 ) then On(C, B, s 3 ) In s 2, axiom N1) is applicable for w=c, z=b, x=a e y=b then On(B, A, s 3 ) In s 2, axiom N1) is applicable for w=c, z=b, x=a e y=m then On(A, M, s 3 ) In s 2, axiom N1) is not applicable for w=c, z=b, x=c e y=m (i.e., On(C,M,s 3 ) is false) In s 2, axiom N2) é aplicável para x=c, w=b e y=c then ClearTop(C, s 3 ) In s 2, axiom N2) is not applicable for x=c, w=b e y=b (i.e., ClearTop(B,s 3 ) is false) Situation s 3 : On(C, B, s 3 ); On(B, A, s 3 ); On(A, M, s 3 ); ClearTop(C, s 3 ); ClearTop(M,s 3 ) c b a
25 Situation Calculus Example (Blocks World) Initial Situation a b c Action Sequence (plan) [ PutOn(A, M), PutOn(B, A), PutOn(C, B) ] Final Situation c b a
26 Complexity of Planning Problem The problem is intractable in the general case Simplifying assumptions: agent knows everything that is relevant for the planning problem agent knows how its available actions can change the world state from one state to another the planning agent is in control of the world the only state changes are the result of its deliberate actions the agent s preferred world states are constant during a planning episode Based on these assumptions, a typical approach is: first formulate the plan then execute it
27 Extensions of Planning Problem The real world surrounding the robot does not meet most of the simplifying assumptions, especially in dynamic, uncertain environments EXTENSIONS conditional planning: handles uncertainty by enumerating the possible states that may arise after the execution of an action and provides alternative courses of action for each of them plan monitoring and repair: during plan execution, progress is monitored and, when deviations from the predicted nominal conditions occur, the plan execution halts and a revised plan is created continual planning: in dynamic environments, one may allow context and/or agent s preferences changes and plan revision is an ongoing process rather than one triggered by failures of the nominal plan. Planning is not made in too much detail into the future, and it is interleaved with execution
28 Basic Planning Problem Formulation A possible formulation of the Planning problem is (Lavalle, 1996): 1. A nonempty state space, X, which is a finite or countably infinite set of states. 2. For each state, x X, a finite action space, U(x). 3. A state transition function, f, which produces a state, f(x; u) X, for every x X and u U(x). The state transition equation is derived from f as x = f(x; u). 4. An initial state, x I X. 5. A goal set, X G X.
29 Basic Planning Problem Formulation represent the planning problem as a directed state transition graph: set of vertices is the state space, X a directed edge from x X to x X exists in the graph if there exists an action u U(x) such that x = f(x; u) the initial state and goal set are designated as special vertices in the graph. X1 obj on the left u1 pick u2 push X2 obj in hand u3 release u1 pick X4 obj on the right X3 obj on ground
30 Basic Planning Problem Formulation Based on this formulation, several problem solving algorithms are available to find a feasible plan (i.e., one that leads from the initial to one of the goal states, not necessarily optimal). Examples: breadth-first depth-first best-first A*... Algorithms to solve Discrete Optimal Planning problems also exist, typically based on Dynamic Programming: find the sequence of actions that lead to the goal set and optimize some criterion, such as distance traversed or energy spent costs are associated to actions.
31 Logic-Based Planning ADVANTAGES build compact representations for discrete planning problems, when their regularity allows such compression convenient for producing output that logically explains the steps involved to arrive at some goal DISADVANTAGES difficult to generalize to enable concepts such as modeling uncertainty, unpredictability, sensing errors, and game theory to be incorporated into planning
32 Logic-Based Planning It is possible to convert the logic-based formulation into the graph-based formulation, e.g., the set of literals may be encoded as a binary string by imposing a linear ordering on the instances and predicates, and using 1s for true and 0 for false. This way, even optimal solutions can be found, if we associate costs to actions obj_on_the_left obj_on_the_right obj_in_hand obj_on_ground x1 x
33 Logic-Based Planning However, the problem dimension may become intractable, even for a small number of predicates and instances e.g., constant number k of arguments per predicate à k space state dimension is 2 P I, where P is the number of predicates I the number of instances per predicate argument. obj_on_the_left obj_on_the_right obj_in_hand obj_on_ground 4 predicates ( P = 4) with 1 argument (k = 1) left(<obj>) right(<obj>) inhand(<obj>) ground(<obj>) Autonomous Systems Pedro Lima, M. Isabel Ribeiro 3 objects ( I = 3) bolt nut bin Task Planning
34 Logic-Based Planning A STRIPS-like Planning formulation is (Lavalle, 1996): 1. A nonempty set, I, of instances. 2. A nonempty set, P, of predicates, which are binary-valued (partial) functions of one of more instances. Each application of a predicate to a specific set of instances is called a positive literal if the predicate is true or a negative literal if it is false. 3. A nonempty set, O, of operators, each of which has: 1) preconditions, which is a set of positive and negative literals that must hold for the operator to apply, and 2) effects, which is a set of positive and negative literals that are the result of applying the operator. 4. An initial set, S, which is expressed as a set of positive literals. All literals not appearing in S are assumed to be negative. 5. A goal set, G, which is expressed as a set of both positive and negative literals.
35 Logic-Based Planning STRIPS (Stanford Research Institute Problem Solver) (Fikes, Nilsson, 1971) Example: mobile robot should move a box from room S3 to S2 room S1 P1 door P1 room S2 World Model (KB) inroom(robot, room s1) inroom(box, room s3) connects(door p1, room s1, room s2) connects(door p2, room s2, room s3) S1 R robot S2 S3 B box door P2 P2 room S3 Goal inroom(box, room s2) Plan (Action Sequence) move(robot, room s1, room s3) search(box) push(box, room s3, room s2, door p2)
36 Logic-Based Planning STRIPS (Stanford Research Institute Problem Solver) (Fikes, Nilsson, 1971) tasks are specified as well-formed-formulas or wff (predicate calculus) planning system attempts to find an action sequence that modifies the world models so as to make the wff TRUE to generate a plan, the effect of each action is modeled
37 Logic-Based Planning STRIPS (Stanford Research Institute Problem Solver) (Fikes, Nilsson, 1971) operator (actions over world model) world model S i clause set operator pre-conditions add clauses remove clauses world model S i+1 clause set inroom(robot, room s1) inroom(box, room s3) connects(door p1, room s1, room s2) connects(door p2, room s2, room s3) inroom(robot, room s2) inroom(box, room s3) connects(door p1, room s1, room s2) connects(door p2, room s2, room s3) 1. is goal clause in the current world model? YES: success NO: 2. search in the operator list one whose pre-conditions are satisfied and that, when applied to the current one, produces a new world model where the goal is closer to be satisfied 3. GoTo 1
38 Logic-Based Planning STRIPS and Situation Calculus STRIPS Pre-conditions: inroom(robot, room s1) connects(door p1, room s1, room s2) OPERATOR move(robot, room s1, room s2) Effects: Add: inroom(robot, room s2) Delete: inroom(robot, room s1) Situation Calculus a x s room(s2) inroom(robot, s2, Result(a,s)) [room(s1) (a = move(robot, s1, s2) inroom(robot, s1,s) (room(x) inroom(robot, s2,s) a move(robot, s2, x) ) ]
39 Plan Representation and Task Modeling How to represent and determine the right plan? Behavior switching for a soccer robot lost_ball undribbable TakeBall2Goal no_ball success Score saw_ball AND ShouldIGo ClearBall lost_ball obstacle success success NOT ShouldIGo unreachable_ball OR lost_ball GetClose2Ball ShouldIGo saw_ball AND NOT ShouldIGo saw_ball AND ShouldIGo Standby saw_ball AND NOT ShouldIGo lost_ball success OR unreachable_posture OR (saw_ball AND NOT ShouldIGo) saw_ball AND ShouldIGo GoEmptySpot success OR unreachable_posture GoHome? success OR (NOT can_shoot_safely) success OR lost_ball
40 Plan Representation and Task Modeling to design a plan that meets some specifications we need a model of the robot task the plan is supposed to carry out a model enables performance analysis, formal verification (model checking) robot tasks are discrete event systems (DES) event-driven (not time-driven) discrete state space (not continuous)
41 Plan Representation and Task Modeling x time-driven continuous state space t x e 3 e 1 e 3 x 0 x 2 x 1 x 0 e 2 event-driven discrete state space t 1 t 2 t 3 t 4 t
42 Plan Representation and Task Modeling DES: State Machines / Finite State Automata
43 Petri Nets p 2 t 2 p 4 p 1 t 1 p 3 t 3 Discrete Event Dynamic Systems Pedro U. Lima Petri Nets
44 Petri Nets Def.: A Petri net (PN) graph or structure is a weighted bipartite graph (P,T,A,w), where: P={p 1, p 2,... p n } is the finite set of places T ={t 1, t 2,... t m } is the finite set of transitions A ( P T) ( T P) is the set of arcs from places to transitions (p i,t j ) and transitions to places (t j,p i ) w: A { 1,2,3, } is the weight function on the arcs Set of input places to t j T I( t ) = { p P : ( p, t ) A} Set of output places from j i t j T O( t ) = { p P : ( t, p ) A} j i i j j i
45 Petri Nets Def.: A marked Petri net is a five-tuple (P,T,A,w,x), where (P,T,A,w) is a Petri net graph and x is a marking of the set of n places P; x = [ x( p1), x( p2),, x( p n )] N is the row vector associated with x. p 2 t 2 p 4 p 1 t 1 p 3 t 3
46 Petri Nets Def. (PN dynamics): The state transition function, f : N T of Petri net (P,T,A,w,x), is defined for transition t j T iff x( pi ) w( pi, t j ), pi I( t j ). If f(x,t j ) is defined, the new state is x = f(x,t j ) where x' ( p ) = x( p ) w( p, t ) + w( t, p ), i = 1,, n. i i i j j i n Enabled t j N n p 2 t 2 p 4 p 1 t 1 p 3 t 3
47 Petri Nets p 2 t 2 p 4 p 1 t 1 p 3 t 3 Discrete Event Dynamic Systems Pedro U. Lima Petri Nets
48 Petri Nets p 2 t 2 p 4 p 1 t 1 p 3 t 3 Discrete Event Dynamic Systems Pedro U. Lima Petri Nets
49 Petri Nets p 2 t 2 p 4 p 1 t 1 p 3 t 3 Discrete Event Dynamic Systems Pedro U. Lima Petri Nets
50 Petri Nets p 2 t 2 p 4 p 1 t 1 p 3 t 3 Discrete Event Dynamic Systems Pedro U. Lima Petri Nets
51 Petri Nets p 2 t 2 p 4 p 1 t 1 p 3 t 3 Discrete Event Dynamic Systems Pedro U. Lima Petri Nets
52 Plan Representation and Modeling Petri Net Models of Robotic Tasks (Lima et al, 1998) (Milutinovic, Lima, 2002)(Costelha, Lima, 2012) Places with tokens represent resources available primitive actions running State is distributed over the places with tokens (PN marking) Events assigned to transitions and represent uncontrolled changes of state (e.g., caused by other agents or simply by the environment dynamics) controlled decisions to start a primitive action Transition fires when it is enabled and the labeling event occurs (note: the labeling event may be replaced by input/output arcs to places representing the reading of sensors for modeling and analysis)
53 Plan Representation and Modeling PN model of a multi-task single robot detected_ post track_found move2post teardown_pole back2track reached_post pole_down standby track_found following_track no_ no_ no_ resuming_ resuming_ resuming_ look_ahead point look_left point look_right point found_ interrupt resuming_ point resuming_ point resuming_ point following_resuming_point check_if_track track_not_found track_found
54 Plan Representation and Modeling (Lima et al, 1998) A Tool for Robotic Task Design and Distributed Execution Further developments in (Milutinovic, Lima, 2002) t vision_ready2locate_ball 2 locating_ball standby t 1 p 2 new_frame p 3 p start 1 robot_ready2move moving2ball catching_ball t 3 t 4 t 5 ball_located ready2catch p 4 p 5 p 6 ball_catched
55 Petri Nets Def. (Labeled Petri net): A labeled Petri net N is an eight-tuple N = ( P, T, A, w, E, l, x0, Xm) where ( P, T, A, w) is a PN graph E is the event set for transition labeling l : T E is the transition labeling function x 0 X N m n N is the initial state n is the set of marked states Def. (Languages generated and marked): L( N) : = { l( s) E : s T and f ( x0, s) is L m defined} ( N) : = { l( s) L( N) : s T and f ( x0, s) Xm}
56 Plan Representation and Modeling Petri Nets (PN) Language Model Petri Net N E = { s, nf, bl, r2c, bc} l( t x 0 1 X m ) = s, l( t = [ ] = { x 0 2 ) = nf, l( t ) = bl,,[ ]} 3 vision_ready2locate_ball t 2 locating_ball standby t 1 p 2 new_frame p 3 robot_ready2move moving2ball catching_ball p 1 start t 3 t 4 t 5 ball_catched p ball_located p ready2catch 4 5 p 6 x = x 0 = [ ] T marking or state Generated string: ε in L
57 Plan Representation and Modeling Petri Nets (PN) Language Model Petri Net N E = { s, nf, bl, r2c, bc} l( t x 0 1 X m ) = s, l( t = [ ] = { x 0 2 ) = nf, l( t ) = bl,,[ ]} 3 vision_ready2locate_ball t 2 locating_ball standby t 1 p 2 new_frame p 3 robot_ready2move moving2ball catching_ball p 1 start t 3 t 4 t 5 ball_catched p ball_located p ready2catch 4 5 p 6 x = [ ] T marking or state Generated string: s in L
58 Plan Representation and Modeling Petri Nets (PN) Language Model Petri Net N E = { s, nf, bl, r2c, bc} l( t x 0 1 X m ) = s, l( t = [ ] = { x 0 2 ) = nf, l( t ) = bl,,[ ]} 3 vision_ready2locate_ball t 2 locating_ball standby t 1 p 2 new_frame p 3 robot_ready2move moving2ball catching_ball p 1 start t 3 t 4 t 5 ball_catched p ball_located p ready2catch 4 5 p 6 x = [ ] T marking or state Generated string: s nf in L
59 Plan Representation and Modeling Petri Nets (PN) Language Model Petri Net N E = { s, nf, bl, r2c, bc} l( t x 0 1 X m ) = s, l( t = [ ] = { x 0 2 ) = nf, l( t ) = bl,,[ ]} 3 vision_ready2locate_ball t 2 locating_ball standby t 1 p 2 new_frame p 3 robot_ready2move moving2ball catching_ball p 1 start t 3 t 4 t 5 ball_catched p ball_located p ready2catch 4 5 p 6 x = [ ] T marking or state Generated string: s nf bl in L
60 Plan Representation and Modeling Petri Nets (PN) Language Model Petri Net N E = { s, nf, bl, r2c, bc} l( t x 0 1 X m ) = s, l( t = [ ] = { x 0 2 ) = nf, l( t ) = bl,,[ ]} 3 vision_ready2locate_ball t 2 locating_ball standby t 1 p 2 new_frame p 3 robot_ready2move moving2ball catching_ball p 1 start t 3 t 4 t 5 ball_catched p ball_located p ready2catch 4 5 p 6 x = [ ] T Generated and Marked Languages marking or state L( N) = { ε, s, s Lm ( G) = { ε, s, s nf, s nf bl nf,...} nf bl r2c bc} L( G)
61 Plan Representation and Modeling Monitoring algorithms check the value of predicates over world state variables. Event occurrence means that a logical function of the predicates changed from true to false or vice-versa. Examples of events: found_ball: see(ball)= false true lost_ball: see(ball)= true false see_ball AND closest_player2ball = false true
62 Plan Representation and Modeling PN markings represent world states A plan to carry out a task is the sequence of primitive actions in a sequence of markings (world states) Plans are conditional, as resource places in markings represent logical pre-conditions for the execution of the next primitive actiion Example: primitive actions set X={GetCloseToBall, TakeBallToGoal, Score } Plan: GetCloseToBall. TakeBallToGoal. Score
63 Plan Representation and Modeling Event sequences (i.e., strings) are an equivalent representation of plans A language is the set of all possible plans for a robot Different language classes are equivalent to machine types used to represent and execute the task (Finite State Automata, PNs,...) Of course, larger classes have an increased modeling power (e.g., PN languages vs regular/finite state machine languages) Do not confuse this with modeling elegance it is more natural to program with a rule-based system rather than with a state machine, but it is not necessarily more powerful (compare with C vs assembly)
64 Plan Representation and Modeling Abstraction Levels in Discrete Event Systems Untimed Timed Stochastic Timed e, e2,..., e 1 k,... time associated to events duration stochastic time associated to events FSA x, x1,..., x 0 k,... Timed FSA x(t) STA x( t) p( x( t)) e, e2,..., e 1 k,... time associated to transitions/ events duration stochastic time associated to transitions/events duration PN x 0,x 1,...,x k,... Timed PN x(t) SPN x( t) p( x( t))
65 Stochastic DES STOCHASTIC TIMED AUTOMATA (STA) inter-event time is stochastically distributed (typical case: exponential pdf) STOCHASTIC PETRI NET (SPN) inter-event time is stochastically distributed (typical case: exponential pdf) stochastic inter-event time assigned to transitions SPN with exponential timed transitions is equivalent to a Markov Chain
66 Controllable vs Uncontrollable Events in PNs (Costelha and Lima, 2012) Conflict between transitions associated enabled by different predicates (whose value is not controlled by the robot) Uncertain action effects Conflict between controllable events (associated to commands to start Dribble2Goal or Kick2Goal) e.g., probability that robot does not see ball happens before getting close to ball λ 2 λ 2 + λ 3 Random switch: probability of choosing Dribble2Goal is p 5 probability of choosing Kick2Goal is p 5 p 7 Probabilistic policy
67 PN Hierarchical Plan Representation (Costelha and Lima, 2012)
68 Generalized Stochastic Petri Net Closed Loop Robot Plan / Environent (Costelha and Lima, 2012) stochastic transitions immediate transitions
69 Plan Qualitative Analysis (Formal Verification) Qualitative view/models enable answering analysis questions such as: will bad behaviors occur? will unsafe states be avoided? will we attempt to use more resources than those available? Qualitative view/models enable designing supervisors for specifications such as: eliminate substrings corresponding to bad behaviors avoid blocking ensure bounded usage of resources
70 Plan Qualitative Analysis (Formal Verification) Safety properties For all executions the system avoids a bad set of events or a set of bad strings is never generated or marked. e.g., robot does not enter a room where holes on the ground exist (any sequence including traversing a door leading to the room must be disabled from happening) Blocking properties deadlocks or livelocks e.g., robot that can only move forward enters a corridor with a dead-end)
71 Plan Qualitative Analysis Def. (Boundedness): Place p i P in PN N with initial state x 0 is said to be k-bounded, or k-safe, if x(p i ) k for all states x R(N), i.e., for all reachable states. e.g., robot can not be called for a transportation task a 2nd time while it is performing the same task (place corresponding to the robot performing the transportation task should be 1-bounded or safe) Def. (Conservation): A PN N with initial state x 0 is said to be conservative with respect to γ = [γ 1, γ 2,..., γ ν ] if n i= 1 γ x( i p i ) = constant for all reachable states. e.g., robot with only one tool can never use 2 tools simultaneously during the performance of the whole task
72 Plan Qualitative Analysis Def. (Liveness): A PN N with initial state x 0 is said to be live if there always exists some sample path such that any transition can eventually fire from any state reached from x 0. Liveness levels - a transition in a PN may be: Dead or L0-live, if the transition can never fire from this state L1-live, if there is some firing sequence from x 0 such that the transition can fire at least once L2-live, if the transition can fire at least k times for some given positive integer k L3-live, if there exists some infinite firing sequence in which the transition appears infinitely often L4-live, if the transition is L1-live for every possible state reached from x 0 L3 L1 L0 L2 This property is related to the reachability of given states, and with the repeatability of system states (e.g., error recovery and returning to the initial state)
73 Plan Qualitative Analysis Def. (Liveness): A PN N with initial state x 0 is said to be live if there always exists some sample path such that any transition can eventually fire from any state reached from x 0. Liveness levels robot task examples: Dead or L0-live: robot in a deadlock situation L1-live: after robot picks an object it will not be able to pick it again later L2-live: robot can only perform an action sequence with a finite number of steps (e.g., release as many objects as those it picked before until its transported bin is empty) L3-live: robot keeps repeating the same action sequence forever L4-live: robot can always return to the same state and repeat the same operation L3 L1 L0 L2
74 Plan Quantitative Analysis Stochastic view/models enable answering analysis questions such as: what is the probability of success of a task plan? given a probability of success for the plan, how many steps (actions) will it take to accomplish the task? Stochastic view/models enable designing controllers for specifications such as: given some allowed number of steps for a plan, determine the plan that maximizes the probability of success given some desired probability of success, determine the plan that minimizes the number of required actions, or the accumulated action cost
75 Markov Decision Process (MDP) A Markov Chain is a stochastic process X(t) with discrete state space which satisfies the Markov property Pr{ x t+1 =x j x t =x i,x t 1 =x k,,x 0 =x i } = Pr { x t+1 =x j x t =x i = p ij p ij := transition probabilities Adding actions to a Markov Chain makes transition probabilities depend on the action taken.
76 Markov Decision Process (MDP) A Markov Chain with transition probabilities dependent on actions (u) rewards r associated to each (state x, action u) pair an associated cost/performance function is known as a Markov Decision Process (MDP) # Pr{ x = x',r = r x,u,r,x,u,,r,x,u $ = Pr x # t +1 t +1 t t t t 1 t % { = x',r = r x,u $ t +1 t +1 t t % object on the table grasp 1.0 object grasped pickup release 0.5 object on the floor no rewards included in the diagram
77 Planning as Solving MDPs Conflict between transitions associated enabled by different predicates (whose value is not controlled by the robot) Uncertain action effects Conflict between controllable events (associated to commands to start Dribble2Goal or Kick2Goal) e.g., probability that robot does not see ball happens before getting close to ball λ 2 λ 2 + λ 3 Random switch: probability of choosing Dribble2Goal is p 5 probability of choosing Kick2Goal is p 5 Probabilistic policy GSPN equivalent to MDP
78 STOCHASTIC PETRI NETS PN Stochastic Timed Model Def.:A Stochastic PN is a 6-tuple (P,T,A,w,x,F) where (P,T,A,w,x) is a marked PN, and F:R[x 0 ] T R is a function that associates to each transition t in each reachable marking x a random variable Def.:A Generalized Stochastic PN is a 7-tuple (P,T=T 0 T D,A,w,x,F,S) where (P,T,A,w,x) is a marked PN, F:R[x 0 ] T D R is a function that associates to each timed transition t T D in each reachable marking x a random variable. Each t T 0 has zero firing time in all reachable x. S is a set (possibly empty) of elements called random switches, which associate probability distributions to subsets of conflicting immediate transitions.
79 EXPONENTIAL TIMED PETRI NETS For Exponential Timed PNs, in the two previous definitions F:R[x 0 ] T R is a function that associates to each transition t j T D in each reachable marking x an exponential random variable with rate λ j (x). The transitions in T D are known as exponential transitions and refer to λ j (x) as the firing rate of t j in x.
80 EXPONENTIAL TIMED PETRI NETS Theorem The marking process of an exponential timed Petri net is a continuous time Markov Chain (CTMC). State space of the equivalent CTMC: reachability set R[x 0 ] of the exponential timed Petri net Computation of the transition rate from state x i to state x j x i is given by q = λ ( x ) ij k tk Tij Where T ij is the subset of T D of enabled transitions in x i such that the firing of any transition in T ij leaves the CTMC in x j. q ii = q ij If x j = x i, j i i
81 GENERALIZED SPN (GSPN) When there is conflict in state x i, if T i is the set of enabled transitions in x i, the probability of firing t j T i is: if T i is composed by exponential transitions only: λ ( x j tk T i k i ) λ ( x if T i includes one single immediate transition, this is the one that will fire if T i includes two or more immediate transition, a probability mass function will be specified over them by an element of S. The subset of immediate transitions plus the switching distribution is called a random switch. i )
82 GSPN AND EQUIVALENT CTMC ( ) To ensure the existence of an unique steady state probability vector ρ 1,...,ρ s for the marking process of the GSPN with s tangible markings, the following simplifying assumptions are made: 1. The GSPN is bounded, i.e., its reachability set is finite 2. Firing rates do not depend on time parameters, ensuring that the equivalent MC is homogeneous 3. The GSPN model is proper and deadlock-free, i.e., the initial marking is reachable with a non-zero probability from any marking in the reachability set and also there is no absorbing marking (can be lifted)
83 EXAMPLE: GSPN AND EQUIVALENT CTMC p.grasped(obj) p 1 p 2 p t 1 t 3 2 p.ontable(obj) λ 1 λ 2 a.pickingup_obj t 3 sel_carry_obj p 4 p 5 a.observing_table sel_deposit_obj q 3 λ 5 t 5 q 3 + q 4 =1 q 4 t 4 t 6 a.carrying_obj sel_deposit_obj p 6 a.depositing_obj random switches sel_pickup_obj
84 EXAMPLE: GSPN AND EQUIVALENT CTMC tangible Marking graph ( ) t 2 t 1 ( ) t 3 vanishing t 4 t 6 ( ) t 5 ( ) tangible vanishing
85 EXAMPLE: GSPN AND EQUIVALENT CTMC Embedded MC (EMC) ( ) tangible λ 2 λ 1 + λ 2 λ 1 λ 1 + λ 2 ( ) q 3 vanishing q 4 1 ( ) 1 ( ) tangible vanishing
86 EXAMPLE: GSPN AND EQUIVALENT CTMC tangible Reduced Embedded MC (REMC) ( ) q 3 λ 1 λ 1 + λ 2 1 λ 2 λ 1 + λ 2 + q 4 λ 1 λ 1 + λ 2 ( ) tangible MDP: random switch probabilities can be manipulated to achieve optimal decision
87 GSPN, REMC AND PERFORMANCE MEASURES PNs of robot controller and world model must be connected in closed loop. Closed loop PN can be analyzed w.r.t., e.g., 1 1. Probability that a particular condition C holds Pr(C) = ρ j j { 1,...,s} : C is satisfied in x j, S 1 = j S 1 2. Probability that place p i has exactly k tokens Pr(p i,k) = ρ j, S 2 = j S 2 3. Expected number of tokens in a place p i: ET[p i ] = K k=1 k Pr(p i,k), { } { j { 1,...,s} : x j ( p i ) = k} where K is the max number of tokens p i may contain in any reachable marking 1 ρ i is the probability of marking i
88 GSPN, REMC AND PERFORMANCE MEASURES 4. Throughput rate of an exponential transition t j : TR(t j ) = ρ i λ(x i,t j ) υ ij, S 3 = { i { 1,...,s} : t j enabled in x i } i S 3 where υ ij is the probability that t j fires among all enabled transitions in x i 5. Throughput rate of immediate transitions can be computed from those of the exponential transitions and from the structure of the model 6. Mean waiting time in a place p i: WAIT( p i ) = ET[p i ] t j IN( p i ) TR(t j ) = ET[ p i ] t j OUT ( p i ) TR(t j )
89 Markov Decision Process (MDP) Given: States x Actions u Transition probabilities p(x u,x) Reinforcement r t / expected payoff function r(x,u) Wanted: Policy π(x) that maximizes the future expected (discounted) reward
90 MDP Rewards and Policies Policy (fully observable case) is a map of states onto actions: π : x t u t Expected discounted cumulative reward / payoff: " T % R T = E $ γ τ r t+τ +1 ', # & τ =0 0 < γ 1 T=0: greedy policy T>0: finite horizon case, typically no discount T= : infinite-horizon case, finite reward if discount < 1
91 Markov Decision Process (MDP) Agent action u t reinforcement r t state x t x t+1 stochastic state dynamics x t X u t U(x t ) r t+1 R r t+1 Environment Goal: choose the action sequence that maximizes R T = E T % γ τ r ', 0 γ 1 T may go to infinity, as long as τ=0 t+τ+1 ' γ 1 & " $ $ #
92 MDP Ex.: Recycling robot transition probability _ trash α, R search search_trash expected reward action taken robot has to be rescued because its battery is depleted 1 β, 3 search_trash _trash β, R search search_trash 1, R wait wait Battery High 1,0 recharge_battery Battery Low 1, R wait α _trash, R search wait search_trash _trash 1 α, R search search_trash R search _ trash > R wait > 0 Number of cans collected while performing the corresponding tasks
93 Policies l Expected cumulative payoff of policy π: " T % R π T (x t ) = E $ γ τ r t+τ +1 x t,u t+τ = π (x t+τ )' # & τ =0 l Bellman equation for continuous action and state spaces Policy (may be deterministic or probabilistic) expected payoff V π T (x) = E{ R π T x t = x} = π (x,u) du p(x' u, x) # $ r(x,u)+γv π T 1 (x')% & dx' l Bellman equation for discrete action and state spaces V π T (x) = E{ R π T x t = x} = π (x,u) p(x' u, x) # $ r(x,u)+γv π T 1 (x')% & u x' V T (x) = maxv π T (x) π
94 l Expected cumulative payoff of policy π: Policies cont d. & R π T ) T (x t ) = E γ τ r t +τ +1 x t,u t +τ = π (x t ) ' ( * + τ =0 l Optimal policy: π = argmax π l 1-step optimal policy: π 0 (x) = argmax u r(x, u) l Value function of 1-step optimal policy: V 0 (x) = max u R T π (x t ) r(x, u)
95 2-step Policies l Optimal policy: π 1 (x) = argmax u l Value function: V 1 (x) = max u # & % r(x,u)+γ V (x')p(x' u, x) 0 ( $ ' x' # & % r(x,u)+γ V (x')p(x' u, x) 0 ( $ ' x'
96 T-step Policies l Optimal policy: π T (x) = argmax u l Value function: $ & r(x,u)+γ % x' V (x')p(x' u, x) T 1 ' ) ( V T (x) = max u $ & r(x,u)+γ % x' V (x')p(x' u, x) T 1 ' ) (
97 Infinite Horizon l Optimal policy, infinite horizon: V (x) = max u l Bellman equation x' l Fixed point is optimal policy $ ' & r(x,u)+γ V (x')p(x' u, x) ) % ( l Necessary and sufficient condition: induced policy is optimal iff value function satisfies the above condition
98 Value Iteration 1. for all x do 2. endfor Vˆ ( x) r min 3. repeat until convergence 1. for all x do 2. endfor 4. endrepeat ˆ V(x) max u π (x) = argmax u # % r(x,u)+γ $ " $ r(x,u)+γ # & V(x')p(x' ˆ u, x) ( ' x' x' % V(x')p(x' ˆ u, x) ' &
99 Value Iteration for Motion Planning
100 Reinforcement Learning Previous (DP) methods to solve MDPs assume full knowledge of p(x u,x) and r(u,x) Dynamic Programming (DP) To determine V for X = N, a system of N non-linear equations must be solved. Well established mathematical method. A complete model of the environment is required (P and R known). Often faces the curse of dimensionality [Bellman, 1957] Alternative approaches, if we do not know p(x u,x) and r(u,x) Monte Carlo Similar to DP, but P and R s unknown. P and R determined from the average of several trial-and-error trials. Unappropriate for a step-by-step incremental approximation of V *. Temporal Differences Knowledge of P e R is not required Step-by-step incremental approximation of V. Mathematical analysis more complex. Q-learning
101 Value Functions state value for policy π: & V π ) (x) = E π ' γ k r t +k +1 x t = x* ( + Expected value of starting in state x and following policy π thereafter. k= 0 (state, action) value for policy π: & Q π (x,u) = E π ' γ k r t +k +1 ( k= 0 ) x t = x,u t = u* + Expected value of starting in state x, carrying out action u, and following policy π thereafter.
102 Value Functions (cont d) relation between state value and Q function for policy π: Q is such that its value is the maximum discounted cumulative reward that can be achieved starting from state x and applying action u as the first action Q(x,u) = E{ r t+1 +γv (x t+1 ) x t = x,u t = u} V (x) = maxq (x,u') u' { } Q (x,u) = E r t+1 +γ maxq (x t+1,u') x t = x,u t = u u'
103 Value Functions (cont d) Bellman equation for V and Q (discrete action and state spaces, deterministic policy) V T π (x) = Q T π (x,u) = x' x' [ ] p(x' u, x) r(x,u) + γv π T 1 (x') V T = max π V T π (x) x p(x' u, x) " # r(x,u)+γ max Q T = maxq π T (x,u) x,u π u' Q π T 1 (x',u') $ % Solutions are unique and equations are also met by the optimal functions
104 Q-Learning - Algorithm Initialize Q(x,u) random or arbitrarily Repeat forever (for each episode or trial): Initialize x Repeat (for each step n of the episode): Choose action u of x Execute action u and observe r and x Update Q for the n xu th visit to (x,u) x x'; until x final. Q (x,u) Q nxu +1 n xu (x,u)+α # nxu $ r(x,u)+γ maxq nxu (x',u') Q nxu (x,u)% & u' α constant allows adaptability to slow environment changes but it does not guarantee convergence only possible with a temporal decay under given circumstances.
105 Q-Learning Algorithm Convergence Should each pair (x,u) be visited an infinite number of times, with 0 α nxu <1 i=1 i=1 α nxu (i) 2 α nxu (i) = < then x, u Pr lim ˆQnxu (x,u) = Q(x,u) & $n % ' =1
106 Action Selection: Exploration vs Exploitation Exploration: less promising actions, which may lead to good results, are tested. Exploitation: takes advantage of tested actions which are more promising, i.e., which have a larger Q(x,u). ε- greedy: at each step n, picks the best action so far with probability 1-ε, for small ε, but can also pick with probability ε, in an uniformly distributed random fashion, one of the other actions. softmax: at each step n, picks the action to be executed according to a Gibbs or Boltzmann distribution: π n (x,u) = eq n (x,u)/τ e Q n (x,u')/τ u'(x)
107 Q-Learning an Example G r(x,u) V * (x) G 100 Q nπ (x,u) α = 1 γ =
108 Q-Learning an Example G r(x,u) V * (x) G Q nπ (x,u)
Task Planning AUTONOMOUS SYSTEMS. Pedro U. Lima M. Isabel Ribeiro. Institute for Systems and Robotics Instituto Superior Técnico Lisbon, Portugal
AUTONOMOUS SYSTEMS Task Planning Pedro U. Lima M. Isabel Ribeiro Institute for Systems and Robotics Instituto Superior Técnico Lisbon, Portugal March 2007 Outline 1. Planning Problem 2. Logic 3. Logic-Based
More informationADVANCED ROBOTICS. PLAN REPRESENTATION Generalized Stochastic Petri nets and Markov Decision Processes
ADVANCED ROBOTICS PLAN REPRESENTATION Generalized Stochastic Petri nets and Markov Decision Processes Pedro U. Lima Instituto Superior Técnico/Instituto de Sistemas e Robótica September 2009 Reviewed April
More informationCS599 Lecture 1 Introduction To RL
CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming
More information7. Queueing Systems. 8. Petri nets vs. State Automata
Petri Nets 1. Finite State Automata 2. Petri net notation and definition (no dynamics) 3. Introducing State: Petri net marking 4. Petri net dynamics 5. Capacity Constrained Petri nets 6. Petri net models
More informationDistributed Optimization. Song Chong EE, KAIST
Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links
More informationCS 4700: Foundations of Artificial Intelligence
CS 4700: Foundations of Artificial Intelligence Bart Selman selman@cs.cornell.edu Module: Knowledge, Reasoning, and Planning Part 2 Logical Agents R&N: Chapter 7 1 Illustrative example: Wumpus World (Somewhat
More informationMARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti
1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early
More informationReinforcement Learning
Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha
More informationChapter 7 R&N ICS 271 Fall 2017 Kalev Kask
Set 6: Knowledge Representation: The Propositional Calculus Chapter 7 R&N ICS 271 Fall 2017 Kalev Kask Outline Representing knowledge using logic Agent that reason logically A knowledge based agent Representing
More informationIntroduction to Reinforcement Learning. CMPT 882 Mar. 18
Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and
More informationIntelligent Agents. Pınar Yolum Utrecht University
Intelligent Agents Pınar Yolum p.yolum@uu.nl Utrecht University Logical Agents (Based mostly on the course slides from http://aima.cs.berkeley.edu/) Outline Knowledge-based agents Wumpus world Logic in
More informationStochastic Petri Net. Ben, Yue (Cindy) 2013/05/08
Stochastic Petri Net 2013/05/08 2 To study a formal model (personal view) Definition (and maybe history) Brief family tree: the branches and extensions Advantages and disadvantages for each Applications
More informationReinforcement Learning II
Reinforcement Learning II Andrea Bonarini Artificial Intelligence and Robotics Lab Department of Electronics and Information Politecnico di Milano E-mail: bonarini@elet.polimi.it URL:http://www.dei.polimi.it/people/bonarini
More informationBasics of reinforcement learning
Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system
More informationThe State Explosion Problem
The State Explosion Problem Martin Kot August 16, 2003 1 Introduction One from main approaches to checking correctness of a concurrent system are state space methods. They are suitable for automatic analysis
More information6 Reinforcement Learning
6 Reinforcement Learning As discussed above, a basic form of supervised learning is function approximation, relating input vectors to output vectors, or, more generally, finding density functions p(y,
More informationLogical Agents. Chapter 7
Logical Agents Chapter 7 Outline Knowledge-based agents Wumpus world Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability Inference rules and theorem
More informationSpecification models and their analysis Petri Nets
Specification models and their analysis Petri Nets Kai Lampka December 10, 2010 1 30 Part I Petri Nets Basics Petri Nets Introduction A Petri Net (PN) is a weighted(?), bipartite(?) digraph(?) invented
More informationArtificial Intelligence Chapter 7: Logical Agents
Artificial Intelligence Chapter 7: Logical Agents Michael Scherger Department of Computer Science Kent State University February 20, 2006 AI: Chapter 7: Logical Agents 1 Contents Knowledge Based Agents
More informationPlanning Under Uncertainty II
Planning Under Uncertainty II Intelligent Robotics 2014/15 Bruno Lacerda Announcement No class next Monday - 17/11/2014 2 Previous Lecture Approach to cope with uncertainty on outcome of actions Markov
More informationReinforcement Learning. Spring 2018 Defining MDPs, Planning
Reinforcement Learning Spring 2018 Defining MDPs, Planning understandability 0 Slide 10 time You are here Markov Process Where you will go depends only on where you are Markov Process: Information state
More informationEE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS
EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 10, 5/9/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Logical Agents Chapter 7
More informationINF5390 Kunstig intelligens. Logical Agents. Roar Fjellheim
INF5390 Kunstig intelligens Logical Agents Roar Fjellheim Outline Knowledge-based agents The Wumpus world Knowledge representation Logical reasoning Propositional logic Wumpus agent Summary AIMA Chapter
More informationLogic. Introduction to Artificial Intelligence CS/ECE 348 Lecture 11 September 27, 2001
Logic Introduction to Artificial Intelligence CS/ECE 348 Lecture 11 September 27, 2001 Last Lecture Games Cont. α-β pruning Outline Games with chance, e.g. Backgammon Logical Agents and thewumpus World
More informationKnowledge base (KB) = set of sentences in a formal language Declarative approach to building an agent (or other system):
Logic Knowledge-based agents Inference engine Knowledge base Domain-independent algorithms Domain-specific content Knowledge base (KB) = set of sentences in a formal language Declarative approach to building
More information, and rewards and transition matrices as shown below:
CSE 50a. Assignment 7 Out: Tue Nov Due: Thu Dec Reading: Sutton & Barto, Chapters -. 7. Policy improvement Consider the Markov decision process (MDP) with two states s {0, }, two actions a {0, }, discount
More informationSequential decision making under uncertainty. Department of Computer Science, Czech Technical University in Prague
Sequential decision making under uncertainty Jiří Kléma Department of Computer Science, Czech Technical University in Prague https://cw.fel.cvut.cz/wiki/courses/b4b36zui/prednasky pagenda Previous lecture:
More informationBalancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm
Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu
More informationThe Markov Decision Process (MDP) model
Decision Making in Robots and Autonomous Agents The Markov Decision Process (MDP) model Subramanian Ramamoorthy School of Informatics 25 January, 2013 In the MAB Model We were in a single casino and the
More informationMachine Learning. Reinforcement learning. Hamid Beigy. Sharif University of Technology. Fall 1396
Machine Learning Reinforcement learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1396 1 / 32 Table of contents 1 Introduction
More informationSequential Decision Problems
Sequential Decision Problems Michael A. Goodrich November 10, 2006 If I make changes to these notes after they are posted and if these changes are important (beyond cosmetic), the changes will highlighted
More informationLogical Agents. Outline
Logical Agents *(Chapter 7 (Russel & Norvig, 2004)) Outline Knowledge-based agents Wumpus world Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability
More informationTime(d) Petri Net. Serge Haddad. Petri Nets 2016, June 20th LSV ENS Cachan, Université Paris-Saclay & CNRS & INRIA
Time(d) Petri Net Serge Haddad LSV ENS Cachan, Université Paris-Saclay & CNRS & INRIA haddad@lsv.ens-cachan.fr Petri Nets 2016, June 20th 2016 1 Time and Petri Nets 2 Time Petri Net: Syntax and Semantic
More informationReinforcement Learning and Deep Reinforcement Learning
Reinforcement Learning and Deep Reinforcement Learning Ashis Kumer Biswas, Ph.D. ashis.biswas@ucdenver.edu Deep Learning November 5, 2018 1 / 64 Outlines 1 Principles of Reinforcement Learning 2 The Q
More informationDiscrete Event Systems Exam
Computer Engineering and Networks Laboratory TEC, NSG, DISCO HS 2016 Prof. L. Thiele, Prof. L. Vanbever, Prof. R. Wattenhofer Discrete Event Systems Exam Friday, 3 rd February 2017, 14:00 16:00. Do not
More informationChapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem Objectives of this chapter: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationDecision Theory: Markov Decision Processes
Decision Theory: Markov Decision Processes CPSC 322 Lecture 33 March 31, 2006 Textbook 12.5 Decision Theory: Markov Decision Processes CPSC 322 Lecture 33, Slide 1 Lecture Overview Recap Rewards and Policies
More informationChristopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015
Q-Learning Christopher Watkins and Peter Dayan Noga Zaslavsky The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Noga Zaslavsky Q-Learning (Watkins & Dayan, 1992)
More informationCourse basics. CSE 190: Reinforcement Learning: An Introduction. Last Time. Course goals. The website for the class is linked off my homepage.
Course basics CSE 190: Reinforcement Learning: An Introduction The website for the class is linked off my homepage. Grades will be based on programming assignments, homeworks, and class participation.
More informationReinforcement Learning. Introduction
Reinforcement Learning Introduction Reinforcement Learning Agent interacts and learns from a stochastic environment Science of sequential decision making Many faces of reinforcement learning Optimal control
More informationAdvanced Topics in LP and FP
Lecture 1: Prolog and Summary of this lecture 1 Introduction to Prolog 2 3 Truth value evaluation 4 Prolog Logic programming language Introduction to Prolog Introduced in the 1970s Program = collection
More informationMotivation for introducing probabilities
for introducing probabilities Reaching the goals is often not sufficient: it is important that the expected costs do not outweigh the benefit of reaching the goals. 1 Objective: maximize benefits - costs.
More informationCourse 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016
Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of
More informationCMU Lecture 12: Reinforcement Learning. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 12: Reinforcement Learning Teacher: Gianni A. Di Caro REINFORCEMENT LEARNING Transition Model? State Action Reward model? Agent Goal: Maximize expected sum of future rewards 2 MDP PLANNING
More informationDES. 4. Petri Nets. Introduction. Different Classes of Petri Net. Petri net properties. Analysis of Petri net models
4. Petri Nets Introduction Different Classes of Petri Net Petri net properties Analysis of Petri net models 1 Petri Nets C.A Petri, TU Darmstadt, 1962 A mathematical and graphical modeling method. Describe
More informationCS 7180: Behavioral Modeling and Decisionmaking
CS 7180: Behavioral Modeling and Decisionmaking in AI Markov Decision Processes for Complex Decisionmaking Prof. Amy Sliva October 17, 2012 Decisions are nondeterministic In many situations, behavior and
More informationProbabilistic Planning. George Konidaris
Probabilistic Planning George Konidaris gdk@cs.brown.edu Fall 2017 The Planning Problem Finding a sequence of actions to achieve some goal. Plans It s great when a plan just works but the world doesn t
More informationStochastic Petri Nets. Jonatan Lindén. Modelling SPN GSPN. Performance measures. Almost none of the theory. December 8, 2010
Stochastic Almost none of the theory December 8, 2010 Outline 1 2 Introduction A Petri net (PN) is something like a generalized automata. A Stochastic Petri Net () a stochastic extension to Petri nets,
More informationIntroduction to Artificial Intelligence. Logical Agents
Introduction to Artificial Intelligence Logical Agents (Logic, Deduction, Knowledge Representation) Bernhard Beckert UNIVERSITÄT KOBLENZ-LANDAU Winter Term 2004/2005 B. Beckert: KI für IM p.1 Outline Knowledge-based
More informationMachine Learning I Reinforcement Learning
Machine Learning I Reinforcement Learning Thomas Rückstieß Technische Universität München December 17/18, 2009 Literature Book: Reinforcement Learning: An Introduction Sutton & Barto (free online version:
More informationLogical agents. Chapter 7. Chapter 7 1
Logical agents Chapter 7 Chapter 7 Outline Knowledge-based agents Wumpus world Logic in general models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability Inference rules
More informationMarkov Decision Processes
Markov Decision Processes Noel Welsh 11 November 2010 Noel Welsh () Markov Decision Processes 11 November 2010 1 / 30 Annoucements Applicant visitor day seeks robot demonstrators for exciting half hour
More informationIntelligent Agents. First Order Logic. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University. last change: 19.
Intelligent Agents First Order Logic Ute Schmid Cognitive Systems, Applied Computer Science, Bamberg University last change: 19. Mai 2015 U. Schmid (CogSys) Intelligent Agents last change: 19. Mai 2015
More informationPredicate Logic: Sematics Part 1
Predicate Logic: Sematics Part 1 CS402, Spring 2018 Shin Yoo Predicate Calculus Propositional logic is also called sentential logic, i.e. a logical system that deals with whole sentences connected with
More informationLogical Agent & Propositional Logic
Logical Agent & Propositional Logic Berlin Chen 2005 References: 1. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Chapter 7 2. S. Russell s teaching materials Introduction The representation
More informationMarkov Decision Processes Chapter 17. Mausam
Markov Decision Processes Chapter 17 Mausam Planning Agent Static vs. Dynamic Fully vs. Partially Observable Environment What action next? Deterministic vs. Stochastic Perfect vs. Noisy Instantaneous vs.
More informationAnalysis and Optimization of Discrete Event Systems using Petri Nets
Volume 113 No. 11 2017, 1 10 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Analysis and Optimization of Discrete Event Systems using Petri Nets
More information20/c/applet/more.html Local Beam Search The best K states are selected.
Have some fun Checker: http://www.cs.caltech.edu/~vhuang/cs 20/c/applet/more.html 1 Local Beam Search Run multiple searches to find the solution The best K states are selected. Like parallel hill climbing
More informationReinforcement Learning
Reinforcement Learning 1 Reinforcement Learning Mainly based on Reinforcement Learning An Introduction by Richard Sutton and Andrew Barto Slides are mainly based on the course material provided by the
More informationDecidability: Church-Turing Thesis
Decidability: Church-Turing Thesis While there are a countably infinite number of languages that are described by TMs over some alphabet Σ, there are an uncountably infinite number that are not Are there
More informationReinforcement Learning: An Introduction
Introduction Betreuer: Freek Stulp Hauptseminar Intelligente Autonome Systeme (WiSe 04/05) Forschungs- und Lehreinheit Informatik IX Technische Universität München November 24, 2004 Introduction What is
More informationLogical Agents: Propositional Logic. Chapter 7
Logical Agents: Propositional Logic Chapter 7 Outline Topics: Knowledge-based agents Example domain: The Wumpus World Logic in general models and entailment Propositional (Boolean) logic Equivalence, validity,
More informationKecerdasan Buatan M. Ali Fauzi
Kecerdasan Buatan M. Ali Fauzi Artificial Intelligence M. Ali Fauzi Logical Agents M. Ali Fauzi In which we design agents that can form representations of the would, use a process of inference to derive
More informationc 2011 Nisha Somnath
c 2011 Nisha Somnath HIERARCHICAL SUPERVISORY CONTROL OF COMPLEX PETRI NETS BY NISHA SOMNATH THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Aerospace
More informationA Gentle Introduction to Reinforcement Learning
A Gentle Introduction to Reinforcement Learning Alexander Jung 2018 1 Introduction and Motivation Consider the cleaning robot Rumba which has to clean the office room B329. In order to keep things simple,
More informationTitle: Logical Agents AIMA: Chapter 7 (Sections 7.4 and 7.5)
B.Y. Choueiry 1 Instructor s notes #12 Title: Logical Agents AIMA: Chapter 7 (Sections 7.4 and 7.5) Introduction to Artificial Intelligence CSCE 476-876, Fall 2018 URL: www.cse.unl.edu/ choueiry/f18-476-876
More informationTime and Timed Petri Nets
Time and Timed Petri Nets Serge Haddad LSV ENS Cachan & CNRS & INRIA haddad@lsv.ens-cachan.fr DISC 11, June 9th 2011 1 Time and Petri Nets 2 Timed Models 3 Expressiveness 4 Analysis 1/36 Outline 1 Time
More informationProof Methods for Propositional Logic
Proof Methods for Propositional Logic Logical equivalence Two sentences are logically equivalent iff they are true in the same models: α ß iff α β and β α Russell and Norvig Chapter 7 CS440 Fall 2015 1
More informationTDT4136 Logic and Reasoning Systems
TDT436 Logic and Reasoning Systems Chapter 7 - Logic gents Lester Solbakken solbakke@idi.ntnu.no Norwegian University of Science and Technology 06.09.0 Lester Solbakken TDT436 Logic and Reasoning Systems
More informationChapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem Objectives of this chapter: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationAI Programming CS S-09 Knowledge Representation
AI Programming CS662-2013S-09 Knowledge Representation David Galles Department of Computer Science University of San Francisco 09-0: Overview So far, we ve talked about search, which is a means of considering
More informationSome techniques and results in deciding bisimilarity
Some techniques and results in deciding bisimilarity Petr Jančar Dept of Computer Science Technical University Ostrava (FEI VŠB-TU) Czech Republic www.cs.vsb.cz/jancar Talk at the Verification Seminar,
More informationARTIFICIAL INTELLIGENCE. Reinforcement learning
INFOB2KI 2018-2019 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Reinforcement learning Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
More informationLecture 1: March 7, 2018
Reinforcement Learning Spring Semester, 2017/8 Lecture 1: March 7, 2018 Lecturer: Yishay Mansour Scribe: ym DISCLAIMER: Based on Learning and Planning in Dynamical Systems by Shie Mannor c, all rights
More informationProbabilistic Model Checking and Strategy Synthesis for Robot Navigation
Probabilistic Model Checking and Strategy Synthesis for Robot Navigation Dave Parker University of Birmingham (joint work with Bruno Lacerda, Nick Hawes) AIMS CDT, Oxford, May 2015 Overview Probabilistic
More informationRevised by Hankui Zhuo, March 21, Logical agents. Chapter 7. Chapter 7 1
Revised by Hankui Zhuo, March, 08 Logical agents Chapter 7 Chapter 7 Outline Wumpus world Logic in general models and entailment Propositional (oolean) logic Equivalence, validity, satisfiability Inference
More informationLogic in AI Chapter 7. Mausam (Based on slides of Dan Weld, Stuart Russell, Subbarao Kambhampati, Dieter Fox, Henry Kautz )
Logic in AI Chapter 7 Mausam (Based on slides of Dan Weld, Stuart Russell, Subbarao Kambhampati, Dieter Fox, Henry Kautz ) 2 Knowledge Representation represent knowledge about the world in a manner that
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Reinforcement learning Daniel Hennes 4.12.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Reinforcement learning Model based and
More informationInference in first-order logic. Production systems.
CS 1571 Introduction to AI Lecture 17 Inference in first-order logic. Production systems. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Sentences in Horn normal form Horn normal form (HNF) in
More informationECE276B: Planning & Learning in Robotics Lecture 16: Model-free Control
ECE276B: Planning & Learning in Robotics Lecture 16: Model-free Control Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Tianyu Wang: tiw161@eng.ucsd.edu Yongxi Lu: yol070@eng.ucsd.edu
More informationLinear-time Temporal Logic
Linear-time Temporal Logic Pedro Cabalar Department of Computer Science University of Corunna, SPAIN cabalar@udc.es 2015/2016 P. Cabalar ( Department Linear oftemporal Computer Logic Science University
More informationPetri Nets (for Planners)
Petri (for Planners) B. Bonet, P. Haslum... from various places... ICAPS 2011 & Motivation Petri (PNs) is formalism for modelling discrete event systems Developed by (and named after) C.A. Petri in 1960s
More informationReinforcement Learning
1 Reinforcement Learning Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Plan 1 Why reinforcement learning? Where does this theory come from? Markov decision
More informationToday s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes
Today s s Lecture Lecture 20: Learning -4 Review of Neural Networks Markov-Decision Processes Victor Lesser CMPSCI 683 Fall 2004 Reinforcement learning 2 Back-propagation Applicability of Neural Networks
More informationInternet Monetization
Internet Monetization March May, 2013 Discrete time Finite A decision process (MDP) is reward process with decisions. It models an environment in which all states are and time is divided into stages. Definition
More informationLogical Agents. Santa Clara University
Logical Agents Santa Clara University Logical Agents Humans know things Humans use knowledge to make plans Humans do not act completely reflexive, but reason AI: Simple problem-solving agents have knowledge
More informationFinal Exam December 12, 2017
Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes
More informationReinforcement Learning. Machine Learning, Fall 2010
Reinforcement Learning Machine Learning, Fall 2010 1 Administrativia This week: finish RL, most likely start graphical models LA2: due on Thursday LA3: comes out on Thursday TA Office hours: Today 1:30-2:30
More informationLecture 25: Learning 4. Victor R. Lesser. CMPSCI 683 Fall 2010
Lecture 25: Learning 4 Victor R. Lesser CMPSCI 683 Fall 2010 Final Exam Information Final EXAM on Th 12/16 at 4:00pm in Lederle Grad Res Ctr Rm A301 2 Hours but obviously you can leave early! Open Book
More informationDecision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012
Decision Problems with TM s Look at following sets: Lecture 31: Halting Problem CSCI 81 Spring, 2012 Kim Bruce A TM = { M,w M is a TM and w L(M)} H TM = { M,w M is a TM which halts on input w} TOTAL TM
More informationLogic. Propositional Logic: Syntax
Logic Propositional Logic: Syntax Logic is a tool for formalizing reasoning. There are lots of different logics: probabilistic logic: for reasoning about probability temporal logic: for reasoning about
More informationElements of Reinforcement Learning
Elements of Reinforcement Learning Policy: way learning algorithm behaves (mapping from state to action) Reward function: Mapping of state action pair to reward or cost Value function: long term reward,
More informationReinforcement Learning. George Konidaris
Reinforcement Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom
More informationIntelligent Agents. Formal Characteristics of Planning. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University
Intelligent Agents Formal Characteristics of Planning Ute Schmid Cognitive Systems, Applied Computer Science, Bamberg University Extensions to the slides for chapter 3 of Dana Nau with contributions by
More informationInference Methods In Propositional Logic
Lecture Notes, Artificial Intelligence ((ENCS434)) University of Birzeit 1 st Semester, 2011 Artificial Intelligence (ENCS434) Inference Methods In Propositional Logic Dr. Mustafa Jarrar University of
More informationFinal Exam December 12, 2017
Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes
More informationLogical Agent & Propositional Logic
Logical Agent & Propositional Logic Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. S. Russell and P. Norvig. Artificial Intelligence:
More informationFirst-Order Logic First-Order Theories. Roopsha Samanta. Partly based on slides by Aaron Bradley and Isil Dillig
First-Order Logic First-Order Theories Roopsha Samanta Partly based on slides by Aaron Bradley and Isil Dillig Roadmap Review: propositional logic Syntax and semantics of first-order logic (FOL) Semantic
More informationPropositional Logic: Syntax
Logic Logic is a tool for formalizing reasoning. There are lots of different logics: probabilistic logic: for reasoning about probability temporal logic: for reasoning about time (and programs) epistemic
More information