State Space Search Problems Lecture Heuristic Search Stefan Edelkamp
1 Overview Different state space formalisms, including labelled, implicit and explicit weighted graph representations Alternative formalisms: Production Systems, Propositional Action Planning Brief introduction to (LTL) Model Checking Proof that general state space problem solving is undecidable Examples of single agent challenges: (n 2 1)-Puzzle and known extensions to it, Rubik s Cube, Sokoban, Atomix, and Wusel Application areas Route Planning and Multiple Sequence Alignment Overview 1
2 State Space Problems A state space problem is a quadruple P = < S, O, I, G >, where S is the set of states, I S is the initial state, G S is the set of goal states, and O : S S is the set of operators that transform states into states. State Space Problems 2
3 Solution A solution π = (O 1,..., O k ) is an ordered sequence of operators O i O, i {1,..., k}, that transforms the initial state I into one of the goal states G G There exists a sequence of states S i S, i {0,..., k}, with S 0 = I, S k = G, and S i is the outcome of applying O i to S i 1, i {1,..., k} The solution path with minimal k is called the optimal solution Solution 3
4 Labelled Representation A labelled state space problem is a quintuple P = < S, O, I, G,Σ >, where the set of operators is subdivided by labels Σ There is a function δ : S Σ S such that O = {(S, S ) σ Σ, δ(s, σ) = σ } Labelled Representation 4
5 Weighted Problems and Dead-Ends A weighted state space problem is a tupel P = < S, O, I, G, w >, where w is a cost function w : O IR + The cost of a path (O 1,..., O n ) is defined as n i=1 w(o i ) We call a solution optimal if it has minimum cost among all feasible solutions A problems is reversible, if for each operator O O there exists an operator O 1 O, so that O(O 1 (S)) = S and O 1 (O(S)) = S If the goal is reachable, then it is reachable from each encountered state A state space problem has a dead-end C S, if C is reachable and P =< S, O, C, G > is unsolvable. Weighted Problems and Dead-Ends 5
6 Explicit State Space Graph A state space problem graph G = (V, E, s, T ) for the state space problem P =< S, O, I, G > is defined by V = S as the set of nodes, s = I as the initial node, T = G as the set of goal states, and E V V as the set of edges that connect nodes to nodes with (u, v) E if and only if there exists and O O with O(u) = v Additionally, a weight function w : E IR, can be defined The graph has uniform weight, if w(u, v) is constant for all (u, v) E Explicit State Space Graph 6
7 Implicit State Space Graph In an implicit state space graph we have an initial node s V, a set of goal nodes determined by a predicate goal: V IB = {0, 1}, and a node expansion function expand: V 2 V To distinguish the node expansion procedure from the successor set itself, we will write Γ for the latter Function expand can often be decomposed as expand(u) = domove(u, a) a Σ Implicit State Space Graph 7
8 (n 2 1)-Puzzle The Eight-, Fifteen-, and Twenty-Four-Puzzle: 1 2 3 8 4 7 6 5 1 2 3 5 6 7 4 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Half of the (n 2 )! possible states are reachable 10 5 reachable states in the Eight-Puzzle, 10 13 states in the Fifteen-Puzzle, and 10 25 states in the Twenty-Four-Puzzle. (n 2 1)-Puzzle 8
State Space Representation S is casted as the set of vector representations, e.g. Eight-Puzzle: (1, 2, 3, 8, 0, 4, 7, 6, 5) initial state I S provided by the user, single goal state G G; value i at index i + 1, 1 i n 2 1 O O as follows: if blank is at index j, swap it either in direction U: index j n, unless blank top-most D: index j + n, unless blank bottom-most L: index j 1, unless blank left-most, or R: index j + 1, unless blank right-most In a labelled representation we may assign Σ = {U, D, L, R} (n 2 1)-Puzzle 9
9 General Sliding Tile Donkey-, Century- and Dad s- as well as the Harlekin- and Man-in-the-Bottle Donkey-Puzzle: 65, 880, Dad-Puzzle: 18, 504, Century-Puzzle: 109, 260, Harlekin-Puzzle: 176, 250, and Man-and-Bottle-Puzzle: 143, 100 states General Sliding Tile 10
Representations 1. Order of placements according to reference points, e.h. Donkey-Puzzle: 2x1, blank, blank, 2x1, 2x2, 2x1, 2x1, 1x2, 1x1, 1x1, 1x1, and 1x1. # configurations bounded by ( s ) = s! f 1,..., f k f 1! f k! 2. Unique normal form that stores, for every piece type, a sorted array of reference points (e.g. to a row-wise order) Successor set generation: place pebbles around each empty square, to signal if and in which direction an object may move Pebbling and unpebbling can be performed in time O(#blanks) Move execution can be tested in O(#pieces) time General Sliding Tile 11
10 Wusel Connected n-polycube, e.g. Wusel is always one piece In motion, a group of cubes is simultanously moved in one of the tree cube axis The group itself constitutes a 3D connected component Cubes connected to the ground can only be moved upwards Wusel tumbles if projection of the center of mass is outside convex hull Wusel 12
Successor Generation Wusel is represented as a sorted array of 3D coordinate values, avoiding repeated positions Naiive successor generation module concern all possible groups in an array of size n2 n, denoting whether the cubes with index i, 0 i n 1 is contained in partition j, 1 j 2 n 1 initialization by converting the binary encoding of the numbers 0 to 2 n 1. move execution checked O(2 n ) times and takes O(n) time, for O(n2 n ) total time After the move executions, the result state is validated combination of O(n) connectivity and O(n log n) gravity checks Wusel 13
11 Rubik s Cube Invented by E. Rubik Each face can be rotated by 90, 180, or 270 degrees 8! 3 8 12! 2 12 /12 43 10 18 possible cube configurations Rubik s Cube 14
Branching Factor Reduction Six faces initial branching factor of 6 3 = 18 Rule Never rotate the same face twice in a row, reduces the branching factor to 5 3 = 15 after the first move. Rule If two opposite faces are rotated consecutively, take only one order Arbitrarily label one a first face, and the other a second face 1. After a first face is twisted, there are three possible twists of each of the remaining five faces, resulting in a branching factor of 15. 2. After a second face is twisted, however, we can only twist four remaining faces, excluding the face just twisted and its corresponding first face, for a branching factor of 12. Rubik s Cube 15
12 Sokoban Rules: n balls located in a maze to be moved onto n corresponding goal fields man, controlled by the puzzle solver, traverses the board and push a ball onto an adjacent empty square Mininum number of ball pushes 90; minimum number of man movements 230 Sokoban 16
Dead-Ends Problem DECIDE is just the task to solve the puzzle Problem PUSHES additionally askes to minimize the number of ball pushes, whereas Problem MOVES request an optimal number of man movements Examples for dead-end positions in Sokoban are balls that lie at the boundary of the maze that does include a goal field, four balls placed next to each other in form of a square, so that the man can not move any of them Many dead-end positions can be indentified as local patterns. Sokoban 17
13 Atomix H H O H C H H H H H H H H H C C C H H C C C H C N C N C H H C H H H H H H C N H H C C C H O H C C H H C C N H H C = 13 H 66 Rules: the player selects an atom at a time and push it towards one direction it will keep on moving until it hits an obstacle or another atom game won when the atoms form the molecule as depicted beside the board Atomix 18
14 Route Planning In Route Planning, the shortest path between a start location s and a target location t has to be found according to a distance graph w : E IR + Given a layout function L : V IR 2 nodes can identified with their layout coordinates Edge weights w((u, v)) might be derived from straight-line distances u v 2 = (v 1 u 1 ) 2 + (v 2 u 2 ) 2, or in terms of actual travel distance or time Maps are often very large and hence stored on external storage devices route planning problem can be treated as an implicit state space problem In practice: on-line travel information systems with a set of (s, t) queries Route Planning 19
15 DNA Sequence Alignment k strings, representing DNA sequences over the alphabet Σ = {A, C, G, T } strings have to be aligned (written one above the other) such that letters in the same column preferably match introduce gaps in either sequence in order to shift the remaining letters into better alignment E.g. k = 2, cost funtion that requires cost 1 for a mismatch and cost 2 for a gap minimal alignment of two given sequences S 1 =ACGTACGACGT and S 2 =ATGTCGTCACGT with cost 5 has the form ACGTACGT_ACGT ATGT_CGTCACGT DNA Sequence Alignment 20
Dynamic Programming Approach The problem is very much related to the problem of computing the Edit Distance of k strings. The three main edit operation (in 2D) are noop (match, cost 0), change (mismatch, cost 1), and wait-l (introduce gap in string l) In 2D the entry T [x, y] includes the cost of aligning string S 1 [1..x] with S 2 [1..y] is computed as the minimum of the values T [x, y 1] + 2, T [x 1, y] + 2, and T [x 1, y 1] if S 1 [x] = S 2 [y] or T [x 1, y 1] + 1 if S 1 [x] S 2 [y] DNA Sequence Alignment 21
Search Approach An alignment can be conveniently depicted as a path between two opposite corners in a search graph structured as a k-dimensional grid: if there is no gap in either string, the path moves diagonally down and right; a gap in the vertical (horizontal) string is represented as a horizontal (vertical) move right (down), since a letter is consumed in only one of the strings The alignment graph is directed and acyclic, where a (non-border) vertex has incoming edges from the left, top, and left-top adjacent verteces, and outgoing edges to the right, bottom, and bottom-right verteces DNA Sequence Alignment 22
16 Action Planning Action planning refers to a world description in predicate logic, where a number of predicates AP describes what can be true or false in each state of the world by applying operations in a world, we arrive at another world where different atoms might be true or false for example, in a blocks world a robot might try to reach a target state by operators that stack and unstack blocks, or pay them on the table usually, only some few atoms are affected by an operator, and most of them remain the same Action Planning 23
Strips Planning A propositional planning problem (in STRIPS notation) is a finite state space problem P =< S, O, I, G >, where S 2 AP is the set of states, I S is the initial state, G S is the set of goal states, and O is the set of operators that transform states into states; Operators O = (P, A, D) O have propositional preconditions P, and propositional effects (A, D), where P AP is the precondition list, A AP is the add list and D AP is the delete list Given a state S with P S then its successor S = O(S) is defined as S = (S \ D) A. Action Planning 24
17 Model Checking Let AP be a set of atomic propositions A Kripke structure M over AP is a quadruple M =< S, I, R, L >, where S is a finite set of states, I S is the set of initial states, R S S is a (total) transition relation, and L : S 2 AP is the state labelling function A path in model M is a sequence of states π = S 0, S 1,... and π i denotes the suffix of π starting at S i. Model Checking 25
Model Checking Problem Given a Kripke structure M, and a temporal formula f Task: find the set of states in S that satisfies f, and check whether the set of initial states belongs to this state set We shortly write M = f in this case Model Checking 26
Linar Temporal Logic LTL formulas have the form Always f, Af for short, where f is a path formula if p AP then p is a path formula if f and g are path formulas, so are f, f g, f g, X f, F f, G f, f U g Semantics: for the next time operator X we have M, π = X f M, π 1 = f, for the until operator g U f we have M, π = g U f 0 k : M, π k = f 0 j k : M, π j = g, for the eventually operator we have M, π = F f 0 k : M, π k = f, for the globally operator we have M, π = G f 0 k : M, π k = f Model Checking 27
Example 1. The LTL formule A(G p) means: along every path, p will hold forever. 2. The LTL formule A(F p) means: along every path, there is some state, in which g will hold. 3. The LTL formula A(FG p) means: along every path, there is some state, from which p will hold forever. Model Checking 28
Propositional Planning as Model Checking Theorem Any STRIPS planning problem can be modelled as an LTL model checking problems. Proof (Sketch) Achieving any propositional goal g 2 AP can be expressed in form of a counter-example to the temporal formula f = A(G g) in LTL. If the problem is solvable, the LTL model checker will return a counter-example which in fact is a solution path for the STRIPS planning problem. On the other hand, several model checking problem can be modelled as state space problems P =< S, O, I, G > The class of model checking problems that fit into the representation of a state space problem with a witnessing goal set are the ones with so-called safety properties Model Checking 29
18 NP-hard Problems For all NP(-complete) decision problems L we have a non-deterministic Turing-Machine M that recognices L in polonomial time Well-known instances are boolean satisfiability (SAT ), number partitioning (PARTITION), bin packing (BPP) as well as graph problems like graph partitioning (GPP) and node coverage (VERTEX-COVER) Even in initial work of Garey and Johnson hundrets of NP-complete problems have been identified NP-hard Problems 30
Simulation and Modeling A deterministic Turing machine may simulate all possible computations of M in exponential time Therefore NP problems are state space problems with S being the set of configurations of M, O being the set of transitions to one successor configuration, I being the start configuration of M and G G being its end configuration. NP-hard Problems 31
19 Production Systems Another classical AI representation for a search problem is a production system. A production system is a state space problem, whose states are strings in Σ, and whose operators are given in form of grammar inferences rules α β, with α, β Σ for some fixed alphabet Σ. Production Systems 32
Undecidability Theorem (Post 1943) The problem to solve a general production system for arbitrary start and goal state is not decidable. Proof Reduction to the halting problem for Turing machines: States are configurations of the Turing machine M, i.e. words of {B} + Γ Q Γ {B} + The initial state is state B q 0 B and goal state is B q e B. Depending on the value of d we assign each aq bq d to words wcqaw s, s {wcq bw, wcbq w, wq cbw } with w {B} + Γ and w Γ {B} + Production Systems 33