Computing Bounded Path Decompositions in Logspace

Computing Bounded Path Decompositions in Logspace Shiva Kintali Sinziana Munteanu Department of Computer Science, Princeton University, Princeton, NJ 08540-5233. kintali@cs.princeton.edu munteanu@princeton.edu Abstract The complexity of Graph Isomorphism (GI) is a long-standing open problem. GI is not known to be solvable in polynomial time and is unlikely to be NP-complete [BHZ87, Sch88]. Polynomial-time algorithms are known for GI of several special classes of graphs. One such interesting class is graphs excluding a fixed minor. Any family of graphs that is closed under taking minors falls into this class. Important examples of such families include bounded genus graphs, bounded pathwidth graphs and bounded treewidth graphs. Recent papers proved the logspace-completeness of GI of planar graphs and graphs of treewidth at most three, thus settling their complexity [ADK08, DLN + 09, DNTW09]. These algorithms are based on computing certain kind of decompositions in logspace. The best known upper bound for GI of bounded treewidth graphs is LogCFL [DTW10], implying the same upper bound for graphs of bounded pathwidth. One bottleneck towards improving these upper bounds is computing bounded width tree/path decompositions in logspace. Elberfeld, Jakoby and Tantau [EJT10] presented a logspace algorithm to compute tree decompositions of bounded treewidth graphs. In this paper, we present a logspace algorithm to compute path decompositions of bounded pathwidth graphs, thus removing a bottleneck towards settling the complexity of GI of bounded pathwidth graphs. Prior to our work, the best known upper bound to compute such decompositions is polynomial time [BK96]. Keywords: graph isomorphism, logspace, path decompositions, pathwidth, tree decompositions, treewidth.

1 Introduction The Graph Isomorphism problem is to decide whether two given graphs are isomorphic. The complexity of Graph Isomorphism (GI) is a long-standing open problem. GI is not known to be solvable in polynomial time and is unlikely to be NP-complete [BHZ87, Sch88]. Polynomial time algorithms are known for GI of several special classes of graphs. One such interesting class is graphs excluding a fixed minor. Any family of graphs that is closed under taking minors falls into this class. Important examples of such families include bounded genus graphs, bounded pathwidth graphs and bounded treewidth graphs. Wagner [Wag37] introduced simplicial tree decompositions. The notions of treewidth and tree decomposition were introduced (under different names) by Halin [Hal76]. Robertson and Seymour [RS86] reintroduced these concepts in their seminal work on graph minors. Roughly speaking, treewidth of an undirected graph measures how close the graph is to being a tree. Definition 1.1. (Tree decomposition) A tree decomposition of a graph G(V, E) is a pair D = ({X i i I}, T (I, F )) where {X i i I} is a collection of subsets of V (called bags) and T (I, F ) is a tree such that i I X i = V. for all edges (u, v) E, there is an i I with u, v X i. for all i, j, k I, if j is on the path from i to k in T, then X i X k X j. (TD-1) (TD-2) (TD-3) The width of a tree decomposition D is max i I X i 1. The treewidth of a graph G, denoted by tw(g), is the minimum width over all tree decompositions of G. The elements of I are called the nodes of the tree T. Besides playing a crucial role in structural graph theory, treewidth also proved to be very useful in algorithmic graph theory. Several problems that are NP-hard on general graphs are solvable in polynomial time (some even in linear time) on graphs with bounded treewidth (also known as partial k-trees). These problems include hamiltonian cycle, graph coloring, vertex cover and many more. Courcelle s theorem [Cou90] states that for every monadic second-order (MSO) formula φ and for every constant k there is a linear time algorithm that decides whether a given logical structure of treewidth at most k satisfies φ. Recently, Elberfeld, Jakoby and Tantau [EJT10] proved a logspace version of Courcelle s theorem, thus showing that several MSO properties of partial k-trees can be decided in logspace. Unfortunately, it is not known how to formulate GI in MSO logic. Study of GI of bounded treewidth graphs has a long history. Bodlaender [Bod90] proved that GI of bounded treewidth graphs can be decided in polynomial time. Lindell [Lin92] showed that trees can be canonized in logspace. Using a clever implementation of the Weisfeiler-Lehman algorithm, Grohe and Verbitsky [GV06] presented a TC 1 upper bound. Recent papers proved the logspace-completeness of GI of planar graphs and graphs of treewidth at most three, thus settling their complexity [ADK08, DLN + 09, DNTW09]. These algorithms are based on computing certain kind of decompositions in logspace. Using an algorithm of Wanke [Wan94] (to compute bounded tree decompositions in LogCFL) and Lindell s logspace tree canonization algorithm [Lin92], Das, Toran, and Wagner [DTW10] presented a LogCFL upper bound. Currently this is the best known upper bound. 1

One of the bottlenecks in the algorithm of Das, Toran, and Wagner is computing bounded width tree decompositions in logspace. Recently Elberfeld, Jakoby and Tantau [EJT10] removed this bottleneck. It is still not clear how to design a logspace algorithm for GI of partial k-trees. This motivates the study of GI on special cases of partial k-trees. Bounded pathwidth graphs are a natural subset of bounded treewidth graphs. Pathwidth is defined as follows. Definition 1.2. (Path decomposition) A path decomposition of a graph G(V, E) is a sequence P = (Y 1, Y 2,..., Y s ) of subsets of V, such that 1 i s Y i = V for all edges (u, v) E, there is an i, 1 i s, with u, v Y i for all i, j, k with 1 i < j < k s, Y i Y k Y j (PD-1) (PD-2) (PD-3) The width of a path decomposition P is max i I Y i 1. The pathwidth of a graph G, denoted by pw(g), is the minimum width over all path decompositions of G. The best known upper bound for GI of bounded pathwidth graphs is LogCFL, implied by the algorithm of Das, Toran, and Wagner [DTW10]. Computing bounded path decompositions in logspace is a natural first step towards improving this upper bound. In this paper, we present a logspace algorithm to compute path decompositions of bounded pathwidth graphs. Prior to our work, the best known upper bound to compute such decompositions is polynomial time [BK96]. Our main theorem is as follows : Theorem 1.3. For all constants k, l 1, there exists a logspace algorithm that, when given a graph G of treewidth l, decides whether the pathwidth of G is at most k, and if so, finds a path decomposition of G of width k in logspace. 2 Overview of Our Algorithm Let G(V, E) be the input graph. We first use an algorithm of Elberfeld, Jakoby and Tantau [EJT10] to compute a tree decomposition D of G in logspace. Theorem 2.1. (Elberfeld, Jakoby and Tantau [EJT10]) For every constant k 1, there is a logspace algorithm that on input of any graph G determines whether tw(g) k, if so, outputs a width-k rooted tree decomposition D = ({X i i I }, T (I, F )) of G. Moreover, T is a binary tree of height O(logn). In Section 4, we present a logspace algorithm to convert D into a nice tree decomposition D = ({X i i I}, T (I, F )) (see Definition 4.1). The elements of T are called the nodes of D. Let root be the root node of D. Every node of D has at most two children. Let v V (G) and p I. If p has one child, we denote it by q. If p has two children, we denote them by q and r. We distinguish q and r by the lexicographic ordering of their labels. Each node of D is one of the following types: start, forget, introduce or join node (see Definition 4.2). In Section 5, we give a logspace algorithm to compute the full set of characteristics at the root, denoted by F S(root). In Section 5.1, we show how to compute F S(p), given the full set of characteristics at the children of p. This is done separately based on the type of the node p. These procedures, ComputeStartFS, ComputeForgetFS, ComputeIntroduceFS and 2

ComputeJoinFS are same as the corresponding polynomial-time procedures of [BK96]. We show that they can be implemented in logspace. In Section 5.2, we give a logspace algorithm to compute F S(root). We first give a logspace relabeling procedure that reduces the space required to store a bag from logarithmic to constant (Definition 5.7). This involves changing the label of each vertex to its rank in an ordering of the vertices of the bag. We then give the logspace procedure ComputeFS(root) that computes F S(root) (Theorem 5.10). This uses Lindell s logspace tree traversal algorithm, the logspace procedures of the previous section and the relabeling technique. In Section 5.3, we give a logspace algorithm to compute a unique characteristic at each node. For each descendant p of p, we define the descendant characteristic c(p, p, C p ) of p at p (Definition 5.11) and then give a logspace procedure ComputeDescendantCharacteristic(p, p, C p ) for computing it in logspace (Theorem 5.12). The unique characteristic at each node will be the descendant characteristic of the lexicographically smallest characteristic in F S(root) (Definition 5.13). In Section 6, we give a logspace algorithm for computing the bag list at the root, an auxiliary data structure useful for computing the path decompositions. We first define the characteristic path decomposition at p, CP(p), a path decomposition with certain properties (Definition 6.1) and then define the bag list b(p) to be the list of the number of bags in CP(p) between any two bags corresponding to consecutive elements in the typical list of the unique characteristic at p (Definition 6.2). In Section 6.1, we see how to compute b(p), given the bag lists at the children of p. This is done separately depending on the type of the node p. The procedures for achieving this are ComputeStartBagList, ComputeJoinBagList, ComputeForgetBagList and ComputeIntroduceBagList. In Section 6.2, we give a logspace algorithm that computes b(root) (Theorem 6.4). This uses the observation that it is enough to consider, one at a time, the paths from each of the leaves to the root, because b(root) is the sum of lists, each corresponding to one of the paths from a leaf to the root and that it is computable only from the nodes on its path (Theorem 6.3). In Section 7, we give a logspace algorithm for computing the endpoints of a vertex at the root, a structure that if specified for each vertex will uniquely determine a path decomposition. By the properties of a path decompositions, all bags containing v in CP(p) form a subpath. Hence, CP(p) is completely determined by specifying for each vertex, the first and last bags that contain it. We define l(p, v) and r(p, v) to be the left and right endpoints of v, the indices of the first and last bags containing v (Definition 7.1). The algorithms of this section will compute the left endpoints. By symmetry, we also obtain logspace algorithms for computing the right endpoints. In Section 7.1, we give a logspace algorithm for computing l(p, v) given the left endpoints of v at the children of p. This is done separately depending on the type of the node p. The procedures for achieving this are ComputeStartLeftEndpoint, ComputeForgetLeftEndpoint, ComputeIntroduceLeftEndpoint and ComputeJoinLeftEndpoint. In Section 7.2, we give a logspace algorithm that computes l(root, v) (Theorem 7.5). This uses the observation that it is enough to consider one at a time the paths from each of the leaves to the root, because l(root, v) is the minimum of indices, each corresponding to one of the paths from a leaf to the root and that is computable only from the nodes on its path (Theorem 7.4). To compute a path decomposition of G, start from D and apply successively the logspace algorithms of Sections 4-7. Since the class of logspace computable functions is closed under composition, we obtain the desired logspace algorithm. It is easy to see that our algorithms imply Theorem 1.3. In Section 8, we prove that the language path-width-k = {G pw(g) k} is L-complete. 3

This complements a result of Elberfeld, Jakoby and Tantau [EJT10], showing that the language tree-width-k = {G tw(g) k} is L-complete. 3 Preliminaries In this Section, we present some required definitions from Bodlaender and Kloks paper [BK96]. 3.1 Treewidth and Pathwidth All the graphs in this paper are simple and undirected, unless otherwise specified. G(V, E) denotes a graph, where V = V (G) is the vertex set of G and E = E(G) is the edge set of G. Let n = V denote the number of vertices of G. For a subset V V (G), G[V ] denotes the subgraph of G induced by the vertices of V. Recall the definitions of treewidth and pathwidth from Section 1. Definition 3.1. (Rooted tree decomposition) A rooted tree decomposition is a tree decomposition D = ({X i i I}, T (I, F )) where T is a rooted tree. Definition 3.2. (Minimal Path decomposition) A path decompostion P = (Y 1, Y 2,... Y s ) is called minimal if for all i, 1 i < s, Y i Y i+1. Definition 3.3. (Subgraph rooted at a node) Let D = ({X i i I}, T (I, F )) be a rooted tree decomposition of a graph G(V, E). For each node i of T, let T i be the subtree of T rooted at node i and let V i = j T i X j. We call G i = G[V i ] the subgraph of G rooted at i. Definition 3.4. (Partial Path decomposition) A partial path decomposition rooted at node i I is a path decomposition of G i, the subgraph of G rooted at i. 3.2 Interval Model, Typical Lists and Characteristics Definition 3.5. (Interval model) Let P = (Y 1, Y 2,... Y s ) be a partial path decomposition rooted at node i. We define a path decomposition of the bag X i to be (Y 1 X i, Y 2 X i,..., Y s X i ). Let 1 = r 1 < < r t+1 = s + 1 be defined by 1 j t rj k<r j+1 [Z k = Z rj ] 1 j<t [Z rj Z rj+1 ] The interval model for P at i is the sequence Z = (Z rj ) 1 j t. Definition 3.6. (List of a partial path decomposition) Let P = (Y 1, Y 2,..., Y s ) be a partial path decomposition with interval model Z = (Z rj ) 1 j t. Let the list of P be [y] = (y 1, y 2,..., y t ), with y j = ( Y rj, Y rj +1,..., Y rj+1 1 ), 1 j t. Note that [y] is a list of integer sequences. For any integer sequence z, let l(z) be the number of integers in z and for any list [y] = (y 1, y 2,... y t ), let l(y) = l(y 1 ) + l(y 2 ) + + l(y t ). Definition 3.7. (Typical sequence, Typical list) Let a = (a 1, a 2,..., a n ) be an integer sequence. Define the typical sequence τ(a) to be the sequence obtained by repeatedly applying the following operation, until it is no longer possible: if there exists 1 i < n with a i = a i+1 replace (a 1, a 2,..., a n ) by (a 1, a 2,... a i, a i+2,..., a n ). 4

if there exists 1 i < n 1 with y i < a i+1 < a i+2 or a i > a i+1 > a i+2, replace (a 1, a 2,..., a n ) by (a 1, a 2,..., a i, a i+2,..., a n ). Let [y] = (y 1, y 2,... y t ) be a list of integer sequences. Define the typical list τ[y] to be the list τ[y] = (τ(y 1 ), τ(y 2 ),... τ(y t )). Definition 3.8. (Characteristic) Let D = ({X i i I}, T (I, F )) be a tree decomposition. Let i I and P be a partial path decomposition at i, with interval model Z and typical list T [y]. Define the characteristic of P at i to be C = (Z, T [y]). Definition 3.9. (Extensions) the set of extensions of a: Let a = (a 1, a 2,..., a n ) be an integer sequence. Define E(y) to be E(a) = { a 1=t1 <t 2 < <t n+1 1 i n ti j<t i+1 [a j = a i ] } A sequence y 1 precedes another sequence y 2, if there are same length extensions such that each integer in the extension of y 1 is at most the integer at the same index in the extension of y 2. Definition 3.10. (Precedence) Let a and b be two integer sequences. Define a b iff there exist e a = (e a 1, ea 2,..., ea n) E(a), e b = (e b 1, eb 2,..., eb n) E(b) such that for all 1 i n, e a i eb i. Let P 1 and P 2 be two partial path decompositions rooted at the same node with the same interval model. Let the corresponding lists be [y 1 ] = (y 1 1, y2 1,... yt 1 ) and [y 2] = (y 1 2, y2 2,... yt 2 ). Define P 1 P 2 iff for all 1 j t, y j 1 yj 2. Definition 3.11. (Full Set of Characteristics) For i I, define F S(i) to be a full set of characteristics at i iff for every partial path decomposition P at i of width at most k there is a partial path decomposition P such that P P and the characteristic of P is in F S(i). 4 Nice Tree Decompositions Let G be a graph of treewidth at most k. Let D = ({X i i I}, T (I, F )) be a width-k rooted tree decomposition of G obtained by using Theorem 2.1. In this section, we present a logspace algorithm to transform D into a nice tree decomposition D. Definition 4.1. (Nice tree decomposition) A rooted tree decomposition D = ({X i i I}, T (I, F )) is called a nice tree decomposition if the following conditions are satisfied every node of T has at most two children if a node i has two children j and k, then X i = X j = X k if a node i has one child j, then exactly one of the following holds (NTD-1) (NTD-2) (NTD-3) X i = X j + 1 and X j X i X i = X j 1 and X i X j Definition 4.2. (Start, Join, Forget and Introduce nodes) Let D = ({X i i I}, T (I, F )) be a nice tree-decomposition and let p I. Node p is one of the following types: Start: a leaf node, 5

Forget: a node with one child q, with X p X q and X p = X q 1, Introduce: a node with one child q, with X p X q and X p = X q + 1, Join: a node with two children q and r, with X p = X q = X r. Notice that if i is a start node of a nice tree decomposition D = ({X i i I}, T (I, F )) such that X i = {v 1, v 2,... v s }, we can add, for all 1 t < s, the bags X i,t = {v 1, v 2,... v t } to {X i i I} and a path of edges i, (i, s 1), (i, s 2),... (i, 1) to T and obtain a nice tree decomposition D = ({X i i I }, T (I, F )) in which the start node i is replaced by the start node (i, 1) with one vertex. Moreover, this adds at most k to the depth of T, so if T is balanced, T is also balanced. Therefore, we may assume that all start nodes of a nice tree decomposition have one vertex. Theorem 4.3. Let k 1 be a constant and T = (X, T ) be a width-k tree decomposition of G(V, E) returned by the algorithm of [EJT10]. There exists a logspace algorithm that converts T = (X, T ) into a nice tree decomposition T = (X, T ). Moreover, the depth of T is at most 4k times the depth of T. Proof. There are three properties of a nice tree decomposition (see Definition 4.1), NTD-1, NTD-2 and NTD-3. Note that T satisfies NTD-1. Let p be a node of T with two children q and r. Let the corresponding bags be X p, X q and X r. If X p X q or X p X r, If NTD-2 holds for i, output i with the same list of children. Otherwise, output i with children j and k, where j, k / I are added at this step to I. Define X j = X k = X i and let the child of j be i 1 and the child of k be i 2. Observe that NTD-1 and NTD-2 hold for the output nodes i, j, k and thus at the end of the traversal, the output will have properties NTD-1 and NTD-2. The local changes simply subdivide an edge in two, so the depth of the tree at most doubles. Since all input bags are also output bags and the local changes are edge subdivisions, the output is a width k tree decomposition. Moreover, each input node is transformed into at most 3 output nodes, so the output has polynomial size. We continue by showing the existence of a logspace computable function f 3 whose input is a polynomial size width k tree decomposition with properties NTD-1 and NTD-2 and whose output is a polynomial size width k tree decomposition with properties NTD-1, NTD-2 and NTD-3, whose tree has depth at most 2k times the depth of the tree of the input tree decomposition. Perform a logspace tree traversal. Let i be the current node. If NTD-3 holds for i, output i with its input list of children. Otherwise, store i on the work tape and continue the tree traversal, until the top of the traversal stack is j for which NTD-3 holds or X i X j. In the first case, output i with the list of children of j. In the second case, let X k = X i X j and X i \ X k = {v i1, v i2, v is } and X j \ X k = {v j1, v j2, v jt }. Add to the output a path i, i s 1, i s 2, i 1, k, j 1, j 2, j t 1, j, where X i r = X k {v i1, v i2, v ir }, 1 r s 1, X j u = X k {v j1, v j2, v ju }, 1 u t 1 and i r, j u / I, 1 r s 1, 1 u t 1 are added at this step to I. Observe that NTD-1, NTD-2 and NTD-3 hold for the output node i and the new nodes and thus at the end of the traversal, these properties will hold for all nodes. Node i is transformed into X i + X j 2k nodes and so the output has polynomial size and its depth is at most 2k times the depth of the input tree. All input bags are output bags and the local changes are edge subdivisions, so the output is a width k tree decomposition. Since the class of logspace computable functions is closed under composition we have the desired result. 6

5 Computing Characteristics Throughout this section, let k 1 be a constant and D = ({X i i I}, T (I, F )) be a nice tree decomposition of width k. Let p, q, r I be nodes of T. 5.1 Computing the Characteristic from the Characteristics of Children 5.1.1 Start Node Let p I be a start node with X p = {v}. Note that F S(p) = {( ({v}), [(1)] ), ( ({v}, { }), [(1), (0)] ), ( ({ }, {v}), [(0), (1)] ), ( ({ }, {v}, { }), [(0), (1), (0)] )}. Define the procedure ComputeStartFS(p) to compute the set F S(p). It is easy to see that ComputeStartFS(p) is a logspace computable function. 5.1.2 Forget Node Let p be a forget node and q be its child such that X q \ X p = {v}. In this section, we see how to compute F S(p) from F S(q). Let C q = (Z, T [y]) be a characteristic at q with Z = (Z j ) 1 j t. Since our algorithms only retains T [y], we may assume wlog [y] is chosen so that [y] = T [y]. Define Z \ {v} = (Z j \ {v}) 1 j t and let Z be the minimal path decomposition obtained by removing the consecutive duplicate bags of Z \ {v}. Define [y ] as follows: if Z \ {v} does not contain any consecutive duplicate bags, then [y ] = [y], if Z \ {v} contains two consecutive duplicate bags, say Z j1 \ {v} = Z j1 +1 \ {v}, for some 1 j 1 < t, then [y ] = (y 1, y 2,..., y j 1 1, y j 1 y j 1+1, y j 1+2,..., y t ), if Z \ {v} contains two pairs of consecutive duplicate bags, but no three consecutive duplicate bags, say Z j1 \ {v} = Z j1 +1 \ {v} and Z j2 \ {v} = Z j2 +1 \ {v} for some 1 j 1 < j 2 < t, then [y ] = (y 1, y 2,..., y j 1 1, y j 1 y j 1+1, y j 1+2,..., y j 2 1, y j 2 y j 2+1, y j 2+2,..., y t ), if Z \ {v} contains three consecutive bags, say Z j1 \ {v} = Z j1 +1 \ {v} = Z j1 +2 \ {v} for some 1 j 1 t, then [y] = (y 1, y 2,..., y j 1 1, y j 1 y j 1+1 y j 1+2, y j 1+3,..., y t ). (Z, T [y ]) is a characteristic at p and F S(p) = {f forget (p, C q ) C q F S(q)}. We define the procedure ComputeForgetFS(p, C q ) to compute (Z, T [y ]) from the input p, C q. We can compute F S(p) by iterating over all the characteristics from F S(q). Theorem 5.1. ComputeForgetFS(p, C q ) is a logspace computable function. Proof. Let C q = (Z, T [y]) with Z = (Z j ) 1 j t. ComputeForgetFS(p, C q ) works as follows. Step 1 : Computing the Interval Model Z First, we can compute in logspace Z \ {v}. Iterate the bags of Z and for each current bag Z j, iterate its vertices. For each current vertex u, output u iff u v. Since Z can be stored in logspace, we can perform this procedure in logspace. Now we shall see that we can eliminate the consecutive duplicate bags of Z \ {v} in logspace. Iterate the bags of Z \ {v}. Always output Z 1 \ {v}. Let the current bag be Z j \ {v} and the last 7

bag output be Z j \ {v}. We can determine in logspace if Z j \ {v} = Z j \ {v} using the algorithm from Lemma 5.2. If Z j \ {v} = Z j \ {v}, output Z j \ {v} and otherwise, store j 1 and continue the iteration of the bags of Z \ {v}. Step 2 : Computing the Typical List T [y ] Observe that the above procedure computes the index j such that Z j \{v} = Z j+1 \{v}. Hence, we can iterate the sequences of [y] and using the definition of [y ], compute [y ] in logspace. Since the length of any sequence of [y ] is constant, we can apply the algorithm of Lemma 5.3 to compute T [y ]. Lemma 5.2. Let Y 1, Y 2 X p with Y 1 k + 1 and Y 2 k + 1. There exists a logspace algorithm deciding whether Y 1 = Y 2. Proof. Iterate the vertices in Y 1 and for each current vertex v, iterate the vertices of Y 2, comparing each of them with v. Since Y 1 and Y 2 have a constant number of vertices, each with a logarithmic label, this can be done in logspace. If v is not found in Y 2, conclude that Y 1 Y 2 and terminate the procedure. If we reach the end of the iteration without terminating prematurely, it means that Y 1 Y 2. Now repeat the above procedure for the pair (Y 2, Y 1 ). If this also does not terminate prematurely, we have Y 2 Y 1 and therefore, we conclude that Y 1 = Y 2 and terminate. Lemma 5.3. Let a = (a 1, a 2,... a n ) be a sequence of integers, with n c(k), where c(k) is a constant depending on k. There exists a logspace algorithm compting T (a). Proof. Iterate the integers of a. Output a 1 and proceed to the next element. Let a i be the current integer and a i be the last integer output. If i = n, output a i. Otherwise, output a i iff one of the following holds: a i > a i and a i > a i+1 a i < a i and a i < a i+1 From the definition of a typical sequence, we see that this procedure correctly outputs the typical sequence of a. Since each of the integers of a can be stored in logspace, this is a logspace algorithm. 5.1.3 Introduce Node Let p be an introduce node and q be its child such that X p \ X q = {v}. In this section, we see how to compute F S(p) from F S(q). Definition 5.4. (Split) Let a = (a 1, a 2,..., a n ) be an integer sequence. For 1 i n, define ((a 1, a 2,..., a i ), (a i+1, a i+2,..., a n )) to be a split of type I of a at i and define ((a 1, a 2,..., a i ), (a i, a i+1,..., a n )) to be a split of type II of a at i. Define (a 1, a 2 ) to be a split of a if there exists 1 i n such that (a 1, a 2 ) is a split of type I or II of a. Let C q = (Z, T [y]) be a characteristic at q with Z = (Z j ) 1 j t and assume wlog that [y] = T [y]. Define the procedure ComputeIntroduceFS(p, C q ) as follows. First compute the set S(Z) of all minimal path decompositions Z = (Z j ) 1 j t for G[X p], such that by removing the consecutive duplicate bags of Z \ {v}, we obtain Z, where Z \ {v} = (Z j \ {v}) 1 j t. For every Z S(Z), do the following. Let l, r be the smallest and largest indices such that v Z l and v Z r, resp. Define the set of typical lists S (Z ) as follows: 8

if t = t, then S (Z ) = {(y 1, y 2,..., y l 1, y l + 1, y l+1 + 1,..., y r + 1, y r+1,..., y t )}, if t = t + 1 and Z l \ {v} = Z l 1, then S (Z ) = {(y 1, y 2,..., y l 2, (y ) l 1, (y ) l + 1, y l + 1,..., y r 1 + 1, y r,..., y t ) ((y ) l 1, (y ) l ) is a split of y l 1 }, if t = t + 1 and Z r \ {v} = Z r+1, then S (Z ) = {(y 1, y 2,..., y l 1, y l + 1,..., y r 1 + 1, (y ) r + 1, (y ) r+1,..., y t ) ((y ) r, (y ) r+1 ) is a split of y r }, if t = t + 2 and l < r, then S (Z ) = {(y 1, y 2,..., y l 2, (y ) l 1, (y ) l + 1, y l + 1,..., y r 2 + 1, (y ) r + 1, (y ) r+1, y r,..., y t ) ((y ) l 1, (y ) l ) is a split of y l 1 and ((y ) r, (y ) r+1 ) is a split of y r 1 }, if t = t + 2 and l = r, then S (Z ) = {(y 1, y 2,..., y l 2, (y ) l 1, (y ) l + 1, (y ) l+1, y l,..., y t ) ((y ) l 1, y ) is a split of y l 1 and ((y ) l, (y ) l+1 ) is a split of y }. For all [y ] S (Z ), output (Z, [y ]). The output of ComputeIntroduceFS(p, C q ) is a set of characteristics at p and moreover, F S(p) is the union over all C q F S(q) of the outputs of ComputeIntroduceFS(p, C q ). Hence, to show that F S(p) can be computed in logspace from F S(q), it is sufficient to show the following. Theorem 5.5. ComputeIntroduceFS(p, C q ) is a logspace computable function. Proof. Let C q = (Z, T [y]) with Z = (Z j ) 1 j t. ComputeIntroduceFS(p, C q ) works as follows. Step 1 : Computing the Minimal Path decompositions for G[X p ] By Lemma B.1, the number of bags in a minimal path decomposition for G[X p ] is bounded by a constant, call it c. We first see that we can iterate in logspace the sequences of integers y = (y 1, y 2,..., y s ) such that s c and for all 1 j s, we have y j k + 1. Since s is constant and each for 1 j s, y j is also a constant, we can store y in constant space. We can iterate these sequences in logspace by storing only the last computed sequence and at the next step considering the next sequence in lexicographical order. For each such sequence y, we see that we can iterate in logspace the sequences of bags Y = (Y 1, Y 2,..., Y s ) such that for all 1 j s, Y j X p and Y j = y j. Since s is constant and for 1 j s, Y j contains a constant number of vertices, each with a logarithmic label, Y can be stored in logspace. We can iterate these sequences in logspace by storing only the last computted sequence and at the next step computting the one following it in lexicographical order. For each such sequence Y, we shall now prove that we can decide in logspace whether Y is a path decomposition. We check in logspace each of the three defining properties (see Definition 1.2) of a path decomposition. PD-1: Iterate all vertices in X p. Since X p has a constant number of vertices, each with a logarithmic label, this can be done in logspace. For each current vertex u, iterate the vertices in each of the bags in Y, comparing each of them with u. Since Y can be stored in logspace, this can be done in logspace. Decide that Y has property PD-1 iff there exists 1 j s such that u Y j. PD-2: Iterate in logspace all edges in X p. For each current edge (u, w), iterate the vertices in each of the bags in Y, comparing each of them with u and w. Decide that Y has property PD-2 iff there exists 1 j s such that u, w Y j. 9

PD-3: Iterate in logspace the vertices in X p. For each current vertex u, iterate the vertices in each of the bags in Y, comparing each of them with u. During this procedure, store a boolean variable found, that is set to true when the current bag contains u. Observe that found = true iff u belongs to one of the bags iterated so far. If the current bag does not contain u and found = true, stop the iteration and conclude that Y does not have have PD-3. If none of the bags causes the termination of the procedure, conclude that Y has property PD-3. For each path decomposition Y, we can decide in logspace whether Y is minimal. Iterate in logspace consecutive pairs of bags of Y. For each current pair of bags (Y j, Y j+1 ) apply the logspace algorithm of Lemma 5.2 to determine whether they are equal. If they are found equal, terminate the procedure and conclude that Y is not minimal. If no pair of bags causes the termination of the procedure, conclude that Y is indeed a minimal path decomposition. Applying succesively the above procedures yields a logspace algorithm that computes all minimal path decompositions for G[X p ]. Step 2 : Computing the inteval models in S(Z) For each minimal path decompositions Z, we want to determine in logspace the sequence of bags obtained by removing the consecutive duplicate bags of Z \ {v} = (Z j \ {v}) 1 j t. This can be done using the algorithm from the Computing the Interval Model step of the ComputeForgetFS(p, C q ) procedure. Let the sequences of bags obtained by this procedure be Z. We now check if Z = Z. Determine the number of bags in Z. Because Z has a constant number of bags, so does Z and hence the number of bags in Z can be determined in logspace. If the number of bags in Z is not t, conclude that Z Z and hence Z / S(Z). Otherwise, iterate pairs of bags at the same indices in Z and Z. For each current pair of bags (Z j, Z j), determine in logspace if they are distinct using the algorithm of Lemma 5.2. Conclude that Z / S(Z) iff there exists j such that Z j Z j. Step 3 : Computing the typical lists in S (Z ) Let Z S(Z) and let l, r be the smallest and largest indices such that v Z l and v Z r, resp. We now see that l can be computed in logspace. Iterate the bags in Z. For each current bag Z j, iterate its vertices, comparing each of them with v. If v Z j, set l = j and terminate the procedure. Moreover, r can be computed in logspace. Iterate pairs of consecutive bags in Z. For each current pair of bags (Z j, Z j+1 ), check if v Z j and v / Z j+1. If so, set r = j and terminate the procedure. We first see that for all 1 j t, we can compute all splits of y j in logspace. Let y j = (y j 1, yj 2,..., yj l(y j ) ). For each 1 j < l(y j ), output ((y j 1, yj 2,..., yj j ), (y j j +1, yj j +2,..., yj )) and l(y j ) ((y j 1, yj 2,..., yj j ), (y j j, y j j +1,..., yj )). By Lemma B.2, [y] can be stored in logspace and hence, l(y j ) for every 1 j t, y j and l(y j ) can be stored in logspace. Thus, this procedure can be done in logspace. If t = t, iterate the sequences of [y]. For each current sequence y j, output y j if j < l or j > r and output y j + 1, otherwise. It t = t + 1 and Z l \ {v} = Z l 1, iterate all possible splits of yl 1. For each current split ((y ) l 1, (y ) l ), iterate the sequences of [y]. For each current sequence y j, output y j, if j < l 1 or j r, output y j + 1 if l j < r and if j = l 1, output (y ) l 1, (y ) l + 1. The other cases are treated similarly. Since [y] can be stored in constant space, this is a logspace 10

procedure. 5.1.4 Join Node Let p be a join node and q, r be its children such that X p = X q = X r. In this section, we see how to compute F S(p) from F S(q) and F S(r). Let C q = (Z q, T [y q ]) F S(q) and C r = (Z r, T [y r ]) F S(r) be characteristics at q and r, resp. We define the procedure ComputeJoinFS(p, C q, C r ) as follows. If C q and C r do not have the same interval model, let the procedure have no output and terminate. Otherwise, let Z = Z q = Z r with Z = (Z j ) 1 j t. Let the procedure output all characteristics of the form (Z, T [y p ]), where for all 1 j t, yp j = e j q + e j r Z j, for some e j q E(T (yq)), j e j r E(T (yr)), j with l(e j q) = l(e j r) l(t (yq)) j + l(t (yr)) j and max(yp) j k + 1. The output of ComputeJoinFS(p, C q, C r ) is a set of characteristics at p and moreover, we can compute F S(p) by iterating over all characteristics C q F S(q) and C r F S(r) and taking the union of the output characteristics of ComputeJoinFS(p, C q, C r ). Theorem 5.6. ComputeJoinFS(p, C q, C r ) is a logspace computable function. Proof. Let C q = (Z q, T [y q ]) and C r = (Z r, T [y r ]). ComputeJoinFS(p, C q, C r ) works as follows. Using the algorithm of Lemma 5.2, we can determine in logspace whether Z q = Z r. If Z q Z r, do not output anything and terminate. Otherwise, let Z = Z q = Z r, with Z = (Z j ) 1 j t. For all 1 j t, repeat the following procedure to compute the set of typical lists S j = {T (yp) j yp j = e j q + e j r Z j for some e j q E(T (yq)), j e j r E(T (yr)), j with l(e j q) = l(e j r) l(t (yq)) j + l(t (yr)) j and max(yp) j k + 1}. We now see that for 1 j t, we can iterate in logspace all extensions e j q E(T (yq)) j with l(e j q) l(t (yq)) j + l(t (yr)). j By Lemma B.2, l(t (yq)) j and l(t (yr)) j are constants and hence l(e j q) is also at most a constant. Iterate all nondecreasing sequences of integers (j 1, j 2,... j l ) with l l(t (yq)) j + l(t (yr)), j j 1 = 1 and j l = l(t (yq)). j Since l is at most a constant and for all 1 j l, j j can be stored in lospace, the sequence (j 1, j 2,... j l ) can be stored in logspace. We can store the last sequence output, and then iterate over its successors until finding one with the desired properties and hence, this iteration can be done in logspace. For each current sequence (j 1, j 2,..., j l ), output ((yq) j j1, (yq) j j2,..., (yq) j jl ). Observe that in this procedure outputs all desired extensions. For each such extension e j q, we can similarly iterate all extensions e j r E(T (yr)) j with l(e j r) = l(e j q). For each current pair of extension (e j q, e j r), compute yp j = e j q + e j r Z j. Since Z j has a constant number of vertices, l(e j q) = l(e j r) is a constant and each of the integers of e j q and e j r can be stored in logspace, we can compute yp j in logspace. Using the algorithm of Lemma 5.3, we can compute T (yp) j in logspace. Add T (yp) j to S j iff max(yp) j k + 1. Hence, we can compute S 1, S 2,... S t in logspace. Now iterate all typical lists (T (yp), 1 T (yp), 2..., T (y t p)) such that for all 1 j t, T (y j p) S j. Since each typical list can be stored in logspace and t is constant, we can store each element of the iteration in logspace. Each time output the typical list following in lexicographical order the last typical list that was output. Hence, this iteration can be done in logspace. Observe that this produces the desired output. 11

5.2 Computing the full set of characteristics at the root To compute the full set of characteristics at the root, we will perform a logspace depth-first seach tree traversal [Lin92]. Throughout this secion, let D = ({X i i I}, T (I, F )) be the balanced width-k nice tree decomposition of G(V, E), output by Theorem 4.3. Definition 5.7. (Relabeling) Let V = {v 1, v 2,..., v n } and X p = {v p1, v p2,..., v ps } with 1 p 1 < p 2 <..., p s n and let C = (Z, T [y]) F S(p) with Z = (Z j ) 1 j t. For 1 r s, define the relabeling of v pr at p to be R(p, v pr ) = r. Define the relabeling of Z at p to be R(i, Z) = Z, where Z = (Z j ) 1 j t, Z j = {R(p, v) v Z j}. Define the relabeling of C at p to be R(p, C) = (R(p, Z), T [y]). Define the relabeling of F S(p) at p to be R(p, F S(p)) = {R(p, C) C F S(p)}. Define the relabeling of null to be R(p, null) = null. For 1 r s, define the inverse relabeling of r at p to be R 1 (p, r) = v pr. Define the inverse relabeling of Z at p with Z = R(i, Z) for some interval model Z at p to be R 1 (i, Z ) = Z. Define the inverse relabeling of C at p with C = R(p, C) for some characteristic C at p to be R 1 (p, C ) = C. Define the inverse relabeling of R(p, F S(p)) at p to be R 1 (p, R(p, F S(p))) = F S(p). Define the inverse relabeling of null to be R(p, null). Theorem 5.8. F S(p) uses (2k + 3) 2k+3 (k + 1) (2k+3)2 +1 [(2k + 3)(k + 1) log n + (log(k + 1)) (2k+3)2 ] space and R(p, F S(p)) uses (2k + 3) 2k+3 (k + 1) (2k+3)2 +1 [(2k + 3)(k + 1) log k + (log(k + 1)) (2k+3)2 ] space. Proof. The number of different characteristics is at most the product of the number of different interval models, which by Lemma B.1 is at most (2k + 3) 2k+3 and the number of different typical lists, which by Lemma B.2 is at most (k + 1) (2k+3)2 +1. So the number of different characteristics is at most (2k + 3) 2k+3 (k + 1) (2k+3)2 +1. Let C = (Z, T [y]) F S(p) with Z = (Z j ) 1 j t. By Lemma B.1, t 2k + 3 and since Z j k + 1 and each vertex has log n label, Z uses (2k + 3)(k + 1) log n space. Each vertex in each of the bags of R(p, Z) has log k label and hence R(p, Z) uses (2k + 3)(k + 1) log k space. By Lemma B.2, l(t [y]) (2k + 3) 2 and since each of the integers in T [y] is at most k + 1, T [y] uses (log(k + 1)) (2k+3)2 space. Thus, C uses (2k + 3)(k + 1) log n + (log(k + 1)) (2k+3)2 space and R(p, C) uses (2k + 3)(k + 1) log k + (log(k + 1)) (2k+3)2 space. Lemma 5.9. R(p, F S(p)) and R 1 (p, R(p, F S(p)) are logspace computable functions. Proof. Since the class of logspace computable functions is closed under composition, to show that R(p, F S(p)) is a logspace function, it is enough to show that for any v X p, R(p, v) is a logspace function. Iterate the vertices of X p, keeping a counter recording the number of vertices out of the ones so far that have smaller labels that the label of v. Let R(p, v) be the value of the counter at the end of the iteration. To show that R 1 (p, R(p, F S(p))) is a logspace functione, it is enough to show that for any 1 r X p, R 1 (p, r) is a logspace function. Iterate the vertices of X p. Output the current node v iff R(p, v) = r. Theorem 5.10. There exists a logspace algorithm ComputeFS(root)) that computes F S(root). 12

Proof. The desired algorithm is Algorithm 2, with its pop function overridden by the function of Algorithm 1, where the functions pushchar and popchar are the stack push and pop functions of charstack, a stack of relabelings of sets of characteristics and R p is a relabeled full set of characteristics, that is initialized with null. Algorithm 1 function pop if t has no children then R p R(t, ComputeStartFS(t)) else if t has one child then if t is a forget node then R p R(t, ComputeForgetFS(t, R 1 (p, R p ))). else R p R(t, ComputeIntroduceFS(t, R 1 (p, R p ))). end if else let q be a child of t, different from p R p R(t, ComputeJoinFS(t, R 1 (p, R p ), R 1 (q, popchar)) end if if nextsibling(t) null then pushchar(r p ) end if p t, t parent(t), lastmove = pop Functionally, the algorithm is equivalent to a stack-based depth-first search traversal, with R p = R(p, F S(p)) and charstack = (R(q 1, F S(q 1 )), R(q 2, F S(q 2 )),..., R(q t, F S(q t )), where R(q 1, F S(q 1 )) is the top of the stack and for all 1 j t, q j was popped off the traversal stack, is lexicographically smaller than its sibling and its parent is in the traversal stack. Moreover, q 1 was popped off more recently than any q j with 1 < j t. Let t be the node that is currently popped off the traversal stack. A node is popped off charstack only when t has two children and hence the relabeling of its smaller child is the top of charstack. A node is pushed on charstack only when it is equal to t and is the smaller child of its parent. So every pop operation updates charstack as desired. Assume that at the beginning of a call of pop, R p is R(p, F S(p)). We shall see that before t is updated, R p is updated as R(t, F S(t)). If t has no children, this clearly holds. If t has one child q, then p = q and since R p = R(p, F S(p)), the claim holds. If t has two children q < r, then q = p and R p = R(p, F S(p)) = R(q, F S(q)). Moreover, the top of charstack is R(r, F S(r)) and hence the claim holds in this case as well. At the end of pop, we have p t and thus R p is updated as R(p, F S(p), as desired. Since the last node that is popped off the traversal stack is the root, the algorithm presented computes R(root, F S(root)). Output R 1 (root, R(root, F S(root)). By Theorem 5.8, R p and each of the items of charstack use constant space. Since T is balanced, charstack has a logarithmic number of elements and hence uses logspace. The functions parent and nextsibling are logspace computable (Lemma A.2), the procedures ComputeStartFS, ComputeForgetFS, ComputeIntroduceFS and ComputeJoinFS are logspace functions and functions R and R 1 are also logspace functions by Lemma 5.9. The class of logspace computable functions is closed under composition and therefore, the given algorithm is a logspace algorithm. 13

5.3 Computing a unique characteristic at each node We now see how to compute a unique characteristic at each node that is the same as the output of the logspace functions of Section 5.1 applied to the unique characteristics at the children of this node. We first show how, given any characteristic at a node, we can compute a unique characteristic for its children and then generalize this for any of its descendants. Let C p F S(p) and q I be a child of p. Let p I be a descendant of p in T. Definition 5.11. (Descendant characteristic) Define the descendant characteristic c(p, q, C p ) of C p at p for q as follows: if p is a forget node, then ComputeForgetFS(p, c(p, q, C p )) = C p, if p is an introduce node, then c(p, q, C p ) is the lexicographically smallest characteristic in {C q C q F S(q), C p ComputeIntroduceFS(p, C q )}, if p is a join node, then let r be its other child. If q is lexicographically smaller than r, let (C q, C r) be the lexicographically smallest pair of characteristics in {(C q, C r ) C q F S(q), C r F S(r), C p ComputeJoinFS(p, C q, C r )}. Otherwise, let (C r, C q) be the lexicographically smallest pair of characteristics in {(C r, C q ) C q F S(q), C r F S(r), C p f join (p, C q, C r )}. Then c(p, q, C p ) = C q. Let p = p 0, p 1,... p t 1, p t = p be the nodes in order on the path in T between p and p. Let C p0 = C p and for all j, 0 j < t, c(p j, p j+1, C pj ) = C pj+1. Define the descendant characteristic of C p at p for p to be c(p, p, C p ) = C t. Theorem 5.12. There exists a logspace algorithm ComputeDescendantCharacteristic(p, p, C p ) that outputs the descendant characteristic c(p, p, C p ). Proof. We first show that the statement holds when p is q, a child of p. Theorem 5.10 applied to the subtree of T rooted at q gives F S(q) in logspace. If p is a forget node, iterate the characteristics in F S(q). Output the current characteristic C q iff ComputeForgetFS(p, C q ) = C p. If p is an introduce node, iterate the characteristics in F S(q), storing a characteristic C min, initialized with null, where null is defined to be lexicographically larger than any characteristic. If the current characteristic is C q such that C p ComputeIntroduceFS(p, C q ) and C q is lexicographically smaller than C min, replace C min by C q. Return the value of C min at the end of the iteration. If p is a join node, let r be the other child of p. Suppose wlog q is lexicographically smaller than r. Similarly to F S(q), we can compute F S(r) in logspace. Iterate the characteristics in F S(q) and for each of them, iterate the characteristics in F S(r), storing a characteristic C min, initialized with null. If the current characteristics are C q F S(q), C r F S(r) such that C p f join (p, C q, C r ) and C q is lexicographically smaller than C min, replace C min by C q. Return the value of C min at the end of the iteration. The functions ComputeForgetFS, ComputeIntroduceFS, ComputeJoinFS are logspace functions by theorems 5.1, 5.5 and 5.6 respectively. By Theorem 5.8, any full set of characteristics uses logspace. Thus, c(p, q, C p ) is a logspace function. For any ancestor p I of p, define c(p, p ) to be the child of p that is an ancestor of p. Iterate the nodes on the path from p to p, storing the current node p, initialized with p and 14

its child p, that is an ancestor of p and that is initialized with null. At each step, update p by p and p by parent(p ). Stop the iteration when p = p and output p. By Lemma A.2, parent is a logspace function and thus c(p, p ) is also a logspace function. Iterate the nodes on the path from the p to p, storing the current node p, initialized with p and a characteristic C at p, initialized with C p. Replace C by c(p, c(p, p ), C ) and then p by c(p, p ). Stop when p = p and output the current value of C. By the above, c(p, p ) and c(p, c(p, p ), C ) are logspace functions. Therefore, since the composition of logspace computable functions is a logspace computable function, c(p, p, C p ) is a logspace function. Definition 5.13. (Unique characteristic) Let C root F S(root) be the lexicographically smallest characteristic in the full set of characteristics at the root. Define the unique characteristic at p to be the descendant characteristic c(root, p, C root ). 6 Computing the Bag List In this section, we give a logspace algorithm for computing the bag list, an auxiliary data structure useful for computing the path decompositions. We first show how to compute the bag list at a node given the bag list at its children and then show how to compute it at the root. Throughout this section, let k 1 be a constant and G = (V, E) be a graph of pathwidth k. Let D = ({X i i I}, T (I, F )) be a width-k nice tree decomposition of G. Let p I. Definition 6.1. (Characteristic path decomposition) Let the characteristic path decomposition CP(p) = (Y 1, Y 2,... Y s ) be the width-k path decomposition at p output by the algorithm of [BK96] given D and choosing as the characteristic at the root the unique characteristic of Definition 5.13. From the algorithm of [BK96], CP(p) is minimal and from the proof of Theorem B.1, any minimal path decomposition of G = (V, E) has at most 2 V + 1 bags. Hence, the number of bags in CP(p) can be stored in logpsace. We now define the bag list and see that it can be strored in logspace. Each of the integers in the typical list of the unique characteristic is the cardinalilty of some bag in the characteristic path decomposition. We define the bag list as the list of the number of bags in the characteristic path decomposition between any two bags corresponding to consecutive elements in the typical list. In this subsection, we first define formally the notion of bag list and then show that the bag list at a node can be computed in logspace given the bag list at its children. Definition 6.2. (Bag list) Let C p = (Z, T [y]) be the unique characteristic at p, with Z = (Z j ) 1 j t. For all 1 j t, let l j = l(t (y j )). Let T (y j ) = ( Y i j, Y 1 i j,..., Y 2 i j ). For 1 j t, define l j b(p) j = (i j 2 ij 1 1, ij 3 ij 2 1,..., ij l j i j l j 1 1, ij+1 1 i j l j 1), where l t+1 1 = s + 1. Define b(p) = (b(p) 1, b(p) 2,..., b(p) t ) to be the bag list at p. Notice that b(p) can be stored in logpspace. For any 1 j t and 1 i l j, we have b(p) j i s and since s can be stored in logspace, we have that b(p) j i can also be stored in logspace. Moreover, l(b(p)) = l(t [y]) and by Theorem B.2, l(t [y]) is at most a constant depending on k. So, b(p) contains at most a constant depending on k integers, each of which can be stored in logspace. 15

6.1 Computing the Bag List from the Bag Lists of Children In this section, we see how to compute in logspace the bag list at a node, given the bag lists at its children. 6.1.1 Start Node Let p be a start node and b(p) be its bag list. Let C p be the unique characteristic at p. Observe that: if C p = ( ({v}), [(1)] ), then b(p) = [(0)], if C p = ( ({v}, { }), [(1), (0)] ), then b(p) = [(0), (0)], if C p = ( ({ }, {v}), [(0), (1)] ), then b(p) = [(0), (0)], if C p = ( ({ }, {v}, { }), [(0), (1), (0)] ), then b(p) = [(0), (0), (0)]. Define the procedure ComputeStartBagList(p) to compute the bag list b(p). It is easy to see that ComputeStartBagList(p) is a logspace computable function. 6.1.2 Forget Node Let p be a forget node and q be its child. Let C p = (Z, T [y ]), C q = (Z, T [y]) be the unique characteristics at p, q, resp. Since the algorithm only retains T [y], we may assume wlog that [y] = T [y]. In this section, we see how to compute b(p) from b(q). We now define the procedure ComputeForgetBagList(b(q)) that computes b(p) given b(q). First define a helper procedure that given two consecutive sequences in b(q), outputs the integer sequence corresponding to the typical sequence obtained by concatenating two consecutive sequences in T [y] corresponding to the input sequences of b(q). We then extend to accept three consecutive bag sequences. For 1 j t 1, consider the shorthand notation w = b(q) j b(q) j+1. Define (w) as follows: if (y j y j+1 ) = y j y j+1, then (w) = w otherwise, let i 1 be the minimum index such that (y j y j+1 ) i1 = (y j y j+1 ) i1 and (y j y j+1 ) i1 +1 = (y j y j+1 ) i1 +i 2, for some i 2 > 1. Let (w) = (w 1, w 2,..., w i1 1, w i1 + w i1 +1 +..., w i1 +i 2 1 + i 2 1, w i1 +i 2,..., w l(w) ). For 1 j t 2, consider the shorthand notation w = b(q) j b(q) j+1 b(q) j+2. Define (w) as follows: if T (y j y j+1 y j+2 ) = y j y j+1 y j+2, let (w) = w, if there exists unique i 2 > 1 for which there exists i such that T (y j y j+1 y j+2 ) i+1 = (y j y j+1 y j+2 ) i+i2. Let i 1 be the minimum such i. Let (w) = (w 1, w 2,...,, w i1 1, w i1 + w i1 +1 +..., w i1 +i 2 1 + i 2 1, w i1 +i 2,..., w l(w) ). if there exist distinct i 2 > 1 and i 4 > 1 for which there exist i and i such that T (y j y j+1 y j+2 ) i+1 = (y j y j+1 y j+2 ) i+i2 and T (y j y j+1 y j+2 ) i +1 = (y j y j+1 y j+2 ) i +i 2 +i 4. Let i 1, i 3 be the minimum such index i, i, resp. Let (w) = (w 1, w 2,..., w i1 1, w i1 + w i1 +1 +..., w i1 +i 2 1 + i 2 1, w i1 +i 2,..., w i3 +i 2 1, w i3 +i 2 + w i3 +i 2 +1 +..., w i3 +i 2 +i 4 1 + i 4 1, w i2 +i 3 +i 4,..., w l(w) ). 16