A Note on Johnson, Minkoff and Phillips Algorithm for the Prize-Collecting Steiner Tree Problem

1 A Note on Johnson, Minkoff and Phillips Algorithm for the Prize-Collecting Steiner Tree Problem Palo Feofiloff Cristina G. Fernandes Carlos E. Ferreira José Coelho de Pina September 04 Abstract The primal-dal scheme has been sed to provide approximation algorithms for many problems. Goemans and Williamson gave a ( n )-approximation for the Prize-Collecting Steiner Tree Problem that rns in O(n 3 logn) time. Johnson, Minkoff and Phillips proposed a faster implementation of Goemans and Williamson s algorithm. We give a proof that the approximation ratio of this implementation is exactly (in the worst case). Introdction Consider a graph G = (V,E), a fnction c from E into the set Q of non-negative rationals and a fnction π from V into Q. The Prize-Collecting Steiner Tree (PCST) Problem asks for a tree T in G sch that c e + v V\V T π v is minimm. (We denote by V T and E T, respectively, the vertex and edge sets of a sbgraph T.) The rooted variant of the PCST problem reqires T to contain a given root vertex. Goemans and Williamson [, 3] sed a primal-dal scheme to derive a ( n )-approximation for the rooted variant of PCST, where n := V. By trying all possible choices for the root, they obtained a ( n )-approximation for the nrooted PCST. The reslting algorithm rns in time O(n3 logn). Johnson, Minkoff and Phillips [4] proposed a modification of the algorithm that rns the primaldal scheme only once, reslting in a rnning-time of O(n logn). They claimed their algorithm which we refer to as JMP achieves an approximation ratio of n. Unfortnately, their claim does not hold. This note does two things. First, it proves that the JMP algorithm is a -approximation (the proof involves some non-trivial technical details). Second, it shows an example where the approximation ratio achieved by the algorithm is exactly, ths disproving the claimed n ratio. This paper was originally pblished in 006 ( A revised version, pblished in 009 on arxiv.org/abs/ v, made explicit the statement that the algorithm is a Lagrangean-preserving -approximation. The present version corrects some minor mistakes and improves the presentation. DepartamentodeCiênciadaComptação, InstittodeMatemática eestatística, UniversidadedeSãoPalo, Ra do Matão 00, São Palo/SP, Brazil. {pf,cris,cef,coelho}@ime.sp.br. Research spported in part by PRONEX/CNPq 66407/997-4 (Brazil).

2 Notation and preliminaries For any sbset F of E, let c(f) := e F c e. For any sbset X of V, let π(x) := v X π v and let X := V \X. If T is a sbgraph of G, we shall abse notation and write π(t) and π(t) to mean π(v T ) and π(v T ) respectively. Similarly, we shall write c(t) to mean c(e T ). Hence, the goal of problem PCST(G,c,π) is to find a tree T in G sch that c(t)+π(t) is minimm. A collection L of nonempty sbsets of V is laminar if, for any two elements L and L of L, either L L = or L L or L L. For any sbset X of V, let L[X] := {L L : L X} and L X := {L L : L X}. For every L in L that is not in L[X] L[X] L X, the sets L X, L\X and X\L are all nonempty. For any sbgraph T of G, we shall abse notation and write L[T], L[T], and L T in place of L[V T ], L[V T ], and L VT respectively. The nion of all sets in L shall be denoted by L. The set of all maximal elements of L shall be denoted by L. If L is laminar, the elements of L are pairwise disjoint. If, in addition, L = V then L is a partition of V. For any laminar collection L of sbsets of V and any edge e of G, let L(e) := {L L : e δ G L}, where δ G L stands for the set of edges of G with one end in L and the other in L. Let y be a fnction from L into Q. For any sbcollection L of L, let y(l ) := L L y L. We say that y respects c if y(l(e)) c e () for each e in E. We say an edge e is tight for y if eqality holds in (). We say y respects π if y(l[x]) π(x) () for each X in L. We say that y satrates an element X of L if eqality holds in (). The following lemma smmarizes the effect of the two respects constraints on y: Lemma. Let L be a laminar collection of sbsets of V and y a fnction from L into Q. If y respects c and π then y(l\l T ) c(t)+π(t) for any tree T of G. Proof. Collection L is the nion of L T, N, and M, where N is L[T] and M is the set of all L in L sch that both L V T and L\V T are nonempty. Hence, y(l\l T ) y(n)+y(m). Since T is connected, δ T L for each L in M, and therefore y(m) L M δ T L y L = y(l(e)) c e = c(t). Next, observe that y(n) = L N y(l[l]) L N π(l) π(t), where N is the set of maximal elements of N. The lemma follows from these ineqalities. The JMP algorithm finds a tree T sch that c(t)+π(t) y(l\l X ) for any nonempty set X of vertices. Sch tree -approximates a soltion of or problem, as the following corollary of Lemma. shows.

3 Corollary. Let L be a laminar collection of sbsets of V and y a fnction from L into Q that respects c and π. For any tree T in G, if c(t)+π(t) y(l\l X ) for every nonempty sbset X of V, then c(t) +π(t) (c(o) +π(o)), where O is a (optimal) soltion of problem PCST(G, c, π). Proof. We may assme V O is nonempty, whence c(t) + π(t) y(l \ L O ). By Lemma., y(l\l O ) c(o)+π(o). The claimed ineqality follows. 3 Johnson, Minkoff and Phillips algorithm Beforewestate thejmpalgorithm, afewmoredefinitionsareneeded. LetLbealaminarcollection of sbsets of V sch that L = V. We say that an edge is internal to L if both of its ends are in the same element of L. All other edges are external to L. For any external edge, there are two elements of L containing its ends. We call these two elements the extremes of the edge in L. Given a forest F in G and a sbset L of V, we say that F is L-connected if V F L = or the indced sbgraph F[V F L] is connected. In other words, F is L-connected if the following property holds: for any two vertices and v of F in L, there exists a path from to v in F and that path never leaves L. If F spans G (as is the case dring the first phase of the algorithm), the condition F[V F L] is connected can, of corse, be replaced by F[L] is connected. For any collection L of sbsets of V, we shall say that F is L-connected if F is L-connected for each L in L. For any collection S of sbsets of V, we say a tree T has no bridge in S if δ T S (whence δ T S = or δ T S ) for all S in S. We say that a tree T in G is wrapped in S if V T S for some S in S. The JMP algorithm receives G, c, π and retrns a tree T in G sch that c(t) + π(t) (c(o)+π(o)), where O is a soltion of PCST(G,c,π)). (The factor mltiplying π is a bons; becase of it, the JMP algorithm is said to be a Lagrangean-preserving -approximation [].) The algorithm has two phases, the second one operating on the otpt of the first: Phase I: Each iteration in phase I starts with a spanning forest F in G, a laminar collection L of sbsets of V sch that L = V, a sbcollection S of L, and a fnction y from L into Q sch that the following invariants hold: (i) F is L-connected; (i) y respects c and π; (i3) each edge of F is tight for y; (i4) y satrates every element of S; (i5) no element of L \S is the nion of elements of S; (i6) for any L-connected tree T in G that has no bridge in S and is not wrapped in S, y(l(e))+y(l[t]) y(l\l X ) for any nonempty sbset X of V. The first iteration starts with F = (V, ), L = {{v} : v V}, S =, and y = 0. Each iteration consists of the following: 3

4 Case I.: L \S >. For ε in Q, let y ε be the fnction defined as follows: y ε L = y L + ε if L L \S and y ε L = y L otherwise. Let εbe thelargest nmber in Q sch that thefnction y ε respects c and π. Sbcase I..A: y ε satrates some element L of L \S. Start a new iteration with S {L} and y ε in the roles of S and y respectively. (The forest F and the collection L do not change.) Sbcase I..B: some edge e external to L is tight for y ε and has at least one of its extremes in L \S. Let let L and L be the extremes of e in L. Set y ε L L := 0 and start a new iteration with F +e, L {L L }, and y ε in the roles of F, L, and y respectively. (The collection S does not change.) Case I.: L \S =. This is the end of phase I. Start phase II. Phase II: Dring this phase, the collections L and S and the fnction y remain nchanged. Let M be the only element of L \S. Each iteration begins with a sbgraph T of F sch that (i7) T is an L-connected tree; (i8) M \V T admits a partition into elements of S. The first iteration begins with T = F[M]. Each iteration does the following: Case II.: δ T Z = for some Z in S. Start a new iteration with T Z in place of T. Case II.: δ T Z for each Z in S. Retrn T and stop. 4 Analysis of the algorithm Sppose, for the moment, that invariants (i) to (i8) are correct. At the end of phase II, T is a tree by virte of (i7). Since each edge of T is tight (de to (i3)), c(t) = c e = y(l(e)). On the other hand, some sbcollection Z of S partitions V T, becase L S is a partition of M and, by (i8), there is a partition of M \V T into elements of S. Since y satrates the elements of Z (de to (i4)), π(t) = S Z π(s) = S Zy(L[S]) y(l[t]). Now, c(t)+π(t) y(l(e))+y(l[t]) y(l\l X ) (3) 4

5 for every nonempty sbset X of V, by virte of invariant (i6). The applicability of (i6) is jstified as follows. Tree T is L-connected de to (i7) and has no bridge in S since we are in Case II.. Moreover, T is not wrapped in S as this wold imply, by (i8), that M is the nion of elements of S, contradicting (i5). Corollary. can now be applied to ineqality (3) to show that c(t)+π(t) (c(o)+π(o)) for any soltion O of PCST(G,c,π). This proves the following corrected version of Johnson, Minkoff and Phillips Theorem 3. [4]: Theorem 4. The JMP algorithm is a -approximation for the PCST problem. The example in Figre shows that the approximation ratio of the JMP algorithm can be arbitrarily close to, regardless of the size of the graph. So, Theorem 4. is tight. (a) 0 (b) (c) (d) 0 +ρ +ρ +ρ Figre : (a) An instance of the PCST. (b) The soltion prodced by the JMP algorithm when ρ > 0. Its cost is 4. (c) The optimal soltion, consisting of vertex alone, has cost +ρ. (d) A similar instance of arbitrary size consists of a long path. 5 Proofs of the invariants Invariants (i) to (i4) obviosly hold at the beginning of each iteration of phase I. We mst only verify the other for invariants. The verification of (i6) depends on the following lemma: Lemma 5. Let P be a partition of V and (A,B) a bipartition of P. Let T be a tree in G. If T is P-connected, has no bridge in B, and is not wrapped in B, then δ T A + A[T] A. A A Proof. Let s say that two elements of P are adjacent if some edge of T has these two elements as extremes. This adjacency relation defines a graph H having P as set of vertices. Since T is P-connected, the edges of H are in one-to-one correspondence with the edges of T external to P. P P δ TP = E H. Since Hence, the degree of any vertex P of H is exactly δ T P, and therefore T is connected, H has + P[T] components (all are singletons, except at most one). Since T has no cycles and is P-connected, H is a forest. Hence, the nmber of edges of H is the nmber of vertices mins the nmber of components: E H = P P[T]. Therefore δ T P = P P[T]. (4) P P 5

6 Now consider the vertices of H that are in B. Since T has no bridge in B and is not wrapped in B, each B in B is sch that either δ T B or B V T. Hence B B δ TB B \B[T], and therefore δ T B B B[T]. (5) B B The difference between (4) and (5) is ineqality follows. A A δ TA + A[T] A, from which the claimed Proof of (i6). Clearly, (i6) holds at the beginning of the first iteration. Now assme that it holds at the beginning of some iteration where Case I. occrs. Sppose, first, that Sbcase I..A occrs. Let S := S {L}, let X be a nonempty sbset of V, and let T be an L-connected tree that has no bridge in S and is not wrapped in S. Of corse T has no bridge in S and is not wrapped in S, and so we may assme that y(l(e))+y(l[t]) y(l\l X ). (6) We mst show next that (6) holds when y ε is sbstitted for y. Lemma 5., with P := L, A := L \S, and B := L S, implies A A δ TA + A[T] A. Since A X, δ T A ε+ A[T] ε A\A X ε. A A The addition of this ineqality to (6) prodces y ε (L(e))+y ε (L[T]) y ε (L\L X ) since y ε differs from y only on A. Hence, (i6) is tre at the beginning of the next iteration. Now sppose Sbcase I..B occrs. Let L := L {L,L }, let X be any nonempty sbset of V, and let T be an L -connected tree that has no bridge in S and is not wrapped in S. Of corse T is L-connected, and so we may asme that y(l(e))+y(l[t]) y(l\l X ) (7) We mst show next that (7) is tre when y ε and L are sbstitted for y and L respectively. Lemma 5., with P := L, A := L \S, and B := L S, implies A A δ TA + A[T] A. Since A X, δ T A ε+ A[T] ε A\A X ε. A A The addition of this ineqality to (7) prodces y ε (L (e))+y ε (L [T]) y ε (L \L X ) since y ε differs from y only in A and y ε L L = 0. Hence, (i6) is tre at the beginning of the next iteration. 6

7 Proof of (i5). Obviosly (i5) holds at the beginning of the first iteration. Now consider an iteration where Case I. occrs. If Sbcase I..A occrs, then (i5) remains trivially tre at the beginning of the next iteration. Sppose now that Sbcase I..B occrs. Adjst notation so that L / S. Since (i5) holds at the beginning of the crrent iteration, L is not the nion of elements of S. Hence, L L is not the nion of elements of S. Therefore, (i5) is tre at the beginning of the next iteration. Proof of (i7). Sppose we are at the beginning of the first iteration of phase II. Let L be an element of L sch that L V T. Since V T = M L, we have L V T and therefore T[V T L] = T[L] = F[L]. Since F[L] is connected by virte of (i), so is T[V T L]. This argment shows that T is L-connected. In particlar, T is M-connected and therefore T is a tree. Hence, (i7) holds at the beginning of the first iteration. Now sppose (i7) holds at the beginning of some iteration where Case II. occrs. Let L be an element of L and let and v be vertices in L (V T \Z). Let P be the niqe path from to v in T. We may assme that P never leaves L. Moreover, P never enters Z, given that δ T Z =. Hence, T Z is L-connected. For the same reason, T Z is a tree. Hence (i7) holds at the beginning of the next iteration. Proof of (i8). At the beginning of the first iteration of phase II, (i8) holds becase V T = M. Now consider an iteration where Case II. occrs. We may assme that there is a partition U of M \V T into elements of S. If Z V T then U {Z} is a partition of M \(V T \Z) into elements of S. Otherwise, Z incldes some of the elements of U and is disjoint from all the others. Hence, {Z} {U U : U Z = } is a partition of M \(V T \Z) into elements of S. This shows that (i8) holds at the beginning of the next iteration. References [] A. Archer, M. Bateni, M. Hajiaghayi, and H. Karloff. Improved approximation algorithms for Prize-Collecting Steiner Tree and TSP. In Proceedings of the 50th Annal IEEE Symposim on Fondations of Compter Science (FOCS), 009. [] M.X. Goemans and D.P. Williamson. A general approximation techniqe for constrained forest problems. SIAM Jornal on Compting, 4():96 37, 995. [3] D.S. Hochbam, editor. Approximation Algorithms for NP-Hard Problems. PWS Pblishing, 997. [4] D.S. Johnson, M. Minkoff, and S. Phillips. The prize collecting Steiner tree problem: theory and practice. In Symposim on Discrete Algorithms, pages ,

