Worst Case Performance of Approximation Algorithm for Asymmetric TSP

Worst Case Performance of Approximation Algorithm for Asymmetric TSP Anna Palbom Department of Numerical Analysis and Computer Science Royal Institute of Technology S-100 44 Stockholm, Sweden E-mail: annap@nada.kth.se August 19, 003 Abstract In 198 Frieze, Galbiati and Maffioli [3] invented their famous algorithm for approximating the TSP tour in an asymmetric graph. They show that the algorithm approximates the TSP tour within a factor of log n. We construct a family of graphs for which the algorithm (with some implementation details specified by us) gives an approximation in Ω(log n). This shows that the analysis by Frieze et al. is tight up to a constant factor and can hopefully give deeper understanding of the problem. 1 Introduction The Travelling Salesman Problem (TSP) is one of the most famous and well-studied NP-problems. Definition 1. The Asymmetric Travelling Salesman Problem (Asymmetric TSP) is the following minimisation problem: Given a collection of cities and a matrix whose entries are interpreted as the non-negative distance from a city to another, find the shortest tour starting and ending in the same city and visiting every city exactly once. TSP was proven to be NP-hard already by Karp [4] in 197. This means that an efficient algorithm for TSP is highly unlikely; hence it is interesting to investigate algorithms that compute approximate solutions. However Sahni and Gonzalez [7] showed that in the case of general distance functions it is NP-hard to find a tour with length within exponential factors of the optimum. Definition. Let P be an NP minimisation problem. For an instance x of P let opt(x) be the optimal value. A solution y with weight w(y) is c-approximate if it is feasible and w(y)/opt(x) c. When the distance function is symmetric and constrained to satisfy the triangle inequality there exists a folklore -approximation algorithm, i.e., an algorithm that finds a tour no longer than two times the length of the optimal tour. For this case the best known approximation algorithm is a factor 3/-approximation algorithm due to Christofides []. The above algorithms applies only to symmetric distance functions. The asymmetric case is much less understood. In 198 Frieze, Galbiati and Maffioli [3] invented their famous algorithm 1

for asymmetric graphs, which approximates the TSP tour within a factor of log n. There is only a miniscule lower bound: Papadimitriou and Vempala [6] recently proved that it is NP-hard to approximate the minimum TSP tour within a factor less than 0/19 ɛ, for any constant ɛ > 0. Obviously huge improvements can be done on a better algorithm or a higher lower bound. Despite a lot of effort in research during the last twenty years the only algorithmic improvement is by Bläser in 003 [1]. He improves the algorithm by Frieze et al. and proves a factor 0.999 log n. Hence, any new insight regarding the asymmetric TSP is important. One way to achieve such insight is to identify potential hard instances for the known approximation algorithms. The algorithm by Bläser is more complicated than the original algorithm due to Frieze et al. and is hence more difficult to understand. Therefore, we study the original algorithm in this paper. By constructing an explicit family of graphs we establish that the analysis of the algorithm is tight up to a constant factor (Theorem 1). We apply the algorithm by Bläser on the graphs and see that it with certain assumptions gives the same approximation as the original algorithm. The main idea of the algorithm by Freize et al. is: Find a minimum cycle cover in the graph by linear programming. Choose one node in every cycle and form a subgraph with the same distance function as in the original graph. Find a minimum cycle cover in the subgraph. Continue recursively until there is only one cycle in the cycle cover. The union of the cycle covers form a strongly connected graph. We can replace edges in the union with a shortcut edge in the original complete graph to obtain a TSP tour. Since the graph respects the triangle inequality the TSP tour has weight less than or equal to the the sum of the cycle covers. The description of the algorithm by Frieze et al. [3] leaves some implementation details unspecified. The algorithm chooses one arbitrary node in every cycle to be in the subgraph and shortcuts are made in arbitrary order. In our analysis of the worst case performance we make the following assumptions; 1. The first node in every cycle is chosen for the subgraph.. The shortcuts are made in a certain specified order. We show that the analysis of the algorithm by Frieze et al. is tight and our main result is: Theorem 1. With the above assumptions the ATSP approximation algorithm of Frieze et al. [3] approximates the optimum within Ω(log n) for certain classes of graphs. We have constructed four classes of graphs which give the algorithm by Frieze et al. a worst case performance. For simplicity, graphs in the first class are symmetric and can thus be 3/- approximated with Christofides algorithm []. For this class we also, in addition to the above assumptions have to assume that a cycle cover with small cycles are preferred by the ATSP algorithm. But the class makes the notation clear and we use it to explain the algorithms and the ideas. A slight modification gives another class of symmetric graphs where the minimal cycle cover is unique. Graphs in the third class are asymmetric and again the algorithm is assumed to prefer small cycles in the cycle cover. A combination of the second and third class gives the last class. Here the graphs are asymmetric and the minimal cycle cover is unique. The fourth class gives Theorem 1. For asymmetric graphs with ratio α between edges in opposite direction Frieze et al. [3] gives another data dependent algorithm which gives a 3α/-approximation of the TSP tour:

Definition 3. For a complete asymmetric graph the maximum ratio between edges in opposite direction is denoted by α, where d(y, x) α = max{, x, y V, x y} d(x, y) The idea of the algorithm is to make the graph symmetric and then use the algorithm for symmetric graphs by Christofides []. We will not show a worst case behaviour of this algorithm, we just make sure that the class of graphs which has a worst case behaviour of their ATSP algorithm is not guaranteed to be solved by the data dependent algorithm. Notations and conventions All graphs in this paper have n = m nodes placed in a circle as in Figure 1. When an algorithm operates on an arbitrary node the ordering is modulo m. For example the node before v 0 is v m 1. Difference in index for two nodes v i and v j is min{ i j, n i j }. Definition 4. For a m-bit integer x the function z m (x) = max{k Z m k divides x}. With this definition z m (0) = m 1 since all numbers divide zero. The function is useful to describe sets, for example {, 6, 10, 14,...} all have z m (x) = 1..1 Constructing a distance function For a strongly connected graph, G = (V, E), let the distance between two nodes, d(v i, v j ), be the weight of the shortest path in G from v i to v j. The distance function obeys the triangle inequality since the shortest path from v i to v j is less than or equal to the path over a certain node v k and d(v i, v j ) d(v i, v k ) + d(v k, v j ). Some subgraphs A cycle cover is a set of node disjoint cycles that covers all nodes. Definition 5. A cycle cover for a directed graph G = (V, E) is a subgraph of G such that for each node v V, indegree(v) = outdegree(v) = 1 Definition 6. A cycle cover where every cycle has exactly two nodes is called a -cover. The approximation algorithm produces several cycle covers which we add together. All these cycle covers form a spanning cactus. Definition 7. [8] A strongly connected, asymmetric graph where each edge is contained in at most (and thus, in exactly) one simple directed cycle is called a directed cactus. Definition 8. A spanning cactus for an asymmetric graph G is a subgraph of G that is a directed cactus and connects all vertices in G. Notations: Throughout the paper T is a TSP tour and opt(g) is the minimum TSP tour in the graph G. C is a cycle cover or a cactus. A cycle is denoted by the nodes in it; for example (v i, v j, v k ) is the directed cycle from v i to v j to v k and back to v i. 3

v 0 v 1 v 15 v 14 v v 13 v 3 v 1 v 4 v 11 v 5 v 10 v 6 v 9 v 8 v 7 Figure 1: The graph G D1 16 which induces the first distance function, D1 16. All edges have weight one. 3 The approximation algorithm The first section is mainly the description of the algorithm given by Frieze et al. [3]. The description leaves some implementation details unspecified which we specify in the second section. An intuitive description of the algorithm is given in the introduction. The main algorithm is ATSP, it calls the procedure ASSIGN which returns a minimum cycle cover and the procedure TOUR which makes the shortcuts. 3.1 Original description Procedure ASSIGN(G, D) Input: A graph G = (V, E) A cost function D : E Q + Output: A cycle cover S E Finds a set S E of minimum cost such that every node in V has in and out degree equal to one. 4

For the next procedure we need some notation: In a spanning cactus C the following holds for every node v V : 1. indegree(v) = outdegree(v) = d(v). the deletion of node v from C leaves d(v) connected components Observe that these properties imply that for each connected component C i obtained by deleting v there are nodes u i, w i C i such that (u i, v) and (v, w i ) are in C. Procedure TOUR(G, C) Input: A graph G = (V, E) A spanning cactus C E Output: A TSP tour C begin while there exists a node v V with d(v) > 1 do begin C C (u 1, w ); C C \ {(u 1, v), (v, w )}; end end Procedure ATSP(G 0, D 0 ) Input: A graph G 0 = (V, E) A cost function D 0 : E Q + Output: A TSP tour T begin 1 C ; D D 0 ; 3 G G 0 ; 4 k ; 5 while k 1 do 6 begin 7 {P 1, P,..., P h } ASSIGN(G, D); 8 V ; 9 for i = 1 until h do 10 begin 11 choose a node v i of P i ; 1 V V {v i }; 13 C C {P i }; 14 end 15 Let G be the complete subgraph of G 0 induced by V and D the induced cost matrix of G; 5

16 k h; 17 end 18 T T OUR(G 0, C); 19 return T ; end To get a upper bound of the algorithm Frieze et al. make the following analysis; In the worst case all cycles, {P 1, P,..., P h }, in the cycle cover have length two. Then ASSIGN is called log n times. The weight of every cycle cover is less than or equal to opt(g). Thus the spanning cactus C has the weight opt(g) log n. Since the graph obeys the triangle inequality the tour T found in TOUR is shorter than or equal to opt(g) log n. The approximation is then O(log n). 3. Our specification In order to analyse the algorithm we need to specify the arbitrary choices in TOUR and ASSIGN: 1. In the procedure ATSP at line 11, a node from every cycle is chosen to be in the subgraph. Here we choose the node with lowest index in every cycle.. The shortcuts made by the procedure TOUR are in arbitrary order. We replace the procedure with our procedure SHORTCUT which is simply TOUR but with specified order. The union of all cycle covers form a spanning cactus C. SHORTCUT makes a depth-first search in C starting at the node v 0 and connects the nodes in the order they are found. If there are several edges from a node, edges to nodes with larger index are traversed first. Procedure SHORTCUT(G, C, s) Input: A graph G = (V, E) A cactus C E A startnode s Output: A TSP tour T begin global T ; global set of visited nodes U ; t DEPTH-FIRST (G, C, s); T T (t, s); return T ; end Procedure DEPTH-FIRST(G, C, s) Input: A graph G = (V, E) A cactus C E The present node s Output: The last node, t, in the TSP tour T 6

begin t s; U U {s}; for each v V such that ((s, v) C and v / U) do (take edges in decreasing order with respect to index of the node s) begin T T (t, v); t DEPTH-FIRST(G,C,v); end return t; end In the procedure ASSIGN any minimum cycle cover may be returned depending on the linear program. For our first simple example the procedure is assumed to choose -covers even though there are other larger cycles of equal weight. This is not an intuitive choice but it gives comparable results with the other classes of graphs. For the last family of graphs the -cover is the unique minimal cycle cover. Lemma 1. The tour produced by the procedure SHORTCUT is a TSP tour. Proof. Since the spanning cactus C is strongly connected a depth-first search will visit all nodes and all nodes will be in the tour. We only add edges to nodes that are not already visited which make the tour loop free. In the last step the procedure returns to the start node and returns a TSP tour. If all cycles in the spanning cactus contains exactly two nodes, then every cycle can be seen as a symmetric edge and the cactus is a spanning tree. Lemma. The TSP tour produced by SHORTCUT on a spanning cactus C, with exactly two nodes in every cycle, can be produced by TOUR on the same graph. Proof. Let C be the initial cactus, C T be the set of edges modified by TOUR, and T the tour produced by SHORTCUT. The last node in the TSP tour produced by SHORTCUT is denoted by t. Initially C T = C and T = and at the end we want C T = T. In every step we want T C T and the edges not in T, traversed by SHORTCUT before visiting the node t, to be removed from C T. This is trivially true when the algorithms starts. We simulate one recursion in SHORTCUT and then show that TOUR (with a certain choice of nodes) can ensure that T C T and that the right edges are removed from C T. In one step in SHORTCUT there are tree different possibilities: 1. s = t = v i, (v i, v j ), (v i, v k ) C and v j is a leaf. First SHORTCUT will T (v i, v j ), t v j then T (v j, v k ) and t = v k. TOUR can, since d(v i ), modify C T C T (v j, v k ) and C T C T \ {(v j, v i ), (v i, v k )}. s = t = v i, (v i, v j ), (v j, v k ) C, v j is an internal node. Then SHORTCUT will T (v j, v k ) and t = v k. TOUR can, since d(v j ), modify C T C T (v k, v i ) and C T C T \ {(v k, v j ), (v j, v i )} 3. t s = v i and (v i, v j ) C. Then SHORTCUT will T (t, v j ) and t v j. In TOUR there are edges e 1, e,..., e i C T which form a path from t to v i. Remove them stepwise and replace them with (t, v i ). 7

In each case T C T and the edges traversed by SHORTCUT, before t, which are not in T are removed from C T by TOUR. In the last step SHORTCUT returns to the root v r and the edge (t, v r ) is added to T. In TOUR there are edges e 1, e,..., e i C T which form a path from t to v r. Remove them stepwise and replace them with (t, v r ). Now T = C T and the tour produced by SHORTCUT can be produced by TOUR. 4 A worst case approximation We construct a simple family of graphs and show a worst case performance of the approximation algorithm by Frieze et al. The graphs are symmetric and obey the triangle inequality. Since the graphs are symmetric they can be approximated within 3/ by the algorithm due to Christofides []. But since the family gives a simple notation we use it to describe the algorithm and the main ideas. 4.1 Constructing the graph The distance function is induced by a graph (Figure 1) with the following definition: Definition 9. The first distance function, Dn, 1 is induced by a graph G D1 n with n = m nodes arranged in a circle. Neighbour nodes are connected by edges of weight one, d(v i 1, v i ) = d(v i, v i 1 ) = 1. Definition 10. A graph G 1 n is a complete, asymmetric graph with n = m nodes ordered as in G D1 n and with the distance function D 1 n. In Dn 1 the distance between two nodes is the difference in index; d(v i, v j ) = min{ i j, n i j }. More intuitively the distance between two neighbours is one, edges which jump two steps is two and so on. The edges are directed even though they have the same value in both directions: (x, y) (y, x) but d(x, y) = d(y, x). The minimum tour, opt(g 1 n ), is of course to traverse the nodes in clock-wise order (or counter-clock-wise) and has length n. 4. The spanning cactus The algorithm by Frieze et al. recursively finds a minimum cycle cover in the graph. Lemma 3. In a complete, asymmetric graph, the union of all cycle covers recursively produced by the algorithm ATSP form a spanning cactus. Proof. Every node is in the first cycle cover and hence all nodes are in the union. From the last cycle there is a path to and from every node in the graph. Hence the graph is strongly connected. By the construction one edge is in exactly one cycle and the union is a spanning cactus. To get an intuitive understanding of the algorithm ATSP and the worst case behaviour of G 1 n we use a graph with n = 4 = 16 nodes as an example (Figure ). In the first call to ASSIGN (line 7) the minimum cycle cover can consist of one large cycle or eight of length two. Both have length 16. Assume that the -cover is chosen. Then (in line 13) C = {(v 0, v 1 ), (v, v 3 ), (v 4, v 5 ), (v 6, v 7 ), (v 8, v 9 ), (v 10, v 11 ), (v 1, v 13 ), (v 14, v 15 )}, and choose the first node in every cycle to be in V (line 1) for the next call to ASSIGN. In our example V = 8

{v 0, v, v 4, v 6, v 8, v 10, v 1, v 14 }. Now the shortest length between any nodes in G (line 15) is two. Again ASSIGN can return one large cycle or four cycles of length two. Both have the length 16 and we assume that the -cover is returned. Proceed in the same way until there is just one cycle in the cycle cover. v 15 v 0 v 1 v 14 v v 13 v 3 v 1 v 4 v 11 v 5 v 6 v 10 v 9 v 8 v 7 Figure : The Worst Case Spanning Cactus (W-cactus) in the graph G 1 16. Each cycle is symbolised with a line. Definition 11. In the procedure ATSP and a graph G n,0 with n nodes, the subgraph after the first call to ASSIGN is denoted by G n,1, the subgraph after the i:th call is denoted by G n,i. In a asymmetric graph G n,0 with n = m nodes ordered in a circle we produce cycle covers with a certain structure in the following way: 1. Make a cycle cover by connecting every pair of nodes v i and v i j in G n,j, such that z(i) = j.. Select the first node in every cycle to the subgraph G n,j+1. 3. Repeat from 1. until there is only one node in G n,m. Definition 1. If edges in every -cycle above have equal length the union of the cycle covers is called a Worst Case Spanning Cactus or a W-cactus. 9

For n = 16 nodes the W-cactus looks like in Figure. Lemma 4. In G 1 n the spanning cactus produced by the algorithm ATSP, with our specification and the assumption that small cycles are preferred in the cycle cover is a W-cactus with weight n log n. Proof. At first there are n nodes. The smallest length of an edge is d(v i, v i 1 ) = 0 = 1. If we connect every pair of nodes (v i, v i 1 ) with odd index i (z(i) = 0) the -cover has weight n which is minimal since every cycle cover consists of n edges with length at least one. Choose the first node in every cycle, (nodes with even index and z(i) 1) to be in G 1 n,1. Then G1 n,1 has n/ nodes and the shortest length is d(v i, v i ) = 1. The -cover of nearest neighbours is again minimal and has weight n. If we always choose the first node in every cycle in G 1 n,i to be in G1 n,i+1 the cycle cover in call k to ASSIGN has n/ k 1 nodes and the smallest length is d(v i, v i k 1) = k 1. The -cover is minimal in every recursion and has weight n. The graph G 1 n is symmetric and the spanning cactus produced is a W-cactus. The recursion is repeated log n times which gives a total weight of n log n. The cycle connecting every node in clockwise order has weight n and thus the -cover is a minimal but not a unique solution. We can make several observations about the W-cactus: Lemma 5. Edges in a W-cactus are of the form (v i, v i k) or (v i k, v i ) where z m (i) = k. Proof. The first cycle cover connects neighbour nodes and produces edges with k = 0 or (v i, v i 1 ) and (v i 1, v i ) where z m (i) = k = 0. We choose the first node in every cycle and G 1 n,1 consists of nodes with even index (z m (i) 1). The next -cover gives edges for k = 1. At the j:th recursion, edges with k = j 1 are produced. In the last recursion there is one cycle (v m 1, v 0 ) and (v 0, v m 1) which corresponds to k = m 1. Lemma 6. If a cycle (v i k, v i ) is in the W-cactus, then there is no other cycle (v s, v t ) in the W-cactus with min{ s t, n s t } > k and s, t [i k + 1, i + k 1]. Proof. If k = m 1 there is no -cycle with longer edges. Hence assume that k < m 1. By Lemma 5, z m (i) = k and no larger cycle can start at v i. The nearest node in a larger cycle in the W-cactus is v i± k since z m (i ± k ) = k + 1. Hence any node in a longer -cycle is in [i + k, i k ] which is the complement of [i k + 1, i + k 1]. Since every cycle in the cycle cover is a -cycle of edges with equal weight they can be treated as an undirected edge. The union of the cycle covers can with this view be seen as a spanning, undirected tree. Since every node is in at least one cycle the tree is spanning and by the construction the cover is cycle-free. Lemma 7. A W-cactus has n 1 cycles. Proof. A spanning tree with n nodes a trivially has n 1 edges. 10

v 0 v 1 v 15 v 14 v v 13 v 3 v 1 v 4 v 11 v 5 v 6 v 10 v 9 v 8 v 7 Figure 3: The TSP tour after some steps with the procedure SHORTCUT on a W-cactus with n = 4 = 16 nodes. Straight lines represent -cycles in the W-cactus and bold arrows represent edges in the TSP tour. 4.3 The TSP tour In the graph G 1 n the ATSP algorithm, in the worst case, produces a W-cactus of weight n log n. The goal is to show that the TSP tour produced by the algorithm has weight larger than half the weight of the W-cactus. The TSP tour, T, consists of n edges which by Lemma 7 is one more than the number of cycles in the W-cactus. If for every cycle in the W-cactus there is one distinct, associated edge in the TSP tour with length at least as high as the edges in the cycle we are done. We use the same example of a W-cactus with n = 16 nodes to describe the procedure SHORT- CUT. Initially the TSP tour T = and the start node s = t v 0. After the first step in SHORTCUT T (v 0, v 8 ). After some steps the graph looks like Figure 3. Then U = {v 0, v 8, v 1, v 14, v 15, v 13, v 10, v 11 }, T = {(v 0, v 8 ), (v 8, v 1 ), (v 1, v 14 ), (v 14, v 15 ), (v 15, v 13 ), (v 13, v 10 ), (v 10, v 11 )} and t = v 11. In the next step T T (v 11, v 9 ). Since SHORTCUT considers the index of nodes and not the distance function it always gives the same TSP-tour for the same structure of the spanning cactus. 11

Definition 13. The TSP tour produced by the procedure SHORTCUT on a W-cactus with the start node, s = v 0, is called Worst Case TSP Tour (W-tour). Lemma 8. For each cycle (v i, v j ) in the W-cactus there is one distinct edge (v p, v q ) in the W-tour associated to the cycle where min{ p q, n p q } min{ i j, n i j }. Proof. Think of the W-cactus as a spanning tree with its root in v 0. Take an arbitrary cycle (v i, v j ) with min{ i j, n i j } = k. By Lemma 5 the difference in index for a cycle in the W-cactus always is a power of two. Assume that v i is at higher level in the tree. The edge (v p, v j ) to v j in the W-tour is the edge associated to the cycle. (Since the tour is Hamiltonian there is always such an edge.) The edge to the root is an extra edge in the tour not associated to any cycle. When SHORTCUT comes to v i there are two possible cases: 1. s = t = v i : Which gives T (v p, v j ) = (v i, v j ). One edge in the cycle is in the tour. This is among others the case at the root when s = v 0.. s t: By Lemma 6 we know that t = v p, p [j + k, i] thus min{ p j, n p j } k. Thus there is one edge associated to every cycle fulfilling the Lemma. Lemma 9. The W-tour in a graph G 1 n has weight larger than n log n Proof. By Lemma 8 we know that every cycle (v i, v j ) in the W-cactus has one corresponding edge (v s, v t ) in the W-tour where min{ s t, n s t } min{ i j, n i j }. With distance function, D 1 n, d(v i, v j ) = min{ i j, n i j }. Thus the weight of the W-tour is at least half the weight of the W-cactus. The W-cactus has weight n log n which gives the W-tour a weight larger than (n log n)/. Theorem. The TSP approximation algorithm, ATSP, by Frieze et al. gives with our specification on the graph G 1 n an approximation in Ω(log n). It is assumed that the algorithm prefers small cycles in the cycle cover. Proof. The optimum TSP tour is opt(g 1 n) = n. The ATSP algorithm gives in the worst case by Lemma 9 a TSP tour of weight larger than (n log n)/. Thus and T is approximated in Ω(log n). T > opt(g 1 n )log n 5 More classes of worst case graphs The previous class of graphs has two main disadvantages: It is symmetric and the minimal cycle cover is not unique. In this chapter we construct classes of graphs which do not have these disadvantages. We will only make a formal analysis of the last class and just show properties for the other classes that will be used in that analysis. 1

5.1 A class of graphs with unique minimum cycle cover By modifying the previous class of graphs we get a class with unique minimum cycle cover. The class is symmetric and can thus be 3/-approximated by the algorithm by Christofides []. The minimum TSP tour and the cycle covers and the TSP tour found by the algorithm is identical in this class and in the last class. Therefor we analyse them carefully here and refer to the result in the final analysis. The distance function is defined by a graph (Figure 4) and has the following definition: Definition 14. The second distance function, Dn, is induced by a symmetric graph G D n with n = m nodes arranged in a circle. Two neighbour nodes are connected by an edge. The weight of an edge is d(v i 1, v i ) = 1 + z m (i)ɛ where 0 < ɛ < 1/ log n and z is defined in Definition 4. v 0 v 1+3ε 1 15 v 1 1 1+ε 1+ε v 14 v 1 v 13 v 3 1 1+ε v 1 v 4 1+ε 1 v 11 1 v 10 1+ε 1 v 6 1+ε v 5 v 9 1 v 8 1+3ε v 7 Figure 4: The graph G D 16 inducing the second distance function, D 16. An intuitive way to understand the second distance function is that every other edge has weight one and every other has at least one extra ɛ-weight added. This makes the -cover minimal. Nodes with even index are in the subgraph. Now every second edge has weight + ɛ and every other edge has at least one extra ɛ-weight added and the -cover is again the unique minimal cycle cover. In 13

every recursion every other edge has at least one extra ɛ-weight added and the -cover is always unique. Definition 15. A graph G n is a complete, asymmetric graph with n = m nodes ordered as in G D n and has the distance function D n. We need some properties of this class of graphs to analyse the final class of graphs. Lemma 10. If there is an edge (v i, v j ) in the graph G D n d(v i, v j ). then the shortest length from v i to v j is Proof. The only thing that might be a problem is that for an edge (v i 1, v i ) the path, anti-clockwise around the circle, is shorter than the edge. The largest edge has weight less than, d(v n 1, v 0 ) = 1 + (m 1) 1 m <. A path in anti-clockwise direction has weight greater than n > and hence the edge is always the shortest path. Lemma 11. In the graph G n edges (v i j, v i), (v i, v i j ) with j z(i) = k have length j + ( j 1 + k j)ɛ Proof. The proof is made by induction. If j = 0 the Lemma is trivially identical to the definition of the distance function D n. Suppose the Lemma is true when j = p, where p < z(i). What happens for j = p + 1? Look at the edge (v i p+1, v i ) with z(i) = k. The length is d(v i p+1, v i ) = d(v i p+1, v i p) + d(v i p, v i ) = p + ( p 1 + p p)ɛ + p + ( p 1 + k p)ɛ = p+1 + ( p+1 + 1 + k (p + 1))ɛ The difference in index between the nodes, j, is less than half the circle since j z(i) and hence the shortest path is over the edge v i p. We have used the fact that z(i p ) = p since p < z(i). Since the distance function is symmetric the edge (v i, v i p+1) has the same length. By induction the Lemma is true for every j. Lemma 1. The algorithm ATSP, with our specification, returns a W-cactus as spanning cactus on the graph G n and the cactus has weight n log n + ( n (log n ) + 1)ɛ Proof. At first G n,0 consists of all n nodes v i and z(i) 0. Every other edge has weight one and every other edge has at least one extra ɛ-weight added and the two cycle cover is the unique minimal cycle cover. If the first node in every cycle is put in the subgraph, G n,1 consists of nodes v i with even index i : z(i) 1. Suppose G n,j consists of all nodes v i with z(i) j and that the cycle covers in G n,r for r < j form a subgraph of the W-cactus. Every other edge has by Lemma 11 weight j + ( j 1 + k j)ɛ and every other has at least one extra ɛ-weight added. The -cover is minimal. Select the first node in every cycle to be in G n,j+1. Then the cycle cover in G n,j is a subgraph of the W-cactus and nodes v i in G n,j+1 have z(i) j + 1. By induction the -cover is minimal for every subgraph and it forms a W-cactus. 14

If there are no ɛ-weights the cactus has weight n log n. The ɛ-weights contribute with log n 1 i=1 n( i log 1)ɛ n 1 nɛ i+1 = i=1 The total weight is then log n i= nɛ i = ɛ n (log n 1) ɛ( n 1) = (n (log n ) + 1)ɛ n log n + ( n (log n ) + 1)ɛ The edges (v i, v j ) in the W-cactus have minimal weight: Lemma 13. In a graph G n an edge (v i, v i k) with z(i) = k has minimal length d(v i, v i k) = k + ( k 1)ɛ d(v j, v j k), j Proof. By Lemma 11 all edges (v i, v i k) with z(i) = k have equal length k + ( k 1)ɛ. Look at the edge (v k, v 0 ) with i = 0. Since the edges grow in clockwise order there is no shorter edge (v j, v j k) in the graph. The algorithm has produced a W-cactus as spanning cactus. Since neither the procedure TOUR nor SHORTCUT consider the distance function they return a W-tour as TSP tour. Lemma 14. In the graph G n a W-tour has weight greater than n log n + ( n 4 (log n ) + 1/)ɛ Proof. From Lemma 8 we know that the edge (v s, v j ) in the W-tour associated to every cycle (v i, v j ) in the W-cactus has min{ s j, n s j } min{ i j, n i j }. Since the edges in the W-cactus by Lemma 13 are minimal the edge d(v s, v j ) d(v i, v j ). Thus for every cycle there is an associated edge with length at least as high as the edges in the cycle and the W-tour has weight at least half of the W-cactus. By Lemma 1 the W-cactus has weight n log n+( n (log n )+1)ɛ. 5. A class of asymmetric graphs The third class is a class of asymmetric graphs where the minimal cycle cover is not unique. The class has the same minimum TSP tour and W-cactus as the first class. The construction is an example of how to construct asymmetric graphs but the analysis of the graphs is not needed for our final analysis why we omit them. The distance function is defined by: Definition 16. The third distance function, Dn 3, is induced by a graph GD3 n with n = m nodes arranged in a circle. Every edge (v i, v i+1 ) in has weight one. Edges (v i, v i k) with k = z m (i) < m 1 have length d(v i, v i k) = k. Definition 17. In a graph G x n with x {1,, 3, 4} an edge (v i, v i k) with z(i) = k is called a Anti-Clockwise edge or AC-edge. The aim is to construct an asymmetric distance function which has W-cactus and minimal TSP tour identical with distance function Dn 1. To get the same minimal TSP tour all edges in clockwise direction (in the circle) are set to one. That induces all edges in clockwise direction in the W-cactus to get the same length as in Dn 1. Edges in the W-cactus in anti-clockwise direction, AC-edges, are set to the length they have in Dn. 1 When we look at some edges with induced length, for example (v 0, v n 1 ), (v n 1, v 0 ), they show an asymmetric behaviour (Figure 5) as required. 15

v 0 v 15 v 1 v 14 15 3 v v 13 3 v 3 4 7 v 1 v 4 v 11 7 4 3 v 5 v 10 3 15 v 6 v 9 v 8 v 7 Figure 5: A graph with distances induced by the graph G D3 16. The solid lines are edges in GD3 16 and dashed lines are edges with induced distances. For simplicity weights equal to one are omitted and only some induced edges are shown. 5.3 A class of asymmetric graph with unique minimum cycle cover In this section we combine the two last distance functions to a distance function which is asymmetric and has unique minimum cycle cover. Figure 6 is a graph with distance function induced by the following distance function: Definition 18. The fourth distance function, Dn 4, is induced by a graph GD4 n with n = m nodes arranged in a circle. Edges (v i 1, v i ) has weight d(v i 1, v i ) = 1 + z m (i)ɛ where ɛ < 1/ log n. Edges (v i, v i k) with z m (i) = k < m 1 have length d(v i, v i k) = k + ( k 1)ɛ. Definition 19. A graph G 4 n is a complete, asymmetric graph with n = m nodes ordered as in G D4 n and the distance function D 4 n. Edges (v i, v i k) with z m (i) = k = m 1 get the induced length k + ( k 1)ɛ, i.e., the formula in Definition 18 holds for all k. This means in particular that d(v 0, v m 1) = d(v m 1,v 0 ) = 16

1+3ε v 0 1+ε v 14 v 15 15+11ε 3+ε +ε v 1 1+ε v v 13 3+ε v 3 +ε 4+3ε 7+4ε 1+ε v 1 v 4 1+ε v 11 7+4ε +ε 4+3ε +ε 3+ε 1+ε v 5 v 10 1+ε 3+ε 15+11ε v 9 v 8 1+3ε v 7 v 6 Figure 6: A graph with distance function induced by the distance function D16 4. The solid lines are edges in G D4 16 and dashed lines are edges with induced distances. For simplicity weights equal to one are omitted and only some induced edges are shown. m 1 + ( m 1 1)ɛ. We do not want redundancy in the definition and left the edges undefined but we use the formula in Definition 18 for all k in the proofs below. Lemma 15. If there is an edge (v i, v j ) in the graph G D4 n d(v i, v j ). then the shortest length from v i to v j is Proof. Look at the edges (v i, v i±1 ). The largest edge has weight <, d(v n 1, v 0 ) = 1 + z(n/)ɛ = 1 + (m 1) 1 m <. A path over at least two edges has length larger than two (since the shortest edge has weight one). Hence the edge is always the shortest path. An AC-edge (v i, v i k), z m (i) = k has length k +( k 1)ɛ. There are two ways to get from v i to v i k in G D4 n. One is clockwise around the circle and the path is m + ( m )ɛ ( k + ( k 1)ɛ) k +( k 1)ɛ since k < m. The other is to use an AC-edge with k > k. But the AC-edge has length larger than k + ( k 1)ɛ and thus the shortest path from v i to v i k is the edge (v i, v i k). 17

Lemma 16. An edge (v i, v i 1 ) with z m (i) = k in G 4 n has an induced length of k+1 1 + ( k+1 (k + 1))ɛ. Proof. The induced length is the shortest path in the graph G D4 n (v i, v i 1 ) G D4 n and there are three possible paths: 1. Using the AC-edge from v i gives a path of length from v i to v i 1. There is no edge d(v i, v i k) + i j=i k d(v j, v j+1 ) = d(v i, v 0 ) + k j=0 d(v j, v j+1 ) = k + k ɛ + k 1 + ( k (k + 1))ɛ = k+1 1 + ( k+1 (k + 1))ɛ. Going in clockwise direction around the circle gives a path of length m 1 + ( m )ɛ (1 + kɛ) > k+1 1 + ( k+1 (k + 1))ɛ 3. Using a larger AC-edge gives a path with length larger than d(v k+1, v 0 ) = k+1 + ( k+1 1)ɛ > k+1 1 + ( k+1 (k + 1))ɛ Thus using the AC-edge from the node v i gives the shortest path and it has the desired length. Lemma 17. Opposite edges (v i, v i k) and (v i k, v i ) with z m (i) = k in the graph G 4 n k + ( k 1)ɛ. have length Proof. There is only one path in G D4 n from v i k to v i with z m (i) = k. It is clockwise in the circle and it has length k + ( k 1)ɛ. In the other direction d(v i, v i k) = k + ( k 1)ɛ by the definition of the distance function. Thus all edges in G 4 n have the desired length. The graph has a clear asymmetric behaviour and the ratio between edges in different direction is linear in n. Lemma 18. In a graph G 4 n the maximum ratio, α, of edges in different directions is greater than n/ if ɛ < log n and n 4. Proof. The edge (v 0, v n 1 ) has by Lemma 16 length m 1 + ( m (m))ɛ. The edge (v n 1, v 0 ) has by definition and Lemma 15 length 1 + ɛ(m 1). Thus the ratio is m 1 + ɛ( m m) 1 + ɛ(m 1) > n 1 + m /m 1 = n + m /m n if m and ɛ < 1/ log n Frieze et al. show in their analysis of their data dependent algorithm that the approximation is in O(n). Lemma 18 shows that a graph G 4 n at least is not proven to be easily approximated by the data dependent algorithm. 18

Lemma 19. In a graph with distance function, Dn 4, the optimum TSP tour has weight n+(n )ɛ. Proof. The minimum TSP tour, opt(g 4 n), is to traverse the nodes in clockwise order. Every edge has weight at least one which gives a weight of n. The extra weight is log n i=1 ( nɛ i ) ɛ = nɛ( log n i=1 ( 1 )) ɛ = (n )ɛ i and the total weight is n + (n )ɛ. Is the tour minimal? There are n edges in G 4 n of weight one. Only half of them can be in a TSP tour since they have opposite direction. There are n/ edges of weight less than two, all induced edges have length greater than. The TSP tour consists of the n shortest edges possible in a TSP tour and is hence minimal. Lemma 0. In the graph G 4 n the algorithm ATSP, with our specification, produces a W-cactus as spanning cactus and it has weight n log n + ( n (log n ) + 1)ɛ Proof. Lemma 1 shows that the W-cactus is the union of the cycle covers produced by the ATSP algorithm on the graph G n. Edges in D4 n have lengths at least as large as in D n ; d 4(v i, v j ) d (v i, v j ), i, j. Hence a minimum cycle cover in Dn is minimum in G 4 n as well - if it exists. Lemma 11 and Lemma 17 show that AC-edges in G n and G 4 n have the same length and hence all minimum cycle covers exist in G 4 n. By Lemma 1 the W-cactus in G n has weight n log n + ( n (log n ) + 1)ɛ. Now we have a W-cactus as spanning cactus. The procedure SHORTCUT makes a TSP tour from the cactus. Since the procedure does not consider the distance function we get a W-tour as TSP tour. Lemma 1. In the graph G 4 n the algorithm ATSP, with our specification, produces a W-tour as an approximation of a TSP tour and the tour has weight greater than n log n + ( n 4 (log n ) + 1/)ɛ Proof. By Lemma 0 the algorithm ATSP produces a W-cactus of weight n log n + ( n (log n ) + 1)ɛ in the graph G 4 n. The procedure SHORTCUT gives by Definition 13 a W-tour as TSP tour from a W-cactus independently of the distance function. By Lemma 13 the W-tour in G n is at least half of the weight of the W-cactus in the same graph. Since edges in G 4 n has weight at least as high as in G n the weight of the W-tour in G 4 n is at least as high. Theorem 3. The approximation algorithm ATSP by Frieze et al. with our specifications, gives on a graph G 4 n a TSP tour, T, such that or an approximation in Ω(log n). T opt(g 4 n ) > log n + ɛ 19

Proof. The optimum TSP-tour has by Lemma 19 weight n + (n )ɛ. algorithm ATSP has by Lemma 1 weight greater than The tour found by the Thus n log n + ( n 4 (log n ) + 1 )ɛ T opt(g 4 n ) > log n + ( 1 (log n ) + 1/n)ɛ > + (1 /n)ɛ and T is an approximation in Ω(log n). Theorem 3 is our main result and proves Theorem 1. 6 Conclusions and future work log n + (1 /n)ɛ > log n + ɛ The graphs, G 4 n, is a family of asymmetric graphs for which the approximation algorithm for asymmetric TSP by Frieze et al., with our specifications, shows a worst case behaviour. The algorithm returns by Theorem 3 a TSP tour of weight greater than (opt(g 4 n ) log n)/(+ɛ), ɛ > 0. The analysis of the algorithm by Frieze et al. is that the algorithm gives a TSP tour with weight less than or equal to opt(g 4 n ) log n. We show that the analysis of the algorithm by Frieze et al. is tight up to a factor of 1/. One improvement of the algorithm might be to make the choices data dependent. It would also be interesting to investigate the average behaviour of the algorithm with random choices. The ratio α between edges in different directions is by Theorem 18 greater than n/ and the data dependent approximation algorithm by Frieze et al. is then proven to give an approximation better than 3α/ = 3n/4 or O(n). When we simply apply the data dependent algorithm to the fourth class of graphs, G 4 n, there are two possible outcome. The algorithm converts the asymmetric graph to a symmetric, uses the algorithm due to Christofides [] and in the end arbitrary chooses the direction of the found TSP tour. The undirected TSP tour is around the circle. If the direction is chosen to be clockwise the algorithm finds the optimum TSP tour of weight n. If the direction on the other hand is chosen to be anti-clockwise the directed cycle has weight n log n which is the same as for the original approximation algorithm. Thus with one choice assumed to be bad the data dependent algorithm approximates the asymmetric TSP tour within a factor of log n. The expected approximation is log n/ over the choice of orientation of the tour. The new algorithm by Bläser is a development of the algorithm by Frieze et al. and is more complicated. Hence it is more difficult to understand and to prove results on worst case performance. The algorithm by Bläser shows that there are algorithms with approximation ratio c log n where c < 1. When we apply the algorithm by Bläser to the fourth class of graphs, G 4 n it can return different TSP tours. It tourns out that it is possible to specify Bläser s algorithm in such way that it returnes the same TSP tour as the algorithm by Frieze et al. Hence, we get a lower bound of 1 +ɛ log n which is, again, optimal up to a constant factor. The question by Karp [5] as to whether there is a polynomial time heuristic for which the approximation ratio of asymmetric TSP is bounded by a constant is still open. Hopefully the understanding of worst case performance graphs helps in developing new ideas for improved approximation algorithms. 0

7 Acknowledgements I thank Lars Engebretsen for great inspiration developing the ideas and support writing the paper and Mikael Goldmann for help writing the paper. Lena Folke and Simon Wigzell have helped me correcting the language thank you! (All mistakes that still remain are due to me.) References [1] Markus Bläser A new approximation algorithm for the asymmetric TSP with triangle inequality Proceedings 14th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA), 638-645, 003. [] Nicos Christofides Worst case analysis of a new heuristic for the traveling salesman problem Tech. Rep. 388, GSIA, Carnegie Mellon Univerity, 1976. [3] Alan M. Frieze, Giulia Galbiati and Francesco Maffioli On the Worst-Case Performance of Some Algorithms for the Asymmetric Traveling Salesman Problem Networks 1:3-39, 198. [4] Richard M. Karp Reducibility among combinatorial problems In Raymond E Miller and James W Thatcher, editors, Complexity of Computer Computations 85-103, Plenum Press, New York, 197. [5] Richard M. Karp The fast approximate solution of hard combinatorial problems Proceedings 6th South Eastern Conference on Combinatorics, Graph Theory and Computing 15-1, Utilitas Mathematica, Winnipeg, 1975. [6] Christos H. Papadimitriou and Santosh Vempala On the approximability of the traveling salesman problem Manuscript May 00. http://www.cs.berkeley.edu/~christos/tsp.ps [7] Sartaj K. Sahni and Teofilo Gonzalez P-complete approximation problems Journal of the Assoc. Comput. Mach. 3, 3, 555-565, 1976. [8] Günter Schaar Remarks on Hamiltonian properties of powers of digraphs Discrete Applied Mathematics 51:181-186, 1994. 1