Approximation Algorithms for Spanner Problems and Directed Steiner Forest

Approxmaton Algorthms for Spanner Problems and Drected Stener Forest Potr Berman a, Arnab Bhattacharyya b,1, Konstantn Makarychev c, Sofya Raskhodnkova a,2, Grgory Yaroslavtsev a,2 a Pennsylvana State Unversty, Unversty Park, PA 16802, USA. b Massachusetts Insttute of Technology, Cambrdge, MA 02139, USA. c IBM T.J. Watson Research Center, Yorktown Heghts, NY 10598, USA. Abstract We present an O( n log n)-approxmaton algorthm for the problem of fndng the sparsest spanner of a gven drected graph G on n vertces. A spanner of a graph s a sparse subgraph that approxmately preserves dstances n the orgnal graph. More precsely, gven a graph G = (V, E) wth nonnegatve edge lengths d : E R 0 and a stretch k 1, a subgraph H = (V, E H ) s a k-spanner of G f for every edge (s, t) E, the graph H contans a path from s to t of length at most k d(s, t). The prevous best approxmaton rato was Õ(n 2/3 ), due to Dntz and Krauthgamer (STOC 11). We also mprove the approxmaton rato for the mportant specal case of drected 3-spanners wth unt edge lengths from Õ( n) to O(n 1/3 log n). The best prevously known algorthms for ths problem are due to Berman, Raskhodnkova and Ruan (FSTTCS 10) and Dntz and Krauthgamer. The approxmaton rato of our algorthm almost matches Dntz and Krauthgamer s lower bound for the ntegralty gap of a natural lnear programmng relaxaton. Our algorthm drectly mples an O(n 1/3 log n)-approxmaton for the 3-spanner problem on undrected graphs wth unt lengths. An easy O( n)-approxmaton algorthm for ths problem has been the best known for decades. Fnally, we consder the Drected Stener Forest problem: gven a drected graph wth edge costs and a collecton of ordered vertex pars, fnd a mnmumcost subgraph that contans a path between every prescrbed par. We obtan an approxmaton rato of O(n 2/3+ɛ ) for any constant ɛ > 0, whch mproves the O(n ɛ mn(n 4/5, m 2/3 )) rato due to Feldman, Kortsarz and Nutov (SODA 09). 1 Arnab Bhattacharyya s supported by the Natonal Scence Foundaton grants CCF-1065125 and CCF-0728645. 2 Sofya Raskhodnkova and Grgory Yaroslavtsev are supported by the Natonal Scence Foundaton (NSF/CCF CAREER award 0845701). Grgory Yaroslavtsev s also supported by a Unversty Graduate Fellowshp and a College of Engneerng Fellowshp. A prelmnary verson of ths paper appeared n the proceedngs of ICALP 2011 [BBM + 11]. Preprnt submtted to Elsever October 19, 2013

1. Introducton A spanner of a graph s a sparse subgraph that approxmately preserves dstances n the orgnal graph. Ths noton was frst used by Awerbuch [Awe85] and explctly ntroduced by Peleg and Schäffer [PS89]. Defnton 1.1 (k-spanner, [Awe85, PS89]). Gven a graph G = (V, E) wth nonnegatve edge lengths d : E R 0 and a real number k 1, a subgraph H = (V, E H ) s a k-spanner of G f for all edges (s, t) E, the graph H contans a path from s to t of length at most k d(s, t). The parameter k s called the stretch. Spanners have numerous applcatons, such as effcent routng [Cow01, CW04, PU89b, RTZ08, TZ01], smulatng synchronzed protocols n unsynchronzed networks [PU89a], parallel, dstrbuted and streamng algorthms for approxmatng shortest paths [Coh98, Coh00, Elk01, FKM + 08], algorthms for dstance oracles [BS06, TZ05], property testng, property reconstructon and key management n access control herarches (see [BGJ + 09, BGJ + 12, JR11], the survey n [Ras10] and references theren). We study the computatonal problem of fndng the sparsest k-spanner of a gven drected graph G, that s, a k-spanner of G wth the smallest number of edges. We refer to ths problem as Drected k-spanner and dstngush between the case of unt edge lengths (.e., d(e) = 1 for all e E) and arbtrary edge lengths. The Undrected k-spanner problem refers to the task of fndng the sparsest k-spanner of a gven undrected graph. The natural reducton from Undrected k-spanner to Drected k-spanner preserves the approxmaton rato. Our man results are an algorthm wth approxmaton rato O( n log n) for Drected k-spanner wth arbtrary edge lengths and an algorthm wth approxmaton rato O(n 1/3 log n) for Drected 3-Spanner wth unt edge lengths, where n s the number of nodes n the nput graph G. Our approxmaton guarantee for Drected 3-Spanner almost matches the ntegralty gap of Ω(n 1/3 ɛ ) by Dntz and Krauthgamer [DK11] for a natural lnear programmng relaxaton of the problem. Our result also drectly mples the same approxmaton rato for the Undrected 3-Spanner problem wth unt edge lengths. Our technques also apply to the Drected Stener Forest problem. Our result for ths problem s dscussed n Secton 1.3. 1.1. Relaton to Prevous Work Drected k-spanner wth unt edge lengths has been extensvely studed. Note that n ths case, we can assume that k s a postve nteger. For k = 2, the problem has been completely resolved: Kortsarz and Peleg [KP94] and Elkn and Peleg [EP01] gave an O(log n)-approxmaton, and Kortsarz [Kor01] proved that ths approxmaton rato cannot be mproved unless P=NP. Elkn and Peleg [EP05] gave an Õ( E 1/3 )-approxmaton for Drected 3-Spanner, whch s an Õ(n 2/3 )- approxmaton for dense graphs wth Θ(n 2 ) edges. For general k 3, Bhattacharyya et al. [BGJ + 09] presented an Õ(n 1 1/k )-approxmaton; then Berman, Raskhodnkova and Ruan [BRR10] mproved t to Õ(n 1 1/ k/2 ), and recently Dntz 2

and Krauthgamer [DK11] gave Õ(n 2/3 )-approxmaton, presentng the frst algorthm wth approxmaton rato ndependent of k. For the specal cases of k = 3 and k = 4, Berman, Raskhodnkova and Ruan s algorthm gves an Õ( n)- approxmaton. Dntz and Krauthgamer also gave an Õ( n)-approxmaton for the case k = 3, usng dfferent technques than n [BRR10]. Thus, our algorthms mprove on [BRR10] for all k 3, where k 4, and on [DK11] for all k 3. Dntz and Krauthgamer s algorthms also work for Drected k-spanner wth arbtrary edge lengths. For ths case, one can no longer assume that k s an nteger. Dntz and Krauthgamer acheved an Õ(n 2/3 )-approxmaton for all k > 1 and Õ( n) for k = 3 for arbtrary edge lengths. We mprove ths approxmaton rato to Õ( n) for all k > 1. In contrast to the drected case, a smple approxmaton algorthm for Undrected k-spanner was known for decades. For all nteger k 3 and for all undrected graphs G wth arbtrary edge lengths, a k-spanner can be constructed n polynomal tme by a greedy algorthm proposed by Althofer, Das, Dobkn, Joseph and Soares [ADD + 93]. It follows from the Moore bound for rregular graphs by Alon, Hoory and Lnal [AHL02] that the graph constructed by ths greedy algorthm has O(n 1+ k/2 1 ) edges. Snce a k-spanner of a connected graph must have at least n 1 edges, an approxmaton rato O(n k/2 1 ) follows. Our result mproves the rato for Undrected 3-Spanner from O( n) to Õ(n 1/3 ) n the case of unt-length edges. Elkn and Peleg [EP00, EP07], mprovng on [Kor01], showed that t s quas- NP-hard to approxmate Drected k-spanner, even when restrcted to unt edge lengths, wth rato better than 2 log1 ɛ n for k (3, n 1 δ ) and all δ, ɛ (0, 1). For Undrected k-spanner wth unt-length edges, such a strong hardness result does not hold snce the problem s O(1)-approxmable when k = Ω(log n). 1.2. Our Technques Our algorthms operate by combnng two graphs: the frst obtaned from randomzed roundng of a fractonal soluton to a lnear programmng relaxaton of the problem and the second obtaned by growng shortest-path trees from randomly selected vertces. The dea of combnng a lnear programmng approach wth samplng of shortest-path trees to solve Drected k-spanner frst appeared n [BGJ + 09]. Dntz and Krauthgamer [DK11] used the same approach n ther man algorthm (for arbtrary stretch k), but wth a novel, flow-based lnear program (LP). In ths paper, we propose alternatve randomzed LP roundng schemes that lead to better approxmaton ratos. Samplng and randomzed roundng has been prevously used by Kortsarz and Peleg [KP98] to construct undrected low-degree 2-spanners. In that work, the samplng step selects unformly random edges, and the LP s dfferent from ours. We also gve new LP relaxatons of Drected k-spanner, slghtly smpler than that n [DK11], although they descrbe the same polytope. Our LP relaxaton for the general case s stated n terms of antspanners, a graph object dual to spanners. An antspanner for an edge (s, t) s a set of edges whose removal from the graph destroys all paths of stretch at most k from s to t. Lke n [DK11], our LP 3

has a polynomal number of varables and an exponental number of constrants. We use the ellpsod algorthm wth a randomzed separaton oracle to solve t. In the case of unt edge lengths, we present a dfferent LP that has an extra advantage: t has a polynomal number of constrants and thus can be solved quckly wthout usng the ellpsod algorthm. We apply two dfferent roundng schemes to the fractonal soluton of ths LP: one for general stretch, another for stretch k = 3. We note, however, that our method would yeld the same approxmaton ratos wth the LP of Dntz and Krauthgamer [DK11] and, n the case of 3-spanners for graphs wth unt edge lengths, wth ther roundng method as well. Dntz and Krauthgamer gave a separate algorthm for Drected 3-Spanner that uses randomzed roundng, but does not combne t wth samplng. By combnng wth samplng, we obtan an algorthm wth better approxmaton rato for the case of unt lengths. Our roundng method allows for smpler analyss. 1.3. Drected Stener Forest Fnally, we apply our technques to the Drected Stener Forest (DSF) problem, a fundamental network desgn problem on drected graphs. In ths problem, the nput s a drected graph G = (V, E) wth edge costs and a collecton D V V of vertex pars. The goal s to fnd a mnmum-cost subgraph of G that contans a path from s to t for every par (s, t) D. DSF s an NP-hard problem and s known [DK99] to be quas-np-hard to approxmate wth rato better than 2 log1 ɛ n for all ɛ (0, 1). DSF s also known [FKN09] to be as hard as MAX-REP, a basc problem used for hardness reductons, for whch the current best approxmaton rato s O(n 1/3 ) [CHK11]. Prevous to ths work, the best known approxmaton rato for DSF, ndependent of the sze of D, was O(n ɛ mn(n 4/5, m 2/3 )) due to Feldman, Kortsarz and Nutov [FKN09]. Ther algorthm has the same structure as the algorthms for Drected k-spanner n [BGJ + 09, DK11]: t combnes two graphs obtaned, respectvely, by samplng and solvng an LP. In addton, the LP relaxaton they formulate s closely related to that developed by Dntz and Krauthgamer, wth edge costs replaced by edge lengths. Our technque for the spanner problem also apples to the DSF problem, yeldng an mproved approxmaton rato of O(n 2/3+ɛ ) for any fxed ɛ > 0. 1.4. Organzaton In Secton 2, we explan the general outlne of our algorthms, ntroduce antspanners and show how to fnd an Õ(n 1/2 )-approxmate soluton to Drected k-spanner n polynomal tme. In Secton 3, we present a more effcent algorthm for the specal case when all the edges of the graph are of unt length. In Secton 4, we show the Õ(n 1/3 )-approxmaton for Drected 3-Spanner wth unt-length edges. Fnally, Secton 5 descrbes the O(n 2/3+ɛ )-approxmaton for Drected Stener Forest. In Secton 6 we gve a concluson and drectons for future work. 4

2. An Õ( n)-approxmaton for Drected k-spanner Our frst result s stated n the followng theorem. Theorem 2.1. There s a polynomal tme randomzed algorthm for Drected k-spanner wth expected approxmaton rato O( n log n). All algorthms n ths paper have the same structure. They break the problem nto two parts and obtan separate solutons to each part: one by random samplng and the other by randomzed roundng of a soluton to a lnear program. We start by explanng how we break Drected k-spanner nto two parts. In Secton 2.1, we descrbe how to obtan a soluton to the frst part usng random samplng. Secton 2.2 descrbes our randomzed roundng scheme for Drected k-spanner. In Secton 2.3, we ntroduce antspanners, a graph object used to formulate and analyze our lnear programmng relaxatons. In Secton 2.4, we formulate our lnear programmng relaxaton and separaton oracle, and fnsh the descrpton and analyss of the algorthm, completng the proof of Theorem 2.1. Let G = (V, E) be a drected graph wth edge lengths d : E R 0, gven as an nput to our algorthm, and OPT be the sze of ts sparsest k-spanner. We assume that G s weakly connected. Otherwse, our algorthm should be executed for each weakly connected component separately. Defnton 2.1 (Local graph G s,t ). For an edge (s, t) E, let G s,t = (V s,t, E s,t ) be the subgraph of G nduced by the vertces that belong to paths from s to t of length at most k d(s, t). We classfy edges accordng to the szes of ther local graphs. Defnton 2.2 (Thck and thn edges). Let β be a parameter n [1, n]. If V s,t n/β, the correspondng edge (s, t) s thck, and otherwse, t s thn. The set of all thn edges s denoted by E. In Sectons 2.1 3, we set β = n and n Secton 4, β = n 1/3. Defnton 2.3. A set E E settles an edge (s, t) E f (V, E ) satsfes the k- spanner property for ths edge,.e., t contans a path of length at most k d(s, t) from s to t. Our algorthm must fnd a small subset of edges that settles all edges n E. To accomplsh ths, t fnds two subsets of edges, E and E, such that E settles all thck edges and E settles all thn edges. The output of the algorthm s (V, E E ). 2.1. Samplng The followng procedure uses random samplng to construct an edge set E that settles all thck edges. Recall that an n-arborescence s a drected rooted tree where all edges are orented towards the root; an out-arborescence s defned smlarly. 5

Algorthm 1 Sample(β) 1: E, S ; 2: for = 1 to β ln n do 3: v a unformly random element of V; 4: Tv n a shortest path n-arborescence rooted at v; 5: Tv out a shortest path out-arborescence rooted at v; 6: E E T n v T out v 7: end for 8: Add all unsettled thck edges to E ; 9: return E., S S {v}; //Set S s used only n the analyss. Lemma 2.2. Algorthm 1, n polynomal tme, computes a set E that settles all thck edges and has expected sze at most 3β ln n OPT. Proof. After the executon of the for-loop n Algorthm 1, E 2(n 1)β ln n 2β ln n OPT. The last nequalty holds because OPT n 1 for weakly connected graphs G. If some vertex v from a set V s,t appears n the set S of vertces selected by Sample, then Tv n and Tv out contan shortest paths from s to v and from v to t, respectvely. Thus, both paths are contaned n E. Snce v V s,t, the sum of lengths of these two paths s at most k d(s, t). Therefore, f S V s,t, then the edge (s, t) s settled. For a thck edge (s, t), the set S V s,t s empty wth probablty at most (1 1/β) β ln n e ln n = 1/n. Thus, the expected number of unsettled thck edges added to E n Step 8 of Sample s at most E /n n 1 OPT. Step 8 ensures that the set E, returned by the algorthm, settles all thck edges. Computng shortest path n- and out-arborescences and determnng whether an edge s thck can be done n polynomal tme. 2.2. Randomzed Roundng To obtan a set E that settles all thn edges, each of our algorthms solves a lnear program and rounds the resultng fractonal soluton. The LP s a relaxaton of Drected k-spannerfor the set of all thn edges. It has a varable x e and a constrant x e 0 for each edge e E. The varable x e n the correspondng optmal {0,1}-soluton ndcates whether the edge e s present n the smallest spanner for all thn edges. The followng randomzed roundng procedure s used n our algorthms for Drected k-spanner, both for arbtrary and for unt lengths. As an nput t gets a fractonal vector { ˆx e } wth nonnegatve entres. Algorthm 2 RandomzedSelecton( ˆx e ) 1: E ; 2: for each edge e E do 3: Add e to E wth probablty mn( n ln n ˆx e, 1); 4: end for 5: return E. 6

The followng proposton shows that f the sum of values assgned by { ˆx e } to edges n some A E s at least 1 then E ntersects A wth hgh probablty. Clam 2.3. Let A E. If Algorthm 2 receves a fractonal vector { ˆx e } wth nonnegatve entres satsfyng e A ˆx e 1, the probablty that t outputs a set E dsjont from A s at most exp( n ln n). Proof. If A contans an edge e, such that ˆx e ( n ln n) 1, then e E wth probablty 1. That s, E s never dsjont from A. Otherwse, for all edges e A, the probablty that e E s exactly n ln n ˆx e. The probablty that no edges of A are n E s, therefore, (1 n ln n ˆx e ) exp n ln n ˆxe exp( n ln n). e A e A The frst nequalty above follows from the fact that 1 x exp( x) for x 0. The second one holds because e A ˆx e 1. 2.3. Antspanners In ths secton, we ntroduce antspanners, a graph object used n the descrpton of our LP for Drected k-spanner and crucal n the analyss of the parts of our algorthms that settle thn edges. After gvng the defnton, we show how to construct mnmal antspanners (n Clam 2.4) and gve an upper bound on ther number (n Clam 2.5.) For a gven edge (s, t), we defne an antspanner to be a subset of edges of G, such that f we remove ths subset of edges from G, the length of the shortest path from s to t becomes larger than k d(s, t). Defnton 2.4 (Antspanner). A set A E s an antspanner for an edge (s, t) E f (V, E \ A) contans no path from s to t of length at most k d(s, t). If no proper subset of an antspanner A for (s, t) s an antspanner for (s, t) then A s mnmal. The set of all mnmal antspanners for all thn edges s denoted by A. The edge set of a k-spanner of G must ntersect all antspanners for all edges of G. In other words, t has to be a httng set for all mnmal antspanners. Specfcally, a set E that settles all thn edges must be a httng set for all mnmal antspanners n A. We now prove that f a set E does not settle some thn edge, then we can effcently fnd a mnmal antspanner A A dsjont from E. Clam 2.4. There exsts a polynomal tme algorthm that, gven a set of edges E E that does not settle some thn edge, outputs a mnmal antspanner A A for some thn edge, such that A E \ E. Proof. The algorthm frst fnds a thn edge (s, t) wth no drected path from s to t of length at most k d(s, t) n E. Recall that all paths from s to t of length at most k d(s, t) n G le n the local graph G s,t = (V s,t, E s,t ). (See Defnton 2.1.) Therefore, E s,t \ E s an antspanner for (s, t). The algorthm sets A = E s,t \ E and then greedly deletes edges e from A whle A \ {e} s an antspanner, that s, whle (V s,t, E s,t \ A) contans no paths of length at most k d(s, t) from s to t. When no more such edges can be deleted, the algorthm returns A. 7

Mnmze e E x e subject to: (1) x e 1 A A (2) e A x e 0 e E (3) Fgure 1: Lnear program for the arbtrary-length case, LP-A. Fgure 1: A s the set of all mnmal antspanners for thn edges. Next, we gve an upper bound on the number of mnmal antspanners for thn edges. Clam 2.5. A E (n/β) n/β. In partcular, f β = n, then A n n+4. Proof. Fx a thn edge (s, t) and a mnmal antspanner A for (s, t). Let T A be an out-arborescence (shortest-path tree) rooted at s n the graph (V s,t, E s,t \A). Denote by d TA (u) the dstance from s to u n the tree T A. If T A contans no drected path from s to u, let d TA (u) =. We show that A = {(u, v) E s,t : d TA (u) + d(u, v) < d TA (v)}, and thus T A unquely determnes A for a gven thn edge (s, t). Consder an edge (u, v) A, and let A denote A\{(u, v)}. Snce the antspanner A s mnmal, the graph (V, E \ A ) contans a path from s to t of length at most k d(s, t). Ths path must le n (V s,t, E s,t \ A ) and must contan the edge (u, v). Thus, the dstance from s to t n the graph (V s,t, E s,t \ A ) s at most k d(s, t) and s strctly less than d TA (t). Hence, T A s not a shortest-path tree n the graph (V s,t, E s,t \ A ). Therefore, d TA (u) + d(u, v) < d TA (v). If (u, v) E s,t satsfes the condton d TA (u) + d(u, v) < d TA (v), then (u, v) E s,t \ A; otherwse, T A would not be a shortest-path tree. Hence, (u, v) A. We now count the number of out-arborescences rooted at s n (V s,t, E s,t \ A). For every vertex u V s,t, we may choose the parent vertex n at most V s,t possble ways (f a vertex s a not reachable from s, we choose t as ts own parent). Thus, the total number of trees s at most V s,t V s,t (n/β) n/β. Snce there are at most E thn edges, the clam follows. 2.4. LP, Separaton Oracle and Overall Algorthm In ths secton, we descrbe a randomzed algorthm for constructng a small subset of edges E E that settles all thn edges. Frst, we formulate an LP relaxaton of ths problem. Then we descrbe how to solve t usng the ellpsod method wth a separaton oracle (Secton 2.4.1). Fnally, n Secton 2.4.2, we summarze the resultng algorthm for Drected k-spanner and complete the proof of Theorem 2.1. A set E that settles must ntersect all mnmal antspanners for all thn edges. Ths condton can be expressed usng lnear program LP-A (see Fg. 1). LP-A has a varable x e for each edge e E and a constrant (2) for each mnmal antspanner A for every thn edge. Recall that A s the set of all mnmal antspanners for thn 8

edges. In the ntegral soluton {xe nt } correspondng to a k-spanner wth an edge set E E, we set xe nt = 1 f e E and xe nt = 0 otherwse. All constrants n (2) are satsfed for {xe nt } snce E ntersects every antspanner. The value of the objectve functon e xe nt s equal to the sze of E. Hence, LP-A s a relaxaton of Drected k-spanner. For ease of presentaton, we assume that we have guessed OPT, the sze of the sparsest spanner. (We can try all values n {n 1,..., n 2 } for OPT and output the sparsest spanner found n all teratons). We replace the objectve functon (1) wth x e OPT, (4) e E and call the resultng lnear program LP-A. 2.4.1. Separaton Oracle LP-A has a polynomal number of varables and, by Clam 2.5, an exponental n Õ( n) number of constrants. We solve t usng the ellpsod algorthm wth a separaton oracle. Our separaton oracle receves a fractonal vector { ˆx e }, satsfyng (3) and (4). If { ˆx e } s a feasble soluton to LP-A, then the separaton oracle outputs a set E of sze at most 2OPT n ln n, whch settles thn edges. Otherwse, t outputs ether a set E wth the same guarantee or a volated constrant from (2) for some antspanner A. The separaton oracle can also fal wth small probablty. If t happens durng an executon of the ellpsod algorthm, we output the nput graph wth all ts edges as a k-spanner. Algorthm 3 SeparatonOracle( ˆx e ) 1: //Sample a random set of edges E, pckng each e E //wth probablty mn( ˆx e n ln n, 1) (see Algorthm 2). E RandomzedSelecton( ˆx e ) 2: f E settles all thn edges then 3: f E 2OPT n ln n then return E ; 4: else fal; 5: else 6: Fnd an antspanner A E\E from A usng the algorthm from Clam 2.4. 7: f e A x e < 1 then return volated constrant e A x e 1; 8: else fal. 9: end f The separaton oracle s descrbed n Algorthm 3. Next we analyze the probablty that the separaton oracle fals. Lemma 2.6. The probablty that the separaton oracle fals durng an executon of the ellpsod algorthm s exponentally small n n. Proof. The separaton oracle can fal for two reasons: 9

1. The sze of the sampled set E s too large. 2. The mnmal antspanner A found by the oracle does not correspond to a volated constrant. To analyze the probablty of the frst event, note that the expected sze of E s at most n ln n e E x e OPT n ln n. By the Chernoff bound, Pr[ E > 2OPT n ln n] exp( c OPT n ln n) = exp( Ω(n n ln n)). Thus, the probablty that the separaton oracle fals because E > 2OPT n ln n s exponentally small n n. To analyze the probablty of the second event, consder one call to the separaton oracle. Fx a mnmal antspanner A satsfyng e A ˆx e 1. Clam 2.3 shows that the probablty that E s dsjont from A s at most exp( n ln n). Clam 2.5 demonstrates that A n n+4. Therefore, by a unon bound, the probablty that there s a mnmal antspanner A A satsfyng e A ˆx e 1 and also dsjont from E s at most n n+4 exp( n ln n) = exp( 1 2 n ln n + 2 ln n). Thus, the probablty that the separaton oracle fals durng one call because e A ˆx e 1 s exponentally small n n. Snce the number of teratons of the ellpsod algorthm s polynomal n n, a unon bound over all teratons gves that the overall probablty that the separaton oracle fals durng an executon of the ellpsod algorthm s exponentally small n n. Lemma 2.6 mples, n partcular, that when the separaton oracle s gven a feasble soluton to LP-A, t fals to output a set E wth exponentally small probablty. Snce E s obtaned by runnng Algorthm 2, we obtan the followng corollary that wll be used n Secton 3. Corollary 2.7. Gven a feasble soluton to LP-A, Algorthm 2 wth all but exponentally small probablty produces a set E that settles thn edges and has sze at most 2OPT n ln n. 2.4.2. Overall Algorthm for Drected k-spanner Proof of Theorem 2.1. We settle thck edges by runnng Sample( n), accordng to Lemma 2.2. We settle thn edges by runnng the ellpsod algorthm as descrbed n the begnnng of Secton 2.4 and n Secton 2.4.1. If the separaton oracle fals, whch, by Lemma 2.6, happens wth exponentally small probablty, we output a spanner contanng all edges E. Thus, the expected sze of the set E s at most 2OPT n ln n + o(1), and the resultng approxmaton rato of the algorthm s O( n ln n). The ellpsod algorthm termnates n polynomal tme, so the overall runnng tme s polynomal. 3. LP and Roundng for Graphs wth Unt-Length Edges In ths secton, we descrbe how to settle all thn edges, and thus prove Theorem 2.1, for the case of unt-length edges. Our motvaton for presentng ths 10

specal case s two-fold. Frst, we show that for the unt-length case, one can drectly formulate a polynomal-szed LP relaxaton, and ths makes the approxmaton algorthm more effcent. Second, we also use the LP from ths secton to present a better algorthm for 3-spanners n Secton 4. Our LP for the case of unt lengths, LP-U, s stated n terms of local layered graphs whch we ntroduce next. Defnton 3.1 (Layered expanson). Gven a drected graph G = (V, E), ts layered expanson s a drected graph Ḡ = ( V, Ē), satsfyng the followng: 1. Let V = {v : v V and Z 0 }, where v denotes the -th copy of v. The set of all the -th copes of nodes n V s the -th layer of V. 2. Let L = {(u, u) : u V} be the set of loops. Defne the -th copy of an edge e = (u, v) to be e = (u, v +1 ), and the -th copy of a loop e = (u, u) to be e = (u, u +1 ). Let Ē = {e : e E L and Z 0 }. Layered expanson Ḡ contans a path from s 0 to t k f and only f G contans a path from s to t of length at most k. A local layered graph for a thn edge (s, t) s defned next. It conssts of all paths n the layered expanson Ḡ that correspond to paths from s to t of length at most k n the orgnal graph G or, n other words, to paths n the local graph G s,t, defned n Defnton 2.1. Defnton 3.2 (Local layered graph). For a thn edge (s, t) and k 1, the local layered graph s a subgraph Ḡ s,t = ( V s,t, Ē s,t ) of Ḡ wth a source s = s 0 and a snk t = t k, such that Ḡ s,t contans all nodes and edges on paths from s to t. Our algorthm solves the lnear program LP-U defned n Fgure 2. Recall that E denotes the set of thn edges. LP-U has varables of two types: x e, where e E, and f, where (s, t) E and e Ē s,t. A varable x e represents whether the edge e s ncluded n the k-spanner. We thnk of a path from s to t of length at most k n G as a unt flow from s to t n Ḡ s,t. A varable f represents flow along the edge e n Ḡ s,t. We denote the sets of ncomng and outgong edges for a vertex v Ḡ s,t by In(v ) and Out(v ), respectvely. Gven ˆx e, a fractonal soluton of LP-U, we construct the set E by frst runnng Algorthm 2 and then addng all unsettled thn edges. Lemma 3.1. The algorthm descrbed above, n polynomal tme, computes a set E that settles all thn edges and has expected sze at most 2 n ln n OPT + o(1). Proof. We prove, n Clam 3.2, that n a fractonal optmal soluton { ˆx e } { fˆ } to LP-U, the vector { ˆx e } s a fractonal soluton to LP-A. Then we apply Corollary 2.7 to get the desred bound on the expected sze of E. At the end, we argue that the algorthm runs n polynomal tme. Clam 3.2. In a fractonal optmal soluton { ˆx e } { ˆ a fractonal soluton to LP-A. f } to LP-U, the vector { ˆx e } s 11

Mnmze x e subject to: e E Flow requrement Flow conservaton Capacty constrants e 0 Out(s 0 ) f 1 f s,t e 1 In(v ) e Out(v ) k 1 x e =0 f s,t e 0 1 (s, t) E e = 0 (s, t) E, v V s,t \ { s, t} f s,t e 0 (s, t) E, e E x e 0 e E f s,t e 0 (s, t) E, e Ē s,t Fgure 2: Lnear program for the unt-length case, LP-U. Proof. Frst, we argue that LP-U s a relaxaton of Drected k-spanner for the unt-length case or, n other words, that an optmal soluton to ths program has value at most OPT. Let H be a sparsest k-spanner of G. Assgn x e = 1 f e s n H and x e = 0 otherwse. For each thn edge (s, t), consder a smple path from s to 1 f ether e s the -th edge on that path or {l + 1,..., k} and e = (t 1, t ); otherwse, set t to 0. Snce the resultng assgnment s a feasble soluton to LP-U, the optmal soluton to ths program has value e E ˆx e OPT. to t n H of length l, where l k. Set f s,t e Next, we argue that f { ˆx e } { fˆ } s a feasble soluton to LP-U then { ˆx e } satsfes the antspanner constrants for LP-A, gven n (2). Consder a thn edge (s, t) and a mnmal antspanner A A for (s, t). Let Ā = {e : e A and e Ē s,t } be the set of copes of the edges n A n the local layered graph. Let S V s,t be the set of nodes that can be reached from s n ( V s,t, Ē s,t \ Ā) and T = V s,t \ S be the set of the remanng nodes. Snce A s an antspanner for (s, t), node t s n T, and thus ( S, T) s an ( s, t) cut n Ḡ s,t. Note that only edges from Ā can cross the cut because for an edge (u, v +1 ) Ā f u s reachable from s then so s v +1. For a fractonal soluton { ˆx e } { ˆ k 1 ˆx e e A e A =0 fˆ = e Ā f fˆ } to LP-U, e cut ( S, T) fˆ = e 0 Out(s 0 ) fˆ 0 1. (5) The frst nequalty above follows from the capacty constrants n LP-U, the followng equalty holds by defnton of Ā, the second nequalty holds because Ā contans the edges n the cut ( S, T), the last equalty follows from the flow conservaton, and the last nequalty s the flow requrement. We proved that n a fractonal optmal soluton { ˆx e } { fˆ } to LP-U, the vector { ˆx e } satsfes constrants (2) and (4) of LP-A. Snce constrants (3) are also n LP-U, vector { ˆx e } s a fractonal soluton to LP-A. 12

By, Clam 3.2, vector { ˆx e } s a fractonal soluton to LP-A. Corollary 2.7 says that, gven such a soluton, Algorthm 2 wth all but exponentally small probablty produces a set E that settles thn edges and has sze at most 2OPT n ln n. After we add all unsettled thn edges, the expected sze of the resultng set E s at most 2OPT n ln n + o(1). It remans to argue that the descrbed algorthm takes polynomal tme. To wrte down LP-U, we only need to know V, E, k and the set of thn edges, E. The frst three are nputs to the algorthm, and E can be computed n polynomal tme. LP-U can be wrtten down and solved n polynomal tme because t has O( E 2 k) = O(n 5 ) varables and constrants. Proof of Theorem 2.1 for the case of unt-lengths. We run Algorthm 1 to get E. We construct E by runnng Algorthm 2 and addng all unsettled thn edges. Let the edge set of our k-spanner be E E. By Lemmas 2.2 and 3.1, E settles all thck edges, E settles all thn edges, the expected sze of E E s O( n ln n OPT), and the resultng algorthm runs n polynomal tme, as requred. 4. An Õ(n 1/3 )-Approxmaton for Drected 3-Spanner wth Unt-Length Edges In ths secton, we show an mproved approxmaton for the specal case of Drected 3-Spanner wth unt-length edges. The algorthm follows the general strategy explaned n Secton 2. The LP roundng scheme here s dfferent from that presented n Secton 2.2 and used n the two algorthms for Drected k-spanner n Sectons 2 and 3. We note that Algorthm 2 from [DK11] wth ρ = Θ(n 1/3 ) could also be used to prove our result. The roundng scheme we present s smpler and allows for smpler analyss. As n [DK11], we use random varables for vertces nstead of edges to gude edge selecton process. Intutvely, ths allows us to ntroduce postve correlatons n selecton of edges adjacent to the same vertex. Because the correlatons are local, the mprovement n approxmaton deterorates for larger values of k. To smplfy analyss, nstead of threshold roundng (as n the prevous sectons) we use Posson random varables. Theorem 4.1. There s a polynomal tme randomzed algorthm for Drected 3-Spanner for graphs wth unt edge lengths wth expected approxmaton rato O(n 1/3 log n). Proof. We defne thck and thn edges as n Defnton 2.2, wth β = n 1/3, and run Sample(n 1/3 ). By Lemma 2.2, the resultng edge set E settles all thck edges and has expected sze at most 3n 1/3 ln n OPT. Then we obtan an optmal soluton { x e } { f } of the lnear program LP-U from Fg. 2. Our roundng scheme s stated n Algorthm 4. It conssts of two stages: frst, we round { x e } to obtan a new soluton { ˆx e }, where every assgnment ˆx e s an nteger multple of n 2/3 ; second, we round { ˆx e } to obtan an edge set E that settles all thn edges wth hgh probablty. 13

In the frst step we sample a random varable from Posson dstrbuton for every edge. Recall, that a Posson random varable X wth mean λ s supported over nonnegatve ntegers and has a probablty densty functon: Pr[X = k] = λk e λ, k Z 0. k! The only propertes of the Posson dstrbuton that we use n the analyss are concentraton bound stated n the Appendx, ntegralty of the support and the fact that the sum of Posson random varables s agan a Posson random varable. Algorthm 4 Randomzed3SpannerSelecton( x e ) 1: E ; //Obtan a new soluton { ˆx e }, where each coordnate ˆx e s a multple of n 2/3 : 2: for each edge e E do 3: P e sample from the Posson dstrbuton wth mean λ e = 6n 2/3 x e ; 4: ˆx e P e n 2/3 ; 5: end for //Round { ˆx e } to get E : 6: for each vertex u V do 7: r u unform sample from (0, 1); 8: end for 9: for each edge e = (u, v) E do 10: f mn(r u, r v ) ˆx e α n 1/3 ln n then add e to E ; 11: //α > 1 s an absolute constant 12: end for 13: return E. Lemma 4.2 below analyzes the frst stage. Then Lemmas 4.3 and 4.4 analyze the set E produced by the second stage. Lemma 4.3 bounds the expected sze of E by O(OPTn 1/3 ln n). Lemma 4.4 shows that E settles a gven thn edge wth probablty at least 1 1/n. Consequently, the expected number of unsettled thn edges s at most E /n n 1 OPT, and they can be added to the soluton wthout affectng the approxmaton rato. Ths completes the proof of Theorem 4.1. It remans to prove the lemmas that were used n the proof of Theorem 4.1. Recall that s and t are used to denote the source and the snk the local layered graph of an edge (s, t) as n Defnton 3.2. Lemma 4.2. Gven a feasble soluton { x e } { f } of LP-U of cost LP, Algorthm 4 on lnes 2 5 computes a vector { ˆx e } of cost at most 20LP (.e., satsfyng e ˆx e 20LP) such that all ˆx e are nteger multples of n 2/3. Moreover, for every thn edge (s, t) and cut ( S, T) n the local layered graph Ḡ s,t wth s, s 1 S and t 2, t T, vector { ˆx e } satsfes ˆx (u,v) 1. (6) (u,v) E s,t :(u,v +1 ) S T 14

Ths stage of the algorthm succeeds wth probablty 1 exp( cn 2/3 ) for some constant c > 0. Proof. For every edge e, we ndependently sample a Posson random varable P e wth mean λ e = 6n 2/3 x e, and set ˆx e = P e n 2/3. Snce the support of the Posson dstrbuton s on nonnegatve ntegers, all ˆx e are nteger multples of n 2/3. We need to verfy that ˆx e satsfes (6) and that ts cost s bounded by 20LP. Fx a thn edge (s, t) and a cut ( S, T) n Ḡ s,t wth s, s 1 S and t 2, t T. Let A = {(u, v) E s,t : (u, v +1 ) S T}. We wll show that t s an antspanner for (s, t). For every path p = s u v t of length 3 n G s,t, one of the edges on the path s u 1 v 2 t crosses the cut ( S, T) and, consequently, one of the edges of p belongs to A. Smlarly, for every path p = s u t of length 2 (respectvely, path p = s t of length 1) one of the edges on the path s u 1 t 2 t (respectvely, path s s 1 t 2 t) crosses the cut ( S, T), and one of the edges of p belongs to A. Therefore, A s an antspanner for (s, t). By Clam 3.2, f { x e } { f } s a feasble soluton to LP-U then { x e } satsfes the antspanner constrants for LP-A, gven n (2). That s, e A x e 1. Next, we bound e A ˆx e = n 2/3 e A P e. The sum e A P e s dstrbuted as a Posson random varable wth mean λ A = e A λ e 6n 2/3. By Lemma A.1 n the Appendx, Pr[ ˆx e < 1] = Pr[ P e < n 2/3 ] Pr[ P e 6/e n 2/3 ] exp( 6n 2/3 /4). e A e A Snce (s, t) s a thn edge, V s,t \ { s, t} 2n 2/3, and the number of cuts ( S, T) n V s,t separatng s and t s at most 2 2n2/3 = exp(ln 4 n 2/3 ). Hence, by a unon bound, (6) holds for all such cuts smultaneously wth probablty at least 1 e cn2/3, where c = 6/4 ln 4 > 0. By a unon bound, the prevous sentence s true for all thn edges (s, t) smultaneously wth a constant f c s set to c/2 = 1 (6/4 ln 4). 2 Fnally, observe that the cost of { ˆx e } s n 2/3 e E P e. The sum e E P e s a Posson random varable wth mean 6n 2/3 x e = 6n 2/3 LP. By Lemma A.1, Pr[ P e 20n 2/3 LP] Pr[ P e 6 e n 2/3 LP] exp( 6n 2/3 LP) exp( 6n 2/3 ). e E e E Thus, the probablty that the cost of { ˆx e } exceeds 20LP s exponentally small. Lemma 4.3 (Analog of Lemma 4.1 n [DK11]). E[ E ] = O(OPTn 1/3 ln n). Proof. By a unon bound, the probablty that an edge e belongs to E s at most 2 ˆx e αn 1/3 ln n. Therefore, snce α s a constant, E[ E ] 2 ˆx e α n 1/3 ln n = O(OPTn 1/3 ln n). e E e A Lemma 4.4 (Analog of Lemma 4.2 n [DK11]). If (s, t) s a thn edge for whch condton (6) holds, then E settles (s, t) wth probablty at least 1 1/n. 15

Proof. Fx a thn edge (s, t). Let Ē = {e : e E and e Ē s,t } {( s, s 1 ), (t 2, t)} be the set of copes of the edges n E n the local layered graph Ḡ s,t. We show that, wth probablty at least 1 1/n, there s a path from s to t n ( V s,t, Ē ) and, consequently, the edge (s, t) s settled. Let { fˆ } be the maxmum flow from s to t n the graph Ḡ s,t wth capactes ˆx e set to ˆx (u,v) on edges e = (u, v +1 ) for u v, nfnte capactes ( ˆx e = ) on edges e {( s, s 1 ), (t 2, t)} and zero capactes ( ˆx e = 0) on edges e = (u, u +1 ) for e {( s, s 1 ), (t 2, t)}. Note that ths flow may be dfferent from the flow { f } obtaned by LP-U. By (6), the capacty of the mnmum cut between s and t s at least 1. Thus, the value of the flow { fˆ } s at least 1. In the smplest case, the flow s routed along n 2/3 dsjont paths of capacty n 2/3 each. The probablty that a gven path s u 1 v 2 t belongs to ( V s,t, Ē ) s at least Pr[r u α n 1/3 ln n and r v α n 1/3 ln n] (α n 1/3 ln n) 2. The probablty that at least one path belongs to Ē s 1 (1 α 2 n 2/3 ln 2 n) n2/3 > 1 1/n. In the general case, however, we need a more nvolved analyss. To analyze the general case, we partton the set V s,t nto two dsjont sets S and T such that at least 1/4 unts of flow { fˆ } are routed along the paths s u 1 v 2 t, where u S and v T. To see that such a partton exsts, randomly add every vertex n V s,t \ {s, t} to S or T wth probablty 1/2. Add s to S and t to T. Then for every path s u 1 v 2 t (where u v), Pr[u S and v T] 1/4, so the expected contrbuton of every path to the new flow s at least 1/4 of the orgnal flow over the path. Because the total new flow from s to t can be represented as a sum of flows over such paths, the expected flow routed from s to t through S and T (as descrbed above) s at least 1/4. That s, for at least one partton (S, T) the flow s at least 1/4. Fx ths partton. Let { f e } be the maxmum flow n Ḡ s,t (wth the same capactes as above) routed from s to t through S and T, such that all f e are multples of n 2/3. Such a flow exsts because all capactes are multples of n 2/3. Observe that (u, v) Ē f mn(r u, r v ) ˆx (u,v) α n 1/3 ln n and, consequently, also f mn(r u, r v ) f (u,v) α n 1/3 ln n, snce f (u,v) ˆx (u,v). Consder the followng two cases: 1. f ( s,u1 ) n 1/3 for some vertex u S. 2. f ( s,u1 ) < n 1/3 for all vertces u S. Case 1. Fx a vertex u S for whch f ( s,u1 ) n 1/3. We wll show that wth probablty at least 1 1/n there s a path from s to t va u 1 n Ḡ s,t. The edge ( s, u 1 ) always belongs to Ē because α ˆx ( s,u1 )n 1/3 ln n α f ( s,u1 )n 1/3 ln n > 1 r u. Consder an arbtrary path u 1 v 2 t. Note that f (v2, t) f (u1,v 2 ), snce all flow from u 1 to v 2 must be routed to t along the edge (v 2, t). Thus, f r v α f (u1,v 2 ) n 1/3 ln n, then (u 1, v 2 ) Ē and (v 2, t) Ē. Therefore, f there s no path from u 1 to t n Ē, 16

then r v > α f (u1,v 2 ) n 1/3 ln n for all v T. Ths happens wth probablty at most mn(1 α f (u1,v 2 ) n 1/3 ln n, 0) exp ( α f (u1,v 2 ) n 1/3 ln n ) v T v T = exp v T f (u1,v 2 ) exp ( f ( s,u1 ) αn 1/3 ln n ) exp( ln n) = 1 n. αn1/3 ln n Therefore, wth probablty at least 1 1/n, there s a path from s to t n Ḡ s,t and, consequently, the edge (s, t) s settled. Case 2. For every u S, defne a random varable F u1 : F u1 = f (u1,v 2 ). v T:r u α f (u1,v 2 )n 1/3 ln n Ths random varable gves a lower bound on the amount of flow that can be routed along the edges Ē from the source s to the set of copes of nodes n T through the vertex u 1. (Recall that Ē s a random set.) Clam 4.5. Pr r u : u S u S F u1 αn 1/3 ln n 8 1 1 2n. Proof. The value of F u1 depends only on r u, and hence all random varables F u1 are ndependent. If f (u1,v 2 ) > 0 then f (u1,v 2 ) n 2/3 because f (u1,v 2 ) s a multple of n 2/3. Therefore, for all nodes u S and v T wth postve flow f (u1,v 2 ), Pr r u [ ru α f (u1,v 2 )n 1/3 ln n ] = mn(α f (u1,v 2 )n 1/3 ln n, 1) αn 2/3 n 1/3 ln n αn 1/3 ln n. Ths mples that for all nodes u S and v T, f (u1,v 2 ) Pr r u [ ru α f (u1,v 2 )n 1/3 ln n ] f (u1,v 2 ) αn 1/3 ln n. Therefore, E[ F u1 ] = u S [ f (u1,v 2 ) Pr ru α f (u1,v 2 )n 1/3 ln n ] r u u S v T ( ) αn 1/3 ln n = ( f ( s,u1 )) αn 1/3 ln n α 4 n 1/3 ln n. u S v T f (u1,v 2 ) u S Now we use the assumpton that f ( s,u1 ) n 1/3 for all u S. By flow conservaton, t mples that all F u1 are bounded from above by n 1/3. By the Hoeffdng 17

nequalty 3 appled wth ɛ = 1/2 and c = n 1/3, Pr r u F u1 αn 1/3 ln n 8 = 1 Pr r u F u1 < 1 2 E[ F u1 ] u S u S u S ( 1 exp α ln n ) 1 1 32 2n. Next, we condton on the event that u S F u1 αn 1/3 ln n/8, and bound the condtonal probablty that there exsts a path from s to t. Clam 4.6. For any fxed {r u } u S, such that u S F u1 αn 1/3 ln n/8, we have [ Pr there s no path s u1 v 2 t n E ] 1 r v : v T 2n. Proof. For every v T, let F v2 = u 1 :r u α f (u1,v 2 )n 1/3 ln n f (u1,v 2 ). If for some ṽ T we have Fṽ2 > 0, then for some ũ S, rũ α f (ũ1,ṽ 2 )n 1/3 ln n, rũ α f ( s,ũ1 )n 1/3 ln n and, hence, the path s ũ 1 ṽ 2 belongs to E. Also, f (ṽ2, t) = f (u1,ṽ 2 ) F v2. u S Now for a fxed {r u } u S and a vertex ṽ T, we bound the probablty that (ṽ 2, t) E from below by [ rṽ α f (ṽ2, t)n 1/3 ln n ] mn(α f (ṽ2, t)n 1/3 ln n, 1) mn(αfṽ2 n 1/3 ln n, 1). Pr rṽ Note that we have a lower bound on the sum of F v2 s: F v2 = F u1 αn 1/3 ln n. 8 v T u S Thus, we can use the same argument as n Clam 2.3 to get a lower bound on the overall probablty: [ Pr (v2, t) E, for all v T wth F v2 > 0 ] exp ( αf v2 n 1/3 ln n ) r v v T exp( α 2 ln 2 n/8) < 1 2n. 3 Here we use the followng varant of the Hoeffdng s nequalty. Let X 1,..., X n be ndependent random varables takng values n [0, c]. Let S n = X, let µ = E[S n ]. Then, for every postve ε, Pr[S n (1 ε)µ] e 1 2 ε2 µ/c. For reference see, e.g., [HMRR98] Theorem 2.3(c) on page 200. 18

By Clams 4.5 and 4.6, the probablty that there exsts a path s u 1 v 2 t s at least (1 1/(2n)) 2 > 1 1/n. 5. An O(n 2/3+ɛ )-Approxmaton for Drected Stener Forest Let us frst recall the Drected Stener Forest (DSF) problem. Gven a drected graph G = (V, E), a cost functon c : E R + and a set D V V of ordered pars, the goal s to fnd a mn-cost subgraph H of G that contans a path from s to t for every (s, t) D. In contrast to spanners, there s no restrcton on the paths used to connect pars, but the objectve to be optmzed depends on arbtrary edge costs. Theorem 5.1. For any fxed ɛ > 0, there s a polynomal tme randomzed algorthm for Drected Stener Forest wth expected approxmaton rato O(n 2/3+ɛ ). Our algorthm for DSF bulds on the algorthm of Feldman, Kortsarz and Nutov [FKN09] for the problem. We descrbe ther algorthm and most of ther analyss, usng notaton compatble wth prevous sectons of ths paper, and show where we make our mprovement. As mentoned n [FKN09], one can assume wthout loss of generalty that D S T for two dsjont subsets S and T of V and that the costs are metrc. Let τ denote our guess for the optmal value of OPT. We start from τ = 1 and repeatedly double our guess each tme we fnd t s too small. Thus, t suffces to gve the approxmaton guarantee for the teraton when OPT τ 2 OPT. The algorthm has two parameters: β and l. We set β = n 1/3 and l = τ/n 2/3 below. Let us adapt some termnology from the prevous sectons to ths new settng. Defnton 5.1 (Thck and thn pars). For a par (s, t) D, let G s,t = (V s,t, E s,t ) be the subgraph of G nduced by the vertces on paths from s to t of cost at most l. A par (s, t) D s thck f V s,t n/β and t s thn otherwse. Defnton 5.2. A set E E settles a par (s, t) D f the subgraph (V, E ) contans a path from s to t. The hgh-level structure of the algorthm s the same as for the spanner problem. We wll descrbe how to fnd n polynomal tme two sets E, E E,such that E settles all the thck pars and E settles all the thn pars. The thck pars can be settled by random samplng, just as n Secton 2.1. For p = O((log n)/(n/β)), f each vertex s selected wth probablty p to le n a set R, then for every (s, t) D, R V s,t wth hgh probablty. Let the set E be constructed by addng, for each u R, s S, t T, the edges of a path from s to u of cost at most l f one exsts and the edges of a path from u to t of cost at most l f one exsts. The expected number of thck pars stll not settled s at most D /n 2 1. Thus, we can add the edges of a mnmum-cost path from s to t for any unsettled thck par (s, t) and stll have that the expected cost of E be O(n pn l + τ) = Õ(nlβ + τ) = Õ(n 2/3 τ), where we use τ as an upper bound on the cost of a mnmum-cost (s, t)-path. 19

Mnmze c(e) x e subject to: (7) (s,t) D Π(s,t) P e P Π(s,t) e E y s,t D /2 f P x e (s, t) D, e E f P = y s,t (s, t) D 0 y s,t, f P, x e 1 (s, t) D, P Π, e E Fgure 3: Lnear program LP-DSF for the case D C > D /2 We remove the settled thck pars from D, so that t only conssts of the unsettled thn pars. Next, we construct an edge set E that settles all the thn pars. Defne the densty of a subset of E to be the rato between the total cost of the subset and the number of pars n D settled by t. We show how to effcently construct a subset K wth expected densty O(n 2/3+ɛ ) τ/ D. Ths allows us to compute the set E : startng from D unsettled thn pars and E =, fnd K of expected densty O(n 2/3+ɛ ) τ/ D, add the edges n K to E, remove the settled pars from D, and repeat. As shown n Theorem 2.1 of [FKN09], ths greedy procedure produces a subset E of expected cost O(n 2/3+ɛ ) τ that settles all the thn pars, completng the proof of Theorem 5.1. The edge set K s produced by constructng two sets K 1 and K 2 and lettng K be the set of smaller densty. We guarantee that one of K 1 and K 2 has expected densty O(n 2/3+ɛ ) τ/ D. Whether the guarantee s provded for K 1 or K 2 depends upon whch one of the two cases below holds. Suppose H s an optmal soluton wth cost τ (we gnore the factor of 2 for smplcty). Let C be the set of pars (s, t) D for whch the mnmum cost of an (s, t)-path n H s at least l; that s, these are the costly pars to settle. The two cases are: C D /2 and C < D /2. Case 1: C D /2. Ths case reles on a result of Chekur, Even, Gupta and Segev [CEGS11]. Defne a juncton tree to be the unon of an ngong tree and an outgong tree (not necessarly dsjont) rooted at the same vertex. Chekur et al. [CEGS11] show an O(n ɛ )-approxmaton for the mnmum densty juncton tree of a graph. Fortunately, there exsts a juncton tree of densty at most τ 2 /( C l). To see why, take the paths n H connectng the pars n C. The sum of the costs of all such paths s at least C l. If we denote the maxmum number of these paths that any edge belongs to as µ, then the sum of the costs of the paths s at most µ τ and thus there exsts an edge, whch belongs to µ C l/τ paths. Therefore, there must be a juncton tree K 1 whch contans ths edge and connects at least C l/τ pars n D. K 1 has densty at most τ/( C l/τ) = τ 2 /( C l). Thus, when C D /2, the algorthm of [CEGS11] (determnstcally) returns a juncton tree of densty O(n ɛ τ/l τ/ D ) = O(n 2/3+ɛ ) τ/ D. 20

Case 2: D C > D /2. In ths case, we attempt to fnd a subgraph that connects many pars of D usng low-cost edges. Consder the problem of connectng at least D /2 pars from D usng paths of cost at most l whle mnmzng the total cost of the edges. For (s, t) D, let Π(s, t) be the set of (s, t)-paths of cost at most l, and let Π = (s,t) D Π(s, t). We can formulate an LP relaxaton for ths problem, LP-DSF, shown n Fgure 3, whch closely resembles the LP used by [DK11] for Drected k-spanner. Each edge e has a capacty x e, each path P Π carres f P unts of flow, and y s,t s the total flow through all paths from s to t. Also, the total flow through all paths n Π should be at least D /2. It s clear that LP-DSF s a relaxaton of the problem of connectng at least D /2 pars n D whle mnmzng the cost of the edges. Feldman, Kortsarz and Nutov [FKN09] show that n polynomal tme, we can fnd a soluton { ˆx e } {ŷ s,t } such that e E c(e) ˆx e s wthn (1 + ɛ) factor of OPT, the optmal soluton to LP-DSF, for any fxed ɛ > 0. Our mprovement comes n the analyss of the roundng algorthm for LP- DSF. Suppose { ˆx e } {ŷ s,t } s a feasble soluton to LP-DSF. Let K 2 be the edge set obtaned by selectng each edge n E wth probablty mn((8n ln n)/β x e, 1). Lemma 5.2. Wth probablty 1 1/n 2, set K 2 settles every thn par (s, t) wth ŷ s,t 1/4. Proof. We renterpret Defnton 2.4 n terms of edge costs nstead of lengths. More precsely, defne a set A E to be an antspanner for a par (s, t) D f (V, E \ A) contans no path from s to t of cost at most l. By exactly the same argument as n Clam 2.5, the set of all mnmal antspanners for thn pars s of sze at most n 2 (n/β) n/β. For every thn par (s, t) D wth ŷ s,t 1/4, f A s an antspanner for (s, t), then e A ˆx e ˆ P Π(s,t) f P 1/4, where fˆ P s the value of the varable f P n LP- DSF that corresponds to the soluton { ˆx e } {ŷ s,t }. So, the probablty that K 2 s dsjont from A s at most exp( (n ln n)/β), by the same argument as n Clam 2.3. Thus, by the bound on the total number of antspanners of thn pars from above, the unon bound, and Clam 2.4, t follows that wth hgh probablty, K 2 settles every thn par (s, t) wth ŷ s,t 1/4. We add to K 2 a mnmum-cost path between any par (s, t) wth ŷ s,t 1/4 that s stll not settled. In expectaton, the number of such pars s D /n 2 1, so that the total expected cost 4 of K 2 s at most (16n ln n)/β τ. A smple argument shows that the number of pars (s, t) n D for whch ŷ s,t < 1/4 s at most 2 D /3; assumng the opposte makes the total amount of flow between all pars strctly less than D /2. So, the expected densty of K 2 s at most: ( ) 16n ln n τ / ( D 2 D /3) = β 48n ln n β τ D = Õ(n2/3 ) τ/ D. 4 Ths s where we save over [FKN09]. The cost of ther comparable K 2 s O(n 2 /β 2 τ). 21