A DETERMINISTIC APPROXIMATION ALGORITHM FOR THE DENSEST K-SUBGRAPH PROBLEM

Similar documents
Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko

An Approximation Algorithm for MAX-2-SAT with Cardinality Constraint

An 0.5-Approximation Algorithm for MAX DICUT with Given Sizes of Parts

Basic Research in Computer Science BRICS RS Ageev & Sviridenko: An Approximation Algorithm for Hypergraph Max k-cut

Lecture 8: The Goemans-Williamson MAXCUT algorithm

6.854J / J Advanced Algorithms Fall 2008

The Complexity of Detecting Fixed-Density Clusters

Label Cover Algorithms via the Log-Density Threshold

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

On a Polynomial Fractional Formulation for Independence Number of a Graph

Max Cut (1994; 1995; Goemans, Williamson)

SDP Relaxations for MAXCUT

ACO Comprehensive Exam March 17 and 18, Computability, Complexity and Algorithms

Decomposing oriented graphs into transitive tournaments

Near-Optimal Algorithms for Maximum Constraint Satisfaction Problems

Introduction to Semidefinite Programming I: Basic properties a

Independent Transversals in r-partite Graphs

The minimum G c cut problem

Approximation results for the weighted P 4 partition problem

Convex and Semidefinite Programming for Approximation

Bin packing with colocations

1 Introduction Semidenite programming (SDP) has been an active research area following the seminal work of Nesterov and Nemirovski [9] see also Alizad

A better approximation ratio for the Vertex Cover problem

An Improved Approximation Algorithm for Maximum Edge 2-Coloring in Simple Graphs

Lecture 21 (Oct. 24): Max Cut SDP Gap and Max 2-SAT

Lecture 20: Course summary and open problems. 1 Tools, theorems and related subjects

An approximation algorithm for the minimum latency set cover problem

Scheduling Parallel Jobs with Linear Speedup

Approximating the minimum quadratic assignment problems

Relaxations and Randomized Methods for Nonconvex QCQPs

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Bounds on the radius of nonsingularity

Reachability-based matroid-restricted packing of arborescences

Notice that lemma 4 has nothing to do with 3-colorability. To obtain a better result for 3-colorable graphs, we need the following observation.

Approximate Hypergraph Coloring. 1 Introduction. Noga Alon 1 Pierre Kelsen 2 Sanjeev Mahajan 3 Hariharan Ramesh 4

Approximation algorithms and hardness results for the clique packing problem. October, 2007

Lecture Semidefinite Programming and Graph Partitioning

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

The path partition problem and related problems in bipartite graphs

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

A note on network reliability

CS 583: Approximation Algorithms: Introduction

Heuristics for The Whitehead Minimization Problem

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

On shredders and vertex connectivity augmentation

CS 6820 Fall 2014 Lectures, October 3-20, 2014

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai

Graph coloring, perfect graphs

A Fixed-Parameter Algorithm for Max Edge Domination

Approximation Hardness for Small Occurrence Instances of NP-Hard Problems

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u)

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization

Partitioning a graph into highly connected subgraphs

Machine Minimization for Scheduling Jobs with Interval Constraints

The Greedy Algorithm for the Symmetric TSP

Approximation algorithms for cycle packing problems

Approximating maximum satisfiable subsystems of linear equations of bounded width

Randomness and Computation

Lectures 6, 7 and part of 8

Roundings Respecting Hard Constraints

Increasing the Span of Stars

Lecture 20: LP Relaxation and Approximation Algorithms. 1 Introduction. 2 Vertex Cover problem. CSCI-B609: A Theorist s Toolkit, Fall 2016 Nov 8

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Lec. 2: Approximation Algorithms for NP-hard Problems (Part II)

A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover

Finding and Counting Given Length Cycles 1. multiplication. Even cycles in undirected graphs can be found even faster. A C 4k 2 in an undirected

A Continuation Approach Using NCP Function for Solving Max-Cut Problem

On a list-coloring problem

Edge-disjoint induced subgraphs with given minimum degree

Discrete (and Continuous) Optimization WI4 131

Deciding Emptiness of the Gomory-Chvátal Closure is NP-Complete, Even for a Rational Polyhedron Containing No Integer Point

On the Complexity of Budgeted Maximum Path Coverage on Trees

Vertex Cover in Graphs with Locally Few Colors

Relaxations of combinatorial problems via association schemes

Dispersing Points on Intervals

Packing triangles in regular tournaments

1 Matrix notation and preliminaries from spectral graph theory

Approximation of Optimal Two-Dimensional Association Rules for Categorical Attributes Using Semidefinite Programming

The P k partition problem and related problems in bipartite graphs

Approximability of Connected Factors

Graphs with Large Variance

approximation algorithms I

Lower Bounds for Testing Bipartiteness in Dense Graphs

On a reduction of the weighted induced bipartite subgraph problem to the weighted independent set problem

A Polynomial-Time Algorithm for Pliable Index Coding

Testing Expansion in Bounded-Degree Graphs

Lecture 13 March 7, 2017

Finding Large Induced Subgraphs

Densest/Heaviest k-subgraph on Interval Graphs, Chordal Graphs and Planar Graphs

Project in Computational Game Theory: Communities in Social Networks

Bisections of graphs

Approximation algorithm for Max Cut with unit weights

Average Case Complexity: Levin s Theory

A Redundant Klee-Minty Construction with All the Redundant Constraints Touching the Feasible Region

On the S-Labeling problem

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

Probabilistic Proofs of Existence of Rare Events. Noga Alon

Monotone Submodular Maximization over a Matroid

The maximum edge-disjoint paths problem in complete graphs

SAT) is defined by a collection of boolean clauses, where each clause is a disjunction

Transcription:

A DETERMINISTIC APPROXIMATION ALGORITHM FOR THE DENSEST K-SUBGRAPH PROBLEM ALAIN BILLIONNET, FRÉDÉRIC ROUPIN CEDRIC, CNAM-IIE, 8 allée Jean Rostand 905 Evry cedex, France. e-mails: {billionnet,roupin}@iie.cnam.fr Abstract. In the Densest k-subgraph problem (DSP), we are given an undirected weighted graph G = (V, E) with n vertices (v,..., v n ). We seek to find a subset of k vertices (k belonging to {,..., n}) which maximizes the number of edges which have their two endpoints in the subset. This problem is NP-hard even for bipartite graphs, and no polynomial-time algorithm with a fixed performance guarantee is known for the general case. Several authors have proposed randomized approximation algorithms for particular cases (especially when k = n, c > ). But derandomization techniques are c not easy to apply here because of the cardinality constraint, and can have a high computational cost. In this paper we present a deterministic max(d, 8 9c )- approximation algorithm for the Densest k-subgraph Problem (where d is the density of G). The complexity of our algorithm is only the one of linear programming. This result is obtained by using particular optimal solutions of a linear program associated with the classical 0- quadratic formulation of DSP. Keywords: Approximation, Quadratic programming, Linear programming, Complexity.. Introduction Consider an undirected and weighted graph G = (V, E) with n vertices (v,..., v n ), an integer k belonging to {,..., n}, and a non-negative weight w ij on each edge [v i, v j ]. The Heaviest k-subgraph problem (HSP) consists in determining a subset S of k vertices such that the total edge weight of the subgraph induced by S is maximized. HSP is also known under the name of p-dispersion problem, p- defence-sum problem (Krarup et al. 00), k-cluster problem, and densest

k-subgraph problem (DSP) when all the edge weights are equal to. HSP is NP-hard even for unweighted bipartite graphs (using a polynomial reduction from clique (Corneil and Perl 984)). For unweighted graphs clique reduces polynomially to DSP: there is a clique of size k in G if and only if the optimal value of DSP is equal to k(k ). Strong negative results have been obtained about the approximability of the clique problem. Unfortunately, there is no direct way to use them for the HSP problem. Recall that a r-approximation algorithm for a maximization problem (Π) returns for any instance I of (Π) a feasible solution of value v a such that va v r, where v is the optimal value of I. The parameter 0 r is known as the approximation ratio of the algorithm. Many approximation results are known for HSP (Kortsarz and Peleg 993, Asahiro et al. 000, Feige et al. 00b, Jäger and Srivastav 005) but no approximation algorithm with fixed ratio-bound have been found to date and the question of knowing if such an algorithm exists is open. However, HSP admits an approximation ratio of at least max ( k/n, n ε /3) for all k and some ε > 0 (Kortsarz and Peleg 993, Feige et al. 00a). A well-known greedy heuristic (Asahiro et al. 000) returns a feasible solution of value v a, and provides upper and lower bounds on the worst case approximation ratio R = r of v upon v a : and ( + n ) ( ) ( O n 3 R k + n ) ( ) + O k n ( n ) ( ) ( n ) ( n ) k O R k k + O k for n 3 < k n for k n 3 Let c = n k. For c =, R is less than or equal to 9 4 + O ( n), for c = 3, we obtain 4 + O ( n k ), and for c = 4, we get 6 + O ( n k ). For some particular cases, there are better approximation results. For HSP, when the weights satisfy the triangle inequality, there exists a -approximation algorithm (Hassin et al. 997). All the previous results for HSP obviously apply to DSP (the Densest k-subgraph problem). RC-486

On the other hand, many randomized algorithms have been proposed for HSP and DSP. Some of them use the optimal solution of the natural linear relaxation of a quadratic formulation of DSP (Han et al. 000). Others are based on semidefinite relaxations (Feige et al. 00b, Han et al. 00, Jäger and Srivastav 005). Unfortunately, to the best of our knowledge, no derandomization has been explicitly presented for these algorithms. Note that the task is complicated here because of the cardinality constraint. For such problems, as noted in (Feige et al. 00c) for the max-bisection problem, the impact on the worst-case ratio by "repairing" the solution is very difficult to analyse. In (Arora et al. 999), starting from a 0- quadratic formulation, such a successful analysis leads to a polynomial time approximation scheme (PTAS) for DSP in everywhere-dense graphs (the minimum degree in G is Ω(n)). This last result holds when G is a dense-graph (the number of edges is Ω ( n ) ) and k = Ω (n). But these hypothesis on the density are crucial. Our contribution is to propose a simple deterministic combinatorial algorithm which uses as input an optimal solution of a linear relaxation of a quadratic formulation of DSP. The total complexity of our algorithm is only the one of linear programming, and its worst case approximation ratio is max ( ) 8 d, 9c where d = E n(n ) is the density of G. Indeed, we did not use derandomization techniques which are not easy to apply here, and can increase dramatically the overall complexity. In Section, we recall the randomized approaches based on linear and semidefinite programming for HSP and DSP. In Section 3, we present our deterministic algorithm. First, we characterize some optimal solutions of a classical linear relaxation of the 0- quadratic formulation of DSP. Second, we compare the value of these solutions for the linear objective function and for the quadratic objective function. Moreover, we study the integrality gap in the quadratic formulation. Using these results, we design a new deterministic approximation algorithm for DSP. In Section 4, we apply our algorithm to a small instance of DSP, then we compare our worst-case ratio with previous ones. Section 5 is a conclusion. 3 RC-486

. Randomized approaches First, note that in DSP the probability to have an edge between two vertices is d = E n(n ), thus the expectation of the number of edges of a subgraph with k vertices in G equals d k(k ). For HSP, there is a simple deterministic O ( n ) algorithm which achieves the expected value of a random solution. This greedy heuristic is used for instance in (Asahiro et al. 000). Just remove one by one n k vertices from G by choosing at each step i the vertex v i such that [v i,v i] E w ij is minimal (i.e. the minimum degree for DSP) in the remaining graph. It can be seen as a deterministic version of the simplest randomized algorithm for HSP. Using this greedy procedure, one can prove that (Srivastav and Wolf 998): Theorem.. The HSP problem in a weighted graph of n vertices has a k(k ) n(n ) - approximation polynomial-time algorithm. Corollary.. The DSP problem in a graph of n vertices has a d-approximation polynomial time algorithm, where d = E n(n ) is the density of G. Now, we recall more sophisticated randomized approaches and discuss derandomization issues. For HSP, in (Feige et al. 00b) the authors prove that the ratio associated to the randomized algorithm proposed in (Srivastav and Wolf 998) is incorrect for the particular semidefinite program proposed in this last paper. But by following similar ideas, they present a randomized approximation algorithm using a tighter semidefinite relaxation. The exact expression of their ratio uses constants derived in the approximation algorithms in (Goemans and Williamson 995) for max-k-cut and max-dicut. This ratio was the first to beat the target c when c is close to. Semidefinite relaxations for HSP have been also proposed in (Han et al. 00) by adding a new constraint to the semidefinite program given in (Srivastav and Wolf 998), and more recently in (Jäger and Srivastav 005), where worst-case ratios have been improved for 0.04 < c < 0.68. Unfortunately, all these approaches imply to solve a semidefinite program and to be able to derandomize the algorithm by using conditional probabilities (Srivastav 00) which is a polynomial process but has a very high computational cost here because of the random hyperplane 4 RC-486

approach. For instance, in (Hariharan and Mahajan 999), authors report a complexity of O ( n 30) for the 3-Vertex Coloring problem. Moreover, for our problem one has still to repair the final solution to satisfy the cardinality constraint. In (Feige et al. 00b), this is done when c is close to by using the greedy heuristic of Theorem.. The same difficulty arises in the linear programming approach when one uses a rounding randomized algorithm. Indeed, consider the following formulation of DSP: (P ) max f(x) = i<j, [v i,v j ] E x i x j : n x i = k, x {0, } n i= where x i = if and only if vertex v i is put into the subset. Let (P ) be the continuous relaxation of (P ), obtained by replacing the constraint x {0, } n by x [0, ] n. Note that (P ) can be formulated as: (P L) max f L(x) = i<j, [v i,v j ] E min (x i, x j ) : n x i = k, x {0, } n i= It is easy to formulate (P L), the continuous relaxation of (P L) (x [0, ] n in place of x {0, } n ), as a linear program by adding E new variables z ij and E constraints z ij x i and z ij x j. A randomized algorithm based on (P L) is cited in (Feige et al. 00a) and attributed to Goemans. Its expected worst-case ratio equals c, and it can be roughly stated as follows. First, consider (x, z ), an optimal solution of (P L). Second, randomly round the n fractional components of x to with probability x i. The probability that an edge [v i, v j ] is selected is at least z ij. Thus the expected number of edges is at least the optimal value of (P L). One can verify that the expected number of vertices is less that nk. Getting from nk vertices to k (by randomly choosing k vertices) looses k ( nk) in the ratio, namely, gives c ratio. Note that there are two randomized steps in this algorithm. A derandomization for the first one (randomized rounding) can be achieved by using the results presented in the seminal work of (Raghavan and Thompson 987, Raghavan 988). The 0- solution obtained here will be such that i x i nk + O( nlog(n)) with 5 RC-486

high probability. In (Han et al. 000) the authors do not derandomize the initial step, but they propose to choose the final k vertices greedily by using Theorem. (instead of taking them randomly). This leads to an approximation ratio of ( ) ( ) +k 3 c n e n 9 3. Thus even for n sufficiently large, it looses a factor ( ) in the ratio +k 3 c. For instance, to get a approximation ratio equal to ( ) then one must have 0.95 and thus k must be larger than 57038. 0.95 c +k 3 Moreover, the additional error O( nlog(n)) for the cardinality constraint is not taken into account since the initial randomized rounding step is not derandomized. All these results illustrate the difficulty to obtain efficient deterministic versions with the same worst case ratio of randomized algorithms for HSP and DSP. 3. A simple deterministic algorithm to approximate DSP In this section, we assume that the graph G = (V, E) is unweighted and that k 3 since otherwise the optimal solution of DSP is trivial. The main result (Theorem 3.5) is obtained by starting from an optimal solution x L of (P L). First, we build x Lβ, a particular optimal solution of (P L) (Lemma 3.). Second, starting from x Lβ we obtain x Q, a particular feasible solution of (P ), and finally, x P, a feasible solution of (P ) (formulation of DSP). At each step, we are able to bound the gap between the values of these different solutions. The complexity of our approximation algorithm is only the one of linear programming, because the others steps can be implemented in O ( n ) and no derandomization is required. Moreover, our worst-case ratio (max ( d, 8 9c) ) is valid for any k. To start, we are going to characterize some optimal solutions of a special mathematical program in order to obtain particular optimal solutions of (P L). Consider G = (V, E), a graph with n vertices {v,..., v n }, and (Π), the following continuous linear program: n (Π) Max h(x) = C i x i + min (x i, x j ) : i= i<j, [v i,v j ] E n x i = k, x [0, ] n i= where C i for all i in {,..., n} is a non-negative coefficient. 6 RC-486

Lemma 3.. Let x be an optimal solution of (Π). If 0 < x i < for all i {,..., n} then x i = k n (i =,..., n) is also an optimal solution of (Π). Proof. Consider = h(x ) h(x ) = n ( ) k C i n x i + i= i<j, [v i,v j] E ( k n min ( x i, x ) ) j Since x is an optimal solution of (Π) we have 0. Now consider x o, a feasible solution of (Π), defined by x o i = x i + ρ ( x i k n) (i =,..., n), where ρ is a positive real such that 0 < x i + ρ ( x i k n) < (ρ exists since 0 < x i < ). We get h(x o ) h(x ) = i<j, [v i,v j ] E + i<j, [v i,v j ] E i= ( ( min x i + ρ x i k ) (, x j + ρ x j k )) n n min ( x i, x ) j n ( C i ρ x i k ) n By examining the two cases x i x j and x i > x j, one can see that h(xo ) h(x ) = ρ. Since (x ) is optimal, 0 and finally = 0 ; this implies the optimality of (x ). A consequence of Lemma 3. is that from any optimal solution of (P L), one can obtain another optimal solution whose fractional components are all equal. Lemma 3.. For DSP, if x Lβ is a feasible solution of (P L) whose fractional components are all equal to the same value β then f ( x Lβ) k n f L ( x Lβ ). { } { } { Proof. We define X 0 = i x Lβ i = 0, X = i x Lβ i =, X β = i x Lβ i { } E,β = i, j [v i, v j ] E ; (x Lβ i =, x Lβ j = β) or (x Lβ i = β, x Lβ j = ), E β,β = {i, j [v i, v j ] E ; x Lβ i = β, x Lβ j and E, = {i, j [v i, v j ] E ; x Lβ i = β}, } = β, =, x Lβ j = }. Let n 0, n, n β, e,β, e β,β, e, be respectively the cardinality of X 0, X, X β, E,β, E β,β and E,. Here we get β = k n n n n 0 (since n i= xlβ i First, if n = 0, one can see that f ( x Lβ) = (i, j), x Lβ i x Lβ j = βmin(x Lβ i, x Lβ j = k and all fractional components are equal). k n n 0 f L ( x Lβ ) k n f L ( x Lβ ) (for all ) since x Lβ i {0, β}). Otherwise, consider the 7 RC-486

X ε X 0 0 β +ε β +ε Xβ ε 0 β +ε β +ε Figure. ( x 0) is built from ( x Lβ) following feasible solution (x o ) of (P L) (see Figure ) : x o i x o i = xlβ i (i X 0 ), x o i = xlβ i = x Lβ i ε (i X ), + ε (i X β ), where ε and ε are two positive real numbers satisfying ε + ε β (to have ε > β + ε, this implies ε and ε β), and εn = ε n β (to satisfy constraint n i= xo i = k). Note it is always possible to find such numbers by taking them small enough. We have h (x o ) h ( x Lβ) = e β,β ε e, ε + e,β ε 0 which implies e β,β n n β ε e, ε 0, thus e, e β,β n n β (recall that h is the objective function in (Π)). We know that f ( x Lβ) = e β,β β + e, + e,β β, and f L ( x Lβ ) = e β,β β + e, + e,β β. Thus we get f ( x Lβ) f L (x Lβ ) e, + e β,β β e, + e β,β β n n β + β n n β + β = n + (k n ) n β k (recall that β = k n n β ). Note that n + (k n ) n β k k = (k (n + n β )) n n + n β β n k(n + n β ) 0 yielding f(xlβ ) f L (x Lβ ) k n +n β k n. In Algorithm, we build from any feasible solution x of (P ), a particular feasible solution x Q of (P ) such that f ( x Q) f (x). Then we obtain in Algorithm a feasible solution x P of (P ) from x Q. Finally we prove in Lemma 3.3 that the gap between f ( x Q) and f ( x P ) can be bounded. Algorithm Let x be any feasible solution of (P ). Consider two vertices v i and v j of G such that there is no edge between them, and x i = α, x j = β where 0 < α < and 0 < β < (if such a pair of vertices does not exist then stop). 8 RC-486

Consider the following program obtained from (P ) by fixing all the variables x r (r =,...n ; r i, j) to their values in the feasible solution x. max L(x i, x j ) = Ax i + Bx j + C s.t. x i + x j = α + β x i [0, ], x j [0, ] where A, B and C are positive constants depending on the x r s values (r =,..., n ; r i, j). This linear program has an obvious optimal solution. Indeed, if A B then (min (, α + β), β min (, α + β) + α) is optimal, and if A < B then (α min (, α + β) + β, min (, α + β)) is optimal. In both cases, at least one variable becomes an integer. By iterating this procedure at most n times, we obtain V, V, and a feasible solution x Q of (P ) such that (see Figure ) : x Q i = 0 or for all i in V. 0 < x Q i < for all i in V. The edge [v i, v j ] belongs to E for all (i, j) in V. V V, V V, V V = V, and V V =. V 0 V 0 0 Figure. Particular feasible solution x Q of (P ) such that f ( x Q) f (x) Notation p and q are the respective values of V and V, and s is the number of variables set to one in V. Moreover, we define T = { (i, j) V : i < j, [v i, v j ] E }, T = { (i, j) V : i < j, [v i, v j ] E }, and T = {(i, j) V V : i < j, [v i, v j ] E}. Algorithm Let x be any feasible solution of (P ). Consider the particular feasible solution x Q of (P ) obtained from x by Algorithm, and build x P, a feasible solution of (P ), by the following way: 9 RC-486

We can write that f ( x Q) = x Q i xq j + x Q i xq j + (i,j) T (i,j) T (i,j) T x Q i xq j Now, fix to their current values all the variables of V, and consider the following restricted program max s.t. i V C i x i + (i,j) T x i x j + (i,j) T x Q i xq j i V x i = k s x i {0, }, i V where we have C i = j Γ(i) V x Q i. The optimal value is clearly equal to i V [k s] C i + (k s)(k s ) + (i,j) T x Q i xq j where V [k s] is the set of the indices of the k s variables x i (i V ) having the k s greatest C i. Thus an optimal solution is x i = for i in V [k s] and x i = 0 otherwise. Finally, the feasible solution x P of (P ) is given by x P i = x Q i for i in V, x P i = for i in V [k s], and x P i = 0 otherwise. Lemma 3.3. Let x be any feasible solution of (P ). One can obtain in polynomialtime a feasible solution x P of (P ) (DSP: formulation of HSP in unweighted graphs) such that f (x) f ( x ) P q 8, where q is obtained by applying Algorithm to x. Proof. First, we use Algorithms and to obtain x Q and x P from x. Recall that i V x Q i = k s. We have: f ( x Q) f ( x P ) = (i,j) T x Q i xq j + (i,j) T x Q i xq j i V [k s] (k s)(k s ) C i Note that ( ) (i,j) T x Q i xq j k s q(q ) q [A] and i V [k s] C i (i,j) T x Q i xq j Thus we can write: [B]. f ( x Q) f ( x P ) ( ) k s q(q ) q (k s)(k s ) Let x = k s. The maximum value of (q ) q x x(x ) as a function of x is q 8 (obtained in x = q ), thus f ( x Q) f ( x ) P q 8. 0 RC-486

Corollary 3.4. Let x be any feasible solution of (P ) of value greater than. One can obtain in polynomial-time a feasible solution x P HSP in unweighted graphs) such that f(xp ) f(x). of (P ) (DSP: formulation of Proof. First, we use Algorithms and to obtain x Q and x P from x. Since we have f ( x Q) f (x), to obtain the claimed result we can prove that ρ = f(xp ) f(x Q ). We have: ρ = a + a + b + (k s)(k s ) x Q i xq j + x Q i xq j (i,j) T (i,j) T where a = x Q i xq j and b = C i. By using [A] and [B] we obtain: (i,j) T i V [k s] ρ a + b + (k s)(k s ) ( a + b + k s q ) q(q ) = a + b + (k s)(k s ) a + b + (k s) (q ) q and thus ρ (k s)(k s ) (k s) (q ) q then f ( x ) P f ( x Q) ( (k s )q (k s)(q ). If k s then ρ q ) q(q ) thus f ( x ) P f ( x Q) q q f ( x P ) f ( x Q). Since we have f ( x Q), we get ρ., and if k s =, and finally One direct consequence of Corollary 3.4 is that the continuous relaxation (P ) is as hard to approximate as (P ): if one admits an ε-approximation algorithm so does the other. Now, we are going to apply Lemma 3., 3., and 3.3 to obtain a deterministic 8 9c -approximation algorithm for DSP, the unweighted case of HSP, where c = n k. Algorithm 3 Let G = (V, E) be an unweighted graph of n vertices, k 3, and n = ck. Consider the following algorithm which builds a feasible solution x P of (P ): () Solve (P L) to obtain x L. () Build x Lβ from x L by using Lemma 3.. (3) Obtain x Q from x Lβ by using Algorithm. (4) Obtain x P from x Q by using Algorithm. RC-486

Recall that q is equal to the number of fractional components of x Q (see Notation, Algorithm ). Theorem 3.5. If n = ck then DSP admits a deterministic polynomial approximation algorithm that gives a feasible solution x P of (P ) such that f (x ) c ( f ( x P ) + q 8), where x is an optimal solution of (P ). Moreover, the complexity of this algorithm is the one of linear programming. ( Proof. We apply Algorithm 3 to (P ). We have f ) L x L f (x ). x Lβ is an optimal solution of (P L) whose all fractional components are equal (Lemma 3.). Considering that x Lβ is a feasible solution of (P ), we get f ( x Lβ) c f ( ) L x Lβ (Lemma 3.). Finally, we have f ( x Q) f ( x Lβ) (Algorithm ) and f ( x Q) f ( x ) P q 8 (Lemma 3.3). Thus x P is a feasible solution for (P ) (Algorithm ) such that f (x ) c ( f ( x ) P + 8) q ( ). Algorithms and can be implemented in O n, and thus the bottleneck complexity of Algorithm 3 is solving (P L). Corollary 3.6. If n = ck then DSP admits a polynomial approximation algorithm that gives a feasible solution x a of (P ) such that f(x ) f(x a ) 9 8 c, where x is an optimal solution of (P ). Proof. We can assume that q k. Indeed q because n i= x i = k (there are at least two fractional components or none), if q k then we have already an optimal solution (a clique of size k), and obviously if q = 0 then f (x ) cf ( x P ). Recall that Theorem 3.5 implies f (x ) c ( f ( x P ) + q 8) [C], where x P is the feasible solution obtained by Algorithm 3. Now, we consider two cases: First, if f ( x P ) q then [C] implies f(x ) f(x P ) 9 8 c, and we set xa = x P. Second, if f ( x P ) q [D] then [C] implies f (x ) c ( q + q 8). Now consider two sub-cases: * If q 3 then we consider x a the feasible solution consisting of the q vertices of the clique completed by k q arbitrary vertices. We obtain: f (x ) f (x a ) c ( ) q + q 8 q (q ) c ( 9 8 q ) q (q ) c 9 8 q q (q ) 9 8 c * If q = then [D] and [C] imply f (x ) c ( q + 8) = 5 4c [E]. If there exist two adjacent edges in G then (k 3) we consider as x a any feasible solution of RC-486

value, and thus [E] implies f(x ) f(x a ) 5 8c. Otherwise (there is no adjacent edges in G), the optimal solution is obvious: simply take edges as many as possible (the degree of each vertex is less or equal to one). 4. Application to a small example and comparisons In this section, we apply our algorithm to a small example, then we give for several values of c the worst-case ratios of our algorithm and compare them with the ones previously obtained. Let us apply Algorithm 3 to the instance of DSP described in Figure 3, where n = 8 and k = 5. By solving (P L) (step ) we get x L = ( 5 8, 5 8, 5 8, 5 8, 5 8, 5 8, 5 8) and fl ( x L ) = 0. Here we have x Lβ = x L (step ). Then (step 3, Algorithm ), by considering successively the pairs {v, v 3 },{v, v 4 },{v, v 5 }, {v 4, v 7 } and {v 5, v 6 } of non adjacent vertices, we get the feasible solutions of (P ): ( 4, 5 8,, 5 8, 5 8, 5 8, 5 8, ) ( 5 8, 0, 5 8,, 7 8, 5 8, 5 8, 5 8, ) ( 5 8, 0,,, 7 8, 4, 5 8, 5 8, ) ( 5 8, 0,,,, 4, 5 8,, 8) 5, and finally x Q = ( 0,,,, 0, 7 8,, 5 8) (the sets V and V are indicated in Figure 3). v V v v 3 v 6 V v 7 v 5 v 4 v 8 Figure 3. Application of Algorithm 3: a small example. Finally (step 4), we get x P from x Q by Algorithm : here C 6 = 3, C 7 =, C 8 =, thus x P = (0,,,, 0,,, 0) and f ( x ) P = 9. In this example, x P is optimal (no integer solution can reach the value 0). Now, we give in Table our deterministic worst-case ratios when n = ck. Naive is the ratio obtained in Theorem.. This ratio was improved in (Asahiro et al. 000) by doing a tighter analysis of the greedy algorithm (column Tighter ). SDP Rand is the expected ratio obtained by using the randomized algorithm based on semidefinite programming in (Jäger and Srivastav 005). PL Rand is the expected ratio achieved by the randomized Goemans algorithm (see Section 3 RC-486

). PL is our ratio which uses linear programming (see Theorem 3.5 and corollary 3.6). Naive Tighter SDP Rand PL Rand PL n(n ) f(x ) Ratio k(k ) c R c f(x 9 a) 8 c c= 4.5 + O ( ) n.606.5 c=3 9 4 + O ( ) n k.7 3 3.375 c=4 6 6 + O ( ) n k.943 4 4.5 Table. Worst-case ratios for DSP As expected, using linear programming improves R which is obtained by using only a combinatorial algorithm (Theorem.). The semidefinite approach provides a better (expected) ratio but involves a higher computational cost (moreover, it is a randomized algorithm). Let us point out that although our worst case ratio ( 8 9c ) is slightly less than c, the complexity bottleneck of our deterministic approximation algorithm is that of solving the linear program (P L) (see Section 3). This should be compared with the randomized approaches since a complete derandomization has a high computational cost and/or looses a significant part in the ratio c for arbitrary k and n (see Section ). For instance, the expected worst-case ratio of the partially derandomized algorithm of (Han et al. 000) is better than our deterministic worstcase ratio only when k is larger than 448 (one must have 8 9 ( ) ) and for very large n. +k 3 5. Conclusion We have proposed a simple deterministic approximation algorithm for the Densest- Subgraph Problem, which complexity is only the one of linear programming. From a practical point of view, our algorithm is very easy to implement and can provide good solutions for large instances of DSP. However, we conjecture that using such a linear programming approach one can not obtain a worst-case ratio that beats the target c. Recall that some random algorithms based on semidefinite programming have succeed in achieving better ratio (Feige et al. 00b, Jäger and Srivastav 005). But one has to solve a semidefinite program and to derandomize the algorithm to get the performance guarantee. Hence, a question remains: is there an efficient deterministic polynomial-time c -approximation algorithm for DSP? 4 RC-486

References Asahiro,Y. Iwama, K. Tamaki, H. and Tokuyama, T. 000, Greedily finding a dense subgraph, Journal of Algorithms, vol. 34, no., pp. 03-. Arora, S. Karger, D. and Karpinski, M. 999, Polynomial time approximation schemes for dense instances of NP-hard problems, Journal of Computer and System Sciences, vol. 58, no., pp. 93-0. Carlson, R. and Nemhauser, G. 966, Clustering to minimize interaction costs, Operations Research, vol. 4, pp. 5-58. Corneil, D.G. and Perl, Y. 984, Clustering and domination in perfect graphs, Discrete Applied Mathematics, vol. 9, no., pp. 7-39. Feige, U. Kortsarz, G. and Peleg, D. 00, The dense k-subgraph Problem, Algorithmica, vol. 9, no. 3, pp. 40-4. Feige, U. Langberg, M. 00, Approximation Algorithms for Maximization Problems Arising in Graph Partitioning, Journal of Algorithms, vol. 4, no., pp. 74-. Feige, U. Langberg, M. 00, The RPR rounding technique for semidefinite programs, ICALP 00, Crete, Greece, Lecture Notes in Computer Science No 076, pp. 3-4, Springer, Berlin. Freize, A. and Jerrum, M. 997, Improved approximation algorithms for MAX k-cut and MAX BISECTION, Algorithmica, vol. 8, pp. 67-8. Goemans, X. and Williamson, D.P. 995, Improved approximation algorithms for maximum-cut and satisfiability problems using semidefinite programming, Journal of ACM, vol. 4, no. 6, pp. 5-45. Han, Q. Ye, Y. and Zhang, J. 000, Approximation of Dense-k-Subgraph, Working Paper, Department of Management Sciences, Henry B. Tippie College of Business, The University of Iowa, Iowa City, IA 54, USA. Han, Q. Ye, Y. and Zhang, J. 00, An improved rounded Method and Semidefinite Programming Relaxation for Graph Partition, Mathematical Programming, vol. 9, no. 3, pp. 509-535. Mahajan, S. Hariharan, R. 999, Derandomizing approximation algorithms based on semidefinite programming, SIAM Journal on Computing, vol. 8, no. 5, pp. 64-663. Hassin, R. Rubinstein, S. and Tamir, A. 997, Approximation algorithms for maximum dispersion, Operations Research Letters, vol., pp. 33-37. Jäger, G. and Srivastav, A. 005, Improved approximation algorithms for maximum graph partitioning problems, Journal of Combinatorial Optimization, vol. 0, no., pp. 33-67. Kortsarz, G. and Peleg, D. 993, On choosing a dense subgraph, 34th Annual IEEE Symposium on Foundations of Computer Science, Palo Alto, California, pp. 69-70. Krarup, J. Pisinger, D. and Plastria, F. 00, Discrete location problems with push-pull objectives, Discrete Applied Mathematics, vol. 3, pp. 363-378. 5 RC-486

Raghavan, P. Thompson, C. 987, Randomized Rounding: a technique for provably good algorithms and algorithmic proofs. Probabilistic construction of deterministic algorithms, Combinatorica, vol. 7, pp. 365-374. Raghavan, P. 988, Probabilistic construction of deterministic algorithms. Approximate packing integer problems, Journal of Computer and System Sciences, vol. 37, no., pp. 30-43. Sahni, S. and Gonzalez, T. 976, P-complete approximation problems, Journal of ACM, vol. 3, pp. 555-565. Srivastav A. 00, Derandomization in Combinatorial Optimization, In Handbook of Randomized Computing, Volume II, Chapter 8, pp. 73-84, Pardalos, Rajasekaran, Reif, Rolim (eds.), Kluwer Academic Publishers. Srivastav, A. and Wolf, K. 998, Finding Dense Subgraphs with Semidefinite Programming, International Workshop on Approximation 98, Lecture Notes in Computer Science vol. 444, pp. 8-9. An erratum: Srivastav, A. and Wolf, K. 999, Erratum on Finding Dense Subgraphs with Semidefinite Programming, Preprint, Mathematisches Seminar, Universitaet zu Kiel, 999. 6 RC-486