Finding Dense Subgraphs in G(n, 1/2)

Similar documents
Problem Set 9 Solutions

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Complete subgraphs in multipartite graphs

Calculation of time complexity (3%)

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

The Expectation-Maximization Algorithm

Min Cut, Fast Cut, Polynomial Identities

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 6 Luca Trevisan September 12, 2017

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

Lecture Space-Bounded Derandomization

APPENDIX A Some Linear Algebra

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities

NP-Completeness : Proofs

18.1 Introduction and Recap

Edge Isoperimetric Inequalities

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 2/21/2008. Notes for Lecture 8

Notes on Frequency Estimation in Data Streams

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Maximizing the number of nonnegative subsets

NUMERICAL DIFFERENTIATION

Lecture Notes on Linear Regression

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Eigenvalues of Random Graphs

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

A new Approach for Solving Linear Ordinary Differential Equations

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Expected Value and Variance

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

Lecture 10: May 6, 2013

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

2.3 Nilpotent endomorphisms

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Vapnik-Chervonenkis theory

Lecture 12: Discrete Laplacian

Graph Reconstruction by Permutations

Canonical transformations

Exercises of Chapter 2

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

arxiv: v3 [cs.dm] 7 Jul 2012

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Distribution of subgraphs of random regular graphs

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

The L(2, 1)-Labeling on -Product of Graphs

Finding Primitive Roots Pseudo-Deterministically

THE CHVÁTAL-ERDŐS CONDITION AND 2-FACTORS WITH A SPECIFIED NUMBER OF COMPONENTS

Linear Regression Analysis: Terminology and Notation

The Geometry of Logit and Probit

Lecture 4: Universal Hash Functions/Streaming Cont d

A new construction of 3-separable matrices via an improved decoding of Macula s construction

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Assortment Optimization under MNL

Foundations of Arithmetic

Stanford University Graph Partitioning and Expanders Handout 3 Luca Trevisan May 8, 2013

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Lecture 10 Support Vector Machines II

A 2D Bounded Linear Program (H,c) 2D Linear Programming

= z 20 z n. (k 20) + 4 z k = 4

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

CHAPTER 14 GENERAL PERTURBATION THEORY

arxiv: v1 [math.co] 1 Mar 2014

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

12. The Hamilton-Jacobi Equation Michael Fowler

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Mixed-integer vertex covers on bipartite graphs

Problem Do any of the following determine homomorphisms from GL n (C) to GL n (C)?

Perron Vectors of an Irreducible Nonnegative Interval Matrix

Approximation Algorithms for Spanner Problems and Directed Steiner Forest

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan.

find (x): given element x, return the canonical element of the set containing x;

arxiv: v2 [cs.ds] 1 Feb 2017

Generalized Linear Methods

Feature Selection: Part 1

Section 3.6 Complex Zeros

Exercise Solutions to Real Analysis

First Year Examination Department of Statistics, University of Florida

Bernoulli Numbers and Polynomials

Singular Value Decomposition: Theory and Applications

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Lecture 4: Constant Time SVD Approximation

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Homework Assignment 3 Due in class, Thursday October 15

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

COS 521: Advanced Algorithms Game Theory and Linear Programming

Computing Correlated Equilibria in Multi-Player Games

1 Definition of Rademacher Complexity

arxiv: v1 [cs.ds] 22 Oct 2016

The Minimum Universal Cost Flow in an Infeasible Flow Network

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Credit Card Pricing and Impact of Adverse Selection

10.34 Fall 2015 Metropolis Monte Carlo Algorithm

Math 426: Probability MWF 1pm, Gasson 310 Homework 4 Selected Solutions

A Simple Research of Divisor Graphs

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

n ). This is tight for all admissible values of t, k and n. k t + + n t

Transcription:

Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng the largest clque n random graphs s a well nown hard problem. It s nown that a random graph Gn,1/ almost surely has a clque of sze about log n. A smple greedy algorthm fnds a clque of sze log n, and t s a long-standng open problem to fnd a clque of sze 1 + ǫlog n n randomzed polynomal tme. In ths paper, we study the generalzaton of fndng the largest subgraph of any gven edge densty. We show that a smple modfcaton of the greedy algorthm fnds a subset of log n vertces wth nduced edge densty at least 0.951. We also show that almost surely there s no subset of.784 log n vertces whose nduced edge densty s at least 0.951. 1 Introducton Fndng the largest clque s a notorously hard problem, even on random graphs. It s nown that the clque number of a random graph Gn, 1/ s almost surely ether or +1, where log n loglog n 1 Secton 4.5 n [1], also []. However, a smple greedy algorthm fnds a clque of sze only log n 1 + o1, wth hgh probablty, and fndng larger clques that of sze even 1+ǫlogn n randomzed polynomal tme has been a long-standng open problem [3]. In ths paper, we study the followng generalzaton: gven a random graph Gn, 1/ fnd the largest subgraph wth edge densty at least 1 δ. We show that a smple modfcaton of the greedy algorthm fnds a subset of log n vertces whose nduced subgraph has edge densty at least 0.951, wth hgh probablty. To complement ths, we show that almost surely there s no subset of.784 logn vertces whose nduced subgraph has edge densty 0.951 or more. We use Gn, p to denote a random graph on n vertces where each par of vertces appears as an edge ndependently wth probablty p. We use V to denote ts set of vertces and E to denote ts set of edges. Moreover, gven two subsets S V and T V, we use ES, T to denote the set of edges wth one endpont n S and another endpont n T. The densty of the subgraph nduced by vertces n S s gven by densty S ES, S. S Wor done whle at Mcrosoft Research

Therefore, the expected densty of Gn, 1/ s 1/ and the densty of any clque s 1. In Secton we descrbe our algorthm for fndng subgraphs of densty 1 δ. We gve a bound on the largest subgraph of densty 1 δ n the followng Secton 3. Fnally, n Secton 4, we present some open problems. Algorthm for fndng large subgraph of densty 1 δ In ths secton, we descrbe our algorthm and gve a relatonshp between the sze of the subgraph obtaned by the algorthm, and ts densty. In partcular, we show that the algorthm can be used to obtan a subset of logn vertces of densty 0.951, wth hgh probablty. Greedy Algorthm to pc a dense subgraph: Input: a random graph Gn, 1/ and δ > 0. Output: a subset S V of sze log n. 1. Partton the vertces nto dsjont sets V V 1 V V, each of sze n/.. Intalze S 0. 3. For 0 to 1 do: a Pc v +1 V +1 that has the maxmum number of edges to S,.e., b S +1 S {v +1 }. 4. Return S S. v +1 argmax v V +1 Ev +1, S. Notce that the algorthm frst parttons all nodes nto random subsets of the same sze, and then pcs one vertex from each partton. Ths parttonng s necessary to argue about ndependence n our analyss of choosng vertces greedly. In the analyss below, Hδ s the standard notaton of the Shannon entropy functon, whch s δ log δ + 1 δlog1 δ. The followng lemma gves a lower bound on the number of edges we can expect to add to our subgraph, for the -th vertex added by the algorthm. Lemma 1. For any 0 and δ that satsfes Hδ 1 1 log n, lnlog n we have Pr Ev +1, S 1 δ 1 1 log n.

Proof. We now by the prevous results, that as long as < log n, the vertex added has all edges to S. Consder log n. The algorthm has n l vertces to choose from. The expected number of vertces among these, wth at least 1 δ vertces s gven by, Fx v V +1. The probablty that v has at least 1 δ edges to S s Pr Ev, S 1 δ t1 δ Hδ+o1 1, t where Hδ δ log δ 1 δlog1 δ s the Shannon entropy here log s taen wth base. Usng ndependence of these events for dfferent v V +1, we get Pr Ev, S < 1 δ, v V +1 1 Hδ 1 n/ Therefore, 1 1 log n. Pr Ev +1, S 1 δ 1 1 log n. n/ lnlog n n We now gve a unon bound over all addtons of vertces, usng the prevous lemma. Lemma. Pr ES, S 1 δ 1 as n. 0 Proof. Snce V 1, V,...,V are dsjont, usng ndependence and Lemma 1 we get Pr ES, S 1 δ Pr Ev +1, S 1 δ 0 0 1 1 log n e 1/log n usng log n The pont s that we are pcng exactly one vertex from each vertex set/partton, and hence do not lose any randomness or ndependence of the edges. Ths now gves us a bound on the mnmum number of edges one can expect, w.h.p., n the chosen set of vertces. We are not able to express, n a closed form, the sze of a subgraph obtanable usng ths algorthm for a specfc densty. Therefore, we state the best densty one can guarantee w.h.p. for log n. Ths s stated as a theorem below, whch we prove subsequently.

Theorem 1. Our algorthm produces a subset S V of sze logn such that densty S 0.951, almost surely. Proof. From Lemma we have that, almost surely, ES, S 1 δ 0 1 H 1 1 1 log 0 0 log m log m n lnlog n H 1 1 log m H 1 1 log m, 1 where m n/ lnlog n. Here we use the fact that we can choose δ 0 for the frst log m steps. Now let 1 1 + αlog m. Then log m H 1 1 log m 1+αlog m log m αlog m t0 H 1 1 log m log m + th 1 1 log m log m + t α log m H 1 1 1 x0 α log m H 1 1 1 Now usng Equatons 1 and we have densty S 0 ES, S 1 log m α 1 1 1 + o1 α 0.951. 0 H 1 1 1 dx 0 H 1 1 1 dx, dx

usng α log m 1 logn log n log 4 log n lnlog n 1 1 + o1. and computng an upper bound on the ntegral numercally. 3 Upper bound on largest subgraph of densty 1 δ In ths secton, we upper bound the sze of the largest subgraph of densty 1 δ n Gn, 1/. Theorem. A random graph Gn, 1/ has no subgraph of sze log n + loge 1 Hδ o1 + 1 and densty at least 1 δ, almost surely. In partcular, there s no subgraph of sze.784 logn and densty at least 0.951, almost surely. Proof. For every S V of sze, defne an ndcator random varable X S as follows. { 1 f S nduces a subgraph of densty 1 δ X S 0 otherwse. Thus E [X S ] 1 δ Hδ+o1 1. By lnearty of expectaton, expected number of subgraphs of sze and densty 1 δ follows. E E [X S ] S : S X S S : S n en en Hδ+o1 1 Hδ+o1 1 Hδ+o1 1 1 Hδ o1 Hδ+o1 1 1 Hδ+o1/ 0, as n,

usng logn + loge 1 Hδ o1 + 1. Therefore, by Marov nequalty we have Pr X S 1 E S : S S : S X S 0, as n. Or n other words, almost surely there s no subset of vertces that nduce a subgraph of densty at least 1 δ. Notce that for densty 0.951, the gap/rato between the largest subgraph that exsts and the largest subgraph that we can fnd s smaller than n the case of clques. Ths s nterestng, although not entrely unexpected as for densty 0.5, the whole graph can be output. Ths rato for densty 0.951 s however sgnfcantly smaller than ; t s.784/ 1.39. 4 Conclusons For a concrete open problem, s there a polynomal tme algorthm that outputs a subgraph of densty 1 ǫ and sze logn for any choce of ǫ > 0? Are there smple algorthms that beat the densty bound of 0.95 for subgraphs of sze logn. If not, what s the maxmum densty obtanable for a subgraph of sze logn? Spectral technques could be tred. Is there an On log n tme algorthm that fnds the largest clque n Gn, 1/? We than an anonymous revewer for pontng out a soluton to ths: Smply enumerate all clques of all szes, generatng them from smaller ones to bgger ones; once all clques of sze have been found, try to add to each of them every vertex and see f t can be expanded. The expected number of clques of sze s n < n /, and wth hgh probablty the number of clques of each sze does not exceed ths expectaton by much. The functon n / attans ts maxmum around log n, and ts value for ths s roughly n 0.5log n, showng that the above algorthm wll run n tme n 0.5+o1log n and fnd the largest clque wth hgh probablty over the nput graphs. Can one do better than ths? References 1. Noga Alon and Joel Spencer, The probablstc method nd edton, second ed., Wley Interscence, 000.. Béla Bollobás, Random graphs, Academc Press, New Yor, 1985. 3. Rchard Karp, The probablstc analyss of some combnatoral search algorthms, 1976, 1 19.