Lecture 1 : Probabilistic Method

Similar documents
Lecture 5: January 30

Matchings in hypergraphs of large minimum degree

Applications of the Lopsided Lovász Local Lemma Regarding Hypergraphs

Lecture Notes CS:5360 Randomized Algorithms Lecture 20 and 21: Nov 6th and 8th, 2018 Scribe: Qianhang Sun

CSE 548: Analysis of Algorithms. Lectures 18, 19, 20 & 21 ( Randomized Algorithms & High Probability Bounds )

Theorem (Special Case of Ramsey s Theorem) R(k, l) is finite. Furthermore, it satisfies,

Probabilistic Methods in Combinatorics Lecture 6

Probabilistic Proofs of Existence of Rare Events. Noga Alon

Randomness and Computation

The concentration of the chromatic number of random graphs

Introduction to Randomized Algorithms: Quick Sort and Quick Selection

THE METHOD OF CONDITIONAL PROBABILITIES: DERANDOMIZING THE PROBABILISTIC METHOD

Tree Decomposition of Graphs

Randomized Algorithms

Graph coloring, perfect graphs

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

Algebraic Methods in Combinatorics

arxiv: v1 [math.co] 2 Dec 2013

Containment restrictions

Size and degree anti-ramsey numbers

Vertex colorings of graphs without short odd cycles

A note on network reliability

The Probabilistic Method

Notes 6 : First and second moment methods

The minimum G c cut problem

Acyclic subgraphs with high chromatic number

Algebraic Methods in Combinatorics

Independence numbers of locally sparse graphs and a Ramsey type problem

Out-colourings of Digraphs

ACO Comprehensive Exam October 14 and 15, 2013

On shredders and vertex connectivity augmentation

Decomposing oriented graphs into transitive tournaments

Packing and decomposition of graphs with trees

The Turán number of sparse spanning graphs

Lower bounds for Ramsey numbers for complete bipartite and 3-uniform tripartite subgraphs

Pigeonhole Principle and Ramsey Theory

Induced subgraphs with many repeated degrees

Lecture 5: Efficient PAC Learning. 1 Consistent Learning: a Bound on Sample Complexity

1 Introduction and Results

Quasi-randomness is determined by the distribution of copies of a fixed graph in equicardinal large sets

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1

CS5314 Randomized Algorithms. Lecture 18: Probabilistic Method (De-randomization, Sample-and-Modify)

On decomposing graphs of large minimum degree into locally irregular subgraphs

Reachability-based matroid-restricted packing of arborescences

Rainbow Hamilton cycles in uniform hypergraphs

The Lopsided Lovász Local Lemma

Math 261A Probabilistic Combinatorics Instructor: Sam Buss Fall 2015 Homework assignments

More on NP and Reductions

Two-coloring random hypergraphs

Constructions in Ramsey theory

Rainbow Hamilton cycles in uniform hypergraphs

HARDNESS AND ALGORITHMS FOR RAINBOW CONNECTIVITY

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

Maximum union-free subfamilies

PARTITIONING PROBLEMS IN DENSE HYPERGRAPHS

Dominating a family of graphs with small connected subgraphs

Ramsey-type problem for an almost monochromatic K 4

Chapter 34: NP-Completeness

Computational complexity theory

Constructive bounds for a Ramsey-type problem

Probabilistic Method. Benny Sudakov. Princeton University

A An Overview of Complexity Theory for the Algorithm Designer

Hardness and Algorithms for Rainbow Connection

1 Primals and Duals: Zero Sum Games

Properly colored Hamilton cycles in edge colored complete graphs

The Complexity of Optimization Problems

Quick Sort Notes , Spring 2010

Nordhaus-Gaddum Theorems for k-decompositions

Advanced Combinatorial Optimization September 22, Lecture 4

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1

1.1 P, NP, and NP-complete

The Lopsided Lovász Local Lemma

Preliminaries. Graphs. E : set of edges (arcs) (Undirected) Graph : (i, j) = (j, i) (edges) V = {1, 2, 3, 4, 5}, E = {(1, 3), (3, 2), (2, 4)}

Stability of the path-path Ramsey number

A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees

Lecture 5: Probabilistic tools and Applications II

Induced subgraphs of prescribed size

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem.

An Introduction to Randomized algorithms

The number of Euler tours of random directed graphs

1 Agenda. 2 History. 3 Probabilistically Checkable Proofs (PCPs). Lecture Notes Definitions. PCPs. Approximation Algorithms.

Ahlswede Khachatrian Theorems: Weighted, Infinite, and Hamming

Decomposition of random graphs into complete bipartite graphs

Approximate Hypergraph Coloring. 1 Introduction. Noga Alon 1 Pierre Kelsen 2 Sanjeev Mahajan 3 Hariharan Ramesh 4

Computational complexity theory

arxiv: v2 [math.co] 19 Jun 2018

On a Conjecture of Thomassen

Block Sensitivity of Minterm-Transitive Functions

Discrete mathematics , Fall Instructor: prof. János Pach

CS Homework Chapter 6 ( 6.14 )

Random Variable. Pr(X = a) = Pr(s)

Chromatic Ramsey number of acyclic hypergraphs

Hypergraph Ramsey numbers

Independent Transversals in r-partite Graphs

On a hypergraph matching problem

Minimum spanning tree

Edge-disjoint induced subgraphs with given minimum degree

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

18.5 Crossings and incidences

Transcription:

IITM-CS6845: Theory Jan 04, 01 Lecturer: N.S.Narayanaswamy Lecture 1 : Probabilistic Method Scribe: R.Krithika The probabilistic method is a technique to deal with combinatorial problems by introducing randomness. Though the method relies on probability theory, it can be used to make deterministic statements. The probabilistic method toolkit typically includes, but is not limited to, Markov s Inequality, Linearity of Expectation, Lovász Local Lemma, Concentration Inequalities and Subadditivity of Probabilities (the union bound), to name a few. This lecture describes examples that illustrate the application of some of these varied tools. 1 Max Cut A cut is a partition of the vertices of a graph into two disjoint sets. An edge is a crossing edge if its end points are in different sets of the partition. The cut-set of a cut is the set of crossing edges. The cut of a graph can sometimes refer to its cut-set instead of the partition. Max Cut Given a graph G, partition V (G) into V 0 and V 1 such that the number of edges crossing the cut is maximum. Max Cut is known to be NP-hard by a reduction from Max -SAT [GJ79]. A simple polynomial-time randomized algorithm achieves an approximation factor of. For each vertex v, flip an unbiased coin to decide which set of the partition v belongs to. Consider this random partition. Linearity of expectation, Independence of events Let X denote the cardinality of the associated cut. For each edge e E(G), define indicator random variable, { 1 if e is a crossing edge X e = 0 otherwise X = X e 1-1

By linearity of expectation, E(X) = E[ X e ] = E[X e ] = P r(x e = 1) Note that the coin tosses corresponding to the choices of the sets to the vertices are independent. The probability that an edge e = {u, v} is an crossing edge is given by, Thus, P r(x e = 1) = P r(u V 0 v V 1 ) + P r(v V 0 u V 1 ) = 1 E(X) = 1 = m Since the size of a maximum cardinality cut is at most m, the number of edges in G, the algorithm described is a -approximation for Max Cut. Note that this algorithm can be derandomized with the method of conditional probabilities to obtain a deterministic polynomial-time -approximation algorithm. Further Study The output of the polynomial-time randomized algorithm A described in this section can be viewed as a function of n random bits. The i th bit b i in this n-bit string b {0, 1} n takes the value j {0, 1} denoting that vertex v i is assigned to set V j in the random partition considered. A can be derandomized as follows: Run A for each choice of n-bit string and output the largest value of the cut. By our expectation argument, it follows that, A(b 1 b n ) b 1 b n n = m However, the algorithm is no longer polynomial as there are n choices of the n-bit string to consider. Interestingly, it can be shown that there exists a set S {0, 1} n such that S = O(n ) and, A(b 1 b n ) (b 1 b n) S S Such existential combinatorics and corresponding explicit constructions are interesting to consider. = m Randomized Quick Sort In the classical average case-analysis of quick sort, we rely on the assumption that all permutations of the input array are equally likely. However, in engineering applications, the 1-

input distribution is rarely known a priori. instead of assuming a distribution of inputs, we impose a distribution by introducing randomness to the algorithm. We associate the notion of rank with each element in A. That is, r i is the i th smallest element among the elements of A. For simplicity, we assume A has distinct elements. Choose an index i from {1,, n}, where n is the number of elements in A. This choice is made uniformly at random. Now, A[i] is designated as the pivot and the classical quick sort algorithm proceeds. At every choice of the pivot, randomness is introduced. The parameter of interest for the analysis is the expected number of comparisons X made in a run of quick sort on A. Linearity of expectation Define indicator random variable, { 1 if ri and r X ij = j are compared 0 otherwise X = n i=i j>i Here are crucial invariants used in the analysis of randomized quick sort. X ij 1. Once an element x has been selected as a pivot in a call to partition, x does not participate in any comparison in any other calls to partition.. In any call to partition, any comparison involves the pivot of that call. 3. Any pair of elements are compared at most once. 4. Two elements r i and r j are compared if and only if the first element to be chosen as pivot from r i, r i+1,..., r j is either r i or r j. By linearity of expectation, E(X) = n E[X ij ] = i=i j>i n P r(x ij = 1) = i=i j>i n j i + 1 i=i j>i n i=i n i+1 k=1 k = nh n where H n denotes the n th Harmonic number. Thus, E(X) = O(n log n) 1-3

Further Study The next inevitable question is to determine if there exists a family S of permutations of n elements such that the average number of comparisons over runs of quick sort on this set is O(n log n). That is, we would typically like to come up with a set S S n, where S n is the set of all permutations of n elements, such that, σ S QuickSort(σ) = O(n log n) S If S = O(n O(1) ), we could use derandomization to obtain a polynomial-time algorithm achieving the desired average-case behaviour. Though, this technique does not yield a sorting algorithm faster than O(n log n), the mathematical artifacts involved are of independent interest. 3 Min Cut Consider a connected, undirected multigraph G on n vertices. By the property of a cut, deleting the edges of a cut disconnects G. A min cut is a cut of minimum cardinality. Min Cut Given a graph G, find a minimum set of edges whose deletion disconnects G. By the max-flow min-cut theorem, there are polynomial-time algorithms to solve Min Cut, notably the Edmonds-Karp algorithm. Non-flow based algorithms are also known for the same. However, they are quite complicated. In this section, we study a simple randomized non-flow based algorithm for finding a min cut. Given G, pick an edge uniformly at random and contract it. The resultant graph is denoted as G e. Observe that no edge contraction reduces the size of a min cut. Further, every cut in G e is a cut in G too. Repeat this contraction step till G has only vertices, say x and y. Note that each contraction reduces the number of vertices of G by one. Eliminate self-loops created as a result of a contraction. Output the cut as X and Y, where X and Y correspond to the sets of vertices merged into x and y, respectively. We will now analyse the probability with which this algorithm outputs a min cut. We denote the graph obtained after i contractions by G i. Method of conditional probabilities Consider a min cut C of G. Let C = k. Since the size of a min cut is at most the min degree δ of G, E(G) nk. We will now estimate the probability with which C survives 1-4

after the sequence of contractions during the execution of the algorithm. P r(e C is contracted) k nk = n Also, since contractions are independent, Further, P r(c survives after first contraction) 1 n P r(c survives after contractions) (1 n )P r(c survives in G 1) P r(c survives after the execution of the algorithm ) (1 n )(1 n 1 ) (1 n (n 3) ) n P r(c is output by the algorithm ) (1 n i + 1 ) = n(n 1) Thus, the probability of the algorithm discovering a min cut is at least. The correctness n guarantee can be increased by running the algorithm multiple times and the min cardinality set is output. By n runs of the algorithm, the probability that the a min cut is not output is at most (1 ) n n < 1 e. Further executions of the algorithm makes the failure probability arbitrarily small at the cost of increasing the run-time. i=1 Interesting Observations and Further Study In Max Cut and Randomized Quick Sort, the random variable of interest was split into sub random variables taking significantly smaller range of values as compared to the original random variable. Thus analysis is localised by linearity of expectation and independence of events. However, in the analysis of the algorithm for Min Cut, there is an extensive dependence among events. In such situations, the method of conditional probabilities prove to be an asset to the algorithm analysis. Further, note that Quick Sort runs in O(n log n) time with high probability, where as in Min Cut, the optimum solution is obtained with high probability. The sorting algorithm always produces the correct solution. However, the randomness is in the run-time while in the Min Cut algorithm, the correctness of the solution is random. These randomized algorithms exemplify two different types of randomized algorithms, namely, Las Vegas and Monte Carlo algorithms. 1-5

4 Ramsey Numbers Ramsey theory deals with finding order amongst apparent chaos. Given a setting where a mathematical structure may appear, Ramsey theory strives to identify conditions on this setting under which this mathematical structure of interest must appear. In other words, it is an attempt to ascertain that complete disorder is an impossibility and any large structure will necessarily contain an orderly substructure. Ramsey Number R(k, l) R(k, l) is the least integer n such that any edge coloring of K n using colors (say, red and blue) has either a red K k or a blue K l. Equivalently, R(k, l) is the least integer n such that any graph on at least n vertices has a clique of size k or an independent set of size l. To prove bounds on diagonal Ramsey numbers {R(k, k) k = 1, }, we identify possible relations between n and k. Union bound, Independence of events Lemma 1. If k and n are positive integers satisfying ( n k) 1 ( k ) < 1, then R(k, k) > n. Proof. Consider a clique G on n vertices. Color the edges of G uniformly at random with colors, red and blue. Consider a set K of k vertices in G. by the union bound, P r(g[k] is monochromatic) = P r( K V (G), K = k such that G[K] is monochromatic) ( ) n 1 (k ) k Since ( n k) 1 ( k ) < 1, it follows that there is a -edge coloring of G producing neither a red K k nor a blue K k. Equivalently, there are at least graphs on n vertices having neither a K k nor an independent set of size k. These two graphs G 1 and G are the spanning subgraphs of G such that E(G 1 ) = {e E(G) e is red} and E(G ) = {e E(G) e is blue}, respectively. Observe that G 1 and G are complementary graphs of each other. R(k, k) > n. Lemma. For any k 3, R(k, k) > k 1. 1-6

Proof. Consider a clique G on n vertices where n k 1. Color the edges of G uniformly at random with colors, red and blue. Consider a set K of k vertices in G. P r(k is monochromatic) = by the union bound, P r( K V (G), K = k such that K is monochromatic) ( ) n k Also, ( ) n k nk k( k 1) = k k+1 = (k ) k +1 Since k 3, k > 1. Thus, P r( K V (G), K = k such that K is monochromatic) < 1. It follows that there exists a coloring producing no monochromatic K k in G. Hence, R(k, k) > k 1, for k 3. Further Study Better bounds for R(k, k) and constructive Ramsey numbers are natural directions for a deeper study. As a generalization, multicolour Ramsey numbers are other interesting objects worth searching for. 5 Tournaments with Property S k A tournament T (V, E) is a complete directed graph. That is, for any v i, v j V (T ), exactly one of (v i, v j ) and (v j, v i ) is in E(T ). T is said to have property S k if for every set X of k vertices, there exists a vertex v l V (T ) such that X N(v l ). We refer to such a set X as being dominated by v l. Lemma 3. If ( ) n k (1 1 ) n k < 1, then there exists a tournament on n vertices with property k S k. Proof. Consider a random tournament T on n vertices. By a random tournament, we mean that for each 1 i < j n, with equal probability one of (v i, v j ) and (v j, v i ) is in E(T ). Union bound, Independence of events 1-7

Pick a set X V (T ) of size k. P r(x is dominated by v i ) = 1 k and P r(x is not dominated by v i) = 1 1 k by the union bound, P r(x is not dominated) = (1 1 k )n k P r( X V (G), X = k such that X is dominated) ( ) n (1 1 k k )n k As ( ) n k (1 1 ) n k < 1, it follows that there exists a tournament on n vertices with property k S k. Further Study Having proved the existence of a tournament T satisfying S k under appropriate conditions, it is interesting to consider the construction of T. Exciting non-trivial constructions of such tournaments are known [AS9] and are natural directions for deeper study. 6 Hypergraph Coloring A k-uniform hypergraph H is a pair (V, E) where V is the set of vertices and E ( V k) is the set of hyperedges. H is -colorable if its vertices can be colored with colors such that no edge is monochromatic. Lemma 4. For any k-uniform hypergraph H on n vertices, if E(H) < k 1, then H is -colorable. Proof. Consider a random coloring of vertices of an arbitrary k-uniform hypergraph H. Union bound, Independence of events Consider an edge e E(H). P r(e is monochromatic) = k 1-8

by the union bound and the fact that E(H) < k 1, P r( e E(H), such that e is monochromatic) < k k 1 = 1 Thus, the probability that all edges in H are non-monochromatic is non-zero. there exists a -coloring for H. Further Study Better bounds, in particular lower bounds are the natural next areas of study. References [AS9] Alon, N. and Spencer, J.H. The probabilistic method. Wiley, New York, 199. [GJ79] Garey, M.R. and Johnson, D.S. Computers and intractability: A guide to the theory of NP-completeness. W.H.Freeman and Company, 1979. 1-9