Lecture 1 : Probabilistic Method

IITM-CS6845: Theory Jan 04, 01 Lecturer: N.S.Narayanaswamy Lecture 1 : Probabilistic Method Scribe: R.Krithika The probabilistic method is a technique to deal with combinatorial problems by introducing randomness. Though the method relies on probability theory, it can be used to make deterministic statements. The probabilistic method toolkit typically includes, but is not limited to, Markov s Inequality, Linearity of Expectation, Lovász Local Lemma, Concentration Inequalities and Subadditivity of Probabilities (the union bound), to name a few. This lecture describes examples that illustrate the application of some of these varied tools. 1 Max Cut A cut is a partition of the vertices of a graph into two disjoint sets. An edge is a crossing edge if its end points are in different sets of the partition. The cut-set of a cut is the set of crossing edges. The cut of a graph can sometimes refer to its cut-set instead of the partition. Max Cut Given a graph G, partition V (G) into V 0 and V 1 such that the number of edges crossing the cut is maximum. Max Cut is known to be NP-hard by a reduction from Max -SAT [GJ79]. A simple polynomial-time randomized algorithm achieves an approximation factor of. For each vertex v, flip an unbiased coin to decide which set of the partition v belongs to. Consider this random partition. Linearity of expectation, Independence of events Let X denote the cardinality of the associated cut. For each edge e E(G), define indicator random variable, { 1 if e is a crossing edge X e = 0 otherwise X = X e 1-1

By linearity of expectation, E(X) = E[ X e ] = E[X e ] = P r(x e = 1) Note that the coin tosses corresponding to the choices of the sets to the vertices are independent. The probability that an edge e = {u, v} is an crossing edge is given by, Thus, P r(x e = 1) = P r(u V 0 v V 1 ) + P r(v V 0 u V 1 ) = 1 E(X) = 1 = m Since the size of a maximum cardinality cut is at most m, the number of edges in G, the algorithm described is a -approximation for Max Cut. Note that this algorithm can be derandomized with the method of conditional probabilities to obtain a deterministic polynomial-time -approximation algorithm. Further Study The output of the polynomial-time randomized algorithm A described in this section can be viewed as a function of n random bits. The i th bit b i in this n-bit string b {0, 1} n takes the value j {0, 1} denoting that vertex v i is assigned to set V j in the random partition considered. A can be derandomized as follows: Run A for each choice of n-bit string and output the largest value of the cut. By our expectation argument, it follows that, A(b 1 b n ) b 1 b n n = m However, the algorithm is no longer polynomial as there are n choices of the n-bit string to consider. Interestingly, it can be shown that there exists a set S {0, 1} n such that S = O(n ) and, A(b 1 b n ) (b 1 b n) S S Such existential combinatorics and corresponding explicit constructions are interesting to consider. = m Randomized Quick Sort In the classical average case-analysis of quick sort, we rely on the assumption that all permutations of the input array are equally likely. However, in engineering applications, the 1-

input distribution is rarely known a priori. instead of assuming a distribution of inputs, we impose a distribution by introducing randomness to the algorithm. We associate the notion of rank with each element in A. That is, r i is the i th smallest element among the elements of A. For simplicity, we assume A has distinct elements. Choose an index i from {1,, n}, where n is the number of elements in A. This choice is made uniformly at random. Now, A[i] is designated as the pivot and the classical quick sort algorithm proceeds. At every choice of the pivot, randomness is introduced. The parameter of interest for the analysis is the expected number of comparisons X made in a run of quick sort on A. Linearity of expectation Define indicator random variable, { 1 if ri and r X ij = j are compared 0 otherwise X = n i=i j>i Here are crucial invariants used in the analysis of randomized quick sort. X ij 1. Once an element x has been selected as a pivot in a call to partition, x does not participate in any comparison in any other calls to partition.. In any call to partition, any comparison involves the pivot of that call. 3. Any pair of elements are compared at most once. 4. Two elements r i and r j are compared if and only if the first element to be chosen as pivot from r i, r i+1,..., r j is either r i or r j. By linearity of expectation, E(X) = n E[X ij ] = i=i j>i n P r(x ij = 1) = i=i j>i n j i + 1 i=i j>i n i=i n i+1 k=1 k = nh n where H n denotes the n th Harmonic number. Thus, E(X) = O(n log n) 1-3

Further Study The next inevitable question is to determine if there exists a family S of permutations of n elements such that the average number of comparisons over runs of quick sort on this set is O(n log n). That is, we would typically like to come up with a set S S n, where S n is the set of all permutations of n elements, such that, σ S QuickSort(σ) = O(n log n) S If S = O(n O(1) ), we could use derandomization to obtain a polynomial-time algorithm achieving the desired average-case behaviour. Though, this technique does not yield a sorting algorithm faster than O(n log n), the mathematical artifacts involved are of independent interest. 3 Min Cut Consider a connected, undirected multigraph G on n vertices. By the property of a cut, deleting the edges of a cut disconnects G. A min cut is a cut of minimum cardinality. Min Cut Given a graph G, find a minimum set of edges whose deletion disconnects G. By the max-flow min-cut theorem, there are polynomial-time algorithms to solve Min Cut, notably the Edmonds-Karp algorithm. Non-flow based algorithms are also known for the same. However, they are quite complicated. In this section, we study a simple randomized non-flow based algorithm for finding a min cut. Given G, pick an edge uniformly at random and contract it. The resultant graph is denoted as G e. Observe that no edge contraction reduces the size of a min cut. Further, every cut in G e is a cut in G too. Repeat this contraction step till G has only vertices, say x and y. Note that each contraction reduces the number of vertices of G by one. Eliminate self-loops created as a result of a contraction. Output the cut as X and Y, where X and Y correspond to the sets of vertices merged into x and y, respectively. We will now analyse the probability with which this algorithm outputs a min cut. We denote the graph obtained after i contractions by G i. Method of conditional probabilities Consider a min cut C of G. Let C = k. Since the size of a min cut is at most the min degree δ of G, E(G) nk. We will now estimate the probability with which C survives 1-4

after the sequence of contractions during the execution of the algorithm. P r(e C is contracted) k nk = n Also, since contractions are independent, Further, P r(c survives after first contraction) 1 n P r(c survives after contractions) (1 n )P r(c survives in G 1) P r(c survives after the execution of the algorithm ) (1 n )(1 n 1 ) (1 n (n 3) ) n P r(c is output by the algorithm ) (1 n i + 1 ) = n(n 1) Thus, the probability of the algorithm discovering a min cut is at least. The correctness n guarantee can be increased by running the algorithm multiple times and the min cardinality set is output. By n runs of the algorithm, the probability that the a min cut is not output is at most (1 ) n n < 1 e. Further executions of the algorithm makes the failure probability arbitrarily small at the cost of increasing the run-time. i=1 Interesting Observations and Further Study In Max Cut and Randomized Quick Sort, the random variable of interest was split into sub random variables taking significantly smaller range of values as compared to the original random variable. Thus analysis is localised by linearity of expectation and independence of events. However, in the analysis of the algorithm for Min Cut, there is an extensive dependence among events. In such situations, the method of conditional probabilities prove to be an asset to the algorithm analysis. Further, note that Quick Sort runs in O(n log n) time with high probability, where as in Min Cut, the optimum solution is obtained with high probability. The sorting algorithm always produces the correct solution. However, the randomness is in the run-time while in the Min Cut algorithm, the correctness of the solution is random. These randomized algorithms exemplify two different types of randomized algorithms, namely, Las Vegas and Monte Carlo algorithms. 1-5

4 Ramsey Numbers Ramsey theory deals with finding order amongst apparent chaos. Given a setting where a mathematical structure may appear, Ramsey theory strives to identify conditions on this setting under which this mathematical structure of interest must appear. In other words, it is an attempt to ascertain that complete disorder is an impossibility and any large structure will necessarily contain an orderly substructure. Ramsey Number R(k, l) R(k, l) is the least integer n such that any edge coloring of K n using colors (say, red and blue) has either a red K k or a blue K l. Equivalently, R(k, l) is the least integer n such that any graph on at least n vertices has a clique of size k or an independent set of size l. To prove bounds on diagonal Ramsey numbers {R(k, k) k = 1, }, we identify possible relations between n and k. Union bound, Independence of events Lemma 1. If k and n are positive integers satisfying ( n k) 1 ( k ) < 1, then R(k, k) > n. Proof. Consider a clique G on n vertices. Color the edges of G uniformly at random with colors, red and blue. Consider a set K of k vertices in G. by the union bound, P r(g[k] is monochromatic) = P r( K V (G), K = k such that G[K] is monochromatic) ( ) n 1 (k ) k Since ( n k) 1 ( k ) < 1, it follows that there is a -edge coloring of G producing neither a red K k nor a blue K k. Equivalently, there are at least graphs on n vertices having neither a K k nor an independent set of size k. These two graphs G 1 and G are the spanning subgraphs of G such that E(G 1 ) = {e E(G) e is red} and E(G ) = {e E(G) e is blue}, respectively. Observe that G 1 and G are complementary graphs of each other. R(k, k) > n. Lemma. For any k 3, R(k, k) > k 1. 1-6

Proof. Consider a clique G on n vertices where n k 1. Color the edges of G uniformly at random with colors, red and blue. Consider a set K of k vertices in G. P r(k is monochromatic) = by the union bound, P r( K V (G), K = k such that K is monochromatic) ( ) n k Also, ( ) n k nk k( k 1) = k k+1 = (k ) k +1 Since k 3, k > 1. Thus, P r( K V (G), K = k such that K is monochromatic) < 1. It follows that there exists a coloring producing no monochromatic K k in G. Hence, R(k, k) > k 1, for k 3. Further Study Better bounds for R(k, k) and constructive Ramsey numbers are natural directions for a deeper study. As a generalization, multicolour Ramsey numbers are other interesting objects worth searching for. 5 Tournaments with Property S k A tournament T (V, E) is a complete directed graph. That is, for any v i, v j V (T ), exactly one of (v i, v j ) and (v j, v i ) is in E(T ). T is said to have property S k if for every set X of k vertices, there exists a vertex v l V (T ) such that X N(v l ). We refer to such a set X as being dominated by v l. Lemma 3. If ( ) n k (1 1 ) n k < 1, then there exists a tournament on n vertices with property k S k. Proof. Consider a random tournament T on n vertices. By a random tournament, we mean that for each 1 i < j n, with equal probability one of (v i, v j ) and (v j, v i ) is in E(T ). Union bound, Independence of events 1-7

Pick a set X V (T ) of size k. P r(x is dominated by v i ) = 1 k and P r(x is not dominated by v i) = 1 1 k by the union bound, P r(x is not dominated) = (1 1 k )n k P r( X V (G), X = k such that X is dominated) ( ) n (1 1 k k )n k As ( ) n k (1 1 ) n k < 1, it follows that there exists a tournament on n vertices with property k S k. Further Study Having proved the existence of a tournament T satisfying S k under appropriate conditions, it is interesting to consider the construction of T. Exciting non-trivial constructions of such tournaments are known [AS9] and are natural directions for deeper study. 6 Hypergraph Coloring A k-uniform hypergraph H is a pair (V, E) where V is the set of vertices and E ( V k) is the set of hyperedges. H is -colorable if its vertices can be colored with colors such that no edge is monochromatic. Lemma 4. For any k-uniform hypergraph H on n vertices, if E(H) < k 1, then H is -colorable. Proof. Consider a random coloring of vertices of an arbitrary k-uniform hypergraph H. Union bound, Independence of events Consider an edge e E(H). P r(e is monochromatic) = k 1-8

by the union bound and the fact that E(H) < k 1, P r( e E(H), such that e is monochromatic) < k k 1 = 1 Thus, the probability that all edges in H are non-monochromatic is non-zero. there exists a -coloring for H. Further Study Better bounds, in particular lower bounds are the natural next areas of study. References [AS9] Alon, N. and Spencer, J.H. The probabilistic method. Wiley, New York, 199. [GJ79] Garey, M.R. and Johnson, D.S. Computers and intractability: A guide to the theory of NP-completeness. W.H.Freeman and Company, 1979. 1-9