15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number of triagles i a grpah with l edges. 2 Motivatio ad Defiitio of Graph Etropy So far i this course, we have leared two aspects to codig theory - source codig ad chael codig. Graph etropy ca be thought as a combiatorial extesio of source codig. Suppose that we are give a source which emits oe symbol x V. The source codig theorem says that if symbols are i.i.d. ad the umber of symbols is large, it is possible to achieve Rate H(X) ad this is the best to hope for. This result is based o the requiremet that wheever we have two sequeces of symbols (x 1,..., x t ) ad (y 1,.., y t ), which are differet i at least oe symbol, the ecoder should assig differet codewords for them; otherwise at least oe of them caot be recovered. What does happe if we relax this strict requiremet ad allow some cofusio (i.e. it is okay to use the same codeword for certai pairs of strigs)? As the requiremet is relaxed, we might hope for a better rate. The graph etropy studies this questio by represetig such requiremets by graphs. 2.1 1-symbol Case We still have a source that emits a symbol i V, ad a graph G = (V, E) such that {a, b} E if a ad b must be distiguished. This graph represets the requiremet that for ay ecoder Ec : V {0, 1} R, {a, b} E : Ec(a) Ec(b) How small R ca be i this settig? This settig is exactly equal to the well-studied graph (vertex) colorig problem, where the goal is to color each vertex so that o edge has both edpoits with the same color (each color correspods to a codeword). Let χ(g) be the miimum umber of colors eeded for G. The best R = log χ(g). If G = K, which meas every symbol must be distiguished, χ(g) = ad R OP T = log. 1
2.2 Multi-symbol Case We ow assume that the source emits t i.i.d. symbols, each accordig to distributio p o V. Defiitio 2.1. (x 1,..., x t ) is distiguishable from (y 1,..., y t ) if i [t] such that (x i, y i ) E. Let G t = (V t, E t ) where V t = {(v 1,..., v t ) : v i V } {(v 1,..., v t ), (w 1,..., w t )} E if ad oly if i such that {v i, w i } E. We ca see (v 1,..., v t ) ad (w 1,..., w t ) are distiguishable whe {(v 1,..., v t ), (w 1,..., w t )} E t. Let p t (v 1,..., v t ) = Π i [t] p(v i ) be the probability of (v 1,..., v t ). As i the origial source codig theorem, we might decide to igore small fractio of vertices accordig to this distributio ad color the rest of the graph with a small umber of colors. Asymptotically, we take t ad allow a error parameter ɛ. If ɛ = 0 (i.e. error-free code), the best achievable rate is log χ(g t ) lim t t If ɛ > 0, we defie etropy of G as the best achievable rate allowig ɛ error, amely H(G, p) = lim mi t U V t p t (U) 1 ɛ where G t (U) is the subgraph of G t iduced by U. proved that 1. Limit exists 2. Limit is idepedet of ɛ (0, 1). log χ(g t (U)) t Körer, who itroduced this defiitio, 3. H(G, p) = mi (X,Y ) I(X; Y ) where X V is a radom vertex whose margial distributio is p, ad Y V is a radom idepedet set of vertices such that X Y always. Y is a idepedet set if for all v, v Y, {v, v } / E. Note that 3 implies 1 ad 2. Oe rough ituitio is that ay colorig of G partitios V ito idepedet sets, ad as we use a fewer umber of colors, the size of each idepedet set will be larger. This colorig aturally defies the joit distributio (X, Y ) - pick X V accordig to p, ad let Y be the set of vertices with the same color with X. I(X; Y ) = H(X) H(X Y ) also gets smaller as the size of Y icreases, so this roughly explais how colorig is related to a I(X; Y ). 3 Examples of Graph Etropy From ow o, p is the uiform distributio o V. I this case defie H(G) to be H(G, uiform). To prove a upper boud o H(G), it is eough to fid a joit distributio (X, Y ) such that I(X; Y ) is small. 2
3.1 Empty Graph I a graph with o edge, Y ca be V always regardless of X. H(G) I(X; Y ) H(Y ) = 0 Sice H(G) 0 by defiitio, H(G) = 0. 3.2 Complete Graph I a complete graph K, give X, Y has to be {X} sice it is the oly set that cotais X ad is idepedet. This uique distributio gives H(G) = I(X; Y ) = H(X) H(X Y ) = H(X) = log. 3.3 Bipartite ad r-partite Graph Suppose we have a complete bipartite graph K m, with partitios A ad B such that A = m, B =. Give X, we take Y = A if x A, ad Y = B if x B. Usig this joit distributio, H(G) I(X; Y ) = H(X) H(X Y ) = log(m + ) m m + log m log = h( m + m + ) where h is the biary etropy fuctio. O the other had, for ay joit distributio (X, Y ), we see that Y A if X A, ad Y B if X B. Therefore, H(X Y ) P r[x A] log A + P r[x B] log B = This shows that H(G) h( m+ ), ad therefore H(G) = h( m m + log m + m + log m+ ) Geerally, if we have r-partite graph where V = [] [r] ad E = {(i, j), (k, l) : j l}, followig the same argumet, we ca coclude that H(G) = log r. The bipartite graph with m = is a special case with H(G) = h( 1 2 ) = log 2 = 1. 4 Properties of Graph Etropy 4.1 Subadditivity Lemma 4.1. Let G 1 = (V, E 1 ), G 2 = (V, E 2 ) ad G = (V, E 1 E 2 ). The H(G) H(G 1 )+H(G 2 ). Proof. Take joit distributio (X, Y 1, Y 2 ) such that H(G 1 ) = I(X; Y 1 ) H(G 2 ) = I(X; Y 2 ) Coditioed o X, Y 1 ad Y 2 are idepedet. 3
Y 1 Y 2 is idepedet o G, ad it cotais X. Therefore, (X, Y 1 Y 2 ) is a valid distributio for G. H(G) I(X; Y 1 Y 2 ) I(X; Y 1, Y 2 ) (follows from data processig iequality) = H(Y 1, Y 2 ) H(Y 1, Y 2 X) = H(Y 1, Y 2 ) H(Y 1 X) H(Y 2 X) (Y 1 Y 2 coditioed o X) H(Y 1 ) + H(Y 2 ) H(Y 1 X) H(Y 2 X) = H(G 1 ) + H(G 2 ) 4.2 Mootoicity Lemma 4.2. Let G = (V, E), F = (V, E ), E E. The H(G) H(F ). Proof. Sice G has fewer edges (less strict requiremets) tha F, (X, Y ) achievig H(F ) is feasible for H(G). 4.3 Disjoit Uio Lemma 4.3. Let G 1,..., G t are coected compoets of G ad ρ i := V (G i) V (G). The H(G) = i [k] ρ i H(G i ) Proof. First we show that H(G) ρ i H(G i ). Take a joit distributio (X, Y ) such that H(G) = I(X; Y ), ad let Y i = Y V (G i ). Defie l(x) : V (G) [k] such that l(x) = i iff x V (G i ). H(G) = I(X; Y 1,..., Y k ) = I(X, l(x); Y 1,..., Y k ) (X determies (X, l(x))) = I(l(X); Y 1,..., Y k ) + I(X; Y 1,..., Y k l(x)) (Chai rule) P r[l(x) = i]i(x; Y 1,..., Y k l(x) = i) (Expad oly the secod term) i [k] = i [k] ρ i (I(X; Y i l(x) = i) + I(X; Y 1,..., Y i 1, Y i+1,..., Y k l(x) = i, Y i )) (Chai rule) i [k] ρ i I(X; Y i l(x) = i) (Igore the secod term) i [k] ρ i H(G i ) (Defiitio of H(G i )) which completes the proof that H(G) ρ i H(G i ). For the other directio, let p i be a joit distributio (X, Y i ) that achieves H(G i ) = I(X; Y i ). We defie a joit distributio (X, Y ) such that 4
1. Pick Y 1,..., Y k idepedetly accordig to p 1,..., p k. 2. Pick i [k] with probability ρ i. 3. Sample X accordig to p i (X Y i ). We wat to show that I(X; Y ) = ρ i H(G i ). We are goig to use the same proof; we oly eed to check that the three ieqaulities above ideed hold as equalities. 1. We chose i = l(x) idepedetly from Y 1,..., Y k ; so I(l(X); Y 1,..., Y k ) = 0 ad the first iequality holds with equality. 2. Our choice of X oly depeds o i ad Y i, so I(X; Y 1,..., Y i 1, Y i+1,..., Y k l(x) = i, Y i ) = 0 ad the secod ieqaulity holds with equalty. 3. By the choice of p i, I(X; Y i ) = H(G i ) for each i. Therefore, H(G) I(X; Y ) = ρ i H(G i ). With the lower boud above, we ca coclude that H(G) = ρ i H(G i ). 5