Lecture 16: Monotone Formula Lower Bounds via Graph Entropy. 2 Monotone Formula Lower Bounds via Graph Entropy

Similar documents
Lecture 14: Graph Entropy

Lecture 6: Source coding, Typicality, and Noisy channels and capacity

An Introduction to Randomized Algorithms

FUNDAMENTALS OF REAL ANALYSIS by. V.1. Product measures

Lecture 7: October 18, 2017

Lecture 12: November 13, 2018

Classification of problem & problem solving strategies. classification of time complexities (linear, logarithmic etc)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Lecture 11: Pseudorandom functions

Information Theory and Statistics Lecture 4: Lempel-Ziv code

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

The Brunn-Minkowski Theorem and Influences of Boolean Variables

Last time, we talked about how Equation (1) can simulate Equation (2). We asserted that Equation (2) can also simulate Equation (1).

2 High-level Complexity vs. Concrete Complexity

Optimally Sparse SVMs

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

MA131 - Analysis 1. Workbook 2 Sequences I

# fixed points of g. Tree to string. Repeatedly select the leaf with the smallest label, write down the label of its neighbour and remove the leaf.

Sequences I. Chapter Introduction

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Lecture 7: Properties of Random Samples

4.3 Growth Rates of Solutions to Recurrences

Here, e(a, B) is defined as the number of edges between A and B in the n dimensional boolean hypercube.

Lecture 9: Expanders Part 2, Extractors

Quantum Computing Lecture 7. Quantum Factoring

The Boolean Ring of Intervals

Lecture 3 The Lebesgue Integral

Notes for Lecture 11

Lecture Notes for Analysis Class

4 The Sperner property.

Empirical Process Theory and Oracle Inequalities

Introduction to Probability. Ariel Yadin. Lecture 2

Lecture 4: April 10, 2013

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Lecture 11: Hash Functions and Random Oracle Model

Lecture 9: Hierarchy Theorems

Chapter IV Integration Theory

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Agnostic Learning and Concentration Inequalities

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

MATH 262A LECTURE 1: AN INTRODUCTION TO BOOLEAN CIRCUITS AND FORMULAE

Seunghee Ye Ma 8: Week 5 Oct 28

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Lecture 2 Long paths in random graphs

Problem Set 2 Solutions

Application to Random Graphs

MAS111 Convergence and Continuity

CS166 Handout 02 Spring 2018 April 3, 2018 Mathematical Terms and Identities

ORIE 633 Network Flows September 27, Lecture 8

HOMEWORK #4 - MA 504

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Lecture 5: April 17, 2013

4.1 Sigma Notation and Riemann Sums

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Lecture 2. The Lovász Local Lemma

Introduction to Probability. Ariel Yadin. Lecture 7

TEACHER CERTIFICATION STUDY GUIDE

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Cardinality Homework Solutions

Rademacher Complexity

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Lecture 2: April 3, 2013

Lecture 4: Grassmannians, Finite and Affine Morphisms

STAT Homework 1 - Solutions

f n (x) f m (x) < ɛ/3 for all x A. By continuity of f n and f m we can find δ > 0 such that d(x, x 0 ) < δ implies that

Lecture 19. sup y 1,..., yn B d n

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Lecture 2 Clustering Part II

Lecture 9: Pseudo-random generators against space bounded computation,

Frequentist Inference

6.895 Essential Coding Theory October 20, Lecture 11. This lecture is focused in comparisons of the following properties/parameters of a code:

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

Lecture 4: Unique-SAT, Parity-SAT, and Approximate Counting

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

A NOTE ON INVARIANT SETS OF ITERATED FUNCTION SYSTEMS

1 Introduction to reducing variance in Monte Carlo simulations

Lecture 11: Channel Coding Theorem: Converse Part

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size.

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

PROBLEM SET 5 SOLUTIONS 126 = , 37 = , 15 = , 7 = 7 1.

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

Lecture 2: Concentration Bounds

Data Structures Lecture 9

18.657: Mathematics of Machine Learning

Math 508 Exam 2 Jerry L. Kazdan December 9, :00 10:20

The Discrete Fourier Transform

Machine Learning Theory (CS 6783)

Lecture XVI - Lifting of paths and homotopies

CS161 Handout 05 Summer 2013 July 10, 2013 Mathematical Terms and Identities

Davenport-Schinzel Sequences and their Geometric Applications

Recitation 4: Lagrange Multipliers and Integration

5.1 Review of Singular Value Decomposition (SVD)

REGRESSION WITH QUADRATIC LOSS

CS583 Lecture 02. Jana Kosecka. some materials here are based on E. Demaine, D. Luebke slides

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

subcaptionfont+=small,labelformat=parens,labelsep=space,skip=6pt,list=0,hypcap=0 subcaption ALGEBRAIC COMBINATORICS LECTURE 8 TUESDAY, 2/16/2016

Transcription:

15-859: Iformatio Theory ad Applicatios i TCS CMU: Sprig 2013 Lecture 16: Mootoe Formula Lower Bouds via Graph Etropy March 26, 2013 Lecturer: Mahdi Cheraghchi Scribe: Shashak Sigh 1 Recap Graph Etropy: Give G = (V, E), we defie H(G) = mi I(X; Y ) over joit distributios (X, Y ), where X V is uiformy radom ad X Y V. We showed Graph Etropy obeys the followig: Sub-additivity: H(G 1 G 2 ) H(G 1 ) + G(G 2 ). Mootoicity: If G 1 G 2, the H(G 1 ) G(G 2 ). Disjoit Uio: If G 1,..., G k are the coected compoets if G, the H(G) = V (G i ) i V (G) H(G i). Last time, we applied Graph Etropy to lower boud the size of a coverig of a graph by bipartite graphs a perfect family of hash fuctios 2 Mootoe Formula Lower Bouds via Graph Etropy Today we examie a applicatio of Graph Etropy to Circuit Complexity. 2.1 Mootoe Boolea Fuctios Defiitio 1 A boolea fuctio is oe mappig {0, 1} {0, 1}. Remark 2 We ca equivaletly cosider boolea fuctios as mappig P ([]) {0, 1}, usig the obvious bijectio betwee {0, 1} ad P ([]). Boolea fuctios are represeted by (ot ecessarily uique) boolea formulae or trees i which leaves variables ad iteral odes are logical coectives. We use these represetatios iterchageably. Defiitio 3 A boolea fuctio f : P ([N]) {0, 1} is mootoe if S T P ([]) implies f(s) f(t ). Furthermore, if f is a mootoe boolea fuctio, the the mi-terms of f of size i are (f) i = {S P ([]) : S = i, f(s) = 1, ad T S, f(t ) = 0}, (f) = (f) i. Furthermore, a boolea formula is mootoe if it cotais oly AND ad OR coectives. i=1 1

Example 4 The followig are mootoe boolea fuctios: 1. OR: x y = 0 x = y = 0. The mi-terms of OR are (OR) 1 = {{0}, {1}}, (OR) i = for i 1. 2. AND: x y = 1 x = y = 1. The mi-terms of AND are (AND) 2 = {{0, 1}}, (AND) i = for i 2. 3. MAJ 3 : MAJ 3 (x 1, x 2, x 3 ) = (x 1 x 2 ) (x 1 x 3 ) (x 2 x 3 ). Figure 1: Tree represetatios of AND, OR, ad MAJ 3 Propositio 5 A boolea fuctio is mootoe iff it ca be represeted by a mootoe boolea formula. Proof: Clearly, mootoe boolea formulae compute mootoe boolea fuctios. Let f be a mootoe boolea fuctio. The, W P ([]), F (W ) = 1 if ad oly if S (f) with S W. It follows that f is defied uiquely by (f) as follows: f(x 1,..., x ) = x j, (x 1,..., x ) {0, 1}. j S S (f) Note that this formula, called the Disjuctive Normal Form (DNF) of f, is also represeted by a biary tree, sice may-iput logic gates ca be simulated by (liearly may) two-iput gates. 2.2 Size of a Boolea Fuctio ad Threshold Fuctios Defiitio 6 The size size(φ)of a formula φ is the umber of odes i the tree represetatio of φ. 2

The size of a boolea fuctio f is size(f) = mi size(φ). φ computig f That is, size(f) is umber of odes i the smallest tree computig f. Defiitio 7 For k [], the threshold fuctio T h k : P ([]) {0, 1} is defied for S P ([]) by { T h k (S) = 1 if S k. 0 else Example 8 Threshold fuctios geeralize AND, OR, ad MAJ: AND = T h size(and) = 2 1 OR = T h 1 size(or) = 2 1 MAJ = T h /2 It ca be show that MAJ is the most complex threshold, i that it maximizes size(t h k ) over k. 2.3 Boudig the Size of Threshold Fuctios Cosider the problem of boudig size(t h k ). For geeral k, the boud size(t h k ) O(5.3 ) due to (Valiat, 1984) is kow, based o a probabilistic costructio which we do ot give here. We aalyze the case k = 2, for which the followig upper boud is easy to demostrate: Claim 9 size(t h 2 ) O(2 ). Proof: (T h 2 ) 2 = {{i, j} P ([]) : i j}. Furthermore, i 2, (T h 2 ) i =. Thus, the DNF of T h 2 is T h 2 (x 1,..., x ) = {i,j} P([]) i j x i x j, which (sice size(and), size(or) O()) idicates size(t h 2 ) O(2 ). Remark 10 Cosider the followig Divide ad Coquer costructio: Divide the iput strig x = (x 1,..., x ) {0, 1} ito y = (x 1,..., x /2 ) ad z = (x /2,..., x ). The, we have the recursive formula T h 2 (x) = T h /2 2 (y) T h /2 2 (z) (T h /2 1 (y) T h /2 1 (z)). 3

This recurrece gives a upper boud: defiig S = size(t h 2 ), the recurrece gives S 2S 1 + O(), sice clearly size(t h 1 ) O(). The solutio of this stadard recurrece (thik mergesort) is S (2 + log + 1)( log ) ad so S O( log ). Exercise: Refie this boud to size(t h 2 ) 2 log 1. The lower boud we ow give shows this is tight. We ow apply Graph Etropy to prove the lower boud size(t h 2 ) 2 log 1, followig (Newma, Ragde, ad Wigderso 1990). I order to use graph etropy we re goig defie a graph G f for a boolea fuctio f. Cosider defiig the followig: Defiitio 11 Note that is from T h 2 G f = (V, E), where V = [], ad E = (f)2. ad is ot ecessarily the umber of variables i f. Example 12 G T h 2 = K. For a sigle variable x i, G xi is the empty graph o vertices. 4

It helps ow to have a few lemmas about how graph etropy evolves with AND ad OR operatios. Lemma 13 Suppose f = g h. The, G f G g G h, ad hece H(G f ) H(G g ) + H(G h ). Proof: Suppose e = {i, j} E(G f ). The, 1 = f(e) = g(e) h(e); without loss of geerality, g(e) = 1. By costructio of G f, e (f) 2, so f({i}) = f({j}) = 0. The, g({i}) = g({j}) = 0, so e (g) 2 = E(G g ). It would be ice if we also had this property for AND, but it does t hold, as the followig example shows: Example 14 Suppose g(x 1, x 2 ) = x 1, h(x 1, x 2 ) = x 2. The, {1, 2} E(G f ), but {1, 2} / E(G g ), E(G h ). Thus, we eed a weaker statemet: G g h G g G h T g,h. Lemma 15 T g,h is the subgraph of G f iduced by edges i (g) 1 (h) 1 = ((g)1 (h) 1 ) ((h) 1 (g) 1 ). Proof: Let e = {i, j}, ad let f, g, h : {0, 1} 2 {0, 1} deote the restrictios of f, g, h, respectively, to e (sice the formulae are mootoe, we ca thik of this as settig the other coordiates to 0). The, we have f (x i, x j ) = x i x j g (x i, x j ) x i x j h { (x i, x j ) x i x j g { f = g h possible cases are = x i, h = x j g = x j, h = x i g by ispectio (x 1, x 2 ) x i x j h (x 1, x 2 ) x i x j (here, we make the simplifyig assumptio that f, g, h are o-costat fuctios; these cases ca be aalyzed separately) which i tur implies that e ((g) 1 (h) 1 ) ((h) 1 (g) 1 ) = (g) 1 (h) 1. Remark 16 Sice (g) 1 (h) 1 ad (h) 1 (g) 1 are disjoit, T g,h is bipartite. previous lecture that this implies H(T g,h ) 1. Usig subadditivity ad the facts We showed i a H(G g h ) H(G g ) + H(G h ) H(G g h ) H(G g ) + H(G h ) + 1 H(G xi ) = 0 H(G T h 2 ) = H(K ) = log, we see that ay mootoe formula for T h 2 has at least log AND gates, ad hece size(t h 2 ) log. 5

We ca get a eve tighter lower boud by tighteig the upper boud o H(T g,h ): Observe that, while V (T g,h ) = [], E(T g,h ) (g) 1 (h) 1, which implies, by the disjoit uio property, H(T g,h ) (g) 1(h) 1. Let s defie a potetial fuctio: Defiitio 17 µ(f) = H(G f ) + (f) 1. Claim 18 For both f = g h ad f = g h, µ(f) µ(g) + µ(h). Proof: Case 1: f = g h. Assumig o gate computes a costat fuctio, (f) 1 = {i : f({i}) = 1} = (g) 1 (h) 1. Thus, µ(f) = H(G f ) + (f) 1 H(G h ) + H(G h ) + (g) 1 + (h) 1 Case 2: f = g h. This time, (f) 1 = (g) 1 (h) 1. Thus, = µ(h) + µ(g). µ(f) = H(G f ) + (f) 1 H(G h ) + H(G h ) + H(T g,h ) + (g) 1 (h) 1 H(G h ) + H(G h ) + (g) 1(h) 1 + (g) 1 (h) 1 = H(G h ) + H(G h ) + (g) 1 + (h) 1 = µ(g) + µ(h). Note that each leaf has µ(x i ) = 1 ad the root has µ(t h 2 ) = log. Hece, by subadditivity ad the precedig claim, there must be at least log leaves. Sice each gate has two iputs ad oe output, it follows that there are at least log 1 iteral odes, for a total lower boud of size(t h 2 ) 2 log 1. 6