A Master Theorem for Discrete Divide and Conquer Recurrences

Similar documents
A Master Theorem for Discrete Divide and Conquer Recurrences. Dedicated to PHILIPPE FLAJOLET

A Master Theorem for Discrete Divide and Conquer Recurrences

Variable-to-Variable Codes with Small Redundancy Rates

Minimax Redundancy for Large Alphabets by Analytic Methods

Digital Trees and Memoryless Sources: from Arithmetics to Analysis

A Master Theorem for Discrete Divide and Conquer Recurrences

String Complexity. Dedicated to Svante Janson for his 60 Birthday

Multiple Choice Tries and Distributed Hash Tables

Analytic Information Theory: From Shannon to Knuth and Back. Knuth80: Piteaa, Sweden, 2018 Dedicated to Don E. Knuth

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Probabilistic analysis of the asymmetric digital search trees

A One-to-One Code and Its Anti-Redundancy

Analytic Pattern Matching: From DNA to Twitter. AxA Workshop, Venice, 2016 Dedicated to Alberto Apostolico

Tunstall Code, Khodak Variations, and Random Walks

Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn

Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn

Divide and Conquer Algorithms

Divide and Conquer. Andreas Klappenecker

Divide and Conquer Algorithms

Chapter 2: Source coding

Coding on Countably Infinite Alphabets

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

On universal types. Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL September 6, 2004*

Recurrence Relations

RENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS: EXTENDED ABSTRACT

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Average Redundancy for Known Sources: Ubiquitous Trees in Source Coding

Existence of a Limit on a Dense Set, and. Construction of Continuous Functions on Special Sets

Divide and Conquer. Recurrence Relations

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004

On the Average Path Length of Complete m-ary Trees

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

CPS 616 DIVIDE-AND-CONQUER 6-1

Notes for Lecture 3... x 4

Lecture 4 : Adaptive source coding algorithms

arxiv: v22 [math.gm] 20 Sep 2018

#A5 INTEGERS 18A (2018) EXPLICIT EXAMPLES OF p-adic NUMBERS WITH PRESCRIBED IRRATIONALITY EXPONENT

7 Asymptotics for Meromorphic Functions

MULTI-RESTRAINED STIRLING NUMBERS

Counting Markov Types

THE REAL NUMBERS Chapter #4

Notes for Recitation 14

CSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD

In 1826, Abel proved the following result for real power series. Let f(x) =

Design and Analysis of Algorithms

Data Compression Techniques

Search Algorithms. Analysis of Algorithms. John Reif, Ph.D. Prepared by

CS 5321: Advanced Algorithms - Recurrence. Acknowledgement. Outline. Ali Ebnenasir Department of Computer Science Michigan Technological University

MOMENTS OF HYPERGEOMETRIC HURWITZ ZETA FUNCTIONS

Applications. More Counting Problems. Complexity of Algorithms

Midterm 2 for CS 170

Recurrence Relations

Jumping Sequences. Steve Butler 1. (joint work with Ron Graham and Nan Zang) University of California Los Angelese

CS 5321: Advanced Algorithms Analysis Using Recurrence. Acknowledgement. Outline

3F1 Information Theory, Lecture 3

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

Stochastic Processes (Week 6)

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Lecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2)

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Digital search trees JASS

Data Structures in Java

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Regularity of the density for the stochastic heat equation

Coding for Discrete Source

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016

Design and Analysis of Algorithms

Analytic Information Theory and Beyond

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Markov Chains, Stochastic Processes, and Matrix Decompositions

Chapter 5: Data Compression

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

7.11 A proof involving composition Variation in terminology... 88

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

Part II. Number Theory. Year

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Lecture 3: Probability Measures - 2

Divide and Conquer. Arash Rafiey. 27 October, 2016

3F1 Information Theory, Lecture 3

An Operator Theoretical Approach to Nonlocal Differential Equations

The Moments of the Profile in Random Binary Digital Trees

Notes for Lecture 3... x 4

CUTS OF LINEAR ORDERS

arxiv: v6 [math.nt] 12 Sep 2017

Outline. Part 5. Computa0onal Complexity (3) Compound Interest 2/15/12. Recurrence Rela0ons: An Overview. Recurrence Rela0ons: Formal Defini0on

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

CS173 Running Time and Big-O. Tandy Warnow

PROBABILISTIC BEHAVIOR OF ASYMMETRIC LEVEL COMPRESSED TRIES

Asymptotic and Exact Poissonized Variance in the Analysis of Random Digital Trees (joint with Hsien-Kuei Hwang and Vytas Zacharovas)

ECE 587 / STA 563: Lecture 5 Lossless Compression

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011)

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

ECE 587 / STA 563: Lecture 5 Lossless Compression

Kolmogorov complexity

Divide-and-Conquer Algorithms and Recurrence Relations. Niloufar Shafiei

A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence

From the Discrete to the Continuous, and Back... Philippe Flajolet INRIA, France

Transcription:

A Master Theorem for Discrete Divide and Conquer Recurrences Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 January 20, 2011 NSF CSoI SODA, 2011 Research supported by NSF Science & Technology Center, and Humboldt Foundation.

Outline 1. Divide and Conquer 2. Example: Boncelet s Algorithm 3. Continuous Relaxation of the Recurrence 4. Master Theorem 5. Examples 6. Sketch of Proof.

Divide and Conquer Divide and Conquer: A divide and conquer algorithm splits the input into several smaller subproblems, solving each subproblem separately, and then knitting together to solve the original problem. Complexity: A problem of size n is divided into m 2 subproblems of size p j n + δ j and p j n+δ j and each subproblem contributesb j,b j fraction to the final solution; there is a cost a n associated with combining subproblems. Total Cost: The total cost T(n) satisfies the discrete divide and conquer recurrence: T(n) = a n + m b j T ( p j n + δ j ) + m ) b j ( p T j n + δ j (n 2) j=1 j=1 where 0 p j < 1 (e.g., m i=1 p i = 1).

Example: Boncelet s Algorithm Boncelet s algorithm is a variable-to-fixed data compression scheme: 1. A variable-to-fixed length encoder partitions a source string over an m- ary alphabet into variable-length phrases. 2. Each phrase belongs to a given dictionary. 3. A dictionary is represented by a complete parsing tree. 4. The dictionary entries correspond to the leaves of the parsing tree.

Example: Boncelet s Algorithm Boncelet s algorithm is a variable-to-fixed data compression scheme: 1. A variable-to-fixed length encoder partitions a source string over an m- ary alphabet into variable-length phrases. 2. Each phrase belongs to a given dictionary. 3. A dictionary is represented by a complete parsing tree. 4. The dictionary entries correspond to the leaves of the parsing tree. Example: A binary string (m = 2) with symbol probabilities p 1 and p 2. The expected phrase length d(n) satisfies: d(n) = 1 + p 1 d( p 1 n + δ ) + p 2 d( p 2 n δ )

Continuous Relaxation We relax the discrete nature of the recurrence and consider a continuous version: m T(x) = a(x) + b j T(p j x)), x > 1, b j = 0. j=1 Akra and Bazzi (1998) proved that T(x) = Θ ( ( x x s 0 1 + 1 where s 0 is a unique real root of j b jp j s 0 = 1. )) a(u) u s 0 +1du

Continuous Relaxation We relax the discrete nature of the recurrence and consider a continuous version: m T(x) = a(x) + b j T(p j x)), x > 1, b j = 0. j=1 Akra and Bazzi (1998) proved that ( T(x) = Θ x s 0 ( x 1 + 1 where s 0 is a unique real root of j b jp j s 0 = 1. )) a(u) u s 0 +1du Indeed, by taking Mellin transform of the relaxed recurrence: t(s) = 0 T(x)x s 1 dx we find (for some a(s) and g(s)) t(s) = a(s) + g(s) 1 m j=1 b jp s. i An application of the Wiener-Ikehara theorem leads to T(x) Cx s 0 with C = a( s 0) + g( s 0 ) j b jp s 0 j log(1/p j ).

Discrete Divide & Conquer Recurrence by Dirichlet Series For a sequence c(n) define the Dirichlet series as C(s) = n=1 c(n) n s provided it exists for R(s) > σ c for some σ c. Theorem 1 (Perron-Mellin Formula). For all σ > σ c and all x > 0 n<x c(n) + c( x ) [[x Z]] = lim 2 T 1 2πi σ+it σ it C(s) xs s ds. where [[P]] is 1 if P is a true proposition and 0 otherwise.

Discrete Divide & Conquer Recurrence by Dirichlet Series For a sequence c(n) define the Dirichlet series as C(s) = n=1 c(n) n s provided it exists for R(s) > σ c for some σ c. Theorem 1 (Perron-Mellin Formula). For all σ > σ c and all x > 0 n<x c(n) + c( x ) [[x Z]] = lim 2 T 1 2πi σ+it σ it C(s) xs s ds. where [[P]] is 1 if P is a true proposition and 0 otherwise. Example: Define c(n) = T(n + 2) T(n + 1). Then T(n) = T(2) + lim T 1 2πi c+it c it T(s) (n 3 2 )s s ds for some c > σ T with T(s) = n=1 T(n + 2) T(n + 1) n s. where R(s) > σ T.

Assumptions Let a n be a nondecreasing sequence. Define Ã(s) = n=1 a n+2 a n+1 n s which is postulated to exists for R(s) > σ a. Example. Define a n = n σ (logn) α. Then Γ(α + 1) Γ(α + 1) Ã(s) = σ + (s σ) α+1 (s σ) + F(s), α where F(s) is analytic for R(s) > σ 1 and Γ(s) is the gamma function. Define s 0 to be the unique real root of m (b j + b j )p j s = 1. j=1 Other zeros depend on the relation among log(1/p 1 ),...,log(1/p m ).

Rationally and Irrationally Related Numbers Definition 1. (i) log(1/p 1 ),...,log(1/p m ) are rationally related if log(1/p 1 ),...,log(1/p m ) are integer multiples of L, that is, log(1/p j ) = n j L, n j Z, (1 j m). (ii) Otherwise log(1/p 1 ),...,log(1/p m ) are irrationally related. Example. If m = 1, then we are always in the rationally related case. For m = 2, if log(1/p 1 )/log(1/p 2 ) = m/n, (m,n integers), then rationally related. Lemma 1. (i) If log(1/p 1 ),...,log(1/p m ) are irrationally related, then s 0 is the only solution on R(s) = s 0. (ii) If log(1/p 1 ),...,log(1/p m ) are rationally related, then there are infinitely many solutions s k = s 0 + 2πik L (k Z) where log(1/p j ) are all integer multiples of L.

Evaluation of T(n): A Bird View We estimate T(n) by the Cauchy residue theorem. T(s) = Ã(s)+B(s) 1 m j=1 (b j +b j )ps j, T(n) = 1 2πi c+i T(s) (n 3 2 )s c i s ds

Main Master Theorem Theorem 2 (DISCRETE MASTER THEOREM). Let a n = Cn σa (logn) α with min{σ,α} 0. (i) If log(1/p 1 ),...,log(1/p m ) are irrationally related, then T(n) = C 1 + o(1) if σ a 0 and s 0 < 0, C 2 logn + C 2 + o(1) if σ a < s 0 = 0, C 3 (logn) α+1 (1 + +o(1)) if σ a = s 0 = 0 C 4 n s 0 (1 + o(1)) if σ a < s 0 and s 0 > 0, C 5 n s 0(logn) α+1 (1 + o(1)) if σ a = s 0 > 0 and α 1, C 5 n s 0loglogn (1 + o(1)) if σ a = s 0 > 0 and α = 1, C 6 (logn) α (1 + o(1)) if σ a = 0 and s 0 < 0, C 7 n σa (logn) α (1 + o(1)) if σ a > s 0 and σ a > 0. (ii) If log(1/p 1 ),...,log(1/p m ) are rationally related, then T(n) behaves as in the irrationally related case with the following two exceptions: T(n) = { C2 logn + Ψ 2 (logn) + o(1) if σ a < s 0 = 0, Ψ 4 (logn)n s 0 (1 + o(1)) if σ a < s 0 and s 0 > 0, where C 2 is positive and Ψ 2 (t),ψ 4 (t) are periodic functions with period L (with usually countably many discontinuities).

Extensions and Remarks 1. We can handle any a n sequence with Dirichlet series Ã(s): Ã(s) = g 0 (s) ( ) β0 log 1 s σa (s σ a ) α 0 + ( ) βj J log 1 s σa g j (s) (s σ a ) α j j=1 + F(s), F(s) is analytic, g 0 (σ a ) 0, β j non-negative integers, and α 0 real. Then the solutiont(n) of the divide and conquer recurrence has the form: T(n) C n σ (logn) α (loglogn) β or T(n) Ψ(logn)n s 0 σ = max{σ,s 0 }), depending whether logp 1,...logp m are irrationally or rationally related. 2. The periodic function Ψ(t) has the following building blocks λ t n 1 λ B n t logn L λ 1 +1 where λ > 1 andb n is such that n 1 B nλ (logn)/l converges absolutely. This function is discontinuous at t = {logn/l}, where{x} = x x denotes the fractional part of a real number x.

Examples Example 1. Irrationally Related; Case 4: T(n) = 2T( n/2 ) + 3T( n/6 ) + nlogn Here σ a = 1 since a n = nlogn. The equation 2 2 s + 3 6 s = 1 has the (real) solution s 0 = 1.402... > 1, and finally log(1/2)/log(1/6) are irrationally related. Thus by our Master Theorem Case 4 for some constant C > 0 T(n) Cn s 0

Examples Example 2. Irrationally Related; Case 6: Consider the recurrence T(n) = 2T( n/2 ) + 8 n2 T( 3n/4 ) + 9 logn. Here σ a = s 0 = 2, and we deal with irrationally related case. Furthermore, Ã(s) = slog 1 s 2 + G(s) for G(s) analytic for R(s) > 1. By Master Theorem Case 6 T(n) Cn 2 loglogn. Example 3. Rationally Related (m = 1); Case 3: Next consider T(n) = T( n/2 ) + logn. Here σ a = s 0 = 0, and we have rational case (m = 1). Since Ã(s) = 1 s + G(s) we conclude T(n) C(logn) 2.

Examples Example 4: Karatsuba algorithm: Rationally Related (m = 1): T(n) = 3T( n/2 ) + n Here, s 0 = (log3)/(log2) = 1.5849... and s 0 > σ a = 1. Thus for some periodic function Ψ(t). T(n) = Ψ(logn)n log3 log2 (1 + o(1))

Examples Example 5. Rationally Related (m = 1). The recurrence T(n) = 1 2 T( n/2 ) + 1 n is not covered by our Master Theorem but our methodology still works. Here σ a = s 0 = 1 < 0. It follows that T(n) = C logn n + Ψ(logn) ( ) 1 + o n n for a periodic function Ψ(t). Example 6: Mergesort. Rationally Related. The mergesort recurrences are T(n) = T( n/2 ) + T( n/2 ) + n 1, Y (n) = Y( n/2 ) + Y ( n/2 ) + n/2. Here σ a = s 0 = 1 and we deal with the rationally related case. By our Master Theorem (cf. Flajolet & Golin, 1994) T(n) = Y (n) = 1 nlogn + nψ(logn) + o(n), log2 1 nlogn + nψ(logn) + o(n). 2log2

Boncelet s Algorithm Revisited Let a sequence X be generated by a memoryless source over alphabet A of size m with symbol probabilities p i, i A. Using the Boncelet s parsing tree, we parse X into phrases {v 1,...v n } of length l(v 1 ),..., l(v n ) with phrase probabilities P(v 1 ),...,P(v n ). Phrase Length and its Probability Generating Function: Let D n denote the phrase length and define the probability generating function as C(n,y) = E[y Dn ] It satisfies the following discrete divide and conquer recurrence: C(n,y) = y m p i C([p i n + δ i ],y) i=1 The expected phrase length d(n) = E[D n ] = C (n,1) satisfies the following discrete divide and conquer recurrence: d(n) = 1 + m p i d([p i n + δ i ]) i=1 with d(0) = = d(m 1) = 0.

Main Results for Boncelet s Algorithm Theorem 3. Consider anm-ary memoryless source with probabilitiesp i > 0 and the entropy rate H = m i=1 p ilog(1/p i ). (i) If log(1/p 1 ),...log(1/p m ) are irrationally related, then where d(n) = 1 H logn α H + o(1), α = E (0) H H 2 2H, H 2 = m i=1 p ilog 2 p i, and E (0) is the derivative at s = 0 of a Dirichlet series E(s) arises from the discrete nature of the recurrence. (ii) If log(1/p 1 ),...log(1/p m ) are rationally related, then d(n) = 1 H logn α + Ψ(logn) H + O(n η ) for some η > 0, where Ψ(t) is a periodic function of bounded variation that has usually an infinite number of discontinuities.

Redundancy of the Boncelet s Algorithm Corollary 1. Let R n denote the redundancy of the Boncelet code: R n = logn E[D n ] H = logn d(n) H. (i) If log(1/p 1 ),...log(1/p m ) are irrationally related, then R n = Hα ( ) 1 logn + o. logn (ii) If log(1/p 1 ),...log(1/p m ) are rationally related, then ( ) Hα + Ψ(logn) 1 R n = + o. logn logn Tunstall Code Redundancy: R T n = H logn ( logh H 2 2H ) ( ) 1 + o logn for irrational case; in the rational case there is aperiodic function. Example. Consider p = 1/3 and q = 2/3. Then one computes α = E (0) H H 2 2H 0.322 while for the Tunstall code logh H 2 2H 0.0496.

Limiting Distribution for the Phrase length Theorem 4. Consider a memoryless source generating a sequence of length n parsed by the Boncelet algorithm. If (p 1,...,p m ) is not the uniform distribution, then the phrase length D n satisfies the central limit law, that is, D n 1 H logn (H2 ) N(0,1), logn H 3 1 H where N(0,1) denotes the standard normal distribution, and VarD n E[D n ] = logn H + O(1), H 1 ) logn 3 H ( H2 for n.

Sketch of Proof 1. From the recurrence we have T(s) m = Ã(s) + b j j=1 n=1 T ( p j (n + 2) + δ j ) T ( p j (n + 1) + δ j ) n s. But defining k + 2 δj n = for some integer k, we arrive at p j 2 n=1 T ( p j (n + 2) + δ j ) T ( p j (n + 1) + δ j ) n s = G j (s)+ k=1 T(k + 2) T(k + 1) ( k+2 δj ) s. p 2 j for an explicit (and simple) analytic function G j (s).

Sketch of Proof Continuation 2. We now compare the last sum to p s j T(s) and obtain k=1 T(k + 2) T(k + 1) ( k+2 δj ) s = p 2 j k=1 T(k + 2) T(k + 1) (k/p j ) s = p s j T(s) E j (s), where E j (s) = (T(k + 2) T(k + 1)) 1 (k/p j ) 1 ( s k+2 δj p 2 j k=1 ) s. 3. Defining E(s) = m b j E j (s) and G(s) = j=1 m b j G j (s) j=1 we finally obtain our final formula T(s) = Ã(s) + G(s) E(s) 1 m j=1 b. j p s j

Sketch of Proof Binary Boncelet s Algorithm 1. Let s 0 (y) be the real zero of y(p s+1 + q s+1 ) = 1, q = 1 p. 2. Define C(s,y) = n=1 which from the basic recurrence becomes C(n + 2,y) C(n + 1,y) n s. C(s,y) = (y 1) E(s,y) 1 y(p s+1 + q s+1 ), where E(s,y) converges for R(s) > s 0 (y) 1 and satisfies E(0,y) = 0 and E(s,1) = 0. 3. By Mellin-Perron formula and residue theorem we can prove that C(n,y) = (1 + O(y 1))n s 0 (y) (1 + o(1)) where s 0 (y) = y 1 ( H + H2 2H 1 ) (y 1) 2 + O((y 1) 3 ). 3 H

That s It THANK YOU