The distribution of run lengths in integer compositions

Similar documents
A conjecture on the alphabet size needed to produce all correlation classes of pairs of words

Pattern avoidance in compositions and multiset permutations

arxiv:math/ v1 [math.co] 2 Oct 2002

- 1 - ENUMERATION OF STRINGS. A. M. Odlyzko AT&T Bell Laboratories Murray Hill, New Jersey USA ABSTRACT

The subword complexity of a class of infinite binary words

The q-exponential generating function for permutations by consecutive patterns and inversions

Counting permutations by cyclic peaks and valleys

Words restricted by patterns with at most 2 distinct letters

arxiv:math/ v2 [math.co] 30 Jan 2003

Congruence successions in compositions

Counting Permutations by their Rigid Patterns

The Symbolic Goulden Jackson Cluster Method

ECO-generation for some restricted classes of compositions

Counting Palindromes According to r-runs of Ones Using Generating Functions

Equivalence Classes of Motzkin Paths Modulo a Pattern of Length at Most Two

Counting Palindromic Binary Strings Without r-runs of Ones

Avoiding Large Squares in Infinite Binary Words

Counting Palindromes According to r-runs of Ones Using Generating Functions

A combinatorial determinant

Toufik Mansour 1. Department of Mathematics, Chalmers University of Technology, S Göteborg, Sweden

REPRESENTING POSITIVE INTEGERS AS A SUM OF LINEAR RECURRENCE SEQUENCES

arxiv: v1 [math.co] 30 Mar 2010

Asymptotics for minimal overlapping patterns for generalized Euler permutations, standard tableaux of rectangular shapes, and column strict arrays

Enumerating Binary Strings

BIJECTIVE PROOFS FOR FIBONACCI IDENTITIES RELATED TO ZECKENDORF S THEOREM

Waiting Time Distributions for Pattern Occurrence in a Constrained Sequence

ON MULTI-AVOIDANCE OF RIGHT ANGLED NUMBERED POLYOMINO PATTERNS

On the number of matchings of a tree.

Compositions, Bijections, and Enumerations

The sigma-sequence and occurrences of some patterns, subsequences and subwords

ON CYCLIC STRINGS WITHOUT LONG CONSTANT BLOCKS

BOUNDS ON ZIMIN WORD AVOIDANCE

Counting Peaks and Valleys in a Partition of a Set

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Enumeration Schemes for Words Avoiding Permutations

Integer Sequences Related to Compositions without 2 s

q-counting hypercubes in Lucas cubes

Permutation Statistics and q-fibonacci Numbers

Numeration Systems. S. E. Payne General Numeration Systems 2. 3 Combinatorial Numeration Numeration with forbidden substrings 14

arxiv: v1 [math.nt] 4 Mar 2014

arxiv: v1 [math.co] 3 Nov 2014

Optimal Ramp Schemes and Related Combinatorial Objects

A fast algorithm for the Kolakoski sequence

On Minimal Words With Given Subword Complexity

Introduction to Techniques for Counting

ABSTRACT 1. INTRODUCTION

Compositions and samples of geometric random variables with constrained multiplicities

Quadratic-backtracking Algorithm for String Reconstruction from Substring Compositions

Binomial Coefficient Identities/Complements

Recursive Definitions

Refining the Stern Diatomic Sequence

Some statistics on permutations avoiding generalized patterns

MULTI-RESTRAINED STIRLING NUMBERS

Algebraic aspects of some Riordan arrays related to binary words avoiding a pattern

The discrepancy of permutation families

Linear algebra I Homework #1 due Thursday, Oct Show that the diagonals of a square are orthogonal to one another.

Pattern correlation matrices and their properties

Bordered Conjugates of Words over Large Alphabets

Bijective proofs for Fibonacci identities related to Zeckendorf s Theorem

Restricted Motzkin permutations, Motzkin paths, continued fractions, and Chebyshev polynomials

Self-dual interval orders and row-fishburn matrices

COUNTING STAIRCASES IN INTEGER COMPOSITIONS

A detailed derivation of the transfer matrix method formula M. Klazar.... a ik 1,j = A i,i1 A i1,i 2

Gray Codes and Overlap Cycles for Restricted Weight Words

ENUMERATING DISTINCT CHESSBOARD TILINGS

A Note on Extended Binomial Coefficients

Topics in Algorithms. 1 Generation of Basic Combinatorial Objects. Exercises. 1.1 Generation of Subsets. 1. Consider the sum:

A New Shuffle Convolution for Multiple Zeta Values

q xk y n k. , provided k < n. (This does not hold for k n.) Give a combinatorial proof of this recurrence by means of a bijective transformation.

A Catalan-Hankel Determinant Evaluation

REFINED RESTRICTED PERMUTATIONS AVOIDING SUBSETS OF PATTERNS OF LENGTH THREE

Permutation Patterns and Continued Fractions

Matrix compositions. Emanuele Munarini. Dipartimento di Matematica Politecnico di Milano

COMPOSITIONS WITH A FIXED NUMBER OF INVERSIONS

arxiv: v1 [math.co] 17 Dec 2007

ON THE ASYMPTOTIC DISTRIBUTION OF PARAMETERS IN RANDOM WEIGHTED STAIRCASE TABLEAUX

Parity Theorems for Statistics on Domino Arrangements

Isomorphisms between pattern classes

Compositions of n with no occurrence of k. Phyllis Chinn, Humboldt State University Silvia Heubach, California State University Los Angeles

ASYMPTOTIC ENUMERATION OF PERMUTATIONS AVOIDING GENERALIZED PATTERNS

THE ALTERNATING GREEDY EXPANSION AND APPLICATIONS TO COMPUTING DIGIT EXPANSIONS FROM LEFT-TO-RIGHT IN CRYPTOGRAPHY

De Bruijn Sequences for the Binary Strings with Maximum Density

ROSENA R.X. DU AND HELMUT PRODINGER. Dedicated to Philippe Flajolet ( )

Expected Number of Distinct Subsequences in Randomly Generated Binary Strings

Pattern Avoidance in Ordered Set Partitions

An Efficient Algorithm for Deriving Summation Identities from Mutual Recurrences

arxiv: v1 [math.co] 25 May 2011

Block intersection polynomials

arxiv:math/ v1 [math.co] 10 Nov 1998

Almost all trees have an even number of independent sets

A MATRIX GENERALIZATION OF A THEOREM OF FINE

k-protected VERTICES IN BINARY SEARCH TREES

Hankel Operators plus Orthogonal Polynomials. Yield Combinatorial Identities

Online Computation of Abelian Runs

Consecutive Patterns: From Permutations to Column-ConvexAugust Polyominoes 10, 2010 and Back 1 / 36

Counting. Mukulika Ghosh. Fall Based on slides by Dr. Hyunyoung Lee

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

CSL361 Problem set 4: Basic linear algebra

q-pell Sequences and Two Identities of V. A. Lebesgue

Transcription:

The distribution of run lengths in integer compositions Herbert S Wilf University of Pennsylvania Philadelphia PA 19104-6395, USA wilf@mathupennedu Submitted: Mar 7, 2010; Accepted: Oct 9, 2011; Published: Oct 17, 2011 From W to Z: Mazel tov on your 60 th! Abstract We find explicitly the generating function for the number of compositions of n that avoid all words on a given list of forbidden subwords, in the case where the forbidden words are pairwise letter-disjoint From this we get the gf for compositions of n with no k consecutive parts equal, as well as the number with m parts and no consecutive k parts being equal, which generalizes corresponding results for Carlitz compositions 1 Introduction A composition of an integer n is a representation n = a 1 +a 2 + +a k in which the parts a i are positive integers, and where the order of the parts is important Thus 9 = 1+1+1+4+2 is one of the compositions of n = 9, and 9 = 4 + 1 + 1 + 2 + 1 is another A run in a composition is a maximal string of identical parts The composition 28 = 3 + 5 + 5 + 5 + 3 + 3 + 4 has run lengths of 1,3,2,1, for example In this note we find (the generating function of) C(n, k, r), the number of compositions of n into k parts, whose runs all have lengths < r (see Theorem 3 below), by using recent generalizations of the Guibas-Odlyzko method for counting words that avoid a list of subwords In their 1981 paper [5], Guibas and Odlyzko gave an elegant solution to the following counting problem Given an alphabet A, and a list L of words over that alphabet, the list This paper was posted on the arxiv [11] on June 29, 2009 the electronic journal of combinatorics 18(2) (2011), #P23 1

being reduced in the sense that no word on the list is a subword of any other How many words of length n do not contain any of the words in L as a subword? Other solutions of this problem have been given by the cluster method of Goulden and Jackson [3], and by Zeilberger s [12] method of counting words that avoid mistakes The results of [5] have recently been extended by AN Myers [10] to the situation wherein the letters of the alphabet are assigned weights, the weight of a word is the sum of the weights of its letters, and one is to find the number of words of weight n that avoid the members of the list L This allows us to solve problems involving compositions of integers as well as problems that do not involve compositions Finally, Myers s results have been complemented by Heubach and Kitaev [6] to provide the number of words of length k and weight n that avoid the members of the list L, though their theorems are restricted to the alphabet {1, 2,, n} and therefore apply almost exclusively to integer compositions The above theorems present the generating function for the desired numbers of words as the first component of the solution vector of a system of linear, simultaneous equations, or, by using Cramer s rule, as a ratio of two determinants The main point of this note is the following The easy case in these word problems is the case in which every pair of distinct words on the forbidden list L has correlation 0, in a sense to be explained below, or equivalently, for every pair x, y of distinct words on that list, no suffix of x is also a prefix of y In that situation, the matrix of coefficients of the system of linear equations that expresses the answer to the question has a very simple form It consists of a nonzero first row and first column and main diagonal, all other entries being 0 s For a matrix of that form it is easy to write out the solution of the governing system of linear equations simply and explicitly We will do that below and then provide some examples We will, for example, find the generating function for C(n, k, j), the number of compositions of n into k parts the lengths of whose runs is at most j, a run in a composition being a maximal contiguous block of identical parts 2 The main theorem Let X and Y be two words over a given alphabet We define the correlation c XY of X on Y, as follows Write the word X above the word Y, aligned so that the rightmost letter of X is above the rightmost letter of Y Fix some integer j 0 Shift Y j places to the left, so the rightmost letter of Y is now under the (j + 1)st letter of X, counting from the right Examine the subword of X that now overlaps with Y This is the maximal prefix of X that has letters of the shifted Y below it If that subword of X is identical with the subword of Y that lies below it, take c j = 1, else take c j = 0 the electronic journal of combinatorics 18(2) (2011), #P23 2

Having done this for all j, the correlation of X on Y is the binary vector c 0 c 1 c 2 For example, if X = 110 and Y = 1011 then c XY = 011 and c Y X = 0010, in which we have written the bits of the c s in the order c 0 c 1 c m 1 Let each letter u of the alphabet be assigned a weight w(u), and let the weight of a word be the sum of the weights of its letters Finally, if X is an m-letter word X = a 0 a 1 a m 1, define the correlation polynomial c XY (x, q) of X on Y to be c XY (x, q) = c 0 + c 1 x w(a m 1) q + c 2 x w(a m 2a m 1 ) q 2 + + c m 1 x w(a 1a 2 a m 1 ) q m 1 (1) The main result of [6], which extends the main result of [10], which in turn extends the main result of [5], is the following Theorem 1 (Heubach, Kitaev) Let L = {S 1,,S k } be a list of integer compositions, such that no composition on the list is contained in any other Let F(x, q) = σ xw(σ) q l(σ), the sum being extended over all compositions of all integers that avoid every word on the list L, where l(σ) is the length of the word (number of parts of) σ and w(σ) is the sum of the parts of σ Then F(x, q) is the component x 1 of the solution vector of the following system of linear equations: (1 + q) x w(s1) q l(s 1) c 11 (x, q) c 1k (x, q) x w(sk) q l(s k) c k1 (x, q) c kk (x, q) 3 The easy case x 1 x 2 x k+1 = 0 We now specialize to the case where c ij (x, q) = 0 for all i j, 1 i, j k, k being the length of the forbidden word list L The coefficient matrix entries in the equations (2) then all vanish except for those in the first row, the first column, and the main diagonal For any such matrix, B, say, the first entry of the solution vector of the equations Bx = (, 0,,0) T is easily verified to be x 1 = b b 11 b 21 b 12 b 22 b k+1,1 1,k+1 b k+1,k+1 If we apply this result to the equations (2) we obtain Theorem 2 Let L = {S 1,,S k } be a list of integer compositions, such that no word on the list is contained in any other Suppose further that all correlation polynomials c ij (x, q) = 0, for i j Let F(x, q) = σ xw(σ) q l(σ), the sum being extended over all compositions σ that avoid every word on the list L, where l(σ) is the length of the word σ Then F(x, q) = (1 + q) + () k x w(s j ) q l(s j ) j=1 c j,j (x,q) 0 (2) (3) the electronic journal of combinatorics 18(2) (2011), #P23 3

4 Carlitz compositions and beyond 41 Carlitz compositions We apply the results of the previous section to finding the distribution function of the lengths of the runs of integer compositions Again, a run in a composition is a maximal string of identical parts The composition 28=3+5+5+5+3+3+4 has run lengths of 1,3,2,1, for example A Carlitz composition is one all of whose runs have length 1 That is, a Carlitz composition is one in which no two consecutive parts are equal These compositions have been extensively studied in recent years, both exactly and asymptotically [1, 8, 9] The machinery of the easy case above counts Carlitz compositions of n, as follows The list L of forbidden subwords is L = {11, 22, 33, 44, } A Carlitz composition is evidently one that avoids this list, and also evidently, this list belongs to the easy case, ie, the off-diagonal correlation polynomials all vanish Thus we can use Theorem 2 The word S j is jj, and its weight is w(s j ) = 2j The correlation polynomials c Si S j vanish for all i j, while for i = j we have by (1), c Sj S j (x, y) = 1 + x j q If C(n, k) is the number of Carlitz compositions of n into k parts, we now have from equation (3), C(n, k)x n q k n,k = (1 + q) + ()q 2 x 2j j 1 1+qx j = 1 + qx + qx 2 + (q + 2q 2 )x 3 + (q + 2q 2 + q 3 )x 4 + (q + 4q 2 + 2q 3 )x 5 + This generating function has previously been found, in somewhat different form, by Knopfmacher and Prodinger [8] 42 Beyond Now we find the distribution function of the maximum run length in compositions of n that have k parts Let C(n, k, r) denote the number of compositions of n into k parts that have no run of length r Note that C(n, k, 2) counts Carlitz compositions of n with k parts To find C(n, k, r) we use the list L = {1 r, 2 r, 3 r, } of forbidden words, where, eg, 1 r is a string of r 1 s Then again the list L qualifies for the easy case, since the correlations all vanish off of the diagonal while on the diagonal, c Sj S j (x, y) = 1 + x j q + x 2j q 2 + + x j(r 1) q r 1 = 1 qr x rj 1 qx j We now have from equation (3), the electronic journal of combinatorics 18(2) (2011), #P23 4

Theorem 3 The number C(n, k, r) of compositions of n into k parts that have no run of length r has the generating function C(n, k, r)x n q k = n,k 1 q + (4) 1 x qr x rj (1 qx j ) j 1 1 q r x rj The average length of the longest run in a composition of n has been found to be log 2 n, by Grabner et al [4], using the method of iid geometric random variables The asymptotic behavior of the number of compositions of n whose runs are all of lengths < k has been found by Gafni [2] References [1] L Carlitz, Restricted compositions, The Fibonacci Quart, 14 (1976), 254264 [2] Ayla Gafni, Longest Run of Equal Parts in a Random Integer Composition, arxiv:math/09075553 [mathco] July 31, 2009 [3] I P Goulden and D M Jackson, An inversion theorem for cluster decompositions of sequences with distinguished subsequences, J London Math Soc 20 (1979), no 3, 567 576 [4] P Grabner, A Knopfmacher and H Prodinger, Combinatorics of geometrically distributed random variables: Run statistics, Theoretical Computer Science 297 (2003), 261 270 [5] LJ Guibas and AM Odlyzko, String overlaps, pattern matching, and nontransitive games, J Combinatorial Theory, Ser A, 30 (1981), 183 208 [6] Silvia Heubach and Sergey Kitaev, Avoiding substrings in compositions, arxiv:math/09035135 [mathco] [7] S Heubach and T Mansour, Enumeration of 3-letter patterns in compositions, arxiv:math/0603285v1 [mathco] [8] Arnold Knopfmacher and Helmut Prodinger, On Carlitz compositions, European J Combin 19 (1998), no 5, 579 589 [9] Guy Louchard and Helmut Prodinger, Probabilistic analysis of Carlitz compositions, Discrete Math Theor Comput Sci 5 (2002), no 1, 71 95 [10] Amy N Myers, Forbidden substrings on weighted alphabets, Australasian J Math, to appear [11] Herbert S Wilf, The distribution of longest run lengths in integer compositions, arxiv:math/09065196 [mathco] June 29, 2009 [12] Doron Zeilberger, Enumeration of words by their number of mistakes Discrete Math 34 (1981), no 1, 89 91 the electronic journal of combinatorics 18(2) (2011), #P23 5