Asymptotic distribution of two-protected nodes in ternary search trees

Similar documents
arxiv: v1 [math.pr] 21 Mar 2014

The space requirement of m-ary search trees: distributional asymptotics for m 27

PROTECTED NODES AND FRINGE SUBTREES IN SOME RANDOM TREES

k-protected VERTICES IN BINARY SEARCH TREES

Protected nodes and fringe subtrees in some random trees

GENERALIZED STIRLING PERMUTATIONS, FAMILIES OF INCREASING TREES AND URN MODELS

MEAN AND VARIANCE OF BALANCED PÓLYA URNS

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

FRINGE TREES, CRUMP MODE JAGERS BRANCHING PROCESSES AND m-ary SEARCH TREES

1 Last time: least-squares problems

Almost sure asymptotics for the random binary search tree

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

arxiv: v1 [math.co] 3 Nov 2014

1 Matrices and Systems of Linear Equations. a 1n a 2n

Linear Algebra. Min Yan

Chapter 3 Transformations

arxiv: v2 [math.pr] 9 Sep 2017

3 (Maths) Linear Algebra

RANDOM RECURSIVE TREES AND PREFERENTIAL ATTACHMENT TREES ARE RANDOM SPLIT TREES

Scale free random trees

arxiv: v1 [math.co] 22 Jan 2013

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Multi-drawing, multi-colour Pólya urns

Intrinsic products and factorizations of matrices

Matthew Rathkey, Roy Wiggins, and Chelsea Yost. May 27, 2013

Chapter Two Elements of Linear Algebra

Linear Algebra March 16, 2019

Spectral Properties of Matrix Polynomials in the Max Algebra

Eigenvectors Via Graph Theory

Topics in linear algebra

ON MULTI-AVOIDANCE OF RIGHT ANGLED NUMBERED POLYOMINO PATTERNS

Linear Algebra Practice Problems

Reproduction numbers for epidemic models with households and other social structures I: Definition and calculation of R 0

1.3 Convergence of Regular Markov Chains

The Subtree Size Profile of Plane-oriented Recursive Trees

Math Linear Algebra Final Exam Review Sheet

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Detailed Proof of The PerronFrobenius Theorem

MAXIMAL CLADES IN RANDOM BINARY SEARCH TREES

Linear Algebra: Lecture Notes. Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway

On improving matchings in trees, via bounded-length augmentations 1

Analysis of Approximate Quickselect and Related Problems

1 Matrices and Systems of Linear Equations

642:550, Summer 2004, Supplement 6 The Perron-Frobenius Theorem. Summer 2004

Proof Techniques (Review of Math 271)

arxiv:math.pr/ v1 17 May 2004

On the Average Path Length of Complete m-ary Trees

Vector spaces. EE 387, Notes 8, Handout #12

Foundations of Matrix Analysis

Descents in Parking Functions

A NOTE ON THE ASYMPTOTIC BEHAVIOUR OF A PERIODIC MULTITYPE GALTON-WATSON BRANCHING PROCESS. M. González, R. Martínez, M. Mota

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

Combining the cycle index and the Tutte polynomial?

ROSENA R.X. DU AND HELMUT PRODINGER. Dedicated to Philippe Flajolet ( )

Review of Basic Concepts in Linear Algebra

Lecture 2: Linear Algebra Review

Dependence between Path Lengths and Size in Random Trees (joint with H.-H. Chern, H.-K. Hwang and R. Neininger)

MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003

Kernels of Directed Graph Laplacians. J. S. Caughman and J.J.P. Veerman

Pattern Popularity in 132-Avoiding Permutations

MATH Mathematics for Agriculture II

Rectangular Young tableaux and the Jacobi ensemble

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

arxiv: v1 [math.ra] 13 Jan 2009

Notes on Linear Algebra and Matrix Theory

ELE/MCE 503 Linear Algebra Facts Fall 2018

Stochastic Realization of Binary Exchangeable Processes

2.2. Show that U 0 is a vector space. For each α 0 in F, show by example that U α does not satisfy closure.

Markov Chains. As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006.

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA

This appendix provides a very basic introduction to linear algebra concepts.

18.S34 linear algebra problems (2007)

Applications of Analytic Combinatorics in Mathematical Biology (joint with H. Chang, M. Drmota, E. Y. Jin, and Y.-W. Lee)

Jordan normal form notes (version date: 11/21/07)

Allocation of multiple processors to lazy boolean function trees justification of the magic number 2/3

STOCHASTIC PROCESSES Basic notions

Boolean Inner-Product Spaces and Boolean Matrices

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Mathematical foundations - linear algebra

Elementary Linear Algebra

1. General Vector Spaces

1 9/5 Matrices, vectors, and their applications

STA 294: Stochastic Processes & Bayesian Nonparametrics

SMT 2013 Power Round Solutions February 2, 2013

The 4-periodic spiral determinant

Math 315: Linear Algebra Solutions to Assignment 7

Some Node Degree Properties of Series-Parallel Graphs Evolving Under a Stochastic Growth Model

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Fiedler s Theorems on Nodal Domains

Linear Algebra in Actuarial Science: Slides to the lecture

Expected Number of Distinct Subsequences in Randomly Generated Binary Strings

Modeling and Stability Analysis of a Communication Network System

The Google Markov Chain: convergence speed and eigenvalues

Basic Concepts in Linear Algebra

Central Groupoids, Central Digraphs, and Zero-One Matrices A Satisfying A 2 = J

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Topic 1: Matrix diagonalization

Eigenvalues and Eigenvectors

Pre-sessional Mathematics for Big Data MSc Class 2: Linear Algebra

Transcription:

Asymptotic distribution of two-protected nodes in ternary search trees Cecilia Holmgren Svante Janson March 2, 204; revised October 5, 204 Abstract We study protected nodes in m-ary search trees, by putting them in context of generalised Pólya urns. We show that the number of two-protected nodes (the nodes that are neither leaves nor parents of leaves) in a random ternary search tree is asymptotically normal. The methods apply in principle to m-ary search trees with larger m as well, although the size of the matrices used in the calculations grow rapidly with m; we conjecture that the method yields an asymptotically normal distribution for all m 26. The one-protected nodes, and their complement, i.e., the leaves, are easier to analyze. By using a simpler Pólya urn (that is similar to the one that has earlier been used to study the total number of nodes in m-ary search trees), we prove normal limit laws for the number of one-protected nodes and the number of leaves for all m 26. Keywords: Random trees, Pólya urns, Normal limit laws, M-ary search trees. MSC 200 subject classifications: Primary 60C05; secondary 05C05, 60F05, 68P05. Introduction There are many recent studies of so-called protected nodes in various classes of random trees, see e.g. [,, 6, 8,, 8, 9]. A node is protected (more precisely, two-protected) if it is not a leaf and none of its children is a leaf. In this paper we consider the number of protected nodes in m-ary search trees (see Section..2 for definitions), by putting them in context of generalised Pólya urns. The d following result is our main theorem. We let denote convergence in distribution and denote a normal distribution with mean µ and variance σ 2 by N (µ, σ 2 ). Theorem.. Let Z n be the number of protected nodes in a ternary search tree with n keys. Then Z n 57 n ( d N 0, 692024867 ). n 4692256000 Department of Mathematics, Stockholm University, 4 8 Stockholm, Sweden. Supported in part by the Swedish Research Council. Department of Mathematics, Uppsala University, SE-7 Uppsala, Sweden. Supported in part by the Knut and Alice Wallenberg Foundation.

For a binary search tree, we obtain by the same method a new proof of the following result, which earlier has been obtained by different methods, first by Mahmoud and Ward [8] (using generating functions), and later in [] (using fringe trees). Theorem.2. Let Y n be the number of protected nodes in a binary search tree with n keys. Then Y n 0 n ( d N 0, 29 ). n 225 p 57 p Remark.. Theorems. and.2 imply that Zn n and Yn n 0. It follows from [2, Theorem.2] that moreover, for the sequence of search trees generated by an infinite sequence of i.i.d. keys, Zn a.s. n 57 and Yn a.s. n 0. (Similarly, convergence almost surely holds in the other limit theorems below too.) Since also 0 Zn n and 0 Yn n, the dominated convergence theorem implies that ( Zn n ) and ( Yn n ) converge in L to 57 and E(Zn) 0, respectively; in particular, n 57 E(Yn) and n 0. We conjecture that also the variances (and higher moments) converge in Theorems. and.2. The methods apply to larger m too, at least in principle, see Sections.. and 5. Similarly, we may consider the one-protected nodes, i.e. the non-leaves. These are easier to analyze than the two-protected nodes and using a minor variation of a Pólya urn earlier used to study the total number of nodes [5, 2, 6], we prove in Sections 4 and 5.2 normal limit laws for the number of one-protected nodes and the number of leaves in an m-ary search tree for all m 26.. Protected nodes in m-ary search trees described as generalised Pólya urns.. A generalised Pólya urn A (generalised) Pólya urn process is defined as follows, see e.g. [2] or [6]. There are balls of q types (or colours),..., q, and for each n a random vector X n = (X n,,..., X n,q ), where X n,i is the number of balls of type i in the urn at time n. The urn starts with a given vector X 0. For each type i, there is an activity (or weight) a i 0, where a i R, and a random vector ξ i = (ξ i,..., ξ iq ), where ξ i Z q 0. The urn evolves according to a discrete time Markov process. At each time n, one ball is drawn at random from the urn, with the probability of any ball proportional to its activity. Thus, the drawn ball has type i with a probability i X n,i j a jx n,j. If the drawn ball has type i, it is replaced together with X (i) n,j balls of type j, j =,..., n, where the random vector X n (i) = ( X (i) n,,..., X(i) n,q) has the same distribution as ξ i and is independent of everything else that has happened so far. (We allow X (i) n,i =, which means that the drawn ball is not replaced.) We let A denote the q q matrix A = (a j E ξ ji ) q i,j=. (.) The matrix A with its eigenvalues and eigenvectors is central for proving limit theorems. The basic assumptions in [2] are the following. We say that a type i is dominating, if every other type j can be found with positive probability at some time in an urn started with a single ball of type i. (A) For each type i, there is an integer d i, such that X 0i and all ξ ji a.s. are divisible by d i, ξ ij 0 for j i (i.e., balls of other types than the drawn ball are never removed) and ξ ii d i. 2

(A2) E(ξij 2 ) < for all i, j {,..., q}. (A) The largest real eigenvalue λ of A is positive. (A4) The largest real eigenvalue λ is simple. (A5) There exists a dominating type i with X 0,i > 0, i.e., we start with at least one ball of a dominating type. (A6) λ is an eigenvalue of the submatrix of A given by the dominating types. Furthermore, [2] says that the process becomes essentially extinct if at some time there are no balls of any dominating type left. We will also use the following simplifying assumption. (A7) With probability, the urn never becomes essentially extinct. Condition (A) is stated here somewhat more generally than in [2], where d i = is assumed, but the general case follows by replacing X n,i by X n,i d i ; see [2, Remark 4.2]. In the Pólya urns used in this paper, it is easily seen (from the definitions using trees) that every type with non-zero activity is dominating. If we remove rows and columns corresponding to the types with activity 0 from A, then the removed columns are identically 0, so the set of non-zero eigenvalues of A is not changed. The remaining matrix is irreducible, and using the Perron Frobenius theorem, it is easy to verify all conditions (A) (A6), see [2, Lemma 2.]. Furthermore, in our urns there will always be a ball of positive activity, so essential extinction is impossible. Before stating the results that we use, we need some notation. With a vector v we mean a column vector, and we write v for its transpose (a row vector). More generally, we denote the transpose of a matrix A by A. By an eigenvector of A we mean a right eigenvector; a left eigenvector is the same as the transpose of an eigenvector of the matrix A. If u and v are vectors then u v is a scalar while uv is a q q matrix of rank. We also use the notation u v for u v. We let λ denote the largest real eigenvalue of A. (This exists by our assumptions and the Perron Frobenius theorem.) Let a = (a,..., a q ) denote the (column) vector of activities, and let u and v denote left and right eigenvectors of A corresponding to the largest eigenvalue λ, i.e., vectors satisfying u A = λ u, Av = λ v. We assume that v and u are normalized such that a v = a v = v a =, u v = u v = v u =, (.2) see [2, equations (2.2) (2.)]. We write v = (v,..., v q ). We define P λ = v u, and P I = I q P λ, where I q is the q q identity matrix. (Thus P λ is the one-dimensional projection onto the eigenspace corresponding to λ such that P λ commutes with the matrix A, see [2, equation (2.2)]; note that P λ typically is not orthogonal). We define the matrices B i := E(ξ i ξ i) (.) q B := v i a i B i (.4) Σ I := i= 0 P I e sa Be sa P Ie λ s ds, (.5)

where we recall that e ta = j=0 tj A j /j!. It is proved in [2] that, under assumptions (A) (A7), X n is asymptotically normal if Re λ λ /2 for each eigenvalue λ λ ; more precisely, if Re λ < λ /2 for each such λ, then n /2 (X n nµ) d N (0, Σ) for some µ = (µ,..., µ k ) and Σ = (σ i,j ) k i,j=. (If λ = λ /2, then X n is still asymptotically normal, however with another normalisation.) The asymptotic covariance matrix Σ may be calculated in different ways; we use the following results from [2], which apply under different additional assumptions. Theorem.4 ([2, Theorem.22 and Lemma 5.4]). Assume (A) (A7) and that we have normalized as in (.2). Also assume that Re λ < λ /2 for each eigenvalue λ λ. Suppose that a E(ξ i ) = m for some m > 0 and every i. Then, as n, n /2 (X n nµ) d N (0, Σ), with µ = λ v and covariance matrix Σ equal to mσ I, with Σ I as in (.5). Theorem.5 ([2, Theorem.22 and Lemma 5.]). Assume (A) (A7), and that we have normalized as in (.2). Also assume that Re λ < λ /2 for each eigenvalue λ λ. If the matrix A is diagonalisable, and {u i }q i= and {v i} q i= are dual bases of left and right eigenvectors, respectively, i.e., u i A = λ iu i, Av i = λ i v i and u i v j = δ ij (where δ ij is the Kronecker delta). Then, as n, n /2 (X n nµ) d N (0, Σ), with µ = λ v and covariance matrix Σ equal to with the matrix B as in (.4)...2 M-ary search trees Σ = q j,k=2 u j Bu k λ λ j λ k v j v k, (.6) We recall the definition of m-ary search trees, see e.g. [4] or [7]. An m-ary search tree, where m 2, is constructed recursively from a sequence of n keys (numbers). We assume that the keys are i.i.d. uniform random numbers in [0, ]. (Only the order of the keys matter, so alternatively, we may assume that the keys form a uniformly random permutation of {,..., n}.) Each node may contain up to m keys. We start with a tree containing just an empty root. The first m keys are put in the root, and are placed in increasing order from left to right; they divide the set of real numbers into m intervals J,..., J m. When the root is full (after the first m keys are added), it gets m children that are initially empty, and each further key is passed to one of the children depending on which interval it belongs to; a key in J i is passed to the i:th child. (The binary search tree is the simplest case where keys are passed to the left or right child depending on whether it is larger or smaller than the key in the root.) The procedure repeats recursively in the subtrees until all keys are added to the tree. Nodes that contain at least one key are called internal, while empty nodes are called external. We regard the m-ary search tree as consisting only of the internal nodes; the external nodes are places for potential additions, and are useful when discussing the tree 4

(e.g. below), but are not really part of the tree. Thus, a leaf is an internal node that has no internal children, but it may have external children. (It will have external children if it is full, but not otherwise.) Similarly, a protected node is an internal node that is not a leaf, and has no child that is a leaf. (It may have external nodes as children.) We say that a node with i m 2 keys has i + gaps, while a full node has no gaps. It is easily seen that a m-ary search tree with n keys has n + gaps; the gaps correspond to the intervals of real numbers between the keys (and ± ), and a new key has the same probability /(n+) of belonging to any of the gaps. Thus the evolution of the m-ary search tree may be described by choosing a gap uniformly at random at each step. Equivalently, the probability that the next key is added to a node is proportional to the number of gaps at that node. Pólya urns have been used in some earlier studies, e.g. [5, 2], to describe the number of nodes in m-ary search trees containing i keys where 0 i m ; then a node containing i keys is called a node of type i and thus the generalised Pólya urn has m different types. It has been shown that for this process, when m 26 the number of different types has an asymptotic multivariate normal distribution, but this does not hold for larger m. (Since the condition Re λ < λ /2 for λ λ on the eigenvalues of the matrix A in (.) holds only if m 26.) Since the number of nodes in the whole tree is a linear combination of these numbers, this implies in particular that the distribution of the random number of nodes in an m-ary search tree containing n keys is asymptotically normal for m 26. In this Pólya urn, with one ball representing each node, the activity of a ball is the number of gaps, i.e., i + for a ball of type i m 2, and 0 for a ball of type m. Alternatively, see [2], we can use a Pólya urn where each ball represents a gap; thus a node with i keys corresponds to i + balls for 0 i m 2, and these balls are all given type i. (Full nodes are ignored.) This is thus an urn with m types, all with activities... Protected nodes and generalised Pólya urns We will see that it is possible to use a generalised Pólya urn also to study protected nodes in an m-ary search tree, although the urn consists of quite a few different types. Description of the Types in the Pólya urn. Given an m-ary search tree T with n keys together with its external nodes, erase all edges that connect two internal non-leaves. This yields a forest of small trees, where (assuming n m) each tree has a root that is a nonleaf in T while all other nodes are leaves or external nodes in T. We regard these small trees as the balls in our generalised Pólya urn. The type of a ball (tree) is the type of the tree as an unordered tree, i.e., up to permutations of the children. The type of a tree in the urn is thus described by the numbers k i, i = 0,..., m, of children of the root with i keys; each of these children is an external node (i = 0) or a leaf (i ), and it has itself children only when i = m when it has m external children; thus the type is uniquely determined by k 0,..., k m, and we can label the type by (k 0,..., k m ). Since the root of any of the small trees has m children (including external ones) in the original tree T, we have m i=0 k i m, (with the remainder m m i=0 k i equal to the number of erased edges to children in the original tree T that are non-leaves). Furthermore, the case k 0 = m is excluded, since the root of the small tree is a non-leaf in T. The total number of types ( is thus one less than the number of compositions of m into m + non-negative parts, i.e., 2m ) m. The activity in the Pólya urn of one of these types is the number of gaps that it contains. 5

Type (0,2) Type 2 (,) Type (0,) Type 4 (,0) Type 5 (0,0) Figure : The different types characterizing protected and unprotected nodes in binary search trees. Type 4 and type 5 are the only ones that include protected nodes. The root has no gaps, so a tree with type (k 0,..., k m ) has activity m i=0 (i + )k i. Moreover, if we add a new key to a leaf, it is still a leaf, so in the Pólya urn, this corresponds to replacing a tree by another tree where we have increased by the number of keys of one of the children of the root. The same holds if we add a key to an external node that is a child of the root. However, if we add a key to an external node that is a child of a leaf, then that leaf becomes a non-leaf, so the edge from it to the root is erased and the tree is split into two (one of which always has the type (m,, 0,..., 0)). See Section 2 for examples. Note that in general, a small tree may be transformed in several different ways when we add a new key, depending on which gap it goes into. Hence, the additions ξ i in the Pólya urn will be random. A protected node in T is a non-leaf, and is therefore a root in one of the small trees. Moreover, it must not have any child that is a leaf, so all its children are external nodes. Thus, the number of protected nodes in T equals the number of balls in the urn that have types (k 0, 0,..., 0) with 0 k 0 m. 2 Protected nodes in binary search trees and Pólya urns In this section we demonstrate the technique of using the Pólya urn defined above to study the number of protected nodes, by applying it to the simplest case m = 2, the binary search tree. This gives us a new proof of Theorem.2; for earlier proofs, see [8] and []. For a binary tree, the number of types in the Pólya urn defined above is ( 4 2) = 5. We show the different types in Figure, with a numbering that will be used below. (For convenience we omit the external nodes in the figures. We use dotted lines for edges attached to external nodes.) With our characterization of the types in Section.., the types i {,..., 5} correspond to (0, 2), (, ), (0, ), (, 0) and (0, 0), respectively. Let X n = (X n,, X n,2, X n,, X n,4, X n,5 ), where X n,i is the number of balls of type i in the urn corresponding to n keys (i.e., the number of trees that correspond to type i in our forest). Recall that we assume that n m = 2; the initial conditions are X 2,2 = and X 2,i = 0 for i 2. In a binary search tree, each leaf contains one key, so it has two external children, whereas other internal nodes have either or 0 external children. There is one gap at each external node, and no gaps at any internal node. As explained in Section..2, each gap (i.e. external node) has activity. When a ball is drawn from the urn (i.e., a new key is added to the tree), as explained in general in Section.., a key is either added to an external node that is a child of the root (we return a ball of another type), or to an external node that is a child of a leaf (we return two balls). Figures 2 5 show the transitions in the Pólya urn when a ball of type i for 6

Type Type 2 + Type Figure 2: Adding a key to type. Type 2 Type Type 2 Type 2 Type 4 / 2/ + Figure : Adding a key to type 2. Type Type 2 Type 5 + Type 4 Type Figure 4: Adding a key to type Figure 5: Adding a key to type 4 i {, 2,, 4} is drawn (where the types are shown in Figure ), so that the drawn ball is replaced by a new set of balls. (As said above, this set could depend on which of the nodes in the drawn type the key is added to, see Figure.) The activities of the different types depend on their number of gaps; the total activities for the types, 2,, 4, 5 are 4,, 2,, 0, respectively; thus a = (4,, 2,, 0). From the transitions that are shown in Figures 2 5, we easily obtain the matrix A = (a j E ξ ji ) 5 i,j= in (2.). A = 4 0 0 0 4 2 0 0 4 0 2 0 0 2 0 0 0 0 2 0 0 (2.) To do the matrix operations in this paper we use computer algebra (in our case Mathematica). 7

The eigenvalues of A are, 0, 2,, 4. Corresponding right eigenvectors of A are: 0 5 0 5, 0 0 0, 2 4, 2, 0 5 2 0, (2.2) 6 2 and corresponding left eigenvectors of A are: 4 2 4 0 2, 0,, 2,. (2.) 0 0 0 0 0 Since the eigenvalues for the matrix A are distinct it follows automatically that u i v j = 0 for i j (recalling that {u i }q i= and {v i} q i= are the left and right eigenvectors of A, respectively). Note that we have scaled the eigenvectors so that u i v i = and (.2) hold. Note also that u is equal to the activity vector a. This is a consequence of the fact that the total activity always increases by when we draw a ball from the urn, and thus a E ξ i = for each i, see [2, Lemma 5.4]. It is easy to see that we can apply Theorem.5 for this generalised Pólya urn. Note that it is obvious that the matrix A is diagonalisable since all eigenvalues are simple. From Theorem.5 we obtain that X n = (X n,, X n,2, X n,, X n,4, X n,5 ) has asymptotically a multivariate normal distribution. Let Y n be equal to the number of protected nodes in the binary search tree with n nodes. Since type 4 and type 5 each contains exactly one protected node, while the other types contain no protected nodes, Thus, Theorem.5 implies that with parameters µ Y = µ 4 + µ 5 and Y n = X n,4 + X n,5. Since λ =, Theorem.5 implies, using v in (2.2), that n /2 (Y n nµ Y ) d N (0, σ 2 Y ) (2.4) σ 2 Y = σ 4,4 + σ 4,5 + σ 5,4 + σ 5,5. (2.5) µ Y = µ 4 + µ 5 = 5 0 + 6 0 = 0. (2.6) Thus, to show Theorem.2 it remains to calculate the sum in (2.5). To calculate the matrix B in (.4) we need to calculate B i = E(ξ i ξ i ) in (.). In all cases except for B 2 these are deterministic and equal to ξ i ξ i. We only show how to obtain B 2 (since the other cases are simpler). As shown in Figure when adding a key to type 2 we can either add it to the leaf or to the external node. In case we add it to the external node (which happens with probability /) a node of type 2 is replaced by a node of type ; this change corresponds to the column vector (,, 0, 0, 0). If the key is instead added to 8

the leaf (which happens with probability 2/) a node of type 2 is replaced by another node of type 2 (the change of type 2 is 0) and a node of type 4; this change corresponds to the column vector (0, 0, 0,, 0). Hence B 2 = (,, 0, 0, 0) (,, 0, 0, 0) + 2 (0, 0, 0,, 0) (0, 0, 0,, 0) 0 0 0 0 0 0 = 0 0 0 0 0. (2.7) 2 0 0 0 0 0 0 0 0 0 By calculating the B i s we obtain the matrix B in (.4) as 0 0 2 5 0 0 0 2 5 0 5 B = 2 5 5 2 6. (2.8) 5 0 0 6 2 0 0 5 5 0 5 From (.6) in Theorem.5 it follows that the covariance matrix Σ for the asymptotic multivariate normal distribution of X n = (X n,, X n,2, X n,, X n,4, X n,5 ), is given by Thus, it follows that Σ = 4 575 67 2520 2600 29 2520 67 2 2520 420 42 2600 42 29 2520 260 0 400 260 400 7 2520 44 600 0 59 7 2520 59 800 4 800 8 260 4 4. (2.9) σ 2 Y = σ 4,4 + σ 4,5 + σ 5,4 + σ 5,5 = 8 260 + 4 2 4 = 29 225. (2.0) Thus, the proof of Theorem.2 is completed. Protected nodes in ternary search trees and Pólya urns We now proceed by analyzing the number of protected nodes in ternary search trees, by using the Pólya urn in Section.. (described for general m-ary search trees ) when m =. The 9 different types we get are shown in Figure 6 (with a numbering that will be used below). From our characterization of the types in Section.., for example type 2 corresponds to (0,,2). Note that type 7, type 8 and type 9 contain one protected node each, while the other types contain no protected nodes. To determine the matrix A we proceed (as for the binary search tree) to find the transitions when a ball (in our case one of the 9 trees in our forest) of type i is chosen. Figure 9

Type (0,0,) Type 2 (0,,2) Type (0,2,) Type 4 (,0,2) Type 5 (0,,0) Type 6 (0,0,2) Type 7 (,,) Type 8 (0,,) Type 9 (,2,0) Type 0 (2,0,) Type (0,2,0) Type 2 (,0,) Type (2,,0) Type 4 (0,0,) Type 5 (,,0) Type 6 (0,,0) Type 7 (2,0,0) Type 8 (,0,0) Type 9 (0,0,0) Figure 6: The different types characterizing protected and unprotected nodes in ternary search trees. Type 7, type 8 and type 9 are the only ones that include protected nodes. 7 illustrates the different situations for how a new key could be added to a ball (a tree) of type 2. All the other cases are similar, and we leave these cases as an exercise to the reader. From the different transitions for changing a node of type i we get the matrix A for ternary search trees in Figure 8. The example in Figure 7 gives the second column of A. Type 2 Type Type 2 Type 8 Type /4 /4 + Figure 7: The two possibilities for adding a key to a node in a tree of type 2 of a ternary search tree. The tree of type 2 has activity 8. If it is drawn, and the new key is added to the node with only one key which happens with probability 2 8, then a tree of type 2 is replaced with a tree of type. If the new key is instead added to one of the nodes containing two keys which happens with probability 6 8, then the tree of type 2 is replaced by a tree of type 8 and one tree of type. Thus, the second column of the matrix A for the ternary search tree is given by 8 ( 2 8,, 0, 0, 0, 0, 0, 6 8, 0, 0, 0, 0, 6 8, 0, 0, 0, 0, 0, 0). In this way we obtain A in Figure 8. 0

9 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 0 6 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 4 2 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 5 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 2 0 0 0 0 0 0 A = 0 0 0 0 0 0 0 0 0 5 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 4 0 0 2 0 0 0 0 9 6 6 0 6 0 0 4 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 8: The transition matrix A for the Pólya urn defined in Section.. in the case of the ternary search tree. The activities of the different types are given by the vector a = (9, 8, 7, 7, 6, 6, 6, 5, 5, 5, 4, 4, 4,,, 2, 2,, 0). These correspond to the number of gaps for the different types. The eigenvalues of the matrix A are, 0, 2,,, 4, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 8, 9. The eigenspace belonging to the eigenvalue 4 (which has algebraic multiplicity 4) has dimension. Since the dimension of the eigenspace belonging to the eigenvalue 4 is not equal to the algebraic multiplicity, the matrix A is not diagonalisable. (However, all other eigenspaces have full dimension.) Hence, we can not apply Theorem.5. However, Theorem.4 can be applied since a E(ξ i ) = for each i (this follows since we always add exactly one key when a tree of type i is chosen). From Theorem.4 we obtain that the vector X n = (X n,,..., X n,9 ), where X n,i are the number of balls of type i (in our case the number of trees that correspond to type i in our forest obtained from the ternary search tree), has asymptotically a multivariate normal distribution. Let Z n be the number of protected nodes in the ternary search tree with n nodes. Since type 7, type 8 and type 9 each contains exactly one protected node, while the other types contain no protected nodes, Thus, Theorem.4 implies that Z n = X n,7 + X n,8 + X n,9. (.) n /2 (Z n nµ Z ) d N (0, σ 2 Z), (.2)

with parameters and, writing Σ = (σ i,j ) 9 i,j=, µ Z = µ 7 + µ 8 + µ 9 σ 2 Z = 9 9 i=7 j=7 Using the normalization in (.2), we see that and that σ i,j. (.) v = 200 (, 5, 9, 9, 6, 7, 6, 20, 42, 42, 5, 0, 26, 28, 48, 5, 42, 45, 84) (.4) u = (9, 8, 7, 7, 6, 6, 6, 5, 5, 5, 4, 4, 4,,, 2, 2,, 0). (As in the binary case, u = a since a E ξ i = for each i, see [2, Lemma 5.4].) Since λ =, Theorem.4 and (.4) yield µ Z = µ 7 + µ 8 + µ 9 = 42 200 + 45 200 + 84 200 = 57. (.5) Thus, to show Theorem. it remains to calculate the sum in (.). Since we want to determine the matrix Σ I in (.5) we need to determine the matrices P I and B. We have P I = I 9 v u, which is a 9 9 matrix that is shown in (A.) in the appendix. To calculate the matrix B in (.4) we need to calculate B i = E(ξ i ξ i ) in (.). We only describe how to get B 2 since the other cases are analogous. From Figure 7 (and the explanation of that figure above) it is easy to see that B 2 = 4 b b + 4 b 2b 2, where, b = (,, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) and b 2 = (0,, 0, 0, 0, 0, 0,, 0, 0, 0, 0,, 0, 0, 0, 0, 0, 0). Note that B 2 is a 9 9 matrix. The matrix B is shown in (A.2) in the appendix. Now we can use Mathematica to evaluate the integral in (.5), which yields Σ. Finally, Σ = Σ by Theorem.4 with m =. This matrix is given last in the appendix. By (.) and (.), we only need the submatrix σ 7,7 σ 7,8 σ 7,9 Σ p = σ 8,7 σ 8,8 σ 8,9 = σ 9,7 σ 9,8 σ 9,9 560 80800 826069 8786000 826069 8786000 4569 0000 49549 876056000 4569 0000 2222557 898800 49549 876056000 4256826 284425625. (.6) Summing the σ i,j in (.6), which is equivalent to calculating (,, )Σ p (,, ), we find σ 2 Z = 9 9 i=7 j=7 which completes the proof of Theorem.. σ i,j = 692024867 4692256000, 2

a) An external node which is not a child of a leaf. b) A leaf containing one key. c) A leaf containing two keys and its three external children. d) An internal node with two keys which is not a leaf. Figure 9: The different types characterizing leaves and non-leaves in ternary search trees. 4 Leaves in ternary search trees Recall that a leaf is an internal node without internal children, i.e., a node that contains at least one key and has no children except possibly external ones. The proof of Theorem. yields also the following theorem. (The corresponding result for a binary search tree was considered already by Devroye [5] using two different methods, one of them a Pólya urn as here.) Theorem 4.. Let L n be the number of leaves in a ternary search tree. Then, L n 0 n n ( d N 0, 89 ). 200 First proof. Counting the number of leaves (of the original ternary search tree) in each type in Figure 6, we see that the number of leaves in a subtree of type i, i =,..., 9, is given by the vector l = (,,, 2,, 2, 2, 2, 2,, 2,,,,,, 0, 0, 0). (4.) Hence, L n = l X n. By the proof of Theorem., the vector X n has asymptotically a multivariate normal distribution, and it follows that with, using (.4) and (4.), and, using the covariance matrix Σ shown in the appendix, n /2 (L n nµ L ) d N (0, σ 2 L) (4.2) µ L = l v = 0, (4.) σ 2 L = l Σ l = 89 200. (4.4) However, it is also possible to show Theorem 4. using a much simpler Pólya urn process, where we only need to consider four different types. We again chop up the ternary search tree into small trees, now using the following types of trees.

Type is an external node which is not a child of a leaf. Type 2 is a node containing one key. Type is a leaf containing two keys together with its three external children. Type 4 is an internal node containing two keys which is not a leaf (i.e., it has less than three external children). The types are shown in Figure 9. Note that all nodes in the ternary search tree belong to exactly one such subtree. A ball of type has activity ; when it is drawn it is replaced by one ball of type 2. A ball of type 2 has activity 2; when it is drawn it is replaced by one ball of type. A ball of type has activity ; when it is drawn it is replaced by one ball of type 2, two balls of type and one ball of type 4. A ball of type 4 has activity 0 and is thus never drawn. The types that contain leaves are type 2 and type. To simplify we can study another urn using the gaps as balls. Type has one gap, type 2 has two gaps, type has three gaps and type 4 has 0 gaps. We label each gap with the type it belongs to; thus the gaps have only the three types. The gaps evolve as an urn with three types, with all activities and the matrix A in (.) given by 0 2 2 2 2. (4.5) 0 Since we consider the gaps (with activity ) it is obvious that all columns add to (since we always add one ball to the urn). The eigenvalues of A are,, 4. Theorem.5 shows that (X n,, X n,2, X n, ) has asymptotically a multivariate normal distribution, where X n,i is the number of balls of type i in the Pólya urn, i.e., the number of gaps of type i. Note that the number of subtrees of Types thus is (X n,, X n,2 /2, X n, /), which thus also is asymptotically multivariate normal. Since the number of leaves L n = X n,2 /2 + X n, /, it follows that L n has asymptotically a normal distribution (4.2). To find the parameters µ L and σl 2, we note that right eigenvectors of A corresponding to the eigenvalues,, 4 are: 0 4, 0, 2, (4.6) 2 5 and corresponding left eigenvectors of A are:,, 2. (4.7) 2 Note that we have scaled the eigenvectors so that u i v j = δ ij and (.2) holds. We have a = (,, ). Since type 2 has two gaps and one leaf and type has three gaps and one leaf, it follows that µ L = µ 2 + µ = ( 0 (, 4, ) 0, 2, ) = 0, corresponding to (4.). By calculating B, we get from Theorem.5, that the covariance 4

matrix Σ is given by We thus obtain σl 2 = ( 0, 2, ) 479 200 7 27 7 27 9 2 75 9 9 479 200 7 27 7 27 9 2 75 9 9. (4.8) 0 2 = 89 200 (4.9) (corresponding to (4.4)), which completes the proof of Theorem 4. with the simpler Pólya urn model. 5 Higher m 5. The Pólya urn defined in Section.. The Pólya urn defined in Section.. can be used for any given m, although the size of the matrices used in the calculations grow rapidly with m. (For m = 4 we have 69 types; for m = 0 we would have 84755.) However, the central condition Re λ < λ /2 is not satisfied for large m. We do not know any general formula for the eigenvalues of the matrix A, but some of them are given as follows. Lemma 5.. Let m 2. Then every root of the polynomial φ m (λ) := (λ + i) m! (5.) m is an eigenvalue of the matrix A for the Pólya urn in Section... i= Proof. Let M := ( ) 2m m be the number of types, and let as above Xn Z M 0 be the composition of the Pólya urn described in Section... Furthermore, let V i,n be the number of nodes containing exactly i keys (thus V 0,n is the number of external nodes), and consider the vector W n = (W,n,..., W m,n ) where W i,n = iv i,n ; thus W i,n is the total number of gaps at nodes with i gaps. The random vector W n can also be described by a Pólya urn, see e.g., [2, Example 7.8] and [6, Section 8..]; we denote the activity vector and the matrix (.) for this urn by a W = (,..., ) and A W, where the (m ) (m ) matrix A W has elements a i,i = i for i {,..., m }, a i,i = i for i {2,..., m}, a,m = m and all other elements a i,j = 0, i.e., 0 0... 0 m 2 2 0... 0 0 0... 0 0 A W = 0 0 4... 0 0. (5.2)........ 0 0 0... m (m ) 5

As is well-known, the matrix A W has characteristic polynomial φ m (λ), see e.g., [2, Example 7.8] or [6, Section 8..]. Since the vector X n determines the number of nodes with different numbers of keys, there is a linear map T : R M R m such that W n = T X n. T is determined by the description of the types in Section.., and it is easily seen that T is onto. Furthermore, starting the urns with an arbitrary (deterministic) non-zero vector X 0 Z M 0 and W 0 = T X 0, the urn dynamics yield Consequently, since also a X 0 = a W W 0, E(X X 0 ) = AX 0 a X 0 (5.) E(W W 0 ) = A W W 0 a W W 0. (5.4) T AX 0 = (a X 0 )T E(X X 0 ) = (a W W 0 ) E(W W 0 ) = A W W 0 = A W T X 0, and thus T A = A W T. Suppose that λ is a root of φ m (λ) = 0. Then λ is an eigenvalue of A W and thus there exists a left eigenvector u with u A W = λu. Consequently, u T A = u A W T = λu T, (5.5) so u T = (T u) is a left eigenvector of A. Since T is onto, T is injective and thus T u 0. This shows that λ is an eigenvalue of A too. The largest eigenvalue λ = for the matrix A, since the total activity increases by at each step, see [2, Lemma 5.4]. Let λ, λ 2,..., λ m be the roots of (5.) in order of decreasing real parts. It is well-known that λ = and, moreover, that Re λ 2 /2 if and only if m 26, see [7] and [9]. Consequently, if m 27, then Lemma 5. shows that A has an eigenvalue λ = λ 2 λ with Re λ 2 > /2, and then X n is not asymptotically normal. (See [2] for general results suggesting this, and [4] for a rigorous proof in the present case, showing that the total number of internal nodes is not asymptotically normal.) Furthermore, if α := Re λ 2 > /2, then (X n E X n )/n α is stochastically bounded, but has no limit in distribution (the distribution oscillates), see [4, 2, 2]. Some exceptional linear combinations of the variables X n,i are asymptotically normal also in such cases [2], but we conjecture that for any m 27, the number of protected nodes is not one of these exceptional cases and that it has the same non-normal behaviour as just described for the number of internal nodes. On the other hand, if m 26, although A has a much larger dimension that A W, and thus presumably many more eigenvalues, we conjecture that all additional eigenvalues also have Re λ < /2, so that Theorem.4 applies showing that the number of protected vertices is asymptotically normal, with asymptotic variance linear in n, just as for m = 2 and in Theorems.2 and.. (This conjecture has been verified for m 6 by Heimbürger [0].) 5.2 One-protected nodes and leaves in m-ary search trees. As mentioned in Section, the number of one-protected nodes and the number of leaves (the complement of the one-protected nodes) are easier to analyze than the two-protected 6

nodes, and we prove normal limit laws for all m-ary search trees where m 26. In these cases we can use a Pólya urn that is similar to the Pólya urn that has earlier been used to study the total number of internal nodes in an m-ary search tree, see e.g. Mahmoud [5] and [6, Section 8..] or [2, Example 7.8]. We can generalise the study of the number of leaves in ternary search tree in Section 4 to arbitrary m 2. (For m = 2, there are minor modifications in the formulas below; we leave these to the reader. As mentioned above, the case m = 2 was considered by Devroye [5].) We have in general m + types, defined in analogy with Figure 9: Type is as before, Type i with 2 i m is a leaf with i keys, Type m is a leaf with m keys together with its m external children, and Type m + is an internal non-leaf. Let V i,n = V i,n be the number of nodes containing exactly i keys for i {,..., m2}; let V 0,n be the number of nodes containing 0 keys (external nodes) that are not children of leaves; let V m,n be the number of nodes containing m keys that are leaves (i.e., they have only external children); finally, let V m,n be the number of internal nodes that are not leaves (all containing m keys). We consider again another, slightly simpler, urn with the balls representing the gaps, giving them types,..., m, and consider the vector W n = (W,n,..., W m,n) where W i,n = iv i,n is the total number of gaps of type i. The random vector W n can be described by a Pólya urn, with all activities. We denote the m m matrix (.) for this urn by A L. It is a minor modification of the matrix A W described in Section 5., see (5.2); the entries of A L are given by a i,i = i for i {,..., m}, a i,i = i for i {2,..., m}, a,m = m, a 2,m = 2, and all other entries a i,j = 0. I.e., A L = 0 0... 0 0 m 2 2 0... 0 0 2 0... 0 0 0 0 0 4... 0 0 0......... 0 0 0... m (m ) 0 0 0 0... 0 m m We can easily calculate the characteristic polynomial of A L and find that it is. (5.6) φ L m(λ) = (m + λ)φ m (λ), (5.7) where φ m (λ) is the characteristic polynomial of A W in (5.). Thus, A L has the same eigenvalues as A, plus the additional eigenvalue λ = m. Since φ m has only simple roots [4, Section.], and m is not one of them, also φ L m has only simple roots. Hence, A L has m distinct eigenvalues, and is thus diagonalisable. The largest eigenvalue of A L is λ = (as for A) and this eigenvalue corresponds to the right and left eigenvectors v = H m m 2(m+) 4. m m m+, u =., (5.8) 7

where we have normalized so that (.2) holds (H m denotes the mth harmonic number). Let L n be the number of leaves in an m-ary search tree with n keys. Then L n = m i= V i,n = m k=2 k W k,n. (5.9) Theorem 5.2. Suppose that m 26. Let L n be the number of leaves in an m-ary search tree. Then, L n µ L n d N ( 0, σ 2 n L), (5.0) where µ L = and σl 2 can be evaluated as H m where (σ ij ) m i,j= is given by (.6). m k=2 k(k + ) = H m m 2(m + ), (5.) σ 2 L = m i,j=2 σ ij ij (5.2) Proof. As said above, for m 26, Re λ < λ /2 = /2 for all eigenvalues λ λ of A, and thus also of A L. Furthermore, A is diagonalisable. Hence, Theorem.5 applies and shows asymptotic normality of W n. The result follows by (5.9), using v in (5.8). Remark 5.. Theorem 5.2 implies that E(Ln) n µ L, by the same argument as in Remark.. For m 27, we expect the same non-normal asymptotic behaviour as for the number of internal nodes [4, 2], see Section 5.. For the one-protected nodes we can use the first Pólya urn described above for the leaves, with m + types. For the leaves we could simplify by considering the gaps and use a Pólya urn with m types, with all activities. However, now we also need to consider type m +, which has 0 gaps. So in the analysis of the one-protected nodes we use the urn with m + different types (as explained in the beginning of this subsection) where types i {,..., m} have activities, 2,..., m and type m + has activity 0. In this Pólya urn, the one-protected nodes correspond to type m +. All other types correspond to leaves or external nodes. Theorem.5 implies the following result (the proof is analogous to the proof of Theorem 5.2). Theorem 5.4. Suppose that m 26. Let Q n be the number of one-protected nodes in an m-ary search tree. Then, Q n µ Q n n d N ( 0, σ 2 Q), (5.) where µ Q = H m (m + ), (5.4) and σ 2 Q can be evaluated as σ 2 Q = σ m+,m+ (5.5) where (σ ij ) m+ i,j= is given by (.6). 8

This urn can obviously also be used to study the number of leaves (the types 2 i m correspond to the leaves), giving another proof of Theorem 5.2. (Note that σ ij refers to different urns and thus has different meanings in Theorems 5.2 and 5.4.) Moreover, we can study L n and Q n together and obtain joint asymptotic normality for m 26; the covariance σ LQ of the limit variables in (5.0) and (5.) equals m i= σ i,m+ with (σ ij ) m+ i,j= as in Theorem 5.4. In particular, this implies the well-known asymptotic normality of the total number of internal nodes I n = L n + Q n, see e.g. [7, 4,, 4, 5, 9, 6]. Example 5.5. For a binary search tree (m = 2), a straightforward calculation of the covariance matrix Σ = (σ ij ) i,j= in Theorem 5.4 yields Hence Σ = 8 45 4 45 4 45 4 45 2 45 4 45 2 45 2 45 2 45. (5.6) σ 2 L,2 = (0,, 0) Σ (0,, 0) = σ 22 = 2 45, (5.7) as shown by Devroye [5]. Similarly, σq,2 2 = σ = 2 45 and σ LQ,2 = σ 2 = 2 45. (We have σl,2 2 = σ2 Q,2 = σ LQ,2 since the total number of internal nodes L n + Q n = I n = n is deterministic when m = 2.) Example 5.6. For a ternary search tree (m = ), similarly (cf. (4.8) for the corresponding urn using the gaps as in Theorem 5.2) Hence, cf. (4.4) and (4.9), Σ = 479 200 7 00 27 200 7 00 27 200 9 00 0 400 8 75 9 00 00 4 400 0 400 00 200 4 400 9. (5.8) σl, 2 = (0,,, 0) Σ (0,,, 0) = 89 200, (5.9) σq, 2 = (0, 0, 0, ) Σ (0, 0, 0, ) = 9, (5.20) σ LQ, = (0,,, 0) Σ (0, 0, 0, ) = 29 400. (5.2) We also obtain the corresponding asymptotic variance (0,,, ) Σ (0,,, ) = σl, 2 + σq, 2 + 2σ LQ, = 2 75 for the number of internal nodes L n + Q n, as found by Mahmoud and Pittel [7]. Acknowledgements: We would like to thank Hosam M. Mahmoud and Mark D. Ward for valuable discussions. We would also like to thank Johan Björklund for help with drawing of figures. 9

References [] Bóna M., k-protected nodes in binary search trees. Adv. Appl. Math. 5 (204),. [2] Chauvin B. and Pouyanne N., m-ary search trees when m 27: a strong asymptotics for the space requirement. Random Structures Algorithms 24 (2004), 54. [] Cheon G.S. and Shapiro L., Protected points in ordered trees. Appl. Math. Lett. 2 (2008), no. 5, 56 520. [4] Chern H.-H. and Hwang H.-K., Phase changes in random m-ary search trees and generalized quicksort. Random Structures Algorithms 9 (200), no. -4, 6 58. [5] Devroye L., Limit laws for local counters in random binary search trees. Random Structures Algorithms 2 (99), no., 0 5. [6] Devroye L. and Janson S., Protected nodes and fringe subtrees in some random trees. Electronic Communications in Probability 9 (204), no. 6, 0. [7] Drmota M., Random Trees, Springer, Vienna, 2009. [8] Du R. and Prodinger H., Notes on protected nodes in digital search trees. Appl. Math. Lett. 25 (202), no. 6, 025 028. [9] Fill J.A. and Kapur N., Transfer theorems and asymptotic distributional results for m-ary search trees. Random Structures Algorithms 26 (2005), no. 4, 59 9. [0] Heimbürger A., Asymptotic distribution of two-protected nodes in m-ary search trees. Master thesis, Stockholm University and KTH, 204. [] Holmgren, C. and Janson S., Limit laws for functions of fringe trees for binary search trees and recursive trees. Preprint, 204. arxiv:406.688 [2] Janson S., Functional limit theorems for multitype branching processes and generalized Pólya urns. Stoch. Process. Appl. 0 (2004), 77 245. [] Lew W. and Mahmoud H.M., The joint distribution of elastic buckets in multiway search trees. SIAM J. Comput. 2 (994), no. 5, 0 074. [4] Mahmoud H.M., Evolution of Random Search Trees. John Wiley & Sons, New York, 992. [5] Mahmoud H.M., The size of random bucket trees via urn models. Acta Inform. 8 (2002), no. -2, 8 88. [6] Mahmoud H.M., Pólya Urn Models. CRC Press, Boca Raton, FL, 2009. [7] Mahmoud H.M. and Pittel B., Analysis of the space of search trees under the random insertion algorithm. J. Algorithms 0 (989), no., 52 75. [8] Mahmoud H.M. and Ward M.D., Asymptotic distribution of two-protected nodes in random binary search trees. Appl. Math. Lett. 25 (202), no. 2, 228 2222. [9] Mansour T., Protected points in k-ary trees. Appl. Math. Lett. 24 (20), no. 4, 478 480. 20

A Appendix P I = B = 697 525 2 00 00 420 420 420 525 525 525 0 0 200 0 40 0 05 27 6 60 27 6 00 9 4 00 27 24 5 8 05 9 9 9 40 9 70 27 25 60 70 70 70 84 84 84 05 05 05 40 40 97 00 00 9 9 9 40 40 40 9 9 97 00 9 9 9 40 40 40 9 9 2 75 7 00 7 00 4 25 4 25 2 5 4 5 2 25 8 75 6 2 20 9 27 40 9 25 2 5 4 25 6 5 8 25 9 200 20 20 25 5 7 7 20 0 2 7 75 4 25 7 60 7 20 7 25 72 49 25 8 8 5 7 7 20 0 2 7 75 2 5 25 25 70 5 9 25 2 25 2 5 25 25 70 5 9 25 2 25 57 2 5 25 25 70 5 9 25 2 25 4 25 24 24 24 7 60 7 20 7 25 7 420 0 0 9 40 0 0 0 25 9 70 6 25 0 25 9 70 6 25 0 0 0 0 0 25 9 70 6 25 70 60 5 70 60 5 20 2 2 0 0 28 4 0 5 4 5 2 0 28 5 0 0 0 70 0 0 9 40 0 0 6 0 0 0 0 0 00 0 0 6 0 70 9 0 0 28 4 0 5 4 5 2 0 28 5 70 2 2 2 60 75 75 75 00 00 5 2 2 2 9 9 2 4 05 4 05 4 05 0 9 0 28 4 0 5 2 25 2 25 4 5 2 5 6 25 4 75 2 25 2 25 5 5 6 25 4 75 2 25 2 25 5 5 5 40 40 2 5 9 25 4 75 70 9 24 25 4 5 6 6 6 2 2 0 28 5 5 2 25 5 4 25 5 2 25 5 4 25 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 2 05 0 0 0 0 0 2 25 0 0 0 0 0 0 25 0 0 9 0 0 0 9 70 9 9 0 2 05 0 05 0 0 0 0 5 0 0 0 70 0 9 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 0 0 0 0 5 0 0 0 0 70 0 0 0 9 0 9 0 0 0 0 9 9 5 2 25 20 70 9 25 6 20 5 9 40 9 40 4 25 25 25 20 6 2 05 25 25 70 5 25 2 75 8 20 6 2 05 25 25 70 5 25 2 75 8 29 0 0 25 70 2 25 (A.) 24 25 70 2 25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 25 25 0 0 0 5 0 0 70 25 9 0 0 0 0 0 0 0 0 0 0 0 9 70 70 0 0 0 25 0 0 9 0 9 0 0 0 0 0 0 0 5 0 0 9 40 0 9 0 4 0 0 0 9 70 9 0 0 0 0 0 0 0 0 0 4 8 27 9 7 75 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 70 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 9 0 0 0 25 70 5 0 0 420 0 0 0 0 00 0 0 05 0 0 0 40 0 70 0 0 75 0 4 0 0 0 0 0 0 0 0 0 70 0 4 25 0 25 70 60 0 0 7 40 0 25 0 25 0 0 25 0 2 0 40 0 0 40 0 0 0 0 0 0 0 0 9 40 0 0 0 0 0 0 0 0 0 0 0 0 0 25 25 0 0 0 0 25 (A.2) 2

Σ = 200 4765424 4660 8760560 2408479 54752975 249644 54752975 249644 7 8777 924707800 28800297 959 8777 7002900 68499 2882870 24976 2882870 24976 09495 6678998 09495 57996 28828 722458 0402925 720772978 5675 2988 480800 466207 42942900 7052429 09495 090867 9606200 6804227 2408479 06759904 8760560 2408479 2900879 09667 09667 244879 55480 09667 55480 84040 244879 76 244879 26802 26802 97724 244879 0675 46005860 56069 680820 24826 680820 006296 26802 5548 46570 5952556 2790 8404 29008790 5952556 4294290 26597 48080 46997 24745 006296 2790 70667 977240 54752975 249644 09667 55480 229749 45 55 40 45 55 9909900 460058600 229407 656 40 64 2068 457 2068 757667 457 98980 757667 757667 990990 442 96524 94257 5997 08900 96524 54752975 94257 825825 257657400 8629757 000 5997 2840 8579 56654959 208579800 54752975 249644 09667 55480 45 55 229749 27859 27859 40 45 27859 9909900 460058600 229407 656 40 64 27859 2068 457 2068 757667 457 98980 757667 757667 990990 442 96524 94257 5997 08900 96524 54752975 94257 825825 257657400 8629757 000 5997 2840 8579 56654959 208579800 7 8777 84040 244879 27859 9909900 27859 9909900 62547 9898 075725 290076 2664 942 942 0 075725 2042 08900 942 08900 942 9696 7665 9898 7665 65447 27225 456479 296674 2927925 456479 656 88940 2207 296674 2689 6525 98980 944 6456887 2474 28800297 76 290076 49467 258052 45549 66940 8946 924707800 28800297 97724 460058600 229407 460058600 229407 075725 290076 9880 258052 578575 4585 2474 66940 49577025 8946 757 7002900 294457 42942900 294457 42942900 746495 2920720 746495 46005860 45549 67698 790075 247859 440 5945546 5840244 5672697 9606200 959 8777 244879 0675 27859 656 27859 656 2664 0 258052 578575 44 0 6402824 578575 942 8 942 8 7665 66066 7665 0 0894 9075 92958 88940 65948 975975 88940 275275 075725 65948 5678 2 00 944 6456887 578575 7002900 68499 46005860 56069 40 64 40 64 075725 2042 4585 7002900 6402824 578575 9604 8757649 8757649 59457 09809 05767 29848045 74698 290976 94589 524495 5584 2474 5584 2474 8757649 8588580 8757649 4294290 59457 075725 09809 6085775 05767 578575 29848045 46005860 74698 578575 290976 226940 94589 52449 24976 24826 2882870 24976 680820 2068 457 2068 942 294457 457 08900 942 42942900 942 8 5584 2474 90452 2475 498 2475 494 495495 878282 495495 09 900 288287 72856 2598 2705 2072 28828 25 200 55970 495495 2078779 4294290 24976 24826 2882870 24976 680820 2068 457 2068 942 294457 457 08900 942 42942900 942 8 5584 2474 498 2475 90452 2475 494 495495 878282 495495 09 900 288287 72856 2598 2705 2072 28828 25 200 55970 495495 2078779 4294290 09495 6678998 006296 26802 98980 757667 757667 7665 746495 98980 757667 9696 7665 2920720 746495 7665 8757649 494 494 66066 8757649 8588580 494 495495 494 495495 5682427 9696 265 9898 7252 222689 295822 52460 222689 09495 295822 6565 5997 824749 425880 824749 660660 5827 9696 24066797 2920720 57996 26802 09495 57996 5548 757667 990990 757667 7665 746495 990990 9898 7665 46005860 746495 7665 8757649 878282 878282 0 8757649 4294290 878282 495495 878282 495495 265 9898 2709457 99099 7252 7620 2445268 59644 09495 59644 6565 5997 824749 22940 824749 00 5827 9898 24066797 46005860 28828 722458 46570 8404 442 08900 442 08900 65447 45549 0894 59457 27225 45549 2474 0894 9075 59457 075725 09 900 09 900 7252 52460 7252 7620 9227 4 2207 7955746 5442 270 69087 6444 89 20 2545 0492 27090 2474 0402925 720772978 29008790 5952556 96524 54752975 54752975 96524 2927925 456479 66940 49577025 92958 975975 09809 6085775 288287 72856 288287 72856 222689 09495 2445268 09495 2207 7955746 60888498 486075 006998 075725 655257 56448 900588 075725 522952526 09495 46625652 49577025 2988 4277 5675 2988 4294290 94257 825825 94257 825825 656 88940 8946 88940 05767 757 88940 275275 05767 578575 2598 2705 2598 2705 295822 6565 59644 6565 5442 270 006998 949972 075725 949972 275275 6726 227 2474 227 42 4840 6565 52924 578575 480800 466207 26597 48080 257657400 8629757 257657400 8629757 2207 296674 67698 2072 2072 790075 075725 65948 46005860 28828 2072 28828 2072 5997 425880 5997 69087 22940 69087 6444 655257 56448 6726 2474 9492082 694478 286970 4044067 2474 2697979 55480 694478 69526600 7052429 46997 42942900 7052429 24745 5997 000 5997 000 2689 6525 247859 440 5678 74698 2 74698 578575 25 200 25 200 824749 660660 824749 00 89 20 900588 227 075725 227 42 4044067 2474 560 8 826069 660660 4569 757 09495 090867 006296 2790 2840 8579 8579 944 5945546 944 290976 2840 8579 98980 944 5840244 00 944 226940 55970 495495 55970 495495 5827 9696 5827 9898 2545 0492 522952526 4840 2697979 09495 4840 6565 2697979 55480 826069 660660 2222557 49549 56628 49549 475960 9606200 6804227 70667 977240 56654959 208579800 56654959 208579800 6456887 2474 5672697 9606200 6456887 2078779 578575 94589 52449 2078779 4294290 2078779 27090 46625652 52924 694478 4294290 24066797 2920720 24066797 46005860 27090 2474 46625652 49577025 52924 578575 694478 69526600 4569 757 49549 475960 99028 6525675