HOMOGENEOUS CUT-AND-PASTE PROCESSES

Size: px

Start display at page:

Download "HOMOGENEOUS CUT-AND-PASTE PROCESSES"

Sherman Wiggins
5 years ago
Views:

1 HOMOGENEOUS CUT-AND-PASTE PROCESSES HARRY CRANE Abstract. We characterize the class of exchangeable Feller processes on the space of partitions with a bounded number of blocks. This characterization leads to the cut-and-paste representation, which decomposes the jump measure of these processes into a discrete component (cut-and-paste) and a continuous component (relocation), and is akin to the Lévy-Itô decomposition for Lévy processes. The cut-and-paste representation classifies the discontinuities of these processes as either (1) cut-and-paste transitions, during which all indices choose a new block simultaneously, or (2) relocation, during which a single index changes its block membership while the rest stay fixed. In discrete-time, the cut-and-paste representation uniquely describes the evolution of exchangeable Feller chains by a product of independent and identically distributed random stochastic matrices. 1. Introduction We study exchangeable Markov chains and Markov processes on P [ ]:k, the subspace of partitions of N with at most k < blocks. In contrast to fragmentation and coagulation processes, which evolve on the entire space of partitions of N, processes on P [ ]:k have not been widely studied in the probability literature. However, such processes arise naturally in DNA sequencing applications, as we elaborate shortly. Moreover, it has been shown (Crane [8, 9]) that the dynamics of processes on P [ ]:k differ from the fragmentation-coalescence processes studied, for example, by Bertoin [4], Kingman [15], Pitman [16] and Berestycki [2]. Most notably, processes on P [ ]:k can admit jumps involving (informally) simultaneous fragmentation and coagulation events. We call such events cut-and-paste events and we show that all of the jumps of a discrete-time exchangeable Feller chain on P [ ]:k are of cut-and-paste type. In continuous-time, a Feller process can experience infinitely many jumps in arbitrarily small time intervals; in this case, the jumps are characterized by cutand-paste events (discrete component) and relocation events (continuous component), in which a single index changes its block membership while the rest of the partition remains unchanged. Since the number of blocks is bounded, it is convenient to work on the space [k] N of k-colorings of N, where we regard [k] := {1,..., k} as a finite set of distinct colors. Of course, each x = x 1 x 2 [k] N maps to a unique partition of N through the equivalence relation i j x i = x j. Any such partition has at most k blocks. Along the way to characterizing exchangeable Feller processes on P [ ]:k, we first characterize exchangeable Feller processes on [k] N. Ultimately, we obtain a Lévy-Itô type Date: May 10, Mathematics Subject Classification. Primary 60J25, secondary 60G09, 60J35. Key words and phrases. exchangeable random partition; de Finetti s theorem; Lèvy-Itô decomposition; paintbox process; coalescent process; interacting particle system; Feller process; random matrix product. 1

2 2 HARRY CRANE representation of the characteristic measure χ governing the jumps of an infinitely exchangeable P [ ]:k -valued process, written (1) χ = cρ + µ Σ, where c 0 is a unique constant, ρ is the relocation measure and µ Σ is the cut-and-paste measure, determined by some unique measure Σ on the space of k k stochastic matrices. Further explanation of the components of (1) is given in due course, but, for now, we point out that (1) resembles analogous representations of the characteristic measure µ of exchangeable coagulation processes, for which (2) µ = cκ + ϱ ν, and homogeneous fragmentation processes, which have (3) µ = cɛ + ϱ ν ; see Bertoin [5]. In both (2) and (3), ϱ ν is Kingman s paintbox measure [14], c 0 is a unique constant and κ (respectively ɛ) is the Kingman measure (resp. erosion measure) which represents the continuous component of µ. In each of (1), (2) and (3), the first term on the right-hand side corresponds to the continuous component of these processes, in the sense that the corresponding behavior of the projection into the simplex is continuous. In contrast to the measures µ in (2) and (3), which are exchangeable measures on P (partitions of N), χ in (1) is a measure on the space M [ ]:k of paintbox arrays, which we define in section 3.4. We point out that our representation (1) is neither a refinement nor a special case of (2) and (3): in a strict sense, processes on P [ ]:k behave differently than those on P. That said, a certain subclass of exchangeable coalescents can be phrased in our terms and, therefore, can be represented by (1). On the contrary, homogeneous fragmentations are not covered under any circumstances. In section 7.1, we discuss these relationships to coagulation and fragmentation processes in more detail. In addition to the decomposition of the characteristic measure of exchangeable partition processes, the cut-and-paste measure µ Σ connects the cut-and-paste process to products of independent and identically distributed random matrices, see e.g. Bougerol and Lacroix [6]. Particularly, a discrete-time exchangeable Feller chain admits a canonical construction by repeated iteration of i.i.d. random matrices. This representation has been used by Crane and Lalley [10] to study the convergence to equilibrium of exchangeable Feller chains on [k] N and P [ ]:k. The key ingredients to the proofs in [10] are the discrete-time cut-and-paste representation (Theorems 2.1 and 2.3) and the classical Furstenberg-Kesten Theorem from random matrix theory [12] Applications to DNA sequencing. The theory of exchangeable partitions and partitionvalued processes is largely motivated by applications in population genetics, e.g. [13]. Similarly, though in a different context, the processes we consider (on partitions with a bounded number of blocks) arise naturally in genetics. For example, given a sample u 1,..., u n of individuals, let X 1 X n {A, C, G, T} n be a string of DNA nucleotides at a particular chromosomal site, for which X i denotes the nucleotide of individual i = 1,..., n. Furthermore, if we observe a DNA sequence (X i m, m 0) for each individual i = 1,..., n, then we obtain a sequence (X m, m 0) in {A, C, G, T} [n]. By forgetting labels (in this case nucleotides), we obtain a sequence of set partitions. See Table 1. Assuming the indices m 0 correspond to adjacent sites along a chromosome, the induced sequence (X m, m 0) corresponds to a contiguous location along the chromosome

3 HOMOGENEOUS CUT-AND-PASTE PROCESSES 3 individuals/sites u 1 A A T C C G A u 2 A T T C G G A u 3 T T T G G C T Table 1. An array of DNA sequences for individuals u 1, u 2, u 3. From this array, we obtain a sequence in {A, C, G, T} [3] : (AAT, ATT, TTT, CCG, CGG, GGC, AAT,...), which can be easily translated into [k] [3] for k = 4. If we ignore labels and consider only the induced partition sequence, we obtain the sequence (12 3, 1 23, 123, 12 3, 1 23, 12 3, 12 3,...) of partitions of the set {1, 2, 3}. and we expect that observations at nearby sites are dependent. Therefore, it is natural to model the observed sequence (X m, m 0) as a sample from an exchangeable Markov chain on the infinite space [k] N or P [ ]:k, which reflects the assumption of sampling from a large (possibly infinite) population. The need for an exchangeable model is clear, as the labels assigned to individuals should not vary inferences in any substantive way. Furthermore, the Feller property is desirable as it allows us to model the observed sequence (a sequence of finite colorings or partitions) as a Markov chain. In general, the restriction of a Markov chain to [k] [n] (or P [n]:k ) need not preserve the Markov property. This paper is devoted to understanding how exchangeability and sampling consistency restrict the evolution of such processes and, hence, the class of corresponding models. 2. Summary of main theorems and organization of paper We begin with a description of a transition procedure on the space {0, 1} N of infinite binary sequences. These examples capture the general characterization of exchangeable Feller processes on [k] N and P [ ]:k, and may be helpful to keep in mind when considering the general case. Suppose we have two coins with success probabilities p 0 and p 1, respectively, and suppose x = x 1 x 2 {0, 1} N is some configuration of 0s and 1s. We can randomly generate a new x Y Y X Table 2. Illustration of the described scheme: x := denotes the current (fixed) state, Y 0 and Y 1 are i.i.d. random sequences of Bernoulli random variables with success probabilities p 0 and p 1, respectively, and X := is obtained by putting X i equal to Y i j on xi = j. The middle two rows, labeled Y 0 and Y 1, determine a paintbox array M := (Y 0, Y 1 ) (as defined in section 3.4), and the corresponding operation x X illustrates the action of M on [k] N, i.e. X = M(x).

4 4 HARRY CRANE element X {0, 1} N as follows. Independently of x and of each other, let Y 0 := Y 1 0 Y2 0 be an i.i.d. sequence of Bernoulli(p 0 ) random variables (i.e. P{Y 1 0 = 1} = p 0) and Y 1 := Y 1 1 Y2 1 be an i.i.d. sequence of Bernoulli(p 1 ) random variables (i.e. P{Y 1 1 = 1} = p 1). Given x, Y 0 and Y 1, we define X coordinate-by-coordinate by putting X i := Y i x i for each i N. We summarize this procedure in Table 2. The above prescription is tantamount to independently flipping a coin for each i N: if x i = 0, we flip the p 0 -coin; otherwise, if x i = 1, we flip the p 1 -coin. If a head is flipped, we put a 1 at the corresponding coordinate of X ; if a tail is flipped, we put a 0. To generalize this procedure, we can consider first drawing a random pair of coins with success probabilities (p 0, p 1 ) from some joint distribution on [0, 1] [0, 1] and, conditional on this outcome, performing the above procedure. Suppose the limiting frequencies f 0 (x) := lim n n 1 f 1 (x) := lim n n 1 n 1{x i = 0} and i=1 n 1{x i = 1} i=1 exist. Then, by the law of large numbers, the limiting frequencies f 0 (X ) and f 1 (X ) exist almost surely and correspond to ( ) ( ) ( ) p 1 p 0 f1 (x) f1 (X = ) 1 p 1 1 p 0 f 0 (x) f 0 (X. ) The left action by the above 2 2 matrix on the 2-simplex is a key observation of our theory. Note that this matrix is column stochastic (each column sums to one) and it leaves the simplex invariant under left multiplication. In continuous-time, we can consider a positive constant r n > 0 and, given x [k] [n], we let T be an exponential holding time with rate r n and T an exponential holding time with rate n. If T T, we obtain X by the above coin tossing procedure for some (p 0, p 1 ); otherwise, if T < T, we obtain X by choosing an index i [n] uniformly at random, changing the label of x i (from either 0 to 1 or 1 to 0), and leaving all other coordinates unchanged, we call this a relocation event. This procedure can be carried out for any n N, in which case we observe that the holding time in state x is shrinking to 0 as n. Indeed, we can also allow r n and maintain a consistent system of exchangeable Markov processes on [k] N. Consequently, the limit process experiences infinitely many jumps in any arbitrarily small time interval. Both of the above formulations have been specified in the case k = 2. In our main theorems, we generalize to any k N and show that any exchangeable Feller process on [k] N (and by corollary P [ ]:k ) is a mixture of an analogous procedure in either discrete or continuous-time Notation. Before summarizing our main theorems, some notation is needed. In this section, in favor of developing intuition, we often state notation and terminology with only a heuristic definition. Precise definitions are delayed to section 3.

5 HOMOGENEOUS CUT-AND-PASTE PROCESSES 5 We break our summary into two parts, corresponding to discrete and continuous-time processes. The key difference between these two is that the behavior of discrete-time chains is determined by a finite measure on S k (k k column stochastic matrices), whereas the continuous-time processes are determined by a (possibly infinite) measure on M [ ]:k (paintbox arrays), which is determined by a (possibly infinite) measure Σ on S k, finite constants c ij 0, i j, and the relocation measures ρ ij, 1 i j k, which we explain shortly. The space M [ ]:k of paintbox arrays was mentioned in section 1, and an example of M M [ ]:k and its induced action on [k] N is given in Table 2 for the case k = 2. Somewhat more formally, let M := (M j i, 1 i k, j 1) satisfy (Mj i, j 1) [k]n for every i = 1,..., k. Then M is a paintbox array. (We may also write M j i = M(i, j) when convenient.) For x [k] N, the image x := M(x) is obtained by putting x j = M j, for each j N, where i i = x j. For example, if x n = i, then the label of n in x := M(x) is the label of n in M i, i.e. x n = M n = M(x n, n). i A random paintbox array is called exchangeable if, regarded as a k-tuple of labeled partitions, its distribution is invariant under independent action by the symmetric group on each of its elements. That is, any M M [ ]:k determines a k-tuple (M 1,..., M k ) [k] N [k] N (k times). A random element M [k] N is exchangeable if for any k-tuple (σ 1,..., σ n ) of finite permutations of N, (4) (M 1,..., M k ) = L (M σ 1 1,..., Mσ k k ), where M σ i is the image of M i i by σ i, defined in (13), and = L denotes equality in law. By immediate corollary to de Finetti s theorem and Kingman s paintbox representation, the law of any exchangeable random paintbox array is determined by a probability measure Σ on S k. We call the measure Σ induces on M [ ]:k a cut-and-paste measure and denote it by µ Σ Discrete-time cut-and-paste chains: random matrix products. Throughout this section, we deal exclusively with discrete-time Markov chains, from which the analog for continuous-time Markov chains is easily deduced by a random scaling of time. Let χ be a probability measure on M [ ]:k and let X 0 [k] N be an exchangeable initial state. We define X := (X m, m 0) by X 0 = X 0 and, for m 1, (5) X m = (M m M m 1 M 1 )(X 0 ), where M 1, M 2,... is an i.i.d. sequence of paintbox arrays governed by χ and independent of X 0. We call χ the directing measure of X. Theorem 2.1. Let X = (X m, m 0) be an exchangeable Feller chain on [k] N. Then there exists a unique probability measure Σ on S k such that X = L X, where X is the Markov chain in (5) with directing measure µ Σ. Remark 2.2. We call Σ the cut-and-paste measure of X, µ Σ the characteristic measure of X, and X a cut-and-paste chain. We define µ Σ in (15). Any exchangeable sequence X [k] N projects to a unique vector X in the (k 1)- dimensional simplex k, where X := ( X 1 (i), 1 i k) T is the vector of limiting frequencies of each class i = 1,..., k. (Here, we write X 1 (i) := {n N : X n = i}.) For any Markov chain X, we write X := ( X m, m 0) to denote its induced chain on k, if it exists. For any measure Σ on S k and Φ 0 k, we construct a Markov chain Φ := (Φ m, m 0) on k by

6 6 HARRY CRANE letting Φ 0 be its initial state and (6) Φ m := S m S m 1 S 1 Φ 0, m 1, where S 1, S 2,... are i.i.d. stochastic matrices with law Σ and are independent of Φ 0. (The operation in (6) is the usual left multiplication of real-valued matrices.) Theorem 2.3. Let X be an exchangeable Feller chain on [k] N. Then X exists almost surely and is a Markov chain on k. In particular, X = L Φ for Φ constructed in (6) from directing measure Σ, the cut-and-paste measure of X from Theorem 2.1. We now consider the projection of X into P [ ]:k by ignoring labels. Formally, we write B : [k] N P [ ]:k to denote this operation: X {X 1 (1),..., X 1 (k)}\{ }. Given an exchangeable probability measure χ on M [ ]:k, let X be the chain constructed in (5). We write Π := (Π m, m 0) to denote the projection of X into P [ ]:k by removing labels, i.e. Π m := B (X m) for every m 0. Theorem 2.4. Let Π be an exchangeable Feller chain on P [ ]:k. Then Π = L Π, where Π is the projection of some cut-and-paste chain on [k] N characterized by a unique row-column exchangeable cut-and-paste measure Σ. Remark 2.5. We call a Markov chain Π on P [ ]:k a homogeneous cut-and-paste chain if Π = L Π for some Π as defined above. As in Theorem 2.1, we call Σ its cut-and-paste measure Continuous-time cut-and-paste processes: Lévy-Itô decomposition. In this part, we extend the theory from section 2.2 to include infinite measures on M [ ]:k. The processes we consider here have the feature that they can experience infinitely many jumps in arbitrarily small time intervals but, by sampling consistency, each finite restriction can make only finitely many jumps in bounded intervals. Throughout this section, we assume X := (X(t), t 0) is an exchangeable Feller process on [k] N (equipped with the product discrete topology). There is a natural restriction of any M M [ ]:k to a map M [n] : [k] [n] [k] [n] by simply ignoring all coordinates of M larger than n. This should be clear by Table 2 and the surrounding discussion. Let χ be an exchangeable measure on M [ ]:k satisfying (7) χ({id k }) = 0 and χ({m M [ ]:k : M [n] id k,n }) < for all n N, where id k is the identity map [k] N [k] N and id k,n is the identity map [k] [n] [k] [n]. We use χ as the intensity of a Poisson point process on M [ ]:k ; from this Poisson process, we construct a Markov process on [k] N by building a compatible collection of finite state space Markov chains. In this context, the second half of (7) is needed to ensure that the finite state processes have càdlàg paths almost surely. We construct a Markov process X = (X (t), t 0) on [k] N as follows. Let M := {(t, M t )} R + M [ ]:k be a Poisson point process with intensity dt χ (where dt is Lebesgue measure on R + ) and let X 0 be an exchangeable initial state. Then, for each n N, we construct X [n] := (X [n] (t), t 0) on [k][n] by putting X [n] (0) = X 0 [n] := X 1 0 Xn 0, the restriction of X 0 to its first n coordinates, and (8) if t > 0 is an atom time of M such that M t [n] id k,n, then we put X [n] (t) = M t [n](x [n] (t )); otherwise, we put X [n] (t) = X [n] (t ).

7 HOMOGENEOUS CUT-AND-PASTE PROCESSES 7 By construction, (X, n N) is a compatible collection of finite state space Markov chains; [n] hence, (X [n], n N) determines a unique process on [k]n, which we denote X. It is easy to show (Proposition 5.2) that any such X is an exchangeable Feller process. We call χ the characteristic measure of X. Theorem 2.6. Let X be an exchangeable Feller processes on [k] N. Then there exists an exchangeable measure χ satisfying (7) such that X = L X, where X is constructed as in (8) from the Poisson point process M with intensity dt χ. Moreover, the measure χ can be expressed as (9) χ = µ Σ + c ij ρ ij, 1 i j k for a unique measure Σ on S k satisfying (10), unique constants c ij 0, i j, and ρ ij defined in (36). Specifically, (i) the restriction of χ to {M M [ ]:k : M k I k } is a cut-and-paste measure, 1 { M k I k }χ(dm) = µ Σ (dm); and (ii) the restriction of χ to {M M [ ]:k : M k = I k } is a relocation measure, 1 { M k =I k }χ(dm) = c ij ρ ij. In the above statement, I k denotes the k k identity matrix. The expression (9) is of Lévy-Itô type (e.g. Bertoin [3]), where ρ ij are the relocation measures, which determine the continuous component of χ, and µ Σ describes the jump component of χ. To guarantee χ satisfies (7), Σ must satisfy (10) Σ({I k }) = 0 and (1 S )Σ(dS) <, S k where S := min{s 11,..., S kk } for any S S k. We give a detailed explanation of the above theorem, and its implications, in section 2.4. As in discrete-time, X := ( X(t), t 0) denotes the projection of X into k by taking asymptotic frequencies. Theorem 2.7. Let X be an exchangeable Feller process on [k] N. Then X exists almost surely and is a Feller process on k. The corresponding result for exchangeable Feller processes Π on P [ ]:k reads similarly to Theorem 2.4, with the modification that the characteristic measure is given by (9). We call Feller processes on P [ ]:k with infinite transition rates homogeneous cut-and-paste processes. Theorem 2.8. Let Π be an exchangeable Feller process on P [ ]:k. Then there exists a unique row-column exchangeable measure Σ on S k satisfying (10) and a unique constant c 0 such that Π = L B (X), where X is a cut-and-paste process on [k] N with characteristic measure (11) χ = µ Σ + cρ, where ρ := i j ρ ij is the relocation measure. Remark 2.9. The measures (9) and (11) are, in general, infinite and so, like the general fragmentation and coagulation processes studied in [4, 16], the homogeneous cut-and-paste process experiences an instantaneous transition out of its initial state. In fact, the cut-and-paste processes can experience instantaneous transitions at all times. i j

8 8 HARRY CRANE 2.4. Discussion of main theorems. In discrete-time, Theorem 2.1 admits the following description of the transitions of an exchangeable cut-and-paste chain on [k] N. At each time, we simultaneously draw k k-sided dice from a distribution Σ (the probability that die j shows i pips occupies the (i, j) entry of S Σ) and, for each index presently having label j [k], we roll die j to determine its label at the next time. So, in discrete-time, all indices choose new labels at each time. Though in some sense obvious, Theorem 2.1 leads to a more surprising observation (Theorem 2.3) regarding the sequence induced by X on the k-simplex k. As a result of the above k-dice construction, this induced chain is governed by a product of i.i.d. random matrices, whose entries coincide with the probabilities of the random dice drawn at each time. This is the generalization of the recipe outlined at the beginning of this section (and illustrated in Table 2). In continuous-time, we obtain the Lévy-Itô decomposition (9), whereby the discontinuities of X are of two types only: cut-and-paste move: as in discrete-time, all indices choose new labels independently from some randomly chosen transition probability matrix; single index flip: a single index labeled i [k] changes its label to j i, while all other indices remain unchanged. The single index flips are governed by the second term in (9), where ρ ij is the (i, j)-relocation measure. In words, the (i, j)-relocation measure assigns unit mass to every single index flip from label i to label j. The constants c ij 0 act as the rate at which any index labeled i flips to j. All other types of discontinuity are ruled out as a result of infinite exchangeability. For instance, if double index flips were permitted, i.e. a pair of indices change their labels simultaneously while all other labels remain unchanged, then the finite restrictions of X could not be càdlàg. To see this, note that exchangeability demands that each double site flip of (n, n ), n n, must occur at the same rate; therefore, if this rate were positive, say r > 0, then each index n N must have infinite flipping rate, since n n r =. In a similar way, the technical condition (10) prevents cut-and-paste moves from bunching up and causing finite restrictions to make instantaneous jumps. Once the theorems are observed for exchangeable processes on [k] N, the conclusions for P [ ]:k -valued processes are nearly immediate by realizing that the projection of X into P [ ]:k preserves the Markov property only if the transition law of X treats the labels [k] symmetrically, which amounts to row-column exchangeability of the characteristic measure χ in (9). By the nature of the decomposition (9), we see that the continuous component of any exchangeable Feller process on [k] N (and P [ ]:k ) is purely deterministic Examples. We now provide two examples of exchangeable cut-and-paste processes (one in discrete-time and one in continuous-time) and one example of an exchangeable family of Markov chains that are not Feller (the Ehrenfest walk on the hypercube). In Examples 2.10 and 2.11, both processes induce a reversible Feller process on P [ ]:k (in fact, the [k] N -valued chain in Example 2.10 is also reversible). The stationary distribution in the first example is a version of the Pitman-Ewens two-parameter generalization to the Ewens sampling formula [11]. On the other hand, the purely continuous process in Example 2.11 evolves deterministically and settles down to a (somewhat trivial) equilibrium measure. Example 2.12 illustrates why single index flips are not admitted by discrete-time Feller chains.

9 HOMOGENEOUS CUT-AND-PASTE PROCESSES 9 Example 2.10 (A reversible discrete-time chain). Fix α > 0, let ξ = Dirichlet(α,..., α) be the k-parameter symmetric Dirichlet measure on k and put Σ := ξ ξ, the k-fold product measure of ξ. It is known (see e.g. Pitman [17]) that the paintbox process directed by ξ is determined by finite-dimensional distributions ϱ (n) ξ (π) := k! b π α #b, π P (k #π)! (kα) n [n]:k, where #π denotes the number of blocks of π, #b denotes the cardinality of b [n] and α j := α(α + 1) (α + j 1). On [k] [n], we define transition probabilities k k P n (x, x j=1 ) := (α/k) n ij(x,x ), x, x [k] [n], i=1 α n i(x) where n ij (x, x ) := #{l [n] : x l = i and x l = j} and n i (x) := #{l [n] : x l = i}. The above transition probability is reversible with respect to λ (n) ξ (x) = k i=1 α n i(x) (kα) n, x [k] [n]. For every n N, the projection of this process to P [n]:k is Markovian and has unique stationary distribution ϱ (n) as given above. Furthermore, its induced chain on the simplex has stationary ξ distribution ξ, which is a special case of the Poisson-Dirichlet distribution. One feature of this family is that it is in the class of self-similar homogeneous cut-and-paste chains (section 6.4), for which Σ is described as a k-fold product of some measure on k. See [8] for a more in depth study of this family of chains. Example 2.11 (A purely continuous process). Let k = 2, c 12, c 21 > 0 and Σ = 0. Then the associated process on [k] N evolves continuously on [k] N by a constant interchange of mass between the labels 1 and 2. In this case, the equilibrium distribution on the simplex is degenerate at ( c 21 c 12 + c 21, ) c 12. c 12 + c 21 The projection into P [ ]:k is Markov only if c 12 = c 21, in which case the projection into the simplex is degenerate at (1/2, 1/2). In equilibrium, there is a constant and equal flow of mass between the two blocks at every instant of time. Example 2.12 (Non-example: Ehrenfest chain on {0, 1} [n] ). The process in Example 2.11 can be thought of as a continuous-time process on the infinite-dimensional hypercube {0, 1} N. In this example, we highlight that the family of discrete-time Ehrenfest chains on the hypercubes {0, 1} [n], n N, are not Feller and, thus, not covered by our theory. On {0, 1} [n], an Ehrenfest chain X [n] evolves by choosing a coordinate 1,..., n uniformly at random and then flipping a fair coin to decide its value at the next time. All other coordinates remain unchanged. In the language of section 2.4, all transitions of this chain are single index flips in discrete-time. These chains are clearly exchangeable; however, they are not consistent since the probability that X [n] remains in the same state x {0, 1} [n] after a transition is 1/2, whereas the projection of an Ehrenfest chain X [n+1] on {0, 1} [n+1] into {0, 1} [n] remains in the same state with probability (n + 2)/(2n + 2) 1/2. Therefore, the system (X [n], n N) is not consistent.

10 10 HARRY CRANE The rest of the paper is organized in five sections. In section 3, we lay out the necessary definitions and notation; in section 4, we prove Theorems 2.1 and 2.3 for discrete-time chains on [k] N ; in section 5, we prove Theorems 2.6 and 2.7 for continuous-time processes on [k] N ; in section 6, we deduce Theorems 2.4 and 2.8 and corollaries for partition-valued processes and their induced processes on the ranked k-simplex; finally, in section 7, we make some concluding remarks on the equilibrium measures of cut-and-paste processes as well as the relationship of cut-and-paste processes to coagulation and fragmentation processes. 3. Preliminaries 3.1. Partitions and colorings. For fixed k N, a k-coloring of [n] := {1,..., n} is a sequence x = x 1 x n in [k] [n], which may be regarded as the projection of an infinite sequence x 1 x 2 in [k] N to its first n coordinates. Any x [k] [n] projects to a unique equivalence relation π, and hence partition, of [n] by the relation (12) i π j x i = x j. From this projection, any x [k] N determines a unique partition of N. To be precise, a partition π of [n] is a collection {π 1, π 2,..., π r } of non-empty, disjoint subsets (blocks) satisfying r j=1 π j = [n]. Equivalently, we can regard π as the equivalence relation π, where, for any i, j [n], i π j i and j are in the same block of π. For n N, we write P [n] to denote the set of all partitions of [n] and P to denote the set of partitions of N. For fixed k N, P [n]:k (respectively P [ ]:k ) denotes the subcollection of partitions of [n] (resp. N) with at most k blocks. Any π P is determined by the compatible sequence (π [1], π [2],...) of its finite restrictions, where, for each n N, π [n] := {π j [n], j 1}\{ }. Similarly, we write x [n] := x 1 x n to denote the restriction of x [k] N to its first n coordinates. Furthermore, any injective map ϕ : [m] [n], m n, naturally associates each π P [n] to π ϕ P [m] and x [k] [n] to x ϕ [k] [m], where and i π ϕ j ϕ(i) π ϕ(j) (13) x ϕ := x ϕ(1) x ϕ(m). At times, it is convenient to express these projections as maps on either P [n] or [k] [n], in which case we make the convenient abuse of notation and write R m to denote restriction to [m] on both partitions and k-colorings, i.e. R m π := π [m] and R m x := x [m]. On the other hand, for ϕ : [m] [n], we write ϕ : P [n] P [m] to denote the image of a partition by ϕ, ϕ (π) := π ϕ, and ϕ : [k] [n] [k] [m] to denote the image of a k-coloring by ϕ, ϕ (x) := x ϕ. For x [k] [n], let B n (x) denote the partition of [n] induced by x through (12), then B := (B n, n N) establishes a natural correspondence between [k] N and P [ ]:k in the sense that, for every m n and every injective map ϕ : [m] [n], B m ϕ = ϕ B n. This correspondence is useful in section 6.

11 HOMOGENEOUS CUT-AND-PASTE PROCESSES 11 The projective nature of both [k] N and P [ ]:k endows both spaces with the natural product discrete topology induced, for example, by the ultrametric d defined by (14) d(λ, λ ) := 2 n(λ,λ ), for λ, λ both in [k] N or P [ ]:k, where n(λ, λ ) := max{n N : λ [n] = λ }. Under (14), both [n] [k] N and P [ ]:k are compact, separable, and therefore Polish, metric spaces. Throughout this paper, we equip [k] N with the sigma field σ n=1 [k][n] and P [ ]:k with the sigma field σ n=1 P [n]:k Exchangeability. For each n N, let S n denote the symmetric group of permutations of [n], i.e. one-to-one maps [n] [n]. A permutation of N is called finite if it fixes all but finitely many elements of N. An infinite sequence X := (X 1, X 2,...) of random variables is called exchangeable if its law is invariant under finite permutations of its indices; in other words, for each n N, (X σ(1),..., X σ(n) ) = L (X 1,..., X n ) for every σ S n. From de Finetti s theorem, see e.g. Aldous [1], the law of any exchangeable sequence X [k] N is determined by a unique measure ν on the (k 1)-dimensional simplex k := {(s 1,..., s k ) T : s i 0, k s i = 1}. In particular, conditional on s ν, let X 1, X 2,... be independent and identically distributed (i.i.d.) according to P s {X = j} = s j, j = 1,..., k. We write λ ν to denote the unconditional distribution of X constructed in this way, and we call ν the directing measure of X. Any exchangeable [k]-valued sequence X projects to an exchangeable random partition Π := B (X). We write Π ϱ ν to denote the law of Π generated in this way. This construction of Π can be viewed as a special case of Kingman s paintbox construction [14] for exchangeable random partitions of N, and so we call ϱ ν a paintbox measure and Π ϱ ν a paintbox process directed by ν. A consequence of the paintbox representation is the almost sure existence of the asymptotic frequency, or associated mass partition, of every exchangeable random partition. To be specific, for X [k] N exchangeable we write X := ( X 1 (1),..., X 1 (k) ) k, where X 1 (i) := {j N : X j = i} and, for any S N, #(S [n]) S := lim n n denotes its asymptotic frequency, if it exists. Likewise, if the asymptotic frequency of each of its blocks exists, a partition π P [ ]:k has asymptotic frequency π := ( π 1,..., π k ), the asymptotic frequencies of its blocks listed in decreasing order of size, which is an element of the ranked-k simplex k := {(s 1,..., s k ) : s 1 s k 0, i s i = 1}. Remark 3.1. To avoid measurability concerns, we can add the point to both k and and we k put x = (resp. π = ) whenever the asymptotic frequency of x [k] N (resp. π P [ ]:k ) does not exist. We equip k, respectively, with the sigma field generated by, respectively, k i=1

12 12 HARRY CRANE : [k] N k { } and : P [ ]:k { }. Beyond this point, issues of measurability never k arise, and so neither does the above formalism Exchangeable Markov processes. For T either Z + or R +, let Λ := (Λ(t), t T) be a random collection in either [k] N or P [ ]:k. We say Λ is Markovian if, for every t, t 0, the conditional law of Λ(t + t ), given F t := σ Λ(s), s t, depends only on Λ(t) and t. Specifically, we distinguish between processes with finitely many jumps in finite intervals (Markov chains) and those which admit infinitely many jumps in arbitrarily small intervals (Markov processes). When speaking generally, we use the terminology and notation of Markov processes as a catch-all. To aid the exposition, we regard T as time; when time is discrete, we indicate time by subscripts indexed by m, e.g. Λ m, and when time is continuous, we index time by t and write Λ(t). Throughout the paper, we focus on (infinitely) exchangeable Markov processes, which, on [k] N and P [ ]:k we define as a collection Λ := (Λ(t), t T) for which Λ is exchangeable; and Λ is consistent (under selection), or Feller; that is, for each n N, the restriction Λ [n] := (R n Λ(t), t T) is a Markov chain with càdlàg paths. Notationally, Λ is infinitely exchangeable if, for every m n, ϕ Λ [n] = L Λ [m], for every injection ϕ : [m] [n], where ϕ coincides with either ϕ or ϕ as appropriate. We stress that, by our definition, an exchangeable Markov process determines a consistent system of finite state space Markov chains through the restriction maps. Note that, in general, a Markov process on [k] N or P [ ]:k need not be consistent, as the restriction maps are many-to-one. However, in this paper, we only treat exchangeable Markov processes which are also consistent, and so, for convenience, we incorporate consistency into our definition of exchangeability. Under the product discrete topology, exchangeability and consistency of a Markov process are equivalent to exchangeability and the Feller property, and so we use the terms consistency and Feller interchangeably. Remark 3.2 (Notation). Throughout the paper, we use the letter x to denote a k-coloring, X a random k-coloring and X a random collection of k-colorings. On the other hand, we use the letter π to denote a partition, Π a random partition and Π a random collection of partitions. Above, when discussing definitions and notation pertaining to both [k] N and P [ ]:k, we have used λ, Λ and Λ, as appropriate Random functionals and paintbox arrays. For k N, we write S k to denote the space of k k (column) stochastic matrices, i.e. each S S k satisfies S(i, j) = S ij 0 for all 1 i, j k; and S 1j + + S kj = 1 for every j = 1,..., k. Any S S k can serve as a transition probability matrix for a time-homogeneous Markov chain on [k] by defining the probability of a transition from j to i by S ij. For n, k N, a paintbox array M M [n]:k is a k-tuple (M 1,..., M k ) of k-colorings of [n], which we write as an array M := (M j i, 1 i k, 1 j n) with each row corresponding to the sequence M i := M 1 i Mn i, i = 1,..., k. Therefore, we call M i the ith row of M. Any M M [n]:k can be regarded as the restriction of an infinite paintbox array, M := (M 1,..., M k )

13 HOMOGENEOUS CUT-AND-PASTE PROCESSES 13 M [ ]:k, which is a k-tuple of k-colorings of N. Written as an array, M := (M j, 1 i k, j 1) i restricts uniquely to M [n] M [n]:k defined by M := (M j, 1 i k, 1 j n). [n] i For any S S k and n N, we define a probability measure on M [n]:k by letting (M j, 1 i k, 1 j n) be independent with distribution i We write µ (n) S each n N, µ (n) S P S {M j i = l} = S li, l = 1,..., k. to denote the probability measure on M [n]:k determined by this procedure. For is a product measure and, hence, the measures (µ(n), n N) are consistent and determine a unique measure µ S on M [ ]:k. Since S k k k := k k (k times), we endow S k with the product of sigma algebras generated by the asymptotic frequency map : [k] N k. Given a measure Σ on S k, we write µ Σ as the mixture of µ S measures: (15) µ Σ ( ) := µ S ( )Σ(dS). S k We call µ Σ a cut-and-paste measure. Recall from section 2.1 that a measure χ on M [ ]:k is exchangeable if χ almost every M M [ ]:k satisfies (4), i.e. χ is invariant under independent action by a finite permutation on each of its rows. From the above definition, µ Σ is exchangeable for every measure Σ on S k. For any M M [ ]:k, we define its asymptotic frequency M k by a k k (column) stochastic matrix with (i, j)-entry n (16) lim n 1 1{M l n j = i}, l=1 the limiting proportion of i s in the sequence M j := M 1 j M2. This quantity exists almost j surely whenever M is exchangeable in the sense of (4). Given M M [ ]:k and x [k] N, we write x = M(x) to denote the sequence for which x m = M m on the event x m = i. We call M : [k] N [k] N a cut-and-paste mapping. The name i cut-and-paste is motivated by the following scheme. For x [k] N let x 1 (i) := {j N : x j = i} for each i = 1,..., k, and envision N partitioned into k labeled classes x 1 (1),..., x 1 (k) (some of which may be empty). Then M acts on these classes by cutting into sub-classes: each class of x is further partitioned into k sub-classes, each labeled 1,..., k, according to M; and pasting commonly labeled sub-classes: for every i = 1,..., k, each sub-class labeled i is merged into a single class labeled i. The image x is obtained by assigning x j = i if j is in a sub-class labeled i after the cutting phase. We equip M [ ]:k with the sigma field σ n N M [n]:k. 4. Discrete-time cut-and-paste chains: random matrix products In this section, we assume X := (X m, m 0) determines a consistent system of exchangeable Markov chains on [k] N. That is, for every n N, X [n] := (R n X m, m 0) is a Markov chain on [k] [n] and, for every ϕ : [m] [n], m n, ϕ X [n] = L X [m]. S

14 14 HARRY CRANE Generically, we denote the transition probabilities of X by the family (P(x, ), x [k] N ) of conditional distributions on [k] N where, for every m 0, P[X m+1 F m ] = P(X m, ). The assumption that (X [n], n N) is a consistent system of Markov chains implies that each X [n] has a well-defined transition kernel P n (, ) given by (17) P n (x, x ) := P(x, {x [k] N : x [n] = x }), x, x [k] [n], for every x [k] N agreeing with x in the first n coordinates. In fact, the assumption of exchangeability of X implies that the finite-dimensional transition probabilities (17) are compatible in the sense that for every m n and injection ϕ : [m] [n], (18) P m (x, x ) = P n (x, ϕ 1 (x )), x, x [k] [m], for every x ϕ 1 (x). In this section, we focus on the law governing the transitions of X and so we implicitly assume that the initial state X 0 is exchangeable. Recall the definition of X in (5). We obtain the following preliminary fact about exchangeable measures on M [ ]:k. Proposition 4.1. Let χ be an exchangeable probability measure on M [ ]:k and let X be the Markov chain in (5) with directing measure χ. Then X is an exchangeable Markov chain on [k] N. Proof. The Markov property of X is obvious since M 1, M 2,... is an i.i.d. sequence which is independent of the initial state. Exchangeability follows by exchangeability of χ and the identity σ (M(x)) = M σ (x σ ) for all finite permutations σ : N N. (Here, M σ denotes the image by applying σ to each row of M, i.e. M σ := (M σ(j), 1 i k, j 1).) Finally, i consistency of the system of finite restrictions (X, n N) follows by the definition of the [n] restriction M M [n] of M M [ ]:k to M [n]:k. In fact, the converse of Proposition 4.1 is also true. That is, any exchangeable Markov chain X can be constructed as in (5) from an i.i.d. sequence M 1, M 2,... from some exchangeable probability measure χ on M [ ]:k. Theorem 4.2. Let X be an exchangeable Markov chain on [k] N. Then there exists an exchangeable probability measure χ on M [ ]:k such that X = L X, for X constructed as in (5) from directing measure χ. Our proof of Theorem 4.2 relies on some preliminary propositions and lemmas. For fixed k N, we define Z k [k] N by (19) Z (z 1)k+i k = i for every z N, i [k]; that is, Z k := Z 1 k Z2 is an infinite repeating sequence of the pattern 12 k, which naturally k partitions N into classes (20) Z 1 k (i) := {(z 1)k + i : z N}, i = 1,..., k. For any n N, let Z k,n := R nk Z k denote the restriction of Z k to [nk]; that is, (21) Z 1 k,n (i) := {i, k + i,..., (n 1)k + i}, i = 1..., k. Remark 4.3. For each i = 1,..., k, the image of Z 1 k,n (i) by the map x x/k is simply [n]. Also, for any x [k] [n], ϕ x ( )/k is the identity [n] [n].

15 HOMOGENEOUS CUT-AND-PASTE PROCESSES 15 For any x [k] [nk], we define x M [n]:k as the paintbox array with entries (22) x j i = x(j 1)k+i, 1 i k, 1 j n. If it is not clear from context, we write x k,n := x to specify the dimensions of the induced paintbox array. For any n N, the map k,n : [k] [nk] M [n]:k is a bijection, and so any measure on [k] [nk] induces a measure on M [n]:k and vice versa. Moreover, as any measure on [k] N induces a system of consistent measures on ([k] [nk], n N), such a measure also induces a measure on M [ ]:k. For M M [n]:k, let M 1 k,n = M 1 be the inverse image of M under, and let (P n (, ), n N) be the finite-dimensional transition probabilities of an exchangeable Feller chain on [k] N. We define a probability measure χ n on M [n]:k, for each n N, by (23) χ n (M) := P nk (Z k,n, M 1 ), M M [n]:k. Clearly, χ n is a probability measure on M [n]:k for every n N. Exchangeability and consistency of (χ n, n N) is induced by exchangeability of X as follows. Let ϕ = (ϕ 1,..., ϕ k ) : [m] k [n] k be an injective map. From ϕ, we define ϕ : [mk] [nk] by (z 1)k + r (ϕ r (z) 1)k + r. Then we have (24) k,m ϕ = ϕ k,n, where we define ϕ (M) := (ϕ 1 (M 1),..., ϕ k (M k)) to be the componentwise application of the associated projections ϕ 1,..., ϕ to the corresponding rows of M. In the following k proposition, we also use the fact that ϕ (Z k,n ) = Z k,m for ϕ : [mk] [nk] as defined above. Proposition 4.4. The collection (χ n, n N) defined in (23) is a consistent collection of measures on (M [n]:k, n N) which satisfy (25) χ m = χ n ϕ 1 for every m n and every injection ϕ : [m] k [n] k. In particular, (χ n, n N) determines a unique exchangeable probability measure χ on M [ ]:k. Proof. Let 1 m n, M M [m]:k and write x := M 1 k,m. Then, for any injection ϕ : [m]k [n] k, we have (26) (27) χ n ϕ 1 (M) := P nk (Z k,n, ϕ 1 (x)) = P mk (Z k,m, x) = P mk (Z k,m, M 1 ) =: χ m (M). Note that (26) follows from (24), ϕ (Z k,n ) = Z k,m and (18). We conclude uniqueness of χ by Kolmogorov s extension theorem. The next two lemmas are needed to prove Theorem 4.2. Lemma 4.5. Let n N. For every x [k] [n], there exists an injection ϕ x : [n] [nk] such that ϕ x(z k,n ) = x. Proof. Take any x [k] [n] and write x(i) = x i whenever convenient. Then ϕ x : [n] [nk], i (i 1)k + x i, is a one-to-one map and the projection ϕ x(z k,n ) satisfies [ϕ x(z k,n )](i) = Z k,n (ϕ x (i)) = Z k,n ((i 1)k + x i ) = x i

16 16 HARRY CRANE for all i [n]. Hence, ϕ x(z k,n ) = x. Remark 4.6. For the rest of the paper, we reserve the symbol ϕ x to denote the injection [n] [nk] associated to x [k] [n], as defined in the preceding lemma. Lemma 4.7. For every n N and every x, x [k] [n], we have the identity {M M [n]:k : M(x) = x } = { x : x ϕ 1 x (x )}. Proof. Fix n N, x, x [k] [n] and take any x ϕ 1 x (x ). We put M := x and write x (i) := x i, i N, when convenient. Then, for every i [n] and r [k], x i = r x (ϕ x (i)) = r x ((i 1)k + x i ) = r M(x i, i) = r [M(x)] i = r and we have M(x) = x. Conversely, assume M(x) = x and let x = M 1. Reading the above series of statements from bottom to top, we have for all i [n], establishing our claim. [M(x)] i = r x (ϕ x (i)) = r [ϕ x(x )] i = r, Proof of Theorem 4.2. Let χ be the unique exchangeable measure characterized by the transition probability measure of X through (23) and let X be the Markov chain defined in (5) with directing measure χ. We now show that X = L X. By definition, X 0 = X 0 and hence both initial states have the same law. It remains to show that the transition probabilities are the same. In this direction, note that for M χ, the restriction M [n] to M [n]:k has distribution χ n as in (23). Now, by Lemmas 4.5 and 4.7 and Proposition 4.4, we have, for every x, x [k] [n], This completes the proof. P{M [n] (x) = x } = χ n ({M M [n]:k : M (x) = x }) := P nk (Z k,n, ϕ 1 x (x )) = P n (x, x ) Asymptotic frequency. In (16), we defined the asymptotic frequency of any M M [ ]:k by the stochastic matrix M k := (S ij, 1 i, j k). For 1 i, j k, S ij is the limiting frequency of the label i in M j := M 1 j M2 j : S ij := lim n n 1 n 1{M l j = i}. Lemma 4.8. Let M M [ ]:k be distributed according to an exchangeable probability measure χ on M [ ]:k. Then M possesses asymptotic frequencies almost surely. Proof. By condition (25), the sequences M 1,..., M k of M χ are independently exchangeable k-colorings of N and, by de Finetti s theorem, almost surely possess asymptotic frequencies M 1,..., M k. It follows that M possesses asymptotic frequency M k almost surely. l=1 We have the following characterization of exchangeable probability measures on M [ ]:k.

17 HOMOGENEOUS CUT-AND-PASTE PROCESSES 17 Proposition 4.9. Let χ be an exchangeable probability measure on M [ ]:k. Then there exists a unique probability measure Σ on S k such that χ = µ Σ. Proof. Let M χ. Then by (25) and Lemma 4.8, the conditional law of M given M k is µ M k (defined in (15)); hence, the unconditional law of M must satisfy χ(dm) = µ M k (dm)χ(dm ) = µ S (dm) χ k (ds) = µ χ k (dm), M [ ]:k S k where χ k denotes the image measure of χ by the map M M k. Uniqueness of Σ = χ k is a consequence of the fact that any other Σ for which χ = µ Σ on measurable subsets of M [ ]:k can, necessarily, only differ from Σ on χ k -null sets. Proof of Theorem 2.1. This is an immediate consequence of Theorem 4.2 and Proposition 4.9. Remark We call Σ the cut-and-paste measure of X, and we call X an (exchangeable) cut-andpaste chain directed by (or with characteristic measure) µ Σ. In fact, the process induced by X on the simplex through X := ( X m, m 0) is a Markov chain with a representation as in (6). Let X be an exchangeable cut-and-paste chain directed by µ Σ and construct a Markov chain Φ on k as in (6). It is intuitive both that Φ is a Markov chain on k and that Φ should have the same law as X, provided X exists. We can now make this claim formal. Lemma Let M, M be independent random paintbox arrays on M [ ]:k such that M k and M k exist almost surely. Then M M k exists and equals M k M k almost surely. Proof. By definition, the (i, j)-entry of M M k is S ij := lim n n 1 n k m=1 l=1 1{M m l = i}1{m m j = l}.

18 18 HARRY CRANE Writing M 1 (i) := {m N : M m l l (l) exist almost surely for every i, j, l [k] and M 1 j = i} and M 1 j (l) := {m N : M m j = l}, we have M 1 (i) and l (28) S ij = k l=1 # = lim n = lim n = = = k l=1 k l=1 k l=1 (M 1 l (i) M 1 j (l)) ( k k l=1 lim n M 1 l=1 (M 1 l #(M 1 l #(M 1 l ) (i) M 1 (l)) [n] n (i) M 1 (l) [n]) n j (i) M 1 (l) [n]) n l (i) M 1 j (l) j j M 1 l (i) M 1 j (l) a.s. (by independence of M and M ), where (28) follows from the fact that M 1 (l) M 1 (l ) = for l l and we interchange the j j limit and the sum in the succeeding line by an easy application of the bounded convergence theorem. Hence, M M k = M k M k almost surely. Since M k and M k are assumed to exist almost surely, it follows that M M k exists almost surely. Proof of Theorem 2.3. To prove this, we use the representation of X from Theorem 2.1. In particular, let Σ be the cut-and-paste measure governing the transitions of X. Let M 1, M 2,... be an i.i.d. sequence of paintbox arrays with distribution µ Σ and let X 0 be taken from the initial distribution of X, which we have assumed to be exchangeable. We construct X from X 0, M 1, M 2,... as in (5). By Lemma 4.8, each of M 1, M 2,... possesses asymptotic frequencies almost surely and, by Theorem 2.1, M 1 k, M 2 k,... are i.i.d. with distribution Σ. By Lemma 4.11, for every m 0, X m = (M m M m 1 M 1 )(X 0 ) = M m k M m 1 k M 1 k X 0 a.s. It follows that X := ( (M m M m 1 M 1 )(X 0 ), m 0) = L Φ, where Φ is defined in (6) from the i.i.d. sequence S 1, S 2,... with law Σ. We delay the proof of Theorem 2.4 and discussion of exchangeable Markov chains on P [ ]:k until section 6. We point out that the above cut-and-paste chains can be embedded in continuous-time in the usual way by adding a random exponentially distributed hold time between successive states. When the jump rates are finite, we obtain the same representation as for the discrete-time chains above. We make no further mention of these chains, as the corresponding theorems are easily deduced from those in the next section, in which we treat general exchangeable Feller processes on [k] N.

19 HOMOGENEOUS CUT-AND-PASTE PROCESSES Continuous-time cut-and-paste processes: Lévy-Itô decomposition So far we have characterized the transition measure of infinitely exchangeable Markov chains on [k] N. These are collections X of random k-colorings indexed by the non-negative integers, which we have shown to admit the representation given in (5). In this section, we show that the cut-and-paste representation holds more generally for exchangeable processes admitting possibly infinitely many jumps in arbitrarily small time intervals. The result is a decomposition of the characteristic measure of the process in the vein of the Lévy-Itô decomposition for Lévy processes. Let X := (X(t), t 0) be an exchangeable Feller process on [k] N. In particular, for any continuous g : [k] N R, the semigroup P := (P t, t 0) of X satisfies lim t 0 P t g(x) = g(x) for all x [k] N and x P t g(x) is continuous for all t 0. By the Feller property and exchangeability, the finite restriction X [n] := (R n X(t), t 0) to [k] [n] is a càdlàg exchangeable finite state space Markov chain, for every n N. Since each restriction X [n] is a finite state space Markov chain, it is characterized by its jump rates (29) Q n (x, x 1 ) := lim t 0 t P(X [n](t) = x X [n] (0) = x), x x [k] [n], which, for every n N, satisfy Q n (x, [k] [n] \{x}) < for all x [k] [n] and are exchangeable in the sense that, for every σ S n, (30) Q n (x, x ) = Q n (x σ, x σ ), x x [k] [n]. We now proceed in much the same way as in the proof of Theorem 4.2, with the modification that the total jump rate of X can now be infinite. Let χ be an exchangeable measure on M [ ]:k which satisfies (7) and let M be a Poisson point process on R + M [ ]:k with intensity dt χ, which we use to construct X := (X (t), t 0) with jump measure χ, as in (8). Proposition 5.1. The collection (X, n N) constructed by (8) is a consistent system of exchangeable Markov [n] chains. Proof. Each X is clearly a Markov chain by assumption (7) and the memoryless property [n] of the exponential distribution. Moreover, by construction, R m X [n] = X for every m n, [m] and so (X, n N) is compatible. Exchangeability is now a consequence of Proposition [n] 4.1 since, for each n N, the image of χ by M M [n] is a finite exchangeable measure on M [n]:k. It follows that (X, n N) determines a unique exchangeable Markov process on [n] [k] N. We write X to denote the unique Markov process determined by the system (X [n], n N). Proposition 5.2. The process X, with exchangeable jump measure χ satisfying (7), is an exchangeable Feller process. Proof. Under the metric (14), every M M [ ]:k determines a Lipschitz continuous map [k] N [k] N. The Feller property is an easy consequence of this fact combined with condition (7). Corollary 5.3. Every exchangeable measure χ on M [ ]:k satisfying (7) determines the jump rates of an exchangeable Feller process on [k] N.

THE CUT-AND-PASTE PROCESS 1. BY HARRY CRANE Rutgers University

THE CUT-AND-PASTE PROCESS 1. BY HARRY CRANE Rutgers University The Annals of Probability 2014, Vol. 42, No. 5, 1952 1979 DOI: 10.1214/14-AOP922 Institute of Mathematical Statistics, 2014 THE CUT-AND-PASTE PROCESS 1 BY HARRY CRANE Rutgers University We characterize