The Poisson-Dirichlet Distribution: Constructions, Stochastic Dynamics and Asymptotic Behavior Shui Feng McMaster University June 26-30, 2011. The 8th Workshop on Bayesian Nonparametrics, Veracruz, Mexico. Typeset by FoilTEX 1
Part I: Definition and Models GEM distribution, Poisson-Dirichlet Distribution and Dirichlet Process An Urn Model Dirichlet Distribution Derivation Subordinator Representation Gamma-Dirichlet Algebra Part II: Stochastic Dynamics Wright-Fisher Model Infinitely-Many-Alleles Model Fleming-Viot Process Dynamical Analogue of the Gamma-Dirichlet Algebra Part III: Asymptotics Sampling Formula and Large Sample Approximation Large θ Approximation Typeset by FoilTEX 2
Part I: Definitions and Models 1. GEM Distribution, Poisson-Dirichlet Disitrbution and Dirichlet Process Definition 1.1 For 0 α < 1, θ > α, let U 1, U 2,... be independent, and U i Beta(1 α, θ + iα). Set V 1 = U 1, V n = (1 U 1 ) (1 U n 1 )U n, n 2. The law of (V 1, V 2,...) is called the GEM distribution, denoted by GEM(α, θ). Definition 1.2 The law of the descending order statistics of V 1, V 2,... is called the two-parameter Poisson-Dirichlet distribution, denoted by P D(α, θ). The case α = 0 corresponds to Kingman s Poisson-Dirichlet distribution. Typeset by FoilTEX 3
Definition 1.3 Let S be a Polish space and ξ 1, ξ 2,... be iid with common diffuse law ν 0 on S, and independently, (P 1 (α, θ), P 2 (α, θ),...) follows the two-parameter Poisson-Dirichlet distribution. The random measure on S Ξ α,θ,ν0 (dx) = P i (α, θ)δ ξi (dx) i=1 is called the two-parameter Dirichlet process. = {x = (x 1, x 2,...) [0, 1] : 0 x i 1, = {(x 1, x 2,...) : x 1 x 2 0}, M 1 (S) = the set of all probabilities on S. x i 1}, i=1 Then the GEM distribution is a probability on, P D(α, θ) is a probability on, and the Dirichlet process is a probability on M 1 (S). Typeset by FoilTEX 4
2. An Urn Model Consider an urn that initially contains a black ball of mass θ. Balls are drawn from the urn successively with probabilities proportional to their masses. When a black ball is drawn, it is returned to the urn together with a black ball of mass α and a ball of new color with mass 1 α. If a non-black ball is drawn, it is returned to the urn with one additional ball of mass one with the same color. Colors are labelled 1, 2, 3,... in the order of appearance. Typeset by FoilTEX 5
For each n 1, let C i (n) denote the number of non-black balls with label 1 i n after n draws. Then ( C 1(n) n, C 2(n) n,..., C n(n) n, 0, 0,...) (V 1, V 2,...) in distribution. Similarly, let C [1] (n) C [2] (n)... denote the descending order statistics of C 1 (n), C 2 (n),.... Then ( C [1](n) n, C [2](n) n,..., C [n](n), 0, 0,...) (P 1 (α, θ), P 2 (α, θ),...) in distribution. n Typeset by FoilTEX 6
3. Dirichlet Distribution Derivation For any n 2, let (X1 n,..., Xn) n be a Dirichlet(a 1,..., a n ) random vector with order statistics (X[1] n,..., Xn [n]). Assume Then max{a 1,..., a n } 0, n, n a i θ, n. i=1 (X n [1],..., Xn [n], 0, 0,...) (P 1(0, θ), P 2 (0, θ),...) in distribution. This derivation explains the name of Poisson-Dirichlet and works for the case of α = 0. Typeset by FoilTEX 7
4. Subordinator Representation Definition 4.1 A process {τ s : s 0} is called a subordinator if it has stationary, independent, and non-negative increments with τ 0 = 0. Definition 4.2 A subordinator {τ s : s 0} has no drift if for any λ 0, s 0, { E[e λτ s ] = exp s 0 } (1 e λx )Λ(d x), where Λ is the Lévy measure on [0, ). Example 4.1 Poisson process {N s, s 0} with parameter c > 0, E[e λn s ] = exp{ cs(1 e λ )}. The corresponding Lévy measure is Λ(d x) = cδ 1. Typeset by FoilTEX 8
Example 4.2 The subordinator {γ s : s 0} is called a Gamma subordinator if its Lévy measure is Λ(dx) = x 1 e x d x, x > 0. In this case, E[e λτ s ] = exp { s 0 (1 e λx )x 1 e x d x } = 1 (1 + λ) s. Example 4.3 Stable subordinator {ρ s : s 0} with index α (0, 1). measure is Λ(d x) = αx (1+α) d x, x > 0. In this case, Its Lévy E[e λρ s ] = exp{ sγ(1 α)λ α }. Typeset by FoilTEX 9
Example 4.4 The subordinator {ϱ s : s 0} is a generalized Gamma process with scale parameter one (Brix (99), Lijoi, Mena and Prünster(07)) if its Lévy measure is Λ(d x) = Γ(1 α) 1 x (1+α) e x d x, x > 0. In this case, E[e λϱ s ] = exp{ s α ((λ + 1)α 1)}. In general, let {τ s : s > 0} be a drift free subordinator with Lévy measure Λ satisfying (i) Λ(0, ) = +, (ii) 0 x 1 Λ(d x) < +. Remark: It follows from Campbell s theorem that condition (ii) guarantees that for every t > 0, τ t < almost surely. Typeset by FoilTEX 10
Let V 1 (τ t ), V 2 (τ t ), denote the jump sizes of {τ s : s 0} up to time t, ranked by size. Then the sequence is infinite due to (i) and due to (ii). Set Then τ t = V i (τ t ) < i=1 U i (τ t ) = V i(τ t ) τ t. (U 1 (τ t ), U 2 (τ t ), ) forms a random discrete probability. Typeset by FoilTEX 11
It follows from direct calculation that the Lévy measures of the Gamma subordinator, the stable subordinator and the standard generalized Gamma process satisfy both (i) and (ii). Theorem 1. (Kingman (75), Pitman and Yor (97)) (1) For θ > 0, the law of (U 1 (γ θ ), U 2 (γ θ ), ) is PD(0, θ); (2) For 0 < α < 1, the law of (U 1 (ρ s ), U 2 (ρ s ), ) is the same for all positive s and is P D(α, 0); (3) For 0 < α < 1, θ > 0, the law of (U 1 (σ α,θ ), U 2 (σ α,θ ), ) is P D(α, θ), where ( γ( θ σ α,θ = ϱ γ( θ α )/Γ(1 α) =: ϱ α ) ). Γ(1 α) Typeset by FoilTEX 12
Example 4.5 (Brownian Motion). Let B t be the standard Brownian motion. Define Z = {t 0 : B t = 0} and let L t denote the local time of B t at zero. Set τ s = inf{t 0 : L t > s}. Then {τ s : s 0} is a stable subordinator with index 1/2 and Z is the closure of the range of {τ s, s 0}. Thus the length of the excursion interval of B t corresponds to the jump size of the stable subordiantor. For any t > 0, let V 1 (t) > V 2 (t) > be the ranked sequence of the excursion lengths up to time t including the meander (the last interval) length. Then the law of is P D(1/2, 0). ( V 1(t) t, V 2(t),...) t Typeset by FoilTEX 13
5. Gamma-Dirichlet Algebra The focus here will be on the case of α = 0. More thorough treatment of the general case can be found in James(03, 05a, 05b). Let S be Polish space, ν 0 a probability on S, θ and β any two positive numbers. Definition:The Gamma process with shape parameter θν 0 and scale parameter β is a random measure on S given by Υ β θ,ν 0 ( ) = β γ i δ ξi ( ) i=1 where γ 1 > γ 2 > are the points of the inhomogeneous Poisson point process on (0, ) with mean measure θx 1 e x d x, and independently, ξ 1, ξ 2,... are i.i.d. with common distribution ν 0. Typeset by FoilTEX 14
Denote the law of Υ β θ,ν 0 by Γ β θ,ν 0. The corresponding Laplace functional has the form: M(S) e µ,g Γ β θ,ν 0 (d µ) = exp{ θ ν 0, log(1 + βg) } where M(S) is the set of all non-negative finite measures on S, and g(s) > 1/β, for all s S. Typeset by FoilTEX 15
Set σ = γ i, i=1 P i = γ i, i = 1, 2,..., σ X θ,ν0 ( ) = P i δ ξi ( ). i=1 The law of (P 1, P 2,...) is the Poisson-Dirichlet distribution with parameter θ, X θ,ν0 ( ) equals in distribution to the Dirichlet process Ξ 0,θ,ν0 with law denoted by Π θ,ν0, and X θ,ν0 ( ) = Υβ θ,ν 0 ( ) Υ β θ,ν 0 (S). (1) Typeset by FoilTEX 16
Algebraic Relations 1 Additive property: for independent Υ β θ 1,ν 1 and Υ β θ 2,ν 2 Υ β θ 1,ν 1 + Υ β θ 2,ν 2 d = Υ β θ θ 1 +θ 2, 1, θ 1 +θ ν 1 + θ 2 2 θ 1 +θ ν 2 2 where d = denotes the equality in distribution. 2 Mixing: for independent η Beta(θ 1, θ 2 ), X θ1,ν 1, and X θ2,ν 2 ηx θ1,ν 1 + (1 η)x θ2,ν 2 d = Xθ1 +θ 2, θ 1 θ 1 +θ ν 1 + θ 2. 2 θ 1 +θ ν 2 2 3. Markov-Krein identity: (1 + λ ν, f ) 1 Π θ,ν0 (d ν) = exp{ θ ν 0, log(1 + λf) }. M 1 (S) Typeset by FoilTEX 17
Formal Hamiltonian Consider an abstract space Ω with a formal reference probability measure P (uniform or invariant in some sense). The formal Hamiltonian H(ω) is a function associated with another probability Q such that Q(dω) = Z 1 exp{ H(ω)}P(d ω). For each µ M(S), set µ( ) = µ( ) µ(s) M 1(S). Let and for any ν 1, ν 2 M 1 (S) φ(x) = x log x (x 1), x 0, Ent(ν 1 ν 2 ) = { M1(S) log d ν 1 d ν 2 d ν 1, ν 1 ν 2 +, else. Typeset by FoilTEX 18
Hamiltonian for Gamma process Γ β θ,ν 0 (Handa(01)): H g (µ) = θent(ν 0 µ) + µ(s) β βθ φ( µ(s) ) = angular component + radial component. Hamiltonian for Dirichlet process Π θ,ν0 (Dawson and F(98), Handa(01)): H d (ν) = θent(ν 0 ν), ν M 1 (S). For βθ = 1, we have H g (µ) µ(s)=1 = H d ( µ). Typeset by FoilTEX 19
Quasi-invariant Gamma Process Γ β θ,ν 0 (Tsilevich, Vershik and Yor (01)): Let B + (S) be the collection of positive Borel measurable functions on S with strictly positive lower bound. For each f in B + (S), set T f (µ)(d x) = f(x)µ(d x), µ M(S), and let T f (Γ β θ,ν 0 ) denote the image law of Γ β θ,ν 0 under T f. Then T f (Γ β θ,ν 0 ) and Γ β θ,ν 0 are mutually absolutely continuous and d T f (Γ β θ,ν 0 ) d Γ β θ,ν 0 (µ) = exp{ [θ ν 0, log f + µ, β 1 (f 1 1)]}. Typeset by FoilTEX 20
Dirichlet Process Π θ,ν0 (Handa (01)): For each f in B + (S), set Quasi-invariant(cont.) T f (ν)(d x) = f(x)ν(d x), ν M 1 (S), ν, f where ν, f denotes the integration of f with respect to ν. Let T f (Π θ,ν0 ) denote the image law of Π θ,ν0 under T f. Then T f (Π θ,ν0 ) and Π θ,ν0 are mutually absolutely continuous and d T f (Π θ,ν0 ) d Π θ,ν0 (ν) = exp{ θ[ ν 0, log f + log ν, f 1 ]}. Typeset by FoilTEX 21
Part II: Stochastic Dynamics 1. Wright-Fisher Model Consider a population of 2N individuals. Each individual is one of two types: a,a. The population evolves under the influence of mutation and random sampling. Mutation: the two types mutate to each other at rate u. Random sampling: individual in next generation is randomly sampled from the current population with replacement. Typeset by FoilTEX 22
Wright-Fisher Model(cont.) The distribution of types of the new generation is thus binomial with parameters p, 2N, where p is the proportion of type A individuals in the population after the mutation. Let X N (t) denote the proportion of type A individuals in the population at time (generation) t. Then X N (t) is the two type Wright-Fisher Markov chain. Diffusion approximation: Count the time in units of 2N generations and scale the mutation rate by a factor of 1/2N. The scaling limit proportion of type A individuals then follows SDE dx t = θ 4 (1 2x t)dt + x t (1 x t )db t, θ = 4Nu. The unique stationary distribution of the SDE is Beta( θ 2, θ 2 ). Typeset by FoilTEX 23
Finite Type Wright-Fisher Model Finite type models: Wright-Fisher model allows the number of types to be any finite number K. Mutation: symmetric parent independent, i.e., type i to type j with rate u/k. Sampling:multinomial sampling. Finite type diffusion approximation: dx i (t) = b i (x(t))dt + K 1 j=1 σ ij (x(t))db j (t) with b i (x(t)) = θ 2 ( 1 K x i(t)), Typeset by FoilTEX 24
and K 1 l=1 σ il (x(t))σ jl (x(t)) = x i (t)(δ ij x j (t)). The unique stationary distribution is Dirichlet( θ K,..., θ K ). The generator of the diffusion has the form K 1 2 [ i,j=1 2 x i (t)(δ ij x j (t)) + θ x i x j 2 K i=1 ( 1 K x i(t)) x i ]. Typeset by FoilTEX 25
2. Infinitely-Many-Alleles Model For any n 1, let and For f D 0, set L α,θ f(x) = 1 2 φ 1 (x) = 1, φ n (x) = x n i, n 2, x i=1 D 0 = algebra generated by {φ n : n 1}. i,j=1 2 f x i (δ ij x j ) x i x j (θx i + α) f x i. i=1 Typeset by FoilTEX 26
Theorem 2. (Petrov(09)) (1) The generator L α,θ defined on D 0 is closable in C( ). The closure, also denoted by L α,θ for notational simplicity, generates a unique -valued diffusion process X α,θ (t), the two-parameter infinite-allele diffusion process; (2) The process X α,θ (t) is reversible with respect to P D(α, θ). Remarks: (a) Theorems 2 is a generalization of the results in Ethier and Kurtz (81) and Ethier (92) where α = 0. Ruggiero and Walker(09) provides an alternate derivation of the infinite-allele diffusion process. The same model is studied in F and Sun (10) using techniques from the theory of Dirichlet forms. (b) Mimicking the derivation of the Poisson-Dirichlet distribution from the Dirichlet distributions, the case of α = 0 can be derived from the finite type Wright-Fisher diffusions by letting K tending to infinity. Typeset by FoilTEX 27
3. Fleming-Viot Process Let S be a compact metric space, C(S) be the set of continuous functions on S, and ν 0 a diffuse probability in M 1 (S). Consider operator A of the form Af(x) = θ 2 (f(y) f(x))ν 0 (dy), f C(S). Define D = {u : u(µ) = f( µ, φ ), f C b (R), φ C(S), µ M 1 (S)}, where µ, φ is the integration of φ with respect to µ and Cb of all bounded, infinitely differentiable functions on R. (R) denotes the set Typeset by FoilTEX 28
The Fleming-Viot process with neutral parent independent mutation (FVprocess) is a pure atomic measure-valued Markov process with generator where Au(µ) = µ( ), Aδu(µ)/δµ( ) + f ( µ, φ ) φ, φ µ, u D 2 = mutation + sampling, δu(µ)/δµ(x) = lim ε 0+ ε 1 {u((1 ε)µ + εδ x ) u(µ)}, φ, ψ µ = µ, φψ µ, φ µ, ψ, and δ x stands for the Dirac measure at x S. The FV-process is reversible with the Dirichlet process Π θ,ν as the reversible measure (Ethier (90)). Typeset by FoilTEX 29
4. Dynamical Analogue of the Gamma-Dirichlet Algebra The measure-valued branching diffusion with immigration {Y t }: L = 1 2 { θν 0 λµ, δ δµ(x) + µ, δ 2 δµ(x) 2 } = immigration and drift + branching where θ > 0, λ = 1 β > 0, ν 0 M 1 (S). Let N t be the standard Poisson process with rate one and set C(λ, t) = λ 1 (e λt/2 1) q a,λ n (t) = P {N a/c(λ,t) = n}, a > 0, n = 0, 1, 2,.... Typeset by FoilTEX 30
Transition function(ethier and Griffiths(93)): P 1 (t, µ, ) = q µ(s),λ 0 (t)γ C( λ,t) + n=1 q n µ(s),λ (t) where η n = 1 n n i=1 δ x i. θ,ν 0 ( ) S n ( µ µ(s) )n (d x 1 d x n )Γ C( λ,t) n+θ, n θ+n η n+ θ+n θ ν 0 The process is reversible with reversible measure Γ β θ,ν 0. Marginal distribution: Given Y 0 = µ, N µ(s)/c(λ,t) = n, it follows that for any t > 0 ( ) Y t d = Υ C( λ,t) n,η n + Υ C( λ,t) θ,ν 0. Typeset by FoilTEX 31
Transition function of FV-process(Ethier and Griffiths(93)): P 2 (t, ν, ) = d θ 0(t)Π θ,ν0 ( ) + d θ n(t) ν n (d x 1 d x n )Π n+θ, n S n θ+n η n+ θ+n θ ν ( ) 0 n=1 where {d θ n(t) : t > 0, n = 0, 1,...} is the marginal distribution of a pure death process {D θ t, t 0} taking values in {, 0, 1,...} with death rates n(n + θ 1) { 2 : n = 0, 1,...} and entrance boundary. Typeset by FoilTEX 32
Coefficients: The pure death process {D θ (t) : t > 0} is the embedded chain of Kingman s coalescent and d θ n(t) is the probability of having n different families at time t or n lines of decent beginning at generation zero. The time-changed Poisson process N µ(s)/c(λ,t) is a time inhomogeneous pure death process with death rate n/2c( λ, t) from state n 0 and t > 0. It represents the number of non-immigrant individuals in the population at time t. Comparison between P 1 (t, µ, ) and P 2 (t, ν, ): structure between a termwise Gamma-Dirichlet Γ C( λ,t) n+θ, n ( ) and Π θ+n η n+ θ+n θ ν 0 n+θ, n θ+n η n+ θ+n θ ν ( ). 0 Typeset by FoilTEX 33
Part III: Asymptotics 1. Sampling Formula and Large Sample Approximation For any n 1, a random sample of size n from the two-parameter Poisson-Dirichlet population consists of n positive integer-valued random variables X 1,..., X n which, given (P 1 (α, θ),...) = (x 1,...), are iid with common distribution (x 1, x 2,...). Set A n = {(a 1,..., a n ) : a k is nonnegative integer, k = 1,..., n; n ia i = n}, A k = the number of values appearing in the sample exactly k times. i=1 An example: n = 3. If one value appears in the sample once and the other appears twice. Then A 1 = 1, A 2 = 1, A 3 = 0. If three values appear in the sample, then A 1 = 3, A 2 = 0, A 3 = 0. If there are only one value in the sample, then A 1 = 0, A 2 = 0, A 3 = 1. Typeset by FoilTEX 34
An equivalent way is to partition [0, 1] into a random countable union of disjoint subintervals with interval-length given by (P 1 (α, θ), P 2 (α, θ),...). Uniformly pick n points from the unit interval. Then A k will be the number of intervals that contain exactly k points. Theorem 3. (Ewens (72), Pitman (92)) The random vector A n = (A 1,..., A n ) is a A n -valued random variable with distribution given by the well known Pitman sampling formula: P{A i = a i, i = 1,..., n} = n! Π k 1 l=0 θ (θ + ((1 α) (j 1) ) a j lα)πn j=1 (n) (j!) a, j(a j!) where a i > 0, n j=1 ja j = n, and k = n j=1 a j. The notation x (m) is the ascending factorial defined by x(x + 1) (x + m 1). The case of α = 0 is the well known Ewens sampling formula. Typeset by FoilTEX 35
Large Sample Approximation An important random variable is K n (α, θ) = total number of different values in the sample, where the special case K n (0, θ) is a sufficient statistic for θ. Theorem 4. (Korwar and Hollander(73)) d K n (0, θ) = η 1 + + η n, where η 1, η 2,..., η n are independent and η i has Bernoulli distribution with success probability θ θ+i 1 and lim n K n (0, θ) log n = θ almost surely. Typeset by FoilTEX 36
Theorem 5. (Fluctuation Theorem) (1)(Hansen(94)(fixed θ), and Goncharov(44)(θ = 1)) K n (0, θ) θ log n θ log n = N(0, 1), n. (2)(Pitman(06)) K n (α, θ) n α S α,θ almost surely, where S α,θ is related to the Mittag-Leffler distribution. (3)(Arratia, Barbour and Tavaré(92)) When α = 0, (A 1,..., A n, 0,...) (Y 1, Y 2,..), n, where Y 1, Y 2,... are independent Poisson random variable with mean θ i. Typeset by FoilTEX 37
Large Deviations Large deviations for a family of probability measures {P λ : λ index set} on space E are estimations of the following type: P λ {G} exp{ a(λ) inf x G I(x)}, where a(λ) is called the large deviation speed, and nonnegative function I( ) is the rate function. It describes the most likely event among all unlikely events. Theorem 6. (F and Hoppe(98)) (a) For appropriate subset A of [0, ), we have P {K n (0, θ)/ log n A} exp{ log n inf x A I(x)} where I(x) = x log x θ x + θ. Typeset by FoilTEX 38
(b) For 0 < α < 1 and appropriate subset A of [0, ), P {K n (α, θ)/n A} exp{ n inf x A I(x)} where and Λ α (λ) = I(x) = sup{λx Λ α (λ)}, λ { log[1 (1 e λ ) α] 1 if λ > 0, 0, else Typeset by FoilTEX 39
2. Large θ Approximaton Law of Large Numbers Noting that the parameter θ is the scaled population mutation rate in the case α = 0. In general, it is related in certain way to the size of a population. The limiting procedure of θ tending to infinity is thus associated with the behavior of a population when the population size tends to infinity. WLLN: lim θ (P 1 (α, θ), P 2 (α, θ),...) = (0, 0,...). Typeset by FoilTEX 40
Fluctuations Consider a sequence of random variables ζ 1 ζ 2... such that for each r 1, 1 k r, ζ 1 e u e u, i.e., ζ 1 has Gumbel distribution, ζ k 1 (k 1)! exp{ (ku + e u )}, (ζ 1,..., ζ r ) exp{ (u 1 + + u r ) e u r }, u 1 u 2 u r. Set β(α, θ) = log θ (1 + α) log log θ log Γ(1 α). Theorem 7. (Handa (09)) For each r 1, (θp 1 (α, θ) β(α, θ),..., θp r (α, θ) β(α, θ)) (ζ 1,..., ζ r ) as θ converges to infinity. Typeset by FoilTEX 41
Large Deviations Theorem 8. have (F(07), F and Gao (10)). For any appropriate subset B of, we P D(α, θ){b} exp{ θ inf x B I(x)} where I(x) = { log 1 1 i=1 x i,, else. i=1 x i < 1 Typeset by FoilTEX 42
Dirichlet Process Theorem 9. (LLN) As θ tends to infinity, the Dirichlet process Ξ α,θ,ν0 (dx) = P i (α, θ)δ ξi (dx) i=1 converges in probability to ν 0 in space M 1 (S). For S = [0, 1], treat Ξ α,θ,ν0 ([0, t]) as a random process in time t and ν 0 ([0, t]) as a function of t. Let ˆB(t) denote the Brownian bridge over [0, 1]. Typeset by FoilTEX 43
Theorem 10. (CLT, James(08)) The random process θ[ξα,θ,ν0 ([0, t]) ν 0 ([0, t])] converges in distribution to 1 α ˆB(ν0 ([0, t])) as θ tends to infinity. Typeset by FoilTEX 44
Theorem 11. (Large Deviations, F(07)) For appropriate subset C of M 1 (S), where Π α,θ,ν0 {C} exp{ θ inf µ C I(µ)} I(µ) = sup { 1 f>0,f C(S) α log ν 0, f α + 1 µ, f } and when α = 0, I(µ) becomes the relative entropy of ν 0 with respect to µ. Typeset by FoilTEX 45
References R. Arratia, A.D. Barbour and S. Tavaré (1992). Poisson process approximations for the Ewens sampling formula. Ann. Appl. Probab. 2, 519 535. A. Brix (1999). Generalized gamma measures and shot-noise Cox processes. Adv. Appl. Probab. 31, 929 953. D.A. Dawson and S. Feng (1998). Large deviations for the Fleming Viot process with neutral mutation and selection. Stoch. Proc. Appl. 77, 207 232. D.A. Dawson and S. Feng (2006). Asymptotic behavior of Poisson-Dirichlet distribution for large mutation rate. Ann. Appl. Probab., 7, No. 2, 562 582. S.N. Ethier (1990). The infinitely-many-neutral-alleles diffusion model with ages. Adv. Appl. Probab. 22, 1 24. S.N. Ethier (1992). Eigenstructure of the infinitely-many-neutral-alleles diffusion model. J. Appl. Probab. 29, 487 498. Typeset by FoilTEX 46
S.N. Ethier and R.C. Griffiths (1993). The transition function of a Fleming Viot process. Ann. Probab. 21, No. 3, 1571 1590 S.N. Ethier and T.G. Kurtz (1981). The infinitely-many-neutral-alleles diffusion model. Adv. Appl. Probab. 13, 429 452. W.J. Ewens (1972). The sampling theory of selectively neutral alleles. Theor. Pop. Biol. 3, 87 112. S. Feng (2007). Large deviations associated with Poisson-Dirichlet distribution and Ewens sampling formula. Ann. Appl. Probab. 17, Nos. 5/6, 1570 1595. S. Feng (2009). Poisson-Dirichlet distribution with small mutation rate. Stoch. Proc. Appl. 119, 2082 2094. S. Feng and F. Gao (2010). Asymptotics results for the two-parameter Poisson- Dirichlet distribution. Stoch. Proc. Appl. 120, 1159 1177. S. Feng and F.M. Hoppe (1998). Large deviation principles for some random Typeset by FoilTEX 47
combinatorial structures in population genetics and Brownian motion. Ann. Appl. Probab. 8, No. 4, 975 994. S. Feng and W. Sun (2010). Some diffusion processes associated with two parameter Poisson Dirichlet distribution and Dirichlet process. Probab. Theory Relat. Fields, 148, No. 3-4, 501 525. V.L. Goncharov (1944). Some facts from combinatorics. Izvestia Akad. Nauk. SSSR, Ser. Mat. 8, 3 48. R.C. Griffiths (1979). On the distribution of allele frequencies in a diffusion model. Theor. Pop. Biol. 15, 140-158. K. Handa (2001). Quasi-invariant measure and their characterization by conditional probabilities. Bull. Sci. Math., 125, No. 6-7, 583-604. K. Handa (2009). The two-parameter Poisson Dirichlet point process. Bernoulli, 15, No. 4, 1082-1116. Typeset by FoilTEX 48
J.C. Hansen (1990). A functional central limit theorem for the Ewens sampling formula. J. Appl. Probab. 27:28 43. L.F. James (2003). Bayesian calculus for gamma processes with applications to semiparametric intensity models. Sankhyā 65, No. 1, 179 206. L.F. James (2005a). Bayesian Poisson process partition with an application to Bayesian Lévy moving averages. Ann. Statist. 33, No. 4, 1771 1799. L.F. James (2005b). Functionals of Dirichlet processes, the Cifarelli-Regazzini identity and beta-gamma processes. Ann. Statist. 33, No. 2, 647 660. L.F. James (2008). Large sample asymptotics for the two-parameter Poisson Dirichlet processes. In Bertrand Clarke and Subhashis Ghosal, eds, Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, 187 199, Institute of Mathematical Statistics Collections, Vol. 3, Beachwood, Ohio. P. Joyce, S.M. Krone, and T.G. Kurtz (2002). Gaussian limits associated with Typeset by FoilTEX 49
the Poisson-Dirichlet distribution and the Ewens sampling formula. Ann. Probab. 12, No. 1, 101 124. Appl. J.C.F. Kingman (1975). Random discrete distributions. J. Roy. Statist. Soc. B 37, 1 22. R.M. Korwar and M. Hollander (1973). Contributions to the theory of Dirichlet processes. Ann. Probab. 1, 705 711. A. Lijoi, R.H. Mena, and I. Prünster (2007). Controlling the reinforcement in Bayesian non-parametric mixture models. J. Roy. Statist. Soc. B 69, 715 740. L.A. Petrov (2009). Two-parameter family of infinite-dimensional diffusions on the Kingman simplex. Funct. Anal. Appl. 43, No. 4, 279 296. J. Pitman (1992). The two-parameter generalization of Ewens random partition structure. Technical Report 345, Dept. Statistics, University of California, Berkeley. J. Pitman (2006). Combinatorial Stochastic Processes, Ecole d Été de Typeset by FoilTEX 50
Probabilités de Saint Flour, Lecture Notes in Math., Vol. 1875, Springer-Verlag, Berlin. J. Pitman and M. Yor (1997). The two-parameter Poisson Dirichlet distribution derived from a stable subordinator. Ann. Probab. 25, No. 2, 855 900. M. Ruggiero and S.G. Walker (2009). Countable representation for infinitedimensional diffusions derived from the two-parameter Poisson-Dirichlet process.. Electro. Comm. Probab. 14, 501-517. N.V. Tsilevich, A. Vershik, and M. Yor (2001). An infinite-dimensional analogue of the Lebesgue measure and distinguished properties of the gamma process. J. Funct. Anal. 185, No. 1, 274 296. Typeset by FoilTEX 51
THANK YOU! Typeset by FoilTEX 52