ESTIMATING THE FREQUENCY OF THE OLDEST ALLELE - A BAYESIAN APPROACH

Size: px
Start display at page:

Download "ESTIMATING THE FREQUENCY OF THE OLDEST ALLELE - A BAYESIAN APPROACH"

Transcription

1 ESTIMATING THE FREQUENCY OF THE OLDEST ALLELE - A BAYESIAN APPROACH by Paul Joyce TECHNICAL REPORT No. 171 July 1989 Department of Statistics, GN 22 University ofwasbi.ngton Seattle, Wasbington USA

2 ESTIMATING THE FREQUENCY OF THE OLDEST ALLELE - A BAYESIAN APPROACH PAUL JOYCE 1 Department of Statistics University of "Washington 0e,:L~~.le 'Washington, Abstract Consider an aged-ordered population Zl, Z2,..., where Z; is the frequency of the ith oldest allele and.ezi = 1. From this population consider an aged-ordered sample of size n with 1 alleles and the frequencies in age-order denoted by M = (/; ml, m2"" ml). vve calculate the posterior distribution and posterior moments from the population frequency of the oldest allele, Zl, given the sample M, assuming that the population is at stationarity and follows the neutral infinite alleles model. We also calculate the posterior disribution of Zl given a partition of n genes with no age information in the sample. These results are used to determine Bayes estimators for the population frequency of the oldest type, and the analysis is extended to include the posterior distribution of Zl, Z2,..., Zk given M for any k. nhtfl<?pq' G.E.M. distribution, infinite NSF BSR

3 1 Introduction Consider a of n drawn at a loci from a large population which has evolved according to assumptions of the neutral population is viewed as a realization of a random sequence of listed in aged-order. It is this particular realization that we to learn more about. In particular, we use the information obtained in the of the oldest type. to estimate the proportion of the population that is However, we do not wish to base our inference entirely on information obtained from the sample. The infinite-alleles assumption is prior knowledge that should be used together with the sample to make our estimate. So the problem at hand naturally lends itself to a Bayesian approach. The above problem is in some sense a follow up to a question asked by Watterson and Guess (1977). The question is: Is the most frequent allele the oldest? The answer is that, (under the assumptions of the infinite alleles model)with probability equal to its expected frequency, the oldest allele is the most frequent. So, it seems natural to ask what is the frequency of the oldest a sample? authors have studied ages of alleles. Watterson (1976a) was the used to answer quesions about ( reversibility arguments to cistrroutson as it to extraction. Lionneuy were to discover 1

4 according to simpler and more intormative. cumulation of the theory that has been developed in last decade concerning age-ordering of alleles can be summed up by the following distribution called the G.E.M. Let VI, 1;2,... be i.i.d. beta (1, B). Zl = Vi, Z, = (1 V 1 )(1-1;2)'" (1 - V'i-l)V'i i > 1. (1.1) Z, represents the allele frequency of the iih. oldest type taken from a stationary infinite alleles diffusion model. The distribution (1.1) (in the context of population genetics) was first discovered by Griffiths (1982) (unpublished). Donnelly and 'I'avare (1986) showed the above distribution (1.1) arises as the limiting distribution of an age-ordered sample of size n taken from a coalescent process with ages, as n -l- 00. An infinite coalescent with ages was constructed by Donnelly and Tavare (1987) which has equilibrium distribution given by (1.1). Hoppe (1987) uses the G.E.M. to derive the Ewens sampling formula giving an alternative proof to a theorem of Wattersons (1976b). Donnelly and Joyce (1990a) showed that (1.1) limit of population of age-ordered alleles from a class of stationary ex.1 IS as -l- 00 2

5 to (1.1). A summary of recent 1."''''U1.''''' about a Since we are in Bayesian inference, we (1.1) as a prier distribution for our population frequencies. Fortunately, distributions (1.1) are often as priors in Bayesian statistics and some known results are listed in Using Bayesian prior to sampling. Zl represents the frequency of the oldest allele 'We aim to calculate the posterior distribution for Zl given the data. The E(Zlldata) is our Bayes estimator for the population frequency of the oldest type. vve use the posterior distribution to access our error for that estimation. 3

6 2 Data As was mentioned in the introduction, distribution like (1.1) are often used as priors in a Bayesian inference problem. What sets our population problem apart from other problems, is the type of information we have available in our sample. The point is best illustrated by an example. Example. Consider a population consisting of an infinite number of types, listed in age order. Let Xi be the proportion of the ith type. Suppose a sample of size 5, X Il X z,..., X s, taken from the population yields the s following Xl = 2 X z = 3, X 3 = 2, X 4 5, X s = 3. Let Tj = L I {Xi = j}, i=l let T = (TIl Tz,...) then T = (0,2,2,0,1,0,0,...). The X's tell you the label of each of the individuals sampled. T tells you which labels have been sampled and which have not. The distribution of T is, of course, multinomial. In the context of age ordering, T tells you (in the above example) that the second oldest type in the populations has 2 representatives in the sample, the third oldest also 2, and the 5th oldest has 1 representative in the sample. However, a g;ej1etilst;s could not hope to be privy to that much information. At best the zenetist would For able to list the types in age-order relative to probtem his sample would look lla,v 11>'" 2 representatrve, 1 It is even more 4

7 no I ai renresent alleles in sample with i representatrves for the above example a1 az = 2, integer 5, Kingman (1978a,b)). The data for our purposes will come the form of a partition, or will be relatively age-ordered. Our inference must be based on data of this form. The above example should serve to motivate the following definitions. Let X (XI, X z,.,., X n ) be a iid sample of size n taken from a population described by (1.1). Define T j to be and T = (T I, Tz,...). n r, = I:I{Xi =j}, i::1 (2.1) Let i 1 = min{j : T, O}, and i k = min{j : j > i k - I T j =f. O}. Let co L = I:I{Tj j::i Define M, for k = 1,2,..., Let O}. (2.2) Thus L is partition of number of (:t1lt~lt::5 sample. Define in the sample and M IS the age-ordered A=.. "", IS n. vector co j=i =i}, or a n j=l 5

8 Let a = Let, az,, n a, E N, such that L.: jaj = n. j=l ~(a co = {(ti,tz,...): ti E N,L.:ti = n,l.:i{tj = i} = ail. i=l j=1 co Let m I (1, ml, mz,..., ml) be such that L.: m; = n. Define i=1 co em = {(tiltz,...): t, E N,L.:ti i=l n til = mj j = 1,2,...,l}. Note that Z = (Zl, Zz,...) defined by (1.1) is a random vector on the infinite simplex.6. defined by Let fl be the distribution of Z given by (1.1) then by definition and P[L = 1;1\11 1 = mi,...,lv1 1 = mil = (L.: I n! xilx~2,... dfl(x) (2.4) it::.. tefm t1 tz! Computing the above infinite dimensional integrals seems like an extremely!-lfyujp'\lpl', for the G.E.M. distribution IL, answers are as n B aj 1 -IIi = 1 6

9 0 1 n!... df-l = ,----: ,------: (n) (2.6) where O(n) = 0(0 1) (0 n - 1). The right side of Equation (2.5) is the famous H:UT,pn" Sampling Formula (see Ewens (1972))and the right side of (2.6) is an aged ordered version (see Donnelly and -LCLVCLLe; (1986), Donnelly and Joyce (1990a,b) Ethier (1989)). Note that the function being integrated in (2.5) is symmetric. So the integral in (2.5) would be the same if integration is done with respect to the joint distribution of the order statistics. In the case of the G.E.M. (1.1), the distribution of the order statistic is the well-known Poisson Dirichlet distribution (see Kingman(1975)). Watterson (1976) was first to relate Poisson Dirichlet distribution to the Ewens sampling formula and so can be credited with the first proof of (2.5). Actually, Wattersons calculation uses finite dimensional Dirichlets rather than the Poisson Dirichlet directly. A Theorem of Kingmans (1977) shows that this is equivalent to showing (2..5) directly. In theorem 3 of rtopne (1987) a proof of (2.5) is given by using the G.E.M. distribution. In theorrn 10 of Donnelly and Joyce (1990a) (2.6) is (2.6) by summing

10 3 Posterior distribution now distribution of Z1 (frequency of the oldest a aged sample M. This distribution depends on only through the frequency of the oldest allele in The prior distribution of Z1' is Beta (1, B). It follows from Theorem A2 of the ap'pej1u1x that the distribution of Z1 given a sample X (X lists the population label for each member, of the sample) is also a beta distribution. However, as was pointed out in example 1 of the last section, we condition on the relatively aged ordered sample M defined by (2.2). There are two cases to be considered. Either the oldest allele in the sample is also the oldest allele in the population, or the oldest allele in the population has no representatives in the sample. By viewing M we cannot determine which of the above two situations is the truth. So we condition on each possibility. Thus the posterior distribution of the oldest allele in the population, Z1, given M is a mixture of two Beta distributions. The theorem below formalizes this Theorem 3.1 Let ZlJ... be the population frequencies in aged-order of a n. posterior of by (1.. Let M a ---.;,..;.;.-

11 Proof. Let X,..., a sample of size n from population given by.1) population labels for allele the sample. Recall from (2.1) that T 1 is the number of genes in the sample which are of the oldest in the population. It follows from Corollary A4 of the appendix that the posterior density of 2 1 given T 1 = t 1 is. st1(1 _ s)n-t1+8-1 fz1lt 1 (slt 1 ) = Beta (t 1 + 1, n - t 1 + e) (:3.2) So it follows that n fzlim(slm) = L fzllt1 (slj)p(t1 = jim = m). j=o However, the oldest allele in the population is eitherthe oldest allele in sample or the oldest allele in the population is not represented in the sample, this implies Thus P(T 1 0IM m) + P(T 1 = m11m = m) = 1. (:3.:3) (:3.4) \Ve need note to show that e e IM=m)=--. n II 1'I~",d'",1... j=l 1, 9

12 by we see n! 1,Yj+1). (3.6) Thus P(T 1 allvf m) _ OBeta(l, n + 0) L n! IT OBeta(tj + 1, Yj + 0) { t2!t3!.... P(M - m) te nl:tl=o} )=2 - OBeta(l. n + 0) P(M = m), P(M = m) by (3.6) o n+o' (3.7) Note that 1 P(T 1 = aim = m) (3.8) So o IS interestine to see reduces to o n+o'

13 P(T1 = 0) = = 0), is (1.1), o (3.9) =O+n Thus event = O} is muepencent of the sample M. We mentioned earlier that viewing the sample M it is impossible to tell whether or not the oldest allele in the population appears in the sample. In fact, viewing M doesn't even give us a hint. The information in M is independent of wnetner or not the oldest allele appears in the sample. This fact serves as a reminder that there is alot less information in a relatively aged-ordered sample M than in a totally aged-ordered sample T. There are alternative ways to prove (3.9). The central result of 'Watterson and Guess (1977) and Kelly(1979) is that the probability that a particular allele is the oldest is equal to its proportion of the population. Thus, the chance that the oldest allele does not appear in the sample, P(R 1 = 0), is the chance that a randomly selected individual is of a type that does not appear in the sample. Corollary 3.1 If Zl is the oldest allele in the population whose distribution is given by (1. and M is an aged ordered sample of size n then n +1 o 1M) = -n-+-0 n + 0' , , ) an azed-ordered sample. 18

14 rru1l11c:1l1;zes ones <"'111 :>lc",,1i error loss. Zl' So it tottows n our estimator IS consistent, 1 [M) =

15 4 Conditioning on a Partition our sampre of n contains no information about As we mentioned in example (1.1), Ai is the number of alleles in have i representatives. Recall from (2.1) that T j is the number genes in the sample that are of the jth oldest type in the population, and that co Ai LI{Tj = i} j=l Recall that L iai = n, and that L Ai is the i=1 i=1 number of alleles present in the sample. vve wish to calculate the posterior and A = (AI, A 2,, An)' distribution of ZI given A = a. Suppose we are given a sample with k alleles present. If no age information is available then it is possible that any given one of the alleles present in the sample could be of the oldest type in the population, or the oldest type in n n the population is not represented in the sample. It is these k 1 possibilities that we must condition on. While the aged-ordered sample gave a mixture of two Betas as posterior for the sample without a posterior distribution is a mixture of k + 1 Betas. vve us consider a sampte of use a partrcutar IS IS to

16 In an t oldest in the population, we (.;<:LJl(.;Ul<:Ll,e an 1nrlnrlrll1"" cnosen at random from population is of a given i representatives in the sample. With probability ;1 the randomly chosen individual will belong to the ~ample, and if this that it is of the given allelic type is!... With probability n happens the lv1 n the randomly chosen individual will be outside sample and if this happens the probability that it is of the given allelic typeis n: e.(see Kelly (1979), 'Watterson and Guess(1977).) Thus the probability we are seeking is n i J\!1-n ~ v1 n J\!1 n + e i(e+ J\!J) lvf(e n)" This is the argument used in Theorem 7.6 of Kelly(1979). Now letting J\!1 -jo 00 we see that the probability that a particular allele with i representatives in the sample is the oldest in the population is So the probability that any with i representatives in the sample is the oldest in the population is we "hn,n,n IS i> 0 --e'

17 IS a statement '-"...'.n... (1.1), but argument we to never mentions G.E.M. (1.1) III book(1979)prectaties use of G.E.M. (1.1) population genetics. The argument works because we now know that the G.E.M. is the limiting distributions as population size 1\ of the alleles model. (See Donnelly and Tavare (1986), (1987), Donnelly and Joyce (1990), Ethier (1990)). However, it is interesting to see that (4.1) can be arrived at by direct calculation using the G.E.M. explicity. For this reason we give an alternative proof to (4.1). Lemma 4.1 Consider a sample of size n taken from a population described by (1.1). Let A be the partition associated with the sample) let T l be a number of alleles in the sample that are of the oldest type in the population} then for j>o Proof. It follows by (3..5) that P(T l = jla = a) = J. n+o =J

18 t i is i=j+l jth oldest III number of note it follows by summing equation (3.5) ~wens sampling formula (2.5) which we will now denote by (to make explicit the dependence on size) can be Ull lltt.l"n as Pn(aJ, az,, an) := P(A = a) = L P(T t) te'i1ll (4.3) vve now rewrite (4.2) to be (n - j)! IT OBeta(tk+ 1, Yk + 0) tz!t3!... k=z Pn(aJ,az,, an) vve now use Ewens Sampling Formula given by (2.5) to reduce the above equation to n () o an urn modet. rioupe ( ""dcj.ul. dlstjrlbutlon IS

19 4.1 contexts rioppe urn IS in Donnelly (1987). Theorem 4.1 VU!t!::iI,'"M::;r a sample ofsize n taken from a population described by (1.1). Let A be the partition associated with the sample. Let ZI be the frequency ofthe oldest allele in the population. Then the posterior distribution ofzi given A n ja' sj(1_s)n- j +8-1 fzlia(sla) = B(l- st :-.L BB C 1. B) (4.4) j=l ri + eta J +,n - J + Proof. Note that From lemma 4.1 we note that n fzda(sla) = 2: fz1lt1(s/j)p(t 1 = jla = a). j=o n [a; B P(T 1 = ala = a) = 1-2:- J - = --. j=l n + B n + B So the result follows from lemma 1 and equation (3.2). 0 Corollary 4.1 trpr,n.e>nr " of VLUC,0 u allele in a sample, A, uaviij.l!..., we can

20 1 of 1JOnnE~l1y and it was ",h,..,"'tn that as n, to mnnity, for all k. So it follows from the above and (4.5) that 00 E(ZlIA) -+ I.:Z; as n -+ 00, i=l (4.6) where Z, has distribution given by (1.1). So we see that the estimator in Corollary 4.3 is not consistent. Since the above estimator is invariant undel' different labelings of the sample, it is not surprising that its limit IS a symmetric function of the population. The poor asymptotic property of the estimator in Corollary (4.1) is very significant, and can be used to argue that one has no business estimating the frequency of the oldest allele from a sample with no age information. However, the posterior distribution (4.4) has a multi-modal density, that is the graph of (4.4) can have many local maxima. So the posterior mean, E(ZlIA), is not a good summary of (4.4). In fact a point estimate is not a very thing to be looking at. Yet, the posterior density still you more about you nerore sampung, It you, IS a near

21 sample. The posterior distribution quantifies this statement. III

22 5 The Posterior G.E.M. \VenowexlcenLdour~H~lV010tolncluc~ joint distribution of the!-'''"'il./ud'","v.',,",u trequencies of sample M. Let us first k oldest alleles ZIl Z2,'.., Zk, given an aged-ordered concentrate on the case k = 2. The posterior uisr.rrbution for (Zl' Z2) given a sample X, which lists the population labels for each H.lC;.lH1VC,.l sample, depends only on T 1, the number of In the sample that are of the oldest type in the population and T 2, the number of genes in the sample that are of the second oldest type in the population. The joint distribution of (Zl' Z2) given X will be a generalized-dirichlet with density O(n+}) 0 + n - t 1 ( )n-t 1-tz+8-1 (5.1) O(n-tl-tz) t1!t2! 1-8 (This follows from theorem A.3 of the appendix.] However, if we view a relatively aged-ordered sample M we do not know T I or T 2 There are four possibilities for (TIlT 2 ) given M. They are =0 =0 =0 =0

23 unique structure of G.E.M. IS In sampie IS In population and second oldest in the sample is second oldest in the population conditioned on knowing the sample. Yet for the G.E.M. distribution aged-ordering is equivalent to sized biasing. (For a proof of see and (1977), a complete description of size-biasing see Donnelly and Joyce (1989)). Thus is just the chance that 1) a randomly selected individual is of a type that appears in the sample, and 2) after deleting that type from the population another randomly chosen individual is of a type that appears in the remaining sample. (1.1) remain population IS G.E.M. (1.1). This is tneorem 1 IS n -:;;:0'

24 statement (Hoppe (1986)) that the probability of IS n - 1\1[1 n ki I + f)' Thus P(T I Similarly o n ---- n+on+o P(T I n 0 n + 0 ti - NIl + 0 (5.2) P(T 1 = 0, T 2 = OIM) (n + O)(n + 0) The posterior distribution of (Zl, Z2) given M has density,82 0)P(T 1 0, = 0IM = m) ) (5.2) prootern rs compretec. now outlined procedure to calculate =m case k =

25 same procedure the we must c;;'v<:lljll;'u some notatron. -tr cenneu by { a if t, = 0 n - to - ti :>: if t i ::/= O. a fixed m = (lj I..., mr) where I::tri; = n, m, > 0 for i > 0, denne i=l Theorem 5.1 The joint posterior distribution for the first k oldest alleles in the pop-ulation Zk = (Zl' Z2,..., Zk) given an aged-ordered sample M = m has the following density " Ilk Cfi(t) L.t team i=l n - L...j=O ",",t-1 t j (5.3) where = 1- SI Sk = n - - i. - tz t i Proof. fj

26 expression k I1---,...'o;~i=l n- (5.5) is one of the appropriate probabilities which we outlined the procedure for calculating in the case k = 2, the general case (5.5) follows by induction. 0 Let us now define a sequence of random variables Z~, Z~,... where the joint distribution of (Z~, Z~,..., Zk) is given by (5.3). The Kolmogorov existence theorem (theorem 3.1 Billingsley (1977)) guarantees that Z~,Z~,... defines a probability measure on.6.., which we denote by u', So fl.1 is the posterior G.E.M. which gives us the aged-ordered frequencies of a population after viewing an aged-ordered sample.

27 A Appendix Connor Mosrmann ( defined a generauzed Dirichlet distribution for which "'~'.t::>...-. distributions of G.E.M. are a special example. Like the Dirichlet, posterior a generalized Dirichlet, is again generalized Dirichlet with a change in the parameters. Theorem A.I Let {Ui} be a sequence ofindependent random variables. Let u, have distribution Beta(ai, bj. Define Ql = U1 and (A.I) The joint distribution of (Q1, Q2,...,Qm-l) is called a generalized Dirichlet and has density given by (A.2) Proof. The proof follows immediately from transformation of variable. 0 Note that in special case ai bi-1 - b; the generalized Dirichlet Theorem A.2 Q.- q Q is

28 ..., are maepenaent with posterior distribution Beta(Ti + ai, Yi bi ), where n LI{Xj = ij and Yi = L r; j=l Proof. j>i+1 Xl,, x; = XnlUi i = 1,2,...)IT I{UI ::;; ud} 1=1 (A.3) = E{. IT(1- Uj)L7=lI{j<Xi}Ul:=7=lI{j=X;}ITI{UI::;; UIJ}. )=1 1=1 Let tj = I {Xi = j}, let Yj = Li=l I {Xi> j} = Li>j ti. Now we use the fact that {Ud are independent and... Beta(ai, b i ) to rewrite (A.3) as IT lui --., ,..--- j=l 0 = 1 J X nl l -----:...--:..--- j>k 0 00 = Xl,...,- n ---'--=---=-;"';;"::"-"':;" j=l

29 So by o Corollary A.I Let T, - fined in theorem (A Let T = (Tr, T 2, ). Then P(T Proof. The result follows immediately from equation (A.5). 0 Corollary A.2 The Posterior distribution of Ql given T1 has Beta density given by st+ul-1(1 _ s)t+bt-1 fqllt l (sit) = Beta(t + al, n - t + b 1 ) Proof. It is a special case of Theorem (A.2). We state it separately because of its importance in the paper. 0 Theorem A.3 Let Ti, Qi, Xi be as defined in Theorem A.2: The joint posterior distribution of has a a e-n PTnl1'7pn b. -L t, J1T1rnl,PT distribuiio»: where ai is replaced by a.;+ Proof. an rmmediate consequence

30 References P. (1977), Probability and Measure, Wiley, New York. Connor, R.J., J.E. (1969), Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Statist. Assoc. 64, Donnelly, P. (1986), Partition structures, Polya urns, the Ewens sampling formula, and the ages of alleles. Theoret. Population BioI. 30, Donnelly, P. and Joyce, P. (1989), Continuity and weak convergence of ranked size-biased permutations on the infinite simplex. Stochastic Processes Appl. 31, Donnelly, P. and Joyce, P. (1990a), Consistent ordered sampling distributions: characterization and convergence. Adv. ui Appl. Probab. (to appear). Donnelly, P. and Joyce, P. (1990b), Weak convergence of Population Process to with Ages (sub- S. (1986), of alleles and a Probab, 18,

31 P. S. of mnnitely-many neutral alleles model, J. Math, Biol. 251, S. (1989), The infinitely-many-neutral-alleles diffusion model Adv. Appl. Prob, 22, to appear. L-/Ul.Hvl., S (1990), The distribution of the frequencies of ageordered alleles in a diffusion model (submitted). Ewens, \V.J. (1972), The sampling theory of selectively neutral alleles, Theoret. Population Biol. 3, Ewens, 'N.J. (1989), Population genetics theory -t.he past and the future. Mothematical and Statistical Problems in Evolution, S. Lessard, ed. University of Montreal Press, Montreal, to appear. Griffiths, R.C. (c. 1982) Unpublished Hoppe, F.M. (1984), Polya-like urns and the Ewens sampling formula, J. lvfath. Biol. 20, Hoppe, F.M. Size-biased nrtenng of Poisson-Dirichlet samappucanon to n>ll'htlr\n structures genetics, J. 23,

32 samnunz TI"tp,\rV of neutral Uj'~vL'~"" an urn model population genetacs, J. Math, Biol. 25, Joyce, P. Age-ordered distributions associated with some neutral population genetics models. Unpublished Ph.D. the University of Utah. Kelly, F.P. (1977), Exact results for the moran neutral allele model, Adv. App. Prob, 9, Kelly, F.P. (1979), Reversibility and Stochastic Networks, 'Wiley, New York. Kingman, J.F.C. (1977), The population structure associated with the Ewens sampling formula, Theoret. Population Biol. 11, Kingman, J.F.C. (1978a), Random partitions in population genetics, Proc. Roy. Soc. London Ser. A 361, Kmgman, J.F.C. (1978b), The representation of partition struc J. Lond..Math. Soc. 18, C. ( nversitv as a concept and

33 age of an I. Moran's intinitely-many neutral d,l1t~le:::; model, Theoret. ulaiioti Bioi. 10, 'Watterson, G.A. (1976b), The stationary distribution of the mnnrtetvmany neutral alleles diffusion model, J. Appl. Probab, 13, Watterson, G.A. and Guess, H.A. (1977), Is the most frequent allele the oldest?. Theoret. Population Bioi. 11,

The two-parameter generalization of Ewens random partition structure

The two-parameter generalization of Ewens random partition structure The two-parameter generalization of Ewens random partition structure Jim Pitman Technical Report No. 345 Department of Statistics U.C. Berkeley CA 94720 March 25, 1992 Reprinted with an appendix and updated

More information

classes with respect to an ancestral population some time t

classes with respect to an ancestral population some time t A GENEALOGICAL DESCRIPTION OF THE INFINITELY-MANY NEUTRAL ALLELES MODEL P. J. Donnelly S. Tavari Department of Statistical Science Department of Mathematics University College London University of Utah

More information

The Combinatorial Interpretation of Formulas in Coalescent Theory

The Combinatorial Interpretation of Formulas in Coalescent Theory The Combinatorial Interpretation of Formulas in Coalescent Theory John L. Spouge National Center for Biotechnology Information NLM, NIH, DHHS spouge@ncbi.nlm.nih.gov Bldg. A, Rm. N 0 NCBI, NLM, NIH Bethesda

More information

It is indeed a pleasure for me to contribute to this dedicatory volume

It is indeed a pleasure for me to contribute to this dedicatory volume CHAPTER THREE The Genealogy of the Birth, Death, and Immigration Process Simon. Tavark 3.1. Introduction It is indeed a pleasure for me to contribute to this dedicatory volume for Professor Samuel Karlin.

More information

THE POISSON DIRICHLET DISTRIBUTION AND ITS RELATIVES REVISITED

THE POISSON DIRICHLET DISTRIBUTION AND ITS RELATIVES REVISITED THE POISSON DIRICHLET DISTRIBUTION AND ITS RELATIVES REVISITED LARS HOLST Department of Mathematics, Royal Institute of Technology SE 1 44 Stockholm, Sweden E-mail: lholst@math.kth.se December 17, 21 Abstract

More information

An ergodic theorem for partially exchangeable random partitions

An ergodic theorem for partially exchangeable random partitions Electron. Commun. Probab. 22 (2017), no. 64, 1 10. DOI: 10.1214/17-ECP95 ISSN: 1083-589X ELECTRONIC COMMUNICATIONS in PROBABILITY An ergodic theorem for partially exchangeable random partitions Jim Pitman

More information

Existence of Optimal Strategies in Markov Games with Incomplete Information

Existence of Optimal Strategies in Markov Games with Incomplete Information Existence of Optimal Strategies in Markov Games with Incomplete Information Abraham Neyman 1 December 29, 2005 1 Institute of Mathematics and Center for the Study of Rationality, Hebrew University, 91904

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Annalisa Cerquetti Collegio Nuovo, University of Pavia, Italy Interview Day, CETL Lectureship in Statistics,

More information

REVERSIBLE MARKOV STRUCTURES ON DIVISIBLE SET PAR- TITIONS

REVERSIBLE MARKOV STRUCTURES ON DIVISIBLE SET PAR- TITIONS Applied Probability Trust (29 September 2014) REVERSIBLE MARKOV STRUCTURES ON DIVISIBLE SET PAR- TITIONS HARRY CRANE, Rutgers University PETER MCCULLAGH, University of Chicago Abstract We study k-divisible

More information

Mobius Inversion of Random Acyclic Directed Graphs

Mobius Inversion of Random Acyclic Directed Graphs Mobius Inversion of Random Acyclic Directed Graphs By Joel E. Cohen Suppose a random acyclic digraph has adjacency matrix A with independent columns or independent rows. Then the mean Mobius inverse of

More information

Lecture 19 : Chinese restaurant process

Lecture 19 : Chinese restaurant process Lecture 9 : Chinese restaurant process MATH285K - Spring 200 Lecturer: Sebastien Roch References: [Dur08, Chapter 3] Previous class Recall Ewens sampling formula (ESF) THM 9 (Ewens sampling formula) Letting

More information

WXML Final Report: Chinese Restaurant Process

WXML Final Report: Chinese Restaurant Process WXML Final Report: Chinese Restaurant Process Dr. Noah Forman, Gerandy Brito, Alex Forney, Yiruey Chou, Chengning Li Spring 2017 1 Introduction The Chinese Restaurant Process (CRP) generates random partitions

More information

Stochastic Demography, Coalescents, and Effective Population Size

Stochastic Demography, Coalescents, and Effective Population Size Demography Stochastic Demography, Coalescents, and Effective Population Size Steve Krone University of Idaho Department of Mathematics & IBEST Demographic effects bottlenecks, expansion, fluctuating population

More information

The Moran Process as a Markov Chain on Leaf-labeled Trees

The Moran Process as a Markov Chain on Leaf-labeled Trees The Moran Process as a Markov Chain on Leaf-labeled Trees David J. Aldous University of California Department of Statistics 367 Evans Hall # 3860 Berkeley CA 94720-3860 aldous@stat.berkeley.edu http://www.stat.berkeley.edu/users/aldous

More information

arxiv: v2 [math.co] 29 Oct 2017

arxiv: v2 [math.co] 29 Oct 2017 arxiv:1404.3385v2 [math.co] 29 Oct 2017 A proof for a conjecture of Gyárfás, Lehel, Sárközy and Schelp on Berge-cycles G.R. Omidi Department of Mathematical Sciences, Isfahan University of Technology,

More information

ON COMPOUND POISSON POPULATION MODELS

ON COMPOUND POISSON POPULATION MODELS ON COMPOUND POISSON POPULATION MODELS Martin Möhle, University of Tübingen (joint work with Thierry Huillet, Université de Cergy-Pontoise) Workshop on Probability, Population Genetics and Evolution Centre

More information

THE NUMBER OF LOCALLY RESTRICTED DIRECTED GRAPHS1

THE NUMBER OF LOCALLY RESTRICTED DIRECTED GRAPHS1 THE NUMBER OF LOCALLY RESTRICTED DIRECTED GRAPHS1 LEO KATZ AND JAMES H. POWELL 1. Preliminaries. We shall be concerned with finite graphs of / directed lines on n points, or nodes. The lines are joins

More information

Department of Statistics. University of California. Berkeley, CA May 1998

Department of Statistics. University of California. Berkeley, CA May 1998 Prediction rules for exchangeable sequences related to species sampling 1 by Ben Hansen and Jim Pitman Technical Report No. 520 Department of Statistics University of California 367 Evans Hall # 3860 Berkeley,

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

NORMAL CHARACTERIZATION BY ZERO CORRELATIONS

NORMAL CHARACTERIZATION BY ZERO CORRELATIONS J. Aust. Math. Soc. 81 (2006), 351-361 NORMAL CHARACTERIZATION BY ZERO CORRELATIONS EUGENE SENETA B and GABOR J. SZEKELY (Received 7 October 2004; revised 15 June 2005) Communicated by V. Stefanov Abstract

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

AN ASYMPTOTIC SAMPLING FORMULA FOR THE COALESCENT WITH RECOMBINATION. By Paul A. Jenkins and Yun S. Song, University of California, Berkeley

AN ASYMPTOTIC SAMPLING FORMULA FOR THE COALESCENT WITH RECOMBINATION. By Paul A. Jenkins and Yun S. Song, University of California, Berkeley AN ASYMPTOTIC SAMPLING FORMULA FOR THE COALESCENT WITH RECOMBINATION By Paul A. Jenkins and Yun S. Song, University of California, Berkeley Ewens sampling formula (ESF) is a one-parameter family of probability

More information

Increments of Random Partitions

Increments of Random Partitions Increments of Random Partitions Şerban Nacu January 2, 2004 Abstract For any partition of {1, 2,...,n} we define its increments X i, 1 i n by X i =1ifi is the smallest element in the partition block that

More information

A fixed-point approximation accounting for link interactions in a loss network

A fixed-point approximation accounting for link interactions in a loss network A fixed-point approximation accounting for link interactions in a loss network MR Thompson PK Pollett Department of Mathematics The University of Queensland Abstract This paper is concerned with evaluating

More information

252 P. ERDÖS [December sequence of integers then for some m, g(m) >_ 1. Theorem 1 would follow from u,(n) = 0(n/(logn) 1/2 ). THEOREM 2. u 2 <<(n) < c

252 P. ERDÖS [December sequence of integers then for some m, g(m) >_ 1. Theorem 1 would follow from u,(n) = 0(n/(logn) 1/2 ). THEOREM 2. u 2 <<(n) < c Reprinted from ISRAEL JOURNAL OF MATHEMATICS Vol. 2, No. 4, December 1964 Define ON THE MULTIPLICATIVE REPRESENTATION OF INTEGERS BY P. ERDÖS Dedicated to my friend A. D. Wallace on the occasion of his

More information

3. The Voter Model. David Aldous. June 20, 2012

3. The Voter Model. David Aldous. June 20, 2012 3. The Voter Model David Aldous June 20, 2012 We now move on to the voter model, which (compared to the averaging model) has a more substantial literature in the finite setting, so what s written here

More information

1981] 209 Let A have property P 2 then [7(n) is the number of primes not exceeding r.] (1) Tr(n) + c l n 2/3 (log n) -2 < max k < ir(n) + c 2 n 2/3 (l

1981] 209 Let A have property P 2 then [7(n) is the number of primes not exceeding r.] (1) Tr(n) + c l n 2/3 (log n) -2 < max k < ir(n) + c 2 n 2/3 (l 208 ~ A ~ 9 ' Note that, by (4.4) and (4.5), (7.6) holds for all nonnegatíve p. Substituting from (7.6) in (6.1) and (6.2) and evaluating coefficients of xm, we obtain the following two identities. (p

More information

Krzysztof Burdzy University of Washington. = X(Y (t)), t 0}

Krzysztof Burdzy University of Washington. = X(Y (t)), t 0} VARIATION OF ITERATED BROWNIAN MOTION Krzysztof Burdzy University of Washington 1. Introduction and main results. Suppose that X 1, X 2 and Y are independent standard Brownian motions starting from 0 and

More information

. Get closed expressions for the following subsequences and decide if they converge. (1) a n+1 = (2) a 2n = (3) a 2n+1 = (4) a n 2 = (5) b n+1 =

. Get closed expressions for the following subsequences and decide if they converge. (1) a n+1 = (2) a 2n = (3) a 2n+1 = (4) a n 2 = (5) b n+1 = Math 316, Intro to Analysis subsequences. Recall one of our arguments about why a n = ( 1) n diverges. Consider the subsequences a n = ( 1) n = +1. It converges to 1. On the other hand, the subsequences

More information

NEW CONSTRUCTION OF THE EAGON-NORTHCOTT COMPLEX. Oh-Jin Kang and Joohyung Kim

NEW CONSTRUCTION OF THE EAGON-NORTHCOTT COMPLEX. Oh-Jin Kang and Joohyung Kim Korean J Math 20 (2012) No 2 pp 161 176 NEW CONSTRUCTION OF THE EAGON-NORTHCOTT COMPLEX Oh-Jin Kang and Joohyung Kim Abstract The authors [6 introduced the concept of a complete matrix of grade g > 3 to

More information

1 A simple example. A short introduction to Bayesian statistics, part I Math 217 Probability and Statistics Prof. D.

1 A simple example. A short introduction to Bayesian statistics, part I Math 217 Probability and Statistics Prof. D. probabilities, we ll use Bayes formula. We can easily compute the reverse probabilities A short introduction to Bayesian statistics, part I Math 17 Probability and Statistics Prof. D. Joyce, Fall 014 I

More information

A Simple Proof of the Stick-Breaking Construction of the Dirichlet Process

A Simple Proof of the Stick-Breaking Construction of the Dirichlet Process A Simple Proof of the Stick-Breaking Construction of the Dirichlet Process John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We give a simple

More information

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I X i Ν CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr 2004 Dirichlet Process I Lecturer: Prof. Michael Jordan Scribe: Daniel Schonberg dschonbe@eecs.berkeley.edu 22.1 Dirichlet

More information

Maximum union-free subfamilies

Maximum union-free subfamilies Maximum union-free subfamilies Jacob Fox Choongbum Lee Benny Sudakov Abstract An old problem of Moser asks: how large of a union-free subfamily does every family of m sets have? A family of sets is called

More information

On The Mutation Parameter of Ewens Sampling. Formula

On The Mutation Parameter of Ewens Sampling. Formula On The Mutation Parameter of Ewens Sampling Formula ON THE MUTATION PARAMETER OF EWENS SAMPLING FORMULA BY BENEDICT MIN-OO, B.Sc. a thesis submitted to the department of mathematics & statistics and the

More information

CLOSED-FORM ASYMPTOTIC SAMPLING DISTRIBUTIONS UNDER THE COALESCENT WITH RECOMBINATION FOR AN ARBITRARY NUMBER OF LOCI

CLOSED-FORM ASYMPTOTIC SAMPLING DISTRIBUTIONS UNDER THE COALESCENT WITH RECOMBINATION FOR AN ARBITRARY NUMBER OF LOCI Adv. Appl. Prob. 44, 391 407 (01) Printed in Northern Ireland Applied Probability Trust 01 CLOSED-FORM ASYMPTOTIC SAMPLING DISTRIBUTIONS UNDER THE COALESCENT WITH RECOMBINATION FOR AN ARBITRARY NUMBER

More information

z E z *" I»! HI UJ LU Q t i G < Q UJ > UJ >- C/J o> o C/) X X UJ 5 UJ 0) te : < C/) < 2 H CD O O) </> UJ Ü QC < 4* P? K ll I I <% "fei 'Q f

z E z * I»! HI UJ LU Q t i G < Q UJ > UJ >- C/J o> o C/) X X UJ 5 UJ 0) te : < C/) < 2 H CD O O) </> UJ Ü QC < 4* P? K ll I I <% fei 'Q f I % 4*? ll I - ü z /) I J (5 /) 2 - / J z Q. J X X J 5 G Q J s J J /J z *" J - LL L Q t-i ' '," ; i-'i S": t : i ) Q "fi 'Q f I»! t i TIS NT IS BST QALITY AVAILABL. T Y FRNIS T TI NTAIN A SIGNIFIANT NBR

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Stochastic Processes, Kernel Regression, Infinite Mixture Models

Stochastic Processes, Kernel Regression, Infinite Mixture Models Stochastic Processes, Kernel Regression, Infinite Mixture Models Gabriel Huang (TA for Simon Lacoste-Julien) IFT 6269 : Probabilistic Graphical Models - Fall 2018 Stochastic Process = Random Function 2

More information

The Game of Normal Numbers

The Game of Normal Numbers The Game of Normal Numbers Ehud Lehrer September 4, 2003 Abstract We introduce a two-player game where at each period one player, say, Player 2, chooses a distribution and the other player, Player 1, a

More information

A comparison of two popular statistical methods for estimating the time to most recent common ancestor (TMRCA) from a sample of DNA sequences

A comparison of two popular statistical methods for estimating the time to most recent common ancestor (TMRCA) from a sample of DNA sequences Indian Academy of Sciences A comparison of two popular statistical methods for estimating the time to most recent common ancestor (TMRCA) from a sample of DNA sequences ANALABHA BASU and PARTHA P. MAJUMDER*

More information

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974 LIMITS FOR QUEUES AS THE WAITING ROOM GROWS by Daniel P. Heyman Ward Whitt Bell Communications Research AT&T Bell Laboratories Red Bank, NJ 07701 Murray Hill, NJ 07974 May 11, 1988 ABSTRACT We study the

More information

Programming Assignment 4: Image Completion using Mixture of Bernoullis

Programming Assignment 4: Image Completion using Mixture of Bernoullis Programming Assignment 4: Image Completion using Mixture of Bernoullis Deadline: Tuesday, April 4, at 11:59pm TA: Renie Liao (csc321ta@cs.toronto.edu) Submission: You must submit two files through MarkUs

More information

Size-biased sampling and discrete nonparametric Bayesian inference

Size-biased sampling and discrete nonparametric Bayesian inference Journal of Statistical Planning and Inference 128 (2005) 123 148 www.elsevier.com/locate/jspi Size-biased sampling and discrete nonparametric Bayesian inference Andrea Ongaro Dipartimento di Statistica,

More information

chapter 5 INTRODUCTION TO MATRIX ALGEBRA GOALS 5.1 Basic Definitions

chapter 5 INTRODUCTION TO MATRIX ALGEBRA GOALS 5.1 Basic Definitions chapter 5 INTRODUCTION TO MATRIX ALGEBRA GOALS The purpose of this chapter is to introduce you to matrix algebra, which has many applications. You are already familiar with several algebras: elementary

More information

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal

More information

A noninformative Bayesian approach to domain estimation

A noninformative Bayesian approach to domain estimation A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal

More information

Language as a Stochastic Process

Language as a Stochastic Process CS769 Spring 2010 Advanced Natural Language Processing Language as a Stochastic Process Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu 1 Basic Statistics for NLP Pick an arbitrary letter x at random from any

More information

Bounds for pairs in partitions of graphs

Bounds for pairs in partitions of graphs Bounds for pairs in partitions of graphs Jie Ma Xingxing Yu School of Mathematics Georgia Institute of Technology Atlanta, GA 30332-0160, USA Abstract In this paper we study the following problem of Bollobás

More information

Biology 9 Springer-Verlag 1986

Biology 9 Springer-Verlag 1986 J. Math. Biol. (1986) 24:353-360,Journal of Mathematical Biology 9 Springer-Verlag 1986 A model of weak selection in the infinite alleles framework E. D. Rothman and N.. Weber* Department of Statistics,

More information

A new distribution on the simplex containing the Dirichlet family

A new distribution on the simplex containing the Dirichlet family A new distribution on the simplex containing the Dirichlet family A. Ongaro, S. Migliorati, and G.S. Monti Department of Statistics, University of Milano-Bicocca, Milano, Italy; E-mail for correspondence:

More information

SPACES OF MATRICES WITH SEVERAL ZERO EIGENVALUES

SPACES OF MATRICES WITH SEVERAL ZERO EIGENVALUES SPACES OF MATRICES WITH SEVERAL ZERO EIGENVALUES M. D. ATKINSON Let V be an w-dimensional vector space over some field F, \F\ ^ n, and let SC be a space of linear mappings from V into itself {SC ^ Horn

More information

~,. :'lr. H ~ j. l' ", ...,~l. 0 '" ~ bl '!; 1'1. :<! f'~.., I,," r: t,... r':l G. t r,. 1'1 [<, ."" f'" 1n. t.1 ~- n I'>' 1:1 , I. <1 ~'..

~,. :'lr. H ~ j. l' , ...,~l. 0 ' ~ bl '!; 1'1. :<! f'~.., I,, r: t,... r':l G. t r,. 1'1 [<, . f' 1n. t.1 ~- n I'>' 1:1 , I. <1 ~'.. ,, 'l t (.) :;,/.I I n ri' ' r l ' rt ( n :' (I : d! n t, :?rj I),.. fl.),. f!..,,., til, ID f-i... j I. 't' r' t II!:t () (l r El,, (fl lj J4 ([) f., () :. -,,.,.I :i l:'!, :I J.A.. t,.. p, - ' I I I

More information

SOME PROPERTIES OF THIRD-ORDER RECURRENCE RELATIONS

SOME PROPERTIES OF THIRD-ORDER RECURRENCE RELATIONS SOME PROPERTIES OF THIRD-ORDER RECURRENCE RELATIONS A. G. SHANNON* University of Papua New Guinea, Boroko, T. P. N. G. A. F. HORADAIVS University of New Engl, Armidale, Australia. INTRODUCTION In this

More information

A NONINFORMATIVE BAYESIAN APPROACH FOR TWO-STAGE CLUSTER SAMPLING

A NONINFORMATIVE BAYESIAN APPROACH FOR TWO-STAGE CLUSTER SAMPLING Sankhyā : The Indian Journal of Statistics Special Issue on Sample Surveys 1999, Volume 61, Series B, Pt. 1, pp. 133-144 A OIFORMATIVE BAYESIA APPROACH FOR TWO-STAGE CLUSTER SAMPLIG By GLE MEEDE University

More information

Explicit evaluation of the transmission factor T 1. Part I: For small dead-time ratios. by Jorg W. MUller

Explicit evaluation of the transmission factor T 1. Part I: For small dead-time ratios. by Jorg W. MUller Rapport BIPM-87/5 Explicit evaluation of the transmission factor T (8,E) Part I: For small dead-time ratios by Jorg W. MUller Bureau International des Poids et Mesures, F-930 Sevres Abstract By a detailed

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

A PECULIAR COIN-TOSSING MODEL

A PECULIAR COIN-TOSSING MODEL A PECULIAR COIN-TOSSING MODEL EDWARD J. GREEN 1. Coin tossing according to de Finetti A coin is drawn at random from a finite set of coins. Each coin generates an i.i.d. sequence of outcomes (heads or

More information

Boolean Inner-Product Spaces and Boolean Matrices

Boolean Inner-Product Spaces and Boolean Matrices Boolean Inner-Product Spaces and Boolean Matrices Stan Gudder Department of Mathematics, University of Denver, Denver CO 80208 Frédéric Latrémolière Department of Mathematics, University of Denver, Denver

More information

A well-quasi-order for tournaments

A well-quasi-order for tournaments A well-quasi-order for tournaments Maria Chudnovsky 1 Columbia University, New York, NY 10027 Paul Seymour 2 Princeton University, Princeton, NJ 08544 June 12, 2009; revised April 19, 2011 1 Supported

More information

Multiple Time Analyticity of a Statistical Satisfying the Boundary Condition

Multiple Time Analyticity of a Statistical Satisfying the Boundary Condition Publ. RIMS, Kyoto Univ. Ser. A Vol. 4 (1968), pp. 361-371 Multiple Time Analyticity of a Statistical Satisfying the Boundary Condition By Huzihiro ARAKI Abstract A multiple time expectation ^(ABjC^)---^^))

More information

Online Supplement to:

Online Supplement to: Online Supplement to: Adaptive Appointment Systems with Patient Preferences Wen-Ya Wang Diwakar Gupta wenya@ie.umn.edu guptad@me.umn.edu Industrial and Systems Engineering Program, University of Minnesota

More information

free pros quantum groups QIT de Finetti -

free pros quantum groups QIT de Finetti - i e free pros QIT de Finetti - @ quantum groups QG INK FP - : FREE DE FINETTI Commun Math Phys 291 473 490 (2009 Digital Object Identifier (DOI 101007/s00220-009-0802-8 Communications in Mathematical Physics

More information

AARMS Homework Exercises

AARMS Homework Exercises 1 For the gamma distribution, AARMS Homework Exercises (a) Show that the mgf is M(t) = (1 βt) α for t < 1/β (b) Use the mgf to find the mean and variance of the gamma distribution 2 A well-known inequality

More information

Atomic Positive Linear Maps in Matrix Algebras

Atomic Positive Linear Maps in Matrix Algebras Publ RIMS, Kyoto Univ. 34 (1998), 591-599 Atomic Positive Linear Maps in Matrix Algebras By Kil-Chan HA* Abstract We show that all of the known generalizations of the Choi maps are atomic maps. 1. Introduction

More information

Introduction to Bayesian Inference

Introduction to Bayesian Inference Introduction to Bayesian Inference p. 1/2 Introduction to Bayesian Inference September 15th, 2010 Reading: Hoff Chapter 1-2 Introduction to Bayesian Inference p. 2/2 Probability: Measurement of Uncertainty

More information

Clustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning

Clustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades

More information

A decision theoretic approach to Imputation in finite population sampling

A decision theoretic approach to Imputation in finite population sampling A decision theoretic approach to Imputation in finite population sampling Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 August 1997 Revised May and November 1999 To appear

More information

Nearly Equal Distributions of the Rank and the Crank of Partitions

Nearly Equal Distributions of the Rank and the Crank of Partitions Nearly Equal Distributions of the Rank and the Crank of Partitions William Y.C. Chen, Kathy Q. Ji and Wenston J.T. Zang Dedicated to Professor Krishna Alladi on the occasion of his sixtieth birthday Abstract

More information

A Note on a General Expansion of Functions of Binary Variables

A Note on a General Expansion of Functions of Binary Variables INFORMATION AND CONTROL 19-, 206-211 (1968) A Note on a General Expansion of Functions of Binary Variables TAIC~YASU ITO Stanford University In this note a general expansion of functions of binary variables

More information

The Wright-Fisher Model and Genetic Drift

The Wright-Fisher Model and Genetic Drift The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population

More information

arxiv: v2 [math.pr] 26 Aug 2017

arxiv: v2 [math.pr] 26 Aug 2017 Ordered and size-biased frequencies in GEM and Gibbs models for species sampling Jim Pitman Yuri Yakubovich arxiv:1704.04732v2 [math.pr] 26 Aug 2017 March 14, 2018 Abstract We describe the distribution

More information

System reliability using the survival signature

System reliability using the survival signature System reliability using the survival signature Frank Coolen GDRR Ireland 8-10 July 2013 (GDRR 2013) Survival signature 1 / 31 Joint work with: Tahani Coolen-Maturi (Durham University) Ahmad Aboalkhair

More information

Lecture Note II-3 Static Games of Incomplete Information. Games of incomplete information. Cournot Competition under Asymmetric Information (cont )

Lecture Note II-3 Static Games of Incomplete Information. Games of incomplete information. Cournot Competition under Asymmetric Information (cont ) Lecture Note II- Static Games of Incomplete Information Static Bayesian Game Bayesian Nash Equilibrium Applications: Auctions The Revelation Principle Games of incomplete information Also called Bayesian

More information

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,

More information

ON IDENTIFIABILITY AND INFORMATION-REGULARITY IN PARAMETRIZED NORMAL DISTRIBUTIONS*

ON IDENTIFIABILITY AND INFORMATION-REGULARITY IN PARAMETRIZED NORMAL DISTRIBUTIONS* CIRCUITS SYSTEMS SIGNAL PROCESSING VOL. 16, NO. 1,1997, Pp. 8~89 ON IDENTIFIABILITY AND INFORMATION-REGULARITY IN PARAMETRIZED NORMAL DISTRIBUTIONS* Bertrand Hochwald a and Arye Nehorai 2 Abstract. We

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

1 Complex Networks - A Brief Overview

1 Complex Networks - A Brief Overview Power-law Degree Distributions 1 Complex Networks - A Brief Overview Complex networks occur in many social, technological and scientific settings. Examples of complex networks include World Wide Web, Internet,

More information

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007 COS 597C: Bayesian Nonparametrics Lecturer: David Blei Lecture # Scribes: Jordan Boyd-Graber and Francisco Pereira October, 7 Gibbs Sampling with a DP First, let s recapitulate the model that we re using.

More information

Random walk on a polygon

Random walk on a polygon IMS Lecture Notes Monograph Series Recent Developments in Nonparametric Inference and Probability Vol. 50 (2006) 4 c Institute of Mathematical Statistics, 2006 DOI: 0.24/0749270600000058 Random walk on

More information

PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator FOR THE BINOMIAL DISTRIBUTION

PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator FOR THE BINOMIAL DISTRIBUTION APPLICATIONES MATHEMATICAE 22,3 (1994), pp. 331 337 W. KÜHNE (Dresden), P. NEUMANN (Dresden), D. STOYAN (Freiberg) and H. STOYAN (Freiberg) PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator

More information

Computer Vision Group Prof. Daniel Cremers. 14. Clustering

Computer Vision Group Prof. Daniel Cremers. 14. Clustering Group Prof. Daniel Cremers 14. Clustering Motivation Supervised learning is good for interaction with humans, but labels from a supervisor are hard to obtain Clustering is unsupervised learning, i.e. it

More information

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 11 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,, a n, b are given real

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Asymptotic efficiency of simple decisions for the compound decision problem

Asymptotic efficiency of simple decisions for the compound decision problem Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com

More information

Heronian tetrahedra are lattice tetrahedra

Heronian tetrahedra are lattice tetrahedra Heronian tetrahedra are lattice tetrahedra Susan H. Marshall and Alexander R. Perlis Abstract Extending a similar result about triangles, we show that each Heronian tetrahedron may be positioned with integer

More information

L n = l n (π n ) = length of a longest increasing subsequence of π n.

L n = l n (π n ) = length of a longest increasing subsequence of π n. Longest increasing subsequences π n : permutation of 1,2,...,n. L n = l n (π n ) = length of a longest increasing subsequence of π n. Example: π n = (π n (1),..., π n (n)) = (7, 2, 8, 1, 3, 4, 10, 6, 9,

More information

SOME FORMULAE FOR THE FIBONACCI NUMBERS

SOME FORMULAE FOR THE FIBONACCI NUMBERS SOME FORMULAE FOR THE FIBONACCI NUMBERS Brian Curtin Department of Mathematics, University of South Florida, 4202 E Fowler Ave PHY4, Tampa, FL 33620 e-mail: bcurtin@mathusfedu Ena Salter Department of

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

ASYMPTOTIC EXPANSION OF A COMPLEX HYPERGEOMETRIC FUNCTION

ASYMPTOTIC EXPANSION OF A COMPLEX HYPERGEOMETRIC FUNCTION ASYMPTOTIC EXPANSION OF A COMPLEX HYPERGEOMETRIC FUNCTION by Aksel Bertelsen TECHNICAL REPORT No. 145 November 1988 Department ofstatistics, GN-22 University.of Washington Seattle, Washington 98195 USA

More information

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i := 2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]

More information

Stochastic Processes

Stochastic Processes qmc082.tex. Version of 30 September 2010. Lecture Notes on Quantum Mechanics No. 8 R. B. Griffiths References: Stochastic Processes CQT = R. B. Griffiths, Consistent Quantum Theory (Cambridge, 2002) DeGroot

More information

ON THE ERDOS-STONE THEOREM

ON THE ERDOS-STONE THEOREM ON THE ERDOS-STONE THEOREM V. CHVATAL AND E. SZEMEREDI In 1946, Erdos and Stone [3] proved that every graph with n vertices and at least edges contains a large K d+l (t), a complete (d + l)-partite graph

More information

ON THE EVOLUTION OF ISLANDS

ON THE EVOLUTION OF ISLANDS ISRAEL JOURNAL OF MATHEMATICS, Vol. 67, No. 1, 1989 ON THE EVOLUTION OF ISLANDS BY PETER G. DOYLE, Lb COLIN MALLOWS,* ALON ORLITSKY* AND LARRY SHEPP t MT&T Bell Laboratories, Murray Hill, NJ 07974, USA;

More information

Recursive Estimation

Recursive Estimation Recursive Estimation Raffaello D Andrea Spring 08 Problem Set : Bayes Theorem and Bayesian Tracking Last updated: March, 08 Notes: Notation: Unless otherwise noted, x, y, and z denote random variables,

More information

URN MODELS: the Ewens Sampling Lemma

URN MODELS: the Ewens Sampling Lemma Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 3, 2014 1 2 3 4 Mutation Mutation: typical values for parameters Equilibrium Probability of fixation 5 6 Ewens Sampling

More information

THE S 1 -EQUIVARIANT COHOMOLOGY RINGS OF (n k, k) SPRINGER VARIETIES

THE S 1 -EQUIVARIANT COHOMOLOGY RINGS OF (n k, k) SPRINGER VARIETIES Horiguchi, T. Osaka J. Math. 52 (2015), 1051 1062 THE S 1 -EQUIVARIANT COHOMOLOGY RINGS OF (n k, k) SPRINGER VARIETIES TATSUYA HORIGUCHI (Received January 6, 2014, revised July 14, 2014) Abstract The main

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information