MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Polynomial Time Perfect Sampler for Discretized Dirichlet Distribution

Size: px
Start display at page:

Download "MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Polynomial Time Perfect Sampler for Discretized Dirichlet Distribution"

Transcription

1 MATHEMATICAL ENGINEERING TECHNICAL REPORTS Polynomial Time Perfect Sampler for Discretized Dirichlet Distriution Tomomi MATSUI and Shuji KIJIMA METR April 003 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE SCHOOL OF INFORMATION SCIENCE AND TECHNOLOGY THE UNIVERSITY OF TOKYO BUNKYO-KU, TOKYO , JAPAN WWW page:

2 The METR technical reports are pulished as a means to ensure timely dissemination of scholarly and technical work on a non-commercial asis. Copyright and all rights therein are maintained y the authors or y other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked y each author s copyright. These works may not e reposted without the explicit permission of the copyright holder.

3 Polynomial Time Perfect Sampler for Discretized Dirichlet Distriution Tomomi Matsui and Shuji Kijima Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Bunkyo-ku, Tokyo , Japan. version: April 4th Astract. In this paper, we propose a perfect exact sampling algorithm according to a discretized Dirichlet distriution. Our algorithm is a monotone coupling from the past algorithm, which is a Las Vegas type randomized algorithm. We propose a new Markov chain whose limit distriution is a discretized Dirichlet distriution. Our algorithm simulates transitions of the chain On 3 ln times where n is the dimension the numer of parameters and / is the grid size for discretization. Thus the otained ound does not depend on the magnitudes of parameters. In each transition, we need to sample a random variale according to a discretized eta distriution -dimensional Dirichlet distriution. To show the polynomiality, we employ the path coupling method carefully and show that our chain is rapidly mixing. Introduction In this paper, we propose a polynomial time perfect exact sampling algorithm according to a discretized Dirichlet distriution. Our algorithm is a monotone coupling from the past algorithm, which is a Las Vegas type randomized algorithm. We propose a new Markov chain for generating random samples according to a discretized Dirichlet distriution. In each transition of our Markov chain, we need a random sample according to a discretized eta distriution -dimensional Dirichlet distriution. Our algorithm simulates transitions of the Markov chain On 3 ln times in expectation where n is the dimension the numer of parameters and / is the grid size for discretization. To show the polynomiality, we employ the path coupling method and show that the mixing rate of our chain is ounded y nn + ln n n/. Statistical methods are widely studied in ioinformatics since they are powerful tools to discover genes causing a common disease from a numer of oserved data. These methods often use EM algorithm, Markov chain Monte Carlo Supported y Superroust Computation Project of the st Century COE Program Information Science and Technology Strategic Core.

4 method, Gis sampler, and so on. The Dirichlet distriution appears as prior and posterior distriution for the multinomial distriution in these methods since the Dirichlet distriution is the conjugate prior of parameters of the multinomial distriution []. For example, Niu, Qin, Xu, and Liu proposed a Bayesian haplotype inference method [7], which decides phased paternal and maternal individual genotypes proailistically. Another example is a population structure inferring algorithm y Pritchard, Stephens, and Donnely [8]. Their algorithm is ased on MCMC method. In these examples, the Dirichlet distriution appears with various dimensions and various parameters. Thus we need an efficient algorithm for sampling from the Dirichlet distriution with aritrary dimensions and parameters. One approach of sampling from a continuous Dirichlet distriution is y rejection see [4] for example. In this way, the numer of required samples from the gamma distriution is equal to the size of the dimension of the Dirichlet distriution. Though we can sample from the gamma distriution y using rejection sampling, the ratio of rejection ecomes higher as the parameter is smaller. Thus, it does not seems effective way for small parameters. Another approach is a descretization of the domain and adoption of Metropolis-Hastings algorithm. Recently, Matsui, Motoki and Kamatani proposed a Markov chain for sampling from discretized Dirichlet distriution [6]. The mixing rate of their chain is ounded y /nn + ln n and so their algorithm is a polynomial time approximate sampler. Propp and Wilson devised a surprising algorithm, called monotone CFTP algoritm or ackward coupling, which produces exact samples from the limit distriution [9, 0]. Monotone CFTP algorithm simulates infinite time transitions of a monotone Markov chain in a proailistically finite time. To employ their result, we propose a new monotone Markov chain whose stationary distriution is a discretized Dirichlet distriution. We also show that the mixing rate of our chain is ounded y nn + ln n n/. This paper is deeply related to authors recent paper [5] which proposes a polynomial time perfect sampling algorithm for two-rowed contingency tales. The idea of the algorithm and outline of the proof descried in this paper is similar to those in our paper [5]. The main difference of these two papers is that when we deal with the uniform distriution on two-rowed contingency tales, the transition proailities of the Markov chain are predetermined. However, in case of discretized Dirichlet distriution, the transition proailities vary with respect to the magnitude of paprameters. Thus, to show the monotonicity and polynomiality of the algorithm proposed in this paper, we need a detailed discussions which appear in Appendix section. Showing two lemmas in Appendix section is easier in case of the uniform distriution on two-rowed contingency tales.

5 Review of Coupling From the Past Algorithm When we simulate an ergodic Markov chain for infinite time, we can gain a sample exactly according to the stationary distriution. Suppose that there exists a chain from infinite past, then a possile state at the present time of the chain for which we can have an evidence of the uniqueness without respect to an initial state of the chain, is a realization of a random sample exactly from the stationary distriution. This is the key idea of CFTP. Suppose that we have an ergodic Markov chain MC with finite state space Ω and transition matrix P. The transition rule of the Markov chain X X can e descried y a deterministic function φ : Ω [0, Ω, called update function, as follows. Given a random numer Λ uniformly distriuted over [0,, update function φ satisfies that Prφx, Λ = y = P x, y for any x, y Ω. We can realize the Markov chain y setting X = φx, Λ. Clearly, update function corresponding to the given transition matrix P is not unique. The result of transitions of the chain from the time t to t t < t with a sequence of random numers λ = λ[t ], λ[t + ],..., λ[t ] [0, t t is denoted y Φ t t x, λ : Ω [0, t t Ω where Φ t t x, λ def. = φφ φx, λ[t ],..., λ[t ], λ[t ]. We say that a sequence λ [0, T satisfies the coalescence condition, when y Ω, x Ω, y = Φ 0 T x, λ. With these preparation, standard Coupling From The Past algorithm is expressed as follows. CFTP Algorithm [9] Step. Set the starting time period T := to go ack, and set λ e the empty sequence. Step. Generate random real numers λ[t ], λ[t + ],..., λ[ T/ ] [0,, and insert them to the head of λ in order, i.e., put λ := λ[t ], λ[t + ],..., λ[ ]. Step 3. Start a chain from each element x Ω at time period T, and run each chain to time period 0 according to the update function φ with the sequence of numers in λ. Note that every chain uses the common sequence λ. Step 4. [ Coalescence check ] If y Ω, x Ω, y = Φ 0 T x, λ, then return y and stop. Else, update the starting time period T := T, and go to Step. Theorem. CFTP Theorem [9] Let MC e an ergodic finite Markov chain with state space Ω, defined y an update function φ : Ω [0, Ω. If the CFTP algorithm terminates with proaility, then the otained value is a realization of a random variale exactly distriuted according to the stationary distriution. Theorem gives a proailistically finite time algorithm for infinite time simulation. However, simulations from all states executed in Step 3 is a hard requirement. 3

6 Suppose that there exists a partial order on the set of states Ω. A transition rule expressed y a deterministic update function φ is called monotone with respect to if λ [0,, x, y Ω, x y φx, λ φy, λ. For ease, we also say that a chain is monotone if the chain has a monotone transition rule. Theorem. monotone CFTP [9, 3] Suppose that a Markov chain defined y an update function φ is monotone with respect to a partially ordered set of states Ω,, and x max, x min Ω, x Ω, x max x x min. Then the CFTP algorithm terminates with proaility, and a sequence λ [0, T satisfies the coalescence condition, i.e., y Ω, x Ω, y = Φ 0 T x, λ, if and only if Φ 0 T x max, λ = Φ 0 T x min, λ. When the given Markov chain satisfies the conditions of Theorem, we can modify CFTP algorithm y sustituting Step 4 a y Step 4. a If y Ω, y = Φ 0 T x max, λ = Φ 0 T x min, λ, then return y and stop. The algorithm otained y the aove modification is called a monotone CFTP algorithm. 3 Perfect Sampling Algorithm In this paper, we denote the set of integers non-negative or positive integers y Z Z +, Z ++. Dirichlet random vector P = P, P,..., P n with non-negative parameters u,..., u n is a vector of random variales that admits the proaility density function Γ n i= u i n i= Γ u i n i= p ui i defined on the set {p, p,..., p n R n p + + p n =, p, p,..., p n > 0} where Γ u is the gamma function. Throughout this paper, we assume that n. For any integer n, we discretize the domain with grid size / and otain a discrete set of integer vectors Ω defined y Ω def. = {x, x,..., x n Z n ++ x i > 0 i, x + + x n = }. A discretized Dirichlet random vector with non-negative parameters u,..., u n is a random vector X = X,..., X n Ω with the distriution Pr[X = x,..., x n ] def. n = C x i / ui where C is the partition function normarizing constant defined y C n x Ω i= x i/ ui. i= def. = 4

7 For any integer, we introduce a set of -dimensional integer vectors Ω def. = {Y, Y Z Y, Y > 0, Y + Y = } and a distriution function f Y, Y u i, u j : Ω [0, ] with non-negative parameters u i, u j defined y f Y, Y u i, u j def. = Cu i, u j, Y ui uj Y def. where Cu i, u j, = Y Y ui,y Ω Y uj is the partition function. We also introduce a vector g 0 u i, u j, g u i, u j,..., g u i, u j defined y g k u i, u j def. = { 0 k = 0 k l= Cu i, u j, l ui l uj k {,,..., } It is clear that 0 = g 0 u i, u j < g u i, u j < < g u i, u j =. We descrie our Markov chain M with state space Ω. Given a current state X Ω, we generate a random numer λ [, n. Then transition X X with respect to λ takes place as follows. Markov chain M Input: A random numer λ [, n. Step : Put i := λ and := X i + X i+. Step : Let k {,,..., } e a unique value satisfying g k u i, u i+ λ λ < g k u i, u i+. k j = i, Step 3: Put X j := k j = i +, otherwise. X j The update function φ : Ω [, n Ω of our chain is defined y φx, λ def. = X where X is determined y the aove procedure. Clearly, this chain is irreducile and aperiodic. Since the detailed alance equations hold, the stationary distriution of the aove Markov chain M is the discretized Dirichlet distriution. We define two special states X U, X L Ω y X U def. = n +,,,...,, X L def. =,,...,, n +. Now we descrie our algorithm. Algorithm Step. Set the starting time period T := to go ack, and λ e the empty sequence. Step. Generate random real numers λ[t ], λ[t + ],..., λ[ T/ ] [, n and put λ = λ[t ], λ[t ],..., λ[ ]. Step 3. Start two chains from X U and X L, respectively at time period T, and run them to time period 0 according to the update function φ with the sequence of numers in λ. 5

8 Step 4. [Coalescence check] If Y Ω, Y = Φ 0 T X U, λ = Φ 0 T X L, λ, then return Y and stop. Else, update the starting time period T := T, and go to Step. Theorem 3. With proaility, Algorithm terminates and returns a state. The state otained y Algorithm is a realization of a sample exactly according to the discretized Dirichlet distriution. The aove theorem guarantees that Algorithm is a perfect sampling algorithm. We prove the aove theorem y showing the monotonicity in the next section. 4 Monotonicity of the Chain In Section, we descried two theorems. Thus we only need to show the monotonicity of our chain. First, we introduce a partial order on the state space Ω. For any vector X Ω, we define the cumulative sum vector c X Z n+ + y { c X i def. 0 i = 0, = X + X + + X i i {,,..., n}, where c X = c X 0, c X,..., c X n. Clearly, there exists a ijection etween Ω and the set {c X X Ω}. For any pair of states X, Y Ω, we say X Y if and only if c X c Y. It is clear that the relation is a partial order on Ω. We can see easily that X Ω, X U X X L. We say that a state X Ω covers Y Ω at j, denoted y X Y or X j Y, when { i = j, c X i c Y i = 0 otherwise. Note that X j Y if and only if + i = j, X i Y i = i = j +, 0 otherwise. Next, we show a key lemma for proving monotonicity. Lemma. X, Y Ω, λ [, n, X j Y φx, λ φy, λ. Proof: We denote φx, λ y X and φy, λ y Y for simplicity. For any index i λ, it is easy to see that c X i = c X i and c Y i = c Y i, and so c X i c Y i = c X i c Y i 0 since X Y. In the following, we show that c X λ c Y λ. Clearly from the definition of our Markov chain, X λ is a unique value k satisfying g k u λ, u λ + λ λ < g k u λ, u λ + 6

9 where def. = X λ + X λ +. Similarly, Y λ is a unique value k satisfying g k u λ, u λ + λ λ < g k u λ, u λ + def. where = Y λ + Y λ +. We need to consider following three cases. Case : If λ j and λ j +, then = and so we have X λ = k = k = Y λ. Case : Consider the case that λ = j. Since X j Y, we have = +. From the definition of cumulative sum vector, c X j c Y j = c X j + X j c Y j Y j = c X j + X j c Y j Y j = X j Y j. Thus, it is enough to show that X j Y j. Lemma 5 in Appendix section implies the following inequalities, 0 = g +0 u j, u j = g 0 u j, u j g + u j, u j g u j, u j g +k u j, u j g k u j, u j g +k u j, u j g + u j, u j g u j, u j = g + u j, u j =, which we will call alternating inequalities. For example, if inequalities g +k u j, u j λ λ < g k u j, u j g +k u j, u j hold, then X λ = k > k = Y λ. And if g +k u j, u j g k u j, u j λ λ < g +k u j, u j hold, then X λ = k = Y λ X j Y j {,. Thus we have,, 3,..., see Figure. From the aove we have that X j Y j. }, g + 0 g + g + g + g g 0 g g g Fig.. A figure of alternating inequalities. In the aove, we denote g k u j, u j and g +k u j, u j y g k, g +k, respectively. Case 3: Consider the case that λ = j +. Since X j Y, we have + =. From the definition of cumulative sum vector, c X j + c Y j + = c X j + X j+ c Y j Y j+ = c X j + X j+ c Y j Y j+ = + X j+ Y j+. 7

10 Thus, it is enough to show that + X j+ Y j+. Then Lemma 5 also implies the following alternating inequalities 0 = g +0 u j+, u j+ = g 0 u j+, u j+ g + u j+, u j+ g u j+, u j+ g +k u j+, u j+ g k u j+, u j+ g +k u j+, u j+ g + u j+, u j+ g u j+, u j+ = g + u j+, u j+ =. Then it is easy to see that { X j+ Y j+,,,,..., 3 }, see Figure. From the aove, we otain the inequality + X j+ Y j+. g 0 g g g 0 3 g + 0 g + g + g + g + Fig.. A figure of alternating inequalities. In the aove, we denote g k u j+, u j+ and g +k u j+, u j+ y g k, g +k, respectively. Lemma. The Markov chain M defined y the update function φ is monotone with respect to, i.e., λ [, n, X, Y Ω, X Y φx, λ φy, λ. Proof: It is easy to see that there exists a sequence of states Z, Z,..., Z r with appropreate length satisfying X = Z Z Z r = Y. Then applying Lemma repeatedly, we can show that φx, λ = φz, λ φz, λ φz r, λ = φy, λ. Lastly, we show the correctness of our algorithm. Proof of Theorem 3: From Lemma, the Markov chain is monotone, and it is clear that X U, X L is a unique pair of maximum and minimum states. Thus Algorithm is a monotone CFTP algorithm and Theorems and implies Theorem 3. 5 Expected Running Time Here, we discuss the running time of our algorithm. In this section, we assume the following. Condition Parameters are arranged in non-increasing order, i.e., u u u n. 8

11 The following is a main result of this paper. Theorem 4. Under Condition, the expected numer of tansitions executed in Algorithm is ounded y On 3 ln where n is the dimension numer of parameters and / is the grid size for discretization. In the rest of this section, we prove Theorem 4 y estimating the expectation def. of coalescence time T Z ++ defined y T = min{t > 0 y Ω, x Ω, y = Φ 0 tx, Λ}. Note that T is a random variale. Given a pair of proailistic distriutions ν and ν on the finite state space Ω, the total variation distance etween ν and ν is defined y D TV ν, ν def. = x Ω ν x ν x. The mixing rate of an ergodic Markov chain is defined y τ def. = max x Ω {min{t s t, D TV π, Px s /e}} where π is the stationary distriution and Px s is the proailistic distriution of the chain at time s with initial state x. Path Coupling Theorem is a useful technique for ounding the mixing rate. Theorem 5. Path Coupling [, ] Let MC e a finite ergodic Markov chain with state space Ω. Let G = Ω, E e a connected undirected graph with vertex Ω set Ω and edge set E. Let l : E R e a positive length defined on the edge set. For any pair of vertices {x, y} of G, the distance etween x and y, denoted y dx, y and/or dy, x, is the length of a shortest path etween x and y, where the length of a path is the sum of the lengths of edges in the path. Suppose that there exists a joint process X, Y X, Y with respect to MC satisfying that whose marginals are a faithful copy of MC and 0 < β <, {X, Y } E, E[dX, Y ] βdx, Y. Then the mixing rate τ of Markov chain MC satisfies τ β +lnd/d, where d def. = min x,y Ω dx, y and D def. = max x,y Ω dx, y. The aove theorem differs from the original theorem in [] since the integrality of the edge length is not assumed. We drop the integrality and introduced the minimum distance d. This modification is not essential and we can show Theorem 5 similarly. Now, we show the polynomiality of our algorithm. First, we estimate the mixing rate of our chain M y employing Path Coupling Theorem. In the proof of the following lemma, Condition plays an important role. Lemma 3. Under Condition, the mixing rate τ of our Markov chain M satisfies τ nn + ln n n/. Proof. Let G = Ω, E e an undirected simple graph with vertex set Ω and edge set E defined as follows. A pair of vertices {X, Y } is an edge if and only if / n i= X i Y i =. Clearly, the graph G is connected. For each edge 9

12 e = {X, Y } E, there exists a unique pair of indices j, j {,..., n}, called the supporting pair of e, satisfying { i = j, j X i Y i =, 0 otherwise. We define the length le of an edge e y le def. = /n j i= n i where j = max{j, j } and {j, j } is the supporting pair of e. Note that min e E le max e E le n/. For each pair X, Y Ω, we define the distance dx, Y e the length of the shortest path etween X and Y on G. Clearly, the diameter of G, i.e., max X,Y Ω dx, Y, is ounded y n n/, since dx, Y n/ n i= / X i Y i n/ n for any X, Y Ω. The definition of edge length implies that for any edge {X, Y } E, dx, Y = l{x, Y }. We define a joint process X, Y X, Y y X, Y φx, Λ, φy, Λ with uniform real random numer Λ [, n where φ is the update function defined in Section 3. Now we show that E[dX, Y ] βdx, Y, β = /nn, for any pair {X, Y } E. In the following, we denote the supporting pair of {X, Y } y {j, j }. Without loss of generality, we can assume that j < j, and X j + = Y j. Case : When Λ = j, we will show that E[dX, Y Λ = j ] dx, Y /n j + /n. In case j = j, X = Y with conditional proailty. Hence dx, Y = 0. In the following, we consider the case j < j. Put = X j + X j and = Y j + Y j. Since X j + = Y j, + = holds. From the definition of the update function of our Marokov chain, we have followings, X j = k [g k u j, u j Λ Λ < g k u j, u j ] Y j = k [g +k u j, u j Λ Λ < g +k u j, u j ]. As descried in the proof of Lemma, the alternating inequalities 0 = g +0 u j, u j = g 0 u j, u j g + u j, u j g u j, u j g + u j, u j g u j, u j = g + u j, u j =, hold. Thus we have X j Y j {,,,,..., 3, }. If X j = Y j, the supporting pair of {X, Y } is {j, j } and so dx, Y = dx, Y. If X j Y j, the supporting pair of {X, Y } is {j, j } and so dx, Y = dx, Y n j + /n. 0

13 Lemma 6 in Appendix section implies that if u j u j, then Pr[X j = Y j Λ = j ] /, Pr[X j Y j Λ = j ] /, y showing that the following inequality Pr[X j Y j Λ = j ] Pr[X j = Y j Λ = j ] = k= [g k u j, u j g +k u j, u j ] k= [g +k u j, u j g k u j, u j ] 0 hold when u j u j see Figure 3. g 0 g g g 0 3 g + 0 g + g + g + g + Fig. 3. A figure of alternating inequalities. In the aove, we denote g k u j, u j and g +k u j, u j y g k, g +k, respectively. Condition is necessary to show Lemma 6. Thus we otain that E[dX, Y Λ = j ] /dx, Y + /dx, Y n j + /n = dx, Y /n j + /n. Case : When Λ = j, we can show that E[dX, Y Λ = j ] dx, Y + /n j /n in a similar way with Case. Case 3: When Λ j and Λ j, it is easy to see that the supporting pair {j, j } of {X, Y } satisfies j = max{j, j }. Thus dx, Y = dx, Y. The proaility of appearance of Case is equal to /n, and that of Case is less than or equal to /n. From the aove, E[dX, Y ] dx, Y n j + + n j n n n n = dx, Y n dx, Y = dx, Y. n max {X,Y } E {dx, Y } nn Since the diameter of G is ounded y n n/, Theorem 5 implies that the mixing rate τ satisfies τ nn + ln n n/. Next, we estimate the coalescence time. When we apply Propp and Wilson s result in [9] straightforwardly, the otained coalescence time is not tight. In the following, we use their technique carefully and derive a etter ound.

14 Lemma 4. Under Condition, the coalescence time T of M satisfies E[T ] = On 3 ln. Proof. Let G = Ω, E e the undirected graph and dx, Y, X, Y Ω, e the metric on G, oth of which are defined in the proof of Lemma 3. We define D def. def. = dx U, X L and τ 0 = nn + ln D. By using the inequality otained in the proof of Lemma 3, we have Pr[T > τ 0 ] = Pr [ Φ 0 τ 0 X U, Λ Φ 0 τ 0 X L, Λ ] = Pr [Φ τ0 0 X U, Λ Φ τ0 0 X L, Λ] X,Y Ω dx, Y Pr [X = Φτ0 = E [d Φ τ0 0 X U, Λ, Φ τ0 0 X L, Λ] = 0 X U, Λ, Y = Φ τ0 0 X L, Λ] τ0 nn dx U, X L nn +ln D nn D e e ln D D e. By sumultiplicativity of coalescence time [9], for any k Z +, Pr[T > kτ 0 ] Pr[T > τ 0 ] k /e k. Thus E[T ] = t=0 tpr[t = t] τ 0 + τ 0 Pr[T > τ 0 ] + τ 0 Pr[T > τ 0 ] + τ 0 + τ 0 /e + τ 0 /e + = τ 0 / /e τ 0. Clearly, D n n/ ecause n. Then we otain the result that E[T ] = On 3 ln. Lastly, we ound the expected numer of transitions executed in Algorithm. Proof of Theorem 4: We denote T e the coalescence time of our chain. Clearly T is a random variale. Put K = log T. Algorithm terminates when we set the starting time period T = K at K + st iteration. Then the total numer of simulated transitions is ounded y K < K 8T, since we need to execute two chains from oth X U and X L. Thus the expectation of total numer of transitions of M is ounded y OE[8T ] = On 3 ln N. We can assume Condition y sorting parameters in On ln n time. 6 Conclusion In this paper, we proposed a monotone Markov chain whose stationary distriution is a discretized Dirichlet distriution. By employing Propp and Wilson s result, on monotone CFTP algorithm, we can construct a perfect sampling algorithm. We showed that our Markov chain is rapidly mixing y using path coupling method. Thus the rapidity implies that our perfect sampling algorithm is a polynomial time algorithm. The otained time complexity does not depend on the magnitudes of parameters. We can reduce the memory requirement y employing Wilson s read-once algorithm in [].

15 Appendix Lemma 5 is essentialy equivalent to the lemma appearing in Appendix section of the paper [6] y Matsui, Motoki and Kamatani which deals with an approximate sampler for discretized Dirichlet distriution. Lemma 6 is a new result otained in this paper. Lemma 5. {, 3,...}, u i, u j 0, k {,,..., }, g + k u i, u j g k u i, u j g + k u i, u j. Proof: In the following, we show the second inequality. We can show the first inequality in a similar way. We denote Cu i, u j, + = C + and Cu i, u j, = C for simplicity. From the definirion of g k u i, u j, we otain that Hk def. = g + k u i, u j g k u i, u j = k l= C +l ui l + uj k l= C l ui l uj = C + l=k+ lui l + uj C l=k lui l uj = C l=k+ l ui l + uj C + l=k+ lui l + uj = l=k+ C l ui l + uj C + l ui l + uj = l=k+ C l ui l + uj ui C l + C. Similarly, we can also show that Hk = C + k l= lui l + uj C k l= lui l uj C + k l= lui l + uj C k l= l ui l + uj = k l= C +l ui l + uj C l ui l + uj = k l= C l ui l + uj C+ C l ui. By introducing the function h : {, 3,..., } R defined y hl def. = ui l C + C, we have the following equality and inequality Hk = l=k+ C l ui l + uj hl k l= C l ui l + uj hl. 3 a Consider the case that u i. Since u i 0, the function hl is monotone non-decreasing. When hk 0 holds, we have 0 hk hk + h, and so implies the nonnegativity Hk 0. If hk < 0, then inequalities h h3 hk < 0 hold, and so 3 implies that Hk k l= C l ui l + uj hl 0. 3

16 Consider the case that 0 u i. Since u i 0, the function hl is monotone non-increasing. If the inequality h 0 hold, we have h h3 h 0 and inequality implies the non-negativity Hk 0. Thus, we only need to show that h = ui C + C 0. In the rest of this proof, we sustitute u i y α i for all i. We define a function H 0, α i, α j y H 0, α i, α j def. = αi C + αi C. It is clear that if the condition [ α i 0, α j, {, 3, 4,...}, H 0, α i, α j 0] holds, we oatin the required result that h 0 for each {, 3, 4,...}. Now we transform the function H 0, α i, α j and otain another expression as follows; H 0, α i, α j = αi k αi k + αj αi k αi k αj k= = k= αi k αi k + = k= [ αi k αi k + αj = α i k α i k α j k= αj k+k k αi k αi k αj ] [ + k k= αi k= kαi k αj + αi k + αi k αj αj k + + k αi k k αi ]. Then it is enough to show that the function H, α i, α j, k def. = + αj αi k k + + αi k k is nonnegative for any k {,,..., }. Since + / k > and α j, we have H, α i, α j, k H, α i,, k = k k+ + + αi αi k k. We differentiate the function H y α i, and otain the following H, α i,, k = + αi αi α k k log + k log i = + k αi log + k k + αi log +. Since k, is a pair of positive integers satisfying k, the non-positivity of α i implies 0 + /k αi + / αi and 0 log + /k k log + /. Thus the function H, α i,, k is monotone non-decreasing with respect to α i 0. Thus we have H, α i,, k H, 0,, k = k k k k = k k+ + k + = k + k k+ = k+ 0. 4

17 Lemma 6. {, 3,...}, u i u j, [g k u i, u j g + k u i, u j ] [g + k u i, u j g k u i, u j ] 0. k= k= Proof: We denote Cu i, u j, + = C + and Cu i, u j, = C for simplicity. It is not difficult to show the following equalities, G def. = k= [g k u i, u j g + k u i, u j ] k= [g +k u i, u j g k u i, u j ] = k= g k u i, u j k= g +k u i, u j k= g +k u i, u j + k= g k u i, u j = k= g k u i, u j k= g +k u i, u j k= g +k u i, u j + k= g k u i, u j = k= C k= C + k l= lui l uj k= C + k l= lui l + uj + k= C k l= lui l + uj k l= lui l uj = C l= llui l uj C + l= llui l + uj C + l= llui l + uj + C l= l lui l uj = C l= l lui l uj C + l= llui l + uj = C C + C + l= l lui l uj C l= llui l + uj = C C + k= kui k + uj l= l lui l uj k= kui k uj l= llui l + uj = C C + k= l= l klui k + l uj k= l= lklui k l + uj = C C + k= l= l klui k + l uj k= = C C + k= = C C + l= k= + k= = C C + k= + k= l= kklui l k + uj k l kl u i uj k + l uj l= k l klui k + l l= k l klui k + l uj l= k l klui k + l uj l= k + l k + l ui uj k + + l 5

18 = C C + k= = C C + = C C + = C C + k= l= uj l= k l klui k + l l= k l k + lui kl uj k 3 l= k=l+ l 3 : k= l=k k=l+ k l klui k + l uj + k= l=k k l klui k + l uj l= k=l+ k l k + lui kl uj k= l=k k l k + lui kl uj k=l+ k l klui k + l uj l= l= l= + l= l= k=l+ k=l+ k l lk ui l + k + uj k=l+ k l k + lui kl uj k=l+ k l l + k + ui lk uj k l kl ui k + l uj lk ui l + k + uj k + l ui kl uj + l + k + ui lk uj. We define a function G 0 k, l, u i, u j y kl ui uj k + l G 0 k, l, u i, u j def. = lk ui l + k + k + l ui kl uj + l + k + ui lk uj uj Since l < l+ k, it is clear that k l > 0. Thus, we only need to show that l {,,..., }, k {, 3,..., }, u i u j, G 0 l, k, u i, u j 0. It is easy to see that G 0 k, l, u i, u j = lk ui k + uj + k ui l uj l + uj. + k + l ui l k uj uj + + l ui k uj Then it is clear that G 0 k, l, u i, u j is non-decreasing with respect to u i and so G 0 k, l, u i, u j G 0 k, l, u j, u j. By sustituting u i y u j in the definition. 6

19 of G 0 k, l, u i, u j, it is easy to see that G 0 k, l, u j, u j = 0. Thus we have the desired result. References. Buley, R., M. Dyer, M.: Path coupling: A technique for proving rapid mixing in Markov chains, 38th Annual Symposium on Foundations of Computer Science, IEEE, San Alimitos, 997, Buley, R.: Randomized Algorithms : Approximation, Generation, and Counting, Springer-Verlag, New York, Dimakos, X. K.: A guide to exact simulation, International Statistical Review, 69 00, pp Durin, R., Eddy, R., Krogh, A., Mitchison, G.: Biological sequence analysis: proailistic models of proteins and nucleic acids, Camridge Univ. Press, Kijima, S. and Matsui, T.,: Polynomial Time Perfect Sampling Algorithm for Two-rowed Contingency Tales, METR 003-5, Mathematical Engineering Technical Reports, University of Tokyo, 003. availale from 6. Matsui, T., Motoki, M., and Kamatani, N.,: Polynomial Time Approximate Sampler for Discretized Dirichlet Distriution, METR 003-0, Mathematical Engineering Technical Reports, University of Tokyo, 003. availale from 7. Niu, T., Qin, Z. S., Xu, X., Liu, J. S.: Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms, Am. J. Hum. Genet., Pritchard, J. K., Stephens, M., Donnely, P.: Inference of population structure using multilocus genotype data, Genetics, Propp, J. and Wilson, D.: Exact sampling with coupled Markov chains and applications to statistical mechanics, Random Structures and Algorithms, 9 996, pp Propp, J. and Wilson, D.: How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph, J. Algorithms, 7 998, pp Roert, C. P.: The Bayesian Choice, Springer-Verlag, New York, 00.. Wilson, D.: How to couple from the past using a read-once source of randomness, Random Structures and Algorithms, 6 000, pp

APPROXIMATE/PERFECT SAMPLERS FOR CLOSED JACKSON NETWORKS. Shuji Kijima Tomomi Matsui

APPROXIMATE/PERFECT SAMPLERS FOR CLOSED JACKSON NETWORKS. Shuji Kijima Tomomi Matsui Proceedings of the 005 Winter Simulation Conference M. E. Kuhl, N. M. Steiger, F. B. Armstrong, and J. A. Joines, eds. APPROXIMATE/PERFECT SAMPLERS FOR CLOSED JACKSON NETWORKS Shuji Kijima Tomomi Matsui

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Discrete Hessian Matrix for L-convex Functions

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Discrete Hessian Matrix for L-convex Functions MATHEMATICAL ENGINEERING TECHNICAL REPORTS Discrete Hessian Matrix for L-convex Functions Satoko MORIGUCHI and Kazuo MUROTA METR 2004 30 June 2004 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE SCHOOL

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Boundary cliques, clique trees and perfect sequences of maximal cliques of a chordal graph

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Boundary cliques, clique trees and perfect sequences of maximal cliques of a chordal graph MATHEMATICAL ENGINEERING TECHNICAL REPORTS Boundary cliques, clique trees and perfect sequences of maximal cliques of a chordal graph Hisayuki HARA and Akimichi TAKEMURA METR 2006 41 July 2006 DEPARTMENT

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Markov Bases for Two-way Subtable Sum Problems

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Markov Bases for Two-way Subtable Sum Problems MATHEMATICAL ENGINEERING TECHNICAL REPORTS Markov Bases for Two-way Subtable Sum Problems Hisayuki HARA, Akimichi TAKEMURA and Ruriko YOSHIDA METR 27 49 August 27 DEPARTMENT OF MATHEMATICAL INFORMATICS

More information

7.1 Coupling from the Past

7.1 Coupling from the Past Georgia Tech Fall 2006 Markov Chain Monte Carlo Methods Lecture 7: September 12, 2006 Coupling from the Past Eric Vigoda 7.1 Coupling from the Past 7.1.1 Introduction We saw in the last lecture how Markov

More information

University of Chicago Autumn 2003 CS Markov Chain Monte Carlo Methods

University of Chicago Autumn 2003 CS Markov Chain Monte Carlo Methods University of Chicago Autumn 2003 CS37101-1 Markov Chain Monte Carlo Methods Lecture 4: October 21, 2003 Bounding the mixing time via coupling Eric Vigoda 4.1 Introduction In this lecture we ll use the

More information

Lect4: Exact Sampling Techniques and MCMC Convergence Analysis

Lect4: Exact Sampling Techniques and MCMC Convergence Analysis Lect4: Exact Sampling Techniques and MCMC Convergence Analysis. Exact sampling. Convergence analysis of MCMC. First-hit time analysis for MCMC--ways to analyze the proposals. Outline of the Module Definitions

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Induction of M-convex Functions by Linking Systems

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Induction of M-convex Functions by Linking Systems MATHEMATICAL ENGINEERING TECHNICAL REPORTS Induction of M-convex Functions by Linking Systems Yusuke KOBAYASHI and Kazuo MUROTA METR 2006 43 July 2006 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE SCHOOL

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order MATHEMATICAL ENGINEERING TECHNICAL REPORTS A Superharmonic Prior for the Autoregressive Process of the Second Order Fuyuhiko TANAKA and Fumiyasu KOMAKI METR 2006 30 May 2006 DEPARTMENT OF MATHEMATICAL

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Dijkstra s Algorithm and L-concave Function Maximization

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Dijkstra s Algorithm and L-concave Function Maximization MATHEMATICAL ENGINEERING TECHNICAL REPORTS Dijkstra s Algorithm and L-concave Function Maximization Kazuo MUROTA and Akiyoshi SHIOURA METR 2012 05 March 2012; revised May 2012 DEPARTMENT OF MATHEMATICAL

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Deductive Inference for the Interiors and Exteriors of Horn Theories

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Deductive Inference for the Interiors and Exteriors of Horn Theories MATHEMATICAL ENGINEERING TECHNICAL REPORTS Deductive Inference for the Interiors and Exteriors of Horn Theories Kazuhisa MAKINO and Hirotaka ONO METR 2007 06 February 2007 DEPARTMENT OF MATHEMATICAL INFORMATICS

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS

MATHEMATICAL ENGINEERING TECHNICAL REPORTS MATHEMATICAL ENGINEERING TECHNICAL REPORTS Combinatorial Relaxation Algorithm for the Entire Sequence of the Maximum Degree of Minors in Mixed Polynomial Matrices Shun SATO (Communicated by Taayasu MATSUO)

More information

Polynomial-time counting and sampling of two-rowed contingency tables

Polynomial-time counting and sampling of two-rowed contingency tables Polynomial-time counting and sampling of two-rowed contingency tables Martin Dyer and Catherine Greenhill School of Computer Studies University of Leeds Leeds LS2 9JT UNITED KINGDOM Abstract In this paper

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

The coupling method - Simons Counting Complexity Bootcamp, 2016

The coupling method - Simons Counting Complexity Bootcamp, 2016 The coupling method - Simons Counting Complexity Bootcamp, 2016 Nayantara Bhatnagar (University of Delaware) Ivona Bezáková (Rochester Institute of Technology) January 26, 2016 Techniques for bounding

More information

Approximate Counting and Markov Chain Monte Carlo

Approximate Counting and Markov Chain Monte Carlo Approximate Counting and Markov Chain Monte Carlo A Randomized Approach Arindam Pal Department of Computer Science and Engineering Indian Institute of Technology Delhi March 18, 2011 April 8, 2011 Arindam

More information

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

INTRODUCTION TO MARKOV CHAIN MONTE CARLO INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate

More information

Advanced Sampling Algorithms

Advanced Sampling Algorithms + Advanced Sampling Algorithms + Mobashir Mohammad Hirak Sarkar Parvathy Sudhir Yamilet Serrano Llerena Advanced Sampling Algorithms Aditya Kulkarni Tobias Bertelsen Nirandika Wanigasekara Malay Singh

More information

Probability & Computing

Probability & Computing Probability & Computing Stochastic Process time t {X t t 2 T } state space Ω X t 2 state x 2 discrete time: T is countable T = {0,, 2,...} discrete space: Ω is finite or countably infinite X 0,X,X 2,...

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Bayesian inference with reliability methods without knowing the maximum of the likelihood function

Bayesian inference with reliability methods without knowing the maximum of the likelihood function Bayesian inference with reliaility methods without knowing the maximum of the likelihood function Wolfgang Betz a,, James L. Beck, Iason Papaioannou a, Daniel Strau a a Engineering Risk Analysis Group,

More information

The Incomplete Perfect Phylogeny Haplotype Problem

The Incomplete Perfect Phylogeny Haplotype Problem The Incomplete Perfect Phylogeny Haplotype Prolem Gad Kimmel 1 and Ron Shamir 1 School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel. Email: kgad@tau.ac.il, rshamir@tau.ac.il. Astract.

More information

Online Supplementary Appendix B

Online Supplementary Appendix B Online Supplementary Appendix B Uniqueness of the Solution of Lemma and the Properties of λ ( K) We prove the uniqueness y the following steps: () (A8) uniquely determines q as a function of λ () (A) uniquely

More information

Approximate Counting

Approximate Counting Approximate Counting Andreas-Nikolas Göbel National Technical University of Athens, Greece Advanced Algorithms, May 2010 Outline 1 The Monte Carlo Method Introduction DNFSAT Counting 2 The Markov Chain

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

An Introduction to Randomized algorithms

An Introduction to Randomized algorithms An Introduction to Randomized algorithms C.R. Subramanian The Institute of Mathematical Sciences, Chennai. Expository talk presented at the Research Promotion Workshop on Introduction to Geometric and

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Discrete solid-on-solid models

Discrete solid-on-solid models Discrete solid-on-solid models University of Alberta 2018 COSy, University of Manitoba - June 7 Discrete processes, stochastic PDEs, deterministic PDEs Table: Deterministic PDEs Heat-diffusion equation

More information

The Markov Chain Monte Carlo Method

The Markov Chain Monte Carlo Method The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov

More information

Propp-Wilson Algorithm (and sampling the Ising model)

Propp-Wilson Algorithm (and sampling the Ising model) Propp-Wilson Algorithm (and sampling the Ising model) Danny Leshem, Nov 2009 References: Haggstrom, O. (2002) Finite Markov Chains and Algorithmic Applications, ch. 10-11 Propp, J. & Wilson, D. (1996)

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Lecture 6: September 22

Lecture 6: September 22 CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 6: September 22 Lecturer: Prof. Alistair Sinclair Scribes: Alistair Sinclair Disclaimer: These notes have not been subjected

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Perfect simulation for image analysis

Perfect simulation for image analysis Perfect simulation for image analysis Mark Huber Fletcher Jones Foundation Associate Professor of Mathematics and Statistics and George R. Roberts Fellow Mathematical Sciences Claremont McKenna College

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Lecture 9: Counting Matchings

Lecture 9: Counting Matchings Counting and Sampling Fall 207 Lecture 9: Counting Matchings Lecturer: Shayan Oveis Gharan October 20 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

More information

Ch5. Markov Chain Monte Carlo

Ch5. Markov Chain Monte Carlo ST4231, Semester I, 2003-2004 Ch5. Markov Chain Monte Carlo In general, it is very difficult to simulate the value of a random vector X whose component random variables are dependent. In this chapter we

More information

Minimizing a convex separable exponential function subject to linear equality constraint and bounded variables

Minimizing a convex separable exponential function subject to linear equality constraint and bounded variables Minimizing a convex separale exponential function suect to linear equality constraint and ounded variales Stefan M. Stefanov Department of Mathematics Neofit Rilski South-Western University 2700 Blagoevgrad

More information

Convergence Rate of Markov Chains

Convergence Rate of Markov Chains Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. The Symmetric Quadratic Semi-Assignment Polytope

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. The Symmetric Quadratic Semi-Assignment Polytope MATHEMATICAL ENGINEERING TECHNICAL REPORTS The Symmetric Quadratic Semi-Assignment Polytope Hiroo SAITO (Communicated by Kazuo MUROTA) METR 2005 21 August 2005 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Essential Maths 1. Macquarie University MAFC_Essential_Maths Page 1 of These notes were prepared by Anne Cooper and Catriona March.

Essential Maths 1. Macquarie University MAFC_Essential_Maths Page 1 of These notes were prepared by Anne Cooper and Catriona March. Essential Maths 1 The information in this document is the minimum assumed knowledge for students undertaking the Macquarie University Masters of Applied Finance, Graduate Diploma of Applied Finance, and

More information

The Monte Carlo Method

The Monte Carlo Method The Monte Carlo Method Example: estimate the value of π. Choose X and Y independently and uniformly at random in [0, 1]. Let Pr(Z = 1) = π 4. 4E[Z] = π. { 1 if X Z = 2 + Y 2 1, 0 otherwise, Let Z 1,...,

More information

Representation theory of SU(2), density operators, purification Michael Walter, University of Amsterdam

Representation theory of SU(2), density operators, purification Michael Walter, University of Amsterdam Symmetry and Quantum Information Feruary 6, 018 Representation theory of S(), density operators, purification Lecture 7 Michael Walter, niversity of Amsterdam Last week, we learned the asic concepts of

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Advances and Applications in Perfect Sampling

Advances and Applications in Perfect Sampling and Applications in Perfect Sampling Ph.D. Dissertation Defense Ulrike Schneider advisor: Jem Corcoran May 8, 2003 Department of Applied Mathematics University of Colorado Outline Introduction (1) MCMC

More information

Perfect simulation of repulsive point processes

Perfect simulation of repulsive point processes Perfect simulation of repulsive point processes Mark Huber Department of Mathematical Sciences, Claremont McKenna College 29 November, 2011 Mark Huber, CMC Perfect simulation of repulsive point processes

More information

IN this paper we study a discrete optimization problem. Constrained Shortest Link-Disjoint Paths Selection: A Network Programming Based Approach

IN this paper we study a discrete optimization problem. Constrained Shortest Link-Disjoint Paths Selection: A Network Programming Based Approach Constrained Shortest Link-Disjoint Paths Selection: A Network Programming Based Approach Ying Xiao, Student Memer, IEEE, Krishnaiyan Thulasiraman, Fellow, IEEE, and Guoliang Xue, Senior Memer, IEEE Astract

More information

25.1 Markov Chain Monte Carlo (MCMC)

25.1 Markov Chain Monte Carlo (MCMC) CS880: Approximations Algorithms Scribe: Dave Andrzejewski Lecturer: Shuchi Chawla Topic: Approx counting/sampling, MCMC methods Date: 4/4/07 The previous lecture showed that, for self-reducible problems,

More information

The variance for partial match retrievals in k-dimensional bucket digital trees

The variance for partial match retrievals in k-dimensional bucket digital trees The variance for partial match retrievals in k-dimensional ucket digital trees Michael FUCHS Department of Applied Mathematics National Chiao Tung University January 12, 21 Astract The variance of partial

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Luis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model

Luis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model Luis Manuel Santana Gallego 100 Appendix 3 Clock Skew Model Xiaohong Jiang and Susumu Horiguchi [JIA-01] 1. Introduction The evolution of VLSI chips toward larger die sizes and faster clock speeds makes

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation Idea: probabilities samples Get probabilities from samples: X count x 1 n 1. x k total. n k m X probability x 1. n 1 /m. x k n k /m If we could sample from a variable s (posterior)

More information

Lecture 28: April 26

Lecture 28: April 26 CS271 Randomness & Computation Spring 2018 Instructor: Alistair Sinclair Lecture 28: April 26 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They

More information

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time

More information

Polynomials Characterizing Hyper b-ary Representations

Polynomials Characterizing Hyper b-ary Representations 1 2 3 47 6 23 11 Journal of Integer Sequences, Vol. 21 (2018), Article 18.4.3 Polynomials Characterizing Hyper -ary Representations Karl Dilcher 1 Department of Mathematics and Statistics Dalhousie University

More information

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Theory of Stochastic Processes 8. Markov chain Monte Carlo Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Lecture 6 January 15, 2014

Lecture 6 January 15, 2014 Advanced Graph Algorithms Jan-Apr 2014 Lecture 6 January 15, 2014 Lecturer: Saket Sourah Scrie: Prafullkumar P Tale 1 Overview In the last lecture we defined simple tree decomposition and stated that for

More information

ERASMUS UNIVERSITY ROTTERDAM Information concerning the Entrance examination Mathematics level 2 for International Business Administration (IBA)

ERASMUS UNIVERSITY ROTTERDAM Information concerning the Entrance examination Mathematics level 2 for International Business Administration (IBA) ERASMUS UNIVERSITY ROTTERDAM Information concerning the Entrance examination Mathematics level 2 for International Business Administration (IBA) General information Availale time: 2.5 hours (150 minutes).

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

Decomposition Methods and Sampling Circuits in the Cartesian Lattice

Decomposition Methods and Sampling Circuits in the Cartesian Lattice Decomposition Methods and Sampling Circuits in the Cartesian Lattice Dana Randall College of Computing and School of Mathematics Georgia Institute of Technology Atlanta, GA 30332-0280 randall@math.gatech.edu

More information

CONTINUOUS DEPENDENCE ESTIMATES FOR VISCOSITY SOLUTIONS OF FULLY NONLINEAR DEGENERATE ELLIPTIC EQUATIONS

CONTINUOUS DEPENDENCE ESTIMATES FOR VISCOSITY SOLUTIONS OF FULLY NONLINEAR DEGENERATE ELLIPTIC EQUATIONS Electronic Journal of Differential Equations, Vol. 20022002), No. 39, pp. 1 10. ISSN: 1072-6691. URL: http://ejde.math.swt.edu or http://ejde.math.unt.edu ftp ejde.math.swt.edu login: ftp) CONTINUOUS DEPENDENCE

More information

#A50 INTEGERS 14 (2014) ON RATS SEQUENCES IN GENERAL BASES

#A50 INTEGERS 14 (2014) ON RATS SEQUENCES IN GENERAL BASES #A50 INTEGERS 14 (014) ON RATS SEQUENCES IN GENERAL BASES Johann Thiel Dept. of Mathematics, New York City College of Technology, Brooklyn, New York jthiel@citytech.cuny.edu Received: 6/11/13, Revised:

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

arxiv: v2 [math.pr] 29 Jul 2012

arxiv: v2 [math.pr] 29 Jul 2012 Appendix to Approximating perpetuities arxiv:203.0679v2 [math.pr] 29 Jul 202 Margarete Knape and Ralph Neininger Institute for Mathematics J.W. Goethe-University 60054 Frankfurt a.m. Germany July 25, 202

More information

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch. Sampling Methods Oliver Schulte - CMP 419/726 Bishop PRML Ch. 11 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Optimal Routing in Chord

Optimal Routing in Chord Optimal Routing in Chord Prasanna Ganesan Gurmeet Singh Manku Astract We propose optimal routing algorithms for Chord [1], a popular topology for routing in peer-to-peer networks. Chord is an undirected

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

StarAI Full, 6+1 pages Short, 2 page position paper or abstract

StarAI Full, 6+1 pages Short, 2 page position paper or abstract StarAI 2015 Fifth International Workshop on Statistical Relational AI At the 31st Conference on Uncertainty in Artificial Intelligence (UAI) (right after ICML) In Amsterdam, The Netherlands, on July 16.

More information

SC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.

More information

Coupling. 2/3/2010 and 2/5/2010

Coupling. 2/3/2010 and 2/5/2010 Coupling 2/3/2010 and 2/5/2010 1 Introduction Consider the move to middle shuffle where a card from the top is placed uniformly at random at a position in the deck. It is easy to see that this Markov Chain

More information

arxiv: v1 [cs.fl] 24 Nov 2017

arxiv: v1 [cs.fl] 24 Nov 2017 (Biased) Majority Rule Cellular Automata Bernd Gärtner and Ahad N. Zehmakan Department of Computer Science, ETH Zurich arxiv:1711.1090v1 [cs.fl] 4 Nov 017 Astract Consider a graph G = (V, E) and a random

More information

Conductance and Rapidly Mixing Markov Chains

Conductance and Rapidly Mixing Markov Chains Conductance and Rapidly Mixing Markov Chains Jamie King james.king@uwaterloo.ca March 26, 2003 Abstract Conductance is a measure of a Markov chain that quantifies its tendency to circulate around its states.

More information

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 18 : Advanced topics in MCMC Lecturer: Eric P. Xing Scribes: Jessica Chemali, Seungwhan Moon 1 Gibbs Sampling (Continued from the last lecture)

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Markov Processes. Stochastic process. Markov process

Markov Processes. Stochastic process. Markov process Markov Processes Stochastic process movement through a series of well-defined states in a way that involves some element of randomness for our purposes, states are microstates in the governing ensemble

More information

Theory of Stochastic Processes 3. Generating functions and their applications

Theory of Stochastic Processes 3. Generating functions and their applications Theory of Stochastic Processes 3. Generating functions and their applications Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo April 20, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information

1 Hoeffding s Inequality

1 Hoeffding s Inequality Proailistic Method: Hoeffding s Inequality and Differential Privacy Lecturer: Huert Chan Date: 27 May 22 Hoeffding s Inequality. Approximate Counting y Random Sampling Suppose there is a ag containing

More information

Upper Bounds for Stern s Diatomic Sequence and Related Sequences

Upper Bounds for Stern s Diatomic Sequence and Related Sequences Upper Bounds for Stern s Diatomic Sequence and Related Sequences Colin Defant Department of Mathematics University of Florida, U.S.A. cdefant@ufl.edu Sumitted: Jun 18, 01; Accepted: Oct, 016; Pulished:

More information

Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices

Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices Yi Sun IDSIA Galleria, Manno CH-98, Switzerland yi@idsia.ch Faustino Gomez IDSIA Galleria, Manno CH-98, Switzerland

More information

L 2 Discrepancy of Two-Dimensional Digitally Shifted Hammersley Point Sets in Base b

L 2 Discrepancy of Two-Dimensional Digitally Shifted Hammersley Point Sets in Base b L Discrepancy of Two-Dimensional Digitally Shifted Hammersley Point Sets in Base Henri Faure and Friedrich Pillichshammer Astract We give an exact formula for the L discrepancy of two-dimensional digitally

More information

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University

More information

CS281A/Stat241A Lecture 22

CS281A/Stat241A Lecture 22 CS281A/Stat241A Lecture 22 p. 1/4 CS281A/Stat241A Lecture 22 Monte Carlo Methods Peter Bartlett CS281A/Stat241A Lecture 22 p. 2/4 Key ideas of this lecture Sampling in Bayesian methods: Predictive distribution

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

INTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505

INTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505 INTRODUCTION TO MCMC AND PAGERANK Eric Vigoda Georgia Tech Lecture for CS 6505 1 MARKOV CHAIN BASICS 2 ERGODICITY 3 WHAT IS THE STATIONARY DISTRIBUTION? 4 PAGERANK 5 MIXING TIME 6 PREVIEW OF FURTHER TOPICS

More information

Solving Homogeneous Trees of Sturm-Liouville Equations using an Infinite Order Determinant Method

Solving Homogeneous Trees of Sturm-Liouville Equations using an Infinite Order Determinant Method Paper Civil-Comp Press, Proceedings of the Eleventh International Conference on Computational Structures Technology,.H.V. Topping, Editor), Civil-Comp Press, Stirlingshire, Scotland Solving Homogeneous

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

INTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505

INTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505 INTRODUCTION TO MCMC AND PAGERANK Eric Vigoda Georgia Tech Lecture for CS 6505 1 MARKOV CHAIN BASICS 2 ERGODICITY 3 WHAT IS THE STATIONARY DISTRIBUTION? 4 PAGERANK 5 MIXING TIME 6 PREVIEW OF FURTHER TOPICS

More information