MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Polynomial Time Perfect Sampler for Discretized Dirichlet Distribution
|
|
- Roger Hunter
- 5 years ago
- Views:
Transcription
1 MATHEMATICAL ENGINEERING TECHNICAL REPORTS Polynomial Time Perfect Sampler for Discretized Dirichlet Distriution Tomomi MATSUI and Shuji KIJIMA METR April 003 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE SCHOOL OF INFORMATION SCIENCE AND TECHNOLOGY THE UNIVERSITY OF TOKYO BUNKYO-KU, TOKYO , JAPAN WWW page:
2 The METR technical reports are pulished as a means to ensure timely dissemination of scholarly and technical work on a non-commercial asis. Copyright and all rights therein are maintained y the authors or y other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked y each author s copyright. These works may not e reposted without the explicit permission of the copyright holder.
3 Polynomial Time Perfect Sampler for Discretized Dirichlet Distriution Tomomi Matsui and Shuji Kijima Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Bunkyo-ku, Tokyo , Japan. version: April 4th Astract. In this paper, we propose a perfect exact sampling algorithm according to a discretized Dirichlet distriution. Our algorithm is a monotone coupling from the past algorithm, which is a Las Vegas type randomized algorithm. We propose a new Markov chain whose limit distriution is a discretized Dirichlet distriution. Our algorithm simulates transitions of the chain On 3 ln times where n is the dimension the numer of parameters and / is the grid size for discretization. Thus the otained ound does not depend on the magnitudes of parameters. In each transition, we need to sample a random variale according to a discretized eta distriution -dimensional Dirichlet distriution. To show the polynomiality, we employ the path coupling method carefully and show that our chain is rapidly mixing. Introduction In this paper, we propose a polynomial time perfect exact sampling algorithm according to a discretized Dirichlet distriution. Our algorithm is a monotone coupling from the past algorithm, which is a Las Vegas type randomized algorithm. We propose a new Markov chain for generating random samples according to a discretized Dirichlet distriution. In each transition of our Markov chain, we need a random sample according to a discretized eta distriution -dimensional Dirichlet distriution. Our algorithm simulates transitions of the Markov chain On 3 ln times in expectation where n is the dimension the numer of parameters and / is the grid size for discretization. To show the polynomiality, we employ the path coupling method and show that the mixing rate of our chain is ounded y nn + ln n n/. Statistical methods are widely studied in ioinformatics since they are powerful tools to discover genes causing a common disease from a numer of oserved data. These methods often use EM algorithm, Markov chain Monte Carlo Supported y Superroust Computation Project of the st Century COE Program Information Science and Technology Strategic Core.
4 method, Gis sampler, and so on. The Dirichlet distriution appears as prior and posterior distriution for the multinomial distriution in these methods since the Dirichlet distriution is the conjugate prior of parameters of the multinomial distriution []. For example, Niu, Qin, Xu, and Liu proposed a Bayesian haplotype inference method [7], which decides phased paternal and maternal individual genotypes proailistically. Another example is a population structure inferring algorithm y Pritchard, Stephens, and Donnely [8]. Their algorithm is ased on MCMC method. In these examples, the Dirichlet distriution appears with various dimensions and various parameters. Thus we need an efficient algorithm for sampling from the Dirichlet distriution with aritrary dimensions and parameters. One approach of sampling from a continuous Dirichlet distriution is y rejection see [4] for example. In this way, the numer of required samples from the gamma distriution is equal to the size of the dimension of the Dirichlet distriution. Though we can sample from the gamma distriution y using rejection sampling, the ratio of rejection ecomes higher as the parameter is smaller. Thus, it does not seems effective way for small parameters. Another approach is a descretization of the domain and adoption of Metropolis-Hastings algorithm. Recently, Matsui, Motoki and Kamatani proposed a Markov chain for sampling from discretized Dirichlet distriution [6]. The mixing rate of their chain is ounded y /nn + ln n and so their algorithm is a polynomial time approximate sampler. Propp and Wilson devised a surprising algorithm, called monotone CFTP algoritm or ackward coupling, which produces exact samples from the limit distriution [9, 0]. Monotone CFTP algorithm simulates infinite time transitions of a monotone Markov chain in a proailistically finite time. To employ their result, we propose a new monotone Markov chain whose stationary distriution is a discretized Dirichlet distriution. We also show that the mixing rate of our chain is ounded y nn + ln n n/. This paper is deeply related to authors recent paper [5] which proposes a polynomial time perfect sampling algorithm for two-rowed contingency tales. The idea of the algorithm and outline of the proof descried in this paper is similar to those in our paper [5]. The main difference of these two papers is that when we deal with the uniform distriution on two-rowed contingency tales, the transition proailities of the Markov chain are predetermined. However, in case of discretized Dirichlet distriution, the transition proailities vary with respect to the magnitude of paprameters. Thus, to show the monotonicity and polynomiality of the algorithm proposed in this paper, we need a detailed discussions which appear in Appendix section. Showing two lemmas in Appendix section is easier in case of the uniform distriution on two-rowed contingency tales.
5 Review of Coupling From the Past Algorithm When we simulate an ergodic Markov chain for infinite time, we can gain a sample exactly according to the stationary distriution. Suppose that there exists a chain from infinite past, then a possile state at the present time of the chain for which we can have an evidence of the uniqueness without respect to an initial state of the chain, is a realization of a random sample exactly from the stationary distriution. This is the key idea of CFTP. Suppose that we have an ergodic Markov chain MC with finite state space Ω and transition matrix P. The transition rule of the Markov chain X X can e descried y a deterministic function φ : Ω [0, Ω, called update function, as follows. Given a random numer Λ uniformly distriuted over [0,, update function φ satisfies that Prφx, Λ = y = P x, y for any x, y Ω. We can realize the Markov chain y setting X = φx, Λ. Clearly, update function corresponding to the given transition matrix P is not unique. The result of transitions of the chain from the time t to t t < t with a sequence of random numers λ = λ[t ], λ[t + ],..., λ[t ] [0, t t is denoted y Φ t t x, λ : Ω [0, t t Ω where Φ t t x, λ def. = φφ φx, λ[t ],..., λ[t ], λ[t ]. We say that a sequence λ [0, T satisfies the coalescence condition, when y Ω, x Ω, y = Φ 0 T x, λ. With these preparation, standard Coupling From The Past algorithm is expressed as follows. CFTP Algorithm [9] Step. Set the starting time period T := to go ack, and set λ e the empty sequence. Step. Generate random real numers λ[t ], λ[t + ],..., λ[ T/ ] [0,, and insert them to the head of λ in order, i.e., put λ := λ[t ], λ[t + ],..., λ[ ]. Step 3. Start a chain from each element x Ω at time period T, and run each chain to time period 0 according to the update function φ with the sequence of numers in λ. Note that every chain uses the common sequence λ. Step 4. [ Coalescence check ] If y Ω, x Ω, y = Φ 0 T x, λ, then return y and stop. Else, update the starting time period T := T, and go to Step. Theorem. CFTP Theorem [9] Let MC e an ergodic finite Markov chain with state space Ω, defined y an update function φ : Ω [0, Ω. If the CFTP algorithm terminates with proaility, then the otained value is a realization of a random variale exactly distriuted according to the stationary distriution. Theorem gives a proailistically finite time algorithm for infinite time simulation. However, simulations from all states executed in Step 3 is a hard requirement. 3
6 Suppose that there exists a partial order on the set of states Ω. A transition rule expressed y a deterministic update function φ is called monotone with respect to if λ [0,, x, y Ω, x y φx, λ φy, λ. For ease, we also say that a chain is monotone if the chain has a monotone transition rule. Theorem. monotone CFTP [9, 3] Suppose that a Markov chain defined y an update function φ is monotone with respect to a partially ordered set of states Ω,, and x max, x min Ω, x Ω, x max x x min. Then the CFTP algorithm terminates with proaility, and a sequence λ [0, T satisfies the coalescence condition, i.e., y Ω, x Ω, y = Φ 0 T x, λ, if and only if Φ 0 T x max, λ = Φ 0 T x min, λ. When the given Markov chain satisfies the conditions of Theorem, we can modify CFTP algorithm y sustituting Step 4 a y Step 4. a If y Ω, y = Φ 0 T x max, λ = Φ 0 T x min, λ, then return y and stop. The algorithm otained y the aove modification is called a monotone CFTP algorithm. 3 Perfect Sampling Algorithm In this paper, we denote the set of integers non-negative or positive integers y Z Z +, Z ++. Dirichlet random vector P = P, P,..., P n with non-negative parameters u,..., u n is a vector of random variales that admits the proaility density function Γ n i= u i n i= Γ u i n i= p ui i defined on the set {p, p,..., p n R n p + + p n =, p, p,..., p n > 0} where Γ u is the gamma function. Throughout this paper, we assume that n. For any integer n, we discretize the domain with grid size / and otain a discrete set of integer vectors Ω defined y Ω def. = {x, x,..., x n Z n ++ x i > 0 i, x + + x n = }. A discretized Dirichlet random vector with non-negative parameters u,..., u n is a random vector X = X,..., X n Ω with the distriution Pr[X = x,..., x n ] def. n = C x i / ui where C is the partition function normarizing constant defined y C n x Ω i= x i/ ui. i= def. = 4
7 For any integer, we introduce a set of -dimensional integer vectors Ω def. = {Y, Y Z Y, Y > 0, Y + Y = } and a distriution function f Y, Y u i, u j : Ω [0, ] with non-negative parameters u i, u j defined y f Y, Y u i, u j def. = Cu i, u j, Y ui uj Y def. where Cu i, u j, = Y Y ui,y Ω Y uj is the partition function. We also introduce a vector g 0 u i, u j, g u i, u j,..., g u i, u j defined y g k u i, u j def. = { 0 k = 0 k l= Cu i, u j, l ui l uj k {,,..., } It is clear that 0 = g 0 u i, u j < g u i, u j < < g u i, u j =. We descrie our Markov chain M with state space Ω. Given a current state X Ω, we generate a random numer λ [, n. Then transition X X with respect to λ takes place as follows. Markov chain M Input: A random numer λ [, n. Step : Put i := λ and := X i + X i+. Step : Let k {,,..., } e a unique value satisfying g k u i, u i+ λ λ < g k u i, u i+. k j = i, Step 3: Put X j := k j = i +, otherwise. X j The update function φ : Ω [, n Ω of our chain is defined y φx, λ def. = X where X is determined y the aove procedure. Clearly, this chain is irreducile and aperiodic. Since the detailed alance equations hold, the stationary distriution of the aove Markov chain M is the discretized Dirichlet distriution. We define two special states X U, X L Ω y X U def. = n +,,,...,, X L def. =,,...,, n +. Now we descrie our algorithm. Algorithm Step. Set the starting time period T := to go ack, and λ e the empty sequence. Step. Generate random real numers λ[t ], λ[t + ],..., λ[ T/ ] [, n and put λ = λ[t ], λ[t ],..., λ[ ]. Step 3. Start two chains from X U and X L, respectively at time period T, and run them to time period 0 according to the update function φ with the sequence of numers in λ. 5
8 Step 4. [Coalescence check] If Y Ω, Y = Φ 0 T X U, λ = Φ 0 T X L, λ, then return Y and stop. Else, update the starting time period T := T, and go to Step. Theorem 3. With proaility, Algorithm terminates and returns a state. The state otained y Algorithm is a realization of a sample exactly according to the discretized Dirichlet distriution. The aove theorem guarantees that Algorithm is a perfect sampling algorithm. We prove the aove theorem y showing the monotonicity in the next section. 4 Monotonicity of the Chain In Section, we descried two theorems. Thus we only need to show the monotonicity of our chain. First, we introduce a partial order on the state space Ω. For any vector X Ω, we define the cumulative sum vector c X Z n+ + y { c X i def. 0 i = 0, = X + X + + X i i {,,..., n}, where c X = c X 0, c X,..., c X n. Clearly, there exists a ijection etween Ω and the set {c X X Ω}. For any pair of states X, Y Ω, we say X Y if and only if c X c Y. It is clear that the relation is a partial order on Ω. We can see easily that X Ω, X U X X L. We say that a state X Ω covers Y Ω at j, denoted y X Y or X j Y, when { i = j, c X i c Y i = 0 otherwise. Note that X j Y if and only if + i = j, X i Y i = i = j +, 0 otherwise. Next, we show a key lemma for proving monotonicity. Lemma. X, Y Ω, λ [, n, X j Y φx, λ φy, λ. Proof: We denote φx, λ y X and φy, λ y Y for simplicity. For any index i λ, it is easy to see that c X i = c X i and c Y i = c Y i, and so c X i c Y i = c X i c Y i 0 since X Y. In the following, we show that c X λ c Y λ. Clearly from the definition of our Markov chain, X λ is a unique value k satisfying g k u λ, u λ + λ λ < g k u λ, u λ + 6
9 where def. = X λ + X λ +. Similarly, Y λ is a unique value k satisfying g k u λ, u λ + λ λ < g k u λ, u λ + def. where = Y λ + Y λ +. We need to consider following three cases. Case : If λ j and λ j +, then = and so we have X λ = k = k = Y λ. Case : Consider the case that λ = j. Since X j Y, we have = +. From the definition of cumulative sum vector, c X j c Y j = c X j + X j c Y j Y j = c X j + X j c Y j Y j = X j Y j. Thus, it is enough to show that X j Y j. Lemma 5 in Appendix section implies the following inequalities, 0 = g +0 u j, u j = g 0 u j, u j g + u j, u j g u j, u j g +k u j, u j g k u j, u j g +k u j, u j g + u j, u j g u j, u j = g + u j, u j =, which we will call alternating inequalities. For example, if inequalities g +k u j, u j λ λ < g k u j, u j g +k u j, u j hold, then X λ = k > k = Y λ. And if g +k u j, u j g k u j, u j λ λ < g +k u j, u j hold, then X λ = k = Y λ X j Y j {,. Thus we have,, 3,..., see Figure. From the aove we have that X j Y j. }, g + 0 g + g + g + g g 0 g g g Fig.. A figure of alternating inequalities. In the aove, we denote g k u j, u j and g +k u j, u j y g k, g +k, respectively. Case 3: Consider the case that λ = j +. Since X j Y, we have + =. From the definition of cumulative sum vector, c X j + c Y j + = c X j + X j+ c Y j Y j+ = c X j + X j+ c Y j Y j+ = + X j+ Y j+. 7
10 Thus, it is enough to show that + X j+ Y j+. Then Lemma 5 also implies the following alternating inequalities 0 = g +0 u j+, u j+ = g 0 u j+, u j+ g + u j+, u j+ g u j+, u j+ g +k u j+, u j+ g k u j+, u j+ g +k u j+, u j+ g + u j+, u j+ g u j+, u j+ = g + u j+, u j+ =. Then it is easy to see that { X j+ Y j+,,,,..., 3 }, see Figure. From the aove, we otain the inequality + X j+ Y j+. g 0 g g g 0 3 g + 0 g + g + g + g + Fig.. A figure of alternating inequalities. In the aove, we denote g k u j+, u j+ and g +k u j+, u j+ y g k, g +k, respectively. Lemma. The Markov chain M defined y the update function φ is monotone with respect to, i.e., λ [, n, X, Y Ω, X Y φx, λ φy, λ. Proof: It is easy to see that there exists a sequence of states Z, Z,..., Z r with appropreate length satisfying X = Z Z Z r = Y. Then applying Lemma repeatedly, we can show that φx, λ = φz, λ φz, λ φz r, λ = φy, λ. Lastly, we show the correctness of our algorithm. Proof of Theorem 3: From Lemma, the Markov chain is monotone, and it is clear that X U, X L is a unique pair of maximum and minimum states. Thus Algorithm is a monotone CFTP algorithm and Theorems and implies Theorem 3. 5 Expected Running Time Here, we discuss the running time of our algorithm. In this section, we assume the following. Condition Parameters are arranged in non-increasing order, i.e., u u u n. 8
11 The following is a main result of this paper. Theorem 4. Under Condition, the expected numer of tansitions executed in Algorithm is ounded y On 3 ln where n is the dimension numer of parameters and / is the grid size for discretization. In the rest of this section, we prove Theorem 4 y estimating the expectation def. of coalescence time T Z ++ defined y T = min{t > 0 y Ω, x Ω, y = Φ 0 tx, Λ}. Note that T is a random variale. Given a pair of proailistic distriutions ν and ν on the finite state space Ω, the total variation distance etween ν and ν is defined y D TV ν, ν def. = x Ω ν x ν x. The mixing rate of an ergodic Markov chain is defined y τ def. = max x Ω {min{t s t, D TV π, Px s /e}} where π is the stationary distriution and Px s is the proailistic distriution of the chain at time s with initial state x. Path Coupling Theorem is a useful technique for ounding the mixing rate. Theorem 5. Path Coupling [, ] Let MC e a finite ergodic Markov chain with state space Ω. Let G = Ω, E e a connected undirected graph with vertex Ω set Ω and edge set E. Let l : E R e a positive length defined on the edge set. For any pair of vertices {x, y} of G, the distance etween x and y, denoted y dx, y and/or dy, x, is the length of a shortest path etween x and y, where the length of a path is the sum of the lengths of edges in the path. Suppose that there exists a joint process X, Y X, Y with respect to MC satisfying that whose marginals are a faithful copy of MC and 0 < β <, {X, Y } E, E[dX, Y ] βdx, Y. Then the mixing rate τ of Markov chain MC satisfies τ β +lnd/d, where d def. = min x,y Ω dx, y and D def. = max x,y Ω dx, y. The aove theorem differs from the original theorem in [] since the integrality of the edge length is not assumed. We drop the integrality and introduced the minimum distance d. This modification is not essential and we can show Theorem 5 similarly. Now, we show the polynomiality of our algorithm. First, we estimate the mixing rate of our chain M y employing Path Coupling Theorem. In the proof of the following lemma, Condition plays an important role. Lemma 3. Under Condition, the mixing rate τ of our Markov chain M satisfies τ nn + ln n n/. Proof. Let G = Ω, E e an undirected simple graph with vertex set Ω and edge set E defined as follows. A pair of vertices {X, Y } is an edge if and only if / n i= X i Y i =. Clearly, the graph G is connected. For each edge 9
12 e = {X, Y } E, there exists a unique pair of indices j, j {,..., n}, called the supporting pair of e, satisfying { i = j, j X i Y i =, 0 otherwise. We define the length le of an edge e y le def. = /n j i= n i where j = max{j, j } and {j, j } is the supporting pair of e. Note that min e E le max e E le n/. For each pair X, Y Ω, we define the distance dx, Y e the length of the shortest path etween X and Y on G. Clearly, the diameter of G, i.e., max X,Y Ω dx, Y, is ounded y n n/, since dx, Y n/ n i= / X i Y i n/ n for any X, Y Ω. The definition of edge length implies that for any edge {X, Y } E, dx, Y = l{x, Y }. We define a joint process X, Y X, Y y X, Y φx, Λ, φy, Λ with uniform real random numer Λ [, n where φ is the update function defined in Section 3. Now we show that E[dX, Y ] βdx, Y, β = /nn, for any pair {X, Y } E. In the following, we denote the supporting pair of {X, Y } y {j, j }. Without loss of generality, we can assume that j < j, and X j + = Y j. Case : When Λ = j, we will show that E[dX, Y Λ = j ] dx, Y /n j + /n. In case j = j, X = Y with conditional proailty. Hence dx, Y = 0. In the following, we consider the case j < j. Put = X j + X j and = Y j + Y j. Since X j + = Y j, + = holds. From the definition of the update function of our Marokov chain, we have followings, X j = k [g k u j, u j Λ Λ < g k u j, u j ] Y j = k [g +k u j, u j Λ Λ < g +k u j, u j ]. As descried in the proof of Lemma, the alternating inequalities 0 = g +0 u j, u j = g 0 u j, u j g + u j, u j g u j, u j g + u j, u j g u j, u j = g + u j, u j =, hold. Thus we have X j Y j {,,,,..., 3, }. If X j = Y j, the supporting pair of {X, Y } is {j, j } and so dx, Y = dx, Y. If X j Y j, the supporting pair of {X, Y } is {j, j } and so dx, Y = dx, Y n j + /n. 0
13 Lemma 6 in Appendix section implies that if u j u j, then Pr[X j = Y j Λ = j ] /, Pr[X j Y j Λ = j ] /, y showing that the following inequality Pr[X j Y j Λ = j ] Pr[X j = Y j Λ = j ] = k= [g k u j, u j g +k u j, u j ] k= [g +k u j, u j g k u j, u j ] 0 hold when u j u j see Figure 3. g 0 g g g 0 3 g + 0 g + g + g + g + Fig. 3. A figure of alternating inequalities. In the aove, we denote g k u j, u j and g +k u j, u j y g k, g +k, respectively. Condition is necessary to show Lemma 6. Thus we otain that E[dX, Y Λ = j ] /dx, Y + /dx, Y n j + /n = dx, Y /n j + /n. Case : When Λ = j, we can show that E[dX, Y Λ = j ] dx, Y + /n j /n in a similar way with Case. Case 3: When Λ j and Λ j, it is easy to see that the supporting pair {j, j } of {X, Y } satisfies j = max{j, j }. Thus dx, Y = dx, Y. The proaility of appearance of Case is equal to /n, and that of Case is less than or equal to /n. From the aove, E[dX, Y ] dx, Y n j + + n j n n n n = dx, Y n dx, Y = dx, Y. n max {X,Y } E {dx, Y } nn Since the diameter of G is ounded y n n/, Theorem 5 implies that the mixing rate τ satisfies τ nn + ln n n/. Next, we estimate the coalescence time. When we apply Propp and Wilson s result in [9] straightforwardly, the otained coalescence time is not tight. In the following, we use their technique carefully and derive a etter ound.
14 Lemma 4. Under Condition, the coalescence time T of M satisfies E[T ] = On 3 ln. Proof. Let G = Ω, E e the undirected graph and dx, Y, X, Y Ω, e the metric on G, oth of which are defined in the proof of Lemma 3. We define D def. def. = dx U, X L and τ 0 = nn + ln D. By using the inequality otained in the proof of Lemma 3, we have Pr[T > τ 0 ] = Pr [ Φ 0 τ 0 X U, Λ Φ 0 τ 0 X L, Λ ] = Pr [Φ τ0 0 X U, Λ Φ τ0 0 X L, Λ] X,Y Ω dx, Y Pr [X = Φτ0 = E [d Φ τ0 0 X U, Λ, Φ τ0 0 X L, Λ] = 0 X U, Λ, Y = Φ τ0 0 X L, Λ] τ0 nn dx U, X L nn +ln D nn D e e ln D D e. By sumultiplicativity of coalescence time [9], for any k Z +, Pr[T > kτ 0 ] Pr[T > τ 0 ] k /e k. Thus E[T ] = t=0 tpr[t = t] τ 0 + τ 0 Pr[T > τ 0 ] + τ 0 Pr[T > τ 0 ] + τ 0 + τ 0 /e + τ 0 /e + = τ 0 / /e τ 0. Clearly, D n n/ ecause n. Then we otain the result that E[T ] = On 3 ln. Lastly, we ound the expected numer of transitions executed in Algorithm. Proof of Theorem 4: We denote T e the coalescence time of our chain. Clearly T is a random variale. Put K = log T. Algorithm terminates when we set the starting time period T = K at K + st iteration. Then the total numer of simulated transitions is ounded y K < K 8T, since we need to execute two chains from oth X U and X L. Thus the expectation of total numer of transitions of M is ounded y OE[8T ] = On 3 ln N. We can assume Condition y sorting parameters in On ln n time. 6 Conclusion In this paper, we proposed a monotone Markov chain whose stationary distriution is a discretized Dirichlet distriution. By employing Propp and Wilson s result, on monotone CFTP algorithm, we can construct a perfect sampling algorithm. We showed that our Markov chain is rapidly mixing y using path coupling method. Thus the rapidity implies that our perfect sampling algorithm is a polynomial time algorithm. The otained time complexity does not depend on the magnitudes of parameters. We can reduce the memory requirement y employing Wilson s read-once algorithm in [].
15 Appendix Lemma 5 is essentialy equivalent to the lemma appearing in Appendix section of the paper [6] y Matsui, Motoki and Kamatani which deals with an approximate sampler for discretized Dirichlet distriution. Lemma 6 is a new result otained in this paper. Lemma 5. {, 3,...}, u i, u j 0, k {,,..., }, g + k u i, u j g k u i, u j g + k u i, u j. Proof: In the following, we show the second inequality. We can show the first inequality in a similar way. We denote Cu i, u j, + = C + and Cu i, u j, = C for simplicity. From the definirion of g k u i, u j, we otain that Hk def. = g + k u i, u j g k u i, u j = k l= C +l ui l + uj k l= C l ui l uj = C + l=k+ lui l + uj C l=k lui l uj = C l=k+ l ui l + uj C + l=k+ lui l + uj = l=k+ C l ui l + uj C + l ui l + uj = l=k+ C l ui l + uj ui C l + C. Similarly, we can also show that Hk = C + k l= lui l + uj C k l= lui l uj C + k l= lui l + uj C k l= l ui l + uj = k l= C +l ui l + uj C l ui l + uj = k l= C l ui l + uj C+ C l ui. By introducing the function h : {, 3,..., } R defined y hl def. = ui l C + C, we have the following equality and inequality Hk = l=k+ C l ui l + uj hl k l= C l ui l + uj hl. 3 a Consider the case that u i. Since u i 0, the function hl is monotone non-decreasing. When hk 0 holds, we have 0 hk hk + h, and so implies the nonnegativity Hk 0. If hk < 0, then inequalities h h3 hk < 0 hold, and so 3 implies that Hk k l= C l ui l + uj hl 0. 3
16 Consider the case that 0 u i. Since u i 0, the function hl is monotone non-increasing. If the inequality h 0 hold, we have h h3 h 0 and inequality implies the non-negativity Hk 0. Thus, we only need to show that h = ui C + C 0. In the rest of this proof, we sustitute u i y α i for all i. We define a function H 0, α i, α j y H 0, α i, α j def. = αi C + αi C. It is clear that if the condition [ α i 0, α j, {, 3, 4,...}, H 0, α i, α j 0] holds, we oatin the required result that h 0 for each {, 3, 4,...}. Now we transform the function H 0, α i, α j and otain another expression as follows; H 0, α i, α j = αi k αi k + αj αi k αi k αj k= = k= αi k αi k + = k= [ αi k αi k + αj = α i k α i k α j k= αj k+k k αi k αi k αj ] [ + k k= αi k= kαi k αj + αi k + αi k αj αj k + + k αi k k αi ]. Then it is enough to show that the function H, α i, α j, k def. = + αj αi k k + + αi k k is nonnegative for any k {,,..., }. Since + / k > and α j, we have H, α i, α j, k H, α i,, k = k k+ + + αi αi k k. We differentiate the function H y α i, and otain the following H, α i,, k = + αi αi α k k log + k log i = + k αi log + k k + αi log +. Since k, is a pair of positive integers satisfying k, the non-positivity of α i implies 0 + /k αi + / αi and 0 log + /k k log + /. Thus the function H, α i,, k is monotone non-decreasing with respect to α i 0. Thus we have H, α i,, k H, 0,, k = k k k k = k k+ + k + = k + k k+ = k+ 0. 4
17 Lemma 6. {, 3,...}, u i u j, [g k u i, u j g + k u i, u j ] [g + k u i, u j g k u i, u j ] 0. k= k= Proof: We denote Cu i, u j, + = C + and Cu i, u j, = C for simplicity. It is not difficult to show the following equalities, G def. = k= [g k u i, u j g + k u i, u j ] k= [g +k u i, u j g k u i, u j ] = k= g k u i, u j k= g +k u i, u j k= g +k u i, u j + k= g k u i, u j = k= g k u i, u j k= g +k u i, u j k= g +k u i, u j + k= g k u i, u j = k= C k= C + k l= lui l uj k= C + k l= lui l + uj + k= C k l= lui l + uj k l= lui l uj = C l= llui l uj C + l= llui l + uj C + l= llui l + uj + C l= l lui l uj = C l= l lui l uj C + l= llui l + uj = C C + C + l= l lui l uj C l= llui l + uj = C C + k= kui k + uj l= l lui l uj k= kui k uj l= llui l + uj = C C + k= l= l klui k + l uj k= l= lklui k l + uj = C C + k= l= l klui k + l uj k= = C C + k= = C C + l= k= + k= = C C + k= + k= l= kklui l k + uj k l kl u i uj k + l uj l= k l klui k + l l= k l klui k + l uj l= k l klui k + l uj l= k + l k + l ui uj k + + l 5
18 = C C + k= = C C + = C C + = C C + k= l= uj l= k l klui k + l l= k l k + lui kl uj k 3 l= k=l+ l 3 : k= l=k k=l+ k l klui k + l uj + k= l=k k l klui k + l uj l= k=l+ k l k + lui kl uj k= l=k k l k + lui kl uj k=l+ k l klui k + l uj l= l= l= + l= l= k=l+ k=l+ k l lk ui l + k + uj k=l+ k l k + lui kl uj k=l+ k l l + k + ui lk uj k l kl ui k + l uj lk ui l + k + uj k + l ui kl uj + l + k + ui lk uj. We define a function G 0 k, l, u i, u j y kl ui uj k + l G 0 k, l, u i, u j def. = lk ui l + k + k + l ui kl uj + l + k + ui lk uj uj Since l < l+ k, it is clear that k l > 0. Thus, we only need to show that l {,,..., }, k {, 3,..., }, u i u j, G 0 l, k, u i, u j 0. It is easy to see that G 0 k, l, u i, u j = lk ui k + uj + k ui l uj l + uj. + k + l ui l k uj uj + + l ui k uj Then it is clear that G 0 k, l, u i, u j is non-decreasing with respect to u i and so G 0 k, l, u i, u j G 0 k, l, u j, u j. By sustituting u i y u j in the definition. 6
19 of G 0 k, l, u i, u j, it is easy to see that G 0 k, l, u j, u j = 0. Thus we have the desired result. References. Buley, R., M. Dyer, M.: Path coupling: A technique for proving rapid mixing in Markov chains, 38th Annual Symposium on Foundations of Computer Science, IEEE, San Alimitos, 997, Buley, R.: Randomized Algorithms : Approximation, Generation, and Counting, Springer-Verlag, New York, Dimakos, X. K.: A guide to exact simulation, International Statistical Review, 69 00, pp Durin, R., Eddy, R., Krogh, A., Mitchison, G.: Biological sequence analysis: proailistic models of proteins and nucleic acids, Camridge Univ. Press, Kijima, S. and Matsui, T.,: Polynomial Time Perfect Sampling Algorithm for Two-rowed Contingency Tales, METR 003-5, Mathematical Engineering Technical Reports, University of Tokyo, 003. availale from 6. Matsui, T., Motoki, M., and Kamatani, N.,: Polynomial Time Approximate Sampler for Discretized Dirichlet Distriution, METR 003-0, Mathematical Engineering Technical Reports, University of Tokyo, 003. availale from 7. Niu, T., Qin, Z. S., Xu, X., Liu, J. S.: Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms, Am. J. Hum. Genet., Pritchard, J. K., Stephens, M., Donnely, P.: Inference of population structure using multilocus genotype data, Genetics, Propp, J. and Wilson, D.: Exact sampling with coupled Markov chains and applications to statistical mechanics, Random Structures and Algorithms, 9 996, pp Propp, J. and Wilson, D.: How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph, J. Algorithms, 7 998, pp Roert, C. P.: The Bayesian Choice, Springer-Verlag, New York, 00.. Wilson, D.: How to couple from the past using a read-once source of randomness, Random Structures and Algorithms, 6 000, pp
APPROXIMATE/PERFECT SAMPLERS FOR CLOSED JACKSON NETWORKS. Shuji Kijima Tomomi Matsui
Proceedings of the 005 Winter Simulation Conference M. E. Kuhl, N. M. Steiger, F. B. Armstrong, and J. A. Joines, eds. APPROXIMATE/PERFECT SAMPLERS FOR CLOSED JACKSON NETWORKS Shuji Kijima Tomomi Matsui
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. Discrete Hessian Matrix for L-convex Functions
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Discrete Hessian Matrix for L-convex Functions Satoko MORIGUCHI and Kazuo MUROTA METR 2004 30 June 2004 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE SCHOOL
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. Boundary cliques, clique trees and perfect sequences of maximal cliques of a chordal graph
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Boundary cliques, clique trees and perfect sequences of maximal cliques of a chordal graph Hisayuki HARA and Akimichi TAKEMURA METR 2006 41 July 2006 DEPARTMENT
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. Markov Bases for Two-way Subtable Sum Problems
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Markov Bases for Two-way Subtable Sum Problems Hisayuki HARA, Akimichi TAKEMURA and Ruriko YOSHIDA METR 27 49 August 27 DEPARTMENT OF MATHEMATICAL INFORMATICS
More information7.1 Coupling from the Past
Georgia Tech Fall 2006 Markov Chain Monte Carlo Methods Lecture 7: September 12, 2006 Coupling from the Past Eric Vigoda 7.1 Coupling from the Past 7.1.1 Introduction We saw in the last lecture how Markov
More informationUniversity of Chicago Autumn 2003 CS Markov Chain Monte Carlo Methods
University of Chicago Autumn 2003 CS37101-1 Markov Chain Monte Carlo Methods Lecture 4: October 21, 2003 Bounding the mixing time via coupling Eric Vigoda 4.1 Introduction In this lecture we ll use the
More informationLect4: Exact Sampling Techniques and MCMC Convergence Analysis
Lect4: Exact Sampling Techniques and MCMC Convergence Analysis. Exact sampling. Convergence analysis of MCMC. First-hit time analysis for MCMC--ways to analyze the proposals. Outline of the Module Definitions
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. Induction of M-convex Functions by Linking Systems
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Induction of M-convex Functions by Linking Systems Yusuke KOBAYASHI and Kazuo MUROTA METR 2006 43 July 2006 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE SCHOOL
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order
MATHEMATICAL ENGINEERING TECHNICAL REPORTS A Superharmonic Prior for the Autoregressive Process of the Second Order Fuyuhiko TANAKA and Fumiyasu KOMAKI METR 2006 30 May 2006 DEPARTMENT OF MATHEMATICAL
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. Dijkstra s Algorithm and L-concave Function Maximization
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Dijkstra s Algorithm and L-concave Function Maximization Kazuo MUROTA and Akiyoshi SHIOURA METR 2012 05 March 2012; revised May 2012 DEPARTMENT OF MATHEMATICAL
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. Deductive Inference for the Interiors and Exteriors of Horn Theories
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Deductive Inference for the Interiors and Exteriors of Horn Theories Kazuhisa MAKINO and Hirotaka ONO METR 2007 06 February 2007 DEPARTMENT OF MATHEMATICAL INFORMATICS
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Combinatorial Relaxation Algorithm for the Entire Sequence of the Maximum Degree of Minors in Mixed Polynomial Matrices Shun SATO (Communicated by Taayasu MATSUO)
More informationPolynomial-time counting and sampling of two-rowed contingency tables
Polynomial-time counting and sampling of two-rowed contingency tables Martin Dyer and Catherine Greenhill School of Computer Studies University of Leeds Leeds LS2 9JT UNITED KINGDOM Abstract In this paper
More informationMARKOV CHAIN MONTE CARLO
MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with
More informationThe coupling method - Simons Counting Complexity Bootcamp, 2016
The coupling method - Simons Counting Complexity Bootcamp, 2016 Nayantara Bhatnagar (University of Delaware) Ivona Bezáková (Rochester Institute of Technology) January 26, 2016 Techniques for bounding
More informationApproximate Counting and Markov Chain Monte Carlo
Approximate Counting and Markov Chain Monte Carlo A Randomized Approach Arindam Pal Department of Computer Science and Engineering Indian Institute of Technology Delhi March 18, 2011 April 8, 2011 Arindam
More informationINTRODUCTION TO MARKOV CHAIN MONTE CARLO
INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate
More informationAdvanced Sampling Algorithms
+ Advanced Sampling Algorithms + Mobashir Mohammad Hirak Sarkar Parvathy Sudhir Yamilet Serrano Llerena Advanced Sampling Algorithms Aditya Kulkarni Tobias Bertelsen Nirandika Wanigasekara Malay Singh
More informationProbability & Computing
Probability & Computing Stochastic Process time t {X t t 2 T } state space Ω X t 2 state x 2 discrete time: T is countable T = {0,, 2,...} discrete space: Ω is finite or countably infinite X 0,X,X 2,...
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More informationBayesian inference with reliability methods without knowing the maximum of the likelihood function
Bayesian inference with reliaility methods without knowing the maximum of the likelihood function Wolfgang Betz a,, James L. Beck, Iason Papaioannou a, Daniel Strau a a Engineering Risk Analysis Group,
More informationThe Incomplete Perfect Phylogeny Haplotype Problem
The Incomplete Perfect Phylogeny Haplotype Prolem Gad Kimmel 1 and Ron Shamir 1 School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel. Email: kgad@tau.ac.il, rshamir@tau.ac.il. Astract.
More informationOnline Supplementary Appendix B
Online Supplementary Appendix B Uniqueness of the Solution of Lemma and the Properties of λ ( K) We prove the uniqueness y the following steps: () (A8) uniquely determines q as a function of λ () (A) uniquely
More informationApproximate Counting
Approximate Counting Andreas-Nikolas Göbel National Technical University of Athens, Greece Advanced Algorithms, May 2010 Outline 1 The Monte Carlo Method Introduction DNFSAT Counting 2 The Markov Chain
More informationMarkov Chains and MCMC
Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,
More informationAn Introduction to Randomized algorithms
An Introduction to Randomized algorithms C.R. Subramanian The Institute of Mathematical Sciences, Chennai. Expository talk presented at the Research Promotion Workshop on Introduction to Geometric and
More informationApril 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning
for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions
More informationDiscrete solid-on-solid models
Discrete solid-on-solid models University of Alberta 2018 COSy, University of Manitoba - June 7 Discrete processes, stochastic PDEs, deterministic PDEs Table: Deterministic PDEs Heat-diffusion equation
More informationThe Markov Chain Monte Carlo Method
The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov
More informationPropp-Wilson Algorithm (and sampling the Ising model)
Propp-Wilson Algorithm (and sampling the Ising model) Danny Leshem, Nov 2009 References: Haggstrom, O. (2002) Finite Markov Chains and Algorithmic Applications, ch. 10-11 Propp, J. & Wilson, D. (1996)
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationLecture 6: September 22
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 6: September 22 Lecturer: Prof. Alistair Sinclair Scribes: Alistair Sinclair Disclaimer: These notes have not been subjected
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationPerfect simulation for image analysis
Perfect simulation for image analysis Mark Huber Fletcher Jones Foundation Associate Professor of Mathematics and Statistics and George R. Roberts Fellow Mathematical Sciences Claremont McKenna College
More informationLecture 7 and 8: Markov Chain Monte Carlo
Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani
More informationLecture 9: Counting Matchings
Counting and Sampling Fall 207 Lecture 9: Counting Matchings Lecturer: Shayan Oveis Gharan October 20 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
More informationCh5. Markov Chain Monte Carlo
ST4231, Semester I, 2003-2004 Ch5. Markov Chain Monte Carlo In general, it is very difficult to simulate the value of a random vector X whose component random variables are dependent. In this chapter we
More informationMinimizing a convex separable exponential function subject to linear equality constraint and bounded variables
Minimizing a convex separale exponential function suect to linear equality constraint and ounded variales Stefan M. Stefanov Department of Mathematics Neofit Rilski South-Western University 2700 Blagoevgrad
More informationConvergence Rate of Markov Chains
Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. The Symmetric Quadratic Semi-Assignment Polytope
MATHEMATICAL ENGINEERING TECHNICAL REPORTS The Symmetric Quadratic Semi-Assignment Polytope Hiroo SAITO (Communicated by Kazuo MUROTA) METR 2005 21 August 2005 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE
More informationMCMC: Markov Chain Monte Carlo
I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov
More informationEssential Maths 1. Macquarie University MAFC_Essential_Maths Page 1 of These notes were prepared by Anne Cooper and Catriona March.
Essential Maths 1 The information in this document is the minimum assumed knowledge for students undertaking the Macquarie University Masters of Applied Finance, Graduate Diploma of Applied Finance, and
More informationThe Monte Carlo Method
The Monte Carlo Method Example: estimate the value of π. Choose X and Y independently and uniformly at random in [0, 1]. Let Pr(Z = 1) = π 4. 4E[Z] = π. { 1 if X Z = 2 + Y 2 1, 0 otherwise, Let Z 1,...,
More informationRepresentation theory of SU(2), density operators, purification Michael Walter, University of Amsterdam
Symmetry and Quantum Information Feruary 6, 018 Representation theory of S(), density operators, purification Lecture 7 Michael Walter, niversity of Amsterdam Last week, we learned the asic concepts of
More informationan introduction to bayesian inference
with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena
More informationAdvances and Applications in Perfect Sampling
and Applications in Perfect Sampling Ph.D. Dissertation Defense Ulrike Schneider advisor: Jem Corcoran May 8, 2003 Department of Applied Mathematics University of Colorado Outline Introduction (1) MCMC
More informationPerfect simulation of repulsive point processes
Perfect simulation of repulsive point processes Mark Huber Department of Mathematical Sciences, Claremont McKenna College 29 November, 2011 Mark Huber, CMC Perfect simulation of repulsive point processes
More informationIN this paper we study a discrete optimization problem. Constrained Shortest Link-Disjoint Paths Selection: A Network Programming Based Approach
Constrained Shortest Link-Disjoint Paths Selection: A Network Programming Based Approach Ying Xiao, Student Memer, IEEE, Krishnaiyan Thulasiraman, Fellow, IEEE, and Guoliang Xue, Senior Memer, IEEE Astract
More information25.1 Markov Chain Monte Carlo (MCMC)
CS880: Approximations Algorithms Scribe: Dave Andrzejewski Lecturer: Shuchi Chawla Topic: Approx counting/sampling, MCMC methods Date: 4/4/07 The previous lecture showed that, for self-reducible problems,
More informationThe variance for partial match retrievals in k-dimensional bucket digital trees
The variance for partial match retrievals in k-dimensional ucket digital trees Michael FUCHS Department of Applied Mathematics National Chiao Tung University January 12, 21 Astract The variance of partial
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationLuis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model
Luis Manuel Santana Gallego 100 Appendix 3 Clock Skew Model Xiaohong Jiang and Susumu Horiguchi [JIA-01] 1. Introduction The evolution of VLSI chips toward larger die sizes and faster clock speeds makes
More information6 Markov Chain Monte Carlo (MCMC)
6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationStochastic Simulation
Stochastic Simulation Idea: probabilities samples Get probabilities from samples: X count x 1 n 1. x k total. n k m X probability x 1. n 1 /m. x k n k /m If we could sample from a variable s (posterior)
More informationLecture 28: April 26
CS271 Randomness & Computation Spring 2018 Instructor: Alistair Sinclair Lecture 28: April 26 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They
More informationMarkov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains
Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time
More informationPolynomials Characterizing Hyper b-ary Representations
1 2 3 47 6 23 11 Journal of Integer Sequences, Vol. 21 (2018), Article 18.4.3 Polynomials Characterizing Hyper -ary Representations Karl Dilcher 1 Department of Mathematics and Statistics Dalhousie University
More informationTheory of Stochastic Processes 8. Markov chain Monte Carlo
Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationReminder of some Markov Chain properties:
Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationA Bayesian Approach to Phylogenetics
A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte
More informationAnswers and expectations
Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E
More informationLecture 6 January 15, 2014
Advanced Graph Algorithms Jan-Apr 2014 Lecture 6 January 15, 2014 Lecturer: Saket Sourah Scrie: Prafullkumar P Tale 1 Overview In the last lecture we defined simple tree decomposition and stated that for
More informationERASMUS UNIVERSITY ROTTERDAM Information concerning the Entrance examination Mathematics level 2 for International Business Administration (IBA)
ERASMUS UNIVERSITY ROTTERDAM Information concerning the Entrance examination Mathematics level 2 for International Business Administration (IBA) General information Availale time: 2.5 hours (150 minutes).
More informationSTA 294: Stochastic Processes & Bayesian Nonparametrics
MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationDecomposition Methods and Sampling Circuits in the Cartesian Lattice
Decomposition Methods and Sampling Circuits in the Cartesian Lattice Dana Randall College of Computing and School of Mathematics Georgia Institute of Technology Atlanta, GA 30332-0280 randall@math.gatech.edu
More informationCONTINUOUS DEPENDENCE ESTIMATES FOR VISCOSITY SOLUTIONS OF FULLY NONLINEAR DEGENERATE ELLIPTIC EQUATIONS
Electronic Journal of Differential Equations, Vol. 20022002), No. 39, pp. 1 10. ISSN: 1072-6691. URL: http://ejde.math.swt.edu or http://ejde.math.unt.edu ftp ejde.math.swt.edu login: ftp) CONTINUOUS DEPENDENCE
More information#A50 INTEGERS 14 (2014) ON RATS SEQUENCES IN GENERAL BASES
#A50 INTEGERS 14 (014) ON RATS SEQUENCES IN GENERAL BASES Johann Thiel Dept. of Mathematics, New York City College of Technology, Brooklyn, New York jthiel@citytech.cuny.edu Received: 6/11/13, Revised:
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationarxiv: v2 [math.pr] 29 Jul 2012
Appendix to Approximating perpetuities arxiv:203.0679v2 [math.pr] 29 Jul 202 Margarete Knape and Ralph Neininger Institute for Mathematics J.W. Goethe-University 60054 Frankfurt a.m. Germany July 25, 202
More informationSampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.
Sampling Methods Oliver Schulte - CMP 419/726 Bishop PRML Ch. 11 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure
More informationBayesian Methods with Monte Carlo Markov Chains II
Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3
More informationOptimal Routing in Chord
Optimal Routing in Chord Prasanna Ganesan Gurmeet Singh Manku Astract We propose optimal routing algorithms for Chord [1], a popular topology for routing in peer-to-peer networks. Chord is an undirected
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationStarAI Full, 6+1 pages Short, 2 page position paper or abstract
StarAI 2015 Fifth International Workshop on Statistical Relational AI At the 31st Conference on Uncertainty in Artificial Intelligence (UAI) (right after ICML) In Amsterdam, The Netherlands, on July 16.
More informationSC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.
More informationCoupling. 2/3/2010 and 2/5/2010
Coupling 2/3/2010 and 2/5/2010 1 Introduction Consider the move to middle shuffle where a card from the top is placed uniformly at random at a position in the deck. It is easy to see that this Markov Chain
More informationarxiv: v1 [cs.fl] 24 Nov 2017
(Biased) Majority Rule Cellular Automata Bernd Gärtner and Ahad N. Zehmakan Department of Computer Science, ETH Zurich arxiv:1711.1090v1 [cs.fl] 4 Nov 017 Astract Consider a graph G = (V, E) and a random
More informationConductance and Rapidly Mixing Markov Chains
Conductance and Rapidly Mixing Markov Chains Jamie King james.king@uwaterloo.ca March 26, 2003 Abstract Conductance is a measure of a Markov chain that quantifies its tendency to circulate around its states.
More information18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)
10-708: Probabilistic Graphical Models 10-708, Spring 2014 18 : Advanced topics in MCMC Lecturer: Eric P. Xing Scribes: Jessica Chemali, Seungwhan Moon 1 Gibbs Sampling (Continued from the last lecture)
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationSampling Methods (11/30/04)
CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with
More informationMarkov Processes. Stochastic process. Markov process
Markov Processes Stochastic process movement through a series of well-defined states in a way that involves some element of randomness for our purposes, states are microstates in the governing ensemble
More informationTheory of Stochastic Processes 3. Generating functions and their applications
Theory of Stochastic Processes 3. Generating functions and their applications Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo April 20, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html
More information1 Hoeffding s Inequality
Proailistic Method: Hoeffding s Inequality and Differential Privacy Lecturer: Huert Chan Date: 27 May 22 Hoeffding s Inequality. Approximate Counting y Random Sampling Suppose there is a ag containing
More informationUpper Bounds for Stern s Diatomic Sequence and Related Sequences
Upper Bounds for Stern s Diatomic Sequence and Related Sequences Colin Defant Department of Mathematics University of Florida, U.S.A. cdefant@ufl.edu Sumitted: Jun 18, 01; Accepted: Oct, 016; Pulished:
More informationImproving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices
Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices Yi Sun IDSIA Galleria, Manno CH-98, Switzerland yi@idsia.ch Faustino Gomez IDSIA Galleria, Manno CH-98, Switzerland
More informationL 2 Discrepancy of Two-Dimensional Digitally Shifted Hammersley Point Sets in Base b
L Discrepancy of Two-Dimensional Digitally Shifted Hammersley Point Sets in Base Henri Faure and Friedrich Pillichshammer Astract We give an exact formula for the L discrepancy of two-dimensional digitally
More informationMinimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA
Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University
More informationCS281A/Stat241A Lecture 22
CS281A/Stat241A Lecture 22 p. 1/4 CS281A/Stat241A Lecture 22 Monte Carlo Methods Peter Bartlett CS281A/Stat241A Lecture 22 p. 2/4 Key ideas of this lecture Sampling in Bayesian methods: Predictive distribution
More informationMarkov Chain Monte Carlo
Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible
More informationINTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505
INTRODUCTION TO MCMC AND PAGERANK Eric Vigoda Georgia Tech Lecture for CS 6505 1 MARKOV CHAIN BASICS 2 ERGODICITY 3 WHAT IS THE STATIONARY DISTRIBUTION? 4 PAGERANK 5 MIXING TIME 6 PREVIEW OF FURTHER TOPICS
More informationSolving Homogeneous Trees of Sturm-Liouville Equations using an Infinite Order Determinant Method
Paper Civil-Comp Press, Proceedings of the Eleventh International Conference on Computational Structures Technology,.H.V. Topping, Editor), Civil-Comp Press, Stirlingshire, Scotland Solving Homogeneous
More informationInference in Bayesian Networks
Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)
More informationINTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505
INTRODUCTION TO MCMC AND PAGERANK Eric Vigoda Georgia Tech Lecture for CS 6505 1 MARKOV CHAIN BASICS 2 ERGODICITY 3 WHAT IS THE STATIONARY DISTRIBUTION? 4 PAGERANK 5 MIXING TIME 6 PREVIEW OF FURTHER TOPICS
More information