Statistical-Mechanical Approach to Probabilistic Image Processing Loopy Belief Propagation and Advanced Mean-Field Method

Size: px
Start display at page:

Download "Statistical-Mechanical Approach to Probabilistic Image Processing Loopy Belief Propagation and Advanced Mean-Field Method"

Transcription

1 Statistical-Mechanical Approach to Probabilistic Image Processing Loopy Belief Propagation and Advanced Mean-Field Method Kazuyuki Tanaka and Noriko Yoshiike 2 Graduate School of Information Sciences, Tohoku University Aranaki-aza-aoba 09, Aoba-ku, Sendai , Japan Abstract The framework is presented of Bayesian image restoration for multi-valued images by means of the multi-state classical spin systems. Hyperparameters in the probabilistic models are determined so as to maximize the marginal likelihood. A practical algorithm is described for multi-valued image restoration based on loopy belief propagations in probabilistic inference. Loopy belief propagations are equivalent to the Bethe approximation and the pair approximation of cluster variation method in statistical mechanics. Introduction Bayesian approach to image processing is one of powerful methods for probabilistic image processing[, 2, 3, 4]. In Bayesian approach to image processing, some probabilistic models which have been proposed as physical models for some magnetic materials in statistical mechanics are useful as aprioriprobabilistic models. It seemingly seems that image processing and magnetic materials have no concern with each other. However, both of them are defined on a regular lattice and are composed with many elements relating. In magnetic materials, many spins are arrayed on a regular lattice and the spins locally interact with each other. In image processing, images are constructed from pixels and the pixels are arrayed on a regular lattice. In order to process an image in a certain purpose, a filter is designed. Many image processing filters usually have structures in which the output of each pixel is determined from the configuration of the neighboring pixels. Recently, many researchers have investigated to design new image processing system with paying attention to the similarity[5, 6]. When we formulate image processing as a probabilistic model, some ideas in statistical-mechanical models are very useful. Although the probabilistic models are usually massive ones and it is difficult to treat analytically, we are able to construct some approximate algorithms by using advanced mean-field methods in statistical mechanics. We expect to pioneer a new paradigm of computational art in image processing by using some statisticalmechanical techniques proposed to study nature in material science. Advanced mean-field methods are powerful ones for massive probabilistic models and have been applied to the probabilistic algorithms[7]. Moreover, some computer scientists are interested in the similarity between advanced mean-field methods and loopy belief propagations that have investigated in order to construct probabilistic inference systems in artificial intelligence. An extremum condition for minimizing an approximate free energy in the advanced mean-field method is sometimes equivalent to an update rule of loopy belief propagation[8, 9]. Recently, advanced mean-field methods have been applied also to Bayesian image processing[0, ]. In Bayesian image processing, we have to determine hyperparameters. Some hyperparameters are corresponding to interaction parameters or other model parameters. In practice, it is common for the hyperparameters to be determined so as to maximize a marginal likelihood[4, 2, 3]. In some probabilistic models treated in the statistical-mechanics, some types of phase transitions occur at a critical point. We have the first-order and the second-order phase transitions as familiar types of phase transitions. In the first-order phase transition, the first differentiate of the free energy is discontinuous at a critical point. kazu@statp.is.tohoku.ac.jp, URL: kazu/index-e.html 2 yoshiike@statp.is.tohoku.ac.jp, URL: yoshiike/

2 In the second-order phase transition, while the first differentiate of the free energy is always continuous, the second differentiate of the free energy is discontinuous at a critical point. If the probabilistic model adopted as an aprioriprobability distribution has the second-order phase transition, the free energy is differentiable at any value of hyperparameter. On the other hand, if the probabilistic model has the first-order phase transition, the free energy is not differentiable at a critical point of hyperparameter. The marginal likelihood is expressed in terms of free energies of an aprioriprobability distribution and an a posteriori probability distribution. This fact means that the marginal likelihood is not differentiable at a critical point of hyperparameter in an aprioriprobability distribution. In the present paper, we review the framework of the loopy belief propagation in image processing based on a variational principle for the minimization of an approximate free energy of the probabilistic model defined on a square lattice. Bayesian image processing is formulated by using Bayesian statistics and the maximization of a marginal likelihood and by adopting Q-state Potts model and Q-Ising model as the aprioriprobability distribution. The free energy in the Q-state Potts model for Q 3 isnot differentiable at a value of hyperparameter when we treat it by using advanced mean-field methods. The maximization of the marginal likelihood is achieved by using the loopy belief propagation. Part of the present work is the research collaboration with Professor D. M. Titterington, of the Glasgow University and Professor J. Inoue of the Hokkaido University. 2 Bayesian Image Analysis and Hyperparameter Estimation We consider an image on a square lattice Ω {i} such that each pixel takes one of the gray-levels Q = {0,, 2,,Q }, with0andq corresponding to black and white, respectively. The intensities at pixel i in the original image and the degraded image are regarded as random variables denoted by F i and G i, respectively. Then the random fields of intensities in the original image and the degraded image are represented by F {F i } and G {G i }, respectively. The actual original image and degraded image are denoted by f = {f i } and g = {f i }, respectively. The probability that the original image is f, Pr{F = f}, is called the aprioriprobability of the image. In the Bayes formula, the a posteriori probability Pr{F = f G = g}, that the original image is f when the given degraded image is g, is expressed as Pr{F = f G = g} = Pr{G = g } F = f Pr{F = f} Pr{G = g F = z}pr{f = z}, z where the summation z is taken over all possible configurations of images z = {z i }. The probability Pr { G = g F = f } is the conditional probability that the degraded image is g when the original image is f and describes the degradation process. In the present paper, it is assumed that the degraded image g is generated from the original image f by changing the intensity of each pixel to another intensity with the same probability p, independently of the other pixels; the probability that the intensity is unchanged is therefore Q p, which constrains p to lie in the range 0 p /Q. The conditional probability distribution associated with the degradation process when the original image is f is Pr{G = g F = f} = Pr{G = g F = f,p} = p δ fi,g i + Q p δ fi,g i, 2 where δ a,b is the Kronecker delta. Moreover, the aprioriprobability distribution that the original image is f is assumed to be Pr{F = f} =Pr{F = f α} = exp αuf i,f j, 3 Z PR α where B is the set of all the nearest-neighbor pairs of pixels on the square lattice Ω and Z PR α isa normalization constant. By substituting Eqs.2 and 3 into Eq., we obtain Pr{F = f G = g,α,p} = Z PO α, p 2

3 p δ fi,g i + Q p δ fi,g i exp αuf i,f j, 4 where Z PO α, p is a normalization constant. In the maximum marginal likelihood estimation approach, the hyperparameters α and p are determined so as to maximize the marginal likelihood Pr{G = g α, p}, where Pr{G = g α, p} z Pr{G = g F = z,p}pr{f = z α}. 5 We denote the maximizer of the marginal likelihood Pr{G = g α, p} by α and p. Thus α, p = arg maxpr{g = g α, p}. 6 α,p The marginal likelihood Pr{G = g α, p} is expressed in terms of the free energies F PO α, p ln Z PO α, p and F PR α ln Z PR α as follows: ln Pr{G = g α, p} =ln Z PO α, p ln Z PR α = F PO α, p+f PR α. 7 Given the estimates α and p, the restored image f = { f i } is determined by f i =arg max f i Q Pr{F i = f i G = g,α,p}. 8 Here Pr{F i = f i G = g,α,p} is a marginal probabilities defined by Pr{F i = f i G = g,α,p} z δ fi,z i Pr{F = z G = g,α,p}, 9 This way of producing a restored image is called maximum posterior marginal estimation. 3 Loopy Belief Propagation In the framework given in the previous section, we have to calculate the marginal probability distributions Pr{F i = f i G = g,α,p},andthefreeenergiesf PO α, p andf PR α. Since it is hard to calculate the marginal probability distributions and the free energies exactly, we apply the loopy belief propagation to the above probabilistic models given by Pr{F = f G = g,α,p} and Pr{F = f α}. Next we summarize the simultaneous equations to be satisfied by marginal probability distributions, namely P i f i z δ fi,z i P z, 0 in the loopy belief propagation for a probability distribution P f defined by ψ i f i φ ij f i,f j P f = ψ i z i z. φ ij z i,z j Here φ ij ξ,ξ andψ i ξ are always positive for any values of ξ Q and ξ Q. The detailed derivation is similar to that given in [4, 4, 5] and here we merely state the results of the variational calculation for an approximate free energy in the loopy belief propagation 3. The probability distribution is rewritten as P f = Φ ij f i,f j Φ i f i, Z Z Φ i f i Φ j f j z Φ i z i Φ ij z i,z j Φ i z i Φz j, 2 3 In the statistical mechanics, the loopy belief propagation has been often referred to as the Bethe approximation[6] or the cluster variation method[7, 8, 9]. 3

4 where Φ i f i ψ if i ψ i ζ, Φ ψ i f i φ ij f i,f j ψ j f j ijf i,f j ψ i ζφ ij ζ,ζ φ j ζ. 3 Now we introduce the Kullback-Leibler divergence D[Q P ] between the given probability distribution P f and a probability distribution Qf can be written in terms of F[φ] as follows: D[Q P ] z Qz Qzln = P z z Qz lnqz lnp z. 4 By using the normalization Qz =, 5 z the Kullback-Leibler divergence D[Q P ] can be rewritten as D[Q P ]=F[Q]+lnZ, 6 where F[Q] z Qz lnqz lnφ i z i + lnφij z i,z j lnφ i z i lnφ j z j. 7 The Kullback-Leibler divergence D[Q P ] is equal to zero if Qf =P f. Moreover, it is valid that F[P ] = lnz. In statistical mechanics, F[P ] = lnz is referred to as a free energy of the probabilistic model given in Eq.2. We consider a probability distribution Qf expressed by where Qf = Q i f i Q ij f i,f j Q i f i Q j f j, 8 Q i f i z δ fi,z i Qz, Q ij f i,f j z δ fi,z i δ fj,z j Qz. 9 We remark that Q ij f i,f j =Q ji f j,f i. By substituting Eq.8 to the factor lnqz in Eq.6 with Eq.7, the Kullback-Leibler divergence D[Q P ] can be given as follows: D[Q P ]= Qz lnq i z i + lnqij z i,z j lnq i z i lnq j z j z lnφ i z i + lnφij z i,z j lnφ i z i lnφ j z j +lnz. 20 By applying the normalization conditions Q i ζ =, Q ij ζ,ζ =, 2 and the definition of marginal probability of Qf given in Eq.9, the expression of the Kullback-Leibler divergence D[Q P ] given in Eq.20 can be rewritten as follows: D[Q P ]= Q i ζlnq i ζ 4

5 + Q ij ζ,ζ lnq ij ζ,ζ Q i ζlnq i ζ Q j ζlnq j ζ Q i ζlnφ i ζ + Q ij ζ,ζ lnφ ij ζ,ζ Q i ζlnφ i ζ Q i ζlnφ i ζ +lnz. 22 Hence, the Kullback-Leibler divergence D[Q P ] is expressed only in terms of marginal probabilities {Q i,q ij,} as follows: D[Q P ]=F[{Q i,q ij,}]+lnz, 23 where and F [ {Q γ γ Ω B} ] D[Q i Φ i ]+ D[Q ij Φ ij ] D[Q i Φ i ] D[Q j Φ j ], 24 D[Q i Φ i ] Qi ζ Q i ζln, D[Q ij Φ ij ] Q ij ζ,ζ Qij ζ,ζ ln Φ i ζ Φ ij ζ,ζ. 25 From the definitions of marginals {Q i,q ij,}, we see that the following conditions are valid: Q i ζ = Q ij ζ,ζ, j c i,. 26 Here c i {j } is the set of all the nearest-neighbor pixels of i. The conditions are often referred to as reducibility conditions in the cluster variation method. In the standpoint of the cluster variation method[4], the marginal probabilities of P f are approximately determined so as to minimize the right-hand side of Eq.23 under the constraint conditions 2 and 26 as follows: {P i,p ij,} { Q i, Q ij,} { arg min D[Q P ] Q i ζ = Q ij ζ,ζ,j c i,, {Q i,q ij,} Q i ζ =, } Q ij ζ,ζ = =arg min {F [ {Q γ γ Ω B} ] Qi ζ = Q ij ζ,ζ,j c i,, {Q i,q ij,} Q i ζ =, } Q ij ζ,ζ =. 27 By introducing the Lagrange multipliers to ensure the constraint conditions 2 and 26 L [ {Q γ γ Ω B} ] F [ {Q γ γ Ω B} ] λ ij,i ζ Q i ζ Q ij ζ,ζ j c i λ i Q i ζ λ ij Q ij ζ,ζ, 28 and by taking the first variation of L [ {Q γ γ Ω B} ], the expressions of the marginal probabilities { Q i ξ, ξ Q} and { Q ij ξ,ξ,ξ Q,ξ Q} can be given as follows: ψ i ξ /3 exp 3 {k k c Q λ i} ik,iξ i ξ =, 29 φ i ζ /3 exp {k k c λ i} ik,iζ 3 5

6 Q ij ξ,ξ = φ ij ξ,ξ exp λ i ξ+λ j ξ. 30 φ ij ζ,ζ exp λ ij,i ζ+λ ij,j ζ The Lagrange multipliers {λ ij,i,λ ij,j } are determined so as to satisfy the reducibility conditions 26 with Eqs.29 and 30. The approximate value of the free energy F[P ]isgivenas k c i\j F[P ] F [ { Q γ γ Ω B} ]. 3 By replacing λ ij,i ξ byexp λ ij,i ξ = ψ i ξ µ k i ξ in Eqs.29 and 30, and by substitute Eqs.26 and 33, we obtain the simultaneous equations for the sets of marginal probabilities {P i ξ, ξ Q} and the free energy F ln Z as follows: P i ξ ψ i ξ µ k i ξ k c i ψ i ζ, 32 µ k i ζ k c i F[P ] lnz 3 ψ i ζ ln µ k i ζ k c i ln ψ i ζφ ij ζ,ζ ψ j ζ µ k i ζ k c i\j l c j\i µ l j ζ, 33 where µ j i ξ = φ ij ξ,ζψ j ζ µ k j ζ k c j\i φ ij ζ,ζψ j ζ µ k j ζ. 34 k c j\i By setting ψ i ξ =p δ ξ,gi + Q p δ ξ,gi and φ ij ξ,ξ =exp αuξ,ξ, we obtain the marginal probability Pr{F i = ξ G = g,α,p} and the free energy F PO α, p asp i f i andf, respectively. By setting ψ i ξ =andφ ij ξ,ξ =exp αuξ,ξ, we obtain the free energy F PR α asf. Though these forms may not be so familiar to some physicists, lnµ i j ξ corresponds to the effective field in the conventional Bethe approximation. In probabilistic inference, the quantity µ i j ξ is called a message propagated from i to j. Equations 34 have the form of fixed-point equations for the messages µ i j ξ. In practical numerical calculations, we solve the simultaneous equations 34 by using iterative methods. For various values of the hyperparameters α and p, we obtain the marginal probability distributions Pr{F i = f i G = g,α,p}, the free energies F PO α, p andf PR α and search for the optimal set of values, α, p, that maximize the marginal likelihood Pr{G = g α, p} numerically. 4 Numerical Experiments In this section, we report some numerical experiments. The optimal set of values of hyperparameters, α, p, are determined by means of maximum marginal likelihood estimation and the loopy belief propagation given in Secs.2 and 3. To evaluate restoration performance quantitatively, fifteen original images f are simulated from the aprioriprobability distribution 3 for the Q-state Potts model and the Q-Ising model, respectively. We produce a degraded image g from each original image f by means of the degradation process 2 for Q p =0.3. By applying the iterative algorithm of the loopy belief propagation to each degraded image g, we obtain estimates of the hyperparameters p and α and the restored image f for each degraded image g. ForQ = 4, one of the numerical experiments is shown in Figs. and 2, respectively. In order to show the achievement of the maximization of marginal likelihood, we give in Figs. 3 and 4, also p- 6

7 a b c Figure : Image restoration based on the Q-state Potts model Q = 4. The original image f is generated by the aprioriprobability distribution 3. a Original image f α =.25. a Degraded image g Q p =0.3. b Restored image f obtained by the loopy belief propagation Q p =0.2246, α =.0050, df, f =0.557, SNR = db. a b c Figure 2: Image restoration based on the Q-Ising model Q = 4. The original image f is generated by the aprioriprobability distribution 3. a Original image f α = b Degraded image g Q p =0.3. c Restored image f obtained by the loopy belief propagation Q p =0.275, α =0.6202, df, f =0.702, SNR =4.647 db. and α-dependences of ln Pr{G = g α, p} corresponding to the image restorations shown in Figs. and 2, respectively. In order to show the achievement of the maximization of marginal likelihood, we give in Figs. 3 and 4, also p- andα-dependences of ln Pr{G = g α, p} corresponding to the image restorations shown in Figs. 3 and 4, respectively. From the fifteen degraded images g and the corresponding restored images f, we calculate the 95% confidence intervals for the hyperparameters, p and α, and the values of the Hamming distance df, f, df, f δ Ω fi, fi, 35 and the improvement in the signal to noise ratio, SNR, SNR 0 log 0 f g 2 f f 2 db. 36 These results showed that combining the loopy belief propagation with maximum marginal likelihood estimation gives us good results for hyperparameter estimations for the Q-state Potts model and the Q-Ising model. Though the Q-state Potts model has the first-order phase transition, we can estimate the hyperparameters α and p from the maximization of the marginal likelihood. We then performed numerical experiments based on the artificial images in Figs. 5, 6 and 7. The artificial images f in Figs. 5a, 6a and 7a. are generated from the 256-valued standard images Mandrill, Home and Girl by thresholding. Degraded images g, produced from the original images f with Q p =0.3, are shown in Figs. 5b, 6b and 7b. The image restorations created by means of the iterative algorithm using the loopy belief propagation for the Q-state Potts model are shown in 7

8 a α = Q p b.0..2 Q p.3 = α Figure 3: Maximization of the marginal likelihood Pr{G = g α, p} with respect to p and α, whichis corresponding to the image restorations shown in figure. a p-dependence of the logarithm of marginal likelihood per pixel, Ω ln Pr{G = g α, p}, in setting α = b α-dependence of the logarithm of marginal likelihood per pixel, Ω ln Pr{G = g α, p}, in setting Q p = a α = Q p b.0..2 Q p.3 = α Figure 4: Maximization of the marginal likelihood Pr{G = g α, p} with respect to p and α, whichis corresponding to the image restorations shown in figure. a p-dependence of the logarithm of marginal likelihood per pixel, Ω ln Pr{G = g α, p} in setting α = b α-dependence of the logarithm of marginal likelihood per pixel, Ω ln Pr{G = g α, p}, in setting Q p = Figs. 5c, 6c and 7c. The image restorations created by means of the iterative algorithm using the loopy belief propagation for the Q-Ising model are shown in Figs. 5d, 6d and 7d. The image Girl has many spatially-smooth parts, whereas the image Home has many spatially-flat parts and the image Mandrill has many spatially-changing parts. However, the Q-Ising model gives us better results than the Q-state Potts model. 5 Concluding Remarks In the present paper, we investigated the loopy belief propagation algorithm for probabilistic image restorations when the Q-state Potts model or the Q-Ising model is adopted as the aprioriprobability distribution. Our algorithm can be determined the hyperparameters so as to maximize the marginal likelihood for each degraded image. Now we proceed the investigations for more generalized loopy belief propagation algorithm based on a cluster variation method[8, 9, 7, 8, 9]. Although the expectation-maximization EM algorithm is usually applied to maximize the marginal likelihood, we maximize it by calculating the approximate value of marginal likelihood directly. It is important to construct in the approximate algoritghm for the maximization of the marginal likelihood by combining the expectation-maximization algorithm[3] with the Bethe approximation or the cluster variation method. On the other hand, we now consider the application of the probabilistic image processing algorithm to object segmentations for gesture recognition systems[20]. Moreover, we have applied the loopy belief propagation not only to image processing but also to probabilistic inference and have extended it to the algorithm to calculate correlations between 8

9 a b c d Figure 5: Image restoration for an artificial 4-valued image f generated from the 256-valued standard image Mandrill obtained by thresholding Q = 4. a Original image f. b Degraded image g Q p =0.3. c Restored image f by Q-state Potts model Q p = , α =.0050, df, f = , SNR = d Restored image f by Q-Ising model Q p = , α =0.6236, df, f =0.2406, SNR =4.08. a b c d Figure 6: Image restoration for an artificial 4-valued image f generated from the 256-valued standard image Home obtained by thresholding Q = 4. a Original image f. b Degraded image g Q p = 0.3. c Restored image f by Q-state Potts model Q p =0.28, α =.0050, df, f =0.500, SNR = d Restored image f by Q-Ising model Q p =0.2827, α =0.7274, df, f = 0.249, SNR = any pairs of nodes by combining the cluster variation method with a linear response theory[2]. Acknowledgements The author is grateful to Professor T. Horiguchi of the Tohoku University, for valuable discussions. This work was partly supported by the Grants-In-Aid No for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan. References [] D. Geman, Random Fields and Inverse Problems in Imaging, Lecture Notes in Mathematics, no.427, pp.3-93, Springer-Verlag, New York, 990. [2] R. Chellappa and A. Jain eds, Markov Random Fields: Theory and Applications, Academic Press, New York, 993. [3] S. Z. Li, Markov Random Field Modeling in Computer Vision, Springer-Verlag, Tokyo, 995. [4] K. Tanaka, Statistical-mechanical approach to image processing, J. Phys. A: Math. Gen., vol.35, no.37, R8-R50, [5] H. Nishimori, Statistical Physics of Spin Glasses and Information Processing: An Introduction, Oxford University Press, Oxford, 200. [6] H. Nishimori and K. Y. M. Wong, Statistical mechanics of image restoration and error-correcting codes, Phys. Rev. E, vol.60, no., pp.32-44,

10 a b c d Figure 7: Image restoration for an artificial 4-valued image f generated from the 256-valued standard image Girl obtained by thresholding Q = 4. a Original image f. b Degraded image g Q p = 0.3. c Restored image f by Q-state Potts model Q p =0.222, α =.03, df, f =0.290, SNR = d Restored image f by Q-Ising model Q p =0.2805, α =0.7880, df, f = 0.098, SNR = [7] M. Opper and D. Saad edited, Advanced Mean Field Methods Theory and Practice, MIT Press, 200. [8] J. S. Yedidia, W. T. Freeman and Y. Weiss, Generalized belief propagation, In T. Leen, T. Dietterich and V. Tresp, editors, Advances in Neural Information Processing Systems 3, pp , 200. [9] J. S. Yedidia, W. T. Freeman and Y. Weiss, Constructing free energy approximations and generalized belief propagation algorithms, Mitsubishi Electric Research Laboratories Technical Report, Technical Report Number TR , August 2002 available at [0] K. Tanaka and T. Morita, Cluster variation method and image restoration problem, Physics Letters A, vol.203, no.2-3, pp.22-28, 995. [] Y. Weiss and W. T. Freeman, Correctness of belief propagation in Gaussian graphical models of arbitrary topology, Neural Computation, vol.3, no.0, pp , 200. [2] K. Tanaka and J. Inoue, Maximum likelihood hyperparameter estimation for solvable Markov random field model in image restoration, IEICE Transactions on Information and Systems, vol.e85-d, no.3, pp , [3] J. Inoue and K. Tanaka, Dynamics of the maximum likelihood hyper-parameter estimation in image restoration: Gradient descent versus expectation and maximization algorithm, Physical Review E, vol. 65, no., Article No. 0625, pp.-, [4] K. Tanaka, Theoretical study of hyperparameter estimation by maximization of marginal likelihood in image restoration by means of cluster variation method, IEICE Transactions, vol.j83-a, no.0, pp.48-60, 2000 in Japanese; translated in Electronics and Communications in Japan, Part 3, vol.85, no.7, pp.50-62, 2002 in English. [5] K. Tanaka, Maximum Marginal Likelihood Estimation and Constrained Optimization in Image Restoration, Transactions of Japanese Society for Artificial Intelligence, vol.6, no.2, pp , 200. [6] C. Domb, On the theory of cooperative phenomena in crystals, Advances in Physics, vol.9, no.35, 960. [7] T. Morita, General Structure of the distribution functions for the Heisenberg model and the Ising model, J. Math. Phys. vol.3, no., pp.5-23, 972. [8] T. Morita, Cluster variation method and Möbius inversion formula, Journal of Statistical Physics, vol.59, no.3/4, pp , 990. [9] T. Morita, Cluster variation method for non-uniform Ising and Heisenberg models and spin-pair correlation function, Prog. Theor. Phys., vol.85, no.2, pp , 99. [20] N. Yoshiike and Y. Takefuji, Object segmentation using maximum neural networks for gesture recognition systems, Neurocomputing, vol.5, pp , [2] K. Tanaka, Probabilistic inference by means of cluster variation method and linear response theory, IEICE Transactions on Information and Systems, vol.e86-d, no.7, 2003, to appear. 0

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems

More information

arxiv: v1 [cond-mat.dis-nn] 21 Jul 2008

arxiv: v1 [cond-mat.dis-nn] 21 Jul 2008 Bayes-optimal inverse halftoning and statistical mechanics of the Q-Ising model arxiv:0807.3209v1 [cond-mat.dis-nn] 21 Jul 2008 Yohei Saika a a Department of Electrical and Computer Engineering, Wakayama

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Title. Author(s)Saika, Yohei; Inoue, Jun-Ichi; Tanaka, Hiroyuki; Oka. CitationCentral European Journal of Physics, 7(3): Issue Date

Title. Author(s)Saika, Yohei; Inoue, Jun-Ichi; Tanaka, Hiroyuki; Oka. CitationCentral European Journal of Physics, 7(3): Issue Date Title Bayes-optimal solution to inverse halftoning based o Authors)Saika, Yohei; Inoue, Jun-Ichi; Tanaka, Hiroyuki; Oka CitationCentral European Journal of Physics, 73): 444-456 Issue Date 2009-09 Doc

More information

Probabilistic image processing and Bayesian network

Probabilistic image processing and Bayesian network robabilistic imae processin and Bayesian network Kazuyuki Tanaka Graduate School o Inormation Sciences Tohoku University kazu@smapip.is.tohoku.ac.jp http://www.smapip.is.tohoku.ac.jp/~kazu/ Reerences K.

More information

arxiv: v1 [stat.ml] 20 Mar 2018

arxiv: v1 [stat.ml] 20 Mar 2018 Momentum-Space Renormalization Group Transformation in Bayesian Image Modeling by Gaussian Graphical Model arxiv:1804.0077v1 [stat.ml] 0 Mar 018 Kazuyuki Tanaka 1, Masamichi akamura 1, Shun Kataoka 1,

More information

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Kohta Aoki 1 and Hiroshi Nagahashi 2 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

U-Likelihood and U-Updating Algorithms: Statistical Inference in Latent Variable Models

U-Likelihood and U-Updating Algorithms: Statistical Inference in Latent Variable Models U-Likelihood and U-Updating Algorithms: Statistical Inference in Latent Variable Models Jaemo Sung 1, Sung-Yang Bang 1, Seungjin Choi 1, and Zoubin Ghahramani 2 1 Department of Computer Science, POSTECH,

More information

A Gaussian Tree Approximation for Integer Least-Squares

A Gaussian Tree Approximation for Integer Least-Squares A Gaussian Tree Approximation for Integer Least-Squares Jacob Goldberger School of Engineering Bar-Ilan University goldbej@eng.biu.ac.il Amir Leshem School of Engineering Bar-Ilan University leshema@eng.biu.ac.il

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection (non-examinable material) Matthew J. Beal February 27, 2004 www.variational-bayes.org Bayesian Model Selection

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

Inference as Optimization

Inference as Optimization Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Lecture 18 Generalized Belief Propagation and Free Energy Approximations

Lecture 18 Generalized Belief Propagation and Free Energy Approximations Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models Lecture Notes Fall 2009 Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

The Generalized Distributive Law and Free Energy Minimization

The Generalized Distributive Law and Free Energy Minimization The Generalized Distributive Law and Free Energy Minimization Srinivas M. Aji Robert J. McEliece Rainfinity, Inc. Department of Electrical Engineering 87 N. Raymond Ave. Suite 200 California Institute

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Texas A&M University

Texas A&M University Texas A&M University Electrical & Computer Engineering Department Graphical Modeling Course Project Author: Mostafa Karimi UIN: 225000309 Prof Krishna Narayanan May 10 1 Introduction Proteins are made

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Fractional Belief Propagation

Fractional Belief Propagation Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation

More information

Multi-Layer Boosting for Pattern Recognition

Multi-Layer Boosting for Pattern Recognition Multi-Layer Boosting for Pattern Recognition François Fleuret IDIAP Research Institute, Centre du Parc, P.O. Box 592 1920 Martigny, Switzerland fleuret@idiap.ch Abstract We extend the standard boosting

More information

Variational algorithms for marginal MAP

Variational algorithms for marginal MAP Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang

More information

Minimum Message Length Analysis of the Behrens Fisher Problem

Minimum Message Length Analysis of the Behrens Fisher Problem Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction

More information

Graphical models: parameter learning

Graphical models: parameter learning Graphical models: parameter learning Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London London WC1N 3AR, England http://www.gatsby.ucl.ac.uk/ zoubin/ zoubin@gatsby.ucl.ac.uk

More information

Linear Response for Approximate Inference

Linear Response for Approximate Inference Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

An introduction to Variational calculus in Machine Learning

An introduction to Variational calculus in Machine Learning n introduction to Variational calculus in Machine Learning nders Meng February 2004 1 Introduction The intention of this note is not to give a full understanding of calculus of variations since this area

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Learning Binary Classifiers for Multi-Class Problem

Learning Binary Classifiers for Multi-Class Problem Research Memorandum No. 1010 September 28, 2006 Learning Binary Classifiers for Multi-Class Problem Shiro Ikeda The Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, Tokyo, 106-8569,

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Empirical Bayes interpretations of random point events

Empirical Bayes interpretations of random point events INSTITUTE OF PHYSICS PUBLISHING JOURNAL OF PHYSICS A: MATHEMATICAL AND GENERAL J. Phys. A: Math. Gen. 38 (25) L531 L537 doi:1.188/35-447/38/29/l4 LETTER TO THE EDITOR Empirical Bayes interpretations of

More information

The Expectation Maximization or EM algorithm

The Expectation Maximization or EM algorithm The Expectation Maximization or EM algorithm Carl Edward Rasmussen November 15th, 2017 Carl Edward Rasmussen The EM algorithm November 15th, 2017 1 / 11 Contents notation, objective the lower bound functional,

More information

Chalmers Publication Library

Chalmers Publication Library Chalmers Publication Library Copyright Notice 20 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Expectation Consistent Free Energies for Approximate Inference

Expectation Consistent Free Energies for Approximate Inference Expectation Consistent Free Energies for Approximate Inference Manfred Opper ISIS School of Electronics and Computer Science University of Southampton SO17 1BJ, United Kingdom mo@ecs.soton.ac.uk Ole Winther

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

Data Fusion Algorithms for Collaborative Robotic Exploration

Data Fusion Algorithms for Collaborative Robotic Exploration IPN Progress Report 42-149 May 15, 2002 Data Fusion Algorithms for Collaborative Robotic Exploration J. Thorpe 1 and R. McEliece 1 In this article, we will study the problem of efficient data fusion in

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Posterior Regularization

Posterior Regularization Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Minimizing D(Q,P) def = Q(h)

Minimizing D(Q,P) def = Q(h) Inference Lecture 20: Variational Metods Kevin Murpy 29 November 2004 Inference means computing P( i v), were are te idden variables v are te visible variables. For discrete (eg binary) idden nodes, exact

More information

Connecting Belief Propagation with Maximum Likelihood Detection

Connecting Belief Propagation with Maximum Likelihood Detection Connecting Belief Propagation with Maximum Likelihood Detection John MacLaren Walsh 1, Phillip Allan Regalia 2 1 School of Electrical Computer Engineering, Cornell University, Ithaca, NY. jmw56@cornell.edu

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

Large-scale Ordinal Collaborative Filtering

Large-scale Ordinal Collaborative Filtering Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk

More information

Stochastic Spectral Approaches to Bayesian Inference

Stochastic Spectral Approaches to Bayesian Inference Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

Corner Transfer Matrix Renormalization Group Method

Corner Transfer Matrix Renormalization Group Method Corner Transfer Matrix Renormalization Group Method T. Nishino 1 and K. Okunishi 2 arxiv:cond-mat/9507087v5 20 Sep 1995 1 Physics Department, Graduate School of Science, Tohoku University, Sendai 980-77,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures

More information

Quantum annealing by ferromagnetic interaction with the mean-field scheme

Quantum annealing by ferromagnetic interaction with the mean-field scheme Quantum annealing by ferromagnetic interaction with the mean-field scheme Sei Suzuki and Hidetoshi Nishimori Department of Physics, Tokyo Institute of Technology, Oh-okayama, Meguro, Tokyo 152-8551, Japan

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

Information Geometric view of Belief Propagation

Information Geometric view of Belief Propagation Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural

More information

Bayesian Networks Inference with Probabilistic Graphical Models

Bayesian Networks Inference with Probabilistic Graphical Models 4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning

More information

P (x1) x 1 P (x2jx1) x 2 P (x3jx2) x 3 P (x4jx3) x 4 o ff1 ff1 ff2 ff2 ff3 ff3 ff4 x 1 o x 2 o x 3 o x 4 fi1 fi1 fi2 fi2 fi3 fi3 fi4 P (y1jx1) P (y2jx

P (x1) x 1 P (x2jx1) x 2 P (x3jx2) x 3 P (x4jx3) x 4 o ff1 ff1 ff2 ff2 ff3 ff3 ff4 x 1 o x 2 o x 3 o x 4 fi1 fi1 fi2 fi2 fi3 fi3 fi4 P (y1jx1) P (y2jx Expectation propagation for approximate inference in dynamic Bayesian networks Tom Heskes Onno Zoeter SNN, University of Nijmegen Geert Grooteplein 21, 6252 EZ, Nijmegen, The Netherlands ftom, orzoeterg@snn.kun.nl

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Self-Organization by Optimizing Free-Energy

Self-Organization by Optimizing Free-Energy Self-Organization by Optimizing Free-Energy J.J. Verbeek, N. Vlassis, B.J.A. Kröse University of Amsterdam, Informatics Institute Kruislaan 403, 1098 SJ Amsterdam, The Netherlands Abstract. We present

More information

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood

More information

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

VIBES: A Variational Inference Engine for Bayesian Networks

VIBES: A Variational Inference Engine for Bayesian Networks VIBES: A Variational Inference Engine for Bayesian Networks Christopher M. Bishop Microsoft Research Cambridge, CB3 0FB, U.K. research.microsoft.com/ cmbishop David Spiegelhalter MRC Biostatistics Unit

More information

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Published in: Tenth Tbilisi Symposium on Language, Logic and Computation: Gudauri, Georgia, September 2013

Published in: Tenth Tbilisi Symposium on Language, Logic and Computation: Gudauri, Georgia, September 2013 UvA-DARE (Digital Academic Repository) Estimating the Impact of Variables in Bayesian Belief Networks van Gosliga, S.P.; Groen, F.C.A. Published in: Tenth Tbilisi Symposium on Language, Logic and Computation:

More information

The Bayesian approach to inverse problems

The Bayesian approach to inverse problems The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu

More information

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Sean Borman and Robert L. Stevenson Department of Electrical Engineering, University of Notre Dame Notre Dame,

More information

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions - Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions Simon Luo The University of Sydney Data61, CSIRO simon.luo@data61.csiro.au Mahito Sugiyama National Institute of

More information

bound on the likelihood through the use of a simpler variational approximating distribution. A lower bound is particularly useful since maximization o

bound on the likelihood through the use of a simpler variational approximating distribution. A lower bound is particularly useful since maximization o Category: Algorithms and Architectures. Address correspondence to rst author. Preferred Presentation: oral. Variational Belief Networks for Approximate Inference Wim Wiegerinck David Barber Stichting Neurale

More information

Independent Component Analysis and Unsupervised Learning

Independent Component Analysis and Unsupervised Learning Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent

More information

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models $ Technical Report, University of Toronto, CSRG-501, October 2004 Density Propagation for Continuous Temporal Chains Generative and Discriminative Models Cristian Sminchisescu and Allan Jepson Department

More information

Clustering with k-means and Gaussian mixture distributions

Clustering with k-means and Gaussian mixture distributions Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Statistical Modeling of Visual Patterns: Part I --- From Statistical Physics to Image Models

Statistical Modeling of Visual Patterns: Part I --- From Statistical Physics to Image Models Statistical Modeling of Visual Patterns: Part I --- From Statistical Physics to Image Models Song-Chun Zhu Departments of Statistics and Computer Science University of California, Los Angeles Natural images

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectation-Maximization Algorithm Francisco S. Melo In these notes, we provide a brief overview of the formal aspects concerning -means, EM and their relation. We closely follow the presentation in

More information

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = =

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = = Exact Inference I Mark Peot In this lecture we will look at issues associated with exact inference 10 Queries The objective of probabilistic inference is to compute a joint distribution of a set of query

More information

Machine Learning Lecture Notes

Machine Learning Lecture Notes Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective

More information

Continuous Probability Distributions from Finite Data. Abstract

Continuous Probability Distributions from Finite Data. Abstract LA-UR-98-3087 Continuous Probability Distributions from Finite Data David M. Schmidt Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico 87545 (August 5, 1998) Abstract Recent approaches

More information

Optimal Shape and Topology of Structure Searched by Ants Foraging Behavior

Optimal Shape and Topology of Structure Searched by Ants Foraging Behavior ISSN 0386-1678 Report of the Research Institute of Industrial Technology, Nihon University Number 83, 2006 Optimal Shape and Topology of Structure Searched by Ants Foraging Behavior Kazuo MITSUI* ( Received

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information