Modeling heterogeneity in random graphs

Size: px
Start display at page:

Download "Modeling heterogeneity in random graphs"

Transcription

1 Modeling heterogeneity in random graphs Catherine MATIAS CNRS, Laboratoire Statistique & Génome, Évry (Soon: Laboratoire de Probabilités et Modèles Aléatoires, Paris) cmatias

2 Outline Introduction to random graphs State space models for random graphs Stochastic Block Model (binary or weighted graphs) Parameter estimation and node clustering Identifiability Parameter estimation Clustering convergence results

3 Data: Biological networks Different networks types Protein-protein interaction networks (PPI), Metabolic networks Genes co-expression networks Genes regulation networks... Some challenges Analyse big data sets, noisy data, Identify structures (topological patterns, cliques, nodes groups, etc), Modelling evolution across time, Compare networks between different species,...

4 Some models for random graphs Some existing models, advantages and drawbacks Erdös-Rényi, simple and mathematically well-understood, too homogeneous; edges are i.i.d. B(p) Models based on degree distribution, scale-free property, only a partial descriptor of the graph, greedy numerical simulations with fixed-degrees models ; Generative processes (like preferential attachment), dynamic model, depends on parameters (initialisation, stop,... ), can we characterize the result? Exponential random graph, natural from a statistical point of view, big inference issues ;...

5 Some models for random graphs Some existing models, advantages and drawbacks Erdös-Rényi, simple and mathematically well-understood, too homogeneous; Models based on degree distribution, scale-free property, only a partial descriptor of the graph, greedy numerical simulations with fixed-degrees models ; Nodes degrees are fixed, samples obtained through re-wiring algorithm Generative processes (like preferential attachment), dynamic model, depends on parameters (initialisation, stop,... ), can we characterize the result? Exponential random graph, natural from a statistical point of view, big inference issues ;...

6 Some models for random graphs Some existing models, advantages and drawbacks Erdös-Rényi, simple and mathematically well-understood, too homogeneous; Models based on degree distribution, scale-free property, only a partial descriptor of the graph, greedy numerical simulations with fixed-degrees models ; Generative processes (like preferential attachment), dynamic model, depends on parameters (initialisation, stop,... ), can we characterize the result? Start from a small graph, add nodes and connect them with higher prob. to nodes with large degrees Exponential random graph, natural from a statistical point of view, big inference issues ;...

7 Some models for random graphs Some existing models, advantages and drawbacks Erdös-Rényi, simple and mathematically well-understood, too homogeneous; Models based on degree distribution, scale-free property, only a partial descriptor of the graph, greedy numerical simulations with fixed-degrees models ; Generative processes (like preferential attachment), dynamic model, depends on parameters (initialisation, stop,... ), can we characterize the result? Exponential random graph, natural from a statistical point of view, big inference issues ; P θ (Y = y) = c(θ) 1 exp(θ S(y))...

8 Some models for random graphs Some existing models, advantages and drawbacks Erdös-Rényi, simple and mathematically well-understood, too homogeneous; Models based on degree distribution, scale-free property, only a partial descriptor of the graph, greedy numerical simulations with fixed-degrees models ; Generative processes (like preferential attachment), dynamic model, depends on parameters (initialisation, stop,... ), can we characterize the result? Exponential random graph, natural from a statistical point of view, big inference issues ;... In this talk, I am interested in modeling heterogeneity, thus I will focus on model-based clustering methods acting on the nodes.

9 Outline Introduction to random graphs State space models for random graphs Stochastic Block Model (binary or weighted graphs) Parameter estimation and node clustering Identifiability Parameter estimation Clustering convergence results

10 State space models I Observations Adjacency matrix: Y = (Y ij ) 1 i,j n characterizing the relations between n distinct individuals or nodes; Either binary (Y ij {0, 1}) or weighted (Y ij R s ); Directed or undirected (Y ij = Y ji ), with or without self-loops (Y ii = 0); Example of undirected binary graph with no self-loops. 9

11 State space models II State space modeling There exist latent variables Z i associated to each node i; The Z i s are assumed to be i.i.d.; Conditional on the Z i s, the observations (Y ij ) 1 i,j n are assumed to be independent, The conditional distribution of Y ij only depends on Z i, Z j ; Different cases Depending on the latent state space Z Finite case: Z i {1,..., K} corresponds to stochastic block model (SBM), Continuous case: Z i R k or [0, 1].

12 State space models III Common features Networks characteristics are summarized through a low dimensional latent space; Classical em algorithm is not tractable.

13 Likelihood computation (state space r.g. models) I Preliminary comments As in any state space model, likelihood computation is not tractable, except for very small sample sizes, log P θ (Y) = log Z P θ (Y, Z) One approach is to consider the latent variables as parameters and compute a conditional likelihood ; counterpart is maximisation issues ; Classical solution is em-algorithm, but requires that the distribution of {Z i } conditional on {Y ij } is easily computed, which is not the case here!

14 Likelihood computation (state space r.g. models) II p({z i }, {Y ij }) Moralization of p({z i }, {Y ij }) p({z i } {Y ij }) The conditional distribution of the latent variables given the observed ones is not factorized.

15 One example: Latent space models [Handcock et al. 07] Model Z i i.i.d. vectors in a latent space R k. Conditional on {Z i }, the {Y ij } are independent Bernoulli r.v. log-odds(y ij = 1 Z i, Z j, U ij, θ) = θ 0 + θ 1 U ij Z i Z j, where log-odds(a) = log P(A)/(1 P(A)) ; {U ij } set of covariate vectors and θ parameters vector. Results [Handcock et al. 07] Two-stage maximum likelihood or MCMC procedures are used to infer the model s parameters Assuming Z i sampled from mixture of multivariate normal, one may obtain a clustering of the nodes. Issues No model selection procedure to infer the effective dimension k of latent space and the number of clusters.

16 One example: Latent space models [Handcock et al. 07] Model Z i i.i.d. vectors in a latent space R k. Conditional on {Z i }, the {Y ij } are independent Bernoulli r.v. log-odds(y ij = 1 Z i, Z j, U ij, θ) = θ 0 + θ 1 U ij Z i Z j, where log-odds(a) = log P(A)/(1 P(A)) ; {U ij } set of covariate vectors and θ parameters vector. Results [Handcock et al. 07] Two-stage maximum likelihood or MCMC procedures are used to infer the model s parameters Assuming Z i sampled from mixture of multivariate normal, one may obtain a clustering of the nodes. Issues No model selection procedure to infer the effective dimension k of latent space and the number of clusters.

17 One example: Latent space models [Handcock et al. 07] Model Z i i.i.d. vectors in a latent space R k. Conditional on {Z i }, the {Y ij } are independent Bernoulli r.v. log-odds(y ij = 1 Z i, Z j, U ij, θ) = θ 0 + θ 1 U ij Z i Z j, where log-odds(a) = log P(A)/(1 P(A)) ; {U ij } set of covariate vectors and θ parameters vector. Results [Handcock et al. 07] Two-stage maximum likelihood or MCMC procedures are used to infer the model s parameters Assuming Z i sampled from mixture of multivariate normal, one may obtain a clustering of the nodes. Issues No model selection procedure to infer the effective dimension k of latent space and the number of clusters.

18 Outline Introduction to random graphs State space models for random graphs Stochastic Block Model (binary or weighted graphs) Parameter estimation and node clustering Identifiability Parameter estimation Clustering convergence results

19 Mixture model approach Idea: probability model based clustering Assume that the nodes of the graph belong to unobserved groups, that describe their connectivity to the other nodes. Advantages Induces heterogeneity in the data, keeping it simple, Clustering of the nodes groups induced by the model, Model encompasses the community detection framework. Motivation/Justification: Szemerédi regularity Lemma [Szemerédi 78] Every large enough graph can be divided into subsets of about the same size so that the edges between different subsets behave almost randomly.

20 Stochastic block model (binary graphs) γ 6 γ γ n = 10, Z 5 = 1 Y 12 = 1, Y 15 = 0 γ 9 γ 10 Binary case (parametric model with θ = (π, γ)) K groups (=colors ). {Z i } 1 i n i.i.d. vectors Z i = (Z i1,..., Z ik ) M(1, π), with π = (π 1,..., π K ) groups proportions. Z i not observed (latent). Observations: presence/absence of an edge {Y ij } 1 i<j n, Conditional on {Z i } s, the r.v. Y ij are independent B(γ Zi Z j ).

21 (Assumption: f has continuous cdf at zero). Stochastic block model (weighted graphs) γ 6 γ γ n = 10, Z 5 = 1 Y 12 R, Y 15 = 0 γ 9 γ 10 Weighted case (parametric model with θ = (π, γ (1), γ (2) )) Latent variables: idem Observations: weights Y ij, where Y ij = 0 or Y ij R s \ {0}, Conditional on the {Z i } s, the random variables Y ij are independent with distribution µ Zi Z j ( ) = γ (1) Z i Z j f(, γ (2) Z i Z j ) + (1 γ (1) Z i Z j )δ 0 ( )

22 SBM clustering vs other clusterings SBM clustering Nodes clustering induced by the model reflects a common connectivity behaviour; Many clustering methods try to group nodes that belong to the same clique (ex: community detection) Toy example SBM cluster Clustering based on cliques

23 Particular cases and generalisations Particular cases Affiliation model: connectivity matrix γ has only 2 parameters α... β γ =..... α β β... α Affiliation + α β = community detection (cliques clustering). Generalisations Overlapping groups [Latouche et al. 11, Airoldi et al. 08] for binary graphs; Adding covariates [Zanghi et al. 10b] ; Latent block models (LBM), for array data.

24 From SBM to LBM A graph is encoded through its adjacency matrix. Clustering the nodes corresponds to simultaneous and identical clustering of the rows and columns. Generalise this to non square array data, without constraining identical rows and columns groups. Models bi-partite graphs.

25 Latent block models I LBM notation Observations: array Y n,m := {Y ij } 1 i n,1 j m with Y ij Y, K 1 and L 1 number of row and column groups, respectively. Groups prior distributions π = (π 1,..., π K ) over K = {1,..., K} and ρ = (ρ 1,..., ρ L ) over L = {1,..., L}, such that k π k = l ρ l = 1. Latent variables Z n := Z 1,..., Z n iid π over K and W m := W 1,..., W m i.i.d. ρ over L.

26 Latent block models II Two models in the same framework 2 cases occur LBM : {Z i } 1 i n and {W j } 1 j m independent. SBM : n = m, K = L, Z i = W i for all 1 i n and π = ρ. Connectivity parameters γ = (γ kl ) (k,l) K L, Conditional on {Z i, W j }, random variables {Y ij } are independent, with distribution Y ij Z i = k, W j = l f( ; γ kl ).

27 Outline Introduction to random graphs State space models for random graphs Stochastic Block Model (binary or weighted graphs) Parameter estimation and node clustering Identifiability Parameter estimation Clustering convergence results

28 Outline Introduction to random graphs State space models for random graphs Stochastic Block Model (binary or weighted graphs) Parameter estimation and node clustering Identifiability Parameter estimation Clustering convergence results

29 Parameter s identifiability Problem Obviously, the model may only be identifiable up to a permutation on the group s labels. But whether one may uniquely recover the parameter up to label switching is a delicate task. Existing identifiability results Undirected SBM, binary or weighted [Allman et al. 09, Allman et al. 11], Directed and binary SBM [Celisse et al. 12], Overlapping SBM [Latouche et al. 11], Binary LBM [Keribin et al. 13].

30 Outline Introduction to random graphs State space models for random graphs Stochastic Block Model (binary or weighted graphs) Parameter estimation and node clustering Identifiability Parameter estimation Clustering convergence results

31 Parameter estimation I Parameter estimation issue em algorithm not feasible because latent variables are not independent conditional on observed ones. Ex (SBM) : P({Z i } i {Y ij } i,j ) i P(Z i {Y ij } i,j ) Alternatives: Gibbs sampling or Variational approximation to em. Composite likelihood approaches for affiliation valued graphs [Ambroise & Matias 10]; About LBM case Variational methods for binary, Gaussian or Poisson data arrays [Govaert & Nadif 03, Govaert & Nadif 08, Govaert & Nadif 10]. Bayesian framework and Gibbs sampling for binary and Gaussian data [Wyse & Friel 12] sem Gibbs approach (for categorical data) [Keribin et al. 13].

32 Parameter estimation II Model selection Maximal likelihood is not available (thus neither AIC or BIC), ICL criterion is used [Daudin et al. 08, Keribin et al. 13]. MCMC approach to select number of LBM groups [Wyse & Friel 12]. Node clustering Automatically performed by the previous algorithms.

33 Variational approximation principle I Likelihood decomposition L Y (θ) := log P(Y; θ) = log P(Y, Z; θ) log P(Z Y; θ) and for any distribution Q on Z, L Y (θ) = E Q (log P(Y, Z; θ)) + H(Q) + KL(Q P(Z Y; θ)) em principle e-step: maximise the quantity E Q (log P(Y, Z; θ (t) )) + H(Q) with respect to Q. This is equivalent to minimizing KL(Q P(Z Y; θ (t) )) with respect to Q. m-step: keeping now Q fixed, maximize the quantity E Q (log P(Y, Z; θ)) + H(Q) with respect to θ and update the parameter value θ (t+1) to this maximiser. This is equivalent to maximizing the conditional expectation E Q (log P(Y, Z; θ)) w.r.t. θ.

34 Variational approximation principle II Variational em e-step: search for an optimal solution Q within a restricted class of distributions Q, e.g. the class of factorized distributions n Q(Z) = Q(Z i ), i=1 Q = argmin KL(Q P(Z Y; θ (t) )) Q Q m-step: unchanged, i.e. θ (t+1) = argmax θ E Q (log P(Y, Z; θ)) A consequence of KL 0 is the lower bound L Y (θ) E Q (log P(Y, Z; θ)) + H(Q) So that the variational approximation consists in maximizing a lower bound on the log-likelihood. Why does it make sense?

35 Variational approximation in SBM Q = {Q, Q(Z) = n i=1 Q i(z i )} and Q i (Z i ) = k τ Z ik ik ; The minimizer of KL(Q P(Z Y; θ (t) )) in Q satisfies a fixed point relation τ ik π k f kk (Y ij ) τjk j k It s a mean field approximation. Once the τ ik are computed, it s easy to do the m-step.

36 Binary or Weighted Affiliation SBM [Ambroise & Matias 10] I Models (Binary/weighted) {Z i } 1 i n i.i.d. latent vectors Z i = (Z i1,..., Z ik ) M(1, π); Conditional on {Z i } s, the Y ij are independent; Binary case: Y ij { B(γin ) if Z i = Z j B(γ out ) if Z i Z j. Weighted case: { pin f(, γ Y ij in ) + (1 p in )δ 0 ( ) if Z i = Z j p out f(, γ out ) + (1 p out )δ 0 ( ) if Z i Z j.

37 Binary or Weighted Affiliation SBM [Ambroise & Matias 10] II Composite likelihood idea - Weighted case The present edges Y ij 0 follow a mixture distribution Y ij Y ij 0 { Q q=1 π2 q p in}f(y ij ; γ in )+ { q l πqπ lp out}f(y ij ; γ out ) Parameters of a mixture of two continuous distributions are in general identifiable. We form a composite log-likelihood L c X(θ) = 1 log[α n(n 1) in f(y ij ; γ in ) + α out f(y ij ; γ out )]. i<j If this converges to E[log(α in f(y ij ; γ in ) + α out f(y ij ; γ out )] then we can estimate the parameters with θ = argmax L c X(θ). θ

38 Binary or Weighted Affiliation SBM [Ambroise & Matias 10] III Moment methods idea - Binary case Same idea does not apply directly in the Binary case, because Y ij mixture of Bernoulli. Not identifiable! However, mixtures of 3-variate Bernoulli distributions are identifiable (in many cases). Develop same methodology with L c X (π, α, β) = 1 n(n 1)(n 2) (i,j,k) I 3 log P(Y ij, Y ik, Y jk ). For the two approaches to be valid, we need to know whether the composite log-likelihoods converge.

39 Binary or Weighted Affiliation SBM [Ambroise & Matias 10] IV Notation i = (i 1,..., i k ) a k-tuple of nodes, Y i = (Y i1 i 2,..., Y i1 i k, Y i2 i 3,..., Y ik 1 i k ) the vector of p = ( ) k 2 r.v. induced by the nodes i, g : Y p R s, a function and m g = (n k)! n! i I k g(yi ) and m g = E(g(Y (1,...,k) )). Theorem For any k, s 1 and p = ( k 2) and any measurable function g : Y p R s such that E( g(y (1,...,k) ) 2 ) < +, the estimator ˆm g is consistent ˆm g m g almost surely, n as well as asymptotically normal n( ˆm g m g ) n N (0, Σ g ).

40 Outline Introduction to random graphs State space models for random graphs Stochastic Block Model (binary or weighted graphs) Parameter estimation and node clustering Identifiability Parameter estimation Clustering convergence results

41 Convergence issues Why does the variational approximation work? The variational approximation appears to be efficient, both for LBM and SBM. Variational approximation does not converge unless the true posterior p(z Y; γ) is degenerate [Gunawardana & Byrne 05]. Remaining issues What is the (asymptotic) behaviour of the groups posterior distribution? Is it degenerate? Is variational approximation somehow equivalent to em approach? Does maximum likelihood converge in this setting anyway?

42 Maximum likelihood and variational approach Results from [Celisse et al. 12] in SBM case Variational em is asymptotically equivalent to classical em for SBM. Maximum likelihood is convergent in this setup (as the sample size increases).

43 Convergence of the groups posterior distribution (LBM or SBM) Results from [Mariadassou & Matias 13] In general, the groups posterior distribution converges to a Dirac mass (when n, m ). However, when there exist equivalent configurations (=nodes groups inducing the same likelihood), the posterior converges to a mixture of Dirac located at these configurations. In some cases -in particular affiliation-, the number of equivalent configurations is larger than the number of label switching configurations. When there are equivalent configurations, the posterior converges to a Dirac mass at the configuration with largest prior.

44 Equivalent configurations in SBM or LBM Label switching corresponds to P (σ(π),σ(γ)) = P (π,γ) for any permutation σ of {1,..., K}; In classical mixtures, identifiability requires that γ q γ q for any q q ; In SBM or LBM, one may have γ ql = γ q l for some q q ; Then, if the matrix γ has symmetries, we may have σ(γ) = γ with the model still identifiable if π has non equal entries. Namely P (π,σ(γ)) = P (π,γ) ; As a consequence, the ratios between the posterior distributions at (Z n, W m ) and σ 1 (Z n, W m ) does not depend on data P (π,γ) (Z n, W m Y n,m ) π(z n, W m )P γ (Y n,m Z n, W m ) π(z n, W m )P σ(γ) (Y n,m Z n, W m ) π(z n, W m ) π(σ 1 (Z n, W m )) P (π,γ)(σ 1 (Z n, W m ) Y n,m ).

45 Conclusions Modeling data SBM are natural and powerful models for handling networks data. Many variants, with overlapping groups or covariates. Data may be binary or weighted, sparse or not, directed or not... ; Natural generalisation of SBM for matrix data: LBM are handled in the same way. Model based clustering of the nodes of the graph (or the rows/columns of the array), that encompasses community detection approaches. Theoretical results Convergence results are difficult to obtain but some exist. Variational em approximations provide good practical results but tend to depend on initialisation: there is room for improvement!

46 References I [Airoldi et al. 08] E.M. Airoldi, D.M. Blei, S.E. Fienberg and E.P. Xing. Mixed Membership Stochastic Blockmodels. J. Mach. Learn. Res., 9: , [Allman et al. 09] E.S. Allman, C. Matias and J.A. Rhodes. Identifiability of parameters in latent structure models with many observed variables. Ann. Statist., 37(6A): , [Allman et al. 11] E.S. Allman, C. Matias and J.A. Rhodes. Parameter identifiability in a class of random graph mixture models. J. Statist. Planning and Inference, 141(5): , 2011.

47 References II [Ambroise & Matias 10] C. Ambroise and C. Matias. New consistent and asymptotically normal estimators for random graph mixture models. Journal of the Royal Statistical Society: Series B, 74(1):3-35, [Celisse et al. 12] A. Celisse, J.-J. Daudin, and L. Pierre. Consistency of maximum-likelihood and variational estimators in the stochastic block model. Electron. J. Statist., 6: , [Daudin et al. 08] J.-J. Daudin, F. Picard, and S. Robin. A mixture model for random graphs. Stat. Comput., 18(2): , 2008.

48 References III [Gassiat et al. 13] E. Gassiat, A. Cleynen, S. Robin. Finite state space non parametric Hidden Markov Models are in general identifiable. arxiv: , [Govaert & Nadif 03] G. Govaert and M. Nadif. Clustering with block mixture models. Pattern Recognition, 36(2): , [Govaert & Nadif 08] G. Govaert and M. Nadif. Block clustering with Bernoulli mixture models: Comparison of different approaches. Computational Statistics and Data Analysis, 52(6): , 2008.

49 References IV [Govaert & Nadif 10] G. Govaert and M. Nadif. Latent block model for contingency table. Communications in Statistics - Theory and Methods, 39(3): , [Gunawardana & Byrne 05] Gunawardana and Byrne. Convergence Theorems for Generalized Alternating Minimization Procedures. JMLR, 6: , [Handcock et al. 07] M.S. Handcock, A.E. Raftery and J.M. Tantrum Model-based clustering for social networks. J. R. Statist. Soc. A., 170(2): , [Hoff et al. 02] P.D. Hoff, A.E. Raftery and M. S. Handcock Latent space approaches to social network analysis. J. Amer. Statist. Assoc., 97(460): , 2002.

50 References V [Keribin et al. 13] C. Keribin, V. Brault, G. Celeux and G. Govaert. Estimation and selection for the latent block model on categorical data. INRIA Research report 8264, [Latouche et al. 11] P. Latouche, E. Birmelé and C. Ambroise. Overlapping Stochastic Block Models With Application to the French Political Blogosphere. Annals of Applied Statistics, 5(1): , [Mariadassou & Matias 13] M. Mariadassou and C. Matias. Convergence of the groups posterior distribution in latent or stochastic block models. To appear in Bernoulli. hal , 2013.

51 References VI [Szemerédi 78] Szemerédi, Endre. Regular partitions of graphs. Problèmes combinatoires et théorie des graphes (Colloq. Internat. CNRS, Univ. Orsay, Orsay, 1976), Colloq. Internat. CNRS, 260: , [Wyse & Friel 12] J. Wyse and N. Friel Block clustering with collapsed latent block models. Stat Comput 22: , [Zanghi et al. 10b] H. Zanghi, S. Volant and C. Ambroise. Clustering based on random graph model embedding vertex features. Pattern Recognition Letters 31(9): , 2010.

IV. Analyse de réseaux biologiques

IV. Analyse de réseaux biologiques IV. Analyse de réseaux biologiques Catherine Matias CNRS - Laboratoire de Probabilités et Modèles Aléatoires, Paris catherine.matias@math.cnrs.fr http://cmatias.perso.math.cnrs.fr/ ENSAE - 2014/2015 Sommaire

More information

Variational inference for the Stochastic Block-Model

Variational inference for the Stochastic Block-Model Variational inference for the Stochastic Block-Model S. Robin AgroParisTech / INRA Workshop on Random Graphs, April 2011, Lille S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs,

More information

Deciphering and modeling heterogeneity in interaction networks

Deciphering and modeling heterogeneity in interaction networks Deciphering and modeling heterogeneity in interaction networks (using variational approximations) S. Robin INRA / AgroParisTech Mathematical Modeling of Complex Systems December 2013, Ecole Centrale de

More information

Clustering bi-partite networks using collapsed latent block models

Clustering bi-partite networks using collapsed latent block models Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space

More information

arxiv: v2 [math.st] 8 Dec 2010

arxiv: v2 [math.st] 8 Dec 2010 New consistent and asymptotically normal estimators for random graph mixture models arxiv:1003.5165v2 [math.st] 8 Dec 2010 Christophe Ambroise Catherine Matias Université d Évry val d Essonne - CNRS UMR

More information

MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW

MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW ESAIM: PROCEEDINGS AND SURVEYS, December 2014, Vol. 47, p. 55-74 F. Abergel, M. Aiguier, D. Challet, P.-H. Cournède, G. Faÿ, P. Lafitte, Editors MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE

More information

Bayesian methods for graph clustering

Bayesian methods for graph clustering Author manuscript, published in "Advances in data handling and business intelligence (2009) 229-239" DOI : 10.1007/978-3-642-01044-6 Bayesian methods for graph clustering P. Latouche, E. Birmelé, and C.

More information

Uncovering structure in biological networks: A model-based approach

Uncovering structure in biological networks: A model-based approach Uncovering structure in biological networks: A model-based approach J-J Daudin, F. Picard, S. Robin, M. Mariadassou UMR INA-PG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Statistics

More information

Goodness of fit of logistic models for random graphs: a variational Bayes approach

Goodness of fit of logistic models for random graphs: a variational Bayes approach Goodness of fit of logistic models for random graphs: a variational Bayes approach S. Robin Joint work with P. Latouche and S. Ouadah INRA / AgroParisTech MIAT Seminar, Toulouse, November 2016 S. Robin

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed

More information

Statistical analysis of biological networks.

Statistical analysis of biological networks. Statistical analysis of biological networks. Assessing the exceptionality of network motifs S. Schbath Jouy-en-Josas/Evry/Paris, France http://genome.jouy.inra.fr/ssb/ Colloquium interactions math/info,

More information

arxiv: v2 [stat.ap] 24 Jul 2010

arxiv: v2 [stat.ap] 24 Jul 2010 Variational Bayesian inference and complexity control for stochastic block models arxiv:0912.2873v2 [stat.ap] 24 Jul 2010 P. Latouche, E. Birmelé and C. Ambroise Laboratoire Statistique et Génome, UMR

More information

Different points of view for selecting a latent structure model

Different points of view for selecting a latent structure model Different points of view for selecting a latent structure model Gilles Celeux Inria Saclay-Île-de-France, Université Paris-Sud Latent structure models: two different point of views Density estimation LSM

More information

Theory and Methods for the Analysis of Social Networks

Theory and Methods for the Analysis of Social Networks Theory and Methods for the Analysis of Social Networks Alexander Volfovsky Department of Statistical Science, Duke University Lecture 1: January 16, 2018 1 / 35 Outline Jan 11 : Brief intro and Guest lecture

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Bayesian non parametric inference of discrete valued networks

Bayesian non parametric inference of discrete valued networks Bayesian non parametric inference of discrete valued networks Laetitia Nouedoui, Pierre Latouche To cite this version: Laetitia Nouedoui, Pierre Latouche. Bayesian non parametric inference of discrete

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Final Talk May 8, 2014 Ted

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

The Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul

The Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul The Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul Charles Bouveyron Laboratoire MAP5, UMR CNRS 8145 Université Paris Descartes This is a joint work with Y. Jernite,

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Intro Talk April 22, 2014

More information

Graph Detection and Estimation Theory

Graph Detection and Estimation Theory Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Bishop PRML Ch. 9 Alireza Ghane c Ghane/Mori 4 6 8 4 6 8 4 6 8 4 6 8 5 5 5 5 5 5 4 6 8 4 4 6 8 4 5 5 5 5 5 5 µ, Σ) α f Learningscale is slightly Parameters is slightly larger larger

More information

arxiv: v2 [stat.me] 22 Jun 2016

arxiv: v2 [stat.me] 22 Jun 2016 Statistical clustering of temporal networks through a dynamic stochastic block model arxiv:1506.07464v [stat.me] Jun 016 Catherine Matias Sorbonne Universités, UPMC Univ Paris 06, Univ Paris Diderot, Sorbonne

More information

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014. Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s

More information

Convergence of the groups posterior distribution in latent or stochastic block models

Convergence of the groups posterior distribution in latent or stochastic block models Bernoulli 211, 2015, 537 573 DOI: 103150/13-BEJ579 Convergence of the groups posterior distribution in latent or stochastic block models MAHENDRA MARIADASSOU 1 and CATHERINE MATIAS 2 1 INRA, UR1077 Unité

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment

Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment Catherine Matias Joint works with F. Comets, M. Falconnet, D.& O. Loukianov Currently: Laboratoire

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Clustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning

Clustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Variational inference

Variational inference Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM

More information

Model-Based Co-Clustering of Multivariate Functional Data

Model-Based Co-Clustering of Multivariate Functional Data Model-Based Co-Clustering of Multivariate Functional Data Faicel Chamroukhi, Christophe Biernacki To cite this version: Faicel Chamroukhi, Christophe Biernacki. Model-Based Co-Clustering of Multivariate

More information

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2 STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised

More information

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Random Effects Models for Network Data

Random Effects Models for Network Data Random Effects Models for Network Data Peter D. Hoff 1 Working Paper no. 28 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 January 14, 2003 1 Department of

More information

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010 Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu COMPSTAT

More information

Bayesian Methods for Graph Clustering

Bayesian Methods for Graph Clustering Bayesian Methods for Graph Clustering by Pierre Latouche, Etienne Birmelé, and Christophe Ambroise Research Report No. 17 September 2008 Statistics for Systems Biology Group Jouy-en-Josas/Paris/Evry, France

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21 Cosma Shalizi 3 April 2008 Models of Networks, with Origin Myths Erdős-Rényi Encore Erdős-Rényi with Node Types Watts-Strogatz Small World Graphs Exponential-Family

More information

Mixture models for analysing transcriptome and ChIP-chip data

Mixture models for analysing transcriptome and ChIP-chip data Mixture models for analysing transcriptome and ChIP-chip data Marie-Laure Martin-Magniette French National Institute for agricultural research (INRA) Unit of Applied Mathematics and Informatics at AgroParisTech,

More information

Network Event Data over Time: Prediction and Latent Variable Modeling

Network Event Data over Time: Prediction and Latent Variable Modeling Network Event Data over Time: Prediction and Latent Variable Modeling Padhraic Smyth University of California, Irvine Machine Learning with Graphs Workshop, July 25 th 2010 Acknowledgements PhD students:

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2017

Cheng Soon Ong & Christian Walder. Canberra February June 2017 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2017 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 679 Part XIX

More information

Finite Mixture Models and Clustering

Finite Mixture Models and Clustering Finite Mixture Models and Clustering Mohamed Nadif LIPADE, Université Paris Descartes, France Nadif (LIPADE) EPAT, May, 2010 Course 3 1 / 40 Introduction Outline 1 Introduction Mixture Approach 2 Finite

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Random function priors for exchangeable arrays with applications to graphs and relational data

Random function priors for exchangeable arrays with applications to graphs and relational data Random function priors for exchangeable arrays with applications to graphs and relational data James Robert Lloyd Department of Engineering University of Cambridge Peter Orbanz Department of Statistics

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21: More Networks: Models and Origin Myths Cosma Shalizi 31 March 2009 New Assignment: Implement Butterfly Mode in R Real Agenda: Models of Networks, with

More information

Properties of Latent Variable Network Models

Properties of Latent Variable Network Models Properties of Latent Variable Network Models Riccardo Rastelli and Nial Friel University College Dublin Adrian E. Raftery University of Washington Technical Report no. 634 Department of Statistics University

More information

Department of Statistics. Bayesian Modeling for a Generalized Social Relations Model. Tyler McCormick. Introduction.

Department of Statistics. Bayesian Modeling for a Generalized Social Relations Model. Tyler McCormick. Introduction. A University of Connecticut and Columbia University A models for dyadic data are extensions of the (). y i,j = a i + b j + γ i,j (1) Here, y i,j is a measure of the tie from actor i to actor j. The random

More information

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Some slides courtesy of Eric Xing, Carlos Guestrin (One) bad case for K- means Clusters may overlap Some

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University

More information

arxiv: v1 [stat.me] 3 Apr 2017

arxiv: v1 [stat.me] 3 Apr 2017 A two-stage working model strategy for network analysis under Hierarchical Exponential Random Graph Models Ming Cao University of Texas Health Science Center at Houston ming.cao@uth.tmc.edu arxiv:1704.00391v1

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

EM & Variational Bayes

EM & Variational Bayes EM & Variational Bayes Hanxiao Liu September 9, 2014 1 / 19 Outline 1. EM Algorithm 1.1 Introduction 1.2 Example: Mixture of vmfs 2. Variational Bayes 2.1 Introduction 2.2 Example: Bayesian Mixture of

More information

Mixtures and Hidden Markov Models for analyzing genomic data

Mixtures and Hidden Markov Models for analyzing genomic data Mixtures and Hidden Markov Models for analyzing genomic data Marie-Laure Martin-Magniette UMR AgroParisTech/INRA Mathématique et Informatique Appliquées, Paris UMR INRA/UEVE ERL CNRS Unité de Recherche

More information

Identifiability of latent class models with many observed variables

Identifiability of latent class models with many observed variables Identifiability of latent class models with many observed variables Elizabeth S. Allman Department of Mathematics and Statistics University of Alaska Fairbanks Fairbanks, AK 99775 e-mail: e.allman@uaf.edu

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs Model-based cluster analysis: a Defence Gilles Celeux Inria Futurs Model-based cluster analysis Model-based clustering (MBC) consists of assuming that the data come from a source with several subpopulations.

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 9: Expectation Maximiation (EM) Algorithm, Learning in Undirected Graphical Models Some figures courtesy

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Notes on Machine Learning for and

Notes on Machine Learning for and Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Week 3: The EM algorithm

Week 3: The EM algorithm Week 3: The EM algorithm Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2005 Mixtures of Gaussians Data: Y = {y 1... y N } Latent

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Expectation Maximization (EM) and Mixture Models Hamid R. Rabiee Jafar Muhammadi, Mohammad J. Hosseini Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2 Agenda Expectation-maximization

More information

Variational Learning : From exponential families to multilinear systems

Variational Learning : From exponential families to multilinear systems Variational Learning : From exponential families to multilinear systems Ananth Ranganathan th February 005 Abstract This note aims to give a general overview of variational inference on graphical models.

More information

Bayesian Learning. Two Roles for Bayesian Methods. Bayes Theorem. Choosing Hypotheses

Bayesian Learning. Two Roles for Bayesian Methods. Bayes Theorem. Choosing Hypotheses Bayesian Learning Two Roles for Bayesian Methods Probabilistic approach to inference. Quantities of interest are governed by prob. dist. and optimal decisions can be made by reasoning about these prob.

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

An Introduction to mixture models

An Introduction to mixture models An Introduction to mixture models by Franck Picard Research Report No. 7 March 2007 Statistics for Systems Biology Group Jouy-en-Josas/Paris/Evry, France http://genome.jouy.inra.fr/ssb/ An introduction

More information

Latent Variable Models and EM Algorithm

Latent Variable Models and EM Algorithm SC4/SM8 Advanced Topics in Statistical Machine Learning Latent Variable Models and EM Algorithm Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/atsml/

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018)

GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018) GraphRNN: A Deep Generative Model for Graphs (24 Feb 2018) Jiaxuan You, Rex Ying, Xiang Ren, William L. Hamilton, Jure Leskovec Presented by: Jesse Bettencourt and Harris Chan March 9, 2018 University

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information