Variational inference for the Stochastic Block-Model

Size: px
Start display at page:

Download "Variational inference for the Stochastic Block-Model"

Transcription

1 Variational inference for the Stochastic Block-Model S. Robin AgroParisTech / INRA Workshop on Random Graphs, April 2011, Lille S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 1 / 37

2 Stochastic block model Stochastic block model (SBM) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 2 / 37

3 Stochastic block model Modelling network heterogeneity Latent variable models allow to capture the underlying structure of a network. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 3 / 37

4 Stochastic block model Modelling network heterogeneity Latent variable models allow to capture the underlying structure of a network. General setting for binary graphs (Bollobás et al. (2007)): S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 3 / 37

5 Stochastic block model Modelling network heterogeneity Latent variable models allow to capture the underlying structure of a network. General setting for binary graphs (Bollobás et al. (2007)): an latent (unobserved) variable Z i is associated with each node: {Z i } i.i.d. π the edges X ij are independent conditionally to the Z i s: {X ij } independent {Z i } : X ij B[γ(Z i,z j )] S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 3 / 37

6 Stochastic block model Modelling network heterogeneity Latent variable models allow to capture the underlying structure of a network. General setting for binary graphs (Bollobás et al. (2007)): an latent (unobserved) variable Z i is associated with each node: {Z i } i.i.d. π the edges X ij are independent conditionally to the Z i s: {X ij } independent {Z i } : X ij B[γ(Z i,z j )] Continuous (Hoff et al. (2002)): ( PCA) Z i R d, logit[γ(z,z )] = a z z S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 3 / 37

7 Stochastic block model Modelling network heterogeneity Latent variable models allow to capture the underlying structure of a network. General setting for binary graphs (Bollobás et al. (2007)): an latent (unobserved) variable Z i is associated with each node: {Z i } i.i.d. π the edges X ij are independent conditionally to the Z i s: {X ij } independent {Z i } : X ij B[γ(Z i,z j )] Continuous (Hoff et al. (2002)): ( PCA) Z i R d, logit[γ(z,z )] = a z z Discrete (Nowicki and Snijders (2001)): ( finite mixture = SBM) Z i {1,...,K}, γ(k,l) = γ kl. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 3 / 37

8 Stochastic block model (Weighted) Stochastic Block-Model (SBM) Discrete-valued latent labels: each node i belong to class k with probability π k : {Z i } i i.i.d., Z i M(1;π) where π = (π 1,... π K ). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 4 / 37

9 Stochastic block model (Weighted) Stochastic Block-Model (SBM) Discrete-valued latent labels: each node i belong to class k with probability π k : {Z i } i i.i.d., Z i M(1;π) where π = (π 1,... π K ). Observed edges: {X ij } i,j are conditionally independent given the Z i s: (X ij Z i = k,z j = l) f kl ( ) where f kl ( ) is some parametric distribution f kl (x) = f (x;γ kl ), e.g. We denote γ = {γ kl } k,l. (X ij Z i = k,z j = l) B(γ kl ) (binary graph) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 4 / 37

10 Stochastic block model (Weighted) Stochastic Block-Model (SBM) Discrete-valued latent labels: each node i belong to class k with probability π k : {Z i } i i.i.d., Z i M(1;π) where π = (π 1,... π K ). Observed edges: {X ij } i,j are conditionally independent given the Z i s: (X ij Z i = k,z j = l) f kl ( ) where f kl ( ) is some parametric distribution f kl (x) = f (x;γ kl ), e.g. We denote γ = {γ kl } k,l. (X ij Z i = k,z j = l) B(γ kl ) (binary graph) Statistical inference: We want to estimate θ = (π,γ) and P(Z X). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 4 / 37

11 Variational inference Variational inference S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 5 / 37

12 Variational inference Maximum likelihood inference Maximum likelihood inference Maximum likelihood estimate: We are looking for θ = arg max log P(X;θ) θ but P(X;θ) = Z P(X,Z;θ) is not tractable. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 6 / 37

13 Variational inference Maximum likelihood inference Maximum likelihood inference Maximum likelihood estimate: We are looking for θ = arg max log P(X;θ) θ but P(X;θ) = Z P(X,Z;θ) is not tractable. Classical strategy: Based on the decomposition log P(X) = E[log P(X,Z) X] E[log P(Z X) X], S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 6 / 37

14 Variational inference Maximum likelihood inference Maximum likelihood inference Maximum likelihood estimate: We are looking for θ = arg max log P(X;θ) θ but P(X;θ) = Z P(X,Z;θ) is not tractable. Classical strategy: Based on the decomposition log P(X) = E[log P(X,Z) X] E[log P(Z X) X], the EM algorithm aims at retrieving the maximum likelihood estimates via the alternation of 2 steps. E-step: calculation of P(Z X; θ). M-step: maximisation of E[log P(X,Z;θ) X] in θ. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 6 / 37

15 Variational inference Case of the Stochastic Block-Model Maximum likelihood inference Dependency structure. Dependecy graph: P(Z)P(X Z) Moral graph (Lauritzen (1996)) Conditional dep.: P(Z X) X ij X ij X ij Z j Z j Z j X jk Z i X jk Z i X jk Z i Z k Z k Z k X ik X ik X ik S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 7 / 37

16 Variational inference Case of the Stochastic Block-Model Maximum likelihood inference Dependency structure. Dependecy graph: P(Z)P(X Z) Moral graph (Lauritzen (1996)) Conditional dep.: P(Z X) X ij X ij X ij Z j Z j Z j X jk Z i X jk Z i X jk Z i Z k Z k Z k X ik X ik X ik The conditional dependency graph of Z is a clique no factorisation can be hoped to calculate P(Z X) (unlike hidden Markov random fields). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 7 / 37

17 Variational inference Variational approximation Variational inference As P(Z X) can not be calculated, we need to find some approximate distribution Q(Z). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 8 / 37

18 Variational inference Variational approximation Variational inference As P(Z X) can not be calculated, we need to find some approximate distribution Q(Z). Lower bound of the log-likelihood: For any distribution Q(Z), we have (Jaakkola (2000),Wainwright and Jordan (2008)) log P(X) log P(X) KL[Q(Z); P(Z X)] S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 8 / 37

19 Variational inference Variational approximation Variational inference As P(Z X) can not be calculated, we need to find some approximate distribution Q(Z). Lower bound of the log-likelihood: For any distribution Q(Z), we have (Jaakkola (2000),Wainwright and Jordan (2008)) log P(X) log P(X) KL[Q(Z); P(Z X)] = E Q [log P(X,Z)] + H(Q) where H(Q) is the entropy of Q: H(Q) = E Q [log Q(Z)]. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 8 / 37

20 Variational inference Variational approximation Variational inference As P(Z X) can not be calculated, we need to find some approximate distribution Q(Z). Lower bound of the log-likelihood: For any distribution Q(Z), we have (Jaakkola (2000),Wainwright and Jordan (2008)) log P(X) log P(X) KL[Q(Z); P(Z X)] = E Q [log P(X,Z)] + H(Q) where H(Q) is the entropy of Q: H(Q) = E Q [log Q(Z)]. This amounts to replace P( X) with Q( ) in log P(X) = E P( X) [log P(X,Z)] + H[P( X)]. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 8 / 37

21 Variational EM Variational inference Variational inference Expectation step (pseudo E-step): find the best lower bound of log P(X), i.e. the best approximation of P( X) as Q = arg min Q Q KL[Q(Z);P(Z X)] where Q is a class of manageable distributions. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 9 / 37

22 Variational EM Variational inference Variational inference Expectation step (pseudo E-step): find the best lower bound of log P(X), i.e. the best approximation of P( X) as Q = arg min Q Q KL[Q(Z);P(Z X)] where Q is a class of manageable distributions. Maximisation step (M-step): estimate θ as θ = arg max θ E Q [log P(X,Z;θ)] which maximises the lower bound of log P(X). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 9 / 37

23 Variational inference Approximation of P(Z X) for SBM We are looking for Variational inference Q = arg min Q Q KL[Q(Z);P(Z X)]. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 10 / 37

24 Variational inference Approximation of P(Z X) for SBM We are looking for Variational inference Q = arg min Q Q KL[Q(Z);P(Z X)]. We restrict ourselves to the set of factorisable distributions: { Q = Q : Q(Z) = Q i (Z i ) = } τ Z ik ik, τ ik Pr{Z i = k X}. i i k S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 10 / 37

25 Variational inference Approximation of P(Z X) for SBM We are looking for Variational inference Q = arg min Q Q KL[Q(Z);P(Z X)]. We restrict ourselves to the set of factorisable distributions: { Q = Q : Q(Z) = Q i (Z i ) = } τ Z ik ik, τ ik Pr{Z i = k X}. i i The optimal τik s satisfy the fix-point relation: τ ik π k j i l k f kl (X ij ) τ jl also known as mean-field approximation in physics (Parisi (1988)). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 10 / 37

26 Variational inference Operon network Application to a regulatory network Regulatory network = directed graph where Nodes = genes (or groups of genes, e.g. operons) Edges = regulations: {i j} i regulates j S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 11 / 37

27 Variational inference Operon network Application to a regulatory network Regulatory network = directed graph where Nodes = genes (or groups of genes, e.g. operons) Edges = regulations: {i j} i regulates j S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 11 / 37

28 Variational inference Operon network Application to a regulatory network Regulatory network = directed graph where Nodes = genes (or groups of genes, e.g. operons) Edges = regulations: Questions {i j} i regulates j Do some nodes share similar connexion profiles? Is there a macroscopic organisation of the network? S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 11 / 37

29 SBM analysis Variational inference Operon network Parameter estimates. K = 5 γ kl (%) π (%) Picard et al. (2009) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 12 / 37

30 SBM analysis Variational inference Operon network Parameter estimates. K = 5 γ kl (%) π (%) Picard et al. (2009) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 12 / 37

31 SBM analysis Variational inference Operon network Parameter estimates. K = 5 γ kl (%) π (%) Meta-graph representation. Picard et al. (2009) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 12 / 37

32 Variational inference Properties of variational inference Consistency (?) of the variational inference The quality of the inference based on the variational approximation is not very well known yet. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 13 / 37

33 Variational inference Properties of variational inference Consistency (?) of the variational inference The quality of the inference based on the variational approximation is not very well known yet. Negative result: (Gunawardana and Byrne (2005)) The VEM algorithm converges to a different optimum than ML in the general case, except for degenerated models. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 13 / 37

34 Variational inference Consistency (?) of the variational inference Properties of variational inference The quality of the inference based on the variational approximation is not very well known yet. Negative result: (Gunawardana and Byrne (2005)) The VEM algorithm converges to a different optimum than ML in the general case, except for degenerated models. Specific case of graphs. Specific asymptotic framework: n 2 data, p = n variables per individual. Mean field approximation is asymptotically exact for some models with infinite range dependency (Opper and Winther (2001): law of large number argument). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 13 / 37

35 Variational inference Consistency (?) of the variational inference Concentration of P(Z X) for binary graphs Let us denote g, the conditional distribution g(z;x) := Pr{Z = z X} = 1 π Zi γ X ij C Z i Z j [1 γ Zi Z j ] 1 X ij i j i S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 14 / 37

36 Variational inference Consistency (?) of the variational inference Concentration of P(Z X) for binary graphs Let us denote g, the conditional distribution g(z;x) := Pr{Z = z X} = 1 π Zi γ X ij C Z i Z j [1 γ Zi Z j ] 1 X ij Theorem (Célisse & al. (2011)). Under identifiability conditions and if k,l : 0 < a < γ kl < 1 a,0 < b < π k, then we have { z z t > 0, Pr g(z;x) } g(z > t ;X) Z = z = O(ne κ(t)n ). i j i S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 14 / 37

37 Variational inference Consistency (?) of the variational inference Concentration of P(Z X) for binary graphs Let us denote g, the conditional distribution g(z;x) := Pr{Z = z X} = 1 π Zi γ X ij C Z i Z j [1 γ Zi Z j ] 1 X ij Theorem (Célisse & al. (2011)). Under identifiability conditions and if k,l : 0 < a < γ kl < 1 a,0 < b < π k, then we have { z z t > 0, Pr g(z;x) } g(z > t ;X) Z = z = O(ne κ(t)n ). If the true labels are z, then P(Z X) concentrates around z. i j i S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 14 / 37

38 Variational inference Consistency (?) of the variational inference Concentration of P(Z X) for binary graphs Let us denote g, the conditional distribution g(z;x) := Pr{Z = z X} = 1 π Zi γ X ij C Z i Z j [1 γ Zi Z j ] 1 X ij Theorem (Célisse & al. (2011)). Under identifiability conditions and if k,l : 0 < a < γ kl < 1 a,0 < b < π k, then we have { z z t > 0, Pr g(z;x) } g(z > t ;X) Z = z = O(ne κ(t)n ). If the true labels are z, then P(Z X) concentrates around z. SBM is a degenerated model. i j i S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 14 / 37

39 Variational inference Consistency (?) of the variational inference Concentration of P(Z X) for binary graphs Let us denote g, the conditional distribution g(z;x) := Pr{Z = z X} = 1 π Zi γ X ij C Z i Z j [1 γ Zi Z j ] 1 X ij Theorem (Célisse & al. (2011)). Under identifiability conditions and if k,l : 0 < a < γ kl < 1 a,0 < b < π k, then we have { z z t > 0, Pr g(z;x) } g(z > t ;X) Z = z = O(ne κ(t)n ). If the true labels are z, then P(Z X) concentrates around z. SBM is a degenerated model. Ongoing work about the convergence P( X) δ{z 0 } (Matias (2011)). i j i S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 14 / 37

40 Variational inference Concentration of the degree distribution Binary graph. Binomial distribution of the degrees K i (i q) B(n 1,γ k ) where γ k = l π lγ k,l. Consistency (?) of the variational inference S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 15 / 37

41 Variational inference Concentration of the degree distribution Binary graph. Binomial distribution of the degrees K i (i q) B(n 1,γ k ) where γ k = l π lγ k,l. Normalised degree: D i = K i /(n 1) concentrates around γ k. Consistency (?) of the variational inference S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 15 / 37

42 Variational inference Concentration of the degree distribution Consistency (?) of the variational inference Binary graph. Binomial distribution of the degrees K i (i q) B(n 1,γ k ) n = where γ k = l π lγ k,l Normalised degree: D i = K i /(n 1) concentrates around γ k. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 15 / 37

43 Variational inference Concentration of the degree distribution Consistency (?) of the variational inference Binary graph. Binomial distribution of the degrees K i (i q) B(n 1,γ k ) n = where γ k = l π lγ k,l Normalised degree: D i = K i /(n 1) concentrates around γ k. n = S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 15 / 37

44 Variational inference Concentration of the degree distribution Consistency (?) of the variational inference Binary graph. Binomial distribution of the degrees K i (i q) B(n 1,γ k ) n = where γ k = l π lγ k,l Normalised degree: D i = K i /(n 1) concentrates around γ k. n = n = S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 15 / 37

45 Variational inference Consistency (?) of the variational inference Concentration of the degree distribution Binary graph. Binomial distribution of the degrees K i (i q) B(n 1,γ k ) n = where γ k = l π lγ k,l Normalised degree: D i = K i /(n 1) concentrates around γ k. Linear algorithm based on the gaps between the ordered D (i), provides consistent estimates of π and γ can be derived. (Channarond (2011)). n = 1000 n = S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 15 / 37

46 Variational Bayes inference Variational Bayes inference S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 16 / 37

47 Variational Bayes inference Variational Bayes inference Variational Bayes approximation Bayesian setting: Both θ and Z are random and unobserved and we want to retrieve P(Z,θ X) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 17 / 37

48 Variational Bayes inference Variational Bayes inference Variational Bayes approximation Bayesian setting: Both θ and Z are random and unobserved and we want to retrieve P(Z,θ X) so we look for Q = arg min Q Q KL[Q(Z,θ);P(Z,θ X)] within Q = {Q : Q(Z,θ) = Q Z (Z)Q θ (θ)}. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 17 / 37

49 Variational Bayes inference Variational Bayes inference Variational Bayes approximation Bayesian setting: Both θ and Z are random and unobserved and we want to retrieve P(Z,θ X) so we look for Q = arg min Q Q KL[Q(Z,θ);P(Z,θ X)] within Q = {Q : Q(Z,θ) = Q Z (Z)Q θ (θ)}. VB-EM algorithm: In the exponential family / conjugate prior context P(X,Z,θ) exp{φ(θ) [u(x,z) + ν]} S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 17 / 37

50 Variational Bayes inference Variational Bayes approximation Variational Bayes inference Bayesian setting: Both θ and Z are random and unobserved and we want to retrieve P(Z,θ X) so we look for Q = arg min Q Q KL[Q(Z,θ);P(Z,θ X)] within Q = {Q : Q(Z,θ) = Q Z (Z)Q θ (θ)}. VB-EM algorithm: In the exponential family / conjugate prior context P(X,Z,θ) exp{φ(θ) [u(x,z) + ν]} the optimal Q (Z,θ) is recovered (Beal and Ghahramani (2003)) via pseudo-m: Q θ (θ) exp ( φ(θ) {E QZ [u(x,z)] + ν} ) pseudo-e: Q Z (Z) exp{e Qθ [φ(θ)] u(x,z)} S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 17 / 37

51 Variational Bayes inference Variational Bayes approximation Variational Bayes inference Bayesian setting: Both θ and Z are random and unobserved and we want to retrieve P(Z,θ X) so we look for Q = arg min Q Q KL[Q(Z,θ);P(Z,θ X)] within Q = {Q : Q(Z,θ) = Q Z (Z)Q θ (θ)}. VB-EM algorithm: In the exponential family / conjugate prior context P(X,Z,θ) exp{φ(θ) [u(x,z) + ν]} the optimal Q (Z,θ) is recovered (Beal and Ghahramani (2003)) via pseudo-m: Q θ (θ) exp ( φ(θ) {E QZ [u(x,z)] + ν} ) pseudo-e: Q Z (Z) exp{e Qθ [φ(θ)] u(x,z)} See Latouche et al. (2010) for binary SBM inference. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 17 / 37

52 Variational Bayes inference Operon network Operon network: Comparison of VEM and VB VEM estimates for the K = 5 group model lie within the VB approximate 90% credibility intervals (Gazal et al. (2011)). γ kl π S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 18 / 37

53 Variational Bayes inference Operon network Operon network: Comparison of VEM and VB VEM estimates for the K = 5 group model lie within the VB approximate 90% credibility intervals (Gazal et al. (2011)). γ kl π [0.02;0.04] [0.00;0.10] [0.01;0.08] [0.00;0.03] [0.02;1.34] 2 [6.14;7.60] [0.61;3.68] [1.07;3.50] [0.05;0.54] [0.33;17.62] 3 [1.20;1.72] [0.35;2.02] [0.56;1.92] [0.03;0.30] [0.19;10.57] 4 [0.01;0.07] [0.04;0.51] [0.01;0.20] [0.76;1.27] [0.08;4.43] 5 [6.35;12.70] [4.60;33.36] [4.28;24.37] [63.56;81.28] [5.00;95.00] π [59.65;74.38] [2.88;6.74] [5.68;10.77] [16.02;24.04] [0.11;1.42] VEM and VB estimates for the K = 5 group model (approximate 90% credibility intervals). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 18 / 37

54 Variational Bayes inference Quality of the variational Bayes approximation Variational Bayes approximation: Simulation Study Few is known about the properties of variational-bayes inference: Consistency is proved for some incomplete data models (McGrory and Titterington (2009)). In practice, VB-EM often under-estimates the posterior variances. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 19 / 37

55 Variational Bayes inference Quality of the variational Bayes approximation Variational Bayes approximation: Simulation Study Few is known about the properties of variational-bayes inference: Consistency is proved for some incomplete data models (McGrory and Titterington (2009)). In practice, VB-EM often under-estimates the posterior variances. Simulation design: S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 19 / 37

56 Variational Bayes inference Quality of the variational Bayes approximation Variational Bayes approximation: Simulation Study Few is known about the properties of variational-bayes inference: Consistency is proved for some incomplete data models (McGrory and Titterington (2009)). In practice, VB-EM often under-estimates the posterior variances. Simulation design: 2-group binary SBM with parameters with 2 scenarios π = ( ) ( ) , γ = /0.3 S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 19 / 37

57 Variational Bayes inference Quality of the variational Bayes approximation Variational Bayes approximation: Simulation Study Few is known about the properties of variational-bayes inference: Consistency is proved for some incomplete data models (McGrory and Titterington (2009)). In practice, VB-EM often under-estimates the posterior variances. Simulation design: 2-group binary SBM with parameters with 2 scenarios π = ( ) ( ) , γ = /0.3 Comparison of 4 methods: EM (when possible), VEM, BP and VB S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 19 / 37

58 Variational Bayes inference Quality of the variational Bayes approximation Variational Bayes approximation: Simulation Study Few is known about the properties of variational-bayes inference: Consistency is proved for some incomplete data models (McGrory and Titterington (2009)). In practice, VB-EM often under-estimates the posterior variances. Simulation design: 2-group binary SBM with parameters with 2 scenarios π = ( ) ( ) , γ = /0.3 Comparison of 4 methods: EM (when possible), VEM, BP and VB Belief Propagation (BP) algorithm: E Q [log P(X,Z)] = E Q [Z ik ] log π }{{} k + E Q [Z ik Z jl ] log f (X ij;γ kl). }{{} i,k τ i,j k,l ik ijkl τ ik τ jl S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 19 / 37

59 Variational Bayes inference Quality of the variational Bayes approximation Variational Bayes approximation: Simulation Study Few is known about the properties of variational-bayes inference: Consistency is proved for some incomplete data models (McGrory and Titterington (2009)). In practice, VB-EM often under-estimates the posterior variances. Simulation design: 2-group binary SBM with parameters with 2 scenarios π = ( ) ( ) , γ = /0.3 Comparison of 4 methods: EM (when possible), VEM, BP and VB Belief Propagation (BP) algorithm: E Q [log P(X,Z)] = E Q [Z ik ] log π }{{} k + E Q [Z ik Z jl ] log f (X ij;γ kl). }{{} i,k τ i,j k,l ik ijkl τ ik τ jl 500 graphs are simulated for each scenario and each graph size. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 19 / 37

60 Variational Bayes inference Quality of the variational Bayes approximation Estimates, standard deviation and likelihood Comparison on small graphs (n = 18): π 1 γ 11 γ 12 γ 22 log P(X) Scenario 1 60% 80% 20% 50% EM 59.1 (13.1) 78.5 (13.5) 20.9 (8.4) 50.9 (15.4) VEM 57.7 (16.6) 78.8 (12.4) 22.4 (10.7) 50.3 (14.6) BP 57.9 (16.2) 78.9 (12.3) 22.2 (10.5) 50.3 (14.5) VB 58.1 (13.3) 78.2 (9.7) 21.6 (7.7) 50.8 (13.3) Scenario 2 60% 80% 20% 30% EM 59.5 (14.1) 78.7 (15.6) 21.2 (8.7) 30.3 (14.3) VEM 55.6 (19.0) 80.1 (14.0) 24.0 (11.8) 30.8 (13.8) BP 56.6 (17.8) 80.0 (13.6) 23.2 (11.0) 30.8 (13.8) VB 58.4 (14.6) 77.9 (12.0) 22.3 (9.3) 32.1 (12.3) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 20 / 37

61 Variational Bayes inference Quality of the variational Bayes approximation Estimates, standard deviation and likelihood Comparison on small graphs (n = 18): π 1 γ 11 γ 12 γ 22 log P(X) Scenario 1 60% 80% 20% 50% EM 59.1 (13.1) 78.5 (13.5) 20.9 (8.4) 50.9 (15.4) VEM 57.7 (16.6) 78.8 (12.4) 22.4 (10.7) 50.3 (14.6) BP 57.9 (16.2) 78.9 (12.3) 22.2 (10.5) 50.3 (14.5) VB 58.1 (13.3) 78.2 (9.7) 21.6 (7.7) 50.8 (13.3) Scenario 2 60% 80% 20% 30% EM 59.5 (14.1) 78.7 (15.6) 21.2 (8.7) 30.3 (14.3) VEM 55.6 (19.0) 80.1 (14.0) 24.0 (11.8) 30.8 (13.8) BP 56.6 (17.8) 80.0 (13.6) 23.2 (11.0) 30.8 (13.8) VB 58.4 (14.6) 77.9 (12.0) 22.3 (9.3) 32.1 (12.3) All methods provide similar results. EM achieves the best ones. Belief propagation (BP) does not significantly improve VEM. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 20 / 37

62 Variational Bayes inference Influence of the graph size Quality of the variational Bayes approximation Comparison of VEM: and VB: + in scenario 2 (most difficult). Left to right: π 1, γ 11, γ 12, γ 22. Means. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 21 / 37

63 Variational Bayes inference Quality of the variational Bayes approximation Influence of the graph size Comparison of VEM: and VB: + in scenario 2 (most difficult). Left to right: π 1, γ 11, γ 12, γ 22. Means. Standard deviations. VB estimates converge more rapidly than VEM estimates. Their precision is also better. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 21 / 37

64 Variational Bayes inference VB Credibility intervals Quality of the variational Bayes approximation Actual level as a function of n: π 1 : +, γ 11 :, γ 12 :, γ 22 : S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 22 / 37

65 Variational Bayes inference Quality of the variational Bayes approximation VB Credibility intervals Actual level as a function of n: π 1 : +, γ 11 :, γ 12 :, γ 22 : For all parameters, VB posterior credibility intervals achieve the nominal level (90%), as soon as n 30. The VB approximation seems to work well. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 22 / 37

66 Variational Bayes inference Convergence rate of the VB estimates Quality of the variational Bayes approximation Width of the posterior credibility intervals. π 1, γ 11, γ 12, γ 22 S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 23 / 37

67 Variational Bayes inference Quality of the variational Bayes approximation Convergence rate of the VB estimates Width of the posterior credibility intervals. π 1, γ 11, γ 12, γ 22 The width decreases as 1/ n for π 1. It decreases as 1/n = 1/sqrtn 2 for γ 11, γ 12 and γ 22. Consistent with the penalty of the ICL criterion proposed by Daudin et al. (2008) (see next slide). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 23 / 37

68 Variational Bayes inference Few more about inference Quality of the variational Bayes approximation Identifiability. Even for binary edges, MixNet (SBM) is identifiable (Allman et al. (2009))... although mixtures of Bernoulli are not. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 24 / 37

69 Variational Bayes inference Quality of the variational Bayes approximation Few more about inference Identifiability. Even for binary edges, MixNet (SBM) is identifiable (Allman et al. (2009))... although mixtures of Bernoulli are not. Model selection. Daudin et al. (2008) propose the penalised criterion ICL(K) = E Q [log P(Z,X)] 1 { (K 1)log n + K 2 log[n(n 1)/2] }. 2 S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 24 / 37

70 Variational Bayes inference Quality of the variational Bayes approximation Few more about inference Identifiability. Even for binary edges, MixNet (SBM) is identifiable (Allman et al. (2009))... although mixtures of Bernoulli are not. Model selection. Daudin et al. (2008) propose the penalised criterion ICL(K) = E Q [log P(Z,X)] 1 2 { (K 1)log n + K 2 log[n(n 1)/2] }. The difference between ICL and BIC is the entropy term H(Q )... which is almost zero (due to the concentration of P(Z X)). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 24 / 37

71 Variational Bayes inference Quality of the variational Bayes approximation Few more about inference Identifiability. Even for binary edges, MixNet (SBM) is identifiable (Allman et al. (2009))... although mixtures of Bernoulli are not. Model selection. Daudin et al. (2008) propose the penalised criterion ICL(K) = E Q [log P(Z,X)] 1 2 { (K 1)log n + K 2 log[n(n 1)/2] }. The difference between ICL and BIC is the entropy term H(Q )... which is almost zero (due to the concentration of P(Z X)). BIC and ICL-like criteria are also considered in Latouche et al. (2011b) for SBM in the context of variational Bayes inference. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 24 / 37

72 Covariates in weighted networks Covariates in weighted networks S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 25 / 37

73 Weighted network Covariates in weighted networks Accouting for covariates Understanding the mixture components: Observed clusters may be related to exogenous covariates. Model-based clustering (such as SBM) provides a comfortable set-up to account for covariates. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 26 / 37

74 Weighted network Covariates in weighted networks Accouting for covariates Understanding the mixture components: Observed clusters may be related to exogenous covariates. Model-based clustering (such as SBM) provides a comfortable set-up to account for covariates. Generalised linear model. In the context of exponential family, covariates y can be accounted for via a regression term g(ex ij ) = µ kl + y ij β, if Z ik Z jl = 1 where β does not depend on the group (Mariadassou et al. (2010)). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 26 / 37

75 Weighted network Covariates in weighted networks Accouting for covariates Understanding the mixture components: Observed clusters may be related to exogenous covariates. Model-based clustering (such as SBM) provides a comfortable set-up to account for covariates. Generalised linear model. In the context of exponential family, covariates y can be accounted for via a regression term g(ex ij ) = µ kl + y ij β, if Z ik Z jl = 1 where β does not depend on the group (Mariadassou et al. (2010)). Both VEM or VBEM inference can be performed. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 26 / 37

76 Covariates in weighted networks Tree interaction network Accouting for covariates Data: n = 51 tree species, X ij = number of shared parasites (Vacher et al. (2008)). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 27 / 37

77 Covariates in weighted networks Accouting for covariates Tree interaction network Data: n = 51 tree species, X ij = number of shared parasites (Vacher et al. (2008)). Model: λ kl T1 T2 T3 T4 T5 T6 T7 T T T T T T T π k X ij P(λ kl ), λ kl = mean number of shared parasites. Results: ICL selects K = 7 groups S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 27 / 37

78 Covariates in weighted networks Accouting for covariates Tree interaction network Data: n = 51 tree species, X ij = number of shared parasites (Vacher et al. (2008)). Model: λ kl T1 T2 T3 T4 T5 T6 T7 T T T T T T T π k X ij P(λ kl ), λ kl = mean number of shared parasites. Results: ICL selects K = 7 groups that are strongly related with phylums Conipherophyta Magnoliophyta Mean number of interactions Group size and composition T1 T2 T3 T4 T5 T6 T7 S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 27 / 37

79 Covariates in weighted networks Accounting for taxonomic distance Accouting for covariates Model: d ij = d taxo (i,j), X ij P(λ kl e βd ij ). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 28 / 37

80 Covariates in weighted networks Accounting for taxonomic distance Accouting for covariates Model: d ij = d taxo (i,j), X ij P(λ kl e βd ij ). Results: β = for d = 3.82, e βd =.298 The mean number of shared parasites decreases with taxonomic distance. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 28 / 37

81 Covariates in weighted networks Accouting for covariates Accounting for taxonomic distance Model: d ij = d taxo (i,j), X ij P(λ kl e βd ij ). λ kl T 1 T 2 T 3 T 4 T T T T π k β Results: β = for d = 3.82, e βd =.298 The mean number of shared parasites decreases with taxonomic distance. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 28 / 37

82 Covariates in weighted networks Accouting for covariates Accounting for taxonomic distance Model: d ij = d taxo (i,j), X ij P(λ kl e βd ij ). λ kl T 1 T 2 T 3 T 4 T T T T π k β Results: β = for d = 3.82, e βd =.298 The mean number of shared parasites decreases with taxonomic distance Conipherophyta Magnoliophyta Mean number of interactions Group size and composition T 1 T 2 T 3 T 4 Groups are no longer associated with the phylogenetic structure. Mixture = residual heterogeneity of the regression. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 28 / 37

83 Conclusion Conclusion S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 29 / 37

84 Conclusion Conclusion Conclusion Stochastic block-model: flexible and already widely used mixture model to uncover some underlying heterogeneity in networks. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 30 / 37

85 Conclusion Conclusion Conclusion Stochastic block-model: flexible and already widely used mixture model to uncover some underlying heterogeneity in networks. Variational inference Efficient and scalable (Daudin (2011): n > 2000) in terms of computation times (as opposed to MCMC). Seems to work well, because the conditional distribution P(Z X) (and therefore P(Z,θ X)) asymptotically belongs to the class Q within which the optimisation if achieved. Due to the specific asymptotic framework of networks. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 30 / 37

86 Conclusion Conclusion Conclusion Stochastic block-model: flexible and already widely used mixture model to uncover some underlying heterogeneity in networks. Variational inference Efficient and scalable (Daudin (2011): n > 2000) in terms of computation times (as opposed to MCMC). Seems to work well, because the conditional distribution P(Z X) (and therefore P(Z,θ X)) asymptotically belongs to the class Q within which the optimisation if achieved. Due to the specific asymptotic framework of networks. Alternative methods. Faster algorithms do exist for large graphs: Based on the degree distribution (Channarond (2011)) Based on spectral clustering (Rohe et al. (2010)). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 30 / 37

87 Future work Conclusion Future work Theoretical properties of variational estimates. Although the graph context seems favourable, we still need more understanding about variational and variational Bayes inference properties. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 31 / 37

88 Conclusion Future work Future work Theoretical properties of variational estimates. Although the graph context seems favourable, we still need more understanding about variational and variational Bayes inference properties. SBM = discrete version of W -graph. Let φ : [0,1] 2 [0,1] {Z i } i.i.d. U[0,1] {X ij } indep. {Z i } B[φ(Z i,z j )] Approximation of the φ(u,v) function by a step function γ kl (SBM) Model averaging based on optimal variational weights (Volant (2011)) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 31 / 37

89 Acknowledgements Conclusion Acknowledgements People: A. Célisse, A. Channarond, J.-J. Daudin, S. Gazall, M. Mariadassou, V. Miele, F. Picard, C. Vacher Grant: Supported by the French Agence Nationale de la Recherche NeMo project ANR-08-BLAN S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 32 / 37

90 Conclusion Acknowledgements Acknowledgements People: A. Célisse, A. Channarond, J.-J. Daudin, S. Gazall, M. Mariadassou, V. Miele, F. Picard, C. Vacher Grant: Supported by the French Agence Nationale de la Recherche NeMo project ANR-08-BLAN Softwares: Stand-alone MixNet: stat.genopole.cnrs.fr/software/mixnet/ R-package Mixer: cran.r-project.org/web/packages/mixer/index.html R-package NeMo: Network motif detection in preparation S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 32 / 37

91 Conclusion Acknowledgements Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res Allman, E., Matias, C. and Rhodes, J. (2009). Identifiability of parameters in latent structure models with many observed variables. Ann. Statist. 37 (6A) Beal, J., M. and Ghahramani, Z. (2003). The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Bayes. Statist Bollobás, B., Janson, S. and Riordan, O. (2007). The phase transition in inhomogeneous random graphs. Rand. Struct. Algo. 31 (1) Daudin, J.-J., Picard, F. and Robin, S. (Jun, 2008). A mixture model for random graphs. Stat. Comput. 18 (2) Daudin, J., Pierre, L. and Vacher, C. (Jan, 2010). Model for heterogeneous random networks using continuous latent variables and an application to a tree-fungus network. DOI: /j x. Gazal, S., Daudin, J.-J. and Robin, S. (2011). Accuracy of variational estimates for random graph mixture models. to appear. Girvan, M. and Newman, M. E. J. (2002). Community strucutre in social and biological networks. Proc. Natl. Acad. Sci. USA. 99 (12) Gunawardana, A. and Byrne, W. (2005). Convergence theorems for generalized alternating minimization procedures. J. Mach. Learn. Res Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 (460) Hofman, J. M. and Wiggins, C. H. (2008). A bayesian approach to network modularity. Physical Review Letters doi: /physrevlett Jaakkola, T. (2000). Advanced mean field methods: theory and practice. chapter Tutorial on variational approximation methods. MIT Press. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 32 / 37

92 Conclusion Acknowledgements Latouche, P., Birmelé, E. and Ambroise, C. (2010). A non-asymptotic bic criterion for stochastic blockmodels. submitted, Tech. Report, genome.jouy.inra.fr/ssb/preprint/, # 17. Latouche, P., Birmele, E. and Ambroise, C. (2011a). Overlapping stochastic block models with application to the french political blogosphere. to appear, ArXiv:arxiv.org/pdf/ Latouche, P., Birmele, E. and Ambroise, C. (2011b). Variational bayesian inference and complexity control for stochastic block models. to appear. Lauritzen, S. (1996). Graphical Models. Oxford Statistical Science Series. Clarendon Press. Lovász, L. and Szegedy, B. (2006). Limits of dense graph sequences. J. Combin. Theory, Series B. 96 (6) von Luxburg, U., Belkin, M. and Bousquet, O. (2007). Consistency of spectral clustering. Ann. Statist. 36 (2) Mariadassou, M., Robin, S. and Vacher, C. (2010). Uncovering structure in valued graphs: a variational approach. Ann. Appl. Statist. 4 (2) McGrory, C. A. and Titterington, D. M. (2009). Variational Bayesian analysis for hidden Markov models. Austr. & New Zeal. J. Statist. 51 (2) Newman, M. and Girvan, M. (2004). Finding and evaluating community structure in networks,. Phys. Rev. E Newman, M. E. J. (2004). Fast algorithm for detecting community structure in networks. Phys. Rev. E (69) Nowicki, K. and Snijders, T. (2001). Estimation and prediction for stochastic block-structures. J. Amer. Statist. Assoc Opper, M. and Winther, O. (2001). Advanced mean field methods: Theory and practice. chapter From Naive Mean Field Theory to the TAP Equations. The MIT Press. Parisi, G. (1988). Statistical Field Theory. Addison Wesley, New York),. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 32 / 37

93 Appendix Understanding network structure Picard, F., Miele, V., Daudin, J.-J., Cottret, L. and Robin, S. (2009). Deciphering the connectivity structure of biological networks using mixnet. BMC Bioinformatics. Suppl 6 S17. doi: / s6-s17. Rohe, K., Chatterjee, S. and Yu, B. (July, 2010). Spectral clustering and the high-dimensional Stochastic Block Model. Vacher, C., Piou, D. and Desprez-Loustau, M.-L. (2008). Architecture of an antagonistic tree/fungus network: The asymmetric influence of past evolutionary history. PLoS ONE. 3 (3) e1740. doi: /journal.pone Wainwright, M. J. and Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1 (1 2) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 33 / 37

94 Appendix Understanding network structure Understanding network structure Network constitute a natural way to depict interactions between entities. They are now present in many scientific fields (biology, sociology, communication, economics,...). Most observed networks display an heterogeneous topology, that one would like to decipher and better understand. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 33 / 37

95 Appendix Understanding network structure Understanding network structure Network constitute a natural way to depict interactions between entities. They are now present in many scientific fields (biology, sociology, communication, economics,...). Most observed networks display an heterogeneous topology, that one would like to decipher and better understand. Dolphine social network. Hyperlink network. Newman and Girvan (2004) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 33 / 37

96 Appendix SBM for a binary social network Understanding network structure Zachary data. Social binary network of friendship within a sport club. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 34 / 37

97 Appendix SBM for a binary social network Understanding network structure Zachary data. Social binary network of friendship within a sport club. Results. The split is recovered and the role of few leaders is underlined. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 34 / 37

98 Appendix SBM for a binary social network Understanding network structure Zachary data. Social binary network of friendship within a sport club. Results. The split is recovered and the role of few leaders is underlined. X ij Z i = q,z j = l B(γ ql ) (%) γ kl k/l π l S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 34 / 37

99 Extensions and variations Appendix Alternatives, extensions and variations Algorithmic approaches: Looking for communities Graph clustering (Girvan and Newman (2002), Newman (2004)); Spectral clustering (von Luxburg et al. (2007)). S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 35 / 37

100 Extensions and variations Appendix Alternatives, extensions and variations Algorithmic approaches: Looking for communities Graph clustering (Girvan and Newman (2002), Newman (2004)); Spectral clustering (von Luxburg et al. (2007)). Variations around SBM: Community structure (Hofman and Wiggins (2008)), Mixed-membership (Airoldi et al. (2008)), overlapping groups (Latouche et al. (2011a)) Continuous version (Daudin et al. (2010)), SBM = Step-function version of W -random graphs (Lovász and Szegedy (2006)) S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 35 / 37

101 Extensions and variations Appendix Alternatives, extensions and variations Algorithmic approaches: Looking for communities Graph clustering (Girvan and Newman (2002), Newman (2004)); Spectral clustering (von Luxburg et al. (2007)). Variations around SBM: Community structure (Hofman and Wiggins (2008)), Mixed-membership (Airoldi et al. (2008)), overlapping groups (Latouche et al. (2011a)) Continuous version (Daudin et al. (2010)), SBM = Step-function version of W -random graphs (Lovász and Szegedy (2006)) In this talk: Variational inference for SBM; Variational Bayes inference for SBM; Including covariates. S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 35 / 37

102 Appendix Alternatives, extensions and variations Approximate posterior distribution Q θ S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 36 / 37

103 Appendix Alternatives, extensions and variations Comparison of classifications and G-O-F Accounting for taxonomy deeply modifies the group structure: T 1 T 2 T 3 T 4 T T T T T T T S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 37 / 37

104 Appendix Alternatives, extensions and variations Comparison of classifications and G-O-F Edges X ij : 25 Accounting for taxonomy deeply modifies the group structure: T 1 T 2 T 3 T 4 T T T T T T T Degrees K i : Goodness of fit can be assessed via the predicted intensities X ij or degrees K i S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs, Lille 37 / 37

Deciphering and modeling heterogeneity in interaction networks

Deciphering and modeling heterogeneity in interaction networks Deciphering and modeling heterogeneity in interaction networks (using variational approximations) S. Robin INRA / AgroParisTech Mathematical Modeling of Complex Systems December 2013, Ecole Centrale de

More information

Goodness of fit of logistic models for random graphs: a variational Bayes approach

Goodness of fit of logistic models for random graphs: a variational Bayes approach Goodness of fit of logistic models for random graphs: a variational Bayes approach S. Robin Joint work with P. Latouche and S. Ouadah INRA / AgroParisTech MIAT Seminar, Toulouse, November 2016 S. Robin

More information

Modeling heterogeneity in random graphs

Modeling heterogeneity in random graphs Modeling heterogeneity in random graphs Catherine MATIAS CNRS, Laboratoire Statistique & Génome, Évry (Soon: Laboratoire de Probabilités et Modèles Aléatoires, Paris) http://stat.genopole.cnrs.fr/ cmatias

More information

IV. Analyse de réseaux biologiques

IV. Analyse de réseaux biologiques IV. Analyse de réseaux biologiques Catherine Matias CNRS - Laboratoire de Probabilités et Modèles Aléatoires, Paris catherine.matias@math.cnrs.fr http://cmatias.perso.math.cnrs.fr/ ENSAE - 2014/2015 Sommaire

More information

Bayesian methods for graph clustering

Bayesian methods for graph clustering Author manuscript, published in "Advances in data handling and business intelligence (2009) 229-239" DOI : 10.1007/978-3-642-01044-6 Bayesian methods for graph clustering P. Latouche, E. Birmelé, and C.

More information

MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW

MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW ESAIM: PROCEEDINGS AND SURVEYS, December 2014, Vol. 47, p. 55-74 F. Abergel, M. Aiguier, D. Challet, P.-H. Cournède, G. Faÿ, P. Lafitte, Editors MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE

More information

Uncovering structure in biological networks: A model-based approach

Uncovering structure in biological networks: A model-based approach Uncovering structure in biological networks: A model-based approach J-J Daudin, F. Picard, S. Robin, M. Mariadassou UMR INA-PG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Statistics

More information

Bayesian non parametric inference of discrete valued networks

Bayesian non parametric inference of discrete valued networks Bayesian non parametric inference of discrete valued networks Laetitia Nouedoui, Pierre Latouche To cite this version: Laetitia Nouedoui, Pierre Latouche. Bayesian non parametric inference of discrete

More information

Statistical analysis of biological networks.

Statistical analysis of biological networks. Statistical analysis of biological networks. Assessing the exceptionality of network motifs S. Schbath Jouy-en-Josas/Evry/Paris, France http://genome.jouy.inra.fr/ssb/ Colloquium interactions math/info,

More information

The Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul

The Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul The Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul Charles Bouveyron Laboratoire MAP5, UMR CNRS 8145 Université Paris Descartes This is a joint work with Y. Jernite,

More information

arxiv: v2 [stat.ap] 24 Jul 2010

arxiv: v2 [stat.ap] 24 Jul 2010 Variational Bayesian inference and complexity control for stochastic block models arxiv:0912.2873v2 [stat.ap] 24 Jul 2010 P. Latouche, E. Birmelé and C. Ambroise Laboratoire Statistique et Génome, UMR

More information

Theory and Methods for the Analysis of Social Networks

Theory and Methods for the Analysis of Social Networks Theory and Methods for the Analysis of Social Networks Alexander Volfovsky Department of Statistical Science, Duke University Lecture 1: January 16, 2018 1 / 35 Outline Jan 11 : Brief intro and Guest lecture

More information

Mixture models for analysing transcriptome and ChIP-chip data

Mixture models for analysing transcriptome and ChIP-chip data Mixture models for analysing transcriptome and ChIP-chip data Marie-Laure Martin-Magniette French National Institute for agricultural research (INRA) Unit of Applied Mathematics and Informatics at AgroParisTech,

More information

Variational Bayes model averaging for graphon functions and motif frequencies inference in W-graph models

Variational Bayes model averaging for graphon functions and motif frequencies inference in W-graph models Variational Bayes model averaging for graphon functions and motif frequencies inference in W-graph models Pierre Latouche, Stephane Robin To cite this version: Pierre Latouche, Stephane Robin. Variational

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection (non-examinable material) Matthew J. Beal February 27, 2004 www.variational-bayes.org Bayesian Model Selection

More information

arxiv: v2 [math.st] 8 Dec 2010

arxiv: v2 [math.st] 8 Dec 2010 New consistent and asymptotically normal estimators for random graph mixture models arxiv:1003.5165v2 [math.st] 8 Dec 2010 Christophe Ambroise Catherine Matias Université d Évry val d Essonne - CNRS UMR

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Variational Scoring of Graphical Model Structures

Variational Scoring of Graphical Model Structures Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Mixtures and Hidden Markov Models for analyzing genomic data

Mixtures and Hidden Markov Models for analyzing genomic data Mixtures and Hidden Markov Models for analyzing genomic data Marie-Laure Martin-Magniette UMR AgroParisTech/INRA Mathématique et Informatique Appliquées, Paris UMR INRA/UEVE ERL CNRS Unité de Recherche

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of discussion We first describe inference with PGMs and the intractability of exact inference Then give a taxonomy of inference algorithms

More information

Bayesian Methods for Graph Clustering

Bayesian Methods for Graph Clustering Bayesian Methods for Graph Clustering by Pierre Latouche, Etienne Birmelé, and Christophe Ambroise Research Report No. 17 September 2008 Statistics for Systems Biology Group Jouy-en-Josas/Paris/Evry, France

More information

Network Event Data over Time: Prediction and Latent Variable Modeling

Network Event Data over Time: Prediction and Latent Variable Modeling Network Event Data over Time: Prediction and Latent Variable Modeling Padhraic Smyth University of California, Irvine Machine Learning with Graphs Workshop, July 25 th 2010 Acknowledgements PhD students:

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Learning latent structure in complex networks

Learning latent structure in complex networks Learning latent structure in complex networks Lars Kai Hansen www.imm.dtu.dk/~lkh Current network research issues: Social Media Neuroinformatics Machine learning Joint work with Morten Mørup, Sune Lehmann

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s

More information

A brief introduction to Conditional Random Fields

A brief introduction to Conditional Random Fields A brief introduction to Conditional Random Fields Mark Johnson Macquarie University April, 2005, updated October 2010 1 Talk outline Graphical models Maximum likelihood and maximum conditional likelihood

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Final Talk May 8, 2014 Ted

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Random Effects Models for Network Data

Random Effects Models for Network Data Random Effects Models for Network Data Peter D. Hoff 1 Working Paper no. 28 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 January 14, 2003 1 Department of

More information

arxiv: v2 [stat.me] 22 Jun 2016

arxiv: v2 [stat.me] 22 Jun 2016 Statistical clustering of temporal networks through a dynamic stochastic block model arxiv:1506.07464v [stat.me] Jun 016 Catherine Matias Sorbonne Universités, UPMC Univ Paris 06, Univ Paris Diderot, Sorbonne

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs Model-based cluster analysis: a Defence Gilles Celeux Inria Futurs Model-based cluster analysis Model-based clustering (MBC) consists of assuming that the data come from a source with several subpopulations.

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Aaron C. Courville Université de Montréal Note: Material for the slides is taken directly from a presentation prepared by Christopher M. Bishop Learning in DAGs Two things could

More information

Variational inference

Variational inference Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

G8325: Variational Bayes

G8325: Variational Bayes G8325: Variational Bayes Vincent Dorie Columbia University Wednesday, November 2nd, 2011 bridge Variational University Bayes Press 2003. On-screen viewing permitted. Printing not permitted. http://www.c

More information

Probabilistic Graphical Models for Image Analysis - Lecture 4

Probabilistic Graphical Models for Image Analysis - Lecture 4 Probabilistic Graphical Models for Image Analysis - Lecture 4 Stefan Bauer 12 October 2018 Max Planck ETH Center for Learning Systems Overview 1. Repetition 2. α-divergence 3. Variational Inference 4.

More information

Lecture 1b: Linear Models for Regression

Lecture 1b: Linear Models for Regression Lecture 1b: Linear Models for Regression Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced

More information

Latent Variable Models

Latent Variable Models Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 5 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 5 1 / 31 Recap of last lecture 1 Autoregressive models:

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Statistical Approaches to Learning and Discovery

Statistical Approaches to Learning and Discovery Statistical Approaches to Learning and Discovery Bayesian Model Selection Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Intro Talk April 22, 2014

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Creighton Heaukulani and Zoubin Ghahramani University of Cambridge TU Denmark, June 2013 1 A Network Dynamic network data

More information

Variational inference in the conjugate-exponential family

Variational inference in the conjugate-exponential family Variational inference in the conjugate-exponential family Matthew J. Beal Work with Zoubin Ghahramani Gatsby Computational Neuroscience Unit August 2000 Abstract Variational inference in the conjugate-exponential

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Mixtures and Hidden Markov Models for genomics data

Mixtures and Hidden Markov Models for genomics data Mixtures and Hidden Markov Models for genomics data M.-L. Martin-Magniette marie laure.martin@agroparistech.fr Researcher at the French National Institut for Agricultural Research Plant Science Institut

More information

Topics in Network Models. Peter Bickel

Topics in Network Models. Peter Bickel Topics in Network Models MSU, Sepember, 2012 Peter Bickel Statistics Dept. UC Berkeley (Joint work with S. Bhattacharyya UC Berkeley, A. Chen Google, D. Choi UC Berkeley, E. Levina U. Mich and P. Sarkar

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Some Statistical Models and Algorithms for Change-Point Problems in Genomics

Some Statistical Models and Algorithms for Change-Point Problems in Genomics Some Statistical Models and Algorithms for Change-Point Problems in Genomics S. Robin UMR 518 AgroParisTech / INRA Applied MAth & Comput. Sc. Journées SMAI-MAIRCI Grenoble, September 2012 S. Robin (AgroParisTech

More information

Chapter 20. Deep Generative Models

Chapter 20. Deep Generative Models Peng et al.: Deep Learning and Practice 1 Chapter 20 Deep Generative Models Peng et al.: Deep Learning and Practice 2 Generative Models Models that are able to Provide an estimate of the probability distribution

More information

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee Whye Teh Gatsby Computational Neuroscience Unit University College London 17 Queen Square, London WC1N 3AR, UK ywteh@gatsby.ucl.ac.uk

More information

Bayesian Inference Course, WTCN, UCL, March 2013

Bayesian Inference Course, WTCN, UCL, March 2013 Bayesian Course, WTCN, UCL, March 2013 Shannon (1948) asked how much information is received when we observe a specific value of the variable x? If an unlikely event occurs then one would expect the information

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Properties of Latent Variable Network Models

Properties of Latent Variable Network Models Properties of Latent Variable Network Models Riccardo Rastelli and Nial Friel University College Dublin Adrian E. Raftery University of Washington Technical Report no. 634 Department of Statistics University

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Clustering bi-partite networks using collapsed latent block models

Clustering bi-partite networks using collapsed latent block models Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

Another Walkthrough of Variational Bayes. Bevan Jones Machine Learning Reading Group Macquarie University

Another Walkthrough of Variational Bayes. Bevan Jones Machine Learning Reading Group Macquarie University Another Walkthrough of Variational Bayes Bevan Jones Machine Learning Reading Group Macquarie University 2 Variational Bayes? Bayes Bayes Theorem But the integral is intractable! Sampling Gibbs, Metropolis

More information

Dispersion modeling for RNAseq differential analysis

Dispersion modeling for RNAseq differential analysis Dispersion modeling for RNAseq differential analysis E. Bonafede 1, F. Picard 2, S. Robin 3, C. Viroli 1 ( 1 ) univ. Bologna, ( 3 ) CNRS/univ. Lyon I, ( 3 ) INRA/AgroParisTech, Paris IBC, Victoria, July

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Image segmentation combining Markov Random Fields and Dirichlet Processes

Image segmentation combining Markov Random Fields and Dirichlet Processes Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO

More information

Graph Detection and Estimation Theory

Graph Detection and Estimation Theory Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and

More information

Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases

Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases arxiv:1507.08140v2 [math.st] 30 Jan 2017 Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases Sarah Ouadah 1,2,, Stéphane Robin 1,2, Pierre Latouche

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

The Variational Bayesian EM Algorithm for Incomplete Data: with Application to Scoring Graphical Model Structures

The Variational Bayesian EM Algorithm for Incomplete Data: with Application to Scoring Graphical Model Structures BAYESIAN STATISTICS 7, pp. 000 000 J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West (Eds.) c Oxford University Press, 2003 The Variational Bayesian EM

More information

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2 STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Note Set 5: Hidden Markov Models

Note Set 5: Hidden Markov Models Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Collapsed Variational Dirichlet Process Mixture Models

Collapsed Variational Dirichlet Process Mixture Models Collapsed Variational Dirichlet Process Mixture Models Kenichi Kurihara Dept. of Computer Science Tokyo Institute of Technology, Japan kurihara@mi.cs.titech.ac.jp Max Welling Dept. of Computer Science

More information

Gaussian Mixture Model

Gaussian Mixture Model Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information