Network Event Data over Time: Prediction and Latent Variable Modeling

Size: px
Start display at page:

Download "Network Event Data over Time: Prediction and Latent Variable Modeling"

Transcription

1 Network Event Data over Time: Prediction and Latent Variable Modeling Padhraic Smyth University of California, Irvine Machine Learning with Graphs Workshop, July 25 th 2010

2 Acknowledgements PhD students: Arthur Asuncion, Chris DuBois, Jimmy Foulds Funding National Science Foundation Office of Naval Research (MURI grant) NDSEG Graduate Fellowship Yahoo!, Google, IBM, Microsoft, Experian Padhraic Smyth: MLG Workshop, KDD 2010: 2

3 Resources A survey of statistical network models A. Goldenberg, A. Zheng, S. Fienberg, E. Airoldi, Foundations and Trends in Machine Learning, 2009 Multiplicative latent factor models for description and prediction of social networks P. D. Hoff, Computational and Mathematical Organization Theory, Random effects models for network data P. D. Hoff, in Dynamic Social Network Modeling and Analysis, 2003 A relational event model for social action C. E. Butts, Sociological Methodology, 2008 Slides from 2010 Whistler Summer School on Social Networks Padhraic Smyth: MLG Workshop, KDD 2010: 3

4 Static Network Data General Notation: N actors (node set) Will assume that set of actors is known and fixed Edges between actors (Y) Adjacency matrix Y y i,j indicates an edge between actor i and actor j Simplest case: binary undirected/directed edges Covariates/Attributes (X) e.g., for each actor (e.g., age, text documents,..) e.g., for each edge (e..g., numeric weights, vector of attributes, text, etc ) Padhraic Smyth: MLG Workshop, KDD 2010: 4

5 Dynamic Network Data Case 1: discrete-time Y t represents the state of the network at discrete time t Data D = {Y 1 Y t. Y T } Example actors = students in a school Y t = friendships between students in month t, t = 1, 12 Interest is often in network dynamics and evolution e.g., Markov models for P( Y t+1 Y t ) Padhraic Smyth: MLG Workshop, KDD 2010: 5

6 Carter Butts Padhraic Smyth: MLG Workshop, KDD 2010: 6

7 Dynamic Network Data Case 2: continuous-time network events y t is an edge between some pair i and j at time t Birth-death edges: each y t has a start and end time Flipping edges: edges can switch on or off Instantaneous edges: each y t is (effectively) instantaneous Data D = { y 1 y t. y T } - in a sense there is no graph Example actors = students in a school y t = between 2 students at time t (would need to allow for multiple recipients ) Interest is often in rates and patterns of communication e.g., Poisson rates for y i,j given network history up to time t Padhraic Smyth: MLG Workshop, KDD 2010: 7

8 Enron Data (Figure from Goldenberg et al, 2010) Padhraic Smyth: MLG Workshop, KDD 2010: 8

9 Time 1 Padhraic Smyth: MLG Workshop, KDD 2010: 9

10 Time 1 Padhraic Smyth: MLG Workshop, KDD 2010: 10

11 Time 2 Padhraic Smyth: MLG Workshop, KDD 2010: 11

12 Time 50 Padhraic Smyth: MLG Workshop, KDD 2010: 12

13 Relational Event Model Butts, 2009 λ (i,j) = Poisson rate of edge generation between actor i and actor j λ t (i,j) = function of network features up to time t, Results in a piecewise constant inhomogeneous Poisson process - rates(t) are a function of network history at time t - between events the rates are constant Typical features include: - individual actor effects - persistence between pairs - preferential attachment - conversational behavior Padhraic Smyth: MLG Workshop, KDD 2010: 13

14 Relational Event Model Example log λ t (i,j) = log λ 0 + log λ i + log λ j + β t x t (i,j) p-dim vector of weights p-dim vector of features Estimation Can be fit with standard regression methods (survival analysis) Likelihood involves O(N 2 ) terms for each of T events Does not scale well Nonetheless an interesting model. See Butts (2008) for an application to emergency response communications Padhraic Smyth: MLG Workshop, KDD 2010: 14

15 Ordinal Version of Relational Events If we don t have time-stamps, but do have the order of the events Can use the fact that choice probability can be written as P(i, j) = λ t (i,j) / Σ λ t (i,j) Can still learn the model from sequence of events, with relative rates Overall network rate λ 0 is unspecified Padhraic Smyth: MLG Workshop, KDD 2010: 15

16 Additional Modeling Aspects. As with static networks. Actor attributes, e.g., actor age Edge (event) attributes, e.g., text of an Can also have time-dependent covariates/attributes E.g., actor attributes changing over time Network level external covariates Calendar effects: time of day, day of week, time of year External events exogenous time-series Padhraic Smyth: MLG Workshop, KDD 2010: 16

17 Outline of the Talk Begin with statistical models for static data In particular, latent variable models Review some useful approaches in this area Look at how to extend these models to temporal data Particularly relational event data Discuss recent work [Caveat: only focus on certain approaches, not exhaustive] Evaluation and prediction Some general comments Mostly review.with some new work towards the end Padhraic Smyth: MLG Workshop, KDD 2010: 17

18 Why Statistical Modeling? Learning can estimate network properties from data in a principled way Prediction/Querying reduces to computation of relevant conditional probabilities and expectations Noise/Missing Data Systematic way to handle real-world noise Covariates Relatively straightforward to integrate non-network information into the network model Padhraic Smyth: MLG Workshop, KDD 2010: 18

19 Slide from Dave Hunter Padhraic Smyth: MLG Workshop, KDD 2010: 19

20 Estimation is Hard P(G θ) = f( G ; θ ) / normalization constant The normalization constant = sum over all possible graphs Say binary directed graphs: how many graphs? 2 n(n-1) e.g., with n = 50, we will have ~ graphs to sum over MCMC techniques are now the method of choice but many problems with degeneracy of likelihoods difficult models to fit (e.g., see Robins et al, Social Networks, 2007) Padhraic Smyth: MLG Workshop, KDD 2010: 20

21 An Alternative: Latent Variable Models ERGMs allow us to model edge dependencies in very flexible ways but the computational penalty is too high Latent Variable Models Typically, latent variables are chosen so that edges are conditionally independent given the latent variables Can lead to much simpler models than full ERGMs If we can find useful and tractable latent variable representations, this may provide a good alternative to ERGMs Padhraic Smyth: MLG Workshop, KDD 2010: 21

22 Idea: Example: The Latent Space Model Embed nodes in a latent K-dimensional Euclidean space Probability of edge (i,j) = f (distance (i, j) ) Edges are conditionally independent given K-dim locations Hoff, Raftery, Handcock, 2002 Padhraic Smyth: MLG Workshop, KDD 2010: 22

23 Idea: Example: The Latent Space Model Embed nodes in a latent K-dimensional Euclidean space Probability of edge (i,j) = f (distance (i, j) ) Edges are conditionally independent given K-dim locations Probability model z i = K-dim latent position vector for node i Log-odds (y ij = 1) = log P( y ij = 1)/(1 P( y ij = 1) ) Hoff, Raftery, Handcock, 2002 = - z i - z j + µ + β x ij covariate effects (optional) distance of nodes i and j network density parameter Padhraic Smyth: MLG Workshop, KDD 2010: 23

24 Likelihood: Estimation: Example: The Latent Space Model P(Y Z, b, m) = Π P( y ij z i, z j, µ, β ) Can maximize likelihood directly (as a function of Z,.) using gradient methods Can also be Bayesian, use priors, and sample from posterior density using MCMC Can also introduce block/cluster structure on nodes (see Handcock, Raftery, and Tantrum, 2007) Computational issues logistic function Hoff, Raftery, Handcock, 2002 Note that the product above is over all pairs, O(N 2 ): poor scalability Recent work (Raftery et al, 2010) shows how to ignore many non-edges Padhraic Smyth: MLG Workshop, KDD 2010: 24

25 Figure from Hoff, Raftery, Handcock, 2002 Padhraic Smyth: MLG Workshop, KDD 2010: 25

26 Figure from Hoff, Raftery, Handcock, 2002 Representational issue Is Euclidean space embedding a good way to represent network information? Similar to issues with multidimensional scaling Padhraic Smyth: MLG Workshop, KDD 2010: 26

27 Example: Relational Topic Model Chang and Blei, 2009 Nodes = documents, edges = links between documents Standard LDA/topic model, but.. Topics i,j influence p(edge i, j) Edge(i, j) influences topics i and j Model is similar to latent-space model Latent space: actors represented by k-dim location Relational topics: docs represented by k-dim topic distribution Both use logistic-like links for edge probabilities Padhraic Smyth: MLG Workshop, KDD 2010: 27

28 Relational Topic Model (RTM) [Chang, Blei, 2009] Same setup as LDA, except we have observed network information across documents (adjacency matrix) α Link probability function θ kd η,υ θ kd' Z id y d, d' β Z id' Documents with similar topics are more likely to be linked Topics influence links, and links infuence topics X id N d Φ wk K X id' N d Padhraic Smyth: MLG Workshop, KDD 2010: 28

29 Exponential: Link probability functions Logistic: K-dim vector of topic proportions Normal CDF: Normal: where Element-wise product Padhraic Smyth: MLG Workshop, KDD 2010: 29

30 Link Prediction with Wikipedia Movie Pages 'Sholay' Indian film, 45% of words belong to topic 24 (Hindi topic) Top 5 most probable movie links in training set: 'Laawaris 'Hote Hote Pyaar Ho Gaya 'Trishul 'Mr. Natwarlal 'Rangeela Cowboy Western film, 25% of words belong to topic 7 (western topic) Top 5 most probable movie links in training set: 'Tall in the Saddle 'The Indian Fighter' 'Dakota' 'The Train Robbers' 'A Lady Takes a Chance Rocky II Boxing film, 40% of words belong to topic 47 (sports topic) Top 5 most probable movie links in training set: 'Bull Durham '2003 World Series 'Bowfinger 'Rocky V 'Rocky IV' Padhraic Smyth: MLG Workshop, KDD 2010: 30

31 Idea: Example: Stochastic Block Model Partition the set of nodes into K blocks that are structurally equivalent Model interactions at the K x K block level instead of N x N actor level P( y ij ) = P( y ki, kj ), k i, k j ε {1, K} e.g., Nowicki and Snijders, 2001 Padhraic Smyth: MLG Workshop, KDD 2010: 31

32 Idea: Example: Stochastic Block Model Partition the set of nodes into K blocks that are structurally equivalent Model interactions at the K x K block level instead of N x N actor level P( y ij ) = P( y ki, kj ), k i, k j ε {1, K} e.g., Nowicki and Snijders, 2001 (Figure from Goldenberg et al, 2010) Padhraic Smyth: MLG Workshop, KDD 2010: 32

33 Example: Stochastic Block Model e.g., Nowicki and Snijders, 2001 Estimation 2 sets of parameters 1. B = block-level interaction matrix, e.g., K x K matrix of Bernoullis 2. Z = N indicator variables, mapping each node to one of K blocks (Can use your favorite estimation technique: EM, gradient, MCMC, etc) - See also Infinite Relational Model (IRM), Kemp et al (2006) - Allows one to learn the number of blocks Padhraic Smyth: MLG Workshop, KDD 2010: 33

34 Mixed Membership Stochastic Block Model (MMB) Airoldi, et al, 2008 Generalizes the stochastic block model to allow mixed membership Specifically, Replace N indicator variables, with N multinomials z i, i = 1, N Each multinomial is a distribution over the K blocks Allows an actor to have multiple memberships, with prob z i1, z ik Padhraic Smyth: MLG Workshop, KDD 2010: 34

35 Mixed Membership Stochastic Block Model (MMB) Airoldi, et al, 2008 Generalizes the stochastic block model to allow mixed membership Specifically, Replace N indicator variables, with N multinomials z i, i = 1, N Each multinomial is a distribution over the K blocks Allows an actor to have multiple memberships, with prob z i1, z ik Generative model For each actor: multinomial z i ~ Dirichlet For each possible edge: Estimation k i ~ multinomial z i, k j ~ multinomial z j y ij, ~ B( k i, k j ) Likelihood involves O(N 2 ) terms: can use variational or MCMC methods Padhraic Smyth: MLG Workshop, KDD 2010: 35

36 Mixed Membership Stochastic Blockmodel Stochastic Blockmodel Figures from Airoldi, et al, 2008 Padhraic Smyth: MLG Workshop, KDD 2010: 36

37 Binary Feature Relational Model Miller, Griffiths, Jordan, 2009 Based on idea of Indian Buffet Process (Griffiths and Ghahramani, 2006) Represent each object by a set of latent binary features Learn binary features that explain well the observed data Non-parametric: infinite number of features.but in practice, given data, only a finite number are inferred Motivation: Classes defined over combinatorial number of binary features Different from MMB, e.g., male high school musicians/athletes Different from latent space Can apply this idea to network data Latent variable model where p(edge i, j) is a function of i and j s latent binary features Padhraic Smyth: MLG Workshop, KDD 2010: 37

38 Relational Binary Feature Model for Networks Miller, Griffiths, Jordan, 2009 Hidden Features Actors Presence of edge between actor i and actor j is (e.g.) a logistic function of a weighted sum of features they have in common Estimation: based on MCMC Figure from Griffiths and Ghahramani, 2006 Padhraic Smyth: MLG Workshop, KDD 2010: 38

39 Predictions on NIPS Coauthorship Data From Miller, Griffiths, Jordan, 2009 Padhraic Smyth: MLG Workshop, KDD 2010: 39

40 A Unified View P ( y ij = 1) = f ( g( z i, z j ) + β x ij + µ ) (see also Hoff, 2009; Airoldi 2010) where f = logistic function (for example) z i, z j = k x 1 latent vectors for the ith and jth nodes g = function that combines latent vectors, with parameters θ x i j = covariate vector for the pair of nodes µ = network density parameter Padhraic Smyth: MLG Workshop, KDD 2010: 40

41 Examples Log-odds ( y ij = 1) = g( z i, z j ) + β x ij + µ Latent space model: z i, z j = k x 1 vectors of latent positions in Euclidean space g( z i, z j ) = - z i - z j Padhraic Smyth: MLG Workshop, KDD 2010: 41

42 Examples Log-odds ( y ij = 1) = g( z i, z j ) + β x ij + µ Latent space model: z i, z j = k x 1 vectors of latent positions in Euclidean space g( z i, z j ) = - z i - z j Latent factor model: (see Hoff, 2008) z i = k x 1 real-valued vector g( z i, z j ) = z i W z j, where W is a k x k diagonal matrix Padhraic Smyth: MLG Workshop, KDD 2010: 42

43 Examples Log-odds ( y ij = 1) = g( z i, z j ) + β x ij + µ Latent space model: z i, z j = k x 1 vectors of latent positions in Euclidean space g( z i, z j ) = - z i - z j Latent factor model: (see Hoff, 2008) z i = k x 1 real-valued vector g( z i, z j ) = z i W z j, where W is a k x k diagonal matrix Relational topic model: z i = k-dimensional topic distribution (multinomial) for document i g( z i, z j ) = weighted element-wise product of the 2 topics Padhraic Smyth: MLG Workshop, KDD 2010: 43

44 Examples Log-odds ( y ij = 1) = g( z i, z j ) + β x ij + µ Latent class or stochastic blockmodel: z i = fixed k-dimensional binary indicator vector, e.g., (0, 0, 1, 0, 0) g( z i, z j ) = W zi, zj, where W is a k x k matrix The indicators select which element (block) to use Padhraic Smyth: MLG Workshop, KDD 2010: 44

45 Examples Log-odds ( y ij = 1) = g( z i, z j ) + β x ij + µ Latent class or stochastic blockmodel: z i = fixed k-dimensional binary indicator vector, e.g., (0, 0, 1, 0, 0) g( z i, z j ) = W zi, zj, where W is a k x k matrix The indicators select which element (block) to use Mixed membership stochastic blockmodel (MMB) Like latent class, but z i = sampled from actor multinomial i Padhraic Smyth: MLG Workshop, KDD 2010: 45

46 Examples Log-odds ( y ij = 1) = g( z i, z j ) + β x ij + µ Latent class or stochastic blockmodel: z i = fixed k-dimensional binary indicator vector, e.g., (0, 0, 1, 0, 0) g( z i, z j ) = W zi, zj, where W is a k x k matrix The indicators select which element (block) to use Mixed membership stochastic blockmodel (MMB) Like latent class, but z i = sampled from actor multinomial i Relational binary feature model (finite version): z i = k-dimensional binary vector, e.g., (1, 0, 1, 0, 1) g( z i, z j ) = z i W z j, where W is a k x k matrix The combination of on features determine the pairwise effect Padhraic Smyth: MLG Workshop, KDD 2010: 46

47 Adding Time. General static form for latent variable models: Log-odds ( y ij = 1) = g( z i, z j ) + β x ij + µ One approach is to make the z s time-dependent i.e., allow latent features of each actor change over time An example: Gaussian linear motion models in z-space - Sarkar and Moore (2005) for actors latent-space positions - Fu, Song, and Xing (2009) for actors mixed membership vectors Padhraic Smyth: MLG Workshop, KDD 2010: 47

48 Event Data and Latent Variables Data = series of time-stamped binary directed events among actors 2 processes we want to model: Rates (e.g., Poisson) Choices (who connects to who) A simple approach: Each pair i, j (at time t) has an event rate that is Poisson λ ij Global network rate = Σ λ ij = λ P( y ij ) = λ ij / λ or, λ ij = P( y ij ) x λ Here P( y ij ) is a multinomial with O(N 2 ) entries : given that an event will happen, which pair will it be? (different from binary y ij variables we saw before) Padhraic Smyth: MLG Workshop, KDD 2010: 48

49 Direct Estimation We could predict the likelihood of i and j communicating based directly on i and j s history Multinomial with O(N 2 ) entries Can use smoothing to combat sparsity Problems Data can be extremely sparse for large N smoothing is non-informative, and does not borrow strength from the graph Nonetheless this is a useful baseline when evaluating predictions Historically, few papers evaluate models predictively Even fewer compare their models to simple baselines Padhraic Smyth: MLG Workshop, KDD 2010: 49

50 Illustration of Sparsity: Frequency of Events per pair of Actors International Political Events data King, 2003 Padhraic Smyth: MLG Workshop, KDD 2010: 50

51 Mixtures for Relational Events Talk by Chris DuBois, Tuesday Mixture model over events First choose event class k, k = 1,. K k ~ π y ij : i ~ φ ( sender nodes k), j ~ φ ( receiver nodes k) Parameters π : K x 1 multinomial = relative likelihood of different event classes φ ( sender nodes k), φ ( receiver nodes k) 2K multinomials, each of size N Simple model Similar to model proposed by Sinkkonen et al, MLG 2008 Quite similar in spirit to LDA/topic model for documents Padhraic Smyth: MLG Workshop, KDD 2010: 51

52 Marginal Product Mixture Model (MPMM) Likelihood Estimation Can use EM or Collapsed Gibbs Sampling Both are fast only need to loop over observed events (can ignore pairs where no events occurred) Extensions Modulate choice process with time-varying network rate Different types of events ( actions ) Markov (hidden) dependence on selection of event class Padhraic Smyth: MLG Workshop, KDD 2010: 52

53 MMB model Comparing MPMM and MMB For every pair of actors Sample latent class for i and j Given latent classes, sample a binary edge, or a count (e.g., Poisson) MPMM model For every event Select latent class of event Given latent class, sample i and j Differences MMB models whole graphs, but not individual events So dynamics are from graph to graph (e.g., Fu, Song, Xing 2009) MPMM models individual events, not whole graphs Allows dynamics at the event level (e.g., Markov dependence of events) And inference in MPMM is much more tractable. Padhraic Smyth: MLG Workshop, KDD 2010: 53

54 Estimation EM Straightforward and fast MCMC, Collapsed Gibbs sampling Also straightforward and fast Both EM and Gibbs scale linearly in the number of observed events (edges) Easy to apply to large data sets Padhraic Smyth: MLG Workshop, KDD 2010: 54

55 Eckmann Data Set 200,000 s 2997 individuals, 82 days Padhraic Smyth: MLG Workshop, KDD 2010: 55

56 Data (Eckmann) Padhraic Smyth: MLG Workshop, KDD 2010: 56

57 International Relations Data (King, 2003) 40,000 events 2700 actors 171 action types Padhraic Smyth: MLG Workshop, KDD 2010: 57

58 Padhraic Smyth: MLG Workshop, KDD 2010: 58

59 Prediction and Evaluation Use future data to evaluate predictive power and compare models Metrics Log score = log probability of events that actually occurred Brier/MSE style scores Ranking/ROC scores Padhraic Smyth: MLG Workshop, KDD 2010: 59

60 Padhraic Smyth: MLG Workshop, KDD 2010: 60

61 Padhraic Smyth: MLG Workshop, KDD 2010: 61

62 Comments on Evaluation Prediction on independent test data is critical Relatively easy to do with dynamic networks Tricky to do with static networks (but see Hoff, 2009) Caveat For link (or link probability) prediction it can be very difficult to beat relatively simple baselines, e.g., Graph(t+1) = Graph(t) p(event) = smoothed estimate based on historical frequency of that pair Solution? More interesting questions than just predicting what happens next, e.g How likely is that group A will communicate with group B in the next k days? If we have events with missing information, can we infer sender/receiver? Can we detect significant shifts/non-stationarity? Padhraic Smyth: MLG Workshop, KDD 2010: 62

63 What Next? Historically, social science applications of network analysis focused on understanding rather than prediction per se For data miners/computer scientists, predictive modeling plays a much more important role Key question: what are the important applications/problems that network/graph models can solve, that can t be solved by other means? Candidates? e.g., tools for egocentric modeling/analysis/management of personal communication data ( , social media, etc) Change detection Ranking of incoming communication events Padhraic Smyth: MLG Workshop, KDD 2010: 63

64 Summary Latent variable models are useful for network modeling/prediction Broad toolbox of building blocks May be scalable to large data sets Latent models for dynamic data show promise Dynamic network data comes in multiple forms Aggregated/longitudinal data Time-stamped event data (quite different in nature) Models need to be evaluated via prediction on test data Padhraic Smyth: MLG Workshop, KDD 2010: 64

65 Resources A survey of statistical network models A. Goldenberg, A. Zheng, S. Fienberg, E. Airoldi, Foundations and Trends in Machine Learning, 2009 Multiplicative latent factor models for description and prediction of social networks P. D. Hoff, Computational and Mathematical Organization Theory, Random effects models for network data P. D. Hoff, in Dynamic Social Network Modeling and Analysis, 2003 A relational event model for social action C. E. Butts, Sociological Methodology, 2008 Slides from 2010 Whistler Summer School on Social Networks Padhraic Smyth: MLG Workshop, KDD 2010: 65

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Creighton Heaukulani and Zoubin Ghahramani University of Cambridge TU Denmark, June 2013 1 A Network Dynamic network data

More information

Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs

Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs Lawrence Livermore National Laboratory Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs Keith Henderson and Tina Eliassi-Rad keith@llnl.gov and eliassi@llnl.gov This work was performed

More information

Statistical Model for Soical Network

Statistical Model for Soical Network Statistical Model for Soical Network Tom A.B. Snijders University of Washington May 29, 2014 Outline 1 Cross-sectional network 2 Dynamic s Outline Cross-sectional network 1 Cross-sectional network 2 Dynamic

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Learning latent structure in complex networks

Learning latent structure in complex networks Learning latent structure in complex networks Lars Kai Hansen www.imm.dtu.dk/~lkh Current network research issues: Social Media Neuroinformatics Machine learning Joint work with Morten Mørup, Sune Lehmann

More information

Stochastic blockmodeling of relational event dynamics

Stochastic blockmodeling of relational event dynamics Christopher DuBois Carter T. Butts Padhraic Smyth Department of Statistics University of California, Irvine Department of Sociology Department of Statistics Institute for Mathematical and Behavioral Sciences

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Modeling heterogeneity in random graphs

Modeling heterogeneity in random graphs Modeling heterogeneity in random graphs Catherine MATIAS CNRS, Laboratoire Statistique & Génome, Évry (Soon: Laboratoire de Probabilités et Modèles Aléatoires, Paris) http://stat.genopole.cnrs.fr/ cmatias

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed

More information

IV. Analyse de réseaux biologiques

IV. Analyse de réseaux biologiques IV. Analyse de réseaux biologiques Catherine Matias CNRS - Laboratoire de Probabilités et Modèles Aléatoires, Paris catherine.matias@math.cnrs.fr http://cmatias.perso.math.cnrs.fr/ ENSAE - 2014/2015 Sommaire

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Random Effects Models for Network Data

Random Effects Models for Network Data Random Effects Models for Network Data Peter D. Hoff 1 Working Paper no. 28 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 January 14, 2003 1 Department of

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003. Following slides borrowed ant then heavily modified from: Jonathan Huang

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Intro Talk April 22, 2014

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Hybrid Models for Text and Graphs. 10/23/2012 Analysis of Social Media

Hybrid Models for Text and Graphs. 10/23/2012 Analysis of Social Media Hybrid Models for Text and Graphs 10/23/2012 Analysis of Social Media Newswire Text Formal Primary purpose: Inform typical reader about recent events Broad audience: Explicitly establish shared context

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Creighton Heaukulani ckh28@cam.ac.uk Zoubin Ghahramani zoubin@eng.cam.ac.uk University of Cambridge, Dept. of Engineering,

More information

Mixed Membership Matrix Factorization

Mixed Membership Matrix Factorization Mixed Membership Matrix Factorization Lester Mackey 1 David Weiss 2 Michael I. Jordan 1 1 University of California, Berkeley 2 University of Pennsylvania International Conference on Machine Learning, 2010

More information

Hierarchical Models for Social Networks

Hierarchical Models for Social Networks Hierarchical Models for Social Networks Tracy M. Sweet University of Maryland Innovative Assessment Collaboration November 4, 2014 Acknowledgements Program for Interdisciplinary Education Research (PIER)

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Bayesian Hidden Markov Models and Extensions

Bayesian Hidden Markov Models and Extensions Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

GLAD: Group Anomaly Detection in Social Media Analysis

GLAD: Group Anomaly Detection in Social Media Analysis GLAD: Group Anomaly Detection in Social Media Analysis Poster #: 1150 Rose Yu, Xinran He and Yan Liu University of Southern California Group Anomaly Detection Anomalous phenomenon in social media data

More information

Gaussian Mixture Model

Gaussian Mixture Model Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,

More information

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Assessing Goodness of Fit of Exponential Random Graph Models

Assessing Goodness of Fit of Exponential Random Graph Models International Journal of Statistics and Probability; Vol. 2, No. 4; 2013 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Assessing Goodness of Fit of Exponential Random

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

arxiv: v1 [cs.si] 7 Dec 2013

arxiv: v1 [cs.si] 7 Dec 2013 Sequential Monte Carlo Inference of Mixed Membership Stochastic Blockmodels for Dynamic Social Networks arxiv:1312.2154v1 [cs.si] 7 Dec 2013 Tomoki Kobayashi, Koji Eguchi Graduate School of System Informatics,

More information

Mixed Membership Matrix Factorization

Mixed Membership Matrix Factorization Mixed Membership Matrix Factorization Lester Mackey University of California, Berkeley Collaborators: David Weiss, University of Pennsylvania Michael I. Jordan, University of California, Berkeley 2011

More information

Restricted Boltzmann Machines for Collaborative Filtering

Restricted Boltzmann Machines for Collaborative Filtering Restricted Boltzmann Machines for Collaborative Filtering Authors: Ruslan Salakhutdinov Andriy Mnih Geoffrey Hinton Benjamin Schwehn Presentation by: Ioan Stanculescu 1 Overview The Netflix prize problem

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

Click Prediction and Preference Ranking of RSS Feeds

Click Prediction and Preference Ranking of RSS Feeds Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS

More information

Unified Modeling of User Activities on Social Networking Sites

Unified Modeling of User Activities on Social Networking Sites Unified Modeling of User Activities on Social Networking Sites Himabindu Lakkaraju IBM Research - India Manyata Embassy Business Park Bangalore, Karnataka - 5645 klakkara@in.ibm.com Angshu Rai IBM Research

More information

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017 CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Design of Text Mining Experiments. Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.

Design of Text Mining Experiments. Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt. Design of Text Mining Experiments Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.taddy/research Active Learning: a flavor of design of experiments Optimal : consider

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

Dynamic modeling of organizational coordination over the course of the Katrina disaster

Dynamic modeling of organizational coordination over the course of the Katrina disaster Dynamic modeling of organizational coordination over the course of the Katrina disaster Zack Almquist 1 Ryan Acton 1, Carter Butts 1 2 Presented at MURI Project All Hands Meeting, UCI April 24, 2009 1

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Final Talk May 8, 2014 Ted

More information

Bayesian Nonparametric Learning of Complex Dynamical Phenomena

Bayesian Nonparametric Learning of Complex Dynamical Phenomena Duke University Department of Statistical Science Bayesian Nonparametric Learning of Complex Dynamical Phenomena Emily Fox Joint work with Erik Sudderth (Brown University), Michael Jordan (UC Berkeley),

More information

Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs

Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun (work with S.V.N. Vishwanathan) Department of Statistics Purdue Machine Learning Seminar November 9, 2011 Overview

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP Personal Healthcare Revolution Electronic health records (CFH) Personal genomics (DeCode, Navigenics, 23andMe) X-prize: first $10k human genome technology

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21: More Networks: Models and Origin Myths Cosma Shalizi 31 March 2009 New Assignment: Implement Butterfly Mode in R Real Agenda: Models of Networks, with

More information

Random function priors for exchangeable arrays with applications to graphs and relational data

Random function priors for exchangeable arrays with applications to graphs and relational data Random function priors for exchangeable arrays with applications to graphs and relational data James Robert Lloyd Department of Engineering University of Cambridge Peter Orbanz Department of Statistics

More information

Model Based Clustering of Count Processes Data

Model Based Clustering of Count Processes Data Model Based Clustering of Count Processes Data Tin Lok James Ng, Brendan Murphy Insight Centre for Data Analytics School of Mathematics and Statistics May 15, 2017 Tin Lok James Ng, Brendan Murphy (Insight)

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Clustering bi-partite networks using collapsed latent block models

Clustering bi-partite networks using collapsed latent block models Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space

More information

arxiv: v1 [stat.me] 3 Apr 2017

arxiv: v1 [stat.me] 3 Apr 2017 A two-stage working model strategy for network analysis under Hierarchical Exponential Random Graph Models Ming Cao University of Texas Health Science Center at Houston ming.cao@uth.tmc.edu arxiv:1704.00391v1

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s

More information

Replicated Softmax: an Undirected Topic Model. Stephen Turner

Replicated Softmax: an Undirected Topic Model. Stephen Turner Replicated Softmax: an Undirected Topic Model Stephen Turner 1. Introduction 2. Replicated Softmax: A Generative Model of Word Counts 3. Evaluating Replicated Softmax as a Generative Model 4. Experimental

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21 Cosma Shalizi 3 April 2008 Models of Networks, with Origin Myths Erdős-Rényi Encore Erdős-Rényi with Node Types Watts-Strogatz Small World Graphs Exponential-Family

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Topic Models and Applications to Short Documents

Topic Models and Applications to Short Documents Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text

More information

Specification and estimation of exponential random graph models for social (and other) networks

Specification and estimation of exponential random graph models for social (and other) networks Specification and estimation of exponential random graph models for social (and other) networks Tom A.B. Snijders University of Oxford March 23, 2009 c Tom A.B. Snijders (University of Oxford) Models for

More information

Hierarchical Mixed Membership Stochastic Blockmodels for Multiple Networks and Experimental Interventions

Hierarchical Mixed Membership Stochastic Blockmodels for Multiple Networks and Experimental Interventions 22 Hierarchical Mixed Membership Stochastic Blockmodels for Multiple Networks and Experimental Interventions Tracy M. Sweet Department of Human Development and Quantitative Methodology, University of Maryland,

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

RaRE: Social Rank Regulated Large-scale Network Embedding

RaRE: Social Rank Regulated Large-scale Network Embedding RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat

More information

Generalized Exponential Random Graph Models: Inference for Weighted Graphs

Generalized Exponential Random Graph Models: Inference for Weighted Graphs Generalized Exponential Random Graph Models: Inference for Weighted Graphs James D. Wilson University of North Carolina at Chapel Hill June 18th, 2015 Political Networks, 2015 James D. Wilson GERGMs for

More information

Chapter 20. Deep Generative Models

Chapter 20. Deep Generative Models Peng et al.: Deep Learning and Practice 1 Chapter 20 Deep Generative Models Peng et al.: Deep Learning and Practice 2 Generative Models Models that are able to Provide an estimate of the probability distribution

More information

Mixed-membership Models (and an introduction to variational inference)

Mixed-membership Models (and an introduction to variational inference) Mixed-membership Models (and an introduction to variational inference) David M. Blei Columbia University November 24, 2015 Introduction We studied mixture models in detail, models that partition data into

More information

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Interpretable Latent Variable Models

Interpretable Latent Variable Models Interpretable Latent Variable Models Fernando Perez-Cruz Bell Labs (Nokia) Department of Signal Theory and Communications, University Carlos III in Madrid 1 / 24 Outline 1 Introduction to Machine Learning

More information

Nonparametric Latent Feature Models for Link Prediction

Nonparametric Latent Feature Models for Link Prediction Nonparametric Latent Feature Models for Link Prediction Kurt T. Miller EECS University of California Berkeley, CA 94720 tadayuki@cs.berkeley.edu Thomas L. Griffiths Psychology and Cognitive Science University

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

Bayesian non parametric approaches: an introduction

Bayesian non parametric approaches: an introduction Introduction Latent class models Latent feature models Conclusion & Perspectives Bayesian non parametric approaches: an introduction Pierre CHAINAIS Bordeaux - nov. 2012 Trajectory 1 Bayesian non parametric

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Modeling Relational Event Dynamics with statnet

Modeling Relational Event Dynamics with statnet Modeling Relational Event Dynamics with statnet Carter T. Butts Department of Sociology and Institute for Mathematical Behavioral Sciences University of California, Irvine Christopher S. Marcum RAND Corporation

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

arxiv: v1 [stat.ml] 5 Dec 2016

arxiv: v1 [stat.ml] 5 Dec 2016 A Nonparametric Latent Factor Model For Location-Aware Video Recommendations arxiv:1612.01481v1 [stat.ml] 5 Dec 2016 Ehtsham Elahi Algorithms Engineering Netflix, Inc. Los Gatos, CA 95032 eelahi@netflix.com

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released

More information

Bayesian Analysis for Natural Language Processing Lecture 2

Bayesian Analysis for Natural Language Processing Lecture 2 Bayesian Analysis for Natural Language Processing Lecture 2 Shay Cohen February 4, 2013 Administrativia The class has a mailing list: coms-e6998-11@cs.columbia.edu Need two volunteers for leading a discussion

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language

More information

Sparse Stochastic Inference for Latent Dirichlet Allocation

Sparse Stochastic Inference for Latent Dirichlet Allocation Sparse Stochastic Inference for Latent Dirichlet Allocation David Mimno 1, Matthew D. Hoffman 2, David M. Blei 1 1 Dept. of Computer Science, Princeton U. 2 Dept. of Statistics, Columbia U. Presentation

More information

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012) Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Coupled Hidden Markov Models: Computational Challenges

Coupled Hidden Markov Models: Computational Challenges .. Coupled Hidden Markov Models: Computational Challenges Louis J. M. Aslett and Chris C. Holmes i-like Research Group University of Oxford Warwick Algorithms Seminar 7 th March 2014 ... Hidden Markov

More information

Machine Learning for Signal Processing Bayes Classification and Regression

Machine Learning for Signal Processing Bayes Classification and Regression Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information