Bayesian estimation of complex networks and dynamic choice in the music industry

Size: px
Start display at page:

Download "Bayesian estimation of complex networks and dynamic choice in the music industry"

Transcription

1 Bayesian estimation of complex networks and dynamic choice in the music industry Stefano Nasini Víctor Martínez-de-Albéniz Dept. of Production, Technology and Operations Management, IESE Business School, University of Navarra, Barcelona, Spain

2 Outline Multidimensional Gaussian reduction The exponential family of distribution 4 Numerical results Goodness of fit

3 Artist goods: the music broadcasting industry Artist goods Their life cycles that resemble clothing fashion trends, with a time window in which their popularity increases shortly after their premiere and then decrease. This is due to network externalities in individual preferences and opinions.

4 Artist goods: the music broadcasting industry A data set of songs played on TV channels and radio stations Germany UK Broadcasting companies Artists Songs Time periods 163 weeks 163 weeks

5 Artist goods: the music broadcasting industry A song s popularity increases after their premiere and then decrease (a) B. Mars, Just the way you are in Germany. (b) B. Mars, Locked Out Of Heaven in Germany. (c) B. Mars, Just the way you are in the UK. (d) B. Mars, Locked Out Of Heaven in the UK.

6 Artist goods: the music broadcasting industry Correlated choices from different broadcasting companies BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth Radio London BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth Radio London Table: Spearman s correlations among the dynamic plays of Locked Out Of Heaven. BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth Radio London BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth Radio London Table: Spearman s correlations among the dynamic plays of Just the way you are.

7 Artist goods: the music broadcasting industry Our goal is to have a joint model which allows... Predicting the common life cycle of song diffusion within the music broadcasting industry. Detecting the structure of imitation and spillover between radio stations and TV channels, based on the observed correlations. Taking decision about what s the best broadcasting industry to launch a song in order to maximize the future number of plays.

8 as two-mode network Notation R := set of individuals (primary layer); S := set of item (secondary layer); T := set of time periods; xst = [xs1t xs2t... xs R t ]T χ is the R -dimensional connection profile of the sth item at time t. E R R := a set of connections between broadcasting industries;

9 as two-mode network Spillover measurements to internalize cross-section dependency in the panel 1 i Ghk (xst ; xs,t 1,..., xs,t τ ) = E τ Pτ 1 ii Ghk (xst ; xs,t 1,..., xs,t τ ) = E τ `=0 Pτ 1 d` (xsht )uh (xsk (t `) )uk p ; `=0 d` xsht uh xsk (t `) uk 2 p ;

10 Multidimensional Gaussian reduction The exponential family of distribution P(x st x s,t 1,..., x s,t τ ) h(x st ) exp α st S s + β r R r + γ hk G hk r R (h, k) E - S st accounts for the size effect of each item in the secondary layer, for s S; - R r accounts for the size effect of each individual in the primary layer, for r R; - G hk internalizes the one-mode projection into the primary layer, for (h, k) R; Underlying measure: either h(x st ) = 1 x srt! or h(x st ) = (2π) r R (τ+1) R 2

11 Multidimensional Gaussian reduction The exponential family of distribution The spillover measurement G hk plays an important role. ( [ P(x srt x sr t such that r r, t < t) 1 x srt! exp αst + β r η ] T [ x srt C(x srt ) ] ), where η = 1 τ 1 γ rk (x sk(t l) ) p and C(x srt ) = (x srt ) p 1, for i, τ E k R l=1 γ r1 η = 1 τ E. and C(x srt ) = γ rn τ l=1 τ l=1 d l x srt u r d l x srt u r x s1(t l) u 1. x sn(t l) u n 2 p 2 p for ii.

12 Multidimensional Gaussian reduction The exponential family of distribution α = 1 and γ = 1 α = 1 and γ = 1 Spillover measurement 1 x!y! exp(α(x + y) + γ(xy)1/2 ) 1 exp(α(x + y) + γ x y ) x!y!

13 Multidimensional Gaussian reduction The exponential family of distribution Multidimensional Gaussian reduction Under special conditions: P(x st x s,t 1,..., x s,t τ ) h(x st ) exp α st S s + β r R r + γ hk G hk r R (h, k) E - G hk (x st ; x s,t 1,..., x s,t τ ) = τ l=0 d l ( xsht x sk(t l) ) ; - h(x st ) = (2π) (τ+1) R 2 ; X st. X s,t τ N (µ, Σ), where µ = Σ α st e + β. α s,t τ e + β and Σ = 1 2 d 0 Γ... d τ Γ.. d τ Γ... d 0 Γ 1.

14 Multidimensional Gaussian reduction The exponential family of distribution Why is our model an extension of the ERGM? Exponential Family Whenever the density of a random variable may be written f (x) h(x) exp{θ T C(x)} the family of all such random variables (for all possible θ) is called an exponential family. Exponential Random Graph Model (ERGM) P θ (X = x) = exp{θt C(x)}, where Z (θ) X is a random network on n nodes (a matrix of 0 s and 1 s); θ is a vector of parameters; C(x) is a known vector of graph statistics on x.

15 Why it is difficult to find the MLE Multidimensional Gaussian reduction The exponential family of distribution The log-likelihood function - the model: P(X = x (0) θ) = exp{θt C(x (0) )}, where x (0) is the Z (θ) observed data set. - The log-likelihood function is l(θ) = θ T C(x (0) ) log ( Z (θ) ) = θ T C(x (0) ) log exp{θ T C(x)} all possible x - Even in the simplest case of undirected graphs without self-edges, the number of graphs in the sum is very large.

16 Maximum Pseudo-likelihood Multidimensional Gaussian reduction The exponential family of distribution Let x w be a unique component of x and x w the vector of all the remaining components. The pseudo-likelihood function Let s approximate the marginal P(x w θ) by the conditional P(x w x w ; θ)? Then l(θ) = w P(x w x w ; θ). Result: The maximum pseudo-likelihood estimate. Unfortunately, little is known about the quality of MPL estimates.

17 Pseudo-likelihood for ERGM Multidimensional Gaussian reduction The exponential family of distribution Notation: For a network x and a pair (i, j) of nodes l(θ) = w P(x w x w ; θ) = exp{θ T C(x (0) )} exp{θ T C(x (i,j) ij = 1, x ij )} + exp{θ T C(x ij = 0, x ij )} exp{n(n 1)θ T C(x (0) )} = ( ) (i,j) exp{θ T C(x ij = 1, x ij )} + exp{θ T C(x ij = 0, x ij )}

18 Pseudo-likelihood for our model Multidimensional Gaussian reduction The exponential family of distribution Pseudo-likelihood for our model l(θ) = (r,t) P(x srt x sr t such that r r, t < t) (r,t) ( [ 1 x srt! exp αst + β r η ] T [ x srt C(x srt ) ] ), What is the normalizing constant for the full conditional? Z (α st, β r, η) = x srt 0 ( [ 1 x srt! exp αst + β r η ] T [ x srt C(x srt ) ] ) Even the pseudo-likelihood is hard to define for our model

19 Pseudo-likelihood for our model Multidimensional Gaussian reduction The exponential family of distribution Pseudo-likelihood for our model l(θ) = (r,t) P(x srt x sr t such that r r, t < t) (r,t) ( [ 1 x srt! exp αst + β r η ] T [ x srt C(x srt ) ] ), What is the normalizing constant for the full conditional? Z (α st, β r, η) = x srt 0 ( [ 1 x srt! exp αst + β r η ] T [ x srt C(x srt ) ] ) Even the pseudo-likelihood is hard to define for our model

20 Bayesian posterior Goodness of fit Let θ = [α 1t,..., α S t, β 1,..., β R, γ 11,..., γ R, R ] T be the vector of natural parameters, π(θ) a prior distribution and x (0) the observed data set. By applying the Bayes rule we have: P(θ x (0) ) = P(x (0) θ)π(θ) P(x (0) θ)π(θ) dθ θ P(x 1..., x τ ; θ) w P(x t x t 1..., x t τ ; θ)π(θ) t = τ+1 = w P(x 1..., x τ ; θ) P(x t x t 1..., x t τ ; θ)π(θ) dθ θ t = τ+1 P(x 1..., x τ ; θ) π(θ) w m q s,t,θ (x st ) Z (θ) t = τ+1 s=1 = P(x 1..., x τ ; θ) π(θ) w m q s,t,θ (x st ) dθ θ Z (θ) t = τ+1 s=1

21 Metropolis-Hastings Goodness of fit Since both P(x (0) θ) and P(θ x (0) ) can only be specified under proportionality conditions, almost all known valid MCMC algorithms for θ cannot be applied. Consider for instance the Metropolis-Hastings acceptance probability: π accept (θ, θ ) { = min 1, P(x (0) θ )π(θ ) P(x (0) θ)π(θ) Q(θ } θ ) Q(θ θ) w m P(x 1..., x τ ; θ ) q s,t,θ (x st )π(θ ) t = τ+1 s=1 = min 1, Z (θ) Q(θ θ ) w m Z (θ )Q(θ θ) P(x 1..., x τ ; θ) q s,t,θ (x st )π(θ) t = τ+1 s=1 where Q(θ θ) is the proposal distribution.

22 Goodness of fit Specialized MCMC for doubly intractable distributions Murray proposed a MCMC approach which overcomes the drawback to a large extent, based on the simulation of the joint distribution of the parameter and the sample spaces, conditioned to the observed data set x (0), that is to say P(x, θ x (0) ). Algorithm 1 Exchange algorithm of Murray. 1: Initialize θ 2: repeat 3: Draw θ from an arbitrary proposal distribution; 4: Draw x from P(. θ ) 5: Accept θ with probability min 6: Update θ 7: until Convergence { 1, P(x θ)p(x (0) θ )π(θ } ) P(x (0) θ)p(x θ )π(θ)

23 Goodness of fit Goodness of fit: graphical illustration Total number of plays along time by the top-30 songs (a) Full model. (b) Null model (γ = 0).

24 Goodness of fit Goodness of fit: graphical illustration Total number of plays along time by the top-30 songs (a) Total plays along time. (b) Market share.

25 Goodness of fit Reducing the dimensionality of the parameter space Model specification based on structural properties of the music industry The parameter space is the whole ( T S + R + E )-dimensional Euclidean space, while the sample space has dimension ( T S R ). We use two strategies to reduce the dimensionality of the parameter space: A. Define communities of broadcasting companies to consider only within-group spillover effects γ; B. Define a functional form for the effect of the song life cycle α.

26 Goodness of fit Reducing the dimensionality of the parameter space A. Reducing the E effects γ Pairwise spillover effects γ kh, between individual companies h and k with the same radio format. Common spillover effect between different radio formats γ kh, if h and k have different formats. B. Reducing the T S effects α The broadcasting pattern of songs exhibit a time window in which their popularity quickly increases shortly after their premier and then decreases.

27 Goodness of fit Groups of broadcasting companies WITHIN FORMAT BETWEEN FORMATS TV channels Let s introduce only the effects γ which are associated to TV channels and radio station of the same format. Contemporary and Easy listening Top 40 and Urban Radio stations Rock music

28 The estimated spillover effects Goodness of fit The estimated spillover effects Contemporary Rock News Sport Top-40 World-Music TV channels Contemporary ( 0.089, 0.004) (0.012, 0.021) ( 0.028, 0.014) ( 0.164, 0.012) ( 0.030, 0.019) Rock ( 0.035, 0.021) ( 0.049, 0.037) ( 0.018, 0.001) ( 0.032, 0.001) ( 0.015, 0.021) News ( 0.023, 0.047) ( 0.072, 0.010) ( 0.035, 0.008) ( 0.005, 0.024) (0.009, 0.030) ( 0.186, 0.068) Sport ( 0.009, 0.076) ( 0.036, 0.001) ( 0.068, 0.030) ( 0.015, 0.013) ( 0.029, 0.001) Top-40 ( 0.070, 0.001) ( 0.083, 0.022) ( 0.052, 0.000) ( 0.038, 0.022) ( 0.025, 0.019) World-Music ( 0.017, 0.014) ( 0.029, 0.036) ( 0.022, 0.005) ( 0.017, 0.011) ( 0.014, 0.024) TV channels ( 0.291, 0.038) BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth R. London BBC 1 Xtra ( 0.009, 0.060) ( 0.104, 0.057) ( 0.015, 0.012) (0.005, 0.024) ( 0.015, 0.012) Capital FM ( 0.015, 0.051) ( 0.060, 0.001) ( 0.009, 0.025) (0.000, 0.025) ( 0.013, 0.019) Kiss 100 FM ( 0.020, 0.124) ( 0.028, 0.025) ( 0.009, 0.025) ( 0.032, 0.021) (0.001, 0.029) Metro Radio ( 0.008, 0.094) ( 0.009, 0.012) ( 0.027, 0.026) ( 0.014, 0.037) (0.000, 0.055) Radio City ( 0.019, 0.110) ( 0.040, 0.012) ( 0.015, 0.022) ( 0.021, 0.009) (0.010, 0.033) Smooth R. London ( 0.033, 0.011) ( 0.021, 0.014) ( 0.022, 0.016) ( 0.032, 0.023) ( 0.022, 0.001)

29 Songs dynamics Goodness of fit Define a functional form for the effect of song dynamics The attractiveness trajectory of the s th song can be specified by letting t 0 be the starting week when the song is launched and then considering a gamma kernel to design the shape its time dynamics: { δ 0 α st = s + δs 1 (t t 0 ) + δs 2 log(t t 0 ) if t > t 0 otherwise where t 0 is the week when the song has been launched.

30 Songs life cycle Goodness of fit Common life cycle of the top-30 songs

31 Goodness of fit Propagation of the broadcasting decision after the premier week t 0. 1 T [ ] max E x S s,,t+t x srt = z r : for all r R, s S t =1 subject to y r = 1 r R z r min{my r, φ} r R, y r {0, 1}, z r 0, F φ 0 r R, Format Eigenvector Expected plays in t Expected plays in t φ = 10 φ = 100 φ = 10 φ = 100 Contemporary Rock News Sport Top World Music TV-channels

32 Discussion Goodness of fit Which are the real achievements of this work? We considered a large multidimensional panel of songs weekly broadcasted on radio stations and TV channels and detect a pattern of cross-section dependencies, based on pairwise imitations. has been proposed to internalized in a unique probabilistic framework both the songs life cycle and the complex correlation structure. A specialized MCMC method has been implemented to estimate the model parameters. The out-of-sample goodness of fit has been analyzed, assessing the model adequacy for the observed data set.

33 Goodness of fit THANK YOU FOR YOUR ATTENTION Acknowledgements The research leading to these results has received funding from the European Research Council under the European Union s Seventh Framework Programme (FP/ ) / ERC Grant Agreement n

Pairwise influences in dynamic choice: method and application

Pairwise influences in dynamic choice: method and application Pairwise influences in dynamic choice: method and application Stefano Nasini Victor Martínez-de-Albéniz December 14, 2015 Abstract Choices of different individuals over time exhibit pairwise associations

More information

Pairwise influences in dynamic choice: method and application

Pairwise influences in dynamic choice: method and application Pairwise influences in dynamic choice: method and application Stefano Nasini b, Victor Martínez-de-Albéniz a a IESE Business School, University of Navarra, Barcelona, Spain. b IESEG School of Management

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Bayesian Graphical Models

Bayesian Graphical Models Graphical Models and Inference, Lecture 16, Michaelmas Term 2009 December 4, 2009 Parameter θ, data X = x, likelihood L(θ x) p(x θ). Express knowledge about θ through prior distribution π on θ. Inference

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Kernel methods, kernel SVM and ridge regression

Kernel methods, kernel SVM and ridge regression Kernel methods, kernel SVM and ridge regression Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Collaborative Filtering 2 Collaborative Filtering R: rating matrix; U: user factor;

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

Hierarchical Models & Bayesian Model Selection

Hierarchical Models & Bayesian Model Selection Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Evidence estimation for Markov random fields: a triply intractable problem

Evidence estimation for Markov random fields: a triply intractable problem Evidence estimation for Markov random fields: a triply intractable problem January 7th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers

More information

Inexact approximations for doubly and triply intractable problems

Inexact approximations for doubly and triply intractable problems Inexact approximations for doubly and triply intractable problems March 27th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers of) interacting

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

The Origin of Deep Learning. Lili Mou Jan, 2015

The Origin of Deep Learning. Lili Mou Jan, 2015 The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Theory of Stochastic Processes 8. Markov chain Monte Carlo Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Decomposable Graphical Gaussian Models

Decomposable Graphical Gaussian Models CIMPA Summerschool, Hammamet 2011, Tunisia September 12, 2011 Basic algorithm This simple algorithm has complexity O( V + E ): 1. Choose v 0 V arbitrary and let v 0 = 1; 2. When vertices {1, 2,..., j}

More information

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Nested Sampling. Brendon J. Brewer.   brewer/ Department of Statistics The University of Auckland Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/ brewer/ is a Monte Carlo method (not necessarily MCMC) that was introduced by John Skilling in 2004. It is very popular

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Notes on pseudo-marginal methods, variational Bayes and ABC

Notes on pseudo-marginal methods, variational Bayes and ABC Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution

More information

Theory of Statistical Tests

Theory of Statistical Tests Ch 9. Theory of Statistical Tests 9.1 Certain Best Tests How to construct good testing. For simple hypothesis H 0 : θ = θ, H 1 : θ = θ, Page 1 of 100 where Θ = {θ, θ } 1. Define the best test for H 0 H

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

Statistical Tools and Techniques for Solar Astronomers

Statistical Tools and Techniques for Solar Astronomers Statistical Tools and Techniques for Solar Astronomers Alexander W Blocker Nathan Stein SolarStat 2012 Outline Outline 1 Introduction & Objectives 2 Statistical issues with astronomical data 3 Example:

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Markov Networks.

Markov Networks. Markov Networks www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts Markov network syntax Markov network semantics Potential functions Partition function

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Introduction to Machine Learning

Introduction to Machine Learning Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},

More information

ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors. RicardoS.Ehlers

ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors. RicardoS.Ehlers ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors RicardoS.Ehlers Laboratório de Estatística e Geoinformação- UFPR http://leg.ufpr.br/ ehlers ehlers@leg.ufpr.br II Workshop on Statistical

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Variational Scoring of Graphical Model Structures

Variational Scoring of Graphical Model Structures Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Hybrid Censoring; An Introduction 2

Hybrid Censoring; An Introduction 2 Hybrid Censoring; An Introduction 2 Debasis Kundu Department of Mathematics & Statistics Indian Institute of Technology Kanpur 23-rd November, 2010 2 This is a joint work with N. Balakrishnan Debasis Kundu

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes

More information

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability

More information

Lecture 4: Dynamic models

Lecture 4: Dynamic models linear s Lecture 4: s Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29 Outline 1 Introduction 2 Spatial

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Algorithms for Variational Learning of Mixture of Gaussians

Algorithms for Variational Learning of Mixture of Gaussians Algorithms for Variational Learning of Mixture of Gaussians Instructors: Tapani Raiko and Antti Honkela Bayes Group Adaptive Informatics Research Center 28.08.2008 Variational Bayesian Inference Mixture

More information

On Markov chain Monte Carlo methods for tall data

On Markov chain Monte Carlo methods for tall data On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

Approximating mixture distributions using finite numbers of components

Approximating mixture distributions using finite numbers of components Approximating mixture distributions using finite numbers of components Christian Röver and Tim Friede Department of Medical Statistics University Medical Center Göttingen March 17, 2016 This project has

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Variational Learning : From exponential families to multilinear systems

Variational Learning : From exponential families to multilinear systems Variational Learning : From exponential families to multilinear systems Ananth Ranganathan th February 005 Abstract This note aims to give a general overview of variational inference on graphical models.

More information

Bayesian Inference: Posterior Intervals

Bayesian Inference: Posterior Intervals Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)

More information

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods Konstantin Zuev Institute for Risk and Uncertainty University of Liverpool http://www.liv.ac.uk/risk-and-uncertainty/staff/k-zuev/

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Bayesian Decision and Bayesian Learning

Bayesian Decision and Bayesian Learning Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants

Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants Faming Liang Texas A& University Sooyoung Cheon Korea University Spatial Model Introduction

More information

Stochastic modelling of urban structure

Stochastic modelling of urban structure Stochastic modelling of urban structure Louis Ellam Department of Mathematics, Imperial College London The Alan Turing Institute https://iconicmath.org/ IPAM, UCLA Uncertainty quantification for stochastic

More information

Retail Planning in Future Cities A Stochastic Dynamical Singly Constrained Spatial Interaction Model

Retail Planning in Future Cities A Stochastic Dynamical Singly Constrained Spatial Interaction Model Retail Planning in Future Cities A Stochastic Dynamical Singly Constrained Spatial Interaction Model Mark Girolami Department of Mathematics, Imperial College London The Alan Turing Institute Lloyds Register

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

High dimensional Ising model selection

High dimensional Ising model selection High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse

More information

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graphical model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Osnat Stramer 1 and Matthew Bognar 1 Department of Statistics and Actuarial Science, University of

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Markov Chain Monte Carlo Methods for Stochastic Optimization

Markov Chain Monte Carlo Methods for Stochastic Optimization Markov Chain Monte Carlo Methods for Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U of Toronto, MIE,

More information

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL Xuebin Zheng Supervisor: Associate Professor Josef Dick Co-Supervisor: Dr. David Gunawan School of Mathematics

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop,

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graph model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information