Efficient MCMC Samplers for Network Tomography

Size: px
Start display at page:

Download "Efficient MCMC Samplers for Network Tomography"

Transcription

1 Efficient MCMC Samplers for Network Tomography Martin Hazelton 1 Institute of Fundamental Sciences Massey University 7 December m.hazelton@massey.ac.nz AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

2 Networks AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

3 Networks AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

4 Networks AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

5 Networks AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

6 Networks AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

7 Networks AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

8 Networks AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

9 Networks and Statistics Network data generates a host of interesting statistical problems. In some cases interest centres on the structure of an abstract network. Biological networks. Social networks. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

10 A Social Network AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

11 Network Tomography This talk focuses on inference for flows on physical networks. Road networks Electronic communication networks Biological networks Target for inference often higher dimensional than observed data. Led Vardi (1996) to coin the phrase network tomography for such problems. Vardi, Y. (1996). Network tomography: estimating source-destination traffic intensities from link data. Journal of the American Statistical Association 91, AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

12 Modelling Framework Traffic Network System abstracted to network G = (N, A). AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

13 Modelling Framework Traffic Network System abstracted to network G = (N, A). N is set of nodes. Nodes represent: Origins and/or destinations of travel; Intersections. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

14 Modelling Framework Traffic Network System abstracted to network G = (N, A). N is set of nodes. Nodes represent: Origins and/or destinations of travel; Intersections. A is set of directed links. Links also referred to as arcs. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

15 Modelling Framework Routes Travel is possible only between pre-specified origin-destination (OD) node pairs. E.g. node 1 to node 6 might be only OD pair. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

16 Modelling Framework Routes Travel is possible only between pre-specified origin-destination (OD) node pairs. E.g. node 1 to node 6 might be only OD pair. Typically travel can be by a variety of routes (paths). AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

17 Modelling Framework Routes Travel is possible only between pre-specified origin-destination (OD) node pairs. E.g. node 1 to node 6 might be only OD pair. Typically travel can be by a variety of routes (paths). AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

18 Modelling Framework Routes Travel is possible only between pre-specified origin-destination (OD) node pairs. E.g. node 1 to node 6 might be only OD pair. Typically travel can be by a variety of routes (paths). Sometimes no route choice. E.g. if (4, 3) is another OD pair. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

19 Traffic Models Traffic models typically specified in terms of route flows (volumes). Let x denote vector of route flows for some observational period. Model parameterized by θ. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

20 Traffic Models Traffic models typically specified in terms of route flows (volumes). Let x denote vector of route flows for some observational period. Model parameterized by θ. Canonical Example x Pois(θ) (interpret elementwise, with independence) OD flows z defined by z i = j i x j j i if route j serves OD pair i. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

21 Traffic Models Traffic models typically specified in terms of route flows (volumes). Let x denote vector of route flows for some observational period. Model parameterized by θ. Canonical Example x Pois(θ) (interpret elementwise, with independence) OD flows z defined by z i = j i x j j i if route j serves OD pair i. More Sophisticated Example z NegBin (allows for over-dispersion) Route choice probabilities {p j } defined by random utility models. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

22 Data for Model Fitting and Assessment Traffic model defined by probability function f X (x θ) for route flows. Want to conduct inference for θ. Route flows, x Link counts, y Nature Direct observation on x Indirect observation on x Example sources Vehicle tracking Numberplate matching Travel surveys Vehicle counters on road Counts at router switch Computer packet tracking Cost High Low AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

23 Data for Model Fitting and Assessment Traffic model defined by probability function f X (x θ) for route flows. Want to conduct inference for θ. Route flows, x Link counts, y Nature Direct observation on x Indirect observation on x Example sources Vehicle tracking Numberplate matching Travel surveys Vehicle counters on road Counts at router switch Computer packet tracking Cost High Low Focus on inference from link counts. Relevant unless routing data complete; e.g. Parry & H. (2012). Parry, K. and Hazelton, M.L. (2012). Estimation of origin-destination matrices from link counts and sporadic routing data. Transp. Res. Part B 46, AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

24 Toy Example y 1 = 10 y 2 = 10 Assume three OD pairs: (1, 2), (1, 3), (2, 3). Fixed routing, so we have r = 3 routes. For definiteness, suppose x = (x 1, x 2, x 3 ) T Pois(θ). Vector of n = 2 observed link counts is y = (y 1, y 2 ) T. What does this tell us about latent route flows x and hence about parameter vector θ? AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

25 Links Counts and Route Flows Fundamental Relationship Link count vector y and route flow vector x related by y = Ax where A = (a ij ) is the link-path incidence matrix defined by { 1 if link i forms part of route j a ij = 0 otherwise. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

26 Links Counts and Route Flows Fundamental Relationship Link count vector y and route flow vector x related by y = Ax where A = (a ij ) is the link-path incidence matrix defined by { 1 if link i forms part of route j a ij = 0 otherwise. Just a counting exercise: y i = r j=1 a ijx j. Usually massively under-determined linear system. Link counts provide only indirect information about x and θ. Inference is a form of linear inverse problem. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

27 Back to the Toy Example y 1 = 10 y 2 = 10 y = [ y1 y 2 ] = [ ] x 1 x 2 x 3 = Ax AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

28 Feasible Route Flow Set Feasible Route Flow Set For a given link count vector y, the set of feasible route flows is X y = {x : y = Ax, x 0} where inequality is interpreted component-wise. Set X y is of fundamental importance for inference. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

29 Back to the Toy Example y 1 = 10 y 2 = 10 y = [ y1 y 2 ] = [ ] x 1 x 2 x 3 = Ax X y = = {(x 1, x 2, x 3 ) T Z 3 0 : x 1 + x 2 = y 1, x 2 + x 3 = y 2 } { (0, 10, 0) T, (1, 9, 1) T,..., (10, 0, 10) T}. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

30 Likelihood Functions for Traffic Models Probability function for observed link count data y denoted f Y (y θ). Model likelihood given by L(θ) = f Y (y θ) = f Y X (y x, θ)f X (x θ) x = f X (x θ). x X y Explanation: f Y X (y x, θ) is indicator of event {y = Ax, x 0}. I.e. that route flow is compatible with observed link counts. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

31 Likelihood Functions for Traffic Models Probability function for observed link count data y denoted f Y (y θ). Model likelihood given by L(θ) = f Y (y θ) = f Y X (y x, θ)f X (x θ) x = f X (x θ). x X y Explanation: f Y X (y x, θ) is indicator of event {y = Ax, x 0}. I.e. that route flow is compatible with observed link counts. For canonical Poisson example: r If x Pois(θ) then L(θ) = x X y i=1 Awkward mathematical form. e θ i θ x i i x i!. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

32 Model Identifiability Single link count vector y provides limited information about x and hence θ. What about an iid sample, y (1),..., y (n)? Model identifiable from link count data if f Y ( θ) = f Y ( θ) θ = θ. Nice study for normal models by Singhal & Michailidis (2007). Extends to correlated data cases but places restrictions on network topology and model parameterization. Singhal, H. and Michalidis, G. (2007). Identifiability of flow distributions from link measurements in computer networks. Inverse Problems 23, AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

33 Identifiability Theory for the Integer-Valued Traffic Case Vardi (1996) proved identifiability for canonical Poisson traffic model with single route per OD pair. New result generalizes this: Theorem (H. 2015) Assume that the route flows are independent, and that f X and f Y have support equal to the non-negative integers. Then if θ is identifiable from independent observations on x, it is also identifiable from independent observations on y. Hazelton, M.L. (2015). Network tomography for integer-valued traffic. Annals of Applied Statistics, 9(1), Vardi, Y. (1996). Network tomography: estimating source-destination traffic intensities from link data. Journal of the American Statistical Association 91, AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

34 Learning from the Dependence Structure Identifiability can be explained in part by information from link count dependence structure. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

35 Learning from the Dependence Structure Identifiability can be explained in part by information from link count dependence structure. Scenario 1 Scenario 2 Observation 1 Observation y 1 = 11 y 2 = 11 Observation y 1 = 6 y 2 = 6 Observation y 1 = 7 y 2 = y 1 = 8 y 2 = 6 Observation y 1 = 9 y 2 = 12 Observation y 1 = 10 y 2 = 11 AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

36 Learning from the Dependence Structure Identifiability can be explained in part by information from link count dependence structure. Scenario 1 Scenario 2 Observation 1 Observation y 1 = 11 y 2 = 11 Observation y 1 = 6 y 2 = 6 Observation y 1 = 7 y 2 = 7 θ = (0, 10, 0) T y 1 = 8 y 2 = 6 Observation y 1 = 9 y 2 = 12 Observation y 1 = 10 y 2 = 11 θ = (10, 0, 10) T AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

37 Identifiability: Open Research Questions 1 For integer-valued flows, results require independence of route flows, and independence between days. Can either of these independence assumptions be relaxed? If so, in what ways? What are minimal requirements? AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

38 Identifiability: Open Research Questions 1 For integer-valued flows, results require independence of route flows, and independence between days. Can either of these independence assumptions be relaxed? If so, in what ways? What are minimal requirements? 2 Proof is constructive in nature, and relies on observation of very low probability events. What are practical implication of theoretical identifiability from link count data? AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

39 Likelihood-Based Inference Observe n link count vectors: y (1),..., y (n). Likelihood is L(θ) = n t=1 f Y(y (t) θ). Recall that for each single link count vector, f Y (y θ) = x X y f X (x θ). Usually computationally infeasible to enumerate X y = {x: y = Ax, x 0}. Hence not feasible to explicitly evaluate f Y (y θ) and L(θ). AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

40 Likelihood-Based Inference Observe n link count vectors: y (1),..., y (n). Likelihood is L(θ) = n t=1 f Y(y (t) θ). Recall that for each single link count vector, f Y (y θ) = x X y f X (x θ). Usually computationally infeasible to enumerate X y = {x: y = Ax, x 0}. Hence not feasible to explicitly evaluate f Y (y θ) and L(θ). Passing comment: Many ad hoc methods of estimation seek single optimal element of X y. That is demonstrably sub-optimal. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

41 Sampling Based Inference Crux of the idea: replace enumeration of X y = {x : y = Ax, x 0} by a representative sample therefrom. Practical implementations: stochastic EM algorithm for likelihood inference, MCMC algorithms for Bayesian inference. Li (2005) discusses EM and stochastic EM algorithms for trip matrix estimation. Tebaldi & West (1998) is seminal reference for MCMC Bayesian network tomography. All previous methods have struggled because of difficulty in sampling from X y. Li, B. (2005). Bayesian inference for origin-destination matrices of transport networks using the EM algorithm. Technometrics 47(4), Tebaldi, C. & West, M. (1998). Bayesian inference on network traffic using link count data (with discussion). Journal of the Amer. Statist. Assoc. 93, AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

42 Stochastic EM Algorithm Iterative algorithm. Suppose θ current value of θ. Step 1 Compute ˆQ(θ θ ) = M 1 M i=1 N t=1 log{f X (x (t) i θ)} where x (t) 1,..., x (t) M is a random sample from f X Y( y (t), θ ). Step 2 Maximize ˆQ(θ θ ) to give new value of θ. Algorithm converges to maximum likelihood estimate ˆθ. Standard errors available from (approximate) information matrix. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

43 Bayesian Inference using MCMC Metropolis Hastings Algorithm Prior: π(θ). Posterior : p(θ y (1),..., y (n) ) L(θ)π(θ) MCMC Algorithm (for n = 1 case) 1 Generate initial values for θ, x. 2 Sample candidate x from proposal distribution q. Want q to have support X y. 3 Accept x with probability min(1, α) where α = f X Y(x y, θ)q(x) f X Y (x y, θ)q(x ) = f X,Y(x, y θ)q(x) f X,Y (x, y θ)q(x ) = f X(x θ)q(x) f X (x θ)q(x ). 4 Sample θ from p(θ x) (usually straightforward). 5 Go to step 2. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

44 Sampling Feasible Route Flow Vectors Trick is to sample from proposal distribution q with support matching X y = {x : y = Ax, x 0}. Can be an iterative process, updating one element of x at at time. Development of efficient algorithm for such sampling is 20 year old problem. Recent geometrical insight from Airoldi & Haas (2011), Airoldi & Blocker (2013) for continuous flow models but still no reliable solution for integer valued traffic. Airoldi, E.M. & Blocker, A.W. (2013). Estimating latent processes on a network from indirect measurements. Journal Amer. Statist. Assoc. 108, Airoldi, E.M. & Hass, B. (2011). Polytope samplers for inference in ill-posed inverse problems. In Int. Conf. Artificial Intelligence and Statistics, Vol. 15. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

45 Geometry of the Feasible Route Flow Set X y is intersection of linear manifold {x : y = Ax} with non-negative orthant {x 0}. Hence X y = {x : y = Ax, x 0} is a convex polytope. If r routes and n links then X y is an r n dimensional object embedded in r-dimensional space. Have flexibility in representation. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

46 Geometry of the Feasible Route Flow Set Example Nodes 1, 2 are origins of flow, nodes 3, 4 and 5 destinations. Hence six routes, indexed (o, d) for o = 1, 2, d = 3, 4, 5. Link-route incidence matrix is A = [A 1 A 2 ] = Partitioned so that A 1 is invertible, and hence polytope represented by flows on last two routes, (2, 4) and (2, 5). AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

47 Geometry of the Feasible Route Flow Set Example continued Suppose link counts are y = (20, 30, 20, 10) T (single day). Polytope are represented by routes 5 (2, 4) and 6 (2, 5). x 5 x AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

48 Sampling Geometry of the Feasible Route Flow Set Example continued Sampling (e.g. uniformly) from polytope not straightforward, because no convenient expression for support. Sampling iteratively in coordinate directions much simpler. x 5 x AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

49 Sampling Geometry of the Feasible Route Flow Set Example continued Sampling (e.g. uniformly) from polytope not straightforward, because no convenient expression for support. Sampling iteratively in coordinate directions much simpler. x 4 x AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

50 Sampling for Awkward Geometries Revised example Suppose y = (10, 20, 20, 10) T (left) or y = (10, 20, 19, 9) T (right). Sampling in coordinate directions impossible (left) or very slow to converge (right). x x x 5 x 5 AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

51 Change of Representation Tebaldi & West (1998) proposed coordinate direction iterative sampling but failed to appreciate that it won t necessarily work for awkward geometries. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

52 Change of Representation Tebaldi & West (1998) proposed coordinate direction iterative sampling but failed to appreciate that it won t necessarily work for awkward geometries. But recall there is flexibility in choice of representation of X y. H. (2015) proved recently that there is always a convenient representation of X y for coordinate direction sampling. Caveat: so long as the link-path matrix A is unimodular. Hazelton, M.L. (2015). Network tomography for integer-valued traffic. Annals of Applied Statistics, 9(1), Tebaldi, C. & West, M. (1998). Bayesian inference on network traffic using link count data (with discussion). Journal of the Amer. Statist. Assoc. 93, AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

53 Change of Representation Back to the Example As before, y = (10, 20, 20, 10) T (left) or y = (10, 20, 19, 9) T (right). Represent polytope by flows on routes 4 (2, 3) and 6 (2, 5). Sampling in coordinate directions then efficient. x x x 5 x 5 AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

54 Adaptive MCMC Samplers We can construct a good route flow sampler by allowing adaptive representation of X y. Strategy is to exclude routes carrying heavy traffic from basis for polytope. Intuitively this maximizes the slack and hence permits longest possible moves. Implement by monitoring estimated route flows and updating basis periodically. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

55 Example: Road Intersection in Leicester, UK Leicester AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

56 Example: Road Intersection in Leicester, UK Network AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

57 Example: Road Intersection in Leicester, UK Network OD pairs: (o, d) {1, 2, 3, 5, 6} {1, 2, 3, 5, 6}. Traffic counted on all links except 5. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

58 Example: Road Intersection in Leicester, UK Model Canonical Poisson model: x Pois(θ). Fixed routing, so route flows equal OD flows. Single link vector observed over 15 mins on weekday in May. y = (72, 56, 217, 120, 119, 127, 178, 117, 181) T Prior information available. Critical otherwise useful inference impossible. Specified through gamma priors on elements of θ. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

59 Example: Road Intersection in Leicester, UK Model Canonical Poisson model: x Pois(θ). Fixed routing, so route flows equal OD flows. Single link vector observed over 15 mins on weekday in May. y = (72, 56, 217, 120, 119, 127, 178, 117, 181) T Prior information available. Critical otherwise useful inference impossible. Specified through gamma priors on elements of θ. Employ MCMC algorithm to sample x and θ. Switch between sampling x θ, y and θ x. Adaptive choice of basis for polytope X y. Updates at 10,000 and 20,000 iterations. AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

60 Example: Road Intersection in Leicester, UK Trace Plots for x AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

61 Example: Road Intersection in Leicester, UK Results Route O-D Prior mean Posterior mean 95% CI (27.6, 52.7) (12.0, 30.3) (42.5, 76.3) (16.3, 38.4) (12.8, 33.8) (25.9, 53.8) (43.5, 80.2) (1.6, 11.0) (11.5, 31.8) (21.9, 46.3) (26.1, 48.9) (5.2, 18.0) (37.3, 67.0) (5.8, 27.1) (16.1, 39.8) (0.3, 16.0) (48.7, 85.1) (1.5, 9.5) (3.4, 14.3) (27.8, 51.6) AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

62 Closing Comments: More Open Research Questions Preceding theory and methods work if A is totally unimodular. A matrix is totally unimodular if all square sub-matrices have integer-valued inverses. Most realistic link-route incidence matrices seem to have this property, but can construct artificial counter-examples. Why do real matrices tend to have totally unimodular link-route incidence matrices? Are we likely to find counter-examples in practice? AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

63 For a Copy of These Slides... AUT Mathematical Sciences Symposium Auckland, 7 8 December / 39

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian inference for origin-destination matrices of transport networks using the EM algorithm

Bayesian inference for origin-destination matrices of transport networks using the EM algorithm Loughborough University Institutional Repository Bayesian inference for origin-destination matrices of transport networks using the EM algorithm his item was submitted to Loughborough University's Institutional

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

Integer programming: an introduction. Alessandro Astolfi

Integer programming: an introduction. Alessandro Astolfi Integer programming: an introduction Alessandro Astolfi Outline Introduction Examples Methods for solving ILP Optimization on graphs LP problems with integer solutions Summary Introduction Integer programming

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Theory of Stochastic Processes 8. Markov chain Monte Carlo Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information

Stat 451 Lecture Notes Numerical Integration

Stat 451 Lecture Notes Numerical Integration Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite

More information

Introduction to Integer Programming

Introduction to Integer Programming Lecture 3/3/2006 p. /27 Introduction to Integer Programming Leo Liberti LIX, École Polytechnique liberti@lix.polytechnique.fr Lecture 3/3/2006 p. 2/27 Contents IP formulations and examples Total unimodularity

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58 Overview of Course So far, we have studied The concept of Bayesian network Independence and Separation in Bayesian networks Inference in Bayesian networks The rest of the course: Data analysis using Bayesian

More information

15-780: LinearProgramming

15-780: LinearProgramming 15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

The Monte Carlo Method: Bayesian Networks

The Monte Carlo Method: Bayesian Networks The Method: Bayesian Networks Dieter W. Heermann Methods 2009 Dieter W. Heermann ( Methods)The Method: Bayesian Networks 2009 1 / 18 Outline 1 Bayesian Networks 2 Gene Expression Data 3 Bayesian Networks

More information

On Markov chain Monte Carlo methods for tall data

On Markov chain Monte Carlo methods for tall data On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational

More information

Linear and Integer Programming - ideas

Linear and Integer Programming - ideas Linear and Integer Programming - ideas Paweł Zieliński Institute of Mathematics and Computer Science, Wrocław University of Technology, Poland http://www.im.pwr.wroc.pl/ pziel/ Toulouse, France 2012 Literature

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Estimating Latent Processes on a Network From Indirect Measurements

Estimating Latent Processes on a Network From Indirect Measurements Estimating Latent Processes on a Network From Indirect Measurements The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation

More information

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 Permuted and IROM Department, McCombs School of Business The University of Texas at Austin 39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 1 / 36 Joint work

More information

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. xx, No. x, Xxxxxxx 00x, pp. xxx xxx ISSN 0364-765X EISSN 156-5471 0x xx0x 0xxx informs DOI 10.187/moor.xxxx.xxxx c 00x INFORMS On the Power of Robust Solutions in

More information

11. Learning graphical models

11. Learning graphical models Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical

More information

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Network Flows. 6. Lagrangian Relaxation. Programming. Fall 2010 Instructor: Dr. Masoud Yaghini

Network Flows. 6. Lagrangian Relaxation. Programming. Fall 2010 Instructor: Dr. Masoud Yaghini In the name of God Network Flows 6. Lagrangian Relaxation 6.3 Lagrangian Relaxation and Integer Programming Fall 2010 Instructor: Dr. Masoud Yaghini Integer Programming Outline Branch-and-Bound Technique

More information

Exponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that

Exponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that 1 More examples 1.1 Exponential families under conditioning Exponential families also behave nicely under conditioning. Specifically, suppose we write η = η 1, η 2 R k R p k so that dp η dm 0 = e ηt 1

More information

Linear Programming Duality P&S Chapter 3 Last Revised Nov 1, 2004

Linear Programming Duality P&S Chapter 3 Last Revised Nov 1, 2004 Linear Programming Duality P&S Chapter 3 Last Revised Nov 1, 2004 1 In this section we lean about duality, which is another way to approach linear programming. In particular, we will see: How to define

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Viable and Sustainable Transportation Networks. Anna Nagurney Isenberg School of Management University of Massachusetts Amherst, MA 01003

Viable and Sustainable Transportation Networks. Anna Nagurney Isenberg School of Management University of Massachusetts Amherst, MA 01003 Viable and Sustainable Transportation Networks Anna Nagurney Isenberg School of Management University of Massachusetts Amherst, MA 01003 c 2002 Viability and Sustainability In this lecture, the fundamental

More information

1.1.1 Algebraic Operations

1.1.1 Algebraic Operations 1.1.1 Algebraic Operations We need to learn how our basic algebraic operations interact. When confronted with many operations, we follow the order of operations: Parentheses Exponentials Multiplication

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Advances in Network Tomography

Advances in Network Tomography Advances in Network Tomography Edoardo M. Airoldi eairoldi@stat.cmu.edu supervised by: Christos N. Faloutsos christos@cs.cmu.edu 1 Abstract Knowledge about the origin-destination (OD) traffic matrix allows

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

Time-Sensitive Dirichlet Process Mixture Models

Time-Sensitive Dirichlet Process Mixture Models Time-Sensitive Dirichlet Process Mixture Models Xiaojin Zhu Zoubin Ghahramani John Lafferty May 25 CMU-CALD-5-4 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 Abstract We introduce

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Deriving the EM-Based Update Rules in VARSAT. University of Toronto Technical Report CSRG-580

Deriving the EM-Based Update Rules in VARSAT. University of Toronto Technical Report CSRG-580 Deriving the EM-Based Update Rules in VARSAT University of Toronto Technical Report CSRG-580 Eric I. Hsu Department of Computer Science University of Toronto eihsu@cs.toronto.edu Abstract Here we show

More information

SC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.

More information

Network Equilibrium Models: Varied and Ambitious

Network Equilibrium Models: Varied and Ambitious Network Equilibrium Models: Varied and Ambitious Michael Florian Center for Research on Transportation University of Montreal INFORMS, November 2005 1 The applications of network equilibrium models are

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the

More information

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 35, No., May 010, pp. 84 305 issn 0364-765X eissn 156-5471 10 350 084 informs doi 10.187/moor.1090.0440 010 INFORMS On the Power of Robust Solutions in Two-Stage

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

1 The linear algebra of linear programs (March 15 and 22, 2015)

1 The linear algebra of linear programs (March 15 and 22, 2015) 1 The linear algebra of linear programs (March 15 and 22, 2015) Many optimization problems can be formulated as linear programs. The main features of a linear program are the following: Variables are real

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Osnat Stramer 1 and Matthew Bognar 1 Department of Statistics and Actuarial Science, University of

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15

More information

Relation of Pure Minimum Cost Flow Model to Linear Programming

Relation of Pure Minimum Cost Flow Model to Linear Programming Appendix A Page 1 Relation of Pure Minimum Cost Flow Model to Linear Programming The Network Model The network pure minimum cost flow model has m nodes. The external flows given by the vector b with m

More information

Lectures 6, 7 and part of 8

Lectures 6, 7 and part of 8 Lectures 6, 7 and part of 8 Uriel Feige April 26, May 3, May 10, 2015 1 Linear programming duality 1.1 The diet problem revisited Recall the diet problem from Lecture 1. There are n foods, m nutrients,

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

Likelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University

Likelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University Likelihood, MLE & EM for Gaussian Mixture Clustering Nick Duffield Texas A&M University Probability vs. Likelihood Probability: predict unknown outcomes based on known parameters: P(x q) Likelihood: estimate

More information

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

A Parametric Simplex Algorithm for Linear Vector Optimization Problems A Parametric Simplex Algorithm for Linear Vector Optimization Problems Birgit Rudloff Firdevs Ulus Robert Vanderbei July 9, 2015 Abstract In this paper, a parametric simplex algorithm for solving linear

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

Stat 451 Lecture Notes Monte Carlo Integration

Stat 451 Lecture Notes Monte Carlo Integration Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:

More information

Maintaining Nets and Net Trees under Incremental Motion

Maintaining Nets and Net Trees under Incremental Motion Maintaining Nets and Net Trees under Incremental Motion Minkyoung Cho, David Mount, and Eunhui Park Department of Computer Science University of Maryland, College Park August 25, 2009 Latent Space Embedding

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints.

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints. Section Notes 8 Integer Programming II Applied Math 121 Week of April 5, 2010 Goals for the week understand IP relaxations be able to determine the relative strength of formulations understand the branch

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Approximate Likelihoods

Approximate Likelihoods Approximate Likelihoods Nancy Reid July 28, 2015 Why likelihood? makes probability modelling central l(θ; y) = log f (y; θ) emphasizes the inverse problem of reasoning y θ converts a prior probability

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Afternoon Meeting on Bayesian Computation 2018 University of Reading Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of

More information

Probabilistic Graphical Models & Applications

Probabilistic Graphical Models & Applications Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with

More information

Statistical Tools and Techniques for Solar Astronomers

Statistical Tools and Techniques for Solar Astronomers Statistical Tools and Techniques for Solar Astronomers Alexander W Blocker Nathan Stein SolarStat 2012 Outline Outline 1 Introduction & Objectives 2 Statistical issues with astronomical data 3 Example:

More information

Linear Programming Redux

Linear Programming Redux Linear Programming Redux Jim Bremer May 12, 2008 The purpose of these notes is to review the basics of linear programming and the simplex method in a clear, concise, and comprehensive way. The book contains

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

3.3 Easy ILP problems and totally unimodular matrices

3.3 Easy ILP problems and totally unimodular matrices 3.3 Easy ILP problems and totally unimodular matrices Consider a generic ILP problem expressed in standard form where A Z m n with n m, and b Z m. min{c t x : Ax = b, x Z n +} (1) P(b) = {x R n : Ax =

More information

Bayesian Graphical Models

Bayesian Graphical Models Graphical Models and Inference, Lecture 16, Michaelmas Term 2009 December 4, 2009 Parameter θ, data X = x, likelihood L(θ x) p(x θ). Express knowledge about θ through prior distribution π on θ. Inference

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 18 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 31, 2012 Andre Tkacenko

More information

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process 10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable

More information

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures

More information

The method of lines (MOL) for the diffusion equation

The method of lines (MOL) for the diffusion equation Chapter 1 The method of lines (MOL) for the diffusion equation The method of lines refers to an approximation of one or more partial differential equations with ordinary differential equations in just

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Jonathan Marchini Department of Statistics University of Oxford MT 2013 Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 27 Course arrangements Lectures M.2

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Log-Density Estimation with Application to Approximate Likelihood Inference

Log-Density Estimation with Application to Approximate Likelihood Inference Log-Density Estimation with Application to Approximate Likelihood Inference Martin Hazelton 1 Institute of Fundamental Sciences Massey University 19 November 2015 1 Email: m.hazelton@massey.ac.nz WWPMS,

More information