Total positivity in Markov structures
|
|
- Derrick Jennings
- 5 years ago
- Views:
Transcription
1 1 based on joint work with Shaun Fallat, Kayvan Sadeghi, Caroline Uhler, Nanny Wermuth, and Piotr Zwiernik (arxiv: ) Faculty of Science Total positivity in Markov structures Steffen Lauritzen 1 Department of Mathematical Sciences CRM, Montreal, July 2016 Slide 1/27
2 1 Positive association and multivariate total positivity 2 Multivariate Gaussian MTP 2 distributions 3 Conditional independence and Markov properties 4 Totally positive Markov distributions 5 Special instances of total positivity Slide 2/27
3 Positive dependence and Simpson s paradox Two real-valued random variables X and Y are positively associated if if Cov{f(X),g(Y)} 0 for all f, g which are non-decreasing. The Yule-Simpson paradox says that we may have X and Y positively associated but X and Y negatively associated conditionally on a third variable Z. Multivariate total positivity (MTP 2 ) ensures this not to happen: associations can never change sign due to changes of context. Hence might be of a causal nature... Slide 3/27
4 Multivariate total positivity for functions Let f : X = v V X v R where X v are either discrete or open subsets of R. Definition f is multivariate totally positive of order two (MTP 2 ) if f(x)f(y) f(x y)f(x y) for all x,y X. Here and should be applied coordinatewise. In the bivariate case, this property is known simply as total positivity or TP 2 (Karlin and Rinott, 1980). A function g is supermodular if g(x y)+g(x y) g(x)+g(y) for all x,y Z. Thus g is supermodular iff exp(g) is MTP 2. Slide 4/27
5 Example For d = 2, x 1 x 2,y 1 y 2 the condition for MTP 2 simply becomes or, alternatively f(x 1,y 2 )f(x 2,y 1 ) f(x 1,y 1 )f(x 2,y 2 ), det { f(x1,y 1 ) f(x 1,y 2 ) f(x 2,y 1 ) f(x 2,y 2 ) } 0. Slide 5/27
6 Multivariate total positivity for distributions For X = v V X v as before we adopt a standard base measure µ = v V µ v where µ v is counting measure if X v is discrete and Lebesgue measure if X v is an open subset of R. We then define Definition A distribution P is said to be multivariate totally positive of order two (MTP 2 ) if its density w.r.t. the standard base measure µ is MTP 2. Introduced and studied by Karlin and Rinott (1980) using results (FKG inequality) from fundamental paper by Fortuin et al. (1971). We shall occasionally say X is MTP 2 instead of the distribution of X is MTP 2. Slide 6/27
7 Example For d = 2, let f be the density of a Gaussian distribution with mean zero and covariance matrix { } σxx σ Σ = xy. σ yx σ yy Then f(x 1,y 2 )f(x 2,y 1 ) f(x 1,y 1 )f(x 2,y 2 ) if and only if σ yx 0, since this is equivalent to the mixed terms in the exponents satisfying (x 1 y 2 +x 2 y 1 )σ xy /det(σ) (x 1 y 1 +x 2 y 2 )σ xy /det(σ) and if σ xy > 0 this is equivalent to (x 1 y 1 +x 2 y 2 ) (x 1 y 2 +x 2 y 1 ) = (x 2 x 1 )(y 2 y 1 ) 0. Slide 7/27
8 Example Consider binary X and Y with p ij = P(X = i,y = j) for i,j {0,1}. Then P is MTP 2 if and only if p 01 p 10 p 00 p 11 i.e. iff the odds-ratio θ = p 00 p 11 /p 01 p 10 satisfies θ 1. For three MTP 2 binary variables X,Y,Z we have, for example, p 01k p 10k p 00k p 11k, k = 0,1, and thus the conditional odds-ratios satisfy θ k = p 00k p 11k /p 01k p 10k 1. Slide 8/27
9 Examples of MTP 2 distributions Mostly from Karlin and Rinott (1980): Characteristic roots of a Wishart matrix W, or of W 1 W 1 2, or W 1 (W 1 +W 2 ) 1, where W 1 W 2 (Dykstra and Hewett, 1978); Ferromagnetic (attractive) Ising models (Lebowitz, 1972); Multivariate logistic density (Gumbel, 1961); Gaussian free fields (random height landscapes) (Dynkin, 1980); Markov chains with TP 2 transition densities; Order statistics (X (1),...,X (n) ) if X 1...,X n are i.i.d. with density f; Gaussian latent tree models as in phylogenetics (Zwiernik, 2015); Many other examples... Slide 9/27
10 Fundamental properties A wealth of probability inequalities are satisfied for MTP 2 distributions (Karlin and Rinott, 1980). Also Proposition Assume X is MTP 2. Then If A V, then the marginal X A = (X v ) v A is MTP 2 ; If C V then the conditional distribution L(X V\C X C = x C ) is MTP 2 for almost all x C X C; If X is discrete and Y is obtained from X by collapsing neighboring states, then Y is MTP 2 ; If φ = (φ v ) v V are non-decreasing, then Y = φ(x) is MTP 2. Slide 10/27
11 Positive association and MTP 2 Proposition If X is MTP 2 and f and g are non-decreasing in each of its arguments, then X is positively associated Proof. Cov{f(X),g(X)} 0. Discrete case by Fortuin et al. (1971). General case by Sarkar (1969). Slide 11/27
12 Covariance and independence Proposition If X is positively associated and A,B V are disjoint, then X A X B Cov(X u,x v ) = 0 for all u A,v B. Proof. Shown in Lebowitz (1972). Such a result is usually special for the Gaussian distribution. So learning MTP 2 structure may be based on correlation analysis. Slide 12/27
13 Multivariate Gaussian MTP 2 distributions Proposition Let X N V (0,Σ). Then X is MTP 2 if and only if K = Σ 1 is a positive definite Minkowski matrix (M-matrix) i.e. iff Proof. k uv 0 for u v and u,v V. See Bølviken (1982) and Karlin and Rinott (1983). Since k uv is proportional to the negative partial correlation between X u and X v, X is MTP 2 if and only if all partial correlations are non-negative. Note also that this is a convex restriction in K. Slide 13/27
14 Mathematics marks Mechanics Vectors Algebra Analysis Statistics Mechanics Vectors Algebra Analysis Statistics Empirical partial correlations (below the diagonal) and concentrations ( 1000, on and above the diagonal) for 88 examination marks in five mathematical subjects. Essentially MTP 2. Slide 14/27
15 Mathematics marks under MTP 2 Fitting under the MTP 2 constraint yields ˆK which conforms with graphical model below Vectors Analysis Algebra Mechanics Statistics Slide 15/27
16 Abstract conditional independence An independence model σ is a ternary relation over subsets of V. It is semi-graphoid if for disjoint subsets A, B, C, D: (S1) if A σ B C then B σ A C (symmetry); (S2) if A σ (B D) C then A σ B C and A σ D C (decomposition); (S3) if A σ (B C) D then A σ B (C D) (weak union); (S4) if A σ B C and A σ D (B C), then A σ (B D) C (contraction). Any probabilistic independence model P is a semi-graphoid. It is a graphoid if (S1) (S4) holds and (S5) if A σ B (C D) and A σ C (B D) then A σ (B C) D (intersection). If X has a density f > 0 its associated independence model P is a graphoid. Slide 16/27
17 Conditional independence and total positivity A probability distribution on X defines an independence model P by A P B S X A P X B X S. Proposition (Fallat et al. 2016) If X is MTP 2, its independence model P satisfies (S6) (A P B C) (A P D C) = A P (B D) C (composition); (S7) (u P v C) (u P v (C w)) = (u P w C) (v P w C) (singleton transitivity) S(8) (A P B C) D V \(A B) = A P B (C D) (upward stability). These are all fulfilled for separation G in undirected graphs, but not necessarily for any probabilistic independence model P. Slide 17/27
18 Markov properties Let P be a probability distribution on X = v V X v. The pairwise independence graph G(P) = (V,E) is defined through the relation uv E u P v V \{u,v}. In other words, G(P) is the smallest graph G such that P is pairwise Markov w.r.t. G. We say that P is globally Markov w.r.t. a graph G if A G B S = A P B S where G is separation in the graph G. Further, we say that P is faithful to G if A G B S A P B S i.e. if the independence models P and G are identical. Slide 18/27
19 A main result Theorem (Fallat et al. 2016) Assume the distribution P of X is MTP 2 with strictly positive density f > 0. Then P is faithful to G(P). In other words, for MTP 2 distributions, the pairwise independence graph yields a complete picture of the independence relations in P. It also implies that if P is faithful to a DAG D and P is MTP 2, D must be perfect, i.e. all parents in the DAG are connected. So in this case, the undirected version of the DAG is chordal. Slide 19/27
20 Graph decompositions and total positivity Consider a chordal graph G and an associated junction tree T of cliques. Theorem (Fallat et al. 2016) If all separators S in T are singletons, a distribution P is MTP 2 if and only if all clique marginals P C,c C are MTP 2. Note in particular this covers trees. If the separators are not singletons, it is easy to construct counterexamples. And since the MTP 2 property is closed under marginalization, this implies that latent tree models with pairwise MTP 2 2 associations are MTP 2. Slide 20/27
21 Pairwise interaction models Theorem (Fallat et al. (2016)) A distribution of the form p(x) = 1 Z uv E ψ uv (x u,x v ), where ψ uv are positive functions and Z is a normalizing constant, is MTP 2 if and only if each ψ uv is an MTP 2 function. This covers, in particular, ferromagnetic Ising models. Slide 21/27
22 Higher order interactions Let X = (X v ) v V take values in X = v V X v where each X v is finite. D denote the power set of V. If p(x) > 0 for all x, we can expand log(x) = D Dθ D (x), (1) where interactions θ D depend on x through x D only. For uniqueness, we may w.l.o.g. assume 0 X v and require that θ D (x) = 0 whenever x d = 0 for some d D. In the binary case we may use simpler notation by letting θ D (1 D ) := θ D for all D D. Slide 22/27
23 Higher order interactions For a fixed pair u,w V, we define γ uw on X by γ uw (x) = θ D (x). D:{u,w} D Proposition (Fallat et al. (2016)) Let P be strictly positive. Then P is MTP 2 if and only if for all A V with A 2 and any given u,w V the function γ uw is non-negative, non-decreasing, and supermodular over X A, where X A are those with support A. Slide 23/27
24 Binary log-linear expansions For the binary case, the previous result specializes: Corollary (Bartolucci and Forcina (2000)) Let P be a binary distribution with logp(x) = D θ D Then P is MTP 2 if and only if for all A with A 2 and all {u,w} V we have D:{u,w} D A θ D 0. Slide 24/27
25 Causal betweenness Let X = (X 1 = 1 A,X 2 = 1 B,X 3 = 1 C ) be binary indicator functions of events A, B, C. Reichenbach (1956) says B is causally between A and C if P(C B A) = P(C B) and 1 > P(C B) > P(C A) > P(C) > 0, 1 > P(A B) > P(A C) > P(A) > 0. In general, causal betweenness does not imply MTP 2 ; if we let p 101 = 0, p 000 = 4/10, and p ijk = 1/10 for the remaining six possibilities, B is causally between A and C, but X is not MTP 2 since 0 = p 101 p 000 < p 100 p 001. However, if P(X = x) > 0 for all x and B is causally between A and C, then P is MTP 2. Conversely, if P(X = x) > 0 for all x, P is MTP 2, and the independence graph of P is then B is causally between A and C. This follows from the faithfulness of P. Slide 25/27
26 Some implications for structural learning A distribution is signed MTP 2 if sign changes σ v { 1,1} can be allocated to X v so that Y v = σ v X v,v V is MTP 2 ; The MTP 2 restriction is convex in logf, hence lends itself to convex optimization; So a potential learning strategy first finds a Chow-Liu tree, then changes signs so associations along edges are positive, and finally optimizes scoring function (e.g. penalized likelihood) under MTP 2 constraints. To be explored, so watch this space... Slide 26/27
27 There are many more things to be said... Thank you! Slide 27/27
28 Bartolucci, F. and Forcina, A. (2000). A likelihood ratio test for MTP 2 within binary variables. Ann. Statist., 28(4): Bølviken, E. (1982). Probability inequalities for the multivariate normal with non-negative partial correlations. Scand. J. Statist., 9: Dykstra, R. L. and Hewett, J. E. (1978). Positive dependence of the roots of a Wishart matrix. The Annals of Statistics, 6(1): Dynkin, E. (1980). Markov processes and random fields. Bulletin of the American Mathematical Society, 3(3): Fallat, S., Lauritzen, S., Sadeghi, K., Uhler, C., Wermuth, N., and Zwiernik, P. (2016). Total positivity in Markov structures. Annals of Statistics, page To appear. arxiv: Slide 27/27
29 Fortuin, C. M., Kasteleyn, P. W., and Ginibre, J. (1971). Correlation inequalities on some partially ordered sets. Comm. Math. Phys., 22(2): Gumbel, E. J. (1961). Bivariate logistic distributions. Journal of the American Statistical Association, 56(294): Karlin, S. and Rinott, Y. (1980). Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions. J. Multiv. Anal., 10(4): Karlin, S. and Rinott, Y. (1983). M-matrices as covariance matrices of multinormal distributions. Linear Algebra Appl., 52: Lebowitz, J. L. (1972). Bounds on the correlations and analyticity properties of ferromagnetic Ising spin systems. Comm. Math. Phys., 28(4): Reichenbach, H. (1956). The Direction of Time. University of California Press, Berkeley, CA. Slide 27/27
30 Sarkar, T. K. (1969). Some lower bounds of reliability. Tech. Report, No. 124, Department of Operations Research and Department of Statistics, Stanford University. Zwiernik, P. (2015). Semialgebraic Statistics and Latent Tree Models. Number 146 in Monographs on Statistics and Applied Probability. Chapman & Hall. Slide 27/27
Likelihood Analysis of Gaussian Graphical Models
Faculty of Science Likelihood Analysis of Gaussian Graphical Models Ste en Lauritzen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 2 Slide 1/43 Overview of lectures Lecture 1 Markov Properties
More informationMarkov properties for undirected graphs
Graphical Models, Lecture 2, Michaelmas Term 2011 October 12, 2011 Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y,
More informationConditional Independence and Markov Properties
Conditional Independence and Markov Properties Lecture 1 Saint Flour Summerschool, July 5, 2006 Steffen L. Lauritzen, University of Oxford Overview of lectures 1. Conditional independence and Markov properties
More informationMarkov properties for undirected graphs
Graphical Models, Lecture 2, Michaelmas Term 2009 October 15, 2009 Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y,
More informationFaithfulness of Probability Distributions and Graphs
Journal of Machine Learning Research 18 (2017) 1-29 Submitted 5/17; Revised 11/17; Published 12/17 Faithfulness of Probability Distributions and Graphs Kayvan Sadeghi Statistical Laboratory University
More informationUndirected Graphical Models
Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates
More informationLecture 4 October 18th
Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations
More informationGraphical Models and Independence Models
Graphical Models and Independence Models Yunshu Liu ASPITRG Research Group 2014-03-04 References: [1]. Steffen Lauritzen, Graphical Models, Oxford University Press, 1996 [2]. Christopher M. Bishop, Pattern
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models I
MATH 829: Introduction to Data Mining and Analysis Graphical Models I Dominique Guillot Departments of Mathematical Sciences University of Delaware May 2, 2016 1/12 Independence and conditional independence:
More informationLearning Multivariate Regression Chain Graphs under Faithfulness
Sixth European Workshop on Probabilistic Graphical Models, Granada, Spain, 2012 Learning Multivariate Regression Chain Graphs under Faithfulness Dag Sonntag ADIT, IDA, Linköping University, Sweden dag.sonntag@liu.se
More informationTutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg
Tutorial: Gaussian conditional independence and graphical models Thomas Kahle Otto-von-Guericke Universität Magdeburg The central dogma of algebraic statistics Statistical models are varieties The central
More informationDecomposable Graphical Gaussian Models
CIMPA Summerschool, Hammamet 2011, Tunisia September 12, 2011 Basic algorithm This simple algorithm has complexity O( V + E ): 1. Choose v 0 V arbitrary and let v 0 = 1; 2. When vertices {1, 2,..., j}
More informationDecomposable and Directed Graphical Gaussian Models
Decomposable Decomposable and Directed Graphical Gaussian Models Graphical Models and Inference, Lecture 13, Michaelmas Term 2009 November 26, 2009 Decomposable Definition Basic properties Wishart density
More informationIndependencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks)
(Bayesian Networks) Undirected Graphical Models 2: Use d-separation to read off independencies in a Bayesian network Takes a bit of effort! 1 2 (Markov networks) Use separation to determine independencies
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More informationUndirected Graphical Models: Markov Random Fields
Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationTotal positivity order and the normal distribution
Journal of Multivariate Analysis 97 (2006) 1251 1261 www.elsevier.com/locate/jmva Total positivity order and the normal distribution Yosef Rinott a,,1, Marco Scarsini b,2 a Department of Statistics, Hebrew
More informationGeometry of Gaussoids
Geometry of Gaussoids Bernd Sturmfels MPI Leipzig and UC Berkeley p 3 p 13 p 23 a 12 3 p 123 a 23 a 13 2 a 23 1 a 13 p 2 p 12 a 12 p p 1 Figure 1: With The vertices Tobias andboege, 2-faces ofalessio the
More informationCSC 412 (Lecture 4): Undirected Graphical Models
CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:
More informationarxiv: v2 [stat.me] 5 May 2016
Palindromic Bernoulli distributions Giovanni M. Marchetti Dipartimento di Statistica, Informatica, Applicazioni G. Parenti, Florence, Italy e-mail: giovanni.marchetti@disia.unifi.it and Nanny Wermuth arxiv:1510.09072v2
More informationReview: Directed Models (Bayes Nets)
X Review: Directed Models (Bayes Nets) Lecture 3: Undirected Graphical Models Sam Roweis January 2, 24 Semantics: x y z if z d-separates x and y d-separation: z d-separates x from y if along every undirected
More information4.1 Notation and probability review
Directed and undirected graphical models Fall 2015 Lecture 4 October 21st Lecturer: Simon Lacoste-Julien Scribe: Jaime Roquero, JieYing Wu 4.1 Notation and probability review 4.1.1 Notations Let us recall
More informationParameter estimation in linear Gaussian covariance models
Parameter estimation in linear Gaussian covariance models Caroline Uhler (IST Austria) Joint work with Piotr Zwiernik (UC Berkeley) and Donald Richards (Penn State University) Big Data Reunion Workshop
More informationMarkov properties for mixed graphs
Bernoulli 20(2), 2014, 676 696 DOI: 10.3150/12-BEJ502 Markov properties for mixed graphs KAYVAN SADEGHI 1 and STEFFEN LAURITZEN 2 1 Department of Statistics, Baker Hall, Carnegie Mellon University, Pittsburgh,
More informationProf. Dr. Lars Schmidt-Thieme, L. B. Marinho, K. Buza Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany, Course
Course on Bayesian Networks, winter term 2007 0/31 Bayesian Networks Bayesian Networks I. Bayesian Networks / 1. Probabilistic Independence and Separation in Graphs Prof. Dr. Lars Schmidt-Thieme, L. B.
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,
More informationCausal Effect Identification in Alternative Acyclic Directed Mixed Graphs
Proceedings of Machine Learning Research vol 73:21-32, 2017 AMBN 2017 Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Jose M. Peña Linköping University Linköping (Sweden) jose.m.pena@liu.se
More informationLearning Marginal AMP Chain Graphs under Faithfulness
Learning Marginal AMP Chain Graphs under Faithfulness Jose M. Peña ADIT, IDA, Linköping University, SE-58183 Linköping, Sweden jose.m.pena@liu.se Abstract. Marginal AMP chain graphs are a recently introduced
More informationLog-Convexity Properties of Schur Functions and Generalized Hypergeometric Functions of Matrix Argument. Donald St. P. Richards.
Log-Convexity Properties of Schur Functions and Generalized Hypergeometric Functions of Matrix Argument Donald St. P. Richards August 22, 2009 Abstract We establish a positivity property for the difference
More informationLearning discrete graphical models via generalized inverse covariance matrices
Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,
More informationStructure estimation for Gaussian graphical models
Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationChapter 16. Structured Probabilistic Models for Deep Learning
Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe
More informationIdentifying the Graphs of Polynomial Functions
Identifying the Graphs of Polynomial Functions Many of the functions on the Math IIC are polynomial functions. Although they can be difficult to sketch and identify, there are a few tricks to make it easier.
More informationEstimating Latent Variable Graphical Models with Moments and Likelihoods
Estimating Latent Variable Graphical Models with Moments and Likelihoods Arun Tejasvi Chaganty Percy Liang Stanford University June 18, 2014 Chaganty, Liang (Stanford University) Moments and Likelihoods
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationMarkov properties for directed graphs
Graphical Models, Lecture 7, Michaelmas Term 2009 November 2, 2009 Definitions Structural relations among Markov properties Factorization G = (V, E) simple undirected graph; σ Say σ satisfies (P) the pairwise
More informationGraphical Gaussian models and their groups
Piotr Zwiernik TU Eindhoven (j.w. Jan Draisma, Sonja Kuhnt) Workshop on Graphical Models, Fields Institute, Toronto, 16 Apr 2012 1 / 23 Outline and references Outline: 1. Invariance of statistical models
More informationIntroduction to Graphical Models
Introduction to Graphical Models STA 345: Multivariate Analysis Department of Statistical Science Duke University, Durham, NC, USA Robert L. Wolpert 1 Conditional Dependence Two real-valued or vector-valued
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationDEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY
DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY OUTLINE 3.1 Why Probability? 3.2 Random Variables 3.3 Probability Distributions 3.4 Marginal Probability 3.5 Conditional Probability 3.6 The Chain
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning and Inference in DAGs We discussed learning in DAG models, log p(x W ) = n d log p(x i j x i pa(j),
More informationx log x, which is strictly convex, and use Jensen s Inequality:
2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and
More informationDirected and Undirected Graphical Models
Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed
More informationParametrizations of Discrete Graphical Models
Parametrizations of Discrete Graphical Models Robin J. Evans www.stat.washington.edu/ rje42 10th August 2011 1/34 Outline 1 Introduction Graphical Models Acyclic Directed Mixed Graphs Two Problems 2 Ingenuous
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationAlgebraic methods toward higher-order probability inequalities
Algebraic methods toward higher-orderprobability inequalities p. 1/3 Algebraic methods toward higher-order probability inequalities Donald Richards Penn State University and SAMSI Algebraic methods toward
More informationExample: multivariate Gaussian Distribution
School of omputer Science Probabilistic Graphical Models Representation of undirected GM (continued) Eric Xing Lecture 3, September 16, 2009 Reading: KF-chap4 Eric Xing @ MU, 2005-2009 1 Example: multivariate
More informationChapter 1 Vector Spaces
Chapter 1 Vector Spaces Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 110 Linear Algebra Vector Spaces Definition A vector space V over a field
More informationThe Maximum Likelihood Threshold of a Graph
The Maximum Likelihood Threshold of a Graph Elizabeth Gross and Seth Sullivant San Jose State University, North Carolina State University August 28, 2014 Seth Sullivant (NCSU) Maximum Likelihood Threshold
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationBased on slides by Richard Zemel
CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we
More informationProbabilistic Graphical Models. Rudolf Kruse, Alexander Dockhorn Bayesian Networks 153
Probabilistic Graphical Models Rudolf Kruse, Alexander Dockhorn Bayesian Networks 153 The Big Objective(s) In a wide variety of application fields two main problems need to be addressed over and over:
More informationQuiz 1 Date: Monday, October 17, 2016
10-704 Information Processing and Learning Fall 016 Quiz 1 Date: Monday, October 17, 016 Name: Andrew ID: Department: Guidelines: 1. PLEASE DO NOT TURN THIS PAGE UNTIL INSTRUCTED.. Write your name, Andrew
More informationCOMP538: Introduction to Bayesian Networks
COMP538: Introduction to Bayesian Networks Lecture 9: Optimal Structure Learning Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationMarkov properties for graphical time series models
Markov properties for graphical time series models Michael Eichler Universität Heidelberg Abstract This paper deals with the Markov properties of a new class of graphical time series models which focus
More informationElements of Graphical Models DRAFT.
Steffen L. Lauritzen Elements of Graphical Models DRAFT. Lectures from the XXXVIth International Probability Summer School in Saint-Flour, France, 2006 December 2, 2009 Springer Contents 1 Introduction...................................................
More informationAn Algebraic and Geometric Perspective on Exponential Families
An Algebraic and Geometric Perspective on Exponential Families Caroline Uhler (IST Austria) Based on two papers: with Mateusz Micha lek, Bernd Sturmfels, and Piotr Zwiernik, and with Liam Solus and Ruriko
More informationLecture 17: May 29, 2002
EE596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 2000 Dept. of Electrical Engineering Lecture 17: May 29, 2002 Lecturer: Jeff ilmes Scribe: Kurt Partridge, Salvador
More informationMaximum likelihood in log-linear models
Graphical Models, Lecture 4, Michaelmas Term 2010 October 22, 2010 Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs Let A denote an arbitrary set of subsets
More informationLecture 12: May 09, Decomposable Graphs (continues from last time)
596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 00 Dept. of lectrical ngineering Lecture : May 09, 00 Lecturer: Jeff Bilmes Scribe: Hansang ho, Izhak Shafran(000).
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationTowards an extension of the PC algorithm to local context-specific independencies detection
Towards an extension of the PC algorithm to local context-specific independencies detection Feb-09-2016 Outline Background: Bayesian Networks The PC algorithm Context-specific independence: from DAGs to
More informationOn an Additive Semigraphoid Model for Statistical Networks With Application to Nov Pathway 25, 2016 Analysis -1 Bing / 38Li,
On an Additive Semigraphoid Model for Statistical Networks With Application to Pathway Analysis - Bing Li, Hyunho Chun & Hongyu Zhao Kim Youngrae SNU Stat. Multivariate Lab Nov 25, 2016 On an Additive
More informationCS281A/Stat241A Lecture 19
CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723
More informationIndependence, Decomposability and functions which take values into an Abelian Group
Independence, Decomposability and functions which take values into an Abelian Group Adrian Silvescu Department of Computer Science Iowa State University Ames, IA 50010, USA silvescu@cs.iastate.edu Abstract
More informationLog-concave distributions: definitions, properties, and consequences
Log-concave distributions: definitions, properties, and consequences Jon A. Wellner University of Washington, Seattle; visiting Heidelberg Seminaire, Institut de Mathématiques de Toulouse 28 February 202
More informationComplex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity
Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Eckehard Olbrich MPI MiS Leipzig Potsdam WS 2007/08 Olbrich (Leipzig) 26.10.2007 1 / 18 Overview 1 Summary
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationMarkovian Combination of Decomposable Model Structures: MCMoSt
Markovian Combination 1/45 London Math Society Durham Symposium on Mathematical Aspects of Graphical Models. June 30 - July, 2008 Markovian Combination of Decomposable Model Structures: MCMoSt Sung-Ho
More informationCausal Models with Hidden Variables
Causal Models with Hidden Variables Robin J. Evans www.stats.ox.ac.uk/ evans Department of Statistics, University of Oxford Quantum Networks, Oxford August 2017 1 / 44 Correlation does not imply causation
More informationChapter 9: Relations Relations
Chapter 9: Relations 9.1 - Relations Definition 1 (Relation). Let A and B be sets. A binary relation from A to B is a subset R A B, i.e., R is a set of ordered pairs where the first element from each pair
More informationBayesian (conditionally) conjugate inference for discrete data models. Jon Forster (University of Southampton)
Bayesian (conditionally) conjugate inference for discrete data models Jon Forster (University of Southampton) with Mark Grigsby (Procter and Gamble?) Emily Webb (Institute of Cancer Research) Table 1:
More informationThe Minimum Rank, Inverse Inertia, and Inverse Eigenvalue Problems for Graphs. Mark C. Kempton
The Minimum Rank, Inverse Inertia, and Inverse Eigenvalue Problems for Graphs Mark C. Kempton A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for
More informationIntelligent Systems:
Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationON STRONGLY PRIME IDEALS AND STRONGLY ZERO-DIMENSIONAL RINGS. Christian Gottlieb
ON STRONGLY PRIME IDEALS AND STRONGLY ZERO-DIMENSIONAL RINGS Christian Gottlieb Department of Mathematics, University of Stockholm SE-106 91 Stockholm, Sweden gottlieb@math.su.se Abstract A prime ideal
More information11 : Gaussian Graphic Models and Ising Models
10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood
More informationAlgebraic Representations of Gaussian Markov Combinations
Submitted to the Bernoulli Algebraic Representations of Gaussian Markov Combinations M. SOFIA MASSA 1 and EVA RICCOMAGNO 2 1 Department of Statistics, University of Oxford, 1 South Parks Road, Oxford,
More information1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)
Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 4 Learning Bayesian Networks CS/CNS/EE 155 Andreas Krause Announcements Another TA: Hongchao Zhou Please fill out the questionnaire about recitations Homework 1 out.
More informationThe intersection axiom of
The intersection axiom of conditional independence: some new [?] results Richard D. Gill Mathematical Institute, University Leiden This version: 26 March, 2019 (X Y Z) & (X Z Y) X (Y, Z) Presented at Algebraic
More informationarxiv: v4 [math.st] 11 Jul 2017
UNIFYING MARKOV PROPERTIES FOR GRAPHICAL MODELS arxiv:1608.05810v4 [math.st] 11 Jul 2017 By Steffen Lauritzen and Kayvan Sadeghi University of Copenhagen and University of Cambridge Several types of graphs
More informationRegression models for multivariate ordered responses via the Plackett distribution
Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationCausality in Econometrics (3)
Graphical Causal Models References Causality in Econometrics (3) Alessio Moneta Max Planck Institute of Economics Jena moneta@econ.mpg.de 26 April 2011 GSBC Lecture Friedrich-Schiller-Universität Jena
More informationARTICLE IN PRESS. Journal of Multivariate Analysis ( ) Contents lists available at ScienceDirect. Journal of Multivariate Analysis
Journal of Multivariate Analysis ( ) Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Marginal parameterizations of discrete models
More informationCapturing Independence Graphically; Undirected Graphs
Capturing Independence Graphically; Undirected Graphs COMPSCI 276, Spring 2014 Set 2: Rina Dechter (Reading: Pearl chapters 3, Darwiche chapter 4) 1 Constraint Networks Example: map coloring Variables
More informationGraphical Model Inference with Perfect Graphs
Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite
More informationFisher Information in Gaussian Graphical Models
Fisher Information in Gaussian Graphical Models Jason K. Johnson September 21, 2006 Abstract This note summarizes various derivations, formulas and computational algorithms relevant to the Fisher information
More informationLearning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University
Learning from Sensor Data: Set II Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University 1 6. Data Representation The approach for learning from data Probabilistic
More informationGaussian Graphical Models: An Algebraic and Geometric Perspective
Gaussian Graphical Models: An Algebraic and Geometric Perspective Caroline Uhler arxiv:707.04345v [math.st] 3 Jul 07 Abstract Gaussian graphical models are used throughout the natural sciences, social
More information