Decomposable and Directed Graphical Gaussian Models
|
|
- Sheila Arnold
- 5 years ago
- Views:
Transcription
1 Decomposable Decomposable and Directed Graphical Gaussian Models Graphical Models and Inference, Lecture 13, Michaelmas Term 2009 November 26, 2009
2 Decomposable Definition Basic properties Wishart density The Wishart distribution is the sampling distribution of the matrix of sums of squares and products. More precisely: A random d d matrix W has a d-dimensional Wishart distribution with parameter Σ and n degrees of freedom if W D = n X ν (X ν ) i=1 where X ν N d (0, Σ). We then write W W d (n, Σ). The Wishart is the multivariate analogue to the χ 2 : W 1 (n, σ 2 ) = σ 2 χ 2 (n). If W W d (n, Σ) its mean is E(W ) = nσ.
3 Decomposable Definition Basic properties Wishart density If W 1 and W 2 are independent with W i W d (n i, Σ), then W 1 + W 2 W d (n 1 + n 2, Σ). If A is an r d matrix and W W d (n, Σ), then AWA W r (n, AΣA ). For r = 1 we get that when W W d (n, Σ) and λ R d, λ W λ σλ 2 χ2 (n), where σλ 2 = λ Σλ.
4 Decomposable Definition Basic properties Wishart density If W W d (n, Σ), where Σ is regular, then W is regular with probability one if and only if n d. When n d the Wishart distribution has density f d (w n, Σ) = c(d, n) 1 (det Σ) n/2 (det w) (n d 1)/2 e tr(σ 1 w)/2 for w positive definite, and 0 otherwise. The Wishart constant c(d, n) is c(d, n) = 2 nd/2 (2π) d(d 1)/4 d Γ{(n + 1 i)/2}. i=1
5 Decomposable Definition and likelihood function Iterative Proportional Scaling Consider X = (X v, v V ) N V (0, Σ) with Σ regular and K = Σ 1. The concentration matrix of the conditional distribution of (X α, X β ) given X V \{α,β} is Hence ( kαα k K {α,β} = αβ k βα k ββ ). α β V \ {α, β} k αβ = 0. Thus the dependence graph G(K) of a regular Gaussian distribution is given by α β k αβ = 0.
6 Decomposable Definition and likelihood function Iterative Proportional Scaling S(G) denotes the symmetric matrices A with a αβ = 0 unless α β and S + (G) their positive definite elements. A Gaussian graphical model for X specifies X as multivariate normal with K S + (G) and otherwise unknown. The likelihood function based on a sample of size n is L(K) (det K) n/2 e tr(kw )/2, where W is the Wishart matrix of sums of squares and products, W W V (n, Σ) with Σ 1 = K S + (G).
7 Decomposable Definition and likelihood function Iterative Proportional Scaling Define the matrices A u, u V E as those with elements 1 if u V and i = j = u aij u = 1 if u E and u = {i, j}. 0 otherwise. Then, as K S(G), K = v V k v A v + e E k e A e (1) and hence tr(kw ) = k v tr(a v W ) + k e tr(a e W ) v V e E
8 Decomposable Definition and likelihood function Iterative Proportional Scaling Hence we can identify the family as a (regular and canonical) exponential family with tr(a u W )/2, u V E as canonical sufficient statistics. This yields the likelihood equations tr(a u W ) = n tr(a u Σ), u V E. which can also be expressed as or, equivalently nˆσ vv = w vv, nˆσ αβ = w αβ, v V, {α, β} E. n ˆΣ cc = w cc for all cliques c C(G), We should remember the model restriction Σ 1 S + (G).
9 Decomposable Definition and likelihood function Iterative Proportional Scaling For K S + (G) and c C, define the operation of adjusting the c-marginal as follows. Let a = V \ c and ( n(wcc ) T c K = 1 + K ca (K aa ) 1 ) K ac K ca. (2) K ac K aa The C-marginal covariance Σ cc corresponding to the adjusted concentration matrix becomes Σ cc = {(T c K) 1 } cc = { n(w cc ) 1 + K ca (K aa ) 1 K ac K ca (K aa ) 1 K ac } 1 = w cc /n, hence T c K does indeed adjust the marginals. From (2) it is seen that the pattern of zeros in K is preserved under the operation T c, and it can also be seen to stay positive definite.
10 Decomposable Definition and likelihood function Iterative Proportional Scaling Next we choose any ordering (c 1,..., c k ) of the cliques in G. Choose further K 0 = I and define for r = 0, 1,... K r+1 = (T c1 T ck )K r. Then we have: Consider a sample from a covariance selection model with graph G. Then ˆK = lim r K r, provided the maximum likelihood estimate ˆK of K exists.
11 Decomposable Basic factorizations Maximum likelihood estimates An example If the graph G is chordal, we say that the graphical model is decomposable. In this case, the IPS-algorithm converges in a finite number of steps, as in the discrete case. We also have the familiar factorization of densities C C f (x Σ) = f (x C Σ C ) S S f (x S Σ S ) ν(s) (3) where ν(s) is the number of times S appear as intersection between neighbouring cliques of a junction tree for C.
12 Decomposable Relations for trace and determinant Basic factorizations Maximum likelihood estimates An example Using the factorization (3) we can for example match the expressions for the trace and determinant of Σ tr(kw ) = C C tr(k C W C ) S S ν(s) tr(k S W S ) and further det Σ = {det(k)} 1 = C C det{σ C } S S {det(σ S)} ν(s) These are some of many relations that can be derived using the decomposition property of chordal graphs.
13 Decomposable Basic factorizations Maximum likelihood estimates An example The same factorization clearly holds for the maximum likelihood estimates: C C f (x ˆΣ) = f (x C ˆΣ C ) S S f (x (4) S ˆΣ S ) ν(s) Moreover, it follows from the general likelihood equations that ˆΣ A = W A /n whenever A is complete. Exploiting this, we can obtain an explicit formula for the maximum likelihood estimate in the case of a chordal graph.
14 Decomposable Basic factorizations Maximum likelihood estimates An example For a d e matrix A = {a γµ } γ d,µ e we let [A] V denote the matrix obtained from A by filling up with zero entries to obtain full dimension V V, i.e. ( ) { [A] V = aγµ if γ d, µ e γµ 0 otherwise. The maximum likelihood estimates exists if and only if n C for all C C. Then the following simple formula holds for the maximum likelihood estimate of K: { } ˆK = n [(w C ) 1] V ν(s) [(w S ) 1] V. C C S S
15 Decomposable Mathematics marks Basic factorizations Maximum likelihood estimates An example 2:Vectors 4:Analysis 3:Algebra 1:Mechanics 5:Statistics This graph is chordal with cliques {1, 2, 3}, {3, 4, 5} with separator S = {3} having ν({3}) = 1.
16 Decomposable Basic factorizations Maximum likelihood estimates An example Since one degree of freedom is lost by subtracting the average, we get in this example ˆK = 87 w 11 [123] w 12 [123] w 13 [123] 0 0 w 21 [123] w 22 [123] w 23 [123] 0 0 w 31 [123] w 32 [123] w 33 [123] + w 33 [345] 1/w 33 w 34 [345] w 35 [345] 0 0 w 43 [345] w 44 [345] w 45 [345] 0 0 w 53 [345] w 54 [345] w 55 [345] where w ij [123] is the ijth element of the inverse of and so on. W [123] = w 11 w 12 w 13 w 21 w 22 w 23 w 31 w 32 w 33
17 Decomposable Definition A simple example More general systems Maximum likelihood estimation Consider a directed acyclic graph D and associate for every vertex a random variable X v. Consider now the equation system X v α v X pa(v) + β v + U v, v V (5) where U v, v V are independent random disturbances with U v N (0, σ 2 v ). Such an equation system is known as a recursive structural equation system. Structural equation systems are used heavily in social sciences and in economics. The term structural refers to the fact that the equations are assumed to be stable under intervention so that fixing a value of x v would change the system only by removing the line in the equation system (5) defining x v.
18 Decomposable Definition A simple example More general systems Maximum likelihood estimation A recursive structural equation system defines a multivariate Gaussian distribution which satisfies the directed Markov property of D since the joint density becomes f (x α, σ) = v (2π) 1/2 σv 1 e = (2π) V /2 ( v e v σ 1 v (xv α v x pa(v) βv )2 ) (xv α v x pa(v) βv )2 2σv 2, from which the joint concentration matrix K can easily be derived. 2σ 2 v
19 Decomposable Definition A simple example More general systems Maximum likelihood estimation Consider the system X 1 U 1 X 2 U 2 X 3 α 31 X 1 + U 3 X 4 α 42 X 2 + α 43 X 3 + U 4. The quadratic expression in the exponent becomes x1 2 σ1 2 + x 2 2 σ (x 3 α 31 x 1 ) 2 σ (x 4 α 42 x 2 α 43 x 3 ) 2 σ4 2.
20 Decomposable Definition A simple example More general systems Maximum likelihood estimation Expanding the squares and identifying terms yields the concentration matrix 1 + α2 σ 2 31 α σ3 2 σ α2 σ α 42 α 43 α 42 σ4 2 σ4 2 σ4 2. α 31 α 42 α α2 σ3 2 σ4 2 σ α 43 σ4 2 σ4 2 0 α 42 σ 2 4 α 43 σ σ 2 4
21 Decomposable Definition A simple example More general systems Maximum likelihood estimation The covariance matrix can in principle be found by inverting the above. However, it is easier to express X in terms of the Us as X 1 = U 1 X 2 = U 2 X 3 = α 31 U 1 + U 3 X 4 = α 43 α 31 U 1 + α 42 U 2 + α 43 U 3 + U 4 and then calculate the covariances directly to obtain σ1 2 0 α 31 σ1 2 α 43 α 31 σ1 2 0 σ2 2 0 α 42 σ2 2 α 31 σ1 2 0 σ3 2 + α2 31 σ2 1 α 43 α31 2 σ2 1 + α 43σ3 2, α 43 α 31 σ1 2 α 42 σ2 2 α 43 α31 2 σ2 1 + α 43σ3 2 ω4 2 where ω 2 4 = α 2 43α 2 31σ α 2 42σ α 2 43σ σ 2 4.
22 Decomposable Definition A simple example More general systems Maximum likelihood estimation Systems of structural equations of the type considered are called recursive in contrast to feedback systems of equations where directed cycles in the corresponding graph are allowed. It is also customary to allow correlations between the disturbance terms. This leads to violation of the directed Markov property, so other types of graph and Markov properties must be considered to deal correctly with such models. This type of model and analysis goes back to the geneticist Sewall Wright who coined the term path analysis to the calculus of effects based on this kind of models. This is one of the early precursors for modern graphical modelling.
23 Decomposable Definition A simple example More general systems Maximum likelihood estimation A directed graphical Gaussian model generated by a linear recursive structural equation system is equivalent to a corresponding undirected graphical model if and only if D is perfect, in which case its skeleton σ(d) is chordal. Still, even for non-perfect DAGs the MLE is easy to find for linear recursive structural equation systems: For each v, linear regression of data X ν v, ν = 1,... N onto parents X pa(v) is performed and corresponding regression coefficients ˆα v and residual variance estimates ˆσ 2 v are calculated in the usual way.
Decomposable Graphical Gaussian Models
CIMPA Summerschool, Hammamet 2011, Tunisia September 12, 2011 Basic algorithm This simple algorithm has complexity O( V + E ): 1. Choose v 0 V arbitrary and let v 0 = 1; 2. When vertices {1, 2,..., j}
More informationLikelihood Analysis of Gaussian Graphical Models
Faculty of Science Likelihood Analysis of Gaussian Graphical Models Ste en Lauritzen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 2 Slide 1/43 Overview of lectures Lecture 1 Markov Properties
More informationMultivariate Gaussian Analysis
BS2 Statistical Inference, Lecture 7, Hilary Term 2009 February 13, 2009 Marginal and conditional distributions For a positive definite covariance matrix Σ, the multivariate Gaussian distribution has density
More informationStructure estimation for Gaussian graphical models
Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of
More informationWilks Λ and Hotelling s T 2.
Wilks Λ and. Steffen Lauritzen, University of Oxford BS2 Statistical Inference, Lecture 13, Hilary Term 2008 March 2, 2008 If X and Y are independent, X Γ(α x, γ), and Y Γ(α y, γ), then the ratio X /(X
More informationProbability Propagation
Graphical Models, Lectures 9 and 10, Michaelmas Term 2009 November 13, 2009 Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable;
More informationProbability Propagation
Graphical Models, Lecture 12, Michaelmas Term 2010 November 19, 2010 Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable; (iii)
More informationGraphical Models with Symmetry
Wald Lecture, World Meeting on Probability and Statistics Istanbul 2012 Sparse graphical models with few parameters can describe complex phenomena. Introduce symmetry to obtain further parsimony so models
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Recitation 3 1 Gaussian Graphical Models: Schur s Complement Consider
More informationPartitioned Covariance Matrices and Partial Correlations. Proposition 1 Let the (p + q) (p + q) covariance matrix C > 0 be partitioned as C = C11 C 12
Partitioned Covariance Matrices and Partial Correlations Proposition 1 Let the (p + q (p + q covariance matrix C > 0 be partitioned as ( C11 C C = 12 C 21 C 22 Then the symmetric matrix C > 0 has the following
More informationCS281A/Stat241A Lecture 19
CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723
More informationTotal positivity in Markov structures
1 based on joint work with Shaun Fallat, Kayvan Sadeghi, Caroline Uhler, Nanny Wermuth, and Piotr Zwiernik (arxiv:1510.01290) Faculty of Science Total positivity in Markov structures Steffen Lauritzen
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationMarkov properties for directed graphs
Graphical Models, Lecture 7, Michaelmas Term 2009 November 2, 2009 Definitions Structural relations among Markov properties Factorization G = (V, E) simple undirected graph; σ Say σ satisfies (P) the pairwise
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationElements of Graphical Models DRAFT.
Steffen L. Lauritzen Elements of Graphical Models DRAFT. Lectures from the XXXVIth International Probability Summer School in Saint-Flour, France, 2006 December 2, 2009 Springer Contents 1 Introduction...................................................
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationLecture 17: May 29, 2002
EE596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 2000 Dept. of Electrical Engineering Lecture 17: May 29, 2002 Lecturer: Jeff ilmes Scribe: Kurt Partridge, Salvador
More informationCS Lecture 3. More Bayesian Networks
CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More informationGaussian Models (9/9/13)
STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate
More informationGaussian Graphical Models: An Algebraic and Geometric Perspective
Gaussian Graphical Models: An Algebraic and Geometric Perspective Caroline Uhler arxiv:707.04345v [math.st] 3 Jul 07 Abstract Gaussian graphical models are used throughout the natural sciences, social
More informationLinear-Time Inverse Covariance Matrix Estimation in Gaussian Processes
Linear-Time Inverse Covariance Matrix Estimation in Gaussian Processes Joseph Gonzalez Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 jegonzal@cs.cmu.edu Sue Ann Hong Computer
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models
1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall
More informationBased on slides by Richard Zemel
CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we
More informationECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4
ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian
More informationExact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = =
Exact Inference I Mark Peot In this lecture we will look at issues associated with exact inference 10 Queries The objective of probabilistic inference is to compute a joint distribution of a set of query
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationp L yi z n m x N n xi
y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen
More informationInverse Wishart Distribution and Conjugate Bayesian Analysis
Inverse Wishart Distribution and Conjugate Bayesian Analysis BS2 Statistical Inference, Lecture 14, Hilary Term 2008 March 2, 2008 Definition Testing for independence Hotelling s T 2 If W 1 W d (f 1, Σ)
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,
More informationAsymptotic standard errors of MLE
Asymptotic standard errors of MLE Suppose, in the previous example of Carbon and Nitrogen in soil data, that we get the parameter estimates For maximum likelihood estimation, we can use Hessian matrix
More informationTutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg
Tutorial: Gaussian conditional independence and graphical models Thomas Kahle Otto-von-Guericke Universität Magdeburg The central dogma of algebraic statistics Statistical models are varieties The central
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models I
MATH 829: Introduction to Data Mining and Analysis Graphical Models I Dominique Guillot Departments of Mathematical Sciences University of Delaware May 2, 2016 1/12 Independence and conditional independence:
More informationWISHART/RIETZ DISTRIBUTIONS AND DECOMPOSABLE UNDIRECTED GRAPHS BY STEEN ANDERSSON INDIANA UNIVERSITY AND THOMAS KLEIN AUGSBURG UNIVERSITÄET
. WISHART/RIETZ DISTRIBUTIONS AND DECOMPOSABLE UNDIRECTED GRAPHS BY STEEN ANDERSSON INDIANA UNIVERSITY AND THOMAS KLEIN AUGSBURG UNIVERSITÄET 1 TARTU 6JUN007 Classical Wishart distributions. Let V be a
More informationMaximum likelihood in log-linear models
Graphical Models, Lecture 4, Michaelmas Term 2010 October 22, 2010 Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs Let A denote an arbitrary set of subsets
More informationGraphical Gaussian Models with Edge and Vertex Symmetries
Graphical Gaussian Models with Edge and Vertex Symmetries Søren Højsgaard Aarhus University, Denmark Steffen L. Lauritzen University of Oxford, United Kingdom Summary. In this paper we introduce new types
More informationIntroduction to Probabilistic Graphical Models
Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More information3. Probability and Statistics
FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important
More informationUndirected Graphical Models
Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates
More information1. The Multivariate Classical Linear Regression Model
Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The
More informationLecture 4 October 18th
Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations
More informationConditional Independence and Markov Properties
Conditional Independence and Markov Properties Lecture 1 Saint Flour Summerschool, July 5, 2006 Steffen L. Lauritzen, University of Oxford Overview of lectures 1. Conditional independence and Markov properties
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationNext tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2
Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution Defn: Z R 1 N(0,1) iff f Z (z) = 1 2π e z2 /2 Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) (a column
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationAlgebraic Representations of Gaussian Markov Combinations
Submitted to the Bernoulli Algebraic Representations of Gaussian Markov Combinations M. SOFIA MASSA 1 and EVA RICCOMAGNO 2 1 Department of Statistics, University of Oxford, 1 South Parks Road, Oxford,
More informationEigenvalues and diagonalization
Eigenvalues and diagonalization Patrick Breheny November 15 Patrick Breheny BST 764: Applied Statistical Modeling 1/20 Introduction The next topic in our course, principal components analysis, revolves
More informationFisher Information in Gaussian Graphical Models
Fisher Information in Gaussian Graphical Models Jason K. Johnson September 21, 2006 Abstract This note summarizes various derivations, formulas and computational algorithms relevant to the Fisher information
More informationEco517 Fall 2014 C. Sims FINAL EXAM
Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,
More informationParametrizations of Discrete Graphical Models
Parametrizations of Discrete Graphical Models Robin J. Evans www.stat.washington.edu/ rje42 10th August 2011 1/34 Outline 1 Introduction Graphical Models Acyclic Directed Mixed Graphs Two Problems 2 Ingenuous
More informationY1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4
Inference: Exploiting Local Structure aphne Koller Stanford University CS228 Handout #4 We have seen that N inference exploits the network structure, in particular the conditional independence and the
More informationIntroduction to Matrix Algebra and the Multivariate Normal Distribution
Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Structural Equation Modeling Lecture #2 January 18, 2012 ERSH 8750: Lecture 2 Motivation for Learning the Multivariate
More informationA Bayesian Treatment of Linear Gaussian Regression
A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 10 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due this Wednesday (Nov 4) in class Project milestones due next Monday (Nov 9) About half
More informationIterative Conditional Fitting for Gaussian Ancestral Graph Models
Iterative Conditional Fitting for Gaussian Ancestral Graph Models Mathias Drton Department of Statistics University of Washington Seattle, WA 98195-322 Thomas S. Richardson Department of Statistics University
More informationParameter estimation in linear Gaussian covariance models
Parameter estimation in linear Gaussian covariance models Caroline Uhler (IST Austria) Joint work with Piotr Zwiernik (UC Berkeley) and Donald Richards (Penn State University) Big Data Reunion Workshop
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationMarkov Chains and Hidden Markov Models
Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden
More informationLecture 15. Probabilistic Models on Graph
Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationIntroduction to Graphical Models
Introduction to Graphical Models STA 345: Multivariate Analysis Department of Statistical Science Duke University, Durham, NC, USA Robert L. Wolpert 1 Conditional Dependence Two real-valued or vector-valued
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More information1 Cricket chirps: an example
Notes for 2016-09-26 1 Cricket chirps: an example Did you know that you can estimate the temperature by listening to the rate of chirps? The data set in Table 1 1. represents measurements of the number
More informationSTAT 350: Geometry of Least Squares
The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationMarkov Independence (Continued)
Markov Independence (Continued) As an Undirected Graph Basic idea: Each variable V is represented as a vertex in an undirected graph G = (V(G), E(G)), with set of vertices V(G) and set of edges E(G) the
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationarxiv: v3 [stat.me] 10 Mar 2016
Submitted to the Annals of Statistics ESTIMATING THE EFFECT OF JOINT INTERVENTIONS FROM OBSERVATIONAL DATA IN SPARSE HIGH-DIMENSIONAL SETTINGS arxiv:1407.2451v3 [stat.me] 10 Mar 2016 By Preetam Nandy,,
More informationevaluate functions, expressed in function notation, given one or more elements in their domains
Describing Linear Functions A.3 Linear functions, equations, and inequalities. The student writes and represents linear functions in multiple ways, with and without technology. The student demonstrates
More informationFLEXIBLE COVARIANCE ESTIMATION IN GRAPHICAL GAUSSIAN MODELS
Submitted to the Annals of Statistics FLEXIBLE COVARIANCE ESTIMATION IN GRAPHICAL GAUSSIAN MODELS By Bala Rajaratnam, Hélène Massam and Carlos M. Carvalho Stanford University, York University, The University
More informationGraphical Models and Independence Models
Graphical Models and Independence Models Yunshu Liu ASPITRG Research Group 2014-03-04 References: [1]. Steffen Lauritzen, Graphical Models, Oxford University Press, 1996 [2]. Christopher M. Bishop, Pattern
More informationCS 559: Machine Learning Fundamentals and Applications 2 nd Set of Notes
1 CS 559: Machine Learning Fundamentals and Applications 2 nd Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationExample: multivariate Gaussian Distribution
School of omputer Science Probabilistic Graphical Models Representation of undirected GM (continued) Eric Xing Lecture 3, September 16, 2009 Reading: KF-chap4 Eric Xing @ MU, 2005-2009 1 Example: multivariate
More informationDynamic Matrix-Variate Graphical Models A Synopsis 1
Proc. Valencia / ISBA 8th World Meeting on Bayesian Statistics Benidorm (Alicante, Spain), June 1st 6th, 2006 Dynamic Matrix-Variate Graphical Models A Synopsis 1 Carlos M. Carvalho & Mike West ISDS, Duke
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationLecture 20: November 1st
10-725: Optimization Fall 2012 Lecture 20: November 1st Lecturer: Geoff Gordon Scribes: Xiaolong Shen, Alex Beutel Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationThe Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11
Wishart Priors Patrick Breheny March 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Introduction When more than two coefficients vary, it becomes difficult to directly model each element
More informationThe Multivariate Gaussian Distribution [DRAFT]
The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,
More informationLikelihood Methods. 1 Likelihood Functions. The multivariate normal distribution likelihood function is
Likelihood Methods 1 Likelihood Functions The multivariate normal distribution likelihood function is The log of the likelihood, say L 1 is Ly = π.5n V.5 exp.5y Xb V 1 y Xb. L 1 = 0.5[N lnπ + ln V +y Xb
More information26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.
10-708: Probabilistic Graphical Models, Spring 2015 26 : Spectral GMs Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 1 Introduction A common task in machine learning is to work with
More informationMath, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD
Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline These preliminaries serve to signal to students what tools they need to know to succeed in ECON 360 and refresh their
More informationMultivariate Bayesian Linear Regression MLAI Lecture 11
Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate
More informationMaths for Signals and Systems Linear Algebra in Engineering
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 15, Tuesday 8 th and Friday 11 th November 016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE
More informationProbabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm
Probabilistic Graphical Models 10-708 Homework 2: Due February 24, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 4-8. You must complete all four problems to
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning
More informationA Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models
A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models arxiv:1811.03735v1 [math.st] 9 Nov 2018 Lu Zhang UCLA Department of Biostatistics Lu.Zhang@ucla.edu Sudipto Banerjee UCLA
More information