Markov random fields. The Markov property

Similar documents
Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models

Introduction to graphical models: Lecture III

Undirected Graphical Models

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Markov Random Fields

Review: Directed Models (Bayes Nets)

Gibbs Fields & Markov Random Fields

Introduction to Graphical Models

Statistics 352: Spatial statistics. Jonathan Taylor. Department of Statistics. Models for discrete data. Stanford University.

Fast algorithms that utilise the sparsity of Q exist (c-package ( Q 1. GMRFlib) Seattle February 3, 2009

Graphical Models and Kernel Methods

Computation fundamentals of discrete GMRF representations of continuous domain spatial models

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

CSC 412 (Lecture 4): Undirected Graphical Models

Markov Chains and MCMC

Markov Random Fields and Gibbs Measures

Gaussian Processes 1. Schedule

Spatial Statistics with Image Analysis. Lecture L11. Home assignment 3. Lecture 11. Johan Lindström. December 5, 2016.

An Introduction to Exponential-Family Random Graph Models

Bayesian spatial hierarchical modeling for temperature extremes

Data are collected along transect lines, with dense data along. Spatial modelling using GMRFs. 200m. Today? Spatial modelling using GMRFs

Summary STK 4150/9150

Probabilistic Graphical Models

Rapid Introduction to Machine Learning/ Deep Learning

Undirected Graphical Models: Markov Random Fields

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models

Asymptotic standard errors of MLE

Areal Unit Data Regular or Irregular Grids or Lattices Large Point-referenced Datasets

Chris Bishop s PRML Ch. 8: Graphical Models

Bayesian Learning in Undirected Graphical Models

Conditional Independence and Factorization

Example: multivariate Gaussian Distribution

Statistical Learning

Probabilistic Graphical Models

CPSC 540: Machine Learning

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks)

Undirected graphical models

Probabilistic Graphical Models

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Exponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that

Variational Inference (11/04/13)

High dimensional Ising model selection

A fast sampler for data simulation from spatial, and other, Markov random fields

Nonparameteric Regression:

Probabilistic Graphical Models

Multivariate Gaussians. Sargur Srihari

Markov Random Fields

Note Set 5: Hidden Markov Models

MAP Examples. Sargur Srihari

STA 4273H: Statistical Machine Learning

13: Variational inference II

11 : Gaussian Graphic Models and Ising Models

6 Markov Chain Monte Carlo (MCMC)

The Ising model and Markov chain Monte Carlo

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

13 : Variational Inference: Loopy Belief Propagation and Mean Field

3 : Representation of Undirected GM

An EM algorithm for Gaussian Markov Random Fields

Graphical Models. Outline. HMM in short. HMMs. What about continuous HMMs? p(o t q t ) ML 701. Anna Goldenberg ... t=1. !

Generative and Discriminative Approaches to Graphical Models CMSC Topics in AI

Learning discrete graphical models via generalized inverse covariance matrices

Hidden Markov Models for precipitation

Machine Learning 4771

STOCHASTIC MODELING OF ENVIRONMENTAL TIME SERIES. Richard W. Katz LECTURE 5

Bayesian Machine Learning - Lecture 7

Lecture 4 October 18th

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Computer Intensive Methods in Mathematical Statistics

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

STA 4273H: Statistical Machine Learning

10708 Graphical Models: Homework 2

Representation of undirected GM. Kayhan Batmanghelich

Introduction to Probabilistic Graphical Models

Probabilistic Graphical Models. Guest Lecture by Narges Razavian Machine Learning Class April

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Exact and approximate recursive calculations for binary Markov random fields defined on graphs

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Monte Carlo simulation inspired by computational optimization. Colin Fox Al Parker, John Bardsley MCQMC Feb 2012, Sydney

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Gaussian Process Regression

Alternative Parameterizations of Markov Networks. Sargur Srihari

Disease mapping with Gaussian processes

Restricted Boltzmann Machines

1 Data Arrays and Decompositions

Lecture 15. Probabilistic Models on Graph

An Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University

1. Introduction. Hang Qian 1 Iowa State University

Gibbs Field & Markov Random Field

Probabilistic Graphical Models

Gaussian Process Regression Model in Spatial Logistic Regression

Basic Sampling Methods

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

R-INLA. Sam Clifford Su-Yun Kang Jeff Hsieh. 30 August Bayesian Research and Analysis Group 1 / 14

MAS223 Statistical Inference and Modelling Exercises

Undirected Graphical Models

Transcription:

Markov random fields The Markov property Discrete time: (X k X k!1,x k!2,... = (X k X k!1 A time symmetric version: (X k! X!k = (X k X k!1,x k+1 A more general version: Let A be a set of indices >k, B a set of indices <k. Then X A! X B X k These are all equivalent.

On a spatial grid Let δ i be the neighbors of the location i. The Markov assumption is P(Z i = z i! Z!i =! z!i = P(Z i = z i Z "i = z "i = p i (z i z "i Equivalently for i!" j Z i! Z j Z "i,j! The p i are called local characteristics. They are stationary if p i = p. A potential assigns a number V A (z to every subconfiguration z A of a configuration z. (There are lots of them! Graphical models Neighbors are nodes connected with edges. 2 1 4 5 3 Given 2, 1 and 4 are independent.

Gibbs measure The energy U corresponding to a potential V is U(z = V A (z. The corresponding Gibbs measure is P(z = exp(!u(z C where C = " z! is called the partition function. A exp(!u(z Nearest neighbor potentials A set of points is a clique if all its members are neighbours. A potential is a nearest neighbor potential if V A (z=0 whenever A is not a clique.

Markov random field Any nearest neighbour potential induces a Markov random field: p i (z i z!i = P( z! =! P( z'! " z' " exp(! exp(! " C clique " C clique V C (! z V C (! z' where zʼ agrees with z except possibly at i, so V C (z=v C (zʼ for any C not including i. z' The Hammersley-Clifford theorem Assume P(z>0 for all z. Then P is a MRF on a (finite graph with respect to a neighbourhood system Δ iff P is a Gibbs measure corresponding to a nearest neighbour potential. Does a given nn potential correspond to a unique P?

The Ising model Model for ferromagnetic spin (values +1 or -1. Stationary nn pair potential V(i,j=V(j,i; V(i,i=V(0,0=v 0 ; V(0,e N =V(0,e E =v 1. logit P(Z i = 1! Z!i =! z!i =!(v 0 + v 1 (z i+en + z i!en + z i+ee + z i!ee so L(v = exp(t v + t v 0 0 1 1 where C(v t 0 =! z i ; t 1 = z i z j! i! j~i Interpretation v 0 is related to the external magnetic field (if it is strong the field will tend to have the same sign as the external field v 1 corresponds to inverse temperature (in Kelvins, so will be large for low temperatures.

Phase transition At very low temperature there is a tendency for spontaneous magnetization. For the Ising model, the boundary conditions can affect the distribution of x 0. In fact, there is a critical temperature (or value of v 1 such that for temperatures below this the boundary conditions are felt. Thus there can be different probabilities at the origin depending on the values on an arbitrary distant boundary! Simulated Ising fields

Tomato disease Data on spotted wilt from the Waite Institute 1929. 16 plots in 4x4 Latin square, each 6 rows with 15 plants each. Occurrence of the viral disease 23 days after planting. Exponential tilting log( L(v / L(u = t 0 (v 0! u 0 + t 1 (v 1! u 1! log(c(v / C(u C(u C(v = E v (exp(t 0 (v 0! u 0 + T 1 (v 1! u 1 Simulate from the model u and estimate the expectation by an average.

Fitting the tomato data t 0 = - 834 t 1 =2266 Condition on boundary and simulate 100,000 draws from u=(0,0.5. Mle ˆv = (0,.52 The simulated values of t 0 are half positive and half negative (phase transition. The auto-models Let Q(x=log(P(x/P(0. Besagʼs automodels are defined by n Q( z = z i G i (z i!! +!!" ij z i z j i=1 i=1 j~i When z i!{0,1} and G i (z i =α i we get the autologistic model When G i (z i =! i z i " log( z i! and β ij 0 we get the auto-poisson model n

Coding schemes In order to estimate parameters, it can be easier to not use all the data. Besag suggested a coding scheme in which one only uses data at points which are conditionally independent (given all the other data: Pseudolikelihood Another approximate approach is to write down a function of the data which is the product of the p i (x!i, I.e., acting as if the neighborhoods of each point were independent. This as an estimating equation, but not an optimal one. In fact, in cases of high dependence it tends to be biased.

Recall the Gaussian formula If! X$!! # & ~ N µ X$!' XX ## &,# " Y% "" µ Y % "' YX ' XY $ $ && ' YY %% then "1 (Y X ~ N(µ Y +! YX! XX(X " µx, "1! YY "! YX! XX!XY Let Q =! "1 be the precision matrix. Then the conditional precision matrix is (! YY "! YX! "1 XX! XY "1 = Q YY Gaussian MRFs We want a setup in which Z i! Z j Z "i,j whenever i and j are not neighbors.! Using the Gaussian formula we see that the condition is met iff Q ij = 0. Typically the precision matrix of a GMRF is sparse where the covariance is not. This allows fast computation of likelihoods, simulation etc.

An AR(1 process Let X t X t!1 = "X t!1 + # t. The lag k autocorrelation is φ k. The precision matrix has Q ij = φ if I-j =1, Q 11 =Q nn =1 and Q ii =1+φ 2 elsewhere. Thus Σ has n 2 non-zero elements, while Q has n+2(n-1=3n-2 non-zero elements. Using the Gaussian formula we see that " µ i!i = 1+ " (x + x Q 2 i!1 i+1 = 1+ " 2 i!i Conditional autoregression Suppose that Z i Z!i ~ N(µ i + " ij (x j! µ j! $,%!1 i This is called a Gaussian conditional autoregressive model. WLOG µ i =0. If also! i " ij =! j " ji these conditional distributions correspond to a multivariate joint Gaussian distribution, mean 0 and precision Q with Q ii =κ i and Q ij = -κ i β ij, provided Q is positive definite. If the β ij are nonzero only when i~j we have a GMRF. i# j

Likelihood calculation The Cholesky decomposition of a pd square matrix A is a lower triangular matrix L such that A=LL T. To solve Ay = b first solve Lv = b (forward substitution, then L T y = v (backward substitution. If a precision matrix Q = LL T, log det(q = 2! log( L i,i. The quadratic form in the likelihood is w T u where u=qw and w=(z-µ. Note that u i = Q i,i w i +! j:j~i Q i,j w j Simulation Let x ~ N(0,I, solve L T v = x and set z = µ + v. Then E(z = µ and Var(z = (L T -1 IL -1 = (LL T -1 = Q -1.

Spatial covariance Whittle (1963 noted that the solution to the stochastic differential equation (* +1/ 2 $! " 1 ' % & # 2 ( Z(s = +(s has covariance function # Cov(Z(s,Z(s + h! h & # h & $ % " ' ( K $ % " ' ( Rue and Tjelmeland (2003 show that one can approximate a Gaussian random field on a grid by a GMRF. Unequal spacing Lindgren and Rue show how one can use finite element methods to approximate the solution to the sde (even on a manifold like a sphere on a triangulization on a set of possibly unequally spaced points.

Covariance approximation Precipitation in the Sahel The African Sahel region (south of Sahara suffered severe drought in the 70s through 90s. There is recent evidence of recovery. Data from GHCN at 550 stations 1982-1996 (monthly, aggregated to annual.

Model Precipitation is determined by a latent (hidden process, modelled as a Gaussian process on the sphere. This is approximated by a GMRF Z(s i,t j on a discrete number of points: Details Y(s,t = P(s,t 1/3 (1-0.13P(s,t 1/3 = Z(s,t+ε Mean is a linear combination of basis functions (B-splines. Temporal structure is AR(1. Fitting by MCMC.

Results Reserved data Model check by reserving 10 stations.