Spatial Statistics with Image Analysis. Lecture L11. Home assignment 3. Lecture 11. Johan Lindström. December 5, 2016.

Size: px

Start display at page:

Download "Spatial Statistics with Image Analysis. Lecture L11. Home assignment 3. Lecture 11. Johan Lindström. December 5, 2016."

Camron Robinson
5 years ago
Views:

1 HA3 MRF:s Simulation Estimation Spatial Statistics with Image Analysis Lecture 11 Johan Lindström December 5, 2016 Lecture L11 Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 1/22 HA3 MRF:s Simulation Estimation Home assignment 3 General MRF:s Gibbs distributions Gibbs sampling Block updating Parallel updating Convergence Parameter estimation Notation for discrete MRF:s Pseudo-likelihood Example Home assignment 3 Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 2/22 HA3 MRF:s Simulation Estimation 1. Pixel classification using MRFs In classification it is reasonable to assume that the class belonging of one pixel is influenced by it s neighbours. Use a MRF to classify pixels in fmri data. 2. Spatio-Temporal modelling One way of modelling spatio-temporal data is using a set of temporal basis functions with spatially varying regression coefficients. Use a set of GMRFs to model the spatial variation in regression coefficients for fmri data. 3. Corrupted data (Classification and Gibbs-sampling Previously we have reconstructed missing pixel values. A more realistic setting is that the pixels might not be missing, but rather replaced by incorrect values. Written and oral presentation. The report is due 17:00 on the day before the presentation. Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 3/22

2 HA3 MRF:s Simulation Estimation Gibbs General Markov random fields We want models, defined for locations s i, which fulfil the Markov condition p ( x i {xj : j i} = p ( x i {xj : j N i } for the neighbours N i Gibbs distributions Originaly from mathematical physics: p(x = 1 exp ( W(x/(kT, Z where W is the energy of a state x, T is a temperature, k is some physical constant, and Z is an (unknown normalising constant. Gibbs distributions Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 4/22 HA3 MRF:s Simulation Estimation Gibbs Neighbourhood system N i = {s j ; s i is a neighbour of s i }. Cliques C. Potentials V C (x; θ. Parameters θ. Density/probability function ( p (x θ = 1 exp V C (x; θ Z θ Conditional distributions p ( x i x j, j N i, θ = 1 exp V C (x; θ Z i,θ C C;s i C Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 5/22 HA3 MRF:s Simulation Estimation Gibbs Gibbs Normalizing constant The normalizing constant Z θ sums over possible combination of x. For a 3 3 image with x (i,j {0, 1} there exists 2 9 possible combinations. For a image there exists possible combinations. Thus in reality not possible to compute Z θ Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 6/22

3 HA3 MRF:s Simulation Estimation Gibbs Example: Integer valued fields Integer pixels, when only first and second order cliques are used: V {i} (x = α k, when x i = k V {i,j} (x = β kl, when x i = k and x j = l. Simple case prefering equal neighbours: { { β β kl = k, if k = l β, if k = l or β kl = 0, otherwise. 0, otherwise. If each pixel takes a value in the set {1, 2,..., K}, the space of possible outcomes for N pixels x is Λ = {1, 2,..., K} N Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 7/22 HA3 MRF:s Simulation Estimation Gibbs Integer valued fields Joint distribution The global probability function for the simple integer field model is: N p(x exp α xi + 1 N βi(x 2 j = x i i=1 j N i i=1 The conditional distribution for a pixel x i given the rest of the field is p ( x i xj, j i = p (x p ( {x j, j i} = p (x K k=1 p ( x i = k, {x j, j i} Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 8/22 HA3 MRF:s Simulation Estimation Gibbs Integer valued fields Conditional distributions The conditional distribution can be found by simply removing all terms in the global probability function that do not depend on x i : p ( x i xj, j N i = exp α xi + β I(x j = x i j Ni K exp I(x j = k k=1 α k + β j Ni If β > 0 we obtain fields where neighbouring pixels strive to be equal. Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 9/22

4 Gibbs sampling Sampling from the distribution p(x is hard. However the conditional distributions p ( x i xj, j i = p ( x i xj, j N i are straight forward to obtain and simluate from. Gibbs sampling says that we can simulate from a complex distribution p(x by repeatedly simulating from the conditionals p ( x i xj, j i. Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 10/22 Block-update Gibbs-sampling Sequential Gibbs-sampling is often slow. Can we use the Markov property better? Compute the conditional distribution for an entire block of neighbouring pixels, allowing us to update larger sets of blocks. But, computing the conditional for a large enough block may be difficult. Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 11/22 Parallel Gibbs-sampling Can we update more than one pixel at the same time, without having them interfere with each other? In the 4-neighbour system, each pixel is conditionally independent of its diagonal neighbours, given its 4-neighbours. Looping over conditionally independent pixels will give exactly the same result as updating them simultaneously. Partition the pixels into conditionally independent sets, and update all pixels in each set in parallel. Loop over these sets, forwards and backwards, or choose a set at random. Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 12/22

Block-update Gibbs-sampling p (x white x red p (x white x red, x grey, x yellow p (x red x white p (x red x grey, x yellow, x white Convergence diagnostics p ( x grey x yellow, x white, x red p ( x

5 Block-update Gibbs-sampling p (x white x red p (x white x red, x grey, x yellow p (x red x white p (x red x grey, x yellow, x white Convergence diagnostics p ( x grey x yellow, x white, x red p ( x yellow x white, x red, x grey Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 13/22 How can we tell whether the simulation has converged to the correct distribution or not? How many iterations are required? This is a difficult problem, and no clear answers exists. Various heuristics can be applied: Plot the trajectory of the (unnormalised log-likelihood Plot the trajectory of various functionals of the field, such as the relative amount of the different classes. When the trajectories have oscillated around a constant long enough, the simulation is assumed to have converged. Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 14/22 Convergence diagnostics (cont. A simulated field (4-neighbours, α = 0, β = 1 and the corresponding trajectory of the log-likelihood: x Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 15/22

6 HA3 MRF:s Simulation Estimation Notation Pseudo Example Parameter estimation for discrete MRF:s Maximum likelihood: The normalising factor Zθ depends on the parameters. Zθ is typically troublesome or impossible to calculate. This makes direct Maximum likelihood estimation hard/impossible. Pseudo-likelihood estimation: The normalisation constants Zu,θ for the conditional distributions are usually simple to compute. Maximise the product of all the full conditional distributions θ = arg max p ( x xj i, j N i ; θ θ with respect to the parameters θ. It can be shown that Pseudo-likelihood estimates share many of the properties fond in maximum likelihood estimation. i Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 16/22 HA3 MRF:s Simulation Estimation Notation Pseudo Example Useful notation for a discrete MRF To simplify the notation, we rewrite the MRF using a multi-dimensional indicator field: z ik = I(x i = k f ik = j N i z jk P(x i = k = E(z ik For the discrete model we have P(x i = k x j, j N i, α, β = E(z ik z j, j N i, α, β = = exp (α k + βf ik l exp (α l + βf il = E(z ik f i, α, β Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 17/22 HA3 MRF:s Simulation Estimation Notation Pseudo Example Pseudo-likelihood estimation Identifiability For the discrete model the log-pseudo-likelihood is log PL z (α, β = α k z ik + β z ik f ik k i k i ( log exp(α k + βf ik i α k might not be identifiable. Take: k a k = exp(α k, and require that k a k = 1 PL z is not guaranteed to be bounded for large β. Introduce a prior on β β N (0, σ 2 Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 18/22

HA3 MRF:s Simulation Estimation Notation Pseudo Example C6: Simulated fields and parameter estimation (β; α = 0 0.25 1 5 Estimated value 1 0.75 0.

se FMSN20/MASM25 L11 19/22 HA3 MRF:s Simulation Estimation Notation Pseudo Example Pseudo-likelihood estimation Example Given a 4 4-image, 1 0 0 1 0 0 0 0 0 1 1 0 1 1 0 0 and a 4-neighbourhood we

343 for the other set of conditionally independent pixels Johan Lindström - johanl@maths.lth.

7 HA3 MRF:s Simulation Estimation Notation Pseudo Example C6: Simulated fields and parameter estimation (β; α = Estimated value True value Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 19/22 HA3 MRF:s Simulation Estimation Notation Pseudo Example Pseudo-likelihood estimation Example Given a 4 4-image, and a 4-neighbourhood we want to estimate β (assuming α = 0. The pseudo-likelihood for the conditionally independent pixels, is L = e 7β (1 + e 2β 4 (1 + e β 4 with maximum for ˆβ (ˆβ for the other set of conditionally independent pixels Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 20/22 HA3 MRF:s Simulation Estimation Notation Pseudo Example Pseudo-likelihood estimation For the estimation the set of points, I, over which we compute the approximate likelihood log PL z (α, β = α k z ik + β z ik f ik k i I k i I ( log exp(α k + βf ik i I can be selected as either a set of conditionally independent points (as in the example above leading to one estimate for each set; or as all points in the field. k Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 21/22

8 HA3 MRF:s Simulation Estimation Notation Pseudo Example Pseudo-likelihood estimation Conditionally independent points: Called coding-method. Final estimate obtained by averaging each individual estimate. Lower bias. All points: Called pseudo-likelihood. Nice asymptotic properties (consistent, Normal. Lower variance. Johan Lindström - johanl@maths.lth.se FMSN20/MASM25 L11 22/22

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Spatial Statistics Spatial Examples More Spatial Statistics with Image Analysis Johan Lindström 1 1 Mathematical Statistics Centre for Mathematical Sciences Lund University Lund October 6, 2016 Johan Lindström