Marginal density If the unknown is of the form x = x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density πx 1 y) = πx 1, x 2 y)dx 2 = πx 2 )πx 1 y, x 2 )dx 2 needs to be formed. In other words, all the other variables but those of primary interest are integrated out from the posterior density.
Marginal density Example 2 Matlab) The goal is, like in Example 1, to locate an electric point source within a unit disk D centered at origin using sensors lying on the boundary. In this case, the charge q of the source is modeled is assumed to be a Gaussian random variable with mean 1 and standard deviation ν and the voltage expericenced by i-th sensor is of the form y i = q/d i. Find and visualize the marginal posterior πx y) of the location x. Use the formula expcx 1 2π c 2 bx 2 2 ) ) dx = b exp, b > 0. 2b
Marginal density Example 2 Matlab) continued Solution The likelihood follows from the likelihood of Example 1, by substituting 1/d i with q/d i yielding πy x, q) exp 1 2σ 2 The marginal density is given by πx y) = πq)πx y, q)dq exp n y i q/d i ) 2). i=1 1 2ν 2 q 1)2) exp 1 2σ 2 n y i q/d i ) 2) dq i=1
Marginal density Example 2 = = exp 1 2ν 2 q 1)2 1 2σ 2 exp 1 2ν 2 + n 1 n exp ν 2 + i=1 i=1 n y i q/d i ) 2) dq i=1 1 ) q 2 2σ 2 di 2 + 1 n ν 2 + i=1 y i ) 1 q σ 2 d i 2 1 ν 2 + n i=1 1 σ 2 d 2 i y i ) ) q + C σ 2 dq d i ) q 2 ) dq If b = 1/ν 2 + n i=1 1/σ 2 di 2) and c = 1/ν2 + n i=1 y i/σ 2 d i ), it follows from expcx 1 2 bx 2 ) dx = 2π/b expc 2 /2b)) that the marginal density is of the form
Marginal density Example 2 1/ν 2π exp 2 + n i=1 y i /σ 2 d i ) 2 1/ν πx y) 2 + ) ) n i=1 1/σ 2 di 2) 1/ν 2 + n i=1 1/σ 2 di 2) In the following visualizations, the exact particle location was x = r, φ) = 0.5, 0.5) and the charge was chosen to be q = 0.5. Prior and likelihood standard deviations were given the values ν = 0.1, 1 and σ = 0.1, 0.2. The results show that ν = 0.1 is a rather low prior standard deviation, since with that value the marginal posterior density is not peaked where the particle is located. The difference between the prior mean q = 1 and exact value q = 0.5 is also large as compared to the choice ν = 0.1. Value ν = 1, on the otherhand, leads to more spread results. Again, the likelihood variance σ = 0.2 leads to more spread densities than σ = 0.1. ) 2
Marginal density Example 2 Matlab) continued Solution n=3, σ=0.1, ν=0.1 n=4, σ=0.1, ν=0.1 n=5, σ=0.1, ν=0.1 n=3, σ=0.2, ν=0.1 n=4, σ=0.2, ν=0.1 n=5, σ=0.2, ν=0.1 Particle and sensor locations are indicated by the purple and red circles, respectively.
Marginal density Example 2 Matlab) continued Solution n=3, σ=0.1, ν=1 n=4, σ=0.1, ν=1 n=5, σ=0.1, ν=1 n=3, σ=0.2, ν=1 n=4, σ=0.2, ν=1 n=5, σ=0.2, ν=1 Particle and sensor locations are indicated by the purple and red circles, respectively.
Estimates Estimates are often necessary in order to get a concept of possible realizations of X. One of the most popular statistical estimates is the maximum a posteriori estimate MAP), which maximizes the posterior density, i.e. x MAP = arg max πx y), x Rn Another common point estimate is the conditional mean CM) of the unknown X defined as x CM = E{x y} = xπx y)dx. R n The task of finding MAP or CM constitutes an optimization or integration problem, respectively.
MAP vs. CM estimates MAP is the global) maximizer of the posterior and CM is the center of posterior probability mass. CM is considered to be, in general, more robust than MAP, as the maximizer point estimate) of a posterior density can be, for example, more sensitive to noise small changes) in the data than the center of probability mass integral estimate).
Estimates If X is a Gaussian random variable, then MAP coincides with CM. A typical spread estimator is the conditional covariance covx y) R n n, defined as covx y) = x x CM )x x CM ) T πx y) dx. R n A Bayesian credibility set D p including p% of the posterior probability mass can be estimated through the integral µd p y) = πx y)dx = D p p 100, πx y) x D p = constant.
Estimates Example 3 Given a forward model Y = A X + N, where A R m n is a constant matrix and N is a Gaussian distributed zero mean EN) = 0) noise vector with a diagonal covariance matrix C = σi, find a) the likelihood πy x), b) the posterior density πx y) corresponding to the Gaussian prior πx) exp 1 ) 2α 2 x T x, c) the maximizer of the posterior MAP).
Estimates Example 3 continued Solution a) The distribution of N = Y AX is zero mean Gaussian with the diagonal covariance martrix C = σi meaning that πy x) = πn) exp 1 ) 2σ 2 y Ax)T y Ax). b) The posterior density is given by πx y) = πx)πy x) exp exp 1 2α 2 x T x ) exp 1 ) 2σ 2 y Ax)T y Ax) ). 1 2α 2 x T x 1 2σ 2 y Ax)T y Ax)
Estimates Example 3 continued Solution c) Maximizer of the posterior density, i.e. x MAP, minimizes the argument of the exponential function, meaning that 1 x MAP = arg min 2α 2 x T x + 1 ) 2σ 2 y Ax)T y Ax). The derivative of the quadratic form needs to be zero, that is 1 σ 2 AT Ax MAP + 1 α 2 x MAP 1 σ 2 AT y = 0. This is equivalent to x MAP = [A T A + σ 2 /α 2 )I ] 1 A T y, that is the Tikhonov regularized solution of Ax = y with the regularization parameter σ 2 /α 2, i.e. the likelihood variance σ 2 divided by the prior variance α 2.
Gaussian priors A Gaussian n-variate random variable X with mean x R n and symmetric and positive definite) covariance matrix Γ R n n is denoted by X Nx, Γ). The probability density of X is given by πx) = 1 2π) n detγ) exp 1 ) n 2 x x)t Γ 1 x x). When Gaussian desity is used as a prior, structural prior information of the unknown x can be encoded into the covariance matrix Γ. Due to the positive definiteness there exists a factorization of the form Γ 1 = W T W, in which W is invertible and can be, for example, upper) triangular Cholesky factor W = U = L T ).
Gaussian priors The matrix W is called a whitening matrix, since Z = W X x) is Gaussian white noise: it has zero-mean and identity covariance matrix Z N0, I ). A random vector, whose components are mutually independent and identically distributed, is called white noise.) This can be verified through a straightforward calculation as follows: πx) exp 1 ) 2 x x)t Γ 1 x x) = exp 1 ) 2 x x)t W T W x x) = exp 1 ) 2 zt z πz). Hence, a realization x can be obtained by first drawing a realization z and, after that, applying the formula x = W 1 z + x.
Gaussian priors Example 4 Matlab) Assume that Z is white noise Z N0, I )) random vector corresponding to a 64 64 pixel image. Visualize a realization of X N0, Γ) with Γ 1 = W T W using the formula x = W 1 z in the following four cases: a) W = I, i.e. x is white noise, b) W is proportional to a discrete approximation of the Laplace operator = 2 1 + 2 2, c) W is otherwise as in b) but correlation between pixels close to the center of the image is higher, d) W is proportional to a discrete approximation of the directional differential operator d = d 1 1 + d 2 2 with d = d 1, d 2 ) = 1, 1).
Gaussian priors Example 4 Matlab) continued Solution a) White noise can be generated with a standard Gaussian random number generator randn in Matlab). b) W was formed as the standard finite difference approximation of the Laplace operator, i.e. w ki,j,k i,j = 4, w ki,j,k i+1,j = w ki,j,k i 1,j = w ki,j,k i,j+1 = w ki,j,k i,j 1 = 1 w ki,j,k l,n = 0, if i l > 1 or j n > 1. Here, k i,j is the vector index corresponding to pixel i, j).
Gaussian priors Example 4 Matlab) continued c) W was otherwise same as in b), but 3 was added to all elements w ki,j,k l,n if the centers of pixels i, j) and l, n) were both closer than the distance of 10 pixel side-lengths to the center of the image. d) W corresponding to a differential operator to a given direction d was defined as W = W 1) cosφ) + W 2) sinφ) where φ is the angle between d and positive X axis, w 1) k i,j,k i,j = w 1) k i,j,k i,j+1 = 1, w 2) k i,j,k i,j = w 1) k i,j,k i 1,j = 1 and otherwise w 1) = w 2) = 0. Direction d corresponded to a line with slope one, meaning that φ = π/4.
Gaussian priors Example 4 Matlab) continued a) b) c) d)