Group Importance Sampling for particle filtering and MCMC

Size: px

Start display at page:

Download "Group Importance Sampling for particle filtering and MCMC"

Georgiana Ellis
5 years ago
Views:

1 Group Importance Sampling for particle filtering an MCMC Luca Martino, Víctor Elvira, Gustau Camps-Valls Image Processing Laboratory, Universitat e València (Spain). IMT Lille Douai CRISTAL (UMR 989), Villeneuve Ascq (France). Abstract Importance Sampling (IS) is a well-known Monte Carlo technique that approximates integrals involving a posterior istribution by means of weighte samples. In this work, we stuy the assignation of a single weighte sample which compresses the information containe in a population of weighte samples. Part of the theory that we present as Group Importance Sampling (GIS) has been alreay employe implicitly in ifferent works in literature. The provie analysis yiels several theoretical an practical consequences. For instance, we iscuss the application of GIS into the Sequential Importance Resampling (SIR) framework an show that Inepenent Multiple Try Metropolis (I-MTM) schemes can be interprete as a stanar Metropolis-Hastings algorithm, following the GIS approach. We also introuce two novel Markov Chain Monte Carlo (MCMC) techniques base on GIS. The first one, name Group Metropolis Sampling (GMS) metho, prouces a Markov chain of sets of weighte samples. All these sets are then employe for obtaining a unique global estimator. The secon one is the Distribute Particle Metropolis-Hastings (DPMH) technique, where ifferent parallel particle filters are jointly use to rive an MCMC algorithm. Different resample trajectories are compare an then teste with a proper acceptance probability. The novel schemes are teste in ifferent numerical experiments such as learning the hyperparameters of Gaussian Processes (GP), the localization problem in a sensor network an the tracking of the Leaf Area Inex (LAI), where they are compare with several benchmark Monte Carlo techniques. Three escriptive Matlab emos are also provie. Keywors: Importance Sampling, Markov Chain Monte Carlo (MCMC), Particle Filtering, Particle Metropolis-Hastings, Multiple Try Metropolis, Bayesian Inference Introuction Monte Carlo methos are state-of-the-art tools for approximating complicate integrals involving multiimensional istributions, which is often require in science an technology [6, 22, 23, 38. The most popular classes of MC methos are the Importance Sampling (IS) techniques an the Markov chain Monte Carlo (MCMC) algorithms [22, 38. IS schemes prouce a ranom iscrete approximation of the posterior istribution by a population of weighte samples [6, 27, 23, 38. MCMC techniques generate a Markov chain (i.e., a sequence of correlate samples) with a pre-establishe target probability ensity function (pf) as invariant ensity [22, 23. In this work, we introuce theory an practice of the Group Importance Sampling (GIS) scheme, where the information containe in ifferent sets of weighte samples is compresse by using only one particle (properly selecte) an one suitable weight. This general iea supports the valiity of ifferent Monte Carlo algorithms in literature: interacting parallel particle filters [5, 33, 37, particle islan schemes an relate techniques [42, 43, 44, particle filters for moel selection [4, 32, 4, neste Sequential Monte Carlo (SMC) methos [34, 35, 40 are some examples. We point out some consequences of the application of GIS in Sequential Importance Resampling (SIR) schemes, allowing partial resampling proceures an the use of ifferent marginal likelihoo estimators. Then, we show that the Inepenent Multiple Try Metropolis (I-MTM) techniques an the Particle Metropolis- Hastings (PMH) algorithm can be interprete as a classical Inepenent Metropolis-Hastings (I-MH) metho by the application of GIS. Furthermore, we present two novel techniques base on GIS. One example is the Group Metropolis Sampling (GMS) algorithm that generates a Markov chain of sets of weighte samples. All these resulting sets of samples are

2 jointly employe obtaining a unique particle approximation of the target istribution. On the one han, GMS can be consiere an MCMC metho since it prouce a Markov chain of sets of samples. On the other han, the GMS can be also consiere as an iterate importance sampler where ifferent estimators are finally combine in orer to buil a unique IS estimator. This combination is obtaine ynamically through ranom repetitions given by MCMC-type acceptance tests. GMS is closely relate to Multiple Try Metropolis (MTM) techniques an Particle Metropolis-Hastings (PMH) algorithms [2, 3, 7, 0, 3, 29, as we iscuss below. The GMS algorithm can be also seen as an extension of the metho in [8, for recycling auxiliary samples in a MCMC metho. We also introuce the Distribute PMH (DPMH) technique where the outputs of several parallel particle filters are compare by an MH-type acceptance function. The proper esign of DPMH is a irect application of GIS. The benefit of DPMH is twofol: ifferent type of particle filters (for instance, with ifferent proposal ensities) can be jointly employe, an the computational effort can be istribute in several machines speeing up the resulting algorithm. As the stanar PMH metho, DPMH is useful for filtering an smoothing the estimation of the trajectory of a variable of interest in a state-space moel. Furthermore, the marginal version of DPMH can be use for the joint estimation of ynamic an static parameters. When the approximation of only one specific moment of the posterior is require, like GMS, the DPMH output can be expresse as a chain of IS estimators. The novel schemes are teste in three ifferent numerical experiments: hyperparameter tuning for Gaussian Processes, localization in a sensor network an filtering of Leaf Area Inex (LAI). The comparisons with other benchmark Monte Carlo methos show the benefits of the propose algorithms. The paper has the following structure. Section 2 recalls some backgroun material. The basis of the GIS theory is introuce in Section 3. The applications of GIS in particle filtering an Multiple Try Metropolis algorithms are iscusse in Section 4. In Section 5, we introuce the novel techniques base on GIS. Section 6. provies the numerical results an in Section 7 we iscuss some conclusions. 2 Problem statement an backgroun In many applications, the goal is to infer a variable of interest, x x :D [x, x 2,..., x D X R D η, where x R η for all,..., D, given a set of relate observations or measurements, y R Y. In the Bayesian framework all the statistical information is summarize by the posterior probability ensity function (pf), i.e., π(x) p(x y) l(y x)g(x), () Z(y) where l(y x) is the likelihoo function, g(x) is the prior pf an Z(y) is the marginal likelihoo (a.k.a., Bayesian evience). In general, Z Z(y) is unknown an ifficult to estimate in general, so we assume to be able to evaluate the unnormalize target function, π(x) l(y x)g(x). (2) The computation of integrals involving π(x) Z π(x) are often intractable. We consier the Monte Carlo approximation of these complicate integrals involving the target π(x) an a integrable function h(x) with respect to π, e.g., I E π [h(x) h(x) π(x)x, (3) where we enote X π(x). The basic Monte Carlo (MC) proceure consists in rawing N inepenent samples from the target pf, i.e., x,..., x N π(x), so that ÎN N N h(x n) is an estimator of I [23, 38. However, in general, irect methos for rawing samples from π(x) o not exist so that alternative proceures are require. Below, we escribe one of them. Three escriptive Matlab emos are also provie at X 2

3 2. Importance Sampling Let us consier the use of a simpler proposal pf, q(x), an rewrite the integral I in Eq. (3) as I E π [h(x), E q [h(x)w(x), h(x) π(x) q(x)x, (4) Z q(x) X where w(x) π(x) q(x) : X R. This suggests an alternative proceure. Inee, we can raw N samples x,..., x N from q(x), 2 an then assign to each sample the unnormalize weights w n w(x n ) π(x n), n,..., N. (5) q(x n ) If Z is known, an unbiase IS estimator [23, 38 is efine as ÎN N ZN w nh(x n ), where x n q(x). If Z is unknown, efining the normalize weights, w n P wn N with n,..., N, an alternative asymptotically i wi unbiase IS estimator is given by I N w n h(x n ). (6) Both ÎN an I N are consistent estimators of I in Eq. (3) [23, 38. Moreover, an unbiase estimator of marginal likelihoo, Z X π(x)x, is given by Ẑ N N i w i. More generally, the pairs {x i, w i } N i can be use to buil a particle approximation of the posterior istribution, π(x x :N ) w n δ(x x n ), (7) NẐ w n δ(x x n ), (8) where δ(x) enotes the Dirac elta function. Table summarizes the main notation of the work. Note that the wors sample an particle are use as as synonyms along this work. Moreover, Table 4 shows the main use acronyms. 2.2 Concept of proper weighting The stanar IS weights in Eq. (5) are broaly use in the literature. However the efinition of properly weighte sample can be extene as suggeste in [38, Section 4.2, [23, Section an in [5. More specifically, given a set of samples, they are properly weighte with respect to the target π if, for any integrable function h, E q [w(x n )h(x n ) ce π [h(x n ), n {,..., N}, (9) where c is a constant value, also inepenent from the inex n, an the expectation of the left han sie is performe, in general, w.r.t. to the joint pf of w(x) an x, i.e., q(w, x). Namely, the weight w(x), conitione to a given value of x, coul even be consiere a ranom variable. Thus, in orer to obtain consistent estimators, one can esign any joint q(w, x) as long as the restriction of Eq. (9) is fulfille. In the following, we use the general efinition in Eq. (9) for esigning proper weights an proper summary samples assigne to ifferent sets of samples. 2 We assume that q(x) > 0 for all x where π(x) 0, an q(x) has heavier tails than π(x). 3

4 Table : Main notation of the work. x [x,..., x D variable of interest, x X R D η, with x R η for all π(x) Normalize posterior pf, π(x) p(x y) π(x) Unnormalize posterior function, π(x) π(x) π(x x :N ) Particle approximation of π(x) using the set of samples x :N {x n } N x Resample particle, x π(x x :N ) (note that x {x,..., x N }) w n w(x n ) Unnormalize stanar IS weight of the particle x n w n w(x n ) Normalize weight associate to x n w m w( x m ) Unnormalize proper weight associate to the resample particle x m W m summary weight of m-th set S m I N Stanar self-normalize IS estimator using N samples Ĩ N Self-normalize estimator using N samples an base on GIS theory Z Marginal likelihoo; normalizing constant of π(x) Ẑ, Z Estimators of the marginal likelihoo Z 3 Group Importance Sampling: weighting a set of samples Let us consier M sets of weighte samples, S {x,n, w,n } N, S 2 {x 2,n, w 2,n } N2,... S M {x M,n, w M,n } N M, where x m,n q m (x), i.e., a ifferent proposal pf for each set S m an in general N i N j, for all i j, i, j {,..., M}. In some applications an ifferent Monte Carlo schemes, it is convenient (an often require) to compress the statistical information containe in each set using a pair of summary sample, x m, an summary weight, W m, m,..., M, in such a way that the following expression Ĩ M M j W j M W m h( x m ), (0) is still a consistent estimator of I, for a generic integrable function h(x). Thus, although the compression is lossy, we still have a suitable particle approximation π of the target π as shown below. In some cases, it is possible to store all the sets S m, for m,..., M, but the use of the concept of summary weight is neee (as an example, see the algorithm escribe in Section 5.). In other scenarios, it is convenient to employ both concepts of summary particle an summary weight: for instance, in a istribute framework where it is necessary to restrict the communication with the central noe (see Figure 3). In the following, we enote the importance weight of the n-th sample in the m-th group as w m,n w(x m,n ) π(xm,n) q m(x m,n), the m-th marginal likelihoo estimator as an the normalize weights within a set, w m,n Definition. A resample particle, i.e., Ẑ m N m m N m x m π m (x) π(x x m,:nm ) w m,n, () wm,n P Nm j w m,j wm,n N m b Zm, for n,..., N m an m,..., M. N m w m,n δ(x x m,n ), (2) is a summary particle x m for the m-group. Note that x m is selecte within {x m,,..., x m,nm } accoring to the probability mass function (pmf) efine by w m,n, n,..., N m. It is possible to use the Liu s efinition in orer to assign a proper importance weight to a resample particle [25, as state in the following theorem. 4

5 Theorem. Let us consier a resample particle x m π m (x) π(x x m,:nm ). A proper unnormalize weight following the Liu s efinition in Eq. (9) for this resample particle is w m Ẑm, efine in Eq. (). The proof is given in Appenix A. Note that two (or more) particles, x m, x m, resample with replacement from the same set an hence from the same approximation, x m, x m π m (x), have the same weight w( x m) w( x m) Ẑm, as epicte in Figure. Note that the classical importance weight cannot be compute for a resample particle, as explaine in Appenix A an pointe out in [2, 25, 34, [27, App. C. w m, w m,2 w m,3 w m,4 x Resample 3 times x m, x m,2 x m,3 x m,4 m x ex 0 m ex 00 m,2 m x m,4 Figure : Example of generation (one run) an proper weighting of 3 resample particles (with replacement), ex m, ex m an ex m, from the m-th group, where N m 4 an b Z m 4 P 4 wm,n. ex 000 bz m b Zm x Definition 2. The summary weight for the m-th group of samples is W m N m w m N m Ẑ m. Particle approximation. Figure 2 represents graphically an example of GIS with M 2 an N 4, N 2 3. Given the M summary pairs { x m, w m } M m in a common computational noe, we can obtain the following particle approximation of π(x), i.e., π(x x :M ) M j N jẑj M N m Ẑ m δ(x x m ), (3) m involving M weighte samples in this case (see App. B). For a given function h(x), the corresponing specific GIS estimator in Eq. (0) is Ĩ M M j N jẑj M N m Ẑ m h( x m ). (4) m It is a consistent estimator of I, as we show in Appenix B. The expression in Eq. (4) can be interprete as a stanar IS estimator where w( x m ) Ẑm is a proper weight of a resample particle [25, an we give more importance to the resample particles belonging to a set with more carinality. See DEMO-2 at Combination of estimators. If we are intereste only in computing the integral I for a specific function h(x), we can summarize the statistical information by the pairs {I (m) N m, w m } where I (m) N m N m w m,n h(x m,n ), (5) is the m-th partial IS estimator obtaine by using N m samples in S m. Given all the S M j N j weighte samples in the M sets, the complete estimator I S in Eq. (6) can be written as a convex combination of the M partial IS 5

6 N b Z N 2 b Z2 ex ex 2 N 4 N 2 3 w w,4,2 w 2, w 2,3 w, w,3 w 2,2 x, x,2 x,3 x,4 x x 2, x 2,2 x 2,3 x Figure 2: Graphical representation of GIS. In this case, M 2 groups of N 4 an N 2 3 weighte samples are summarize with a resample particle an one summary weight ew m N m b Zm, m, 2. estimators, I (m) N m, i.e., I S M j N jẑj M j N jẑj M j W m M N m m w m,n h(x m,n ), (6) M N m N m Ẑ m w m,n h(x m,n ), (7) m M m W m I (m) N m. (8) The equation above shows that the summary weight W m measures the importance of the m-th estimator I (m) N m, which is another interpretation of the proper weighting the group of samples S m. This suggests another vali compression scheme. Remark. In orer to approximate only one specific moment I of π(x), we can summarize the m-group with the pair {I (m) N m, W m } M m, thus all the M partial estimators can be combine following Eq. (8). In this case,there is no loss of information w.r.t. storing all the weighte samples. However, the approximation of other moments of π(x) is not possible. Figures 3-4 epict the graphical representations of the two possible approaches for GIS. All the previous consierations have theoretical an practical consequences for the application of ifferent Monte Carlo schemes, as we highlight hereafter. 4 Application of GIS in other Monte Carlo schemes In this section, we iscuss the application of GIS within other well-known Monte Carlo schemes. First of all, we consier the use of GIS in Sequential Importance Resampling (SIR) methos, a.k.a., stanar particle filters. Then, we iscuss that the Inepenent Multiple Try Metropolis (I-MTM) schemes an the Particle Metropolis- Hastings (PMH) algorithm can be interprete as a classical Metropolis-Hastings metho taking into account the GIS approach. 4. Application in particle filtering In Section 2., we have escribe the IS proceure in a batch way, i.e., generating irectly a D-imensional vector x q(x) an then compute the weight π(x ) q(x ). This proceure can be performe sequentially if the target ensity is 6

7 {ex,w } W (a) Central Noe W m {ex m,w m } ex ex m S... S m... {ex M,W M } W M S M ex M I () N {I () N,W } W (b) Central Noe W m S... S m... {I (m) {I (M) N M,W M } N m,w m } I (m) N m W M I (M) N M S M Figure 3: Graphical overview of GIS in a parallel/istribute framework. (a) The central noe obtains all the pairs {ex m, W m} M m, an provies bπ(x ex :M ) or I M. Note that only M particles, ex m R D, an M scalar weights, W m R, are transmitte, instea of S samples an S weights, with S P M m Nm. (b) Alternatively, if we are intereste only in a specific moment of the target, we can transmit the pairs {I (m) N m, W m} M m an then combine them as in Eq. (8). {S m } M m {I (m) N m,w m } M m {ex m,w m } M m Particle Approx. b (x x :M,:Nm ) None b (x ex :M ) Moment Est. I S I S MX Num. of particles S m N m M apple S M apple S ei M Figure 4: Possible outputs of ifferent GIS compression schemes. On the left, {S m} M m, no compression is applie. In the center, {I (m) N m, W m} M m, we can perfectly reconstruct the estimator I S in Eq. (6) where S P M m Nm, but we cannot approximate other moments. Using {ex (m) N m, W m} M m, we always obtain a lossy compression, but any moments of π(x) can be approximate, as shown in Eqs. (3)-(4). 7

8 factorize. In this case, the metho is known as Sequential Importance Sampling (SIS) an, is the basis of particle filtering, along with the use of the resampling proceure. Below, we escribe the SIS metho. 4.. Sequential Importance Sampling (SIS) Let us that recall x x :D [x, x 2,..., x D X R D η where x R η for all,..., D, an let us consier a target pf π(x) factorize as π(x) Z π(x) Z γ (x ) D γ (x x : ), (9) where γ (x ) is a marginal pf an γ (x x : ) are conitional pfs. We can also consier a proposal pf ecompose in the same fashion, q(x) q (x ) D 2 q (x x ), In a batch IS scheme, given the n-th sample x n x (n) :D q(x), we assign the importance weight where β (n) π(x(n) ) q(x (n) ) 2 w(x n ) π(x n) q(x n ) γ (x (n) )γ 2(x (n) 2 x(n) ) γ D(x (n) D x(n) :D ) q (x (n) )q 2(x (n) 2 x(n) ) q D(x (n) D x(n) :D ). (20) D β, (2) an β(n) γ (x (n) x(n) : ) q (x (n) x(n) : [x,..., x as π (x : ) Z π (x : ) Z γ (x ) Thus, we can raw samples generating sequentially each component x (n) x n x (n) :D q(x) q (x ) D The SIS technique is also given in Table 2 setting η. ), with,..., D. Let us also enote the joint probability of j2 γ j(x j x :j ), so that π D (x :D ) π(x) an Z D Z. q (x x (n) : ),,..., D, so that 2 q (x x ), an compute recursively the corresponing IS weight as in Eq. (2). Remark 2. In SIS, there are two possible formulations of the estimator of the marginal likelihoos Z R η π (x : )x :, Ẑ N [ N Z j w (n) N w (n) j β(n) j In SIS, both estimators are equivalent Z Ẑ. See Appenix C for further etails Sequential Importance Resampling (SIR) w (n) β(n), (22). (23) The expression in Eq. (2) suggests a recursive proceure for generating the samples an computing the importance weights, as shown in Steps 2a an 2b of Table 2. In Sequential Importance Resampling (SIR), a.k.a., stanar particle filtering, resampling steps are incorporate uring the recursion as in step 2(c)ii of Table 2 [, 2. In general, the resampling steps are applie only in certain iterations in orer to avoi the path egeneration, taking into account an approximation ÊSS of the Effective Sampling Size (ESS) [9, 26. If ÊSS is smaller of preestablishe threshol, the particles are resample. Two examples of ESS approximation are ÊSS P N ( w(n) )2 an ÊSS max w (n) where w (n) w(n) P N i w(i). Note that, in both cases, 0 ÊSS N. (24) 8

9 Hence, the conition for the aaptive resampling can be expresse as ÊSS < ηn where η [0,. SIS is given when η an SIR for η (0,. When η, the resampling is applie at each iteration an in this case SIR is often calle bootstrap particle filter [, 2, 3. If η 0, no resampling is applie, we only apply Steps 2a an 2b, an we have the SIS metho escribe above, that after D iterations is completely equivalent to the bath IS approach, since w n w(x n ) w (n) D where x n x :D. Partial resampling. In Table 2, we have consiere that only a subset of R N particles are resample. In this case, step 2(c)iii incluing the GIS weighting is strictly require in orer to provie final proper weighte samples an hence consistent estimators. The partial resampling proceure is an alternative approach to prevent the loss of particle iversity [25. In the classical escription of SIR [39, we have R N (i.e., all the particles are resample) an the weight recursion follows setting the unnormalize weights of the resample particles to any equal value. Since all the N particles have been resample, the choice of this value have no impact in the weight recursion an in the estimation of I. Marginal likelihoo estimators. Even in the case R N, i.e., all the particle are resample as in the stanar SIR metho, without using the GIS weighting only the formulation Z in Eq. (23) provies a consistent estimator of Z, since involve the normalize weights w (n), instea of the unnormalize ones, w(n). Remark 3. If the GIS weighting is applie in SIR, both formulations Ẑ an Z in Eqs. (22)-(23) provie consistent estimator of Z an they are equivalent, Ẑ Z (as in SIS). See an exhaustive iscussion in Appenix C. Table 2: SIR with partial resampling. Choose N the number of particles, R N the number of particles to be resample, the initial particles, n,..., N, an ESS approximation ÊSS [26 an a constant value η [0,. x (n) 0 2. For,..., D: (a) Propagation: Draw x (n) q (x x (n) : ), for n,..., N. (b) Weighting: Compute the weights where β (n) (c) if ÊSS < ηn then: w (n) γ (x (n) x(n) : ). q (x (n) x(n) : ) w (n) β(n) j β (n) j, n,..., N, (25) i. Select ranomly a set of particles S {x (jr) } R r where R N, j r {,..., N} for all r, an j r j k for r k. ii. Resampling: Resample R times within the set S accoring to the probabilities w (jr) w (jr ) P R k w(j k ), obtaining { x (jr) } R r. Then, set x (jr) x (jr), for r,..., R. iii. GIS weighting: Compute ẐS R 3. Return {x n x (n) :D, w n w (n) D }N. R r w (jr) an set w(jr) ẐS for all r,..., R. 9

10 GIS in Sequential Monte Carlo (SMC). The iea of summary sample an summary weight have been implicitly use in ifferent SMC schemes propose in literature, for instance, for the communication among parallel particle filters [5, 33, 37, an in the particle islan methos [42, 43, 44. GIS also appears inirectly in particle filtering for moel selection [4, 32, 4 an in the so-calle Neste Sequential Monte Carlo techniques [34, 35, Multiple Try Metropolis schemes as Stanar Metropolis-Hastings metho The Metropolis-Hastings (MH) metho is one of the most popular MCMC algorithm [22, 23, 38. It generates a Markov chain {x t } t where π(x) is the invariant ensity. Consiering a proposal pf q(x) inepenent from the previous state x t, the corresponing Inepenent MH (IMH) scheme is forme by the steps in Table 3 [38. Table 3: The Inepenent Metropolis-Hastings (IMH) algorithm. Choose an initial state x For t,..., T : (a) Draw a sample v q(x). (b) Accept the new state, x t v, with probability [ α(x t, v ) min, π(v )q(x t ) π(x t )q(v ) [ min, where w(x) π(x) q(x) (stanar importance weight). Otherwise, set x t x t. 3. Return {x t } T t. w(v ), (26) w(x t ) [ Observe that α(x t, v ) min, w(v ) w(x t ) in Eq. (26) involves the ratio between the importance weight of the propose sample v at the t-th iteration, an the importance weight of the previous state x t. Furthermore, note that at each iteration only one new sample v is generate an compare with the previous state x t by the acceptance probability α(x t, v ) (in orer to obtain the next state x t ). The Particle Metropolis-Hastings (PMH) metho [2 an the alternative version of the Inepenent Multiply Try Metropolis technique [28 (enote as I-MTM2) are jointly escribe in Table 4. 3 They are two MCMC algorithms where at each iteration several caniates {v,..., v N } are generate. After computing the IS weights w(v n ), one caniate is selecte v j within the N possible values, i.e., j {,..., N}, applying a resampling step accoring to the probability mass w n P w(vn) w(vn) N i w(vi) NZ b, n,..., N. Then the selecte sample v j is teste with a proper probability α(x t, v j ) in Eq. (27). The ifference between PMH an I-MTM2 is the proceure employe for the generation of the N caniates an for construction of the weights. PMH employs a sequential approach, whereas I-MTM2 uses a stanar batch approach [28. Namely, PMH generates sequentially the components v j,k of the caniates, v j [v j,,..., v j,d, an compute recursively the weights as shown in Section 4.. Since resampling steps are often use the resulting caniates v,..., v N are correlate, whereas in I-MTM2 they are inepenent. I-MTM2 coincies with PMH if the caniates are generate sequentially but without applying resampling steps, so that I-MTM2 can be consiere a special case of PMH. Note that w(v j ) Ẑ an w(x t ) Ẑt are the GIS weights of the resample particles v j an x t 3 PMH is use for filtering an smoothing a variable of interest in state-space moels (see, for instance, Figure 3). The Particle Marginal MH (PMMH) algorithm [2 is an extension of PMH employe in orer to infer both ynamic an static variables. PMMH is escribe in Appenix D. 0

11 Table 4: PMH an I-MTM2 techniques. Choose an initial state x 0 an Ẑ0. 2. For t,..., T : (a) Draw N particles v,..., v N from q(x) an weight them with the proper importance weight w(v n ), n,..., N, using a sequential approach (PMH), or a batch approach (I-MTM2). Thus, enoting Ẑ N N w(v n), we obtain the particle approximation π(x v :N ) N b Z N w(v n)δ(x v n ). (b) Draw v j π(x v :N ). (c) Set x t v j an Ẑt Ẑ, with probability Otherwise, set x t x t an Ẑt Ẑt. 3. Return {x t } T t. α(x t, v j ) min [, Ẑ Ẑ t. (27) respectively, as state in Definition an Theorem. 4 Hence, consiering the GIS theory, we can write [ Ẑ [ w(v j ) α(x t, v j ) min, min,, (28) w(x t ) Ẑ t which has the form of the acceptance function of the classical IMH metho in Table 3. I-MTM2 algorithms can be also summarize as in Table 5. Therefore, PMH an Remark 4. The PMH an I-MTM2 algorithms take the form of the classical IMH metho employing the equivalent proposal pf q(x) in Eq. (29) (epicte in Figure 5; see also Appenix A), an using the GIS weight w( x ) of a resample particle x q(x), within the acceptance function α(x t, x ) (x) 0.25 v n q(x) n,...,n v :N ex 0 b (x v :N ) ex0 eq(x) eq(x) q(x) x Figure 5: (Left) Graphical representation of the generation of one sample x from the equivalent proposal pf eq(x) in Eq. (29).(Right) Example of the equivalent ensity eq(x) (soli line) with N 2. The target, π(x), an proposal, q(x), pfs are shown with ashe lines. See DEMO-3 at 4 Note that the number of caniates per iteration is constant (N), so that W t N ew(v j ) W t N ew(x t ) ew(v j ) ew(x t ).

12 Table 5: Alternative escription of PMH an I-MTM2. Choose an initial state x For t,..., T : (a) Draw x q(x), where q(x) X N [ N q(v i ) π(x v :N )v :N, (29) is the equivalent proposal pf associate to a resample particle [2, 25. (b) Set x t x, with probability Otherwise, set x t x t. 3. Return {x t } T t. i [ α(x t, x ) min, w( x ). (30) w(x t ) 5 Novel Techniques In this section, we provie two examples of novel MCMC algorithms base on GIS. First of all, we introuce a Metropolis-type metho proucing a chain of sets of weighte samples. Seconly, we present a PMH technique riven by M parallel particle filters. In the first scheme, we exploit the concept of summary weight an all the weighte samples are store. In the secon one, both concepts of summary weight an summary particle are use. The consistency of the resulting estimators an the ergoicity of both schemes is ensure. 5. Group Metropolis Sampling Here, we escribe an MCMC proceure that yiels a sequence of sets of weighte samples. All the samples are then employe for a joint particle approximation of the target istribution. The Group Metropolis Sampling (GMS) is outline in Table 6. Figures 6(a)-(b) give two graphical representations of GMS outputs (with N 4 in both cases). Note that the GMS algorithm uses the iea of summary weight for comparing sets. Given the generate sets S t {x n,t, ρ n,t } N, for t,..., T, GMS provies the global particle approximation π(x x :N,:T ) T T T t T t ρ n,t N i ρ i,t δ(x x n,t ), (3) ρ n,t δ(x x n,t ). (32) Thus, the estimator of a specific moment of the target is Ĩ NT T T ρ n,t h(x n,t ) T t T t Ĩ (t) N. (33) If the N caniates, v,..., v N, an the associate weights, w,..., w N, are built sequentially by a particle filtering metho, we have a Particle GMS (PGMS) algorithm (see Section 6.3) an marginal versions can be also consiere (see Appenix D). 2

13 Table 6: Group Metropolis Sampling. Buil an initial set S 0 {x n,0, ρ n,0 } N an Ẑ0 N N ρ n,0. 2. For t,..., T : (a) Draw N samples, v,..., v N q(x) following a sequential or a batch proceure. (b) Compute the weights, w n π(vn) q(v n), n,..., N; efine S Ẑ N N w n. (c) Set S t {x n,t v n, ρ n,t w n } N, i.e., S t S, an Ẑt Ẑ, with probability [ Otherwise, set S t S t an Ẑt Ẑt. α(s t, S ) min 3. Return {S t } T t, or {Ĩ(t) N }T t where Ĩ(t) N N ρ P n,t N h(x n,t ). i ρi,t, Ẑ Ẑ t {v n, w n } N ; an compute. (34) Relation with IMH. The acceptance probability α in Eq. (34) is the extension of the acceptance probability of IMH in Eq. (26), consiering the proper GIS weighting of a set of weighte samples. Note that, in this version of GMS, all the sets contain the same number of samples. Relation with MTM methos. GMS is strictly relate to Multiple Try Metropolis (MTM) schemes [7, 30, 3, 28 an Particle Metropolis Hastings (PMH) techniques [2, 28. The main ifference is that GMS use no resampling steps at each iteration for generating summary samples, inee GMS uses the entire set. However, consiering a sequential of a batch proceure for generating the N tries at each iteration, we can recover a MTM (or the PMH) chain by the GMS output applying one resampling step when S t S t, ṽ t ρ n,t δ(x x n,t ), if S t S t, x t (35) x t, if S t S t, for t,..., T. Namely, { x t } T t is the chain obtaine by one run of the MTM (or PMH) technique. Figure 6(b) provies a graphical representation of a MTM chain recovere by GMS outputs. Ergoicity. As also iscusse above, (a) the sample generation, (b) the acceptance probability function an hence (c) the ynamics of GMS exactly coincie with the corresponing steps of PMH or MTM (with a sequential or batch particle generation, respectively). Hence, the ergoicity of the chain is ensure [7, 3, 2, 28. Inee, we can recover the MTM (or PMH) chain as shown in Eq. (35). Recycling samples. The GMS algorithm can be seen as a metho of recycling auxiliary weighte samples in MTM schemes (or PMH schemes, if the caniates are generate by SIR). In [8, the authors show how recycling an incluing the samples rejecte in one run of a stanar MH metho into a unique consistent estimator. GMS can be consiere an extension of this technique where N caniates are rawn at each iteration. Iterate IS. GMS can be also interprete as an iterative importance sampling scheme where an IS approximation of N samples is built at each iteration an compare with the previous IS approximation. This proceure is iterate 3

14 ex t+ ex t+2 ex t+3 T times an all the accepte IS estimators Ĩ(t) N are finally combine for proviing a unique global approximation of N T samples. Note that, the temporal combination of the IS estimators is obtaine ynamically by the ranom repetitions ue to the rejections in the MH test. Hence, the complete proceure for weighting the samples generate by GMS can be interprete as the composition of two weighting schemes: (a) by an importance sampling approach builing {ρ n,t } N an (b) by the possible ranom repetitions ue to the rejections in the MH test. Consistency of the GMS estimator. Recovering the MTM chain { x t } T t as in Eq. (35), the estimator Ĩ T T T t h( x t) is consistent, i.e., ĨT converges almost-surely to I E π [h(x) as T, since { x t } T t is an ergoic chain [38. Note that E bπ [h( x t ) S t N ρ n,th(x n,t ) Ĩ(t) N for S t S t, where π(x x :N,t ) an, since Ĩ(t) N Ĩ(t ) N, we N ρ n,tδ(x x n,t ). If S t S t, then E bπ [h( x t ) S t E bπ [h( x t ) S t Ĩ(t ) N have again E bπ [h( x t ) S t Ĩ(t) N. Therefore, we have E[ĨT S :T T T T E bπ [h( x t ) S t (36) t T t Ĩ (t) N ĨNT. (37) Thus, the GSM estimator ĨNT in Eq. (33) can be expresse as ĨNT E[ĨT S :T, where S :T are all the weighte samples obtaine by GMS, an it is consistent for T since ĨT is consistent. Furthermore, fixing T, ĨNT is consistent for N for stanar IS arguments [23. (a) S t S t S t+ (b) x x x x ex t S t S t+ S t+2 S t+3 t Figure 6: (a) Chain of sets S t {x n,t, ρ n,t} N generate by the GMS metho (graphical representation with N 4). (b) Graphical examples of GMS outputs, S t, S t+, S t+2 an S t+3, where S t+2 S t+. The weights of the samples are enote by the size of the circles. A possible recovere MTM chain is also epicte with soli line, where the states are ex τ with τ t, t +, t + 2, t + 3 an ex t+2 ex t Distribute Particle Metropolis-Hastings algorithm The PMH algorithm is an MCMC technique particularly esigne for filtering an smoothing a ynamic variable in a state-space moel [2, 28 (see for instance Figure 3). In PMH, ifferent trajectories obtaine by ifferent runs of a particle filter (see Section 4.) are compare accoring to suitable MH-type acceptance probabilities, as shown in Table 4. In this section, we show how several parallel particle filters ( for instance, each one consier a ifferent proposal pf) can rive a PMH-type technique. The classical PMH metho uses a single factorize proposal pf q(x) q (x ) D 2 q (x x : ), employe in single SIR metho in orer to generate new caniates before of the MH-type test (see Table 4). Let us consier the problem of tracking a variable of interest x [x,..., x D R D η an the target pf π can be factorize as π(x) π (x ) D 2 π (x x : ). We assume that M inepenent processing units are available jointly with a central noe as shown Fig. 7. We use M parallel particle filters, each one with a ifferent proposal 4

15 pf, q m (x) q m, (x ) D 2 q m,(x x : ), one per each processor. Then, after one run of the parallel particle filters, we obtain M particle approximations π m (x). Since, we aim to reuce the communication cost to the central noe (see Figs. 3 an 7), we consier that each machine only transmits the pair {Ẑm, x m }, where x m π m (x) (we set N... N M, for simplicity). Applying the GIS theory, then it is straightforwar to outline the metho, calle Distribute Particle Metropolis-Hastings (DPMH) technique, shown in Table 7. Table 7: Distribute Particle Metropolis-Hastings algorithm. Choose an initial state x 0 an Ẑm,0 for m,..., M. 2. For t,..., T : (a) (Parallel Processors) Draw N particles v m,,..., v m,n from q m (x) an weight them with IS weights w(v m,n ), n,..., N, using a particle filter (or a batch approach), for each m,..., M. Thus, enoting Ẑm N N w(v m,n), we obtain the M particle approximations π m (x) π(x v m,:n ) N NZ b m w(v m,n)δ(x v m,n ). (b) (Parallel Processors) Draw x m π(x v m,:n ), for m,..., M. (c) (Central Noe) Resample x { x,..., x M } accoring to the pmf x π(x x,..., x M ). () (Central Noe) Set x t x an Ẑm,t Ẑm, for m,..., M, with probability [ α(x t, x) min, M m Ẑm M m Ẑm,t bz m PM j b Z j, m,..., M, i.e.,. (38) Otherwise, set x t x t an Ẑm,t Ẑm,t, for m,..., M. The metho in Table 7 has the structure of a Multiple Try Metropolis (MTM) algorithm using ifferent proposal pfs [7, 3. More generally, in step 2a, the scheme escribe above can even employ ifferent kin of particle filtering algorithms. In step 2b, M total resampling steps are performe,one per processor. Then, one resampling step is performe in the central noe (step 2c). Finally, the resample particle is accepte as new state with probability α in Eq. (38). Ergoicity. The ergoicity of DPMH is ensure since it can be interprete as a stanar PMH metho consiering a single particle approximation 5 π(x v :M,:N ) M m Ẑ m M j Ẑj π(x v m,:n ) M W m π(x v m,:n ), (39) an then we resample once, i.e., raw x π(x v :M,:N ). Then, the proper weight of this resample particle is Ẑ M m Ẑm, so that the acceptance function of the equivalent classical PMH metho is α(x t, x) [ [ P M bz M min, min, b m Z m P bz M b, where t m Z Ẑt M m,t m Ẑm,t (see Table 4). M Using partial IS estimators. if we are intereste in approximating only one moment of the target pf, as shown in Figures 3-4, at each iteration we can transmit the M partial estimators I (m) N an combine them in the 5 This particle approximation can be interprete as obtaine by a single particle filter splitting the particles in M isjoint sets an then applying the partial resampling escribe in Section 4., i.e., performing resampling steps within the sets. See also Eq. (62). m 5

16 central noe as in Eq. (8), obtaining Ĩ NM M (m) P M b j Z j m ẐmI N. Then, a sequence of estimators, Ĩ(t) NM, is create accoring to the acceptance probability α in Eq. (38). Finally, we obtain the global estimator This scheme is epicte in Figure 7(b). Ĩ NMT T T t Ĩ (t) NM. (40) Benefits. One avantage of the DPMH scheme is that the generation of samples can be parallelize (i.e., fixing the computational cost, DPMH allows the use of M processors in parallel) an the communication to the central noe requires the transfer of only M particles, x m, an M weights, Ẑ m, instea of NM particles an NM weights. Figure 7 provies a general sketch of DPMH. Its marginal version is escribe in Appenix D. Another benefit of DPMH is that ifferent types of particle filters can be jointly employe, for instance, ifferent proposal pfs can be use. Special cases an extensions. The classical PMH metho is as a special case of the propose algorithm of Table 7 when M. If the partial estimators are transmitte to the central noe, as shown in Figure 7(b), DPMH coincies with PGMS when M. Aaptive versions of DPMH can be esigne in orer select automatically Z the best proposal pf among the M ensities, base of the weights W m b m PM b, m,..., M. For instance, j Z j Figure 2(b) shows that DPMH is able to etect the best scale parameters within the M use values. (a) (b) PF- PF PF-M PF- PF PF-M {ex, b Z } {ex 2, b Z 2 } {ex M, b Z M } {I () N, b Z } {I (2) N, b Z 2 } {I (M) N, b Z M } Central Noe (x t ex b (x ex :M ) " P M m Z, ex) min, b # m P M b m Z m,t Central Noe ( e I (t ) NM ei 0 NM MX bz m P M b I (m) N j Z j m " P M, I e NM)min 0 m Z, b # m P M b m Z m,t x t ei (t) NM Figure 7: Graphical representation of Distribute Particle Metropolis-Hastings (DPMH) metho, (a) for estimating a generic moment, or (b) for estimating of a specific moment of the target. 6 Numerical Experiments In this section, we test the novel techniques consiering several experimental scenarios an three ifferent applications: hyperparameters estimation for Gaussian Processes (D 2, η ), a localization problem jointly with the tuning of the sensor network (D 8, η ), an the online filtering of a remote sensing variable calle Leaf Area Inex (LAI; D 365, η ). We compare the novel algorithms with ifferent benchmark methos such aaptive MH algorithm, MTM an PMH techniques, parallel MH chains with ranom walk proposal pfs, IS schemes, an the Aaptive Multiple Importance Sampling (AMIS) metho. 6

17 6. Hyperparameters tuning for Gaussian Process (GP) regression moels We test the propose GMS approach for the estimation of hyperparameters of a Gaussian process (GP) regression moel [4, [36. Let us assume observe ata pairs {y j, z j } P j, with y j R an z j R L. We also enote the corresponing P output vector as y [y,..., y P an the L P input matrix as Z [z,..., z P. We aress the regression problem of inferring the unknown function f which links the variable y an z. Thus, the assume moel is y f(z) + e, where e N(e; 0, σ 2 ), an that f(z) is a realization of a GP [36. Hence f(z) GP(µ(z), κ(z, r)) where µ(z) 0, z, r R L, an we consier the kernel function ( ) L (z l r l ) 2 κ(z, r) exp 2δ 2, (4) Given these assumptions, the vector f [f(z ),..., f(z P ) is istribute as p(f Z, δ, κ) N (f; 0, K), where 0 is a P null vector, an K ij : κ(z i, z j ), for all i, j,..., P, is a P P matrix. Therefore, the vector containing all the hyperparameters of the moel is x [δ, σ, i.e., all the parameters of the kernel function in Eq. (4) an stanar eviation σ of the observation noise. In this experiment, we focus on the marginal posterior ensity of the hyperparameters, π(x y, Z, κ) π(x y, Z, κ) p(y x, Z, κ)p(x), which can be evaluate analytically, but we cannot compute integrals involving it [36. Consiering a uniform prior within [0, 20 2, p(x) an since p(y x, Z, κ) N (y; 0, K + σ 2 I), we have log [π(x y, Z, κ) 2 y (K + σ 2 I) y 2 log [ et ( K + σ 2 I ), where clearly K epens on δ [36. The moments of this marginal posterior cannot be compute analytically. Then, in orer to compute the Minimum Mean Square Error (MMSE) estimator x [ δ, σ, i.e., the expecte value E[X with X π(x y, Z, κ), we approximate E[X via Monte Carlo quarature. More specifically, we apply a the novel GMS technique an compare with an MTM sampler, a MH scheme with a longer chain an a static IS metho. For all these methoologies, we consier the same number of target evaluations, enote as E, in orer to provie a fair comparison. We generate P 200 pairs of ata, {y j, z j } P j, accoring to the GP moel above setting δ 3, σ 0, L, an rawing z j U([0, 0). We keep fixe these ata over the ifferent runs, an the corresponing posterior pf is given in Figure 9(b). We compute the groun-truth x [ δ , σ 9.28 using an exhaustive an costly gri approximation, in orer to compare the ifferent techniques. For both GMS, MTM an MH schemes, we consier the same aaptive Gaussian proposal pf q t (x µ t, λ 2 I) N (x µ t, λ 2 I), with λ 5 an µ t is aapte consiering the arithmetic mean of the outputs after a training perio, t 0.2T, in the same fashion of [24, 8 (µ 0 [, ). First, we test both techniques fixing T 20 an varying the number of tries N. Then, we set N 00 an vary the number of iterations T. Figure 8 (log-log plot) shows the Mean Square Error (MSE) in the approximation of x average over 0 3 inepenent runs. Observe that always GMS outperforms the corresponing MTM scheme. These results confirm the avantage of recycling the auxiliary samples rawn at each iteration uring an MTM run. In Figure 9(a), we show the MSE obtaine by GMS keeping invariant the number of target evaluations E NT 0 3 an varying N {, 2, 0, 20, 50, 00, 250, 0 3 }. As a consequence, we have T {0 3, 500, 00, 50, 20, 0, 4, }. Note that the case N, T 0 3, correspons to an aaptive MH (A-MH) metho with a longer chain, whereas the case N 0 3, T, correspons to a static IS scheme (both with the same posterior evaluations E NT 0 3 ). We observe the GMS always provies smaller MSE than the static IS approach. Moreover, GMS outperforms A-MH with the exception of two cases where T {, 4}. 6.2 Localization of a target an tuning of the sensor network We consier the problem of positioning a target in R 2 using range measurements in a wireless sensor network [, 20. We also assume that the measurements are contaminate by noise with ifferent unknown power, one per each sensor. This situation is common in several practical scenario. Inee, even if the sensors have the same construction features, the noise perturbation of each the sensor can vary with the time an epens on the location of the sensor. This occurs owing to ifferent causes: manufacturing efects, obstacles in the reception, ifferent l 7

(a) (b) MTM GMS MTM GMS 0 0 0 0 0 0 2 0 3 N 0 0 2 T Figure 8: MSE (loglog-scale; average over 0 3 inepenent runs) obtaine with the MTM an GMS algorithms (using the same proposal pf an the same values

18 (a) (b) MTM GMS MTM GMS N T Figure 8: MSE (loglog-scale; average over 0 3 inepenent runs) obtaine with the MTM an GMS algorithms (using the same proposal pf an the same values of N an T ) (a) as function of N with T 20 an (b) as function of T with N 00. (a) (b) Static IS x A-MH (T 000) N x Figure 9: (a) MSE (loglog-scale; average over 0 3 inepenent runs) of GMS (circles) versus the number of caniates N {, 2, 0, 20, 50, 00, 250, 0 3 }, but keeping fixe the total number of posterior evaluations E NT 000, so that T {000, 500, 00, 50, 20, 0, 4, }. The MSE values the extreme cases N, T 000, an N 000, T, are epicte with ashe lines. In first case, GMS coincies with an aaptive MH scheme (ue the aaptation of the proposal, in this example) with a longer chain. The secon one represents a static IS scheme (clearly, using the sample proposal than GMS). We can observe the benefit of the ynamic combination of IS estimators obtaine by GMS. (b) Posterior ensity π(x y, Z, κ). 8

19 physical environmental conitions (such as humiity an temperature) etc. Moreover, in general, these conitions change along the time, hence it is necessary that the central noe of the network is able to re-estimate the noise powers jointly with position of the target (an other parameters of the moels if require) whenever a new block of observations is processe. More specifically, let us enote the target position with the ranom vector Z [Z, Z 2. The position of the target is then a specific realization Z z. The range measurements are obtaine from N S 6 sensors locate at h [3, 8, h 2 [8, 0, h 3 [ 4, 6, h 4 [ 8,, h 5 [0, 0 an h 6 [0, 0 as shown in Figure 0(a). The observation moels are given by Y j 20 log ( z h j ) + B j, j,..., N S, (42) where B j are inepenent Gaussian ranom variables with pfs, N (b j ; 0, λ 2 j ), j,..., N S. We enote λ [λ,..., λ NS the vector of stanar eviations. Given the position of the target z [z 2.5, z an setting λ [λ, λ 2 2, λ 3, λ 4 0.5, λ 5 3, λ (since N S 6), we generate N O 20 observations from each sensor accoring to the moel in Eq. (42). Then, we finally obtain a measurement matrix Y [y k,,..., y k,ns R Y, where Y N O N S 20, k,..., N O. We consier uniform prior U(R z ) over the position [z, z 2 with R z [ , an a uniform prior over λ j, so that λ has prior U(R λ ) with R λ [0, 20 N S. Thus, the posterior pf is π(x Y) π(z, λ Y) l(y z, z 2, λ,..., λ NS ) N O N S k j 2πλ 2 j exp ( 2 N S p(z i ) p(λ j ), (43) i j ) 2λ 2 (y k,j + 0 log ( z h j ) 2 I z (R z )I λ (R λ ) (44) j where x [z, λ is a vector of parameters of imension D N S that we esire to infer, an I c (R) is an inicator variable that is if c R, otherwise is 0. Our goal is to compute the Minimum Mean Square Error (MMSE) estimator, i.e., the expecte value of the posterior π(x Y) π(z, λ Y). Since the MMSE estimator cannot be compute analytically, we apply Monte Carlo methos for approximating it. We compare GMS, the corresponing MTM scheme, the Aaptive Multiple Importance Sampling (AMIS) technique [9, an N parallel MH chains with a ranom walk proposal pf. For all of them we consier Gaussian proposal ensities. For GMS an MTM, we set q t (x µ n,t, σ 2 I) N (x µ t, σ 2 I) where is aapte consiering the empirical mean of the generate samples after a training perio, t 0.2T [24, 8, µ 0 U([, 5 D ) an σ. For AMIS, we have q t (x µ t, C t ) N (x µ t, C t ), where µ t is as previously escribe (with µ 0 U([, 5 D )) an C t is also aapte using the empirical covariance matrix, starting C 0 4I. We also test the use of N parallel Metropolis-Hastings (MH) chains (we also consier the case of N, i.e., a single chain), with a Gaussian ranom-walk proposal pf, q n (µ n,t µ n,t, σ 2 I) N (µ n,t µ n,t, σ 2 I) with µ n,0 U([, 5 D ) for all n an σ. We fix the total number of evaluations of the posterior ensity as E NT 0 4. Note that, generally, the evaluation of the posterior is the most costly step in MC algorithms (however, AMIS has the aitional cost of re-weighting all the samples at each iteration accoring to eterministic mixture proceure [6, 9, 5). We recall that T enotes the total number of iterations an N the number of samples rawn from each proposal at each iteration. We consier x [z, λ as the grountruth an compute the Mean Square Error (MSE) in the estimation obtaine with the ifferent algorithms. The results are average over 500 inepenent runs an they are provie in Tables 8, 9, an 0 an Figure 0(b). Note that GMS outperforms AMIS for each a pair {N, T } (keeping fixe E NT 0 4 ), an GMS also provies smaller MSE values than N parallel MH chains (the case N correspons to a unique longer chain). Figure 0(b) shows the MSE versus N maintaining E NT 0 4 for GMS an the corresponing MTM metho. This figure again confirms the avantage of recycling the samples in a MTM scheme. 9

20 Table 8: Results GMS. MSE N T E NT 0 4 MSE range Min MSE.9 Max MSE.44 Table 9: Results AMIS [9. MSE N T E NT 0 4 MSE range Min MSE.29 Max MSE.7 Table 0: Results N parallel MH chains with ranom-walk proposal pf. MSE N T E NT 0 4 MSE range Min MSE.3 Max MSE Filtering an Smoothing of the Leaf Area Inex (LAI) We consier the problem of estimating the Leaf Area Inex (LAI) enote as x R + (where N + also represents a temporal inex) in a specific region at a latitue of 42 N [7. Since x t > 0, we consier Gamma prior pfs over the evolutions of LAI an Gaussian perturbations for the in-situ receive measurements, y t. More specifically, we following assume the state-space moel (forme by propagation an measurement equations), { ( G ), b x g (x x ) x b exp ( x ) b, l (y x ) N (y x, λ 2 ) exp ( 2πλ 2 2λ (y 2 x ) 2), c x (x b)/b for 2,..., D, with initial probability g (x ) G(x, ), where b, λ > 0 an c > 0 is a normalizing constant. Note that the expecte value of the Gamma pf above is x an the variance is b. (45) First Experiment. Consiering known the parameters of the moel, the posterior pf is π(x y) l(y x)g(x), (46) [ D [( D ) l (y x ) g (x x ) g (x ), (47) 2 with x x :D R D. For generating the groun-truth (i.e., the trajectory x x :D [x,..., x D ), we simulate the temporal evolution of LAI in one year (i.e., D 365) by using a ouble logistic function (as suggeste in the literature [7), i.e., ( ) x a + a 2 + exp(a 3 ( a 4 )) + + exp(a 5 ( a 6 )) +, (48) 20 2

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications