Journées de Statistique Marne La Vallée, January 2005 Contraction properties of Feynman-Kac semigroups Pierre DEL MORAL, Laurent MICLO Lab. J. Dieudonné, Nice Univ., LATP Univ. Provence, Marseille 1
Notations (E, E) measurable space P(E) = {µ probability on E} B b (E) = {f : E R bounded + E-measurable} (µ, f) P(E) B b (E) µ(f) = f(x)µ(dx) M(x, dx ) Markov kernel on E µm(dx ) = µ(dx)m(x, dx ) and M(f)(x) = M(x, dx )f(x ) and, with X n Markov chain M(x, dy) M n (x, dz) = M n 1 (x, dy)m(y, dz) = P x (X n dz) 2
Feynman-Kac models X n Markov chain M n (x, dy) on (E, E) Potential function G n : E [0, ) Feynman-Kac measures ( test funct. f : E R) η n (f) = γ n (f)/γ n (1) with γ n (f) = E η0 (f(x n ) 0 p<n G p (X p )) Ex.: Updated models γ n (f) = γ n (fg n ) η n (f) = γ n (f)/ γ n (1) = η n (fg n )/η n (G n ). G n = e βv n η n = n-th marginal path. measure Q n = 1 Z n exp { β 0 p<n V p(x p )} dp n G n = 1 A η n =Law(X n 0 p < n X p A) Note: Z n = γ n (1) and γ n (f) = η n (f) 0 p<n η p(g p ) 1 n log Z n = 1 n 0 p<n log η p(g p ) 3
Feynman-Kac semigroups Unnormalized models γ n = γ p Q p,n linear semigroup Q p,n (f)(x) = E p,x (f(x n ) p q<n G q (X q )) Normalized models η n = Φ p,n (η p ) nonlinear semigroup Φ p,n = Φ n 1,n Φ p,p+1 with Φ p,p+1 (η) = Ψ p (η)m p+1 and Ψ p (η)(dx) = 1 η(g p ) G p(x) η(dx) Pb.: Asymptotic stability, contraction properties, functional entropy inequalities, decays estimates... 4
Applications/Motivations Particle physics: Markov evolution X n Absorbing medium G(x) = e V (x) [0, 1] X c n Ec = E {c} absorption Xc n exploration X c n+1 Absorption/killing: Xc n = X c n, with proba G(Xc n ); otherwise the particle is killed and Xc n = c. A = {x : G(x) = 0} Hard obstacles T = inf {n 0 ; Xc n = c} Absorption time X c T +n = Xc T +n = c Feynman-Kac models (G, X n ) : γ n = Law(X c n ; T n) and η n = Law(X c n T n) 5
Genetics/Stoch. algo ( particle methods = local perturbation/stoch. linearization) ξ n E N selection ξn E N mutation ξ n+1 E N iid Φ n 1,n ( 1 N 1 i N δ ξ i n 1 ) Selection transition ξ n = ( ξi n ) 1 i N iid Ψ n ( 1 N 1 j N δ ξ j ) = n 1 1 j N G n (ξ j n) 1 k N G n(ξ k n )δ ξ j n Mutation transition ξi n ξn+1 i M n+1( ξi n,.) Particle occupation measures: η N n = 1 N N δ ξ i n N η n and γ N n (.) = η N n (.) i=1 0 p<n η N p (G p ) N γ n (.)
Advanced signal processing ( filtering/hidden Markov chains/bayesian methodology) Signal process: X n = Markov chain E, η 0 = Law(X 0 ) Observation/Sensor eq.: Y n = H n (X n, V n ) F with P(H n (x n, V n ) dy n ) = g n (x n, y n ) λ n (dy n ) Example: Y n = h n (X n ) + V n F = R, with Gaussian noise V n = N (0, 1) P(h n (x n ) + V n dy n ) = (2π) 1/2 e 1 2 (y n h n (x n )) 2 dy n = exp [h n (x n )y n h 2 n (x n)/2] }{{} g n (x n,y n ) N (0, 1)(dy n ) }{{} λ n (dy n ) Prediction/filtering Feynman-Kac representation G n (x n ) = g n (x n, y n ) η n = Law(X n 0 p < n Y p = y p ) and η n = Law(X n 0 p n Y p = y p )
Statistics: ( Sequential MCMC and Feynman-Kac-Metropolis models) Metropolis potential [π target measure]+[(k, L) pair Markov transitions] G(y 1, y 2 ) = π(dy 2)L(y 2, dy 1 ) π(dy 1 )K(y 1, dy 2 ) Th.: (Time reversal formula), [A. Doucet, P.DM; (Séminaire Probab. 2003)] In addition : E L π (f n(y n, Y n 1..., Y 0 ) Y n = y) = EK y (f n(y 0, Y 1,..., Y n ) { 0 p<n G(Y p, Y p+1 )}) E K y ({ 0 p<n G(Y p, Y p+1 )}) FK-Metropolis n-marginal: lim n η n = π (cv. decays π) Nonhomogeneous models: (π n, L n, K n ) π n (dy) e β nv (y) λ(dy), cooling schedule β n, mutation s.t. π n = π n K n, and Law(X 0 ) = π 0 G n (y 1, y 2 ) = exp [ (β n+1 β n )V (y 1 )] = η n = π n
Spectral analysis (time homogeneous models) G M(h) = λ h (> 0), G = λh/m(h) = γ n (fm(h)) Ψ h (η 0 )M n h (f) with Then Ψ h (η 0 )(dx) = 1 η 0 (h) h(x) η 0(dx) and M h (x, dy) = 1 M(h)(x) M(x, dy)h(y) Ψ M(h) ( η n ) = Ψ h (η 0 )M n h (f) = Law(Y n) with Y n Markov chain M h Φ 0,n = Φ n contractive! η = Φ(η ) If M µ-reversible then: M(Gh) = λh (> 0) η = Ψ h (µ) and η (G) = λ Note: (H) m : M m (x,.) ɛ M m (y,.) and G n (x) r G n (y) h = dη /dµ [ɛ/r m, r m /ɛ] 6
h-relative entropy, h convex, h(ax, ay) = ah(x, y) R { }, h(1, 1) = 0 H(η, µ) = h(dη, dµ) = g(dη/dµ) dµ with g(x) = h(x, 1) Ex.: Total variation and L p -norms: g(t) t 1 p Boltzmann or Shannon-Kullback entropy: g(t) = t log t Havrda-Charvat entropy order p > 1: g(t) = 1 p 1 (tp 1) Kakutani-Hellinger integrals order α (0, 1): g(t) = t t α 7
Dobushin s contraction coefficient ηm µm tv β(m) = def. sup M(x,.) M(y,.) tv = sup x,y η,µ η µ tv Th. [L.Miclo, M. Ledoux, P.DM (PTRF 2003)] H(µM, ηm) β(m)h(µ, η) Ex.: M(x,.) ɛm(y,.) = β(m) (1 ɛ) = H(µM, ηm) (1 ɛ)h(µ, η) Corollary (Filtering with a wrong initial condition η 0 η n η n ) [ n ] E(Ent( η n η n)) E(Ent(η n η n)) p=1 β(m p ) Ent(η 0 η 0 ) 8
Feynman-Kac semigroups Φ p,n (µ)(f) = µq p,nf µq p,n 1 = µ(g p,np p,n (f)) µ(g p,n ) with G p,n = def. Q p,n (1) and P p,n (f) = def. Q p,n (f) Q p,n (1) Note: β(p p,n ) = sup µ,η Φ p,n (η) Φ p,n (µ) tv and let g p,n = def. sup G p,n (x)/g p,n (y) x,y Th. h(x, y) suff. regular. α h (t) s.t. H(Φ p,n (µ), Φ p,n (η)) α h (g p,n ) β(p p,n ) H(µ, η) Ex.: α h (t) = t (total variation norm and Boltzmann entropy), α h (t) = t 1+p (Havrda-Charvat and Kakutani-Hellinger integrals of order p, α h (t) = t 3 (L 2 -norms),... 9
Contraction estimates (H) m : (M m (x,.) ɛ M m (y,.) and G n (x) r G n (y) ) Lemma Th. (H) m = p 0 sup g p,n r m /ɛ and β(p p,p+nm ) (1 ɛ 2 /r m 1 ) n n p Φ p,p+nm (η) Φ p,p+nm (µ) tv (1 ɛ 2 /r m 1 ) n ( = (1 ɛ 2 ) n = (H) 1 ) and H(Φ p,p+nm (µ), Φ p,p+nm (η)) α h (r m /ɛ) (1 ɛ 2 /r m 1 ) n H(µ, η) Extensions: nonhomogeneous models, continuous time FK semigroups,... 10
Some application model areas Asymptotic stability of optimal filters. Uniform estimates for particle approximation models (w.r.t time parameter). Spectral analysis of FK-Schrödinger s.g., new probab./particle interpretations. Long time behavior of Feynman-Kac-Metropolis models. Stability properties of infinite population genetic models. 11