Non parametric modeling of multivariate extremes with Dirichlet mixtures

Non parametric modeling of multivariate extremes with Dirichlet mixtures Anne Sabourin Institut Mines-Télécom, Télécom ParisTech, CNRS-LTCI Joint work with Philippe Naveau (LSCE, Saclay), Anne-Laure Fougères (ICJ, Lyon 1), Benjamin Renard (IRSTEA, Lyon). March 26 th, 2013 Rare and Extreme workshop, Aber Wrach 1/29

Air quality Five air pollutants recorded in Leeds (UK), daily. Health issue : probability of a joint (simultaneous) excess of alert thresholds? Probability of regions far from the origin? 2/29

Censored Multivariate extremes : oods in the `Gardons' joint work with Benjamin Renard Daily streamow at 4 neighbouring sites ( St Jean du Gard, Mialet, Anduze, Alès). Joint distribution of extremes? Probability of simultaneous oods. Historical data censored data ; few clean data. Gard river Neppel et al. (2010) 3/29

Multivariate extremes Random vectors Y = (Y 1,..., Y d, ) ; Y j 0 Margins : Y j F j, 1 j d (any). Standardization (unit Fréchet margins) : X j = 1/ log F j (Y j ) Joint extremes : X's distribution above large thresholds? P(X A X A 0 )? (A A 0, 0 / A 0 ), A 0 `far from the origin'. X2 X A u 2 A 0 : Extremal region u 1 X1 4/29

Polar decomposition Polar coordinates : R = d j=1 X j (L 1 norm) ; W = X R. W simplex S d = {w : w j 0, j w j = 1}. Characterize P(X A A A 0 ) Characterize P(R > r, W B R > r 0 ) X 2nd component 0 1 5 10 w B x S d W R B 0 1 5 10 1st component 5/29

Fundamental Result, Angular distribution Radial homogeneity (under hypothesis of regular variation) P(R > r t, W B R t) t 1 r H(B) Above large thresholds r 0, R W ; H (+ margins) rules the joint distribution X1 X2 x 0 1 5 10 0 1 5 10 w x X1 X2 x 0 1 5 10 0 1 5 10 w x One condition only for genuine H : moments constraint, w dh(w) = ( 1 d,..., 1 d ). Center of mass = center of simplex. Few constraints : non parametric family! 6/29

Estimating the angular measure : non parametric problem Non parametric estimation (empirical likelihood, Einmahl et al., 2001, Einmahl, Segers, 2009, Guillotte et al, 2011.) No explicit expression for asymptotic variance, Bayesian inference with d = 2 only, nothing for censored data. Compromise : Mixture of countably many parametric models Innite-dimensional model + easier Bayesian inference (handling parameters). Dirichlet mixture model ( Boldi, Davison, 2007 ; S., Naveau, 2013) How to deal with the moments constraint on H to generate parameters / dene a prior? Do MCMC methods work in moderate dimension (d = 5)? Does it still work with censored data? 7/29

Dirichlet distribution w S d, diri(w µ, ν) = Γ(ν) d i=1 Γ(νµ i) d i=1 w νµ i 1 i. µ S d : location parameter (point on the simplex) : `center' ; ν > 0 : concentration parameter. w2 0.00 0.35 0.71 1.06 1.41 w3 w1 8/29

Dirichlet mixture model Boldi, Davison, 2007 µ = µ,1:k, ν = ν 1:k, p = p 1:k, ψ = (µ, p, ν), h ψ (w) = k p m diri(w µ, m, ν m ) m=1 Moments constraint on (µ, p) : k p m µ.,m = ( 1 d,..., 1 d ). m=1 Weakly dense family (k N) in the space of admissible angular measures 9/29

Bayesian inference with non censored data Moments constraints barycenter constriant on (µ, p) Prior construction? Parameter generation for MCMC sampling? Dicult for dimension > 2. Re-parametrization S., Naveau (13) parameter : work with unconstrained Weak posterior consistency MCMC with reversible jumps manageable in moderate dimension ( 5). 10/29

Re-parametrization Sabourin, Naveau 2013 How to build a prior on (p, µ)? Constraint on center of mass : j p j µ,j Sequential construction : Use associativity properties of barycenter. Intermediate variables : partial centers of mass ; determined by eccentricity parameters (e 1,..., e k 1 ) (0, 1) k 1. Deduce last µ,k from rst ones : no more constraints! 11/29

Re-parametrization : intermediate variables (γ 1,..., γ k 1), partial barycenters ex : k = 4 γ 0 γ m : Barycenter of kernels 'following µ.,m : µ.,m+1,..., µ.,k. γ m = ( ) 1 p j p j µ.,j j>m j>m 12/29

γ 1 on a line segment : eccentricity parameter e 1 (0, 1). ex : k = 4 I1 γ0 γ1 µ1 Draw (µ,1 S d, e 1 (0, 1)) γ 1 dened by γ 0 γ 1 γ 0 I 1 = e 1 ; p 1 = γ 0 γ 1 µ,1 γ 1. 13/29

γ 2 on a line segment : eccentricity parameter e 2 (0, 1). ex : k = 4 I1 µ 2 γ1 γ2 I2 γ0 µ 1 Draw (µ,2, e 2 ) γ 2 : γ 1 γ 2 γ 1 I 2 = e 2 p 2 14/29

Last density kernel = last center µ,k. ex : k = 4 I3 µ4 I1 µ 2 γ1 γ2 I2 γ0 µ 3 µ 1 Draw (µ,3, e 3 ) γ 3 p 3, µ.,4 = γ 3. p 4 15/29

Summary I3 µ4 I1 µ 2 γ1 γ2 I2 γ0 µ 3 µ 1 Given (µ.,1:k 1, e 1:k 1 ), One obtains (µ.,1:k, p 1:k ). The density h may thus be parametrized by θ = (µ.,1:k 1, e 1:k 1, ν 1:k ) `rectangle', unconstrained. 16/29

Bayesian model New parameter : θ k = (µ,1:k 1, e 1:k 1, ν 1:k ) Unconstrained parameter space : union of product spaces (`rectangles') } Θ = Θ k ; Θ k = {(S d ) k 1 [0, 1) k 1 (0, ] k 1 k=1 Inference : Gibbs + Reversible-jumps. Restriction (numerical convenience) : k 15, ν < ν max, etc... `Reasonable' prior `at' and rotation invariant. Balanced weight and uniformly scattered centers. 17/29

MCMC sampling : Metropolis-within-Gibbs, reversible jumps. Three transition types for the Markov chain : Classical (Gibbs) : one µ.,m, e m or a ν m is modied. Proposals of new Dirichlet centers depend on the data. Trans-dimensional (Green, 1995) : One component (µ.,k, e k, ν k+1 ) is added or deleted. Trans-dimensional moves are natural. Additional components again depend on the data `Shue' : Indices permutation of the original mixture : Re-allocating mass from old components to new ones. 18/29

Resuts in the re-parametrized version Theoretically (Asymptotics) : Posterior consistency : U weakly open in Θ, containing θ 0, π n (U) = π(u data 1:n ) 1. n Markov chain's ergodicity : T g(θ t=1 t) E π n (g) T Empirically : convergence checks. Better coverage of credible sets (d=5, bivariate margins, simulated data) 0.0 1.8 0.0 1.8 0 1 X2/(X2+X5) 0 1 X2/(X2+X5) 19/29

Resuts in the re-parametrized version Theoretically (Asymptotics) : Posterior consistency : U weakly open in Θ, containing θ 0, π n (U) = π(u data 1:n ) 1. n Markov chain's ergodicity : T g(θ t=1 t) E π n (g) T Empirically : convergence checks. As good (in dimension 2) as the bivariate non-parametric model of Guillotte et. al. (2006) (simulated data in logistic/asymmetric logistic/dirichlet. Solid line : DM. dotted : alternative non parametric model) H 0.3 0.2 0.1 0.0 0.1 0.2 0.3 H 0.3 0.2 0.1 0.0 0.1 0.2 0.3 H 0.3 0.2 0.1 0.0 0.1 0.2 0.3 19/29 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Inference with censored data Streamflow at Anduze (m3/s) 0 1000 2000 3000 4000 5000 6000 0 1000 2000 3000 Streamflow at StJean (m3/s) Existing litterature : Ledford & Tawn, 1996 : censoring at threshold. GEV models Explicit expression for censored likelihood. 20/29

Issues Censored data points but segments or boxes in R d. Angles W i undened. Intervals overlapping threshold : extreme data or not? Censored likelihood : density dr r 2 dh(w) integrated over boxes. 21/29

Undetermined data (overlapping threshold) X2 Determined data Overlapping data Perception threshold Extreme threshold X1 Considering 'undetermined data' as missing biais! 22/29

Undetermined data (overlapping threshold) X2 Overlapping data Extreme threshold Perception threshold R Rc X1 Data in region R not in region R c... Well dened likelihood in a Poisson model 23/29

Poisson model {( t n, X ) } t, 1 t n PRM(Leb µ ) on [0, 1] A u,n n X2 / n A u,n : Fixed failure region Complementaries A i,n of overlapping regions u 2 /n Overlapping regions A i,n c u 1 /n X1 / n µ : ` exponent measure', with Dirichlet Mixture angular component dµ dr dw (r, w) = d r 2 h(w). Likelihood of overlapping data : [ { P N ( t 2 n t 1 n ) 1 } ] n A i = 0 = exp [ (t 2 t 1 )µ (A i )] 24/29

`Censored' likelihood : and data augmentation Data augmentation : Generate missing components under univariate conditional distributions. One more Gibbs step, no more numerical integration. Z j 1:r [X missing X obs, θ] x2 Augmentation data Z j = [X censored X observed, θ] Censored interval u 2 /n Extremal region u 1 /n x1 Dirichlet Explicit univariate conditionals Exact sampling of censored data on censored interval 25/29

Simulated data (Dirichlet, d = 4, k = 3 components), same censoring as real data Pairwise plot and angular measure density (true/ posterior predictive) S4 0 1000 2000 3000 4000 h 0 1 2 3 4 5 0 1000 2000 3000 4000 5000 6000 S3 0.0 0.2 0.4 0.6 0.8 1.0 X3/( X3 + X4 ) 26/29

Angular predictive density for Gardons data 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 h h h 0.0 0.2 0.4 0.6 0.8 1.0 St Jean/(St Jean+Mialet ) 0.0 0.2 0.4 0.6 0.8 1.0 St Jean/(St Jean+Anduze ) 0.0 0.2 0.4 0.6 0.8 1.0 St Jean/(St Jean+Ales ) 0 1 2 3 4 5 6 0 1 2 3 4 0 1 2 3 4 h h h 0.0 0.2 0.4 0.6 0.8 1.0 Mialet/(Mialet+Anduze ) 0.0 0.2 0.4 0.6 0.8 1.0 Mialet/(Mialet+Ales ) 0.0 0.2 0.4 0.6 0.8 1.0 Anduze/(Anduze+Ales ) 27/29

Conclusion Bayesian Dirichlet model for multivariate large excesses : `non' parametric, suitable for moderate dimension, adaptable to censored data. Two packages R : DiriXtremes, MCMC algorithm for Dirichlet mixtures, DiriCens, implementation with censored data. Towards high dimension (GCM grid, spatial elds) Impose reasonable structure (sparse) on Dirichlet parameters? Possible application : Posterior sample Simulation of regional extremes? 28/29

References M.-O. Boldi and A. C. Davison. A mixture model for multivariate extremes. JRSS : Series B (Statistical Methodology), 69(2) :217229, 2007. Gómez, G., Calle, M. L., and Oller, R. Frequentist and bayesian approaches for interval-censored data. Statistical Papers, 45(2) :139173, 2004. Ledford, A. and Tawn, J. (1996). Statistics for near independence in multivariate extreme values. Biometrika, 83(1) :169187. Neppel, L., Renard, B., Lang, M., Ayral, P., Coeur, D., Gaume, E., Jacob, N., Payrastre, O., Pobanz, K., and Vinet, F. (2010). Flood frequency analysis using historical data : accounting for random and systematic errors. Hydrological Sciences JournalJournal des Sciences Hydrologiques, 55(2) :192208. Sabourin, A., Naveau, P. (2013) Bayesian Dirichlet mixture model for multivariate extremes : a re-parametrization. Computation. Stat and Data Analysis Schnedler, W. (2005). Likelihood estimation for censored random vectors. Econometric Reviews, 24(2) :195217. Van Dyk, D. and Meng, X. (2001). The art of data augmentation. Journal of Computational and Graphical Statistics, 10(1) :150. 29/29