Two practical tools for rainfall weather generators Philippe Naveau naveau@lsce.ipsl.fr Laboratoire des Sciences du Climat et l Environnement (LSCE) Gif-sur-Yvette, France FP7-ACQWA, GIS-PEPER, MIRACLE & ANR-McSim, MOPERA 31 mai 2012
Univariate modeling (joint work with P. Ribereau and A. Hannart) Weather station of Guipavas (near Brest) : 1996-2011 Spring hourly precipitation 192 observations (16 years x 3 months x 4 weeks) Our goal : Modeling moderate and heavy precipitation
Thresholding : the Generalized Pareto Distribution (GPD) H ξ (x/σ) = 1 + ξ x «1/ξ σ + Vilfredo Pareto : 1848-1923 Born in France and trained as an engineer in Italy, he turned to the social sciences and ended his career in Switzerland. He formulated the power-law distribution (or Pareto s Law ), as a model for how income or wealth is distributed across society.
From Bounded to Heavy tails!=-0.5 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5!=0.0!=0.5 0 5 10 15 20 0 50 100 150 200 250 300 Index
Modeling moderate and heavy precipitation Desiderata a GP tail for extreme rainfall Gamma like behavior for moderate precipitation (Katz et al, 2002) Very few parameters Rapid and efficient estimation schemes Easy to obtain return levels Simple simulation algorithms No threshold selection
Frigessi et al, 2002, Vrac and N. (2007) h i c (1 p µ,τ (x)) light tailed density + p µ,τ (x) h ξ (x/σ)/σ, where h ξ a GP density with ξ > 0 and the weight function p µ,τ (.) Desiderata + a GP tail for extreme rainfall p µ,τ (x) = 1 2 + 1 x µ π arctan. τ + Gamma like behavior for moderate precipitation - Very few parameters Inference rapid and efficient - Easy to obtain return levels + Simple simulation schemes +/- No threshold selection - One component has to be heavy-tailed
Hydrid-Pareto, Carreau and her co-authors (2006 ; 2009 ; 2011) ith parameters ξ = 0.4, µ = 0 and σ = 1. Right panel: Hybrid and in all cases µ = 0 and σ = 1. nd let f (y;µ, σ) be the Gaussian density function with parameters o density with parameters ξ > 0 and β. The smoothness constraint 0; ξ, β) which gives: σ exp (α µ)2 2σ 2 = 1 β 2πσ = (α µ)2 xpression found for β in equation 7: 2σ 2 β α means that f (α; µ, σ) = g (α; ξ, β), which yields: (1 + ξ) 2 α (1 = + µ ξ) exp = + σ W 2π that: (α µ)2 2σ 2 density function is given by: α µ σ 2 α = µ + σ2 (1 + ξ) (5) β 1 1 Figure γ exp 2: Left (y µ)2 if y α, we let α and β be functions 2πσ of these free parameters. 2σ panel: Hybrid Pareto density with parameters ξ = 0.4, µ = 0 and σ = 1. Right panel: Hybrid 2 We replace + Simple simulation on 5: ; ξ, µ, σ) = 1 1 Pareto γ β 1 log-density + ξ y α 1/ξ 1 β forifvarious y > α and tail ξ parameters 0, and in all cases µ = 0schemes and σ = 1. σ 2 (1 + ξ) 2 (1 + ξ)2 xp 1 1 y γ β e β if y > α and ξ = 0 the expression β 2 = σ2 (1 + ξ) 2 σ 2 (1 + ξ) 2 2π found for β in equation β 2 exp 7: β 2 (6) + No threshold selection 6, we make use of the Lambert W function: given an input z, and β are given Let inαequations be (1 + ξ) 2 erical algorithm of junction 8 and 7 respectively point (or threshold) and where γand is the letappropriate f (y; µ, σ) be the Gaussian density function with parameters α order = µ + four σ to W find the zero of z we w [16]. In (8) - Strong dependence 2π ty integrates µ and to one σ, and g(y α; is given ξ, β) by: be the Generalized Pareto density with parameters ξ > 0 and β. The smoothness constrain areto σ 2 (1 density + ξ) 2 function is σ(1 given + ξ) by: among parameters on the density at α means that f (α; µ, σ) = g(0; ξ, β) which gives: β 2 β = (7) 1 1 γ + 1 W (1+ξ) 2 W (1+ξ) 2 γ exp 2π (y µ)2 if y α, 2πσ 2σ - Negative values allowed 2π h(y; ξ, µ, σ) = 1 + Er f, 1 (α µ)2 2 exp 2 1 1 2πσ 2σ 2 = 1 γ β 1 + ξ y α 1/ξ 1 β if y > α and ξ 0, 1 1 y 2 γ β e β if y > α and ξ = 0 β 5 eters α and β are given in equations 8 and 7 respectively and where γ is the appropriate density integrates to one andis given z by: 2 (3) β 2 (4) (8) Desiderata + a GP tail for extreme rainfall - Gamma like behavior for moderate precipitation + Very few parameters (3) + MLE Inference + Easy to obtain return levels
Going back to Extreme Value Theory (second order condition) Falk M., Husler J., Reiss R.-D. (2010) where δ > 0 1 σ h ξ(x/σ) 1 + O H δ ξ(x/σ),
Going back to Extreme Value Theory (second order condition) An Extended Generalized Pareto f θ (x) = (1 + 1/δ) 1 σ h ξ(x/σ) 1 H δ ξ(x/σ), where δ > 0 (links with skewed densities).
0 1 2 3 4 5 6 7 Motivation univariate MEV PAM Three pdfs : Grey=GP(0, 1.8, 0.5), dashed=gamma(1.2, 1.6), solid=egp(1, 1, 0.5) density(x) 0.0 0.1 0.2 0.3 0.4 0.5
Properties Simulations with B δ = Beta(1/δ, 2) X = σ ξ (B ξ/δ δ 1)
Properties Simulations with B δ = Beta(1/δ, 2) X = σ ξ (B ξ/δ δ 1) Quantiles x p = σ ξ (G 1 δ (1 p)) ξ/δ 1 with G δ (u) = c δ v 1/δ `1 v 1+δ
Properties Simulations with B δ = Beta(1/δ, 2) X = σ ξ (B ξ/δ δ 1) Quantiles x p = σ ξ (G 1 δ (1 p)) ξ/δ 1 with G δ (u) = c δ v 1/δ `1 v 1+δ Probability weighted moments µ s = E XF s θ(x) µ 0 = µ 2 + δ 1 ξ/(2 + δ) 0 1 + δ 1 ξ/(1 + δ), and µ 1 = µ 1... and µ 2 = µ 2...
Inference of µ k = E XF k θ(x) for finite samples U-statistics (Lee, 1990 ; Hoeffding, 1948 ; Landwehr et al. 1979) bµ k = n k! 1 X (n,k) 1 k min(x i 1,..., X ik ),
Inference of µ k = E XF k θ(x) for finite samples U-statistics (Lee, 1990 ; Hoeffding, 1948 ; Landwehr et al. 1979) bµ k = n k! 1 X (n,k) 1 k min(x i 1,..., X ik ), Variance and other properties (Furrer & N., 2007) with var(bµ k ) = n k! 1 kx i=1! k i P{W ik > x} = F i (w 1 (x)), where w(y) =! n k var(w ik ) k i Z y 0 F k i (z)dz.
Inference of µ k = E XF k θ(x) for finite samples U-statistics (Lee, 1990 ; Hoeffding, 1948 ; Landwehr et al. 1979) bµ k = n k! 1 X (n,k) 1 k min(x i 1,..., X ik ), Special case GPD (Furrer & N., 07, Hosking & Wallis, 87) var(bµ k 1 ) = n k! 1 kx i=1! k i! n k iσ 2 k i k 2 (k ξ) 2 (2k i 2ξ), Also explicit but complex expressions for the extended GPD
Guipavas hourly precipitation (Spring) Spring Density 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0 20 40 60 80
Guipavas hourly precipitation (Spring) Spring Expected 0 20 40 60 80 100 120 140 0 20 40 60 80 Observed
Summary of our extended GPD A distribution in compliance with EVT Very few parameters for the full precipitation range PWM inference but MLE and Bayesian possible A simple block for constructing mixtures (see Julie) Work in progress (more simulations and testing the inference scheme)
Clustering of maxima (joint work with E. Bernard, M. Vrac and O. Mestre) Meteo-France : Weekly maxima of hourly precipitation 228 points = 19 years x 3 months 4 weeks) Our goal : clustering 92 grid points wrt spatial dependence
Hourly precipitation for 92 stations, 1992-2011 (Olivier Mestre)
Applying the kmeans algorithm to maxima (five clusters) PRECIP Kmeans Fall log(precip) Kmeans Fall
The GEV parameter GEV scale Fall 1.4 1.6 1.9 2.1 2.4 2.6 2.9 3.2 3.4 3.7 GEV shape 0.06 0.12 0.17 0.23 0.29 0.35 0.41 0.47 0.53 0.59
Limits of kmeans clusterings Clustering mixed intensity and dependence among maxima Difficult interpretation of clusters (the mean of maxima is not a maximum) How to find an appropriate metric for maxima?
A central question P [M(x) < u, M(y) < v] = G(u, v) =??
Max-stability in the univariate case with an unit-fréchet margin F t (tu) = F(u), if F(u) = exp( 1/u)
Max-stability in the univariate case with an unit-fréchet margin F t (tu) = F(u), if F(u) = exp( 1/u) Max-stability in the multivariate case with unit-fréchet margins G t (tu, tv) = G(u, v), for F X (u) = F Y (u) = F(u) = exp( 1/u)
Max-stable vector (de Haan, Resnick, and others) If one assumes that we have unit Fréchet margins then log G(u, v) = 2 Z 1 0 w max u, 1 w «da(w) v where A(.) a distribution function on [0, 1] such that R 1 w da(w) = 0.5. 0
θ = Extremal coefficient P [M(x) < u, M(y) < u] = G(u, u) = F(u) θ Interpretation Independence θ = 2 M(x) = M(y) θ = 1 Similar to correlation coefficients for Gaussian but... No characterization of the full bivariate dependence
Geostatistics : Variograms Complex non-parametric structure semivariance 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 distance 1 E Z (y) Z (x) 2 2 Finite if light tails Capture all spatial structure if {Z (x)} Gaussian fields but not well adapted for extremes
A Different Variogram (Cooley, Poncet and N., 2005) Why does it work? d(x, y) = 1 E F y(m(y)) F x(m(x)) 2 a = F y(m(y)) and b = F x(m(x)) Ea = Eb = 1/2 1 2 a b = max(a, b) 1 (a + b) 2 E max(a, b) = EF(max(M(y), M(x) )) = θ {z } 1 + θ max-stable
Madogram d(x, y) Extremal coeff θ θ = 1 + 2d(x, y) 1 2d(x, y) The madogram d(x, y) gives the extremal coefficient θ
!!!"#$%&'()*+,"*-'(%./+ + 012+ " #$%&&'()(*"+,-".$+$")(+'"*%'/01"%-2-3&(*"+,-".-*%--"'4".-0-(.-(3-"'4"5$6)5$"!"71-"/,3-*',/"$1".)13%)5)($+'%8"3%)+-%)$" + 456+7+ " 802"$9*'%)+,5":"8,'99-)()*+0'-#)3+2&3-(3$+! "##!$!%&'()&*+!,-!&*.!/0'11##'2+!3-4!567789"
$ $$$$7089#9$:;$!<:09=><9$!!!"#$%&'()*+,-$./$0,((12$3$*4*5%&$-26(*61$ """""#"$%&'(")*"(+,"-./)0$(+1""
!!!"#$%&'()*+,-$./$"00*'1$232)4$5(*1+$+($*+0$ 6&(020+$-27(*7$
!!!"#$%&'()*+,-$./$"00*'1$232)4$5(*1+$+($*+0$ 6&(020+$-27(*7$ d ij "#$%&'()*(#$%+(#,'-.$%d ij
!!!"#$%&'()*+,-$./$012(-34+1$-15(*56$
!!!"#$%&'()*+,-$./$"01$2($(0$3$
PAM with K= 2 Fall Kmeans with K= 2
PAM with K= 5 Fall Kmeans with K= 5
PAM with K= 10 Fall Kmeans with K= 10
!!!"#$%&'()*+,-"(.-/0)+ 1234567887+!57992!27:8+ " + i a i b i s i = b i a i max(a i,b i ) a i b i, s i 1 Well classified a i b i, s i 0 Neutral a i b i, s i 1 Badly classified
Choosing the number of clusters average silhouette width 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0 5 10 15 20 25 30 number of PAM clusters)
Summary on clustering of maxima Classical clustering algorithms are not in compliance with EVT Madogram provides a convenient distance that is marginal free PAM applied with mado preserves maxima and gives interpretable results Needs to go further in terms of applications (eg. regional frequency analysis) Dimension reduction for rainfall generators Caution with gridded data (e.g, ECA&D)
Collaborators Extended GPD Pierre Ribereau (IFSA, Lyon) and Alexis Hannart (CNRS, Argentina) Clustering of maxima Elsa Benard (ENS, Paris), Mathieu Vrac (LSCE) and Olivier Mestre (ENM, Toulouse)