1 EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS Rick Katz Institute for Study of Society and Environment National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu Home page: www.isse.ucar.edu/hp_rick/ Lecture: www.isse.ucar.edu/hp_rick/pdf/envdavos.pdf
2 Quote In order to apply any theory we have to suppose that the data are homogeneous, i. e. that no systematical change of climate and no important change in the basin have occurred within the observation period and that no such changes will take place in the period for which extrapolations are made. Emil Gumbel (Ann. Math. Stat., 1941)
3 Outline (1) Historical Perspective (2) Characteristics of Environmental Extremes (3) Penultimate Approximations in Extreme Value Theory (4) Interpretation of Tail Behavior (5) Complex Extreme Events (6) Risk Communication under Climate Change (7) Resources
4 (1) Historical Perspective Use of Extremal Model in Environmental Applications -- Stationarity Prevailing paradigm Many early applications dealt with environmental variables -- Engineering design Estimating return level of floods (Dams, flood plain definition) Estimating return level of high winds (Construction)
5 Non-Stationarity -- Lack of use of extremal models (until quite recently) -- Trends in frequency & intensity of extremes Global climate change Other sources (e. g., local urban effects such as heat island ) -- Other covariates Annual & diurnal cycles Geophysical variables (e. g., El Niño, North Atlantic Oscillation)
6 (2) Characteristics of Environmental Extremes Tail Behavior -- Shape parameter ξ of GEV or GP distribution -- Bounded upper tail (i. e., ξ < 0) Daily minimum & maximum temperature Hourly or daily wind speed Pollutant concentration Sea level Wave height
7 -- Heavy upper tail (i. e., ξ > 0) Precipitation: Typical estimates for daily totals 0.1 < ξ < 0.15 Streamflow: Estimates of ξ tend to be higher than for precipitation (Effect of integrating precipitation across water basin?) Economic damage from extreme events: Estimates of ξ tend to be higher yet Economic damage from hurricanes (estimate of ξ 0.5)
8
9 Scaling / Aggregation -- Apparent conflict with extreme value theory Precipitation extremes: Shape parameter tends to decrease with aggregation over time (e. g., hourly vs. daily total amounts) Example: Regional analysis of precipitation extremes in Texas Extreme value theory: Shape parameter should be invariant with respect to aggregation
10
11 Cycles -- Diurnal & Annual Phase tends to be roughly consistent with that for overall distribution Amplitude / shape possibly differs: Issue if standardize entire data set to remove annual cycles in mean and standard deviation Physical appeal: Extremal model which reflects process through which extremes occur
12 -- Fort Collins, CO daily precipitation (1900 1999) Point process approach: Threshold u = 0.395 in Time scale parameter h = 1/365.25 Annual cycles in location & scale parameters of GEV distribution Location parameter: μ(t) = μ 0 + μ 1 sin(2πt / T) + μ 2 cos(2πt / T) Scale parameter: log σ(t) = σ 0 + σ 1 sin(2πt / T) + σ 2 cos(2πt / T)
13
14 Trends -- Example (Urban heat island) Summer minimum of daily temperature at Phoenix, AZ (i. e., block minima) min{x 1, X 2,..., X T } = max{ X 1, X 2,..., X T } Assume summer minimum temperature in year t has GEV distribution with location & scale parameters: μ(t) = μ 0 + μ 1 t, log σ(t) = σ 0 + σ 1 t, ξ(t) = ξ, t = 1, 2,... Likelihood ratio test for μ 1 = 0 (P-value < 10 5 ) Likelihood ratio test for σ 1 = 0 (P-value 0.366)
15
16 Physically-based covariates -- Example [Arctic Oscillation (AO)] Winter maximum of daily maximum temperature at Port Jervis, NY (i. e., block maxima) Z denotes winter index of AO Given Z = z, assume conditional distribution of winter max. temp. is GEV distribution with parameters: μ(z) = μ 0 + μ 1 z, log σ(z) = σ 0 + σ 1 z, ξ(z) = ξ Likelihood ratio test for μ 1 = 0 (P-value < 0.001) Likelihood ratio test for σ 1 = 0 (P-value 0.635)
17
18 Dependence -- Temporal Asymptotic dependence at high levels: Evidence for Temperature (But not necessarily for precipitation) -- Spatial Similar to temporal (Temperature & Precipitation) -- Multivariate Insufficient compelling examples to draw general conclusions
19 -- Example of temporal dependence Fort Collins, CO daily precipitation in July Let X t denote precipitation amount on tth day Given precipitation amount on previous day X t 1 = x, assume conditional distribution of excess X t u is GP with parameters: log σ(x) = σ 0 + σ 1 x, ξ(x) = ξ Note: Consider all non-zero values of X t 1 (i.e., x > 0), not just high values
20
21
22 (3) Penultimate Approximations in Extreme Value Theory Ultimate Extreme Value Theory -- GEV distribution as limiting distribution of maxima X 1, X 2,..., X T with common distribution function F M T = Max{ X 1, X 2,..., X T } Penultimate Extreme Value Theory -- Suppose F in domain of attraction of Gumbel type (i. e., ξ = 0) -- Still preferable in nearly all cases to use GEV as approximate distribution for maxima (i. e., act as if ξ 0)
23 -- Expression for shape parameter ξ T Hazard rate (or failure rate ): H(x) = F'(x) / [1 F(x)] Then one choice of shape parameter is: ξ T = (1/H)' (x) x=u(t) where the characteristic largest value u(t) = F 1 (1 1/T) Here ξ T 0 as block size T
24 Example: Exponential Distribution -- Exact exponential upper tail (unit scale parameter) 1 F(x) = exp( x ), x > 0 -- Shape parameter for penultimate approximation is Hazard rate: H(x) = 1, x > 0, Shape parameter: ξ T = 0, T = 1, 2,... So no benefit to penultimate approximation
25 Example: Normal Distribution -- Fisher & Tippett (1928) proposed Weibull type of GEV as penultimate approximation F Normal distribution (with zero mean & unit variance) Hazard rate: H(x) x (for large x) Characteristic largest value: u(t) (2 log T ) 1/2 (for large T) Penultimate approximation is Weibull type with ξ T 1 / (2 log T ) For instance: ξ 30 0.15, ξ 365 0.085
26 Example: Stretched Exponential Distribution -- Traditional form of Weibull distribution (Bounded below) Note: Weibull extremal type is reflected version 1 F(x) = exp( x c ), x > 0, c > 0 where c is shape parameter (unit scale parameter) -- Shape parameter for penultimate approximation is: ξ T (1 c) / (c log T ) (i) Superexponential (c > 1) ξ T 0 as T (ii) Subexponential (c < 1) ξ T 0 as T
27 (4) Interpretation of Tail Behavior Apparent Upper Bound -- Wind speed Weibull type of GEV advocated as penultimate approximation (in engineering applications) -- Maximum possible hurricane intensity Estimate with possible trend due to global warming -- Thermostat hypothesis Hypothesized upper bound on sea surface temperature in tropical oceans (Implications for impact of global warming on coral reefs)
28 GPD Fit estimated upper bound threshold sst 28 29 30 31 0 1000 2000 3000 4000 5000 6000 Time
29 Apparent Heavy Tail -- Precipitation (i) Penultimate approximation Fréchet type of GEV can be obtained with F stretched exponential distribution (Shape parameter c < 1) (ii) Physical argument Wilson & Toumi (2005) gave heuristic argument for universal shape parameter of c = 2/3 for stretched exponential distribution for extreme high precipitation
30 -- Simulation experiment Generate observations with stretched exponential distribution (with shape parameter c = 2/3) Use block size of T = 100 to simulate maxima M 100 (Corresponds to daily precipitation occurrence rate about 27%, ignoring variation in number of wet days) Penultimate approximation: Should produce GEV shape parameter of ξ 100 0.11 Fitted GEV distribution (40,000 replications): Obtained estimate of ξ 100 0.10
31 -- Aggregation Issue Apparent decrease in shape parameter of GEV or GP distribution Stretched exponential should be capable of resolving (at least qualitatively) Simulation experiment: Sum of two independent stretched exponentials (each with c = 2/3) Use block size of T = 100 to simulate maxima M 100 Fitted GEV dist. (40,000 replications): Estimate ξ 100 0.065 But note that precipitation really a random sum
32 -- Example of fitting stretched exponential Threshold selection difficult: Should not be too low Should not be too high -- Alternative approach (Avoid threshold selection) Mixture of exponential (i. e., c = 1) & stretched exponential (c = 2/3) distributions Fit to all nonzero precipitation amounts, not just high values Extension of mixture of two exponential distributions (commonly used in hydrology to fit precipitation totals)
33
34 -- Chance mechanism (Mixture of exponential distributions) Suppose Y has exponential distribution with scale parameter σ*: Pr{Y > y σ* } = exp[ (y / σ*)] Further assume that the rate parameter ν = 1 / σ* varies according to a gamma distribution with shape parameter α (unit scale): f ν (ν; α) = [Γ(α)] 1 ν α 1 exp( ν ), α > 0 Then unconditional distribution of Y is heavy-tailed: Pr{Y > y } = (1 + y ) α (i.e., exact Pareto distribution with shape parameter ξ = 1/α) So mixture of light tails induces heavy tail
35 Example: Let rate parameter of exponential distribution have gamma distribution with shape parameter α = 2 Then mixture distribution is Pareto with shape parameter ξ = 0.5 Simulation algorithm: (i) Generate pseudo random number from gamma distribution (shape parameter α) (ii) Generate pseudo random number from exponential distribution [with rate parameter ν as generated in step (i)] Any realistic environmental / geophysical examples?
36 (5) Complex Extreme Events Environmental Extremes -- Many events have complex structure (e. g., spells) Definition of Heat wave / Hot spell -- Recall runs de-clustering algorithm -- More complex definition (e. g., multiple thresholds) Lack of Use of Extremal Models -- e. g., paper by Meehl & Tebaldi (Science, 2004)
37
38 Cluster Statistics -- Exploit information used in declustering (Ordinarily, just retain cluster maxima) Cluster duration Cluster maxima Other measures of cluster intensity?
39 Covariate approach -- Advantage Requires only univariate extreme value theory (not multivariate) -- Let Y 1, Y 2,..., Y k denote excesses over threshold within given cluster / spell Model conditional distribution of Y 2 given Y 1 as GP distribution with scale parameter depending on Y 1 : log σ(y) = σ 0 + σ 1 y, given Y 1 = y Similar model for conditional distribution of Y 3 given Y 2 (etc.)
40 (6) Risk Communication Under Climate Change Case of Stationarity (i. e., unchanging climate) -- Interpretation of return level x(p) with return period T Pr{X > x(p)} = p, where p = 1/T (i) Length of time T for which expected number of events = 1 1 = Expected no. events = T p, so T = 1/p (ii) Expected waiting time (assume temporal independence) Waiting time W has geometric distribution: Pr{W = k } = (1 p) k 1 p, k = 1, 2,..., E(W) = 1/p = T
41
42 Case of Non-Stationarity (i. e., climate change) -- Retain one of these two interpretations Let p t (u) = Pr{X t > u}, t = 1, 2,... (i) Expected number of events -- Let N T (u) denote number of events during t = 1, 2,..., T E[N T (u)] = p 1 (u) + p 2 (u) + + p T (u) Given specified value of return period T: Set E[N T (u)] = 1 & solve for return level u Then u satisfies p 1 (u) + p 2 (u) + + p T (u) = 1 In other words, average of T probabilities is 1/T
43 (ii) Expected waiting time (assume temporal independence) -- Let W(u) denote first time t that X t > u Pr{W(u) = k } = { t=1,k 1 [1 p t (u)] } p k (u), k = 1, 2,... Given specified value of return period T: Set E[W(u)] = T Solve for return level u Not possible to obtain analytical expression for E[W(u)] (or for threshold u )
44 Example (Mercer Creek, WA) -- Period of rapid urbanization (starting about 1970) Effects on runoff within watershed Expected no. Expected Return period Return level of events waiting time 20 yr 657.8 cfs 1 31.1 yr 20 yr 429.5 cfs 5.6 20 yr 10 yr 573.7 cfs 1 29.2 yr 10 yr 337.3 cfs 4.8 10 yr
45
46 Equivalence of Definitions -- Stationarity Two definitions of return level / return period are equivalent (assuming temporal independence) -- Non-Stationarity Natural extensions of two definitions are far from equivalent -- Another possibility Effective return period or effective return level (i. e., varies from one time period to next)
47 (7) Resources Statistics of Weather and Climate Extremes -- Application of statistics of extremes to weather & climate www.isse.ucar.edu/extremevalues/extreme.html Extremes Toolkit (extremes) -- Open source software in R with GUIs Developed by Eric Gilleland S functions from Stuart Coles www.isse.ucar.edu/extremevalues/evtk.html
48 Reference -- Reiss & Thomas book (Third Edition, 2007) Statistical Analysis of Extreme Values with Applications to Insurance, Finance, Hydrology and Other Fields www.xtremes.de/xtremes/book/index_book.htm Chapter 15 ( Environmental Sciences ): Co-authored by me Contains some of material in this talk