A STATISTICAL APPROACH TO OPERATIONAL ATTRIBUTION Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC 27599-3260, USA rls@email.unc.edu IDAG Meeting Boulder, Colorado January 28, 2010 1
(From a presentation by Myles Allen) 2
3
4
POINT PROCESS APPROACH TO EXTREME VALUES Homogeneous case: Exceedance y > u at time t has probability ( 1 1 + ξ y µ ) 1/ξ 1 ( exp ψ ψ 1 + ξ u µ ψ + ) 1/ξ + dydt 5
Illustration of point process model. 6
Inhomogeneous case: Time-dependent threshold u t and parameters µ t, ψ t, ξ t Exceedance y > u t at time t has probability ( ) 1/ξt 1 ( 1 y µ t 1 + ξ t exp ψ t ψ t u t µ t 1 + ξ t ψ t Estimation by maximum likelihood + ) 1/ξt + dydt 7
HEATWAVE ANALYSIS: FIRST APPROACH 8
Data: 5 model runs from CCSM 1871 2100, including anthropogenic forcing 2 model runs from UKMO 1861 2000, including anthropogenic forcing 1 model runs from UKMO 2001 2100, including anthropogenic forcing 2 control runs from CCSM, 230+500 years 2 control runs from UKMO, 341+81 years All model data have been calculated for the grid box from 30 50 o N, 10 o W 40 o E, annual average temperatures over June August Expressed as anomalies from 1961 1990, similar to Stott, Stone and Allen (2004) 9
CLIMATE MODEL RUNS: ANOMALIES FROM 1961 1990 Temperature 2 0 2 4 6 CCSM ANTHRO. UKMO ANTHRO. CCSM CONTROL UKMO CONTROL Observed Value 1900 1950 2000 2050 2100 Year 10
Method: Fit POT models with various trend terms to the anthropogenic model runs, 1861 2010 Also fit trend-free model to control runs (µ = 0.176, log ψ = 1.068, ξ = 0.068) 11
POT MODELS 1861 2010, u=1 Model p NLLH NLLH+p Gumbel 2 349.6 351.6 GEV 3 348.6 351.6 GEV, lin µ 4 315.5 319.5 GEV, quad µ 5 288.1 293.1 GEV, cubic µ 6 287.7 293.7 GEV, quart µ 7 285.1 292.1 GEV, quad µ, lin log ψ 6 287.9 293.9 GEV, quad µ, quad log ψ 7 287.0 294.9 Fitted model: µ = β 0 + β 1 t + β 2 t 2, ψ, ξ const β 0 β 1 β 2 log ψ ξ Estimate 0.187 0.030 0.000215 0.047 0.212 S.E. 0.335 0.0054 0.00003 0.212 0.067 12
CLIMATE MODEL RUNS: ANOMALIES FROM 1961 1990 Temperature 2 0 2 4 6 CCSM ANTHRO. UKMO ANTHRO. CCSM CONTROL UKMO CONTROL Observed Value Quantiles 99 % 90 % 99 % 90 % 1900 1950 2000 Year 13
We now estimate the probabilities of crossing various thresholds in 2003. Express answer as N=1/(exceedance probability) Threshold 2.3: N=3024 (control), N=29.1 (anthropogenic) Threshold 2.6: N=14759 (control), N=83.2 (anthropogenic) 14
Conclusions from This Analysis The probabilities are less extreme than those reported by Stott et al. (2004) However, note that we re comparing model output with an observed extreme event in 2003. The analysis assumes the models are well-calibrated! In fact Stott et al. combined their analysis with a detection and attribution analysis of the historical averages over the observed region, which amounts to making a locationscale transformation to get better agreement between model output and data. Another thing I don t really like about this approach is that it relies on parametric models for the trend. I d prefer an approach that used the GCM data directly to characterize the shape of the trend either with or without anthropogenic forcing. 15
HadCRUT3v monthly means over 40 50N, 0 15E, 1900 2009 Temperature Anomaly 6 2 0 2 4 6 1900 1920 1940 1960 1980 2000 Year pcmdi.ipcc4.ukmo_hadcm3.20c3m.run1.monthly.tas_a1 Temperature Anomaly 6 2 0 2 4 6 1900 1920 1940 1960 1980 2000 Year 16
HadCRUT3v monthly means over 30 50N, 0 40E, 1900 2009 Temperature Anomaly 4 2 0 2 4 1900 1920 1940 1960 1980 2000 Year pcmdi.ipcc4.ukmo_hadcm3.20c3m.run1.monthly.tas_a1 Temperature Anomaly 4 2 0 2 4 1900 1920 1940 1960 1980 2000 Year 17
HEATWAVE ANALYSIS: SECOND APPROACH 18
We computed monthly temperature anomalies in the region of interest from the University of East Anglia database We also downloaded runs from 42 climate models (31 anthropogenic, 11 control) from PCMDI. 19
Map of study region (0 15 o E, 40 50 o N) 20
HadCRUT3v JJA means over 40 50N, 0 15E, 1850 2008 HadCRUT3v monthly means over 40 50N, 0 15E, 1900 2008 Temperature Anomaly 1 0 1 2 3 Temperature Anomaly 4 2 0 2 4 1850 1900 1950 2000 Year 1900 1920 1940 1960 1980 2000 Year 31 twentieth century model runs 11 control model runs Model Anomaly 1.0 0.5 0.0 0.5 1.0 1.5 Model Anomaly 1.0 0.5 0.0 0.5 1.0 1900 1920 1940 1960 1980 2000 Year 1900 1920 1940 1960 1980 2000 Year 21
Anomalies Adjusted for Anthropogenic GCM Mean Anomalies Adjusted for Control GCM Mean Adjusted Anomaly 5 4 3 2 1 0 1 2 Adjusted Anomaly 5 4 3 2 1 0 1 2 1900 1920 1940 1960 1980 2000 Year 1900 1920 1940 1960 1980 2000 Year Let µ t be trend value in month t (estimated from models, once from anthropogenic GCMs and once from control GCMs) Look for extremes in values of Y t µ t (plotted) Fix high threshold, fit point process model, compute diagnostics 22
However, these model fits all have ξ 0.4 and when we try to calculate the probability associated with the June or August 2003 events, the answer is 0! Possible Bayesian resolution of this issue if we treat ξ as a random variable rather than a fixed constant, we will not get zero probabilities for predicted values. 23
However, there are some other issues with this analysis arising from the use of the UEA database. Data are anomalies from 1961-1990 Also, there are many more stations during 1961-1990 than during other time periods (especially earlier, but the number of stations has dropped off since 1990 as well). This could be an issue, since averages over a large number of stations presumably have fewer extreme events than averages over a smaller group of stations. Taken together, these points probably explain the lack of extreme events during the 1961-1990 period! (but genuine climatic factors may be present as well!) Possible remedy: GHCN data compute average temperatures directly from 24
MULTI-MODEL ENSEMBLES Tebaldi et al. (2005) Smith et al. (2009) Buser (2009) 25
A POSSIBLE HIERARCHICAL MODEL FOR A MULTI-MODEL ENSEMBLE APPROACH TO THE OPERATIONAL ATTRIBUTION PROBLEM 26
GCM Data. Let X k,j,t be result of run j of model k in month t, 1 t T. Assume X k,j,t = A k µ t + B k + λ 1/2 k ɛ k,j,t (1) ɛ k,j,t = φ 1 ɛ k,j,t 1 + 1 φ 2 1 η k,j,t, (2) η k,j,t N[0, 1] (independent) (3) The likelihood of {X k,j,t, t = 1,.., T } given (A k, B k, λ k, φ 1, µ 1,..., µ T ) is computed as ɛ k,j,t = λ 1/2 k (X k,j,t A k µ t B k ), (4) L X k,j = φ(ɛ k,j,1 ) T t=2 1 1 φ 2 1 φ ɛ k,j,t φ 1 ɛ k,j,t 1 1 φ 2 1. (5) 27
Observational Data. If Y t is observed temperature anomaly in month t, we assume ( ) 1/ξ Pr {Y t > y} 1 + ξ y µ t, y, ψ + µ t = µ t + β 1 cos 2πt 12 + β 2 sin 2πt 12. In practice, approximate by GEV and threshold, so the likelihood component associated with Y t (conditional on µ t, β 1, β 2, ψ, ξ) is L Y t = exp 1 ψ ( { ( 1 + ξ ut µ t 1 + ξ yt µ t ψ ψ + ) 1/ξ 1 + ) } 1/ξ exp { ( 1 + ξ yt µ t ψ ) } 1/ξ + if y t u t, if y t > u t, 28
Prior distributions. A k G(a A, b A ) (independent for k = 1,..., m), B k N(ν B, τ 1 ) (independent for k = 1,..., m), B λ k G(a λ, b λ ) (independent for k = 1,..., m), ν B U(, ), a A, b A, a λ, b λ, τ G(a, b ) (independent), φ 1 U( 1, 1), β 1 U(, ), β 2 U(, ), ψ 1 G(a, b ), ξ U( 1, 1). Here a, b are small values such as a = b = 0.01. 29
For µ = ( µ 1 µ 2... µ T ), we propose a pairwise difference prior (cf. Green et al., 1995). This can take many forms depending on the assumed smoothness of the process, but one form that seems suitable for a fairly smooth process is π(µ κ) κ T/2 exp κ 2 κ G(a, b ). T 1 t=2 ( µt 1 2µ t + µ t+1 ) 2, 30
Next steps. Combine likelihood and prior components into a gigantic joint distribution for all the unknowns Compute conditional distributions of the unobserved parameters given the (real and climate model) observations Construct posterior predictive probabilities for the observed extreme events Repeat this under different combinations of models (especially, anthropogenic verses non-anthropogenic) and compare the results (similar to Stott, Stone and Allen) 31
REFERENCES Besag, Green, Higdon and Mengersen (1995), Bayesian Computation and Stochastic Systems. Statistical Science 10, 3 41. Buser, C.M. (2009), Bayesian Statistical Methods for the Analysis of Multi-Model Climate Predictions. ETH Dissertation. Smith, R.L., Tebaldi, C., Nychka, D. and Mearns, L.O. (2009), Bayesian Modeling of Uncertainty in Ensembles of Climate Models. JASA 104, 97-116. Stott, P.A., Stone, D.A. and Allen, M.R. (2004), Human contribution to the European heatwave of 2003. Nature 432, 610 614 Tebaldi, C., Smith, R.L., Nychka, D. and Mearns, L.O. (2005), Quantifying uncertainty in projections of regional climate change: A Bayesian approach to the analysis of multi-model ensembles. Journal of Climate 18, 1524 1540 (corrigendum p. 3405). 32