Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level

Similar documents
Downscaling Extremes: A Comparison of Extreme Value Distributions in Point-Source and Gridded Precipitation Data

DOWNSCALING EXTREMES: A COMPARISON OF EXTREME VALUE DISTRIBUTIONS IN POINT-SOURCE AND GRIDDED PRECIPITATION DATA

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Zwiers FW and Kharin VV Changes in the extremes of the climate simulated by CCC GCM2 under CO 2 doubling. J. Climate 11:

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

DOWNSCALING EXTREMES: A COMPARISON OF EXTREME VALUE DISTRIBUTIONS IN POINT-SOURCE AND GRIDDED PRECIPITATION DATA

RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

arxiv: v1 [stat.ap] 8 Oct 2010

RISK ANALYSIS AND EXTREMES

Generalized additive modelling of hydrological sample extremes

Statistics of extremes in climate change

ANALYZING SEASONAL TO INTERANNUAL EXTREME WEATHER AND CLIMATE VARIABILITY WITH THE EXTREMES TOOLKIT. Eric Gilleland and Richard W.

Bayesian Inference for Clustered Extremes

Statistics for extreme & sparse data

Extreme Value Analysis and Spatial Extremes

Bayesian Modelling of Extreme Rainfall Data

PREPRINT 2005:38. Multivariate Generalized Pareto Distributions HOLGER ROOTZÉN NADER TAJVIDI

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

MULTIVARIATE EXTREMES AND RISK

Multivariate generalized Pareto distributions

STATISTICAL METHODS FOR RELATING TEMPERATURE EXTREMES TO LARGE-SCALE METEOROLOGICAL PATTERNS. Rick Katz

Investigation of an Automated Approach to Threshold Selection for Generalized Pareto

Sharp statistical tools Statistics for extremes

EVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH. Rick Katz

APPLICATION OF EXTREMAL THEORY TO THE PRECIPITATION SERIES IN NORTHERN MORAVIA

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz

A Conditional Approach to Modeling Multivariate Extremes

Regional Estimation from Spatially Dependent Data

A class of probability distributions for application to non-negative annual maxima

What Can We Infer From Beyond The Data? The Statistics Behind The Analysis Of Risk Events In The Context Of Environmental Studies

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

100-year and 10,000-year Extreme Significant Wave Heights How Sure Can We Be of These Figures?

A STATISTICAL APPROACH TO OPERATIONAL ATTRIBUTION

A comparison of U.S. precipitation extremes under two climate change scenarios

Modelação de valores extremos e sua importância na

Estimating changes in temperature extremes from millennial-scale climate simulations using generalized extreme value (GEV) distributions

Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated

Emma Simpson. 6 September 2013

Financial Econometrics and Volatility Models Extreme Value Theory

Overview of Extreme Value Analysis (EVA)

Multivariate generalized Pareto distributions

Uncertainties in extreme surge level estimates from observational records

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Automated, Efficient, and Practical Extreme Value Analysis with Environmental Applications

Semi-parametric estimation of non-stationary Pickands functions

Precipitation Extremes in the Hawaiian Islands and Taiwan under a changing climate

Bayesian nonparametrics for multivariate extremes including censored data. EVT 2013, Vimeiro. Anne Sabourin. September 10, 2013

FORECAST VERIFICATION OF EXTREMES: USE OF EXTREME VALUE THEORY

Threshold estimation in marginal modelling of spatially-dependent non-stationary extremes

Physics and Chemistry of the Earth

A New Class of Tail-dependent Time Series Models and Its Applications in Financial Time Series

Bivariate generalized Pareto distribution

Bayesian trend analysis for daily rainfall series of Barcelona

On the Estimation and Application of Max-Stable Processes

SEVERE WEATHER UNDER A CHANGING CLIMATE: LARGE-SCALE INDICATORS OF EXTREME EVENTS

Abstract: In this short note, I comment on the research of Pisarenko et al. (2014) regarding the

GENERALIZED LINEAR MODELING APPROACH TO STOCHASTIC WEATHER GENERATORS

NON STATIONARY RETURN LEVELS FOR RAINFALL IN SPAIN (IP)

EXTREMAL QUANTILES OF MAXIMUMS FOR STATIONARY SEQUENCES WITH PSEUDO-STATIONARY TREND WITH APPLICATIONS IN ELECTRICITY CONSUMPTION ALEXANDR V.

Extreme Values on Spatial Fields p. 1/1

Estimation of Gutenberg-Richter seismicity parameters for the Bundaberg region using piecewise extended Gumbel analysis

PENULTIMATE APPROXIMATIONS FOR WEATHER AND CLIMATE EXTREMES. Rick Katz

High Rainfall Events and Their Changes in the Hawaiian Islands

CLIMATE EXTREMES AND GLOBAL WARMING: A STATISTICIAN S PERSPECTIVE

Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators

STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS

Multi-resolution models for large data sets

Future extreme precipitation events in the Southwestern US: climate change and natural modes of variability

Peaks-Over-Threshold Modelling of Environmental Data

The Behavior of Multivariate Maxima of Moving Maxima Processes

Probabilistic assessment of regional climate change: a Bayesian approach to combining. predictions from multi-model ensembles

R.Garçon, F.Garavaglia, J.Gailhard, E.Paquet, F.Gottardi EDF-DTG

Extremes and Atmospheric Data

Global Climate Models and Extremes

On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves

Journal of Environmental Statistics

New Classes of Multivariate Survival Functions

Construction of confidence intervals for extreme rainfall quantiles

Max-stable Processes for Threshold Exceedances in Spatial Extremes

Extreme Value Theory and Applications

Accounting for Choice of Measurement Scale in Extreme Value Modeling

Extremes Analysis for Climate Models. Dan Cooley Department of Statistics

GENERALIZED LINEAR MODELING APPROACH TO STOCHASTIC WEATHER GENERATORS

Spatial and Temporal Characteristics of Heavy Precipitation Events over Canada

Spatial Extremes in Atmospheric Problems

Extremes of Severe Storm Environments under a Changing Climate

Bayesian spatial hierarchical modeling for temperature extremes

Latent process modelling of threshold exceedances in hourly rainfall series

STOCHASTIC MODELING OF ENVIRONMENTAL TIME SERIES. Richard W. Katz LECTURE 5

Two practical tools for rainfall weather generators

Bayesian Spatial Modeling of Extreme Precipitation Return Levels

A review of methods to calculate extreme wind speeds

Applications of Tail Dependence II: Investigating the Pineapple Express. Dan Cooley Grant Weller Department of Statistics Colorado State University

Research Article Extreme Precipitation Changes in the Semiarid Region of Xinjiang, Northwest China

High-frequency data modelling using Hawkes processes

Estimation of 100-year-return-period temperatures in France in a non-stationary climate : results from observations and IPCC scenarios

Efficient Estimation of Distributional Tail Shape and the Extremal Index with Applications to Risk Management

The Spatial Variation of the Maximum Possible Pollutant Concentration from Steady Sources

Weather generators for studying climate change

Transcription:

Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level Presented by: Elizabeth Shamseldin Joint work with: Richard Smith, Doug Nychka, Steve Sain, Dan Cooley Statistics Group, IMAGe National Center for Atmospheric Research March 22, 2006

Outline Motivating Question Techniques Theory Results Conclusion and Future Work 1

Motivating Question The question under investigation is whether regional climate model return level estimations can be used to obtain return level predictions at the station level. 2

Outline of Techniques Relationship of grid cell data to the n-year return levels at point locations is explored Tail of the Generalized Extreme Value distribution (GEV) is fit to the grid cell data above a given threshold Similar to the Peaks Over Threshold method However, here the method used for parameter estimation is a Point Process approach Leads directly to the GEV parameters 3

Techniques GEV Parameter estimates are used to generate n-year return levels n-year returns at the point (station) locations are estimated the same way Various models are explored to predict point location n-year return from grid cell n-year return 4

Theory The Generalized Extreme Value or GEV distribution is defined by the equation ( Pr{Y y} = exp 1 + ξ y µ ) 1/ξ (1) ψ + In practice, however, fitting (1) directly to sample annual maxima has a number of disadvantages. Many of the observational series are short (less than 25 years) Most observational series contain missing values - it is unclear how to define the annual maximum when a significant fraction of the daily values are missing. 5

Theory To take account of these deficiencies, an alternative class of methods has been developed, often known as the Peaks Over Thresholds (POT) approach. For the distribution of the excess values, a common family of probability density functions is the Generalized Pareto Distribution (GPD), introduced by Pickands (1975), and given by Pr{X u + x X > u} = 1 ( 1 + ξ x ) 1/ξ σ +. (2) One drawback in the POT model is that the parameters are directly tied to the threshold value, u. A third approach, the point process approach (Smith 1989, 2003, Coles 2001), although operationally very similar to the POT approach, uses a representation of the probability distribution that leads directly to the GEV parameters (µ, ψ, ξ). 6

Theory The point process approach is similar to the peaks over threshold method in that all observations over a given threshold are considered. However, in the POT model the parameters are directly tied to the threshold value whereas in the PP approach, the parameters are not tied to the threshold value. Here we estimate the tail of the GEV. Instead of conditionally modeling the tail as the GPD does, we are directly modeling the tail values over a given threshold. The parameters for the GEV obtained through the point process approach lead directly to the GEV parameters. 7

Theory The Point Process method considers N peaks, Y 1...Y N, observed at times T 1...T N. Pairs are viewed as points in the space [0, T ] (u, ) (u=threshold), which form a nonhomogeneous Poisson process with intensity measure: λ(t, y) = 1 ψ ( (y µ) 1 + ξ ψ ) 1 ξ 1 + 8

Theory By standard formulae for a Poisson Process, the likelihood is of the form: L(µ, ψ, ξ) = = N i=1 N i=1 λ(t i, Y i ) exp { T λ(t i, X i ) exp T 0 ( u λ(t, y)dtdy 1 + ξ u µ ψ } ) 1/ξ +. (3) 9

Theory In practice we work with the negative log likelihood, l = log L, which leads to ( ) 1 N ( l(µ, ψ, ξ) = N log ψ + ξ + 1 log 1 + ξ Y ) i µ (4) ψ + T ( 1 + ξ u µ ψ i=1 ) 1/ξ where T is the length of the observation period in years and the (...) + symbols essentially mean that the expression are only evaluated if 1 + ξ u µ ψ > 0 and 1 + ξy i µ ψ > 0 for each i (if these constraints are violated, L is automatically set to 0). + + 10

Theory The basic method of estimation is therefore to choose the parameters (µ, ψ, ξ) to maximize (3) or equivalently to minimize (4). This is performed by numerical nonlinear optimization. In practice it is convenient to replace (µ, ψ, ξ) by (θ 1, θ 2, θ 3 ) where θ 1 = µ, θ 2 = log ψ, θ 3 = ξ (defining θ 2 to be log ψ rather than ψ itself makes the algorithm more numerically stable, and has the advantage that we don t have to build the constraint ψ > 0 explicitly into the optimization procedure). 11

Theory The PP approach should produce equivalent parameter values as the POT approach, provided the threshold is high enough for the model to fit the data. Provided the model fits the data, the parameters are independent of the threshold (adjusting for estimation error). The ideal threshold can be determined by considering where the parameter values stabilize. In the GPD approach, the scale parameter still varies with threshold, and is not necessarily the same as the scale parameter of the PP approach. 12

Theory The n-year return values can be directly obtained using the estimated GEV parameters obtained from the PP approach. Define y n by the equation: ( 1 + ξ y n µ ψ ) 1/ξ = 1 n which leads to the formula: y n = µ + ψ nξ 1 ξ if ξ 0, µ + ψ log n if ξ = 0. (5) 13

Data Point-source observational data from NCDC, originally obtained from Dr. Pavel Groisman Covers period 1950-1999 over 5873 stations. The data are daily rainfall values; units are tenths of a millimeter. Grid-cell data are from NCEP Covering period 1948 2003 with no missing data on 288 2.5 o grid cells, converted to the same units as the NCDC data. Rainfall values are considered over the four seasons Threshold values are determined by the 95 th and 97 th percentiles Clustering method is used to define peaks 14

Results Comparisons of R and Fortran Codes for NCDC Rainfall Data 50 year Return Values 50 year Return Values (Outliers Removed) 4000 4000 3000 3000 Fortran code 2000 Fortran code 2000 1000 1000 0 0 0 10000 20000 30000 40000 50000 R code 0 1000 2000 3000 4000 R code The parameter estimates obtained through Richard Smith s point process algorithm using the Quasi-Newton method are directly comparable to estimates obtained through the pp.f it function, using the Nelder-Mead method, in the R software package ismev for the point process approach. 15

GEV Model Fit - Mu Parameter: 95 Grid - Point vs 97 Grid - Point 16

GEV Model Fit - LogPsi Parameter: 95 Grid - Point vs 97 Grid - Point 17

GEV Model Fit - Xi Parameter: 95 Grid - Point vs 97 Grid - Point 18

100-Year Return: Point vs Grid - LogPoint vs Grid 19

100-Year Return: LogPoint + Elevation vs Grid 100-Year Return w/ 97% Threshold: LogPoint + Elevation vs Grid 20

50-Year Return: LogPoint + Elevation vs Grid 50-Year Return w/ 97% Threshold: LogPoint + Elevation vs Grid 21

Joint Modeling of 50 and 100-Year Returns: 95% Threshold Joint Modeling of 50 and 100-Year Returns: 97% Threshold 22

Model Results and Comparison Univariate Regression Predicting Point Locations 100-Year Return Level Using Grid Cell 100-Year Return WINTER - 100 OBS AIC Intercept X1 X2 Point Grid 4207 62011-85.7749 2.1267 log(point) Grid 4207 4724 5.1957 0.0032 log(pt) Grid+Elev 4207 4681 5.3775 0.0029-0.000110 W: log(pt) Grid 4026 3479 5.1250 0.0033 W: log(pt) Grid+Elev 4025 3425 5.3186 0.0030-0.000124 97 W: log(pt) Grid 4003 3343 5.1253 0.0033 97 W: log(pt) Grid+Elev 4013 3356 5.3074 0.0030-0.000113 SPRING - 100 OBS AIC Intercept X1 X2 Point Grid 4339 64296-96.0847 2.4299 log(point) Grid 4339 3776 5.4950 0.0029 log(pt) Grid+Elev 4339 3332 5.9362 0.0022-0.000253 W: log(pt) Grid 4132 2324 5.4042 0.0031 W: log(pt) Grid+Elev 4138 1924 5.8960 0.0023-0.000258 97 W: log(pt) Grid 4150 2761 5.4572 0.0030 97 W: log(pt) Grid+Elev 4145 2282 5.9736 0.0021-0.000281 23

Conclusion and Future Work Modeling the tail of the GEV Distribution appears to produce stable estimates as indicated across seasons and 95% and 97% thresholds 100 and 50-year return levels are successfully modeled by season at the point (station) level using grid-level return values and station elevation Model coefficients are consistent within seasons across 95% and 97% thresholds Future work includes possible consideration of quadratic models Future work includes plans to test grid-point models on CCSM data 24

References Coles, S.G. (2001) An Introduction to Statistical Modeling of Extreme Values. Springer Verlag, New York. Davison, A.C. and Smith, R.L. (1990), Models for exceedances over high thresholds (with discussion). J.R. Statist. Soc., 52, 393-442. Fisher, R.A. and Tippett, L.H.C. (1928), Limiting forms of the frequency distributions of the largest or smallest member of a sample. Proc. Camb. Phil. Soc. 24, 180190. Gumbel, E.J. (1958), Statistics of Extremes. Press. Columbia University Katz, R.W., Parlange, M.B. and Naveau, P. (2002), Statistics of extremes in hydrology. Advances in Water Resources (fill in full details of reference) 25

Kharin, V.V. and Zwiers, F.W. (2000), Changes in the extremes in an ensemble of transient climate simulations with a coupled atmosphereocean GCM. Journal of Climate 13, 37603788. Leadbetter, M.R., Lindgren, G. and Rootz?en, H. (1983), Extremes and Related Properties of Random Sequences and Series. Springer Verlag, New York. Pickands, J. (1975), Statistical inference using extreme order statistics. Ann. Statist. 3, 119131. Smith, R.L. (1989), Extreme value analysis of environmental time series: An application to trend detection in ground-level ozone (with discussion). Statistical Science 4, 367-393. Smith, R.L. (1990), Extreme value theory. In Handbook of Applicable Mathematics 7, ed. W. Ledermann, John Wiley, Chichester. Chapter 14, pp. 437-471.

Smith, R.L. (2003), Statistics of extremes, with applications in environment, insurance and finance. Chapter 1 of Extreme Values in Finance, Telecommunications and the Environment, edited by B. Finkenstadt and H. Rootzen, Chapman and Hall/CRC Press, London, pp. 178. http://www.unc.edu/depts/statistics/postscript/rs/semstat Smith, R.L. and Weissman, I. (1994), Estimating the extremal index. J.R. Statist. Soc. B 56, 515528. Zwiers, F.W. and Kharin, V.V. (1998), Changes in the extremes of the climate simulated by CCC GCM2 under CO2 doubling. Journal of Climate 11, 22002222.