Using statistical methods to analyse environmental extremes.

Similar documents
Modelação de valores extremos e sua importância na

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Overview of Extreme Value Analysis (EVA)

Extreme Value Analysis and Spatial Extremes

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated

Sharp statistical tools Statistics for extremes

Statistics for extreme & sparse data

Frequency Estimation of Rare Events by Adaptive Thresholding

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz

Statistical Assessment of Extreme Weather Phenomena Under Climate Change

Prognostication of Ozone Concentration in the Air

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

ON THE TWO STEP THRESHOLD SELECTION FOR OVER-THRESHOLD MODELLING

Precipitation Extremes in the Hawaiian Islands and Taiwan under a changing climate

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Financial Econometrics and Volatility Models Extreme Value Theory

Emma Simpson. 6 September 2013

Threshold estimation in marginal modelling of spatially-dependent non-stationary extremes

Probability Method in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

U.K. Ozone and UV Trends and Extreme Events

Future extreme precipitation events in the Southwestern US: climate change and natural modes of variability

Trends in policing effort and the number of confiscations for West Coast rock lobster

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Mozambique. General Climate. UNDP Climate Change Country Profiles. C. McSweeney 1, M. New 1,2 and G. Lizcano 1

A Conditional Approach to Modeling Multivariate Extremes

High-frequency data modelling using Hawkes processes

EVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH. Rick Katz

Accommodating measurement scale uncertainty in extreme value analysis of. northern North Sea storm severity

RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

High-frequency data modelling using Hawkes processes

The increasing intensity of the strongest tropical cyclones

5.2 Annual maximum sea levels in Venice

Dr. Haritini Tsangari Associate Professor of Statistics University of Nicosia, Cyprus

IT S TIME FOR AN UPDATE EXTREME WAVES AND DIRECTIONAL DISTRIBUTIONS ALONG THE NEW SOUTH WALES COASTLINE

Estimation techniques for inhomogeneous hydraulic boundary conditions along coast lines with a case study to the Dutch Petten Sea Dike

Peaks-Over-Threshold Modelling of Environmental Data

Importance of uncertainties in dose assessment to prove compliance with radiation protection standards

Generalized additive modelling of hydrological sample extremes

FORECAST VERIFICATION OF EXTREMES: USE OF EXTREME VALUE THEORY

Bayesian Modelling of Extreme Rainfall Data

STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS

Mixture models for heterogeneity in ranked data

St Lucia. General Climate. Recent Climate Trends. UNDP Climate Change Country Profiles. Temperature. Precipitation

A Hybrid ARIMA and Neural Network Model to Forecast Particulate. Matter Concentration in Changsha, China

R&D Research Project: Scaling analysis of hydrometeorological time series data

Extremes Events in Climate Change Projections Jana Sillmann

STATISTICAL METHODS FOR RELATING TEMPERATURE EXTREMES TO LARGE-SCALE METEOROLOGICAL PATTERNS. Rick Katz

Investigation of an Automated Approach to Threshold Selection for Generalized Pareto

Extreme Precipitation Analysis at Hinkley Point Final Report

On the modelling of extreme droughts

An application of the GAM-PCA-VAR model to respiratory disease and air pollution data

Bayesian spatial quantile regression

APPLICATION OF EXTREMAL THEORY TO THE PRECIPITATION SERIES IN NORTHERN MORAVIA

4. THE HBV MODEL APPLICATION TO THE KASARI CATCHMENT

CFCAS project: Assessment of Water Resources Risk and Vulnerability to Changing Climatic Conditions. Project Report II.

Climatic study of the surface wind field and extreme winds over the Greek seas

A spatio-temporal model for extreme precipitation simulated by a climate model

Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level

Will a warmer world change Queensland s rainfall?

Nonparametric inference in hidden Markov and related models

Three main areas of work:

Impact of Eurasian spring snow decrement on East Asian summer precipitation

Stochastic Hydrology. a) Data Mining for Evolution of Association Rules for Droughts and Floods in India using Climate Inputs

Spatial Point Pattern Analysis

Presented at WaPUG Spring Meeting 1 st May 2001

Méthode SCHADEX : Présentation Application à l Atnasjø (NO), Etude de l utilisation en contexte non-stationnaire

Indonesian seas Numerical Assessment of the Coastal Environment (IndoNACE) Executive Summary

NON STATIONARY RETURN LEVELS FOR RAINFALL IN SPAIN (IP)

Estimation of Quantiles

S e a s o n a l F o r e c a s t i n g f o r t h e E u r o p e a n e n e r g y s e c t o r

Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz

Review of medium to long term coastal risks associated with British Energy sites: Climate Change Effects - Final Report

Future Weather in Toronto and the GTA

Antigua and Barbuda. General Climate. Recent Climate Trends. UNDP Climate Change Country Profiles. Temperature

Table (6): Annual precipitation amounts as recorded by stations X and Y. No. X Y

A short introduction to INLA and R-INLA

Non-Life Insurance: Mathematics and Statistics

Precipitation processes in the Middle East

Application and verification of ECMWF products 2012

PRELIMINARY DRAFT FOR DISCUSSION PURPOSES

County Clare Flood Forecasting System

Cuba. General Climate. Recent Climate Trends. UNDP Climate Change Country Profiles. Temperature. C. McSweeney 1, M. New 1,2 and G.

Fin285a:Computer Simulations and Risk Assessment Section 6.2 Extreme Value Theory Daníelson, 9 (skim), skip 9.5

Cape Verde. General Climate. Recent Climate. UNDP Climate Change Country Profiles. Temperature. Precipitation

Physically-Based Statistical Models of Extremes arising from Extratropical Cyclones

Seasonal prediction of extreme events

R.Garçon, F.Garavaglia, J.Gailhard, E.Paquet, F.Gottardi EDF-DTG

Drought Criteria. Richard J. Heggen Department of Civil Engineering University of New Mexico, USA Abstract

Climate Change Impact Assessment on Indian Water Resources. Ashvin Gosain, Sandhya Rao, Debajit Basu Ray

CSO Climate Data Rescue Project Formal Statistics Liaison Group June 12th, 2018

An Introduction to Nonstationary Time Series Analysis

Systematic errors and time dependence in rainfall annual maxima statistics in Lombardy

Zambia. General Climate. Recent Climate Trends. UNDP Climate Change Country Profiles. Temperature. C. McSweeney 1, M. New 1,2 and G.

Latent classes for preference data

Changes to Extreme Precipitation Events: What the Historical Record Shows and What It Means for Engineers

Changing Hydrology under a Changing Climate for a Coastal Plain Watershed

Modeling Great Britain s Flood Defenses. Flood Defense in Great Britain. By Dr. Yizhong Qu

Chapter 4: Factor Analysis

Lecture 7. Testing for Poisson cdf - Poisson regression - Random points in space 1

Transcription:

Using statistical methods to analyse environmental extremes. Emma Eastoe Department of Mathematics and Statistics Lancaster University December 16, 2008

Focus of talk Discuss statistical models used to predict the size and/or times of unusually large (extreme) events; Show how these models can be adapted to incorporate physical structure of data; Two examples - air pollution and river flow. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 2 / 27

Unusually large events How high should a sea wall be built so that it is breached (on average) only once every 100 years? Coastal engineer. How high should a dam be built so that it floods (on average) only one year in every 10000? Engineer, water company. On how many days a year will ozone levels exceed safety levels? Health worker, traffic planner. Where should an oil rig be located to be best protected from the largest waves? Oil company, engineer. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 3 / 27

Background Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 4 / 27

Background 1 Data : time series of daily observations for the last n years; Assume data are independent and identically distributed (IID); i.e. data are an independent random sample from a probability distribution with constant parameters, e.g. Normal(µ,σ), Exponential(λ). 0 Data 5 10 15 0 500 1000 1500 Time Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 5 / 27

Background 2 Interest is in predicting unusually large events Fit a statistical model based only on data from the upper tail of the underlying distribution; i.e. ignore small/medium-sized data; Peaks over threshold (POT): select a high constant threshold u and model rate and size of threshold exceedances. 0 Data 5 10 15 0 500 1000 1500 Time Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 6 / 27

Background 3 Mathematical results support the use of the following models for the rate and size of threshold exceedances: Rate : a Poisson process with parameter λ; Size : the two-parameter generalised Pareto distribution (GPD). Use the fitted models to calculate the N-year return level, i.e. the level exceeded on average once every N years, for N much larger than the length of the data set n. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 7 / 27

What can go wrong? Model assumption that data are IID is usually too simplistic. Series displays trends; Series is correlated at short lags. Data 0 5 10 15 20 Data 0 100 200 300 Time (a) IID 0 5 10 15 Data 5 10 15 0 100 200 300 Time (b) Trend 0 0 100 200 300 Time (c) Correlated Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 8 / 27

Part 1 Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 9 / 27

Data with trend Mathematical theory which supports the threshold model only extends to a few special cases of trend; Modellers use the POT model but allow Poisson process and GPD parameters to vary with time and/or covariates; Parametric (GLM) or non-parametric (LOESS, GAM) models; Numerical difficulties with model fitting; Discuss an alternative approach - better motivated by theory and easier to fit. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 10 / 27

Ozone 1 Surface-level ozone data, centre of Reading; Summer peaks, winter troughs; Responds to changes in precursors (e.g. NO, NO 2 ) as well as sunshine, temperature and wind speed/direction. Ozone (µgm 3 ) 50 100 150 200 1998 1999 2000 2001 Time Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 11 / 27

Ozone 2 First remove trends from full data set {Y t } using covariates {x t }, then model extremes which should be IID. Model trends in mean µ and scale σ by supposing that transformed data {Y λ t } are a sample from a Normal(µ(x t),σ(x t )) distribution; Model mean and variance as linear functions of covariates, e.g. µ(x t ) = µ x t = µ 0 + µ 1 x 1,t +... + µ p x p,t where µ is a vector of regression coefficients. Standardise the data by Z t = Y t λ µ(x t ). σ(x t ) Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 12 / 27

Ozone 3 To model the extremes: Select a high constant threshold u z, e.g. 99% quantile of the standardised series {Z t }; Model the rate and size of threshold exceedances of u Z using Poisson process and GPD models with constant parameters; Could include covariates in POT model parameters, especially if extremes might respond in a different way to covariates; Effective threshold now varies in time: u(x t ) = [u Z σ(x t ) + µ(x t )] 1/λ. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 13 / 27

Ozone 4 Modelled mean µ(x t ) and variance σ 2 (x t ) of square root of ozone. Ozone 4 6 8 10 12 14 Zt -2 0 2 4 Ozone 50 100 150 200 1998 1999 2000 2001 Time (a) Mean -4 1998 1999 2000 2001 Time (b) Standardised ozone 1998 1999 2000 2001 Time (c) Effective threshold Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 14 / 27

Return levels From this model, some estimated return levels (with 95% confidence intervals) are These seem realistic Return period 5-yr 10-yr 100-yr Return level 218 234 289 (µgm 3 ) (197,247) (208,271) (239,387) The 5-year return level is exceeded on only three days over the four years of observed data (at the end of July/beginning of August); Neither the 10- nor the 100-year return levels is exceeded at all. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 15 / 27

Part 2 Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 16 / 27

River flow 1 Time-series of daily flows at Kingston on the River Thames (1883-2006): What is the best statistical model for the maximum annual flow at this site? Could extract annual maxima and model these, but leads to loss of information... Instead use models for the number of events in a year and the size of the peak flow in an event... Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 17 / 27

River flow 1 Time-series of daily flows at Kingston on the River Thames (1883-2006): What is the best statistical model for the maximum annual flow at this site? Could extract annual maxima and model these, but leads to loss of information... Instead use models for the number of events in a year and the size of the peak flow in an event... What is the most appropriate model for the annual number of events? Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 17 / 27

River flow 1 Time-series of daily flows at Kingston on the River Thames (1883-2006): What is the best statistical model for the maximum annual flow at this site? Could extract annual maxima and model these, but leads to loss of information... Instead use models for the number of events in a year and the size of the peak flow in an event... What is the most appropriate model for the annual number of events? How does the model used for this affect the implied annual maxima distribution? Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 17 / 27

River flow 2 Data show strong correlation at short time lags; Select a high threshold u to identify associated flow events; Independent events begin with a threshold exceedance and end after m consecutive non-exceedances. Flow ms 3 10 20 30 40 50 60 Aug 19 Aug 24 Aug 29 Time Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 18 / 27

Homogeneous Poisson model Let N i be the number of events in year i; Could assume N i s are an independent random sample from a Binomial(365,p) distribution; Number of observations large and probability of an event p small so use Poisson approximation; Assume that N i s are an independent random sample from a Poisson(λ) distribution; Interpretation: λ is the mean number of events per year. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 19 / 27

Over-dispersion A consequence of this model choice is that the between-year variance in the number of events is equal to the mean of the number of events per year; However, previous studies have shown that the between-year variance is larger than the mean for most rivers in the UK; Reason - unobserved ( missing ) covariates, e.g. precipitation, antecedent soil conditions? This extra variation between years cannot be captured by homogeneous Poisson model. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 20 / 27

Latent variables (or random effects) Build a model for the missing covariates (latent variables). 1 Denote by γ i the latent variable for year i and assume these are an independent random sample from a Gamma(1/α,1/α) distribution; 2 Assume N i are independent in time and follow a Poisson distribution with mean varying from year to year, Under this model λ i = λγ i, λ > 0 The mean number of events per year is λ; Between-year variance has increased from λ to λ(1 + λα). Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 21 / 27

R. Thames, Kingston Average of 3.3 events per year and between-year variance of 5.7; Fitting the latent variable model gives parameter estimates of ˆλ = 3.3 (2.9,3.8) and ˆα = 0.26 (0.14,0.43); Observed (dots) and estimated mean (line) number of events No. events per year 0 2 4 6 8 1880 1900 1920 1940 1960 1980 2000 Year Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 22 / 27

Annual maximum 3 to 500 year return levels; Homogeneous Poisson process (black) Latent variables with α = 0.5 (red), α = 1 (green) and α = 5 (blue) 1 2 Rtn level 3 4 5 6 7 1 2 3 4 5 log( log(1 1/n)) At higher return levels we have averaged over all values of the latent variables; Bigger α implies more years with zero events, hence lower short period return levels. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 23 / 27 6

Extension - within year variability Could model the point process δ ij where δ ij = { 1 if event peak on day j of year i; 0 if no event peak on day j of year i. Inhomogeneous Poisson process model with rate parameter λ ij = γ i exp{β x ij }, β are regression coefficients, x ij are covariates and γ i are as earlier. Use latent variables as a diagnostic for covariate selection; Distribution of annual maxima by simulation. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 24 / 27

R. Nairn at Firhall Covariates - baseflow and 3-month aggregated rainfall. Left - covariates only; Right - covariates and latent variables; No. events per year 0 5 10 15 1880 1920 1960 2000 Year No. events per year 0 2 4 6 8 10 1880 1920 1960 2000 Year Over-estimation of mean number of events in covariates only model. Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 25 / 27

Conclusions Statistical models for extreme values provide a useful way to predict unusual events; These models can be adapted to a variety of applications; And can be constructed to incorporate known physical structure or relationships; Any questions, comments, possible applications are welcome... Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 26 / 27

Thank-you! Emma Eastoe (Maths & Statistics) Statistical models for extremes December 16, 2008 27 / 27