Statistics for extreme & sparse data

Similar documents
Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Bayesian spatial hierarchical modeling for temperature extremes

RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Extreme Value Analysis and Spatial Extremes

Bayesian hierarchical modelling for data assimilation of past observations and numerical model forecasts

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Investigation of an Automated Approach to Threshold Selection for Generalized Pareto

Overview of Extreme Value Analysis (EVA)

Emma Simpson. 6 September 2013

Bayesian spatial quantile regression

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian Modelling of Extreme Rainfall Data

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Fusing point and areal level space-time data. data with application to wet deposition

Statistical downscaling daily rainfall statistics from seasonal forecasts using canonical correlation analysis or a hidden Markov model?

CLIMATE EXTREMES AND GLOBAL WARMING: A STATISTICIAN S PERSPECTIVE

Towards a more physically based approach to Extreme Value Analysis in the climate system

Downscaling in Time. Andrew W. Robertson, IRI. Advanced Training Institute on Climate Variability and Food Security, 12 July 2002

Heavier summer downpours with climate change revealed by weather forecast resolution model

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Generalized additive modelling of hydrological sample extremes

STATISTICAL METHODS FOR RELATING TEMPERATURE EXTREMES TO LARGE-SCALE METEOROLOGICAL PATTERNS. Rick Katz

Precipitation Extremes in the Hawaiian Islands and Taiwan under a changing climate

Using statistical methods to analyse environmental extremes.

Bayesian Hierarchical Models

Weather generators for studying climate change

EVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH. Rick Katz

RISK ANALYSIS AND EXTREMES

Threshold estimation in marginal modelling of spatially-dependent non-stationary extremes

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter

Bayesian data analysis in practice: Three simple examples

Models for Spatial Extremes. Dan Cooley Department of Statistics Colorado State University. Work supported in part by NSF-DMS

A spatio-temporal model for extreme precipitation simulated by a climate model

STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS

Future extreme precipitation events in the Southwestern US: climate change and natural modes of variability

Extremes and Atmospheric Data

Modeling annual precipation extremes from a deterministic model

Financial Econometrics and Volatility Models Extreme Value Theory

Models for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research

A Spatio-Temporal Downscaler for Output From Numerical Models

Zwiers FW and Kharin VV Changes in the extremes of the climate simulated by CCC GCM2 under CO 2 doubling. J. Climate 11:

Assessing Dependence in Extreme Values

Sharp statistical tools Statistics for extremes

High-frequency data modelling using Hawkes processes

Data are collected along transect lines, with dense data along. Spatial modelling using GMRFs. 200m. Today? Spatial modelling using GMRFs

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

High-frequency data modelling using Hawkes processes

A STATISTICAL APPROACH TO OPERATIONAL ATTRIBUTION

Regional climate projections for NSW

Introduction to Bayes and non-bayes spatial statistics

Model Output Statistics (MOS)

Analysing geoadditive regression data: a mixed model approach

Spatial Inference of Nitrate Concentrations in Groundwater

New design rainfalls. Janice Green, Project Director IFD Revision Project, Bureau of Meteorology

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz

Accommodating measurement scale uncertainty in extreme value analysis of. northern North Sea storm severity

A spatial causal analysis of wildfire-contributed PM 2.5 using numerical model output. Brian Reich, NC State

Estimating Design Rainfalls Using Dynamical Downscaling Data

Space-Time Data fusion Under Error in Computer Model Output: An Application to Modeling Air Quality

Data. Climate model data from CMIP3

Hidden Markov Models for precipitation

Extreme Value Theory to analyse, validate and improve extreme climate projections

A short introduction to INLA and R-INLA

Combining Incompatible Spatial Data

A quantile matching adjustment algorithm for Gaussian data series

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Quantifying Weather Risk Analysis

A Conditional Approach to Modeling Multivariate Extremes

Changes in Weather and Climate Extremes and Their Causes. Xuebin Zhang (CRD/ASTD) Francis Zwiers (PCIC)

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Aggregated cancer incidence data: spatial models

Regionalization Techniques and Regional Climate Modelling

BAYESIAN HIERARCHICAL MODELS FOR MISALIGNED DATA: A SIMULATION STUDY

Fusing space-time data under measurement error for computer model output

Statistical analysis of regional climate models. Douglas Nychka, National Center for Atmospheric Research

Climate predictability beyond traditional climate models

Adaptation for global application of calibration and downscaling methods of medium range ensemble weather forecasts

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

Climate Change: the Uncertainty of Certainty

Modeling and Simulating Rainfall

R.Garçon, F.Garavaglia, J.Gailhard, E.Paquet, F.Gottardi EDF-DTG

MULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES

A Geostatistical Approach to Linking Geographically-Aggregated Data From Different Sources

Ensemble Data Assimilation and Uncertainty Quantification

CSC321 Lecture 18: Learning Probabilistic Models

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

NRC Workshop - Probabilistic Flood Hazard Assessment Jan 2013

Space-time downscaling under error in computer model output

Hidden Markov Models Part 1: Introduction

STRUCTURAL ENGINEERS ASSOCIATION OF OREGON

Extreme precipitation events in the Czech Republic in the context of climate change

Transcription:

Statistics for extreme & sparse data University of Bath December 6, 2018

Plan 1 2 3 4 5 6

The Problem Climate Change = Bad!

4 key problems Volcanic eruptions/catastrophic event prediction. Windstorms how to work out potential losses should one occur. Weather extreme indices. Parametric insurance applications.

Data Windstorm data: http://www.europeanwindstorms.org/ Volcanic eruption data: https://www.ngdc.noaa.gov/nndc/servlet/showdatasets? dataset=102557&search_look=50&display_look=50 Climate data station & gridded: https://www.metoffice.gov.uk/climate/uk/data Climate index data: https://www.climdex.org/datasets.html

Key issues Dealing with very sparse data both in space and in time. Ground monitoring stations are often very scattered and coverage of many locations is very poor. These events are extreme in nature so tend to be spread out in time.

Source 1: Data from Monitoring Stations

Source 2: Numerical Model Output

Combining data sources? Why?: Need to accurately assess exposure (e.g. in volcanic eruption) to estimate financial or health risks. Types of data: Directly measured data (e.g. pollutant concentrations) from monitoring networks Numerical model output, which provides estimates of the average level in grid cells. Directly measured data provides a finer level on a spatial scale but considerable missingness. Modelling involves numerically solving complex systems of differential equations capturing various physical processes. This output spans large spatial domains with no missingness.

Combining data sources? Why?: Need to accurately assess exposure (e.g. in volcanic eruption) to estimate financial or health risks. Types of data: Directly measured data (e.g. pollutant concentrations) from monitoring networks Numerical model output, which provides estimates of the average level in grid cells. Directly measured data provides a finer level on a spatial scale but considerable missingness. Modelling involves numerically solving complex systems of differential equations capturing various physical processes. This output spans large spatial domains with no missingness.

Combining data sources? Why?: Need to accurately assess exposure (e.g. in volcanic eruption) to estimate financial or health risks. Types of data: Directly measured data (e.g. pollutant concentrations) from monitoring networks Numerical model output, which provides estimates of the average level in grid cells. Directly measured data provides a finer level on a spatial scale but considerable missingness. Modelling involves numerically solving complex systems of differential equations capturing various physical processes. This output spans large spatial domains with no missingness.

Combining data sources? Why?: Need to accurately assess exposure (e.g. in volcanic eruption) to estimate financial or health risks. Types of data: Directly measured data (e.g. pollutant concentrations) from monitoring networks Numerical model output, which provides estimates of the average level in grid cells. Directly measured data provides a finer level on a spatial scale but considerable missingness. Modelling involves numerically solving complex systems of differential equations capturing various physical processes. This output spans large spatial domains with no missingness.

What Are the Benefits of Data Fusion? 1 Advantages Fusion can improve assessment of exposure at high resolution. Allows us to combine observational data on the current state of the atmosphere with a short-range numerical forecast to obtain initial conditions for a numerical atmospheric model. 2 Disadvantages Many methods proposed in atmospheric data assimilation are ad-hoc, algorithmic and don t address the change of support problem.

Change of Support Problem Spatial data can be collected at point locations or associated with areal units. The spatial support is the volume/shape/size/orientation associated with each spatial measurement. The change of support problem is concerned with inference about the values of a variable at a level different from that which it has been observed. It involves studying statistical properties of a spatial process as we change the spatial support (Banerjee, Carlin and Gelfand, 2004).

Change of Support Problem Spatial data can be collected at point locations or associated with areal units. The spatial support is the volume/shape/size/orientation associated with each spatial measurement. The change of support problem is concerned with inference about the values of a variable at a level different from that which it has been observed. It involves studying statistical properties of a spatial process as we change the spatial support (Banerjee, Carlin and Gelfand, 2004).

Change of Support Problem Spatial data can be collected at point locations or associated with areal units. The spatial support is the volume/shape/size/orientation associated with each spatial measurement. The change of support problem is concerned with inference about the values of a variable at a level different from that which it has been observed. It involves studying statistical properties of a spatial process as we change the spatial support (Banerjee, Carlin and Gelfand, 2004).

Methods of Combining Sources Kriging, a method of interpolation for which the interpolated values are modeled by a Gaussian process, is often used to interpolate between sites. Not ideal when data is sparse!!!. If one is interested at a grid level, can use upscaling to make area-level inferences based on point data. Conversely, can use downscaling when observing at an area level (e.g. gridded data) and want to make point-level inferences. We ll focus primarily on this.

Methods of Combining Sources Kriging, a method of interpolation for which the interpolated values are modeled by a Gaussian process, is often used to interpolate between sites. Not ideal when data is sparse!!!. If one is interested at a grid level, can use upscaling to make area-level inferences based on point data. Conversely, can use downscaling when observing at an area level (e.g. gridded data) and want to make point-level inferences. We ll focus primarily on this.

Methods of Combining Sources Kriging, a method of interpolation for which the interpolated values are modeled by a Gaussian process, is often used to interpolate between sites. Not ideal when data is sparse!!!. If one is interested at a grid level, can use upscaling to make area-level inferences based on point data. Conversely, can use downscaling when observing at an area level (e.g. gridded data) and want to make point-level inferences. We ll focus primarily on this.

Downscaling Two types of downscaling: Dynamical downscaling: Output from Global Climate Models used to drive regional numerical model in higher spatial resolution. Physical/numerical approach! Statistical downscaling: A statistical relationship is established from observations between large scale variables (atmospheric surface pressure) and a local variable (a site s wind speed). The relationship is used on the GCM data to obtain the local variables from the GCM output.

Downscaling in practice A tutorial on working with spatial data & downscaling can be seen at: http://www.wmo.int/pages/prog/wcp/agm/meetings/korea2016/ documents/tutorial_statistical_downscaling.pdf Another good source is this paper by Shaddick, Thomas et al (2018) which explains how to fit it within a Bayesian framework. https://pubs.acs.org/doi/10.1021/acs.est.8b02864

Downscaling in practice A tutorial on working with spatial data & downscaling can be seen at: http://www.wmo.int/pages/prog/wcp/agm/meetings/korea2016/ documents/tutorial_statistical_downscaling.pdf Another good source is this paper by Shaddick, Thomas et al (2018) which explains how to fit it within a Bayesian framework. https://pubs.acs.org/doi/10.1021/acs.est.8b02864

Potential options Two heavy-tailed models that can allow large observations arise out of Extreme Value Theory: 1 Generalised Extreme Value distribution (GEV): This is the limit distribution for the sample maxima, the largest observation in each period of observation. 2 Generalised Pareto Distribution (GPD): This is the limit distribution of losses which exceed a high enough threshold. Given count data for the events exceeding a given threshold, one can also use Poisson regression to investigate whether the frequency of these events is changing.

GEV distribution for block maxima May wish to look at a series of block maxima M n = max{x 1,..., X n }, a sequence of iid RVs with some common distribution function. The M n represents the maximum (or minimum!) of this process over n units of time. Can use the generalized extreme value distribution as a model for these block maxima. { [ ( )] } 1/ξ z µ G(z) = exp 1 + ξ, (1) σ The model has a location parameter, µ, a scale parameter, σ, and a shape parameter, ξ.

GPD: Is the magnitude of events changing? Given peak data X, for a large threshold u, the distribution of (X u) conditioned on X > u may be approximated by: ( H(y) = 1 1 + ξy ) 1/ξ. σ This function is defined on {y : y > 0 & ξy/ σ > 0}, and σ = σ + ξ(u µ). This family of distributions is known as the generalised Pareto family of distributions. The size of threshold exceedances may be approximated by a member of this family.

Poisson regression: Is the frequency of these events changing? We can fit a generalised linear model. We have count data for the numbers of peaks over threshold for each year, so can assume a Poisson distribution. log(µ) = β 0 + β 1 x 1 + β 2 x 2 + + β k x k = xi T β where y i Poisson(µ i ) and we use the natural log link g(µ) = log(µ).

How about both? A point process representation. We want something that looks both at the size and number of exceedances using the GPD style approach. The point process model describes both the magnitude of threshold exceedances and the rate at which the threshold u is exceeded. It is parameterised by three parameters location, scale and shape.

What are extreme weather indices? These are usually values derived from daily temperature & precipitation data. There is a focus in changes in extremes, rather than the usual change in means. Examples include number of days during which temperature exceeds its long-term 90th percentile, or annual count of days when rain exceeds 10mm.

What are extreme weather indices? These are usually values derived from daily temperature & precipitation data. There is a focus in changes in extremes, rather than the usual change in means. Examples include number of days during which temperature exceeds its long-term 90th percentile, or annual count of days when rain exceeds 10mm.

What are extreme weather indices? These are usually values derived from daily temperature & precipitation data. There is a focus in changes in extremes, rather than the usual change in means. Examples include number of days during which temperature exceeds its long-term 90th percentile, or annual count of days when rain exceeds 10mm.

How can we calculate them? Two ways: Station-level index to gridded index Calculate the index at a station level (e.g. by calculating threshold exceedances, possibly by GPD or contour sets). Then grid these indices using techniques such as cubic spline interpolation. Grid to index Simulate from posterior distribution having fitted a spatial model. Calculate indices as before. Problem! Stations not evenly distributed across the global land area = difficult to accurately compute global averages. A simple average of data from all available stations would result in a representation biased toward areas of higher station density.

How can we calculate them? Two ways: Station-level index to gridded index Calculate the index at a station level (e.g. by calculating threshold exceedances, possibly by GPD or contour sets). Then grid these indices using techniques such as cubic spline interpolation. Grid to index Simulate from posterior distribution having fitted a spatial model. Calculate indices as before. Problem! Stations not evenly distributed across the global land area = difficult to accurately compute global averages. A simple average of data from all available stations would result in a representation biased toward areas of higher station density.

How can we calculate them? Two ways: Station-level index to gridded index Calculate the index at a station level (e.g. by calculating threshold exceedances, possibly by GPD or contour sets). Then grid these indices using techniques such as cubic spline interpolation. Grid to index Simulate from posterior distribution having fitted a spatial model. Calculate indices as before. Problem! Stations not evenly distributed across the global land area = difficult to accurately compute global averages. A simple average of data from all available stations would result in a representation biased toward areas of higher station density.

Multiple potential outputs of interest to insurers & general population. Can be considered as rare event prediction see extreme value distributions covered earlier. Predicting when an event is likely to occur based on behaviour prior to the event. When to raise the alarm? How to be aware of risk to life? Very sparse data e.g. volcanic eruption where some are very well studied (such as Vesuvius) and many others aren t. Can we apply knowledge across?

Changepoint Analysis Model We often wish to identify times when the probability distribution of a stochastic process or times series changes. Generally, we will want to look at whether a change (or changes) has occurred, and identifying the times of such a change. Definition Given a dataset X 1:n = {X 1, X 2,..., X n } of real-valued observations we wish to find an ideal set of K non-overlapping intervals where the observations are homogeneous within each. For K segments, the change-point model expresses the distribution of X given some segmentation S 1:n = {S 1, S 2,..., S n } M k as: n K P θ (X 1:n S 1:n ) = β Si (X i ) = β s (X i ) i=1 s=1 i,s i =s

Changepoint Analysis Model We often wish to identify times when the probability distribution of a stochastic process or times series changes. Generally, we will want to look at whether a change (or changes) has occurred, and identifying the times of such a change. Definition Given a dataset X 1:n = {X 1, X 2,..., X n } of real-valued observations we wish to find an ideal set of K non-overlapping intervals where the observations are homogeneous within each. For K segments, the change-point model expresses the distribution of X given some segmentation S 1:n = {S 1, S 2,..., S n } M k as: n K P θ (X 1:n S 1:n ) = β Si (X i ) = β s (X i ) i=1 s=1 i,s i =s

Hidden Markov Models Definition A Hidden Markov Model (HMM) is a statistical Markov model in which the system being modelled is assumed to be a Markov process with some unobserved, or hidden, states. It is a particular kind of Bayesian network (dynamic!). A typical HMM is given by the joint probability distribution: P(X 1:n S 1:n ) = P(S 1 )P(X 1 S 1 ) n P(S i S i 1 )P(X i S i ) i=2

HMM Example Figure: HMM topology with n=5. S i are the hidden states and X i the observed states, i = 1,..., 5

HMMs for changepoint analysis We can look at changepoint analysis as a HMM where: Data are the observations Unknown segmentations are the hidden states So HMM adaptations can identify changepoints by observations where a switch in hidden states is most likely to occur. HMM approaches are also useful in inferential procedures - posterior estimations, etc.

Alternative: Treat it as an optimal stopping problem! Suppose we have a continuously observable process behaving as a standard Brownian motion up to time τ 1, then as a B.M. with known drift afterwards. At stopping time τ 2 some observable event occurs e.g. a volcanic eruption or other.

Alternative: Treat it as an optimal stopping problem! Suppose we have a continuously observable process behaving as a standard Brownian motion up to time τ 1, then as a B.M. with known drift afterwards. At stopping time τ 2 some observable event occurs e.g. a volcanic eruption or other.

Alternative: Treat it as an optimal stopping problem! Suppose we have a continuously observable process behaving as a standard Brownian motion up to time τ 1, then as a B.M. with known drift afterwards. At stopping time τ 2 some observable event occurs e.g. a volcanic eruption or other.

Alternative: Treat it as an optimal stopping problem! Løkka (2007) investigates the problem of detecting changes in a system s behaviour prior to the event occurring, taking into account information given by non-occurrence of the observable event. Also looks at where it is favourable to raise the alarm before this event. Løkka (2007) shows that this problem can be reduced to a 1D optimal stopping problem, & derive an explicit solution.

Alternative: Treat it as an optimal stopping problem! Løkka (2007) investigates the problem of detecting changes in a system s behaviour prior to the event occurring, taking into account information given by non-occurrence of the observable event. Also looks at where it is favourable to raise the alarm before this event. Løkka (2007) shows that this problem can be reduced to a 1D optimal stopping problem, & derive an explicit solution.

Alternative: Treat it as an optimal stopping problem! Løkka (2007) investigates the problem of detecting changes in a system s behaviour prior to the event occurring, taking into account information given by non-occurrence of the observable event. Also looks at where it is favourable to raise the alarm before this event. Løkka (2007) shows that this problem can be reduced to a 1D optimal stopping problem, & derive an explicit solution.

How to include historical data All will be revealed at PSS on Thursday!

References V. J. Berrocal, A. E. Gelfand & D. M. Holland (2012) Space-Time Data Fusion under Error in Computer Model Output: An Application to Modeling Air Quality Biometrics 68, 837-848 S. Banerjee, B. P. Carlin & A. E. Gelfand (2004) Hierarchical Modelling and Analysis for Spatial Data. Boca Ration, FL: Chapman & Hall/CRC.

References T. M. Luong, V. Perduca & G. Nuel (2012) Hidden Markov Model Applications in Change-Point Analysis arxiv:1212.1778 R. Killick and I. A. Eckley (2014) changepoint: An R Package for Changepoint Analysis Journal of Statistical Software 58(3) 1-19 A. Lokka (2007) Detection of Disorder Before an Observable Event Stochastics: An International Journal of Probability and Stochastic Processes, Volume 79, Issue 3-4, 2007

References Alexander, L. V., et al. (2006) Global observed changes in daily climate extremes of temperature and precipitation. Journal of Geophysical Research: Atmospheres 111.D5

https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2005jd006290