Physically-Based Statistical Models of Extremes arising from Extratropical Cyclones

Size: px

Start display at page:

Download "Physically-Based Statistical Models of Extremes arising from Extratropical Cyclones"

Joleen Shields
5 years ago
Views:

Lancaster University STOR603: PhD Proposal Physically-Based Statistical Models of Extremes arising from Extratropical Cyclones Author: Paul Sharkey Supervisors: Jonathan Tawn Jenny Wadsworth Simon

Extreme value theory is a statistical field that has often been used to analyse extreme rainfall accumulations and wind speeds, but without incorporating the physical characteristics of extratropical

These features are generally spatially heterogeneous and non-stationary in time, so this presents a unique modelling challenge from both a statistical and climatological perspective.

1 Lancaster University STOR603: PhD Proposal Physically-Based Statistical Models of Extremes arising from Extratropical Cyclones Author: Paul Sharkey Supervisors: Jonathan Tawn Jenny Wadsworth Simon Brown Abstract Extratropical cyclones are low pressure weather systems in the mid-latitudes that are associated with strong winds and heavy rainfall. Extreme value theory is a statistical field that has often been used to analyse extreme rainfall accumulations and wind speeds, but without incorporating the physical characteristics of extratropical cyclones that generate these extremes. These features are generally spatially heterogeneous and non-stationary in time, so this presents a unique modelling challenge from both a statistical and climatological perspective. This report gives an outline of the various methods that will be used to tackle these problems during PhD research, which aims to combine numerous aspects from extreme value analysis and atmospheric science. The goal is to present a framework that is a statistically consistent representation of extratropical cyclones that incorporates various aspects of cyclone evolution, movement and structure that can be used to predict certain future aspects of cyclone behaviour. August 30, 2014

2 Contents 1 Introduction 1 2 Univariate extreme value theory Block maxima approach Threshold methods Modelling extremes of stationary processes Modelling extremes of non-stationary processes Generalised additive models Random effects Simulation study Simulating a Poisson process Model fitting Bivariate extreme value theory Measures of dependence Testing dependence of simulated data Componentwise block maxima Threshold methods Conditional approach Extratropical cyclones Formation Airmasses The Norwegian Cyclone Model The Shapiro-Keyser Model Key features Statistical modelling of extratropical cyclones Exploratory data analysis Data availability Covariate analysis Dependence structure Further Work Short-term goals Data collection Spatial extremes Random effects Long-term goals

3 1 Introduction The prevalence of extratropical cyclones in the mid-latitudes is a dominant feature of the weather landscape affecting the United Kingdom. The UK has come to expect a consistent pattern of temperate summers and mild winters on a yearly basis. However, in recent years, this country has been a focus of extreme weather events. This is exemplified by major flood events (as recently as February 2014 in Devon and Cornwall), damaging windstorms (Cyclone Christian, 2013) and other events that have caused mass infrastructural damage, transport chaos and, in some instances, even human fatalities. The ongoing threat of weather systems associated with extratropical cyclones is of great concern to the Met Office and its clients. Accurate modelling and forecasting of extreme weather events related to these cyclones is essential to minimise potential damage caused, to aid design of appropriate defence mechanisms to protect the threat to human life and to limit the economic difficulties such an event may cause. Such events usually manifest in the form of strong winds and heavy rainfall. Such instances are both examples of an extreme event. In this context, an extreme event is one that is very rare, with the consequence that datasets of extreme observations are usually quite small. The statistical field of extreme value theory is focused on modelling such rare events, with the ideology of extrapolation of the physical process of interest from the observed data to unobserved levels. This allows a rigorous statistical modelling procedure to be followed in spite of the data constaints. Features of extratropical cyclones have already been analysed using extreme value methods. Wind speed and rainfall accumulations are commonly used datasets for extreme value analysis. However, many of these statistical models fail to incorporate the large scale characteristics of extratropical cyclones that shape the rate of occurrence and magnitude of extreme wind and rain. In particular, the physical processes and atmospheric dynamics that drive the evolution and movement of cyclonic behaviour are largely ignored, and as a result, existing models are divorced from the atmospheric activity that is generating the extremes. This is further complicated by the fact that the features of cyclonic behaviour that have the most damaging consequences can be small-scale in nature. Sting jets, for example, produce localised gusts that can cause mass damage on short time scales. However, the presence of many of these small-scale features are difficult to observe through both empirical and modelgenerated measurements, with the consequence that they are difficult to model in practice. It is essential that these physical characteristics of cyclone behaviour are further investigated and incorporated into an extreme value model in order to produce a more realistic, consistent statistical representation of the underlying physical processes. The goal of this PhD research is to develop such a model. In addition to the atmospheric complexity of extratropical cyclones, model specification is compounded by the fact that these weather systems feature irregularly occurring phenomena with rates and magnitudes that are spatially hetereogeneous and non-stationary in time. Previous extreme value analyses have focused on modelling variables such as rainfall at a single location. From a modelling perspective, such univariate methods will not be sufficient 1

4 in capturing the full picture of extremal behaviour associated with extratropical cyclones. It is inevitable that any model for estimating extreme weather will have to incorporate a dependence structure. This is intuitive considering that, for example, the risk of extreme winds in two locations are related if they are situated on the same track of a storm. Extending this to incorporate spatial variability in the model is crucial due to the clearly spatially heterogenous behaviour of weather systems. The impact of covariates will also be a significant area of research. Some covariates such as the North Atlantic Oscillation (NAO) index (see Section 6) are known to be associated with the behaviour of weather systems in the North Atlantic region. However, to develop a model that fully represents the physical processes of the cyclones, an investigation must be carried out into discovering relevant covariates related to the structure of the cyclone itself. Spatial modelling is essential as covariates which may impact the overall extreme behaviour of the cyclone may not be significant when analysing remote sites. Random effects modelling over space may also be necessary if the form of the covariate is not yet clear. Other factors to consider would be seasonality and long-term climate change. It is hoped that the complete model will provide a more robust assessment of how the risk of extreme events arising from extratropical cyclones will change over time. A major short-term component of the PhD research will focus on gathering meaningful data relevant to modelling extremal behaviour of extratropical cyclones. Naturally, the extreme value methods used in such a model will depend on the relevance and quality of the data obtained. With any analysis involving raw weather data, it is natural to expect inaccuracies and gaps in the data due to factors such as unreliable recording instruments. Measurements recorded at irregularly distributed sites may also distort the spatial structure of the data somewhat. With this in mind, a data assimilation scheme known as reanalysis is used to generate weather observations over fixed time intervals. Observational data are combined with prior information from a forecast model to produce estimates of the state of weather systems. Examples of such reanalysis projects include ERA-40 and ERA-Interim, which are introduced in Section 4. These reanalyses often comprise a system of millions of observations, but it is important not to equate these datasets with reality. These datasets are generated with spatial and temporal resolutions that are constrained by limits of computational power. Model bias may also cause spurious variability and trends to appear in the data. This is further discussed in Section 4. The structure of the report is as follows. Section 2 consists of an overview of extreme value methods in a univariate context, discussing block maxima and threshold exceedance approaches for modelling the tails of a distribution. In addition, extensions of these models to stationary processes are described. Various methods of incorporating non-stationary components into an extreme value model are also addressed, which is likely to be a key to modelling extreme cylonic behaviour. Section 3 introduced the notion of dependence modelling, describing methods of determining the joint risk of extreme events over multiple variables of interest. Section 4 describes the physical context of the problem in greater detail. An overview is presented of the physical processes that drive and shape the evolution and movement of extratropical cylones, which is key to developing a statistical representation of the physics that generate extremes from these weather systems. A brief description and exploratory analysis of reanalysis datasets is presented in Section 5. Lastly, Section 6 describes 2

5 the future direction of PhD research, detailing potential short-term and long-term avenues of interest. 2 Univariate extreme value theory In practical terms, the importance of analysing and predicting extreme events creates a necessity for a statistically rigorous model of the tail of the distribution of interest. Often of interest are events that occur perhaps once every 100 or 200 years, such as a particularly damaging flood event, for example. However, by definition, observations in the tails are scarce (see Figure 1), and so it is often required that information regarding unobserved scenarios is gained using observed data. Extreme value theory focuses essentially on using asymptotic models to extrapolate from observed to unobserved levels. Many problems arise from estimating the tails using standard modelling approaches. As data are concentrated towards the centre of the distribution, parameter estimates and model fit are driven by these central values. In addition, different models that fit the body of the data well can have very different extrapolations. These issues create the need for a tail model that is not compromised by having to be fitted to the body of the distribution simultaneously. Figure 1: Density of a normal distribution with few observations in the tails 2.1 Block maxima approach Consider a set of observations from independent and identically distributed (IID) random variables X 1,..., X n with an unknown distribution function F. Define M X,n = 3

6 max(x 1,..., X n ) to be the maximum of this sequence of random variables. The distribution function of the sample maxima can then be expressed as: P(M X,n x) = P(X 1 x,..., X n x) = P(X 1 x)... P(X n x) = {F (x)} n. An analogous result for minima can be obtained by defining: m X,n = min(x 1,..., X n ) = max( X 1,..., X n ) = M X,n This report focuses on application of extreme value analysis to sample maxima. Results for minima can be obtained using the above identity. Henceforth, M n and m n will be used in place of M X,n and m X,n respectively. The formula for the distribution of maxima is unhelpful in practice as the distributional form of F is typically unknown. One approach is to search for families of models for which the expression F n converges for the tails of the distribution of F. However, where M n x F as n, x F = sup{x : F (x) < 1}. In other words, the distribution of M n degenerates to a point mass on the upper end point of F. A method of overcoming this difficulty is to obtain a linear renormalisation of M n to give a non-degenerate limit distribution. Let M n be defined as: M n = M n b n a n, for sequences of constants a n > 0 and b n, which stabilise the location and scale of M n as n increases, avoiding the issues that arise with the distribution of M n. The Extremal Types Theorem (Leadbetter et al., 1983) states that given appropriate choices of these normalising constants, as n : ( Mn b n P a n ) x G(x), where G is non-degenerate and is of the same type as one of the following distributions: Gumbel: G(x) = exp{ exp( x)} < x < ; { 0 x 0 Fréchet: G(x) = exp{ x α } x > 0, α > 0; 4

7 { exp{ ( x) α } x < 0, α > 0 Negative Weibull: G(x) = 1 x 0. The Unified Extremal Types Theorem (UETT) unites these distributions under one parameterisation, the Generalised Extreme Value (GEV) distribution, with distribution function { [ ( )] } 1/ξ x µ G(x) = exp 1 + ξ, σ where x + = max(x, 0) and σ > 0. The parameters µ, σ and ξ are interpreted as the location, scale and shape parameters respectively. The distribution of M n is of the same type as a GEV distribution as n, for some value of ξ. A Gumbel distribution corresponds to ξ = 0, with the feature of an exponential upper tail. A Fréchet distribution corresponds to ξ > 0, with a heavy upper tail. A Negative Weibull distribution, for which ξ < 0, has the property of a finite upper end point. Substantial research has gone into the characterisation of the domains of attraction of extreme value limits. Essentially, this involves characterising the set of distributions F for which the normalised maxima converges to an extreme value limit. Alternatively, given a distribution F, it involves evaluating the form of the normalising sequences a n and b n such that the distribution of normalised maxima converges. The reciprocal hazard function h is defined by: h(x) = 1 F (x) x F < x < x F, f(x) where f(x) is the density function, x F and x F are the lower and upper end points of the distribution respectively. Expressions for a n, b n and the shape parameter ξ can be formulated as follows: h (y) ξ as y x F assuming h is differentiable. b n is such that 1 F (b n ) = 1/n. a n = h(b n ). The GEV distribution is used to model the distribution of maxima. This asymptotic model is used to approximate the distribution of extreme values for finitely many observations n, provided M n is constructed by taking the maximum of sufficiently many observations. The procedure involves partitioning the data into blocks and analysing the maximum observation from each block. The choice of block size is critical for model performance. The structure and size of the dataset may indicate natural choices for block size. For example, a rainfall dataset containing 150 years of observations may be partitioned into annual blocks. However, it must be ensured that block size must be large enough so that the limit model approximation holds and small enough to obtain a desirably small estimation variance. The applicability of the GEV distribution is also determined by the flatness of the derivative of the reciprocal hazard. Numerical methods are required to solve for the maximum likelihood estimates of θ = (µ, σ, ξ). For more details on the asymptotic properties of these maximum 5 +

8 likelihood estimates, see Smith (1985). In practical applications, interest lies in the estimation of a probability that extreme events are sufficiently small. The return period of level z is defined as the expected waiting time until the level z is next exceeded. The T -year return level is defined as the level for which the expected waiting time between exceedances is T years. The 1/p return level z p is the 1 p quantile of the GEV distribution for 0 < p < 1. By the invariance property, the maximum likelihood estimates can be substituted for the parameters of the GEV distribution to give an MLE for z p, defined as: { ˆµ ˆσˆξ [1 { log(1 p)} ˆξ] for ẑ p = ˆξ 0 ˆµ ˆσ log{ log(1 p)} for ˆξ = 0. Because of asymptotic normality, the delta method can be used to determine the uncertainty of these estimates. However, this approximation performs poorly when considering return levels corresponding to long return periods that fall beyond the scope of the data. Profile likelihood-based confidence intervals provide a more accurate representation of uncertainty when a strong degree of extrapolation is required. 2.2 Threshold methods While the block maxima approach is easily useful and interpretable, one of its drawbacks is its failure to capture the full behaviour of the tail of a distribution. The model is limited to analysing data selected as the maximum of a pre-selected block, despite the strong possibility of there being other observations in the same block that may be characterised as extreme (see Figure 2). Threshold methods account for the extra tail information in these observations by analysing data above a pre-determined level u. This leads to a more efficient modelling procedure. Let X 1, X 2,..., X n be a sequence of independent and identically distributed random variables, with common marginal distribution function F. Considering some high threshold u, the behaviour of extreme events can be characterised by the conditional probability: P(X > u + y X > u) = 1 F (u + y), y > 0. 1 F (u) Given the formulation of a block maxima model that is found to follow a GEV distribution, then for large u, the distribution function of Y u Y u > 0, where Y u = X u, is approximately H(y) = 1 ( 1 + ξỹ ) 1/ξ, y > 0. σ + It follows that Y u Y u > 0 follows a Generalised Pareto (GP) distribution (Pickands III, 1975) with scale parameter σ u and shape parameter ξ. Complete and outline proofs of this result can be found in Leadbetter et al. (1983) and Coles (2001) respectively. 6

Figure 2: Scatterplot of rainfall accumulations in southwest England (1956-62), showing the data used in the block maxima and threshold exceedance approaches.

9 Figure 2: Scatterplot of rainfall accumulations in southwest England ( ), showing the data used in the block maxima and threshold exceedance approaches. For the latter, a threshold of u = 30 is selected. Threshold models are alternatively characterised by limiting results from the theory of point processes. Assuming that F is in the domain of attraction of a GEV(0, 1, ξ) distribution and the required normalising constants are a n and b n, then a sequence of point processes P n can be constructed on [0, 1] R by {( i P n = n + 1, X ) } i b n ; i = 1,..., n a n and examining the behaviour as n. The limit process is non-degenerate as the distribution of the normalised maxima is non-degenerate. Large points of the process are retained in the limit process while small points are normalised to the same value b l, with b l = lim n x F b n a n. Under these conditions on P n, on the set [0, 1] (b l, ) P n P as n, 7

10 where P is a non-homogeneous Poisson process with intensity function λ(t, x) = (1 + ξx) 1 1/ξ +. For a proof of this limit result, the reader is referred to Kallenberg (1983). This result motivates the idea that the behaviour of all threshold exceedances is determined asymptotically by the characteristics of a n, b n and ξ, as with the block maxima approach. However, with the same number of parameters to estimate and a greater availability of extreme data, this suggests the model could benefit from potential efficiency gains. Most importantly, it motivates the use of the GP distribution as a conditional limit model for excesses of a high threshold. The focus lies on the distribution of threshold exceedances in the process P n. For any fixed v > b l, let u n (v) = a n v + b n, then as u n (v) x F, letting x > 0: ( Xi b n P(X i > a n x + u n (v) X i > u n (v)) = P > x + v X i b n a n a n ) > v = P(a given point in P n > x + v a given point in P n > v) P(a given point in P > x + v a given point in P > v) = = (1 + ξ(x + v)) 1/ξ + (1 + ξv) 1/ξ + ) 1/ξ (1 + ξ xσv, where σ v = 1 + ξv. Hence the limiting distribution for a scaled excess + [X i u n (v)] + X i > u n (v) a n follows a generalised Pareto distribution, GP(σ v, ξ). The motivates the use of a GP model for an approximate distribution of excesses above a threshold Y u Y u > 0, such that P(Y u < y Y u > 0) = 1 ( 1 + ξy ) 1/ξ, y > 0. σ u + One of the underlying issues in modelling threshold exceedance data is the choice of threshold. The GP distribution has a threshold stability property. This states that if Y u Y u > 0 GP(σ u, ξ), for some high threshold u, then for a higher threshold v u Y v Y v > 0 GP(σ u + ξ(v u), ξ). Thus, ξ is invariant to threshold choice, but σ u is not. 8

11 The GP model of the excess variable Y u is conditional on having observed a threshold. To obtain a model for the original variable X, the rate parameter φ u is included in the model, that is φ u = P(Y u > 0) = P(X > u), the probability of observing an excess over the threshold u. This is estimated as simply the proportion of data that exceed u. The asymptotic approximation of a GP model may not be valid if the threshold is too low, while a threshold that is too high will reduce the size of the dataset, which leads to greater parameter uncertainty. An ideal threshold choice is based on this trade-off between bias and variance. While there are no exact methods for threshold selection, graphical techniques are available to guide selection based on properties of the GP distribution. Such methods include mean residual life plots and parameter stability plots (Coles, 2001). The former is based on the idea that if a GP model is a good fit, then the sample mean excess over a threshold should be a linear with respect to the threshold. The latter is used based on the idea that ξ and a reparameterised scale parameter σ = σ u ξu are constant with respect to threshold. The point process framework provides an alternative method to formulate extreme value limit results that unifies the block maxima and threshold exceedance approaches. Let X 1, X 2,..., X n be a series of independent and identically distributed random variables, and let {( ) } i N n = n + 1, X i : i = 1,..., n. Then for sufficiently large u, on regions of the form (0, 1) [u, ), N n is approximately a Poisson process, with intensity measure on A = [t 1, t 2 ] (x, ) given by [ ( )] 1/ξ x µ Λ(A) = (t 2 t 1 ) 1 + ξ. σ Assuming the limit process is a reasonable approximation to the behaviour of N n on A, an appropriate likelihood can be derived and maximum likelihood estimates of parameters (µ, σ, ξ) evaluated. Multiplying the intensity measure by a factor n y, the number of years of observation, means that the parameters of the point process likelihood will correspond to the GEV distribution of annual maxima. However, because the point process model makes use of all data that are extreme, inferences are likely to be more accurate than estimates based on a direct fit of the GEV distribution to the annual maximum data. The shape parameter of the point process model is equal to the threshold exceedance model, while the scale parameter is related through the identity σ u = σ + ξ(u µ). The point process model is advantageous in its parameterisation in terms of the GEV parameters that are invariant to threshold. This is beneficial when adapting the model to account for non-stationarity by modelling the parameters as functions of covariates. In addition, because the parameters are not threshold-dependent, the model can be adapted to include time-varying thresholds. 9

12 2.3 Modelling extremes of stationary processes In the previous sections, the models described work under the assumption that the random variables of interest are independently and identically distributed. However, in practice, such an assumption is unrealistic. Rainfall data, as an example, exhibits a high degree of temporal dependence. For example, a day of torrential rain is more likely to succeed a day of rain than a day of sunshine. Hence, there is a need for a statistically rigorous model that accounts for short-range and long-range temporal dependence between extreme observations. Rather than independence, the assumption of stationarity is made. A process {X t } is said to be a stationary process if the joint distributions of (X t1,..., X tk ) and (X t1 +τ,..., X tk +τ) are the same for any k, t 1,..., t k and τ. There is a need to limit the amount of long-range dependence between extreme observations. The Asymptotic Independence of Maxima (AIM) condition (O Brien, 1987) ensures that separated groups of extreme observations become independent as their separation and level are sufficiently large. Let M i,j = max(x i,..., X j ) and u n = a n x + b n for normalising sequences a n, b n and any real number x. Under the AIM(u n ) condition, there exists a sequence q n of positive integers with q n = o(n) such that for all i and j max P(M 1,i u n, M i+qn,i+q n+j u n ) P(M 1,i u n )P(M 1,j u n ) 0 as n (1) The Unified Extremal Types Theorem for stationary sequences says that if this condition holds, and if normalising sequences a n and b n exist, then if ( ) Mn b n P x H(x) as n, a n where H is non-degenerate, then H is a member of the GEV family of distributions. A measure of short-range extremal dependence, the extremal index θ (0, 1) is defined by θ = lim n P(M 2,pn u n X 1 > u n ), where p n = o(n). θ essentially represents the limiting probability of consecutive observations following a maximum occurring below a given threshold u n. Hence, values of θ close to 1 correspond to weaker dependence, while values closer to 0 correspond to stronger dependence. For an IID process, θ = 1. Providing equation (1) holds and θ exists, then H(x) = {G(x)} θ, where G(x) is the limiting distribution under the IID assumption. More details on the extremal index can be found in Leadbetter (1983). In the threshold exceedance approach, a cluster is defined as a set of points exceeding a threshold u that occur within a short time period of one another. The expected number of exceedances of the threshold u per cluster is θ 1. Cluster maxima are independent and can 10

13 be modelled using a GP distribution or point process method. Values within the cluster are dependent. Numerous methods have been proposed for identifying independent clusters of extreme values. A selection of these methods can be found in Smith and Weissman (1994), Ledford and Tawn (2003) and Ferro and Segers (2003). 2.4 Modelling extremes of non-stationary processes Because non-stationarity is a prevalent feature of many physical processes modelled using extreme value methods, a model framework is required that incorporates this feature in a statistically precise manner. Non-stationarity can manifest in a number of ways, the most common being trend and seasonal effects. Rainfall, for example, tends to exhibit a seasonal pattern due to worsening winter weather conditions. Traditional methods have focused on modelling non-stationary margins directly through the model parameters. In this way, the parameters become functions of covariates, which are easily estimated using the likelihood framework and standard model selection techniques such as the likelihood ratio test. Constraints are imposed such that the scale parameters in both the GEV and GP approaches are positive and that the rate parameter in the GP approach lies in (0, 1). Conditional and marginal return levels can then be evaluated for extrapolation. For a comprehensive overview of this procedure, see Coles (2001). Alternatives to this approach include nonparametric fitting (Hall and Tajvidi, 2000) and preprocessing (Eastoe and Tawn, 2009), which removes the non-identical margins before applying the traditional approach Generalised additive models The traditional approach is conveniently implemented in a linear framework. A generalised additive modelling approach expresses the model parameters as linearly dependent on smooth functions of covariates. Chavez-Demoulin and Davison (2005) fit a nonhomogenous Poisson process model with parameters λ(t), ξ(t) and σ(t) such that λ(t) = exp{x T α + f(t)} ξ(t) = x T β + g(t) σ(t) = exp{x T γ + s(t)}, where α, β and γ are parameter vectors and f, g and s are smooth functions. Here, the time covariate is a smooth function of t, but this can be extended to include other covariates. Estimating the rate λ involves the use of penalised likelihood estimation. The Poisson process log-likelihood is given by n t0 l λ = log λ(t j ) λ(t)dt, which is approximated by i=1 0 ˆlλ = m m c k log λ(kδ) δ λ(kδ). k=1 k=1 11

14 The roughness penalised log-likelihood is defined by l λ = l λ + ρ λ R λ, where R λ is a parameter roughness penalty. There are numerous ways to define this penalty, one being R λ = 1 2 b a f (t) 2 dt. (2) The value of roughness coefficient ρ λ is selected using cross-validation to provide good predictive performance. Similarly, the Generalised Pareto model is used to estimate the size of threshold exceedance by maximising the roughness penalised GP likelihood l ξ,σ = l ξ,σ + ρ ξ R ξ + ρ σ R σ, where R ξ and R σ are parameter roughness penalties for GP shape and scale respectively, defined similarly to equation (2). Roughness coefficients ρ ξ and ρ σ are evaluated using cross validation. Jonathan et al. (2014b) presented a similar method for estimating a threshold function φ above which observations are deemed to be extreme. This is done using quantile regression (Koenker, 2005). In particular, estimating φ requires minimising the quantile regression lack of fit criterion n n l φ = τ r i + (1 τ) r i, i:r i 0 for residuals r i = z i φ i and where τ is the non-exceedance probability given any combination of covariates. The smoothness of the quantile function is regulated by penalising lack of fit for parameter roughness R φ by minimising the revised penalised criterion l φ = l φ + ρ φ R φ, A spline modelling approach is used to evaluate the parameter roughness penalties, see Chavez-Demoulin and Davison (2005) and Jonathan et al. (2014b) for more details. Spline representations are also useful in non-stationary conditional extremes modelling based on the approach of Heffernan and Tawn (2004) (see Section 3.4). Penalised likelihood optimisation is performed using a backfitting algorithm (see, for example, Davison (2003)) Random effects It is often found that not all of the observed variability in the model parameters is accounted for by traditional regression models or the generalised additive approach. One way to account for this extra variation is to incorporate a random effect term into the model parameters. This is particularly useful when no covariate data is available, or even as part of an investigation into identifying possible covariates that could be of benefit to model fit. i:r i <0 12

15 Previous work in the extreme value literature has rarely focused on incorporating random effects into a statistical model. Eastoe and Tawn (2010) include an annual random effect component in the formulation of the rate parameter of flood events occurring corresponding to a homogeneous Poisson process. For this model specification, the index of dispersion D = 1, that is D = Var(N) E(N) where N is the number of events. However, it has been seen that major flood events are overdispersed, that is, D > 1, and consequently, extra variation between years cannot be captured by the homogeneous Poisson process model. This is due to the lack of explanatory variables in the model, and thus, the model is incapable of capturing the non-identical margins in the data. Let N i be the number of events in year i. Then consider the following hierarchical model: = 1, N i Poisson(λγ i ); γ i Gamma(1/α, 1/α), where λ > 0, α > 0 and γ i are independent and identically distributed. E(N i ) = λ but Var(N i ) = λ(1 + λα), so the index of dispersion is In this model, D = 1 + λα. Assuming the event peaks are Generalised Pareto distributed, the annual maxima M i have a extended Generalised Logistic distribution, see Eastoe and Tawn (2010) for further details. The model is extended to include within-year variability and covariates: 365 N i Poisson(λ i ), where λ i = j=1 λ ij λ ij = γ i g(βx ij ) γ i Gamma(1/α, 1/α), where N i is the number of counts in year i, λ ij is the probability of there being a peak event on day j in year i, g(βx ij ) is a function of covariates and γ i is the random effect. The parameter α quantifies any extra annual variability in the rate which is not explained by the regression part of the model. Thus, the γ i can be interpreted as covariates, on the annual scale, that are unobserved. The model can be further extended to include year-to-year dependence in the random effects. This, in turn, introduces dependence to the distributions of the counts N i and the annual maxima M i. see Eastoe and Tawn (2010) for more information on model extensions and a detailed overview of the MCMC procedure used for inference. The random effect component of this model can be interpreted as approximating additional, unobserved covariates. This is useful in that estimation of these random effects can be used 13

16 in the identification of suitable covariates for the model. This can be helpful for learning about weather extremes in the sense that intense or fast-moving storms may be influenced by an unobserved variable, whose structure can be analysed and matched with a climate process whose properties are known and similar to the estimate of the random effect. A major component of future research will be to uncover ways to extend the concept of random effects to univariate and multivariate modelling of extremal behaviour in extratropical cyclones (see Section 6). 2.5 Simulation study In this section, a study is presented which illustrates the application of the threshold-based point process model introduced in Section 2.2 to simulated data. An approach for simulating a Poisson process is introduced. This is followed by an overview of the model fitting procedure, implemented using software developed by the author in R and verified using the simulation procedure. In addition, an example is presented to illustrate how likelihood methods can be used to test for nonstationarity in the data using the traditional approach Simulating a Poisson process Consider a two-dimensional non-homogeneous Poisson process with intensity λ(t, x) on the set A = [0, τ] (u, ), for some fintie u. For a non-stationary limiting point process, the intensity function λ is of the form for covariate-dependent parameters θ = (µ t, σ t, ξ t ). λ(t, x) = 1 { ( )} 1/ξt 1 x µt 1 + ξ t, (3) σ t σ t Let N(A) be the number of points of the Poisson process in the set A. A key property of a Poisson process is that N(A) Poisson(Λ(A)), where Λ(A) is the integrated intensity function Λ(A) = τ 0 u The density of points in the set A at the point (t, x) is defined as λ(t, x)dxdt. (4) f(t, x) = λ(t, x), for t [0, τ], x [u, ) Λ(A) Simulating a Poisson process on A corresponds to simulating N(A) = n u points from this bivariate density. 14

17 Recall from probability theory that to simulate from this bivariate distribution, t is simulated from the marginal f(t), which can be expressed as f(t) = u f(t, x)dx = τ 0 {1 + ξ t ( {1 + ξ t ( )} 1/ξt u µ t σ t )} u µ 1/ξtdt t σ t A probability integral transform can be used to achieve this. following equation holds: u = F (t) = t 0 Defining u U(0, 1), the f(s)ds (5) Simulations of t can be found by solving for t in equation (5) using a standard equation solver algorithm. Then for realisation T = t, simulate x from the conditional X T = t, which is a GP distribution. The set of vectors {(t i, x i ) : i = 1,..., n u } then represents a two-dimensional Poisson process, with parameters depending on covariates Model fitting Standard maximum likelihood techniques are used to compute parameter estimates for θ = (µ, σ, ξ). The likelihood function for a Poisson process is defined as L(θ) = n λ(t i, x i ) exp{ Λ(A)}, i=1 where λ(t, x) and Λ(A) are defined by equations (3) and (4) respectively. Hence, the loglikelihood to be maximised can be expressed as [ n { ( )} ] 1/ξti 1 1 xi µ τ { ( )} 1/ξti 1 ti u µti l(θ) = log 1 + ξ ti 1 + ξ ti dt σ ti σ ti 0 σ ti i=1 Numerical techniques are required to solve this optimisation problem. Difficulties arise, however, in the numerical estimation of the integral component of the log-likelihood function. A common numerical method used to compute this integral is a simple Monte Carlo estimator, specifically Î = 1 n { ( )} 1/ξti 1 u µti 1 + ξ ti, n σ ti i=1 where t i is the time of exceedance i and n is the number of exceedances. This is sufficient for the case of stationarity in the data. The absence of covariates in the stationary case means that time points should be uniformly distributed in the data (see Figure 3), and hence, the Monte Carlo estimator should be a valid approximation of the integral. Now, consider the case of a non-stationary trend in the data. For simplicity, assume a linear time trend in µ t. Simulating such a Poisson process gives the plots shown in Figure 4. An adjustment must be made in the formulation of the integral to account for the non- 15

18 Figure 3: A scatterplot of a simulated stationary Poisson process with parameters θ = (100, 15, 0.05) alongside a histogram illustrating the uniformity of the time points. Figure 4: A scatterplot of a simulated non-stationary Poisson process with parameters σ = 15, ξ = 0.05 as before, but with µ t = t, alongside a histogram illustrating the clear non-uniformity of the time points. stationarity component, which in this case, causes the density of observations to increase with respect to time. The assumption of uniformly distributed time points is therefore no longer valid. The Monte-Carlo estimator of the integral is therefore evaluated at userspecified uniform intervals over the time period, which accounts for the trend component of µ t in the correct way. In particular, the new estimator Ĩ of the integral can be expressed as Ĩ = 1 m n i=1 { ( )} 1/ξsi 1 u µsi 1 + ξ si, σ si where s i represents the ith component of the uniform grid in (0, τ) and m represents the number of intervals specified on the grid. Because the Poisson process of interest is generated by the user, this provides an opportunity to test the logic of the model fitting arguments outlined in this section. A Poisson process with parameters θ = (µ 0 + µ 1 t, σ, ξ) is generated with µ 0 = 100, µ 1 = 30, σ = 15, ξ =

19 Performing maximum likeilhood estimation on the model parameters using the arguments outlined above, parameter estimates and standard errors can be found in Table 1. The four µ 0 µ 1 σ ξ Estimate Standard Error % CI (61.55,111.86) (26.82,33.57) (12.16,30.29) (0.04,0.17) Table 1: Table of parameter estimates and standard errors for simulated Poisson process model parameters fall within a 95% confidence interval of their corresponding estimate calculated using the delta method, suggesting that these estimates fall within a reasonable margin of error. Developing this point process methodology is important when analysing weather extremes. As discussed in Section 2.2, the point process model has numerous advantages over both the block maxima and GP threshold approaches. With weather extremes, using this approach and incorporating more data into the extreme value model increases efficiency and reduces variability in the parameter estimates. In particular, reduced variability is an advantage when modelling extremes of physical processes that are highly variable by their very nature. 3 Bivariate extreme value theory In many physical applications, the extremal behaviour of one variable may not be sufficient in representing the complexity of the underlying processes. It is therefore prudent to consider the joint extremal properties of multiple variables in order to better approximate this complexity. When considering the case of multivariate extreme value theory to model the behaviour of natural systems, it is important to account for spatial and temporal dependence. For example, when analysing rainfall data from two nearby locations, intuition suggests that extreme rainfall on one site could be dependent on extreme rainfall on the other. This dependence may also have a temporal component in the case where the same weather system impacts on two locations at two different time points. Dependence must also be modelled when considering the extremes of two variables. In weather data, rainfall and wind speed are often affected by the same storm event. By introducing a multivariate dependence structure to the modelling procedure, there is scope to produce a model that is a more accurate representation of the physical process than a univariate approach can provide. It is expected that this will manifest in an incorporation of dependence over space, but also perhaps dependence between different physical features of extratropical cyclones (see Section 4) that can be captured in Met Office data. In this section, the concept of bivariate extreme value theory is introduced, which can be easily extended to the multivariate setting. 17

20 3.1 Measures of dependence Consider random variables (X, Y ) whose joint distribution function is defined by F (x, y) = P(X x, Y y). Since this function contains a complete description of dependence between X and Y, a common method of exploring this further is to remove the effect of the marginal distributions by transforming the variables onto common margins. The copula function describing dependence between X and Y is defined is given by the function C such that F (x, y) = C{F X (x), F Y (y)}, where F X (x) = F (x, ) and F Y (y) = F (, y) denote the marginal distributions of X and Y. Analysis of dependence concepts can sometimes be more mathematically convenient on certain marginal scales. Scales frequently used for transformation include Uniform, Gumbel, Fréchet and Laplace marginal distributions (see Figure 5). In each case, the copula allows an analysis of dependence between the two variables. Figure 5: A bivariate normal distribution with ρ = 0.6, transformed to Uniform, Fréchet and Gumbel margins. The study of copulas leads to the formulation of a summary measure of dependence. Assuming a common marginal distribution, this is given by the quantity χ where χ = lim P(Y > z X > z), (6) z z where z is the upper end point of the distribution. Intuitively, this can be interpreted as the probability of one variable being extreme given that the other is extreme. This leads to the 18

21 concept of asymptotic dependence, a property where the realisations of the tail components of a random vector occur simultaneously with a high probability. When this scenario is unlikely, the variables are said to be asymptotically independent. To explore how this idea relates to χ, consider the following. By a probability integral transformation, (X, Y ) can be transformed to standard Uniform margins (U, V ) and equation (6) can be rewritten as χ = lim u 1 P(V > u U > u). Then using the laws of conditional probability and the exclusion-inclusion formula, the following result holds: Hence, defining P(V > u U > u) 2 log C(u, u) log u (7) it follows that χ(u) = 2 log P(U < u, V < u), (8) log P(U < u) χ = lim u 1 χ(u). In practice, analysis will often lead to estimates of χ = 0, suggesting asymptotic independence. This result merely captures the behaviour of variables that occur simultaneously and hence there is a need for a second measure that summarises the degree of finite dependence under asymptotic independence. Defining the joint survivor function as F (x, y) = P(X > x, Y > y), the same reasoning as in equations (7) and (8) can be applied. Define χ(u) = 2 log(1 u) log C(u, 1 for 0 u 1, u) where 1 χ(u) 1. Then χ = lim u 1 χ(u). For a complete summary of extremal dependence, the pair (χ, χ) is required. The combination (χ > 0, χ = 1) corresponds to asymptotic dependence, where the value of χ determines the strength of the dependence. Asymptotic independence, in contrast, is given by the combination (χ = 0, χ < 1), where χ signifies the strength of finite dependence within this class. Having defined measures of extremal dependence based on limiting values of dependence functions, it is necessary to relate these quantities to the bivariate extreme value theory and modelling procedure. Ledford and Tawn (1996) formulated a flexible model that provided a smooth link between the bounding cases of perfect dependence and perfect independence. 19

22 Consider a pair of random variables (X, Y ), with unit Fréchet margins. The joint survivor function of (X, Y ) satisfies the asymptotic condition P(X > z, Y > z) L(z)z 1/η, for large z (9) where L(z) is a slowly varying function as z. The parameter η (0, 1] is the coefficient of tail dependence. If η = 1 and L(z) c as z, with 0 c 1, then (χ = c, χ = 1), and the variables are asymptotically dependent of degree c. If η < 1, then it can be shown that χ = 2η 1 and χ = 0, and thus, (X, Y ) are asymptotically independent. The parameter η has been identified as a pivotal parameter in the characterisation of extremal dependence. Inference on η can be made by defining T = min(x, Y ) such that P(T > z) = P(X > z, Y > z) L(z)z 1/η, z η is the shape parameter of the variable T and so standard univariate techniques can be used to estimate η. For example, Ledford and Tawn (1996) use a point process model to analyse the extremal behaviour of the structure variable T. These techniques are easily extended to the multivariate case where the number of variables is greater than Testing dependence of simulated data This numerical investigation aims to analyse the extremal dependence properties of simulated data through estimation of the parameter η. Following the procedure for estimating η outlined in Section 3.1, analysis is performed on data simulated from the bivariate normal distribution (BVN) and the bivariate logistic distribution (BVE). Bivariate Normal distribution The bivariate normal distribution has the form (( 0 X BVN 0 ), ( 1 ρ ρ 1 where ρ is a dependence measure between two variables X and Y. A value of ρ = 0 indicates independence, while values of ρ = 1, 1 corresponds to perfect negative and positive dependence respectively. Data can be simulated from a bivariate normal distribution by simulating X N(0, 1) and Ỹ N(0, 1), then setting: X = X Y = ρ X + Ỹ (1 ρ2 ) 1/2. First, 1000 samples of size are simulated. After transformation to Fréchet margins, η is estimated at a 90% threshold and averaged over each sample. It can be shown that the true value of η for a bivariate normal distribution is (1 + ρ)/2, which in this case means that η = 0.9. The mean value of η from the estimation procedure is ˆη = , with 95% confidence bounds of [0.735, 0.965]. This strongly indicates that η < 1, corresponding to asymptotic independence. )) 20

23 Bivariate Logistic distribution Similarly, 1000 samples of size are simulated from a bivariate logistic distribution, with distribution function { F (x, y) = exp (x 1/α + y 1/α ) α}, where x, y > 0 and α (0, 1]. Independence corresponds to α 1 and perfect dependence corresponds to α 0. For the purpose of this simulation, a value of α = 0.75 is selected. Like in the previous example, the estimate of η is evaluated at a 90% threshold and averaged over each sample. The estimate of η from this procedure is ˆη = 0.963, with 95% confidence bounds of [0.872, 1], which is in the range of the true value of η for the bivariate logistic distribution, η = 1. All bivariate extreme value distributions, like the logistic distribution, are asymptotically dependent, and this estimation procedure supports this result. 3.2 Componentwise block maxima In the bivariate case, the componentwise block maxima approach is suitable in the case where only the annual maximum data are available from two locations, for example. Hence, this method is the bivariate extension to the approach introduced in Section 2.1, though extensions to more than two variables are possible. Consider the maxima of a pair of random variables (X, Y ), and define M X,n = max{x 1,..., X n } and M Y,n = max{y 1,..., Y n }, with M n = (M X,n, M Y,n ). Assume (X, Y ) have Fréchet marginal distributions. The limiting distribution of the normalised vector M n /n is non-degenerate, that is where G has the form where P(M X,n /n x, M Y,n y) = {F (nx, ny)} n G(x, y) as n, V (x, y) = G(x, y) = exp ( V (x, y)) 1 0 ( w max x, 1 w ) 2dH(w) y and H is a distribution function on [0, 1] satisfying the mean constraint 1 0 wdh(w) = 1/2. The family of distributions that arise from this limiting result is termed the class of bivariate extreme value distributions. Although this result provides a complete summary of bivariate extreme value distributions, the class of possible limits is wide. One method is to use parametric sub-families of distributions for H, leading to sub-families of distributions for G. One standard class is the logistic family: G(x, y) = exp{ (x 1/α + y 1/α ) α }, x > 0, y > 0, 21

24 where 0 < α 1. V is said to be homogeneous of order 1 as for any constant a > 0, V (a 1 x, a 1 y) = av (x, y). Using the homogeneity property, it can be shown that for the class of bivariate extreme value distributions χ = 2 V (1, 1). For the logistic family, χ = 2 2 α. This gives the parameter α some interpretation of a measure of dependence. When α = 1, this corresponds to χ = 0, asymptotic independence. The quantity V (1, 1) can also be used to define another measure relating to extremal dependence. Assuming common Fréchet marginal distributions, the extremal coefficient θ is defined such that P(X < x, Y < x) = P(X < x) θ, 1 θ 2. Consider a pair of bivariate logistic random variables with Fréchet margins. Then P(X u, Y u) = exp{ V (1, 1)/u} = [exp{ 1/u}] V (1,1) = P(X u) V (1,1) Thus, θ = V (1, 1), with θ = 1 corresponding to perfect dependence and θ = 2 corrresponding to independence. Like in the univariate case, the theoretical results for componentwise block maxima are applied to a bivariate extreme value modelling procedure, where the asymptotic arguments are assumed to be exact for a large number of observations n. This allows estimation of model parameters χ and χ. Because χ(u) is constant for any member of this family of distributions, evidence of non-constancy is indicative of a lack of model fit. Likelihood-based methods are commonly used for parameter estimation, while nonparametric methods are also available. Like in the univariate case, the componentwise block maxima approach suffers from a failure to capture the full extremal behaviour of the process by only considering the maxima. By considering a threshold-based model, improvements in efficiency and flexibility can be gained. 3.3 Threshold methods There are substantial efficiency gains when considering more general point process characterisations of extremal behaviour. Let (X 1, Y 1 ), (X 2, Y 2 ),... be an independent series of realisations of the random vector (X, Y ) with standard Fréchet marginal distributions. Define a sequence of point processes P n such that {( Xi P n = n, Y ) } i : i = 1,..., n. n As n, P n P, where P is a Poisson process. The intensity function λ of the limiting process is stated upon transformation of the coordinate system to radial and angular components (R, W ) λ(dr dw) = dr r 2 2dH(w), 22

25 where H is the dependence measure of the associated componentwise block maxima vector. The joint tail probability of events follows immediately from this result. If A B, where B = {(x, y) : x > x 0, y > y 0 } for large x 0 and y 0, then P{(X, Y ) ta} 1 P{(X, Y ) A}. t If (X, Y ) have Gumbel margins, obtained by taking the logarithm of Fréchet margins, then this translates to: P{(X, Y ) t + A} exp( t)p{(x, Y ) A}. Difficulties arise in this formulation due to the degeneracy of H in the case of asymptotic independence, which can be avoiding by extending the limit process to account for the degree of dependence within the class of asymptotically independent distributions. Following Ledford and Tawn (1997), define another sequence of point processes P n such that {( Xi P n = n, Y ) } i : i = 1,..., n, η n η where η is the coefficient of tail dependence corresponding to (X, Y ) (see equation (9)). P n P as n, with intensity function defined on the transformed coordinate system as λ(dr dw) = dr d H(w). r (1+η)/η In this case, for A C, where C = {(x, y) : x + y > r 0, w 0 x/(x + y) 1 w 0 }, for large r 0 and small w 0 > 0, it follows that In Gumbel margins, similarly P{(X, Y ) ta} 1 P{(X, Y ) A}. (10) t1/η P{(X, Y ) t + A} exp ( t/η)p{(x, Y ) A}. (11) Inference for the limiting Poisson process model as a reasonable approximation for the distribution of observations above high threshold can proceed in a number of ways. Parametric estimation can proceed with the formulation of a likelihood defined relative to the corresponding intensity function. Alternative nonparametric procedures have also been proposed (de Haan and de Ronde, 1998). The aim of these methods is to map nonparametric estimates of probabilities within a set A of observed data to sets t + A, ta that may contain no observed data. In particular, this extrapolation, using equations (10) and (11), allows the formulation of probabilities of extreme events. 23

26 3.4 Conditional approach A key issue with the standard multivariate extreme value approach concerns the estimation of probabilities for sets A that are not simultaneously extreme in each component, that is, the random variables of interest are asymptotically independent. This is evident as the empirical estimate of the mapped probability is likely to be 0 since the mapped data are unlikely to fall in the sets t + A, ta. The conditional approach of Heffernan and Tawn (2004) is applicable whether the variables are asymptotically independent or asymptotically dependent. For the purpose of this report, the bivariate case is considered. Consider a pair of random variables (X, Y ) transformed onto Gumbel margins and the limiting behaviour of the conditional distribution of Y given X, that is, P(Y y X = x). Assuming the existence of normalising functions a(x) and b(x), both R + R, which can be chosen such that, for all fixed z and for any sequence of x-values such that x, then for lim P(Z z X = x) = G(z), (12) x Z = Y a(x), b(x) where the limit distribution G is non-degenerate. Using this assumption, the result follows that, conditionally on X > u, as u the variables X u and Z are independent in the limit. To illustrate this, let x = u + y with y > 0 fixed; then P(Z z, X u = y X > u) = P(X a(u + y) + b(u + y)z X = u + y) f X(u + y) P(X > u) G(z) exp ( y) as u. The limiting marginal distributions of X u and Z are exponential and G respectively. Like in the univariate case, the normalising functions a(x) and b(x) must be defined in terms of characteristics of the conditional distribution of Y X. The limit distribution G is unique up to type, so the normalising functions are identified up to a constant. The following result from Heffernan and Tawn (2004) describes how these functions can be derived analytically. Suppose that a pair of random variables (X, Y ) has an absolutely continuous joint density. If the functions a(x) and b(x) > 0 satisfy the property (12), then these functions satisfy the following properties up to type: where p is a constant in the range (0, 1), and lim F {a(x) x} = p, x b(x) = h{a(x) x} 1, 24

27 where F is the conditional distribution and h is defined as the conditional hazard function. A detailed proof is given in Heffernan and Tawn (2004), along with some theoretical examples. In Keef et al. (2013), a number of problems with the Heffernan and Tawn (2004) approach are identified that have been found to limit the utility of the method. Complications arise with modelling variables with some components that are positively associated and others negatively associated. There are also issues with parameter identifiability and inferences that are inconsistent with the marginal distributions. In order to overcome the first problem, the variables of interest are transformed onto Laplace margins rather than Gumbel margins, such that: { log{2fx (X)} for X < F 1 X X L = (0.5); log{2[1 F X (X)]} for X F 1 X (0.5), and { log{2fy (Y )} for Y < F 1 Y Y L = (0.5); log{2[1 F Y (Y )]} for Y F 1 Y (0.5). Then (X L, Y L ) has Laplace marginal distributions with { exp(x)/2 if x < 0; P(X L < x) = 1 exp( x)/2 if x 0, and P(Y L < y) = { exp(y)/2 if y < 0; 1 exp( y)/2 if y 0. A Laplace transformation captures the exponential tails of a Gumbel distribution while also having a symmetry property that allows for negatively associated variables to be incorporated into the model parsimoniously. Using these transformations motivate the use of a single class of normalising functions of the form: a(x) = αx and b(x) = x β, with (α, β) [ 1, 1] (, 1). This is a unified class representing both postively and negatively associated variables. Values of α = 1, β = 0 indicate asymptotic dependence, otherwise variables are asymptotically independent. For asymptotically independent distributions, values of 0 < α 1 correspond to positive extremal dependence and 1 α < 0 corresponding to negative extremal dependence. In Jonathan et al. (2014a), an extension to this model is introduced facilitating general nonstationary conditional extremes inference using spline representations of model parameters with respect to covariates, similar to the approach described in Section Extratropical cyclones In recent years, the occurrence of extratropical cyclones in the North Atlantic Ocean has caused incidents of severe weather over Western Europe. For example, in October 2013, Cyclone Christian caused 18 fatalities, widespread damage and mass destruction across Western 25

Europe. This catastrophic event was estimated to have cost the insurance industry an aggregated total of AC1.094 billion in the countries affected.

28 Europe. This catastrophic event was estimated to have cost the insurance industry an aggregated total of AC1.094 billion in the countries affected. While the Met Office was successful in forecasting this extreme weather event, there remains a need to relate the complexities of the physical features of extratropical cyclones to a statistical context. This is due to the underlying stochasticity in cyclonic behaviour that make it a mortal and economic threat. This section introduces the key ideas behind the evolution of extratropical cyclones based on long-standing physical models, while providing a brief overview of the application of Extreme Value Theory to characterising the extreme behaviour of this weather phenomenon. Figure 6: The tracks in the mid-latitudes in the winter of 1987/1988, in which The Great Storm caused mass damage in France, Great Britain and Ireland. 4.1 Formation Extratropical cyclones are low pressure weather systems that occur in the middle latitudes of the earth, and are primarily associated with stormy weather with strong winds and heavy precipitation. A cyclone centre moves along a path known as a track (see Figure 6), which is often the centre of a region affected by severe cyclonic weather. Many factors affect the movement of the track, and consequently, the complexity of quantifying the intensity and severity of extratropical cyclones makes their occurrence difficult to predict. Extreme behaviour in extratropical cyclones is usually identified in data by unusually high observations of wind speed, rainfall accumulations and/or vorticity. In order to develop statistical methods to estimate these quantities, a knowledge of the physical processes behind the formation and development of extratropical cyclones is required. 26

29 4.1.1 Airmasses An airmass is defined as a large body of air whose physical properties are approximately uniform horizontally in a large area of space. Airmasses can be characterised by their temperature; hot and cold, simply. Airmasses move away from their source region because of the differences in temperature between the poles and the equator. Because of this, warm and cold airmasses tend to move and interact, potentially altering their properties as a consequence. A key component of this movement is the jet stream, a narrow band of air where wind speed is at its maximum. It has been found that many day-to-day weather variations are associated with the formation and movement of boundaries, or fronts, between different airmasses The Norwegian Cyclone Model The Norwegian Cyclone Model (Bjerknes and Solberg, 1922) led the meteorological research into the behaviour of weather systems at fronts between airmasses. These fronts were classified into four types: Cold front - Cold air advancing onto warm air. Warm front - Warm air advancing onto cold air. Stationary front - Neither airmass advances. Occluded front - A cold front overtaking a warm front. An extratropical cyclone forms when the interface between the warm and cold airmasses develops into a wave form with its apex located at the centre of the low-pressure area (see Figure 7). Precipitation and wind are characteristic of the locations of the warm and cold fronts. This idealised model is generally characteristic of oceanic extratropical cyclone formation, but analyses following cyclone formation over land found substantial departures from the Norwegian model The Shapiro-Keyser Model The ideology behind the Norwegian model was born from analysis of surface weather maps over Europe in a time before routine air observations began. In recent years, due to inconsistencies between data and the Norwegian model, revisions have been made to the original configuration, such as the Shapiro-Keyser model (Shapiro and Keyser, 1990). As with the Norwegian cyclone model, an incipient cyclone develops cold and warm fronts, but in this case, the cold front moves roughly perpendicular to the warm front such that the fronts never meet, the so-called T-bone (see Figure 7). This is followed by seclusion, the mature phase of the cyclone life-cycle, which may result in hurricane winds and torrential rain. Not all extratropical cyclones originate as frontal waves. Some begin as tropical cyclones before moving into the mid-latitudes, where different types of behaviour have been observed, Barry and Chorley (2009) for a detailed overview. 27

30 Figure 7: From Schultz et al. (1998): (a) Norwegian cyclone model: (I) incipient frontal cyclone, (II) and (III) narrowing warm sector, (IV) occlusion; (b) Shapiro-Keyser cyclone model: (I) incipient frontal cyclone, (II) frontal fracture, (III) frontal T-bone and bent-back front, (IV) frontal T-bone and warm seclusion. 4.2 Key features As discussed in Section 4.1.1, the motion of an extratropical cyclone is steered essentially by a jet stream. In the North Atlantic, the variation in the strength and location of the jet stream is related to North Atlantic Oscillation (NAO) (Hurrell et al., 2003), which essentially measures the degree to which tracks shift to the north or south of Western Europe. The NAO, roughly speaking, is the pressure gradient between two large scale pressure cells over the Atlantic Ocean, the Icelandic low and the Azores high. A positive NAO index brings strong westerly winds, pushing the track of precipitation further north, resulting in cool summers and mild, wet winters in Northern Europe. A negative NAO index brings cold, dry winters to Northern Europe and cyclonic activity with warm temperatures to the Mediterranean region. This project aims to explore the dependence structure between the various measures of cyclonic intensity and the NAO. A key issue with modelling extratropical cyclones is that some of their small-scale features can potentially have the most damaging effects. This is apparent in the occurrence of sting jets. Sting jets are a sequence of forceful winds that occur at the tail of the head of the cyclone (see Figure 8) in a localised region sometimes spanning only 50 kilometres across (Baker, 2009). This results in gusts that are generally stronger than those located on the warm and cold fronts. They originate in the upper air before accelerating downwards at high speeds. The Great Storm of 1987 is an example of a cyclone with a sting jet. Challenges to consider when modelling sting jets include the lack of historical data due to the rarity of the phenomenon. However, it is vital to be able to account for the influence of sting jets due to the threat that it poses. This project aims to explore ways to incorporate small-scale 28

31 features, such a sting jets, into the model without compromising model validity. Figure 8: From Baker et al. (2014): A conceptual picture of a sting jet cyclone, featuring warm and cold air conveyor belts and sting jet component. Like with any weather system, a key area of interest is the effect of climate change on the location and severity of extratropical cyclones. The consensus among climate models is that tracks will shift slightly poleward in response to increases in greenhouse gases, in line with the change in jet streams. In addition, it is predicted that while there will be a reduction in total storm numbers, there will be a higher occurrence of intense cyclone activity (Ulbrich et al., 2009). 4.3 Statistical modelling of extratropical cyclones Extreme Value Theory has been used in recent years to model intense extratropical cyclone events. Because of the complexity of the cyclone system, authors have used different features of the weather system when undergoing statistical inference on the process. Lionello et al. (2008) fit a GEV distribution to monthly pressure minima derived from three different climate models over the entire North Atlantic domain. In two scenarios, it is projected that North Atlantic regions will suffer worsening winters and milder summers, which is consistent with the predicted effects of the northward shift of tracks caused by climate change. Della- Marta and Pinto (2009) fit a GP distribution to extreme wind intensity measurements in three non-overlapping regions of Europe. By this model, the frequency of cyclone occurrence is predicted to increase in Western Europe but remain the same in North Atlantic regions. Each model is formulated without accounting for the spatial and temporal variation in the extremes. The model of Sienz et al. (2010) uses a GP distribution to fit a tail model to sufficiently extreme values of geopotential height, mean horizontal gradient, cyclone depth and relative vorticity, each a measure of cyclone intensity. This model incorporates trends in time and NAO, finding that the probability of extreme cyclonic activity in the North Atlantic increases in months with a positive NAO phase. However, this model fails to account for dependence 29

32 between measures of cyclone intensity. Spatial variability is also ignored. Bonazzi et al. (2012) uses a bivariate extreme value copula (see Section 3.1) to analyse the tail dependence of wind intensity between pairs of locations. Four nodes over Europe are defined, and the measure of dependence χ is used as a probability of a storm hitting node B given that it hits node A. The tail dependence exhibits stronger coupling in the zonal direction, which is consistent with the dominant west-east track of extratropical cyclones. Economou et al. (2014) specify a Bayesian hierarchical procedure (outlined in Davison et al. (2012)) in which a point process conditional model on pressure minima is used due to the model parameters being invariant to threshold, unlike the GP distribution. The model includes spatial random effects and time-dependent covariates in the model parameters, extending the work of Cooley and Sain (2010). A Bayesian hierarchical framework is preferred due to its flexibility compared to max-stable processes and the natural inclusion of physical mechanisms in the model (see Section 6 for a thorough model description). 5 Exploratory data analysis 5.1 Data availability From a statistical perspective, the methods selected to model extreme behaviour of extratropical cyclones largely depend on the quality of available data. Practical difficulties arise with weather data collection. For example, data collected at one site is not necessarily reflective of the behaviour of the weather system in a spatial grid centred at that site. To counter this issue, reanalysis is used to produce reliable datasets for climate modelling and research. Reanalysis data are produced with a sequential data assimilation scheme, advancing forward in time cycles of a pre-determined length. In each cycle, available observations are combined with prior information from a forecast model to estimate the evolving state of the global atmosphere and its underlying surface (Dee et al., 2011). Variational analysis is performed on the basic upper-air atmospheric fields and other factors such as soil moisture, soil temperature, snow and ocean waves. These analyses are then used to initialise a short-range model forecast, which provides the prior state estimates needed for the next analysis cycle. The strength of this data assimilation means that global datasets are readily available with consistent spatial and temporal resolutions. In recent years, model resolution and bias correction techniques have steadily improved. Reanalysis also incorporates millions of observations into a system that would be impossible for an individual to collect and analyse separately. Despite these clear advantages, care must be taken not to equate reanalysis datasets with reality. The data assimilation system can introduce spurious variability and trends into output due to model and observational bias, for example. Examples of reanalysis projects include ERA-40, ERA-Interim, MERRA and JRA. This report focuses on ERA-40 and ERA-Interim data. ERA-Interim was designed to address 30

several problems that arose from the ERA-40 project, such as the representation of the hydrological cycle and technical issues such as data selection and quality control techniques.

33 several problems that arose from the ERA-40 project, such as the representation of the hydrological cycle and technical issues such as data selection and quality control techniques. Hence, ERA-Interim outputs have been used in analysis stretching back to 1979, before which ERA-40 measurements are still incorporated into analyses. The difference in data structure between the two projects is illustrated in the example in Figure 9. This shows the effect of the new reanalysis project on data output after Figure 9: A collection of wind speed measurements (knots) from the ERA-40 project (red) and the ERA-Interim project (green). This report will consist of analyses featuring two datasets from the ERA-Interim reanalysis. The first dataset contains track data over 34 consecutive springs. The identification and tracking of the cyclones is performed following the approach used in Hoskins and Hodges (2002). The approach uses relative vorticity to identify and track the cyclones in 6-hourly intervals. The second dataset contains monthly maxima measurements of wind and rain over gridded regions in the United Kingdom in the period Wind speed is measured in units of knots and rainfall is measured in millimetres. Regression and interpolation are used to generate values on a regular 25 km 25 km grid, taking into account factors such as position, terrain shape and coastal influence among others. A visual representation of this dataset is shown in Figure 10. This dataset is accompanied by measures of the NAO index (see Section 4.2) for each month of interest. The objective of this section is to explore the data and apply methods of extreme value analysis to these datasets. Firstly, covariate methods are applied to the gridded data to establish evidence of non-stationarity in the physical processes, using information from the track data and the NAO. Then, a further analysis is performed on the spatial dependence structure in the gridded data. 5.2 Covariate analysis A key feature of this PhD project will be to identify covariates linked to extremal behaviour in intensity and movement of extratropical cyclones. In this report, an exploratory study is presented analysing the effect of three covariates on monthly maxima of wind and rain in 31

Figure 10: Maximum wind speed in knots (left) and rainfall accumulations in millimetres (right) in gridded regions over the UK in December 2012. the UK. The value of NAO for each month is readily available for analysis.

34 Figure 10: Maximum wind speed in knots (left) and rainfall accumulations in millimetres (right) in gridded regions over the UK in December the UK. The value of NAO for each month is readily available for analysis. Here, two other covariates are presented which are constructed from the track data. For the purpose of this investigation, analysis is performed at one location, specifically the grid box containing the city of Lancaster. Firstly, the data are cleaned to filter the tracks that have a path within a certain radius of the UK. The first covariate d is defined as the minimum distance between the centre of the grid (for which latitude and longitude data is available) and the nearest track for every month. This covariate is selected as, intuitively, one would expect extreme wind speed and rainfall accumulations to increase as the distance to the storm centre decreases. The second covariate v is the corresponding vorticity value at the point of minimum distance on the track. Because these values are only available in 6-hourly intervals, linear interpolation is used in order to ascertain the value of interest. Again, one would expect that extreme wind speed and rainfall accumulations would increase as the vorticity at the nearest track increases. A visual representation of these covariates is shown in Figure 11. Another factor to consider is the effect of seasonality on the extremal behaviour of wind and rain. Naturally, one would expect to see more intense storms in the winter months than in any other season. In any case, it may not be valid to assume a constant effect throughout the year. As shown in Figure 12, median rainfall and wind speed are higher in autumn and winter than in spring and summer, with winter rainfall being more variable. Therefore, it is clearly evident from the plots that a seasonal component in the model is necessary. However, for the purpose of this exploratory investigation, extremal behaviour in spring months is explored as an isolated study. Since monthly maxima are being analysed, a GEV model is suitable. For the purpose of this 32

Figure 11: A map of the UK showing the location of interest (in red), the minimum distance d between this location and a track and the corresponding vorticity value v, obtained by linear

35 Figure 11: A map of the UK showing the location of interest (in red), the minimum distance d between this location and a track and the corresponding vorticity value v, obtained by linear interpolation of the vorticity data over 6-hourly intervals analysis, a constant shape parameter ξ is assumed. The initial model for both variables is taken to be GEV(µ t, σ t, ξ t ) where, µ t = µ 0 + µ 1 d + µ 2 v + µ 3 NAO σ t = exp(σ 0 + σ 1 d + σ 2 v + σ 3 NAO) ξ t = ξ A backward selection likelihood ratio test procedure is implemented. Backwards selection involves reducing the number of parameters in the model by one each time until all covariates are statistically significant. Given the log-likelihood l 0 of the null model H 0, and the loglikeilhood l A of the alternative model H A, the likelihood ratio statistic D is defined as D = 2(l A l 0 ). While additional parameters increase the log-likelihood, this test ensures it increases sufficiently as not to overfit the model. This test statistic follows a χ 2 f distribution, where f is the difference in the number of parameters of the null and alternative models. By this 33

36 Figure 12: Boxplots of monthly maximum rainfall accumulations (top) and wind speeds (bottom) over all seasons in the period process, the following model is deemed the best fit to the wind speed data: µ t = µ 0 + µ 1 d σ t = σ ξ t = ξ, with parameter estimates and standard errors summarised in Table 2. Figure 13 shows the effect of values of covariate d on return levels. The graph shows that the time interval between extreme observations is expected to be larger as the distance between your location and the track increases. Hence, the probability of extreme wind speeds is larger if one is closer to the track, which intuitively makes sense. µ 0 µ 1 σ ξ Estimate Standard Error % CI (7.21, 9.65) (-1.79, -0.33) (2.89, 3.97) (-0.29,-0.01) Table 2: Table of parameter estimates, standard errors and 95% confidence intervals for the GEV wind speed model 34

37 Figure 13: Return levels of wind speed corresponding to the best fitting GEV model, conditional on the minimum (black), mean (green) and maximum (red) values of the covariate d. Applying a GEV model to the rainfall data, following the same procedure as before gives a constant model as the model of best fit. While this is inconsistent with the results of the wind speed analysis, it is important to consider the effect of using monthly data. As the track data show, storms tend to last for only a few days, but with the chance of multiple large storms taking place in one month, considering monthly maxima will not capture the full picture of extreme cyclone activity. As explained in Section 6, daily maxima will be available at the PhD stage, which should result in an analysis that is more representative of the extremal characteristics of wind and rain. Further analysis will also aim to incorporate information from other locations in the UK. While covariates are not significant in this analysis, when a broader spatial model is developed, a more systematic relationship is expected. A brief introduction to spatial extremes modelling is given in Section Dependence structure Because an analysis of wind and rain at one point is not sufficient in representing the extremal behaviour of the overarching weather system, a preliminary analysis is presented here of the dependence structure between locations. In particular, 10 grid-boxes in a west-east direction and 10 grid-boxes in a north-south direction are chosen and the dependence between both sets compared (see Figure 14). It is hoped to investigate whether prevailing westerly winds over the UK have any effect on dependence between locations. Because the spatial resolution of the wind measurements is such that neighbouring grid-boxes in the west-east direction have identical values for wind speed, this analysis will focus on the rain variable only. 35

Figure 14: The grid-boxes chosen for a dependence study in the west-east direction (red) and north-south direction (yellow) (image courtesy of Google Maps) An initial check involves the effect of

38 Figure 14: The grid-boxes chosen for a dependence study in the west-east direction (red) and north-south direction (yellow) (image courtesy of Google Maps) An initial check involves the effect of dependence as the distance between grid-boxes increases. To illustrate this, the Kendall Tau correlation coefficient (Kendall, 1938) is calculated for three pairs of location in the west-east direction, (1, 2), (1, 4) and (1, 10), which result in respective correlation coefficients of 0.681, and (see Figure 15). Figure 15: Rainfall accumulations in locations (1, 2) (black), (1, 4) (red) and (1, 10) (green). To analyse the pattern of extremal dependence, the dependence parameter η is estimated for pairwise locations in both the west-east and north-south grid-boxes. The estimation 36

39 procedure follows the outline in Section 3.1. Figure 16 shows the estimates of η plotted against distance between locations. Distance is defined in terms of number of grid-boxes. Figure 16: Estimates of dependence parameter η changing with respect to distance for the north-south grid-boxes (left) and the west-east grid-boxes (right), with estimates testing negative for asymptotic dependence shown in blue. For each estimate of η, a hypothesis test is performed in order to check if the parameter estimate is significantly different than η = 1, the case of asymptotic dependence. It is interesting that asymptotic dependence tends to weaken as the distance between locations increases, as one would expect. Also of interest is the generally stonger asymptotic dependence between locations on the west-east directions, as analysis on the locations on the north-south direction yields more rejected hypotheses, that is, more evidence of asymptotic independence. This indicates that the prevailing westerly winds have some impact on the dependence structure between locations. Section 6 gives a brief overview of how this way of thinking can be extended to the spatial extremes methodology, in particular, a class of processes that can incorporate both asymptotic independence and asymptotic dependence in the model. 6 Further Work The previous chapters have introduced univariate and multivariate extreme value theory, followed by an overview of the physical processes which define the evolution and movement of extratropical cyclones and a preliminary data analysis. As outlined in Section 1, the goal of the PhD research is to apply methods from extreme value theory to develop a consistent model that accounts for the physical characteristics of these storms. This section features an outline of the short-term and long-term plans to achieve this. 37

40 6.1 Short-term goals Data collection One immediate objective is to better explore the Met Office datasets to gain a further insight into the extreme behaviour of extratropical cyclones. This would include incorporating seasonality into the model to account for weather variations at different times of the year. Using monthly maxima as in Section 5.2 is problematic when dealing with limited datasets as not enough data may be available to justify an asymptotic-based model. Further analysis will take place using daily maximum wind speed and rainfall accumulations. It is important to consider the change of structure as shown in Figure 9 caused by the introduction of a new reanalysis scheme and its effect on extreme value analysis. It is clear from the plot that at least a location or scale change has occurred in the data. As a preliminary theory as to how to incorporate this change into the model, consider the following. Assume that pre-change random variables Z 1,..., Z n are IID with mean a and variance b 2 and post-change random variables Z 1,... Z n are IID with mean c and variance d 2. Assume that other features of the distribution, such as skewness, are not subject to change. Then, in theory, normalised Z should be equally distributed as normalised Z, that is: Z a d = Z c. b d Rearranging this expression gives Z = d d (Z a) + c. b Assuming that the maximum M n of Z 1,..., Z n is distributed as a GEV(µ, σ, ξ) distribution, it would be interesting to discover the equivalent distribution of the maximum M n of Z 1,... Z n. P( M n x) = P( Z 1 x,... Z n x) ( d = P b (Z 1 a) + c x,..., d ) b (Z n a) + c x ( ) b(x c) b(x c) = P Z 1 + a,..., Z n + a d d ( ) b(x c) = P M n + a d { [ = exp 1 + ξ ( )] } 1/ξ b(x c) + a µ σ d { [ = exp 1 + ξ b ( x c + da σd b dµ )] } 1/ξ. b This gives the extreme value parameters of the post-change maxima in terms of the param- 38

41 eters of the pre-change maxima: ξ = ξ σ = βσ µ = α + βµ, where α = c da/b and β = d/b. Exploring whether this formulation can be applied to the Met Office datasets will be a task in the short-term. This will allow the change in structure to be incorporated into the model, producing more meaningful parameter estimates that may be distorted by this change otherwise. Any future analysis should account for spatial variability, in contrast to the analysis in Section 5.2 which dealt with a single location. It would be interesting to discover, in particular, if covariate significance changes with location. With increased spatial and temporal resolution, more information will be available to make more definitive results. It would also be interesting to source some pure observational data from the Met Office on which the Poisson process methodology developed in Section 2.5 can be applied. In addition, sourcing more data from the Met Office with regard to the physical processes controlling the evolution and movement of extratropical cyclones should aid the discovery of covariates related to extreme wind and rainfall. As a starting point, the procedure to obtain the covariate d in Section 5.2 could be made more exact with use of cubic splines as a means of interpolating over the tracks. Immediate analysis will focus on quantifying covariates related to the speed and age of the cyclone at a given point in space. As well as data exploration, further discussion and collaboration with the Met Office is necessary to determine likely factors linked to extreme wind speed and rainfall accumulations Spatial extremes Because the physical processes associated with extratropical cyclones are spatial in extent, spatial modelling of extremes will be necessary to capture the full extremal properties of these weather systems. In the case where few observations are available, incorporating data from nearby locations into the model can help in efficient parameter estimation. There are a number of methods in the extreme value literature to approach this problem. Bayesian hierarchical models Bayesian hierarchical modelling is a common approach for specifying extreme value models over continuous space. In this setting, dependence is introduced by integration over spatial latent variables or processes. Hence, spatial variation can be introduced in the parameters (Davison et al., 2012). Following the example in Coles (2001), consider observations of annual maxima X(r i ) at a set of locations r 1,..., r k R. A simple model is X(r i ) (µ 1,..., µ k, σ, ξ) GEV(µ i, σ, ξ), independently for r 1,..., r k, where µ 1,... µ k are the realisations of a smoothly varying random process µ(r) observed at r 1,..., r k respectively. Because µ(r) varies smoothly, nearby values of the µ(r i ) are more likely to be similar than distant values. Hence, values of X(r) 39

42 are more likely to be similar at nearby locations. An important assumption in the Bayesian hierarchical modelling approach is that of conditional independence of the extremes. An advantage of this approach is its flexibility in terms of incorporating spatial random effects and covariates into the model. To illustrate this, the model of Economou et al. (2014) is presented, which extends the work of Cooley and Sain (2010) to the application of extreme behaviour of extratropical cyclones. In particular, pressure minima are taken as a variable representing cyclonic intensity. Like in Cooley and Sain (2010), a Poisson process model is used due to strengths with regard to model flexibility and efficiency. Let X(s, t) be the depth in grid cell s S at time t T, where S and T are the domains of space and time respectively. The model is described as follows: X(s, t) θ ψ (s), β 2 (s) PP(µ(s, t), σ(s, t), ξ(s)) µ(s, t) = β µ 0 + β µ 1 z 1 (t) + β 2 (s)z 2 (t) + θ µ (s) log σ(s, t) = β σ 0 + β σ 1 z 1 (t) + θ σ (s) ξ(s) = β ξ 0 + θ ξ (s), for ψ = µ, σ, ξ, where z 1 is the latitude of the occurrence and z 2 is the NAO index and where θ µ (s), θ σ (s) and θ ξ (s) are spatial random effects. Spatial random effects define spatial variability in µ, log(σ) and ξ across the cells after allowing for covariates. These random effects are modelled using an intrinstic autoregressive (IAR) spatial model, for more details see Cooley and Sain (2010). In this model, the NAO parameter also varies with space. Prior distributions for the model parameters are chosen and an MCMC procedure is used to sample from the posterior distributions. It would be interesting to apply this formulation to other measures of cyclonic intensity, specifically measures of wind speed and rainfall accumulations. The discovery of covariates obtained from the physical structure of the cyclone could be implemented in further work and adapted to investigate if any underlying spatial variability exists. Max-stable models A common approach to modelling spatial extremes is using max-stable processes. Max-stable processes are the extension of multivariate extreme value theory to the infinite dimensional setting. Consider a random process X(r) having continuous sample paths. Then, if for sequences of continuous functions a n (r) > 0 and b n, as n { max i=1,...,n X i (r) b n (r) a n (r) } r R d {Z(r)} r R d, where X i are independent replications of X and Z is non-degenerate, then Z(r) is a maxstable process. By this result, max-stable processes are the limits of pointwise maxima in the same way that the GEV family is the limit distribution of block maxima. There are two specific ways of characterising max-stable processes. 40

43 Smith model First, consider Z(r) = max{ξ i f(s i, r)}, i where {(ξ i, s i ) : i 1} are the points of a Poisson process on (0, ] S with intensity measure ξ 2 dξ ν(ds) and f is a probability density function on S. The process Z is a maxstable process with unit Fréchet margins. A physical interpretation of this characterisation is described in Smith (1991). The set S can be regarded as a space of storm centres, and ν is a measure which represents the distribution of storms over S. Then each ξ i represents the intensity of the storm and the function f determines the profile of the storm. Hence, ξ i f(s i, r) represents the size of the storm at position r from a storm of size ξ i centred at location s i. Letting S = R d and f be a multivariate normal density with zero mean and covariance matrix Σ gives the Smith model (Smith, 1991). Based on these assumptions, the joint distribution function at two sites r 1 and r 2 is given by { P(Z(r 1 ) z 1, Z(r 2 ) z 2 ) = exp 1 ( a Φ z a log z ) 1 1 ( a Φ z 2 z a log z )} 2, z 1 where Φ is the standard normal distribution function and a = (r 1 r 2 ) T Σ 1 (r 1 r 2 ) is the Mahalanobis distance between r 1 and r 2. Schlather model Another useful characterisation of max-stable processes is presented in Schlather (2002). Let {ξ i } i=1 be points of a Poisson process on R + with intensity measure ξ 2 dξ. Let {W i } i=1 be independent replications of a positive random process having continuous sample paths W such that E[W (r)] = 1 for all r R d. Then Schlather defines Z(r) = max ξ i W i (r). i Z(r) is a stationary max-stable process on R d with unit Fréchet margins. The Schlater model defines W (r) = 2π max{0, ɛ(r)}, where ɛ is a standard Gaussian process. This leads to the bivariate distribution function [ P(Z(r 1 ) z 1, Z(r 2 ) z 2 ) = exp 1 ( ) ( )] {1 + ρ(h)}z 1z 2 2 z 1 z 2 (z 1 + z 2 ) 2 Because of the strong spatial component of physical processes associated with extratropical cyclones, an expansive review of the methodology of spatial extremes is required in the shortterm. A useful task would be to run simulations of the Smith and Schlather models using the SpatialExtremes package in R. Further work will also involve investigations into other max-stable processes, including the Brown-Resnick process (Kabluchko et al., 2009) and the process used in Davison and Gholamrezaee (2012) to fit models to extreme temperature data. 41

44 A key point to note is that the Smith and Schlather models are justified asymptotically for modelling spatial extremes under asymptotic dependence and perfect independence. In practice, however, it may be difficult to detect whether a dataset should be modelling using a model for asymptotic dependence or asymptotic independence. An ideal model is one that is asymptotically dependent over short distances, with weakening dependence as the distance increases, and asymptotic independence over longer distances, which would be consistent with findings from the dependence analysis in Section 5.3. Wadsworth and Tawn (2012) present a hybrid spatial extremes model, which is a mixture model of asymptotically dependent and asymptotically independent processes. Let X(x) be a max-stable process, with extremal coefficient function θ(h) defined as P(X(x 1 ) < z, X(x 2 ) < z) = exp{ θ(h)/z}, where h is the distance between locations x 1 and x 2. Let Y(x) be an asymptotically independent spatial process, with coefficient of tail dependence function η(h) defined as P(Y (x 1 ) > z, Y (x 2 ) > z) = L(z; h)z 1/η(h). Assuming each process has unit Fréchet margins, then for α [0, 1] H(x) = max{ax(x), (1 a)y (x)} is a spatial process with unit Fréchet margins and bivariate joint survivor function P(H(x 1 ) > z, H(x 2 ) > z) = a(2 θ(h)) z + (1 a)1/η(h) z 1/η(h) + O(z 2 ) as z. If there exists finite h = inf{h : θ(h) = 2}, then the process H(x) is asymptotically dependent up to distance h and asymptotically independent for longer distances. While the model is flexible, the freedom of choice for a could result in a model lacking in structure. This PhD project aims to impose constraints on the mixing probability a by incorporating physical knowledge of the extratropical cyclone structure over multiple sites. By incorporating asymptotically independent processes into the model, one hopes to fully capture the full picture of extreme cyclonic behaviour over the North Atlantic Random effects As discussed in Section 2.4.2, random effects modelling has rarely been used in the extreme value literature. However, the model of Eastoe and Tawn (2010) shows that this method can be useful in the absence of available data relating to covariates, and even as a diagnostic tool for covariate selection. In light of this, short-term research will focus on extending this principle to models for size of exceedance, through random effects and/or covariates in the parameters of a GP model. Because weather extremes are associated with high variability due to the complexity of atmospheric processes, it is expected that random effects will account for some of the extra variation not captured by the regression model. In addition, the distribution of the random effect may help to identify new covariates by matching its 42

45 behaviour with that of a known climate process. By obtaining data related to this climate process, model fit can be improved and a more parsimonious representation of weather extremes can be achieved. It is hoped that this could be further extended to the parameters of a GEV distribution. Investigating temporally and spatially dependent random effects may also be of interest in the long-term. In the immediate term, however, it will be necessary to create a Bayesian modelling framework in order to carry out initial investigations. 6.2 Long-term goals As discussed in Section 1, the long-term aim of this PhD research is to develop a statistical model of extremes arising from extratropical cyclones that is a valid representation of the physical processes that generate these extremes. While short-term goals will focus mainly on developing extreme value methodology for tackling this problem, it will be necessary to investigate further the structure of the cyclone from a climate science perspective in the long-term. This should generate an increased understanding of the physics that drive and shape the evolution and movement of extratropical cyclones. While the details in Section 4 are a good starting point, a broader knowledge is necessary in order to develop a model that is fully representative of these atmospheric processes. For example, the UK floods of 2007 (see Figure 17) were possible due to slack large scale flow and resulting slow-moving cyclonic activity. In addition, sting jets (see Section 4) can cause mass damage but are difficult to observe in datasets and to model in practice. For now, it is envisaged that simulated sting jet events can be generated by ensemble-type forecasting (Zhu, 2005) to improve their representation in the data. Part of this investigation will involve whether there are larger-scale factors influencing the occurrence of sting jets, which may prove easier to include in the model. The long-term aim is to identify these phenomena on weather charts and incorporate their effects into a physically and statistically consistent extreme value model. This will involve exploring wind and rain generating processes separately and jointly, using data where available to draw conclusions regarding their effects on extreme behaviour of these variables. It would also be interesting to explore any possible differences in the extremal behaviour in the wind and rain associated with the different types of fronts and using some theory similar to that described in Section to account for these differences in a parsimonious way. Ways of incorporating macro-characteristics of the storm into the model, such as speed or shape of the cyclone, will also need to be explored. From a data perspective, the numerical models discussed in Section 5 gives a complete set of observations on which to build a model. A vital long-term goal will be to calibrate model output with observational site data. Once an understanding of the spatial extremes methodology has been developed in the short-term, the long-term aim is to apply these techniques to Met Office datasets in order to obtain a more accurate representation of the spatial variability of extreme wind speeds and rainfall accumulations. It is hoped that model output will be of a high resolution in space and time, which may involve pooling information from the many different reanalysis projects introduced in Section 5. The extreme features of cyclones could be further modelled by extending the spatial extremes methodology to the multivariate setting, where extreme 43

Figure 17: Rainfall accumulations (mm) in the UK during the flood events of July 2007. rainfall accumulations and wind speeds can be modelled jointly over space.

46 Figure 17: Rainfall accumulations (mm) in the UK during the flood events of July rainfall accumulations and wind speeds can be modelled jointly over space. This would involve exploring new methods for incorporating covariates into a spatial and multivariate framework. Spatial random effects modelling could be used to reveal a dependence on covariates that are insignificant in analysis of a single site. Track data could be used to model the spatial occurrence of the storm in the North Atlantic, which could be related to the impact of extreme wind speeds and rainfall. In the long-term, it is hoped that a combined model will be developed that will capture both the broad spatial and temporal aspects of extratropical cyclones through a framework that will combine components of multivariate and spatial extreme value theory, incorporating non-stationarity in the form of known covariates and random effects. It is hoped that this developed spatio-temporal model can provide a more robust assessment of future risk associated with extratropical cyclones. Estimates of future risk can be improved by jointly assessing changes in intensity, frequency and spatial distribution, with the aim of improving the signal to noise of any future changes. In the long-term, the aim is that this study will provide tools that will allows significantly improved risk assessments that are required by policy makers and clients of the Met Office for both current and future risk of wind and rainfall extremes arising from extratropical cyclones. References Baker, L. (2009). Sting jets in severe northern european wind storms. Weather, 64(6): Baker, L., Gray, S., and Clark, P. (2014). Idealised simulations of sting-jet cyclones. Quarterly Journal of the Royal Meteorological Society, 140(678): Barry, R. G. and Chorley, R. J. (2009). Atmosphere, Weather and Climate. Routledge. Bjerknes, J. and Solberg, H. (1922). Life Cycle of Cyclones and the Polar Front Theory of Atmospheric Circulation. Grondahl. 44

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Overview of Extreme Value Theory Dr. Sawsan Hilal space Maths Department - University of Bahrain space November 2010 Outline Part-1: Univariate Extremes Motivation Threshold Exceedances Part-2: Bivariate