On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves

Similar documents
Split-time Method for Estimation of Probability of Capsizing Caused by Pure Loss of Stability

CONFIDENCE INTERVALS FOR EXCEEDANCE PROB- ABILITIES WITH APPLICATION TO EXTREME SHIP MOTIONS

Extreme Value Analysis and Spatial Extremes

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Abstract: In this short note, I comment on the research of Pisarenko et al. (2014) regarding the

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Zwiers FW and Kharin VV Changes in the extremes of the climate simulated by CCC GCM2 under CO 2 doubling. J. Climate 11:

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Financial Econometrics and Volatility Models Extreme Value Theory

Spatial and temporal extremes of wildfire sizes in Portugal ( )

Classical Extreme Value Theory - An Introduction

Introduction to Algorithmic Trading Strategies Lecture 10

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Modelação de valores extremos e sua importância na

Sharp statistical tools Statistics for extremes

Construction of confidence intervals for extreme rainfall quantiles

Refining the Central Limit Theorem Approximation via Extreme Value Theory

Overview of Extreme Value Analysis (EVA)

Efficient Estimation of Distributional Tail Shape and the Extremal Index with Applications to Risk Management

ON THE TWO STEP THRESHOLD SELECTION FOR OVER-THRESHOLD MODELLING

Wei-han Liu Department of Banking and Finance Tamkang University. R/Finance 2009 Conference 1

Assessing Dependence in Extreme Values

Requirements for Computational Methods to be sed for the IMO Second Generation Intact Stability Criteria

Bayesian Modelling of Extreme Rainfall Data

Estimation of Quantiles

Multivariate generalized Pareto distributions

Extreme Value Theory An Introduction

Extreme Value Theory as a Theoretical Background for Power Law Behavior

Analysis methods of heavy-tailed data

EXTREMAL QUANTILES OF MAXIMUMS FOR STATIONARY SEQUENCES WITH PSEUDO-STATIONARY TREND WITH APPLICATIONS IN ELECTRICITY CONSUMPTION ALEXANDR V.

Models and estimation.

Uncertainty propagation in a sequential model for flood forecasting

Modelling trends in the ocean wave climate for dimensioning of ships

INVERTED KUMARASWAMY DISTRIBUTION: PROPERTIES AND ESTIMATION

Distribution tail structure and extreme value analysis of constrained piecewise linear oscillators

Uncertainties in extreme surge level estimates from observational records

If we want to analyze experimental or simulated data we might encounter the following tasks:

Basics of Uncertainty Analysis

Review. December 4 th, Review

Bivariate generalized Pareto distribution

Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level

Modelling Under Risk and Uncertainty

RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

Challenges in implementing worst-case analysis

RISK ANALYSIS AND EXTREMES

Math 576: Quantitative Risk Management

Statistics: Learning models from data

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Physics and Chemistry of the Earth

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

First steps of multivariate data analysis

Multivariate Pareto distributions: properties and examples

Shape of the return probability density function and extreme value statistics

Structural reliability assessment of accidentally damaged oil tanker

Accommodating measurement scale uncertainty in extreme value analysis of. northern North Sea storm severity

Journal of Environmental Statistics

Does k-th Moment Exist?

Extreme Value Theory and Applications

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

NONPARAMETRIC ESTIMATION OF THE CONDITIONAL TAIL INDEX

Probability and Estimation. Alan Moses

Introduction to Statistical Methods for High Energy Physics

Frequency Estimation of Rare Events by Adaptive Thresholding

Generalized additive modelling of hydrological sample extremes

PAijpam.eu ADAPTIVE K-S TESTS FOR WHITE NOISE IN THE FREQUENCY DOMAIN Hossein Arsham University of Baltimore Baltimore, MD 21201, USA

STA 4273H: Statistical Machine Learning

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT

A NOTE ON SECOND ORDER CONDITIONS IN EXTREME VALUE THEORY: LINKING GENERAL AND HEAVY TAIL CONDITIONS

ECE521 week 3: 23/26 January 2017

Mathematical statistics

PREPRINT 2005:38. Multivariate Generalized Pareto Distributions HOLGER ROOTZÉN NADER TAJVIDI

ESTIMATING BIVARIATE TAIL

Assessing the stability of ships under the effect of realistic wave groups

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

The Instability of Correlations: Measurement and the Implications for Market Risk

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

Probability and statistics; Rehearsal for pattern recognition

Consideration of prior information in the inference for the upper bound earthquake magnitude - submitted major revision

Parametric Techniques Lecture 3

Inference on reliability in two-parameter exponential stress strength model

DEVELOPING THE S FACTOR. Maciej Pawłowski, School of Ocean Engineering and Ship Technology, TU of Gdansk, , Poland,

STA 4273H: Statistical Machine Learning

Estimating Risk of Failure of Engineering Structures using Predictive Likelihood

Discovery significance with statistical uncertainty in the background estimate

Mathematical statistics

16 : Markov Chain Monte Carlo (MCMC)

Parametric Techniques

Introduction to Rare Event Simulation

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Econ 583 Final Exam Fall 2008

What Can We Infer From Beyond The Data? The Statistics Behind The Analysis Of Risk Events In The Context Of Environmental Studies

Report of the Committee on Stability in Waves

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Some Theoretical Properties and Parameter Estimation for the Two-Sided Length Biased Inverse Gaussian Distribution

APPLICATION OF EXTREMAL THEORY TO THE PRECIPITATION SERIES IN NORTHERN MORAVIA

High-frequency data modelling using Hawkes processes

Transcription:

On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves Bradley Campbell 1, Vadim Belenky 1, Vladas Pipiras 2 1. David Taylor Model Basin (NSWCCD), USA 2. University of North Carolina at Chapel Hill, USA Abstract: Stability failures of intact ships can be characterized as the exceedance of some critical level of roll, pitch, and accelerations. The events that need to be characterized for a probabilistic assessment generally have a level of rarity so that they cannot be observed in a reasonable amount of model test runs or simulations. The Peaks Over Threshold (POT) method is a promising technique in the assessment of these rare stability failures. POT methods model the tail of the distribution of peaks as a Generalized Pareto Distribution, which is formally derived from the Generalized Extreme Value distribution. Using Generalized Pareto Distribution in a POT framework allows for the assessment of the probability (with confidence intervals) of these rare events through statistical extrapolation. Key words: Dynamic Stability, Capsize, Extreme Event, Direct Assessment, Statistical Extrapolation, Probabilistic Assessment 1. Introduction In the assessment of the dynamic stability of ships, probabilistic frameworks are generally employed to quantify, in some way, the risk of stability failure. For intact ships a stability failure can be characterized by the exceedance of some high level of the roll, pitch, or acceleration of the ship. * As the large amplitude motion of a ship can be a highly nonlinear process, the assumption of a Gaussian process does not hold. Since the rich set of tools accompanying a Gaussian process are not applicable, other approaches are needed in order to characterize the nature of the extreme events. In severe cases (i.e. very high sea states) descriptive statistics may suffice if enough failures can be * Corresponding author:, Bradley Campbell research field: dynamic stability, stochastic dynamics of ships. E-mail: bradley.campbell@navy.mil The authors wish to thank Dr. Pat Purtell and Dr. Ki-Han Kim at the Office of Naval Research for their support of this work. Participation of Dr. Pipiras was facilitated by the summer faculty program supported by ONR and managed by Dr. Jack Price of David Taylor Model Basin.. observed in physical model tests or Monte Carlo simulations. For seaways where extreme events are sufficiently rare but the risk is still not negligible, the amount of model test runs or Monte Carlo simulations (given the requirements of a hydrodynamic code for this task [1]) becomes intractable. Inferential statistics provide ways of dealing with these types of cases through techniques of statistical extrapolation and extreme value theory. 2. Peaks Over Threshold Methods The extreme value theorem (sometimes referred to as the Fisher-Tippet-Gnedenko theorem) states that the largest value of a set of independent and identically distributed data (IID) in a fixed time period, T, will (for large values of T) be distributed via the Generalized Extreme Value (GEV) distribution [2]. While this theorem is extremely valuable, its asymptotic nature (with respect to the size of the time window) means it makes poor use of available data. The second extreme value theorem (the Pickands Balkema de Haan theorem) states that distribution of exceedances of a sufficiently high Marine Technology Centre, UTM 149

threshold can be approximated using the Generalized Pareto distribution (GPD). Use of the GPD, which is derived from the GEV distribution [3] [2], also relies on the peaks also satisfying the IID condition. 3. Peaks Over Threshold For Dynamic Stability Assessment The general framework for the use of POT methods for stability assessments is discussed in [4], [5], and [6]. In order to ensure the IID requirement is satisfied, the peaks of the piece-wise linear envelope, rather than the peaks of the signal, are used. The theoretical envelope (derived through a Hilbert transform) is not used as its peaks are not always a subset of the signal peaks. As the level is increased the exceedance rate computed using either envelope or signal peaks approach each other, but the IID condition will only be met using the envelope peaks for intermediate level thresholds. The envelope peak extraction process is illustrated in Figure 1. Figure 2. Histogram in GPD Fit of Ship Motions Data 4. Modeling the Distribution of Peaks Over Threshold 4.1 Definition of the Probability Density Function As stated in section 2, the GPD is used to model the conditional distribution of peaks over the threshold. The probability density function of the GPD has three parameters, location (μ), shape (k) and scale (σ). The location parameter is generally taken as the assumed threshold. The probability density function is given by equation (2). (2) Figure 1. Envelope Peak Selection The exceedance rate of level a is then given by: (1) Where a is the level of interest, μ is the threshold, λ μ is the rate of exceedance of threshold μ, and F(x>a x>μ) is the conditional probability that a will be exceeded given that μ has been exceeded. It is this conditional probability that needs to be computed accurately for this type of method to work, since the level of interest (defined as a stability failure) may be higher than the any peak observed during a model test or set of simulations, as shown by sample histogram and GPD fit in Figure 2. The associated cumulative density function is given by equation (3). (3) When the shape parameter k is zero, the GPD reduces to the exponential distribution. The tail of a normal process will behave in this way. When k is positive the tail is said to be heavy and higher levels become more likely than as modeled with the exponential distribution. Conversely, when k is negative the tail is said to be light, and higher levels have a smaller probability of exceedance. Marine Technology Centre, UTM 150

4.2 Light Tails For cases where the tail is light (k < 0), the GPD has a right bound, x B, which is given by: (4) Above x B the probability of exceedance is identically 0. It is important to note that the derivative of the cumulative density function (CDF) gets very steep in the vicinity of x B. This means that small changes in x lead to large changes in the probability of exceedance. The practical implication is that the confidence interval on the predicted exceedance rate can be very large near x B. 4.2 Parameter Estimation The parameters of the GPD are estimated using the Maximum Likelihood Estimation (MLE) method. The MLE method is based on the assumption that the observed data is the most likely data. The Maximum Likelihood (ML) estimator for a probability density function, f, with parameter set, is given by: (5) where the x i values are the observed data. The value of the L is maximized with respect to the parameters. In practice the natural logarithm of the likelihood function is used as certain algebraic simplifications that ease the complexity of the calculations can be achieved and the product operator becomes a summation operator. The estimates the distribution parameters, k and σ, from the MLE method are approximately normally distributed. 4.3 Confidence Interval of the Distribution Parameters The confidence interval on the distribution parameters, k and σ, may be calculated using the delta method. The delta method assumes the parameters are normally distributed and that the ML estimator is a deterministic function of random arguments. The ML estimator is linearized and the variances of the parameters are computed by, first, computing Fisher Information matrix, M F, a 2x2 matrix of the 2 nd partial derivatives of the likelihood function. The covariance matrix is the inverse of M F and the variances of the parameters are the diagonal elements of covariance matrix. The off-diagonal terms give the covariance of the parameter estimates. The confidence interval on the parameters is obtained assuming the parameter estimates are the mean value of a normal distribution with variances from the covariance matrix. The profile log-likelihood method is another method for determining the confidence intervals on the distribution parameters [7]. The profile log-likelihood method has the advantage that the resulting confidence intervals need not be symmetric, but the computational cost is much higher. For the calculation of the parameter confidence intervals, this computational burden likely does not yield much gain. 4.4 Threshold Selection A critical part of using the GPD effectively is selecting an appropriate threshold. Tanaka et. al. provide an overview of several commonly used methods and compare their performance [8]. Typical methods of threshold selection are graphical in nature, as many applications only deal with one dataset. These methods include: Shape parameter plot (see Figure 3 - Top) Modified scale parameter plot (see Figure 3 Bottom) Mean excess plot In order to be useful for the probabilistic assessment of ship stability failure, the threshold selection method must be automated. The shape and modified scale parameter plots can be easily automated, while the mean excess plot (sometimes call the mean residual life plot) is a little more difficult to automate in a sensible fashion. For the shape and modified scale Marine Technology Centre, UTM 151

parameter plots, the main idea is that above the minimum threshold, these values should be (statistically) constant with respect to the threshold. Figure 3. Sample Shape Parameter Plot (Top) and Modified Scale Parameter Plot (Bottom) Additionally Reiss and Thomas [9] suggest two related alternative methods based on minimizing the difference between the shape parameter at a given threshold and the mean or median of the shape parameter for all of the thresholds above. All of these methods give a lower bound on the threshold choice. The selected threshold must therefore be at least as high as the highest low bound from this set of methods. Additionally, given the sensitivity of the probability near the x B when the tail is light, selection of the threshold with the highest shape parameter can help shrink the size of the confidence interval to some extent. mean probability will not be equal to the probability computed using the mean of the parameter estimates. There are several techniques to compute the confidence interval on the probability estimate. The CDF of the extrapolated probability of exceedance, F p, may be computed using equation (6). The confidence interval would then be assessed from the quantiles of the CDF. σ σ σ = lim ( k; z 0) f ( k, ) d dk z 0 F ( ) P z 0 z < 0 Where: (6) k ( ) c a ; k 0 k z 1 σ σ lim( k; z > 0) = ( c k) ; k = 0 ln( z) The parameter space contained by the parameter confidence intervals may contain area where the computed probability is zero. If this is the case, then there is a discontinuity in the CDF of probability of exceedance. This is visible in Figure 4, where F P (0) is the amount of area where F p is zero. In this case the lower bound of the confidence interval on the probability of exceedance will be zero. 5. Extrapolation of the Conditional Probability of Exceedance The conditional probability of exceedance is computed using equation (3). This value of the probability is based on the mean value of the parameter estimates. As the equation (3) is a non-linear function and can be treated as a deterministic function with random arguments, the Figure 4. CDF of the Probability of Exceedance Another method to be considered is an indirect method using the Profile Log-likelihood method mentioned earlier. The confidence intervals are developed for the return level and then mapped to the corresponding probability. This indirect use of the Profile Log-likelihood Method seems to be the more Marine Technology Centre, UTM 152

accurate than the CDF based technique based on investigations using data sampled from a parent GPD. Issues still arise near the right bound in the case of a light tail. 6. Validation Considerations Some work has been done on the validation of statistical extrapolation methods for use in ship dynamics. Smith discussed some initial validation results [10]. Generally these types of methods fair well, though more work needs to be done in this area. 7. Conclusions Peaks Over Threshold methods can be very effective in the prediction of large ship motions or stability failures for intact ships. The Generalized Pareto distribution has some behaviors which need to be understood, particularly for light-tailed processes, in order to make proper use of it. The study of light-tailed processes and the behavior of the confidence interval for the probability of exceedance have been given some treatment in the present work, but have not been studied as deeply as heavy-tailed processes and return levels in available literature. Acknowledgments The authors also wish to thank Dr. A. Reed and T. Smith, (David Taylor Model Basin, NSWCCD), Prof. P. Spanos (Rice University) and Prof. M. R. Leadbetter (University of North Carolina) for many fruitful discussions. References [3] J. Pickands, "Statistical Inference Using Extreme Order Statistics," The Annals of Statistics, vol. 3, no. 1, pp. 119-131, 1975. [4] B. Campbell and V. Belenky, "Assessment of Short-Term Risk with Monte-Carlo Method," in Proceedings of the 11th International Ship Stability Workshop, Wageningen, NL, 2010. [5] B. Campbell and V. Belenky, "Statistical Extrapolation for Evaluation of Probability of Large Roll"," in Proceedings of the 11th International Symposium on Practical Design of Ships and Other Floating Structures, Rio de Janeiro, Brazil, 2010. [6] V. Belenky, K. Weems, C. Bassler, M. Dipper, B. Campbell and K. Spyrou, "Approaches to Rare Events in Stochastic Dynamics of Ships," Probabilistic Engineering Mechanics, vol. 28, pp. 30-38, 2012. [7] S. Coles, An Introduction to Statistical Modeling of Extreme Values, London: Springer-Verlag, 2001. [8] S. Tanaka and K. Takara, "A Study on Threshold Selection in POT Analysis of Extreme Floods.," in The Extremes of Extremes: Extraordinary Floods, Reykjavik, 2002. [9] R.-D. Reiss and M. Thomas, Statistical Analysis of Extreme Values, Boston: Birkhäuser, 1997, pp. 137-138. [10] T. C. Smith, "Example of Validation of Statistical Extrapolation," in Proceedings of the 14th International Ship Stability Workshop, Kuala Lumpur, Malaysi, 2014. [1] R. F. Beck and A. M. Reed, "Modern Computational Methods for Ships in a Seaway," SNAME Transactions, vol. 109, pp. 1-48. [2] M. R. Leaderbetter, G. Lindgren and H. Rootzen, Extremes and Related Properties of Random Sequences and Processes, New York: Springer-Verlag, 1983. Marine Technology Centre, UTM 153