Frequency Analysis & Probability Plots

Similar documents
Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Multivariate Distribution Models

MIT Spring 2015

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

CE 3710: Uncertainty Analysis in Engineering

Properties of Continuous Probability Distributions The graph of a continuous probability distribution is a curve. Probability is represented by area

SPRING 2007 EXAM C SOLUTIONS

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -27 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Lecture 35. Summarizing Data - II

Brief reminder on statistics

Method of Moments. which we usually denote by X or sometimes by X n to emphasize that there are n observations.

Chapter 6. Estimation of Confidence Intervals for Nodal Maximum Power Consumption per Customer

Solutions. Some of the problems that might be encountered in collecting data on check-in times are:

Power laws. Leonid E. Zhukov

PROBABILITY DISTRIBUTION

Unit 3: Identification of Hazardous Events

Chapter 4 - Lecture 3 The Normal Distribution

SUMMARIZING MEASURED DATA. Gaia Maselli

Temporal Analysis of Data. Identification of Hazardous Events

Probability Distribution

C4-304 STATISTICS OF LIGHTNING OCCURRENCE AND LIGHTNING CURRENT S PARAMETERS OBTAINED THROUGH LIGHTNING LOCATION SYSTEMS

STAT 6350 Analysis of Lifetime Data. Probability Plotting

Week 1 Quantitative Analysis of Financial Markets Distributions A

Extreme Value Theory.

5.6 The Normal Distributions

Exponential, Gamma and Normal Distribuions

Journal of Biostatistics and Epidemiology

Contents. Acknowledgments. xix

2 Functions of random variables

Asymptotic distribution of the sample average value-at-risk

Probability Distribution

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.

MATH4427 Notebook 4 Fall Semester 2017/2018

Statistics: Learning models from data

Probability Distribution And Density For Functional Random Variables

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting

Continuous Random Variables

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

International Journal of Scientific & Engineering Research, Volume 6, Issue 2, February-2015 ISSN

Continuous Distributions

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( )

Plotting data is one method for selecting a probability distribution. The following

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Calculus with Algebra and Trigonometry II Lecture 21 Probability applications

Practice Problems Section Problems

Transmuted Pareto distribution

Lecture 2. October 21, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

What to do today (Nov 22, 2018)?

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Copulas. MOU Lili. December, 2014

Estimation of Quantiles

Hydrological extremes. Hydrology Flood Estimation Methods Autumn Semester

Analysis of Experimental Designs

Probability Plots. Summary. Sample StatFolio: probplots.sgp

CS 147: Computer Systems Performance Analysis

Example A. Define X = number of heads in ten tosses of a coin. What are the values that X may assume?

MODEL FOR DISTRIBUTION OF WAGES

Statistical Concepts. Constructing a Trend Plot

Continuous r.v. s: cdf s, Expected Values

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Smooth simultaneous confidence bands for cumulative distribution functions

STAT Section 2.1: Basic Inference. Basic Definitions

Using Method of Moments in Schedule Risk Analysis

1 Probability and Random Variables

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

Introduction to Statistics and Error Analysis II

Summarizing Measured Data

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /13/2016 1/33

Review. December 4 th, Review

Distribution Fitting (Censored Data)

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables

Parameter Estimation

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

' International Institute for Land Reclamation and Improvement. 6.1 Introduction. 6.2 Frequency Analysis

Lecture 3. Conditional distributions with applications

Probability measures A probability measure, P, is a real valued function from the collection of possible events so that the following

Recall the Basics of Hypothesis Testing

Chapter 4: Continuous Random Variables and Probability Distributions

STATISTICAL ANALYSIS OF MIXED POPULATION FLOOD SERIES

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

Functions of Random Variables

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

3 Joint Distributions 71

Continuous Random Variables

This does not cover everything on the final. Look at the posted practice problems for other topics.

Robust Inference. A central concern in robust statistics is how a functional of a CDF behaves as the distribution is perturbed.

The Metalog Distributions

ECE 275A Homework 7 Solutions

Hydrologic Frequency Analysis

Chapter 2: Statistical Methods. 4. Total Measurement System and Errors. 2. Characterizing statistical distribution. 3. Interpretation of Results

MIT Spring 2015

Chapter (a) (b) f/(n x) = f/(69 10) = f/690

Modeling and Performance Analysis with Discrete-Event Simulation

Structural Reliability

Today: Cumulative distributions to compute confidence limits on estimates

HANDBOOK OF APPLICABLE MATHEMATICS

Algorithms for Uncertainty Quantification

Northwestern University Department of Electrical Engineering and Computer Science

Continuous random variables

Transcription:

Note Packet #14 Frequency Analysis & Probability Plots CEE 3710 October 0, 017 Frequency Analysis Process by which engineers formulate magnitude of design events (i.e. 100 year flood) or assess risk associated with various outcomes/events Based on use of sample data to hypothesize probability model and infer characteristics of the population of interest Works with any probability distribution 1

Motivation: You need to design a levee to withstand the 100 year flood (X 0.99 ). Given the sample data {x 1, x,, x n } below corresponding to the ˆX 0.99 magnitude of n = 50 annual maximum flood flows, what is? 4000 Annual Maximum Discharge (cfs) 3500 3000 500 000 1500 1000 500 0 1960 1970 1980 1990 000 010 Year (1) Compute sample moments, descriptive statistics Frequency 14 1 10 8 6 4 0 Histogram 410 860 1310 1760 10 660 3110 More Annual Maximum Discharge (cfs) 1 x n n i 1 x 1503.9 cfs i n 1 sx xi x n 1 i1 81.0 cfs () Select an appropriate model (probability distribution) of annual maximum flood flows Considerations: What does data look like? Is data skewed? Are variables strictly positive? Continuous or Discrete?

(3) Fit selected model to data using MOM to estimate distribution parameters (Point estimates of parameters) xf ( x) dx x 1503.9 cfs x x x x x x ( x ) f ( x) dx s 81.0 cfs If X ~ Lognormal ln( x) 1 1 Y fx( x) exp x Y Y for x > 0; 0 otherwise 1/ Y ln 1X X 0.577 1 Y ln[ X] Y 7.161 Frequency 16 14 1 10 8 6 4 0 Histogram 410 860 1310 1760 10 660 3110 More Annual Maximum Discharge (cfs) (4) Assess goodness of fit:how well does model represent data How good are our parameter estimates? (How good is our estimate of 100 year event?) Compute confidence intervals Construct Quantile Quantile Plot (5) Compute ˆX 0.99 (or other values of interest) 3

General Procedure: 1. Obtain a sample of size n, compute sample moments and descriptive statistics. Hypothesize underlying probability density function (pdf) of the population 3. Apply method of moments and compute parameters of the assumed pdf (i.e. fit probability model to the data) 4. Assess fit of probability model by graphing the fitted cumulative distribution function (cdf) relative to sample data (empirical cdf, probability plot, or quantile quantile plot) 5. Use the fitted cdf to obtain percentiles (design events) or probabilities associated with outcomes of interest Smooth line/curve corresponds to probability model (representation of population) Dots/points correspond to observed sample data 4

Empirical Cumulative Distribution Function (CDF) Representation of the cumulative distribution function based on the relative magnitude of observations in a sample of size n Obtained by graphing the plotting positions versus the ranked observations Plotting Position ( ): provides an estimate of the cumulative probability associated with the observation of rank i(x (i) ) = i/(n+1) In other words, = P[X x (i) ] and thus, x (i) represents an empirical percentile (or quantile) Example: Construct an empirical CDF for the following sample data: {90, 105, 65, 135, 95, 115, 80, 73, 76, 88} i x (i) 1 65 0.091 73 0.18 3 76 0.73 4 80 0.364 5 88 0.455 6 90 0.545 7 95 0.636 8 105 0.77 9 115 18 10 135 0.909 Empirical CDF 1.0 0.6 0.4 0. 0.0 0 50 100 150 x Note: Construction of the empirical CDF does not require consideration of the form of the underlying probability distribution for the random variable/population; however, we can assess the goodness of fit of a probability distribution by plotting the assumed/fitted cdf (model) on the same figure as the empirical cdf (observed). 5

Example: Use the method of moments to fit a normal distribution to the data above, and then assess how well it represents the data by plotting the fitted CDF relative to the empirical CDF. Empirical CDF vs. Fitted Normal xˆ CDF i x(i) pi zpi xˆ() i 11.0 65 0.091-1.335 63.8 73 0.18-0.908 7.9 3 76 0.73-0.605 79.4 40.6 80 0.364-0.349 84.8 50.4 88 0.455-0.114 89.8 6 90 0.545 0.114 94.6 70. 95 0.636 0.349 99.6 80.0 105 0.77 0.605 105.0 9 0 115 18 50 0.908 111.5 100 150 10 135 0.909 1.335 x 10.6 pi Sample Data Fitted Normal Example: Use the method of moments to fit a lognormal distribution to the data, and then assess how well it represents the data by plotting the fitted CDF relative to the empirical CDF. Empirical CDF vs. Fitted Lognormal xˆpi CDF i x(i) pi zpi xˆ() i 1.0 1 65 0.091-1.335 66.3 73 0.18-0.908 73.1 3 76 0.73-0.605 78.3 0.6 4 80 0.364-0.349 83.0 5 0.4 88 0.455-0.114 87.5 6 90 0.545 0.114 9. 0. 7 95 0.636 0.349 97.3 8 105 0.77 0.605 103.1 0.0 9 0 115 1850 0.908 100 110.5 150 10 135 0.909 1.335 x 11.7 Sample Data Fitted LN 6

1.0 Empirical CDF vs. Fitted Normal CDF 0.6 0.4 0. Sample Data Fitted Normal 0.0 0 50 100 150 x 1.0 Empirical CDF vs. Fitted Lognormal CDF 0.6 0.4 0. Sample Data Fitted LN 0.0 0 50 100 150 x Example: Use the method of moments to fit a Gumbel distribution to the data, and then assess how well it represents the data by plotting the fitted CDF relative to the empirical CDF. i x(i) pi xˆ() i Empirical CDF vs. xˆpifitted Gumbel CDF 1 1.0 65 0.091 68.1 73 0.18 73.8 3 76 0.73 78.3 4 0.6 80 0.364 8.4 5 0.4 88 0.455 86.6 6 90 0.545 90.9 7 0. 95 0.636 95.8 8 0.0 105 0.77 101.6 9 0 115 18 50 109.3 100 150 10 135 0.909 11.6 x Sample Data Fitted Gumbel 7

Quantile Quantile (Q Q) Plots Constructed by plotting ranked observations ( fitted percentiles, or quantiles ( ) ) against the Observed or Empirical Quantiles vs. Modeled or Fitted Quantiles ˆx (i) x (i) Sample data should fall approximately on a straight line (1:1) if the fitted distribution adequately describes the true population 8

ˆx (i) ˆx (i) ˆx (i) 9

Probability Plots Sample data is plotted so that the observations should fall approximately on a straight line if a selected distribution describes the true population however, unlike Q Q plots, assessment of the selected distribution (model) does not depend on estimated parameters Can be created with special commercially available probability papers for some distributions (normal, lognormal, Gumbel), or the general technique developed here (easy with a spreadsheet) Constructed by plotting ranked observations ( x (i) ) against standardized percentiles Example: Reconsider the sample data above. Use a probability plot to assess how well the normal distribution fit using the method of moments represents the sample data. i x (i) z pi 1 65 0.091-1.335 73 0.18-0.908 3 76 0.73-0.605 4 80 0.364-0.349 5 88 0.455-0.114 6 90 0.545 0.114 7 95 0.636 0.349 8 105 0.77 0.605 9 115 18 0.908 10 135 0.909 1.335 10

Example: Reconsider the sample data above. Use a probability plot to assess how well the lognormal distribution fit using the method of moments represents the sample data. i x (i) ln( x (i)) z pi 1 65 4.174 0.091-1.335 73 4.90 0.18-0.908 3 76 4.330 0.73-0.605 4 80 4.38 0.364-0.349 5 88 4.477 0.455-0.114 6 90 4.500 0.545 0.114 7 95 4.554 0.636 0.349 8 105 4.654 0.77 0.605 9 115 4.745 18 0.908 10 135 4.905 0.909 1.335 Example: Reconsider the sample data above. Use a probability plot to assess how well the Gumbel distribution fit using the method of moments represents the sample data. i x(i) pi -ln(-ln(pi)) 1 65 0.091-75 73 0.18-0.533 3 76 0.73-0.6 4 80 0.364-0.01 5 88 0.455 0.38 6 90 0.545 0.501 7 95 0.636 0.794 8 105 0.77 1.144 9 115 18 1.606 10 135 0.909.351 11