REPRESENTATION OF DAILY RAINFALL DISTRIBUTIONS USING NORMALIZED RAINFALL CURVES

Similar documents
INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -27 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

STAT 509 Section 3.4: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Statistics for scientists and engineers

The Inverse Weibull Inverse Exponential. Distribution with Application

ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION

A MARKOV CHAIN MODELLING OF DAILY PRECIPITATION OCCURRENCES OF ODISHA

Continuous random variables

4. Distributions of Functions of Random Variables

Continuous Random Variables

On the Comparison of Fisher Information of the Weibull and GE Distributions

Bivariate Weibull-power series class of distributions

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

Chapter 3 Common Families of Distributions

Transformations and Expectations

Estimating Design Rainfalls Using Dynamical Downscaling Data

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Recall the Basics of Hypothesis Testing

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

Chapter Learning Objectives. Probability Distributions and Probability Density Functions. Continuous Random Variables

Best Fit Probability Distributions for Monthly Radiosonde Weather Data

ON A GENERALIZATION OF THE GUMBEL DISTRIBUTION

A world-wide investigation of the probability distribution of daily rainfall

SPI: Standardized Precipitation Index

CHAPTER 3 ANALYSIS OF RELIABILITY AND PROBABILITY MEASURES

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.

STAT 6350 Analysis of Lifetime Data. Probability Plotting

A Study of Five Parameter Type I Generalized Half Logistic Distribution

MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.

MATH4427 Notebook 4 Fall Semester 2017/2018

ON THE THEORY OF ASSOCIATIVE DIVISION ALGEBRAS*

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring

Definition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by speci

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur

MONOTONICITY OF RATIOS INVOLVING INCOMPLETE GAMMA FUNCTIONS WITH ACTUARIAL APPLICATIONS

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( )

4 Resampling Methods: The Bootstrap

Modelling Risk on Losses due to Water Spillage for Hydro Power Generation. A Verster and DJ de Waal

Review: mostly probability and some statistics

Testing Goodness-of-Fit for Exponential Distribution Based on Cumulative Residual Entropy

Continuous Distributions

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Research Article Relationships between Rainy Days, Mean Daily Intensity, and Seasonal Rainfall over the Koyna Catchment during

An Overly Simplified and Brief Review of Differential Equation Solution Methods. 1. Some Common Exact Solution Methods for Differential Equations

Step-Stress Models and Associated Inference

Section 8.1. Vector Notation

Generating synthetic rainfall using a disaggregation model

Statistical Inference for Stochastic Epidemic Models

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

3 Continuous Random Variables

FACULTY OF ENGINEERING AND ARCHITECTURE. MATH 256 Probability and Random Processes. 04 Multiple Random Variables

A class of probability distributions for application to non-negative annual maxima

ON AN INFINITE SET OF NON-LINEAR DIFFERENTIAL EQUATIONS By J. B. McLEOD (Oxford)

Chapter 4: CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

Solutions. Some of the problems that might be encountered in collecting data on check-in times are:

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -29 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

MAS1302 Computational Probability and Statistics

FAILURE-TIME WITH DELAYED ONSET

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Overview of Dispersion. Standard. Deviation

Chapter 3.3 Continuous distributions

Chapter 4. Continuous Random Variables 4.1 PDF

A Modified Family of Power Transformations

Probability and Distributions

The Tobit model. Herman J. Bierens. September 17, Y j = max ³ Y ; (1) j ;0

Asymptotics for posterior hazards

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( )

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

INVERTED KUMARASWAMY DISTRIBUTION: PROPERTIES AND ESTIMATION

9.07 Introduction to Probability and Statistics for Brain and Cognitive Sciences Emery N. Brown

Jordan Journal of Mathematics and Statistics (JJMS) 7(4), 2014, pp

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

TL- MOMENTS AND L-MOMENTS ESTIMATION FOR THE TRANSMUTED WEIBULL DISTRIBUTION

Reprinted from MONTHLY WEATHER REVIEW, Vol. 109, No. 12, December 1981 American Meteorological Society Printed in I'. S. A.

Non-linearity effects in the process of floods generation

0, otherwise. U = Y 1 Y 2 Hint: Use either the method of distribution functions or a bivariate transformation. (b) Find E(U).

Single Maths B: Introduction to Probability

Enhancing Weather Information with Probability Forecasts. An Information Statement of the American Meteorological Society

2. The CDF Technique. 1. Introduction. f X ( ).

Distributions of Functions of Random Variables

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Statistical Analysis of Climatological Data to Characterize Erosion Potential: 2. Precipitation Events in Eastern Oregon/Washington

MATH Notebook 5 Fall 2018/2019

Numerical integration formulas of degree two

EXPLICIT EXPRESSIONS FOR MOMENTS OF χ 2 ORDER STATISTICS

High spatial resolution interpolation of monthly temperatures of Sardinia

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Spatial and temporal variability of wind speed and energy over Greece

Collision Avoidance Lexicon

ASM Study Manual for Exam P, Second Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA Errata

APPROXIMATING THE GENERALIZED BURR-GAMMA WITH A GENERALIZED PARETO-TYPE OF DISTRIBUTION A. VERSTER AND D.J. DE WAAL ABSTRACT

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

On the General Solution of Initial Value Problems of Ordinary Differential Equations Using the Method of Iterated Integrals. 1

Financial Econometrics and Volatility Models Extreme Value Theory

Transcription:

INTERNATIONAL JOURNAL OF CLIMATOLOGY, VOL. 16, 1157-1 163 (1996) 551.521.I l(4) REPRESENTATION OF DAILY RAINFALL DISTRIBUTIONS USING NORMALIZED RAINFALL CURVES IAN T. JOLLIFFE AND PETER B. HOPE Department of Mathematical Sciences. University of Aherdeen. Edward Wright Building, Dunhar Street, Aherdeen AB9 2Tb UK emai I : itj @maths. abdn.ac. uk Received 22 February I995 Accepted I2 December 199s ABSTRACT For daily rainfall the normalized rainfall curve (NRC) provides a plot of the cumulative percentage of rain days 6) against the cumulative percentage of rain amount (x). It is not immediately clear whether the equations that have been widely used to represent the relationship between x and y correspond to a valid probability distribution for daily rainfall amounts. We show that such distributions exist, but that they are truncated, i.e. not all positive values of rainfall have non-zero probability. In practice, the truncation may not be too important, but the form of the equations relating x and y is also determined for a number of wellknown probability distributions that are not truncated. The fit of the corresponding NRCs to some published rainfall data is examined. KEY WORDS: daily rainfall; exponential distribution; gamma distributions; normalized rainfall curves; Weibull distributions I. INTRODUCTION Normalized rainfall curves (NRCs) provide a graphical means of representing the observed frequency distribution of daily rainfall amounts at any given rainfall station. The cumulative percentage (or proportion) of raindays, y, is plotted against the cumulative percentage (or proportion) of rainfall amounts, x. The NRCs have been used to display rainfall distributions by a number of authors. Here we focus on their use in the paper by Ananthakrishnan and Soman (1 989), subsequently referred to as AS. A number of hrther references can be found in AS. In AS the relationship between x and y is estimated by x =yexp[-b(100 -y) ], (1) for suitably chosen constants b and c. This relationship is an adaptation of an earlier expression x = ay exp(by), (2) where a and b constants with a = exp(-b), which dates back to Olascoago (1 950). Equation (1) is shown by AS to give a good fit to data from various Indian rainfall stations. It is therefore of interest to investigate to what probability distributions for rainfall amounts equation (1) corresponds. In section 2 we demonstrate that equations (1) and (2) correspond to probability distributions that are truncated in the following sense. There is a lower bound I, to the possible values for rainfall amounts when equation (1) is valid, and upper and lower bounds u2 and f2 for rainfall amounts in the case of equation (2). Although it appears that this truncation may often not be of practical importance, it is desirable in some circumstances to have probability distributions that allow all non-zero values of x. For this reason we investigate, in section 3, the form of NRCs for some standard probability distributions. The results are mixed: results can be derived for the exponential distribution and for some Weibull distributions, and we show how the corresponding NRCs can be fitted to some of AS S Indian data in section 4. However, there seems to be no readily available analytical solution for the equation relating x and y for general gamma distributions, a class of distributions that is often used to model rainfall data. CCC 0899-8418/96/101157-07 (Q 1996 by the Royal Meteorological Society

1158 I. T. JOLLIFFE AND I? B. HOPE 2. PROBABILITY DISTRIBUTIONS CORRESPONDING TO NRCs Using the notation of AS, let ri, r,,..., rn denote non-zero rainfall amounts observed on N days at a rainfall station, arranged in ascending order; N is the total observed rainfall. Then xk and yk are defined as N xk = rjr, yk = k/n, k = 1,2,..., N. i= 1 The plotted values (xk and yk), k = 1,2,..., N give the observed NRC for the station. Note that AS express xk and yk as percentages (i.e. they multiply both quantities defined above by 100). It is more convenient in what follows, and completely equivalent, to treat them as proportions. Suppose that the rainfall amounts are observations on a random variable U, which has probability density function (p.d.f.) f(u), u > 0, and cumulative distribution c function (c.d.f.) F(u) = f(v)dv. Let u be a particular value for U, and consider the relationship between y, the probability of a value of U no greater than u, and x, the proportion of total rainfall accounted for by falls no greater than u. We have and Jo ru y = f(v)dv = F(u) (3) In equation (4) the denominator represents the overall mean rainfall, p, and the numerator is the contribution to the mean of falls not exceeding u. From equation (3), u = F-'(Y), and substituting in equation (4) gives x = J'" vf(v)dv. P o Note that the right-hand-side of equation (5) can p be written as 1 - F-'(w)dw, P o where w = F(v), so dw =f(v)dv and v = F-'(w). Hence, differentiating both sides of equation (5) with respect toy gives and If we assume that rainfall can take all positive values we have F-'(O) = 0 and as p is finite it follows that - dx =O when y=o. dy

DAILY RAINFALL DISTRIBUTION CURVES 1159 If equation (2) holds, x = uyexp(by) and dx - = u(1 + by) exp(by), dy which cannot be zero when y = 0 (unless a = 0). Similarly, if equation (1) holds then equation (8) cannot be satisfied. At first sight, this implies that equations (1) and (2) do not correspond to any probability distribution, but this is not the case if we relax the requirement that all positive values of v are possible. From equation (l), expressed in terms of proportions rather than percentages, we find The corresponding results for equation (2) are obtained by setting c = 1 in equations (9) and (lo). Equation (6) can be rewritten as but y = 0 corresponds to the smallest possible value of the random variable for which the distribution function is E Similarly, y = 1 corresponds to the largest possible value of that random variable. When y = 0, we have from equation (9) that p(dx/dy) = p exp(-b), and for y = 1, p(dx/dy) = p( 1 + b). Hence equation (1) implies that rainfall is restricted to the range [pexp(-b), p(1 + b)]. If equation (2) is used, the same lower bound applies, but there is no upper bound. Equations (6), (7), (9), and (10) do not allow us to find a formula for the corresponding probability distribution Au). However, substituting equation (6) into (7) gives f (%) = [$I where we have taken p= 1 because the shape of the NRC, which is based on percentages or proportions, is independent of p. By letting y run fkom 0 to 1 in small steps, and calculating corresponding values of dx/dy and d2x/dg from equations (9) and (lo), we can use equation (12) to plot the function f(u) which is monotone, decreasing over all values of u in the range [exp(-b), (1 + b)]. -I 3. NRCs FOR SOME STANDARD PROBABILITY DISTRIBUTIONS We shall see that in the examples of section 4 the limits on the range of rainfall amounts implied by equations (1) and (2) may not be too restrictive. However, it is of interest to investigate the form of NRCs for some standard probability distributions that have been used to model rainfall amounts and which allow all possible positive values. Equation (5) expresses x in terms ofy, but with the inverse distribution function F-'(,v) appearing as a limit of integration it is not in a very useful form. For some probability distributions it is possible to use equation (5), or its equivalent form equation (6), to obtain an explicit expression for x in terms of y. For example, consider the exponential distribution with p.d.f.

1160 1. T. JOLLIFFE AND P. B. HOPE For this distribution, F(u) = 1 - e-"/p, so ify=f(u), then F-'(y) = -p In(1 - y). Substituting into equation (6) gives leading to dr -= -ln(l -y), dy x=y+(l -y)ln(l-y). The exponential distribution is a special case, with a = 1 and j = 1 /p, of the family of gamma distributions with p.d.f. P" f,(u) = -u"-i r(l-4 exp(-pu), u > 0, and this family has often been used to model daily rainfall amounts-see, for example, Stem and Coe (1984) and references therein. For gamma distributions, equation (5) can be written x = F,+,[F;'(y)], where = p(v)dv. 0 However, it appears to be impossible to derive explicit expressions relating x to y for general members of this family. The expression for F,(u) = 5,"& v"-i exp(-pv)dv cannot be written in closed form and so cannot be inverted to give F;'(y), which is required in equation (6). The reason for the popularity of gamma distributions in fitting rainfall data is that they provide a flexible family of positively skewed distributions. As rainfall distributions are usually highly skewed, they can often be well-fitted by gamma distributions. An alternative family of distributions that can accommodate a similar range of distributional shapes is the Weibull family, with p.d.f. f(u) = y6u'-' exp(-6u'), u > 0. (13) As with the gamma family, the exponential distribution arises as a special case, when y = 1 and 6 = l/p. It is possible to solve to equation (6) for some members of the Weibull family. First, note that the solution of equation (6) will not depend on 6, so in finding the form of NRCs we can take 6 = 1 without loss of generality. We have F(u) = 1 - (exp(-uy) leading to F-'(y) = [- ln(l -y)]'/y. When y= 112, it follows that and when y = 113, x=4[2y+2(1 -y)in(l -+(I -y)[ln(l -y>l2], x = i[6y+ 6(1 -y)ln(l -y) - 3(1 -y)[ln(l -y)i2 + (1 -y)[ln(l -y)i3]. Other cases when y-' is integer can be evaluated similarly, and expressions can also be found in some other situations. For example, when y = 213, where @ denotes the c.d.f. of the normal (Gaussian) distribution with mean zero and variance one. 4. EXAMPLES Equation (I) is fitted by AS to data from a number of Indian rainfall stations. In order to make the distribution of rainfall fairly homogeneous across each data set studied, different calendar months are analysed separately. As the

DAILY RAINFALL DISTRIBUTION CURVES 1161 coefficient of variation (CV) of the data increases, so the NRC based on those data moves towards the top left (north-west) of the diagram, and by changing the constants a, b, and c in equations (1) and (2) a variety of NRCs can be well-fitted by these equations. Similar families of curves can also be generated by varying the parameter y in the Weibull family (equation (13)). This is illustrated in Figures 1 and 2. It is suggested by AS that the NRCs are defined uniquely by their CVs. This certainly holds for the Weibull cases that we have examined-both the curve and the CV depend only on y and not on 6. For equations (1) and (2), a CV cannot be calculated readily, because equations (5) and (6) do not lead to explicit expressions for the mean and variance of the corresponding probability distributions. Among their data sets, AS take measurements for Mangalore in July and for Bombay in September as special test cases, because they represent extremes in terms of the CV Mangalore in July has a CVof 96 per cent, whereas Bombay in September has a CV of 224 per cent. To assess the fit of equations (1) and (2) to those data, AS find values of y corresponding to x = 0.03, 0.05, 0.10, 0.20, 0.30,...,0.80, 0.90 (14) from the data set for Mangalore, and corresponding to x = 0.02, 0.03, 0.10, 0.20,...,0.80, 0.90 (15) for Bombay. The x value in the data, x*, corresponding toy = 0.50 is also included in this set, and equals 0.154 for Mangalore and 0.055 for Bombay. For each value of y thus found, x is calculated using equations (1) and (2), with estimated values of a, 6, and c, and the x values calculated are compared with the values in equations (14) and (15). For equation (l), b and c are estimated by fitting the NRC exactly at (x* and 0.5) and (0.5 and y*), where y* is the value of y corresponding to x = 0.5. To estimate b in equation (2), the NRC is fitted exactly at (0.5 and y*). Tables I and I1 and Figures 1 and 2 give information on the fit of the equations used by AS for these data sets. These also include corresponding information when NRCs based on exponential and Weibull distributions are fitted to data from Mangalore and Bombay respectively. Tables I and I1 present the mean, standard deviations, and root-mean-square error ( = [(mean)* + (standard de~iation)~]''~) of the differences between the fitted x values and those in equation (14). For Mangalore, equation (1) gives a better fit than equation (2), but fitting the curve corresponding to an exponential distribution is as good as equation (1). Note that the CV for the exponential distribution is 100 per cent, close to that of the Mangalore data, which is 96.4 per cent. For Bombay, equation (1) is much better than equation (2), whereas the Weibull with y = 1/2 gives an intermediate fit. It is likely that there are Weibull distributions with fits to the NRC that are competitive with that of equation (2). No attempt has been made to optimize the Weibull fit because of the difficulty of finding an expression for x in terms ofy for general values of y. In terms of finding the best-fitting Weibull distribution to the 0.0 i - I 0.0 0.5 1.o Figure 1. Fitted Bombay NRC using equation (2).... Fitted Bombay NRC using equation (l), Fitted Mangalore NRC using equation (2). - - - - -- Observed NRCs for Bombay (upper curve) and Mangalore (lower curve), X

1162 1. T. JOLLIFFE AND P. B. HOPE 1.0 - i 0.0 - I Figure 2. Fitted Weibull NRC, y = 1/2, 0.0 0.5 1.o X Fitted Weibull NRC, y = 2/3,... Fitted Weibull NRC, y = 1 (exponential distribution), - - -. Observed NRCs for Bombay (upper curve) and Mangalore (lower curve),. Table I. Means, standard deviations, and root-mean-square errors (RMSE) of differences between observed and fitted x values in NRC curves for Mangalore July rainfall Mean Standard deviation RMSE Equation (2) 0.09 1.46 1.46 Equation (1) 0.37 0.77 0.85 Exponential - 0.61 0 59 0.85 Table 11. Means, standard deviations and root-mean-square errors (RMSE) of differences between observed and fitted x values in NRC curves for Bombay September rainfall Mean Standard deviation RMSE Equation (2) - 1.34 6.09 6.23 Equation (1) - 0.09 0.67 0.67 Weibull (y = 1 /2) - 2.26 1.18 2.55 Weibull (y = 2/3) 10.66 4.70 11.65 Weibull (y = 1/3) - 14.99 8.66 17.30 NRC of a data set the CV clearly gives some guidance. The CV for a Weibull with y = 112 is 224 per cent, precisely that of the Bombay data, while the CVs for Weibull distributions with y = 213, y = 113 are 154 per cent and 436 per cent respectively. These distributions give much worse fits-see Table 11, which also suggests that to achieve a mean difference (bias) as close to zero as possible y should be taken slightly above 112. The values of b fitted by AS using equation (1) are 3.10 and 15.15 for Mangalore and Bombay respectively. The upper and lower bounds for rainfall values implied by these NRCs are bexp(-b), p(1 + b)]. The mean p is unknown, but if it is estimated by the sample mean F, we find a range of rainfall values of [1.6 mm, 145.1 mm], [4 x lop6 mm, 229 mm] for Mangalore and Bombay respectively.

DAILY RAINFALL DISTRIBUTION CURVES 1163 If equation (2) is fitted, there is no upper bound, but the lower bound is 2.8 mm for Mangalore and 0.005 mm for Bombay. Although the truncation of values for Bombay seems to be of little practical importance, there is a real possibility of observing values for Mangalore outside the allowable range, using either equation (1) or (2). 5. DISCUSSION Normalized rainfall curves provide a useful means of representing rainfall distributions. However, the choice of formulae for NRCs is problematic. Empirical formulae (1) and (2) are found to correspond to probability distributions for which the p.d.f. cannot be written down explicitly; nor can their means or variances. Furthermore, the distributions are truncated in the sense that rainfall values above and below certain thresholds have zero probability of occurrence. This truncation may not be too serious in practice, but it is nevertheless more realistic, for example when simulating rainfall amounts, to work with NRCs that correspond to standard probability distributions. Unfortunately, although the p.d.f., means and variances can be written down for families of distributions such as gamma and Weibull, there are difficulties in finding explicit expressions for NRCs for such probability distributions; although some progress is possible for Weibull distributions. It appears that, as yet, there is no ideal solution to the problem of modelling NRCs. ACKNOWLEDGEMENTS We are grateful to two anonymous referees, whose insights and comments helped considerably in our understanding of the relationships between the NRCs of equations (1) and (2) and corresponding probability distributions. REFERENCES Ananthakrishnan, R. and Soman, M. K. 1989. Statistical distribution of daily rainfall and its association with the coefficient of variation of rainfall series, Inr. J Climatol., 9, 485-500. Olascoago, M. J. 1950. Some aspects of Argentinian rainfall, Tellus, 2, 3 12-3 18. Stem, R. D. and Coe, R. 1984. A model fitting analysis of daily rainfall data, 1 Roy. Statist. SOC. Ser: A, 147, 1-34 (including discussion).