Chapter 2. Basic concepts of probability. Summary. 2.1 Axiomatic foundation of probability theory

Similar documents
MODULE 6 LECTURE NOTES 1 REVIEW OF PROBABILITY THEORY. Most water resources decision problems face the risk of uncertainty mainly because of the

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction

TLT-5200/5206 COMMUNICATION THEORY, Exercise 3, Fall TLT-5200/5206 COMMUNICATION THEORY, Exercise 3, Fall Problem 1.

The concept of limit

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

CHAPTER 1: INTRODUCTION. 1.1 Inverse Theory: What It Is and What It Does

9.1 The Square Root Function

Basic properties of limits

10. Joint Moments and Joint Characteristic Functions

1 Relative degree and local normal forms

ELEG 3143 Probability & Stochastic Process Ch. 4 Multiple Random Variables

The achievable limits of operational modal analysis. * Siu-Kui Au 1)

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

2.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics. References - geostatistics. References geostatistics (cntd.

On High-Rate Cryptographic Compression Functions

Extreme Values of Functions

SEPARATED AND PROPER MORPHISMS

Feedback Linearization

SEPARATED AND PROPER MORPHISMS

Numerical Solution of Ordinary Differential Equations in Fluctuationlessness Theorem Perspective

3. Several Random Variables

Chapter 6 Reliability-based design and code developments

Review of Prerequisite Skills for Unit # 2 (Derivatives) U2L2: Sec.2.1 The Derivative Function

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions

STAT 801: Mathematical Statistics. Hypothesis Testing

Ex x xf xdx. Ex+ a = x+ a f x dx= xf x dx+ a f xdx= xˆ. E H x H x H x f x dx ˆ ( ) ( ) ( ) μ is actually the first moment of the random ( )

Micro-canonical ensemble model of particles obeying Bose-Einstein and Fermi-Dirac statistics

Sample Spaces, Random Variables

Lectures on Elementary Probability. William G. Faris

Syllabus Objective: 2.9 The student will sketch the graph of a polynomial, radical, or rational function.

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

APPENDIX 1 ERROR ESTIMATION

Differentiation. The main problem of differential calculus deals with finding the slope of the tangent line at a point on a curve.

Bayesian Technique for Reducing Uncertainty in Fatigue Failure Model

Categories and Natural Transformations

Fluctuationlessness Theorem and its Application to Boundary Value Problems of ODEs

Wind-Driven Circulation: Stommel s gyre & Sverdrup s balance

NONLINEAR CONTROL OF POWER NETWORK MODELS USING FEEDBACK LINEARIZATION

In many diverse fields physical data is collected or analysed as Fourier components.

Review of probability. Nuno Vasconcelos UCSD

Estimation and detection of a periodic signal

In the index (pages ), reduce all page numbers by 2.

Scattering of Solitons of Modified KdV Equation with Self-consistent Sources

Stochastic Processes. Review of Elementary Probability Lecture I. Hamid R. Rabiee Ali Jalali

Simpler Functions for Decompositions

Introduction to Information Entropy Adapted from Papoulis (1991)

Module 1. Probability

Mixed Signal IC Design Notes set 6: Mathematics of Electrical Noise

Review Notes for IB Standard Level Math

Brief Review of Probability

The Clifford algebra and the Chevalley map - a computational approach (detailed version 1 ) Darij Grinberg Version 0.6 (3 June 2016). Not proofread!

MISS DISTANCE GENERALIZED VARIANCE NON-CENTRAL CHI DISTRIBUTION. Ken Chan ABSTRACT

ENSC327 Communications Systems 2: Fourier Representations. School of Engineering Science Simon Fraser University

Analog Computing Technique

Quadratic Functions. The graph of the function shifts right 3. The graph of the function shifts left 3.

The aim of this section is to introduce the numerical, graphical and listing facilities of the graphic display calculator (GDC).

Math 248B. Base change morphisms

Finite Dimensional Hilbert Spaces are Complete for Dagger Compact Closed Categories (Extended Abstract)

CATEGORIES. 1.1 Introduction

TESTING TIMED FINITE STATE MACHINES WITH GUARANTEED FAULT COVERAGE

Statistical signal processing

Analog Communication (10EC53)

Problem Set. Problems on Unordered Summation. Math 5323, Fall Februray 15, 2001 ANSWERS

Lecture 2: Review of Probability

SIO 211B, Rudnick. We start with a definition of the Fourier transform! ĝ f of a time series! ( )

Signals & Linear Systems Analysis Chapter 2&3, Part II

Online Appendix: The Continuous-type Model of Competitive Nonlinear Taxation and Constitutional Choice by Massimo Morelli, Huanxing Yang, and Lixin Ye

Physics 2B Chapter 17 Notes - First Law of Thermo Spring 2018

Basic mathematics of economic models. 3. Maximization

Algebra I. 60 Higher Mathematics Courses Algebra I

8.4 Inverse Functions

PROBABILITY THEORY 1. Basics

2 Frequency-Domain Analysis

EEO 401 Digital Signal Processing Prof. Mark Fowler

Least-Squares Spectral Analysis Theory Summary

Probability and statistics; Rehearsal for pattern recognition

ELEMENTS OF PROBABILITY THEORY

Comments on Problems. 3.1 This problem offers some practice in deriving utility functions from indifference curve specifications.

Two-dimensional Random Vectors

Curriculum Map for Mathematics SL (DP1)

Root Arrangements of Hyperbolic Polynomial-like Functions

Mathematical Notation Math Calculus & Analytic Geometry III

CS 361 Meeting 28 11/14/18

MATH2206 Prob Stat/20.Jan Weekly Review 1-2

1. Fundamental concepts

Universidad de Chile, Casilla 653, Santiago, Chile. Universidad de Santiago de Chile. Casilla 307 Correo 2, Santiago, Chile. May 10, 2017.

Recitation 2: Probability

Origins of Probability Theory

Special concepts of probability theory in geophysical applications

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Executive Assessment. Executive Assessment Math Review. Section 1.0, Arithmetic, includes the following topics:

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares

Probability. Table of contents

GENERALIZED ABSTRACT NONSENSE: CATEGORY THEORY AND ADJUNCTIONS

If we want to analyze experimental or simulated data we might encounter the following tasks:

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

Module 3. Function of a Random Variable and its distribution

Strong Lyapunov Functions for Systems Satisfying the Conditions of La Salle

Transcription:

Chapter Basic concepts o probability Demetris Koutsoyiannis Department o Water Resources and Environmental Engineering aculty o Civil Engineering, National Technical University o Athens, Greece Summary This chapter aims to serve as a reminder o basic concepts o probability theory, rather than a systematic and complete presentation o the theory. The text ollows Kolmogorov s axiomatic oundation o probability and deines and discusses concepts such as random variables, distribution unctions, independent and dependent events, conditional probability, expected values, moments and L moments, joint, marginal and conditional distributions, stochastic processes, stationarity, ergodicity, the central limit theorem, and the normal, χ and Student distributions. Although the presentation is general and abstract, several examples with analytical and numerical calculations, as well as practical discussions are given, which ocus on geophysical, and particularly hydrological, processes.. Axiomatic oundation o probability theory or the understanding and the correct use o probability, it is very important to insist on the deinitions and clariication o its undamental concepts. Such concepts may dier rom other, more amiliar, arithmetic and mathematical concepts, and this may create conusion or even collapse o our cognitive construction, i we do not base it in concrete undaments. or instance, in our everyday use o mathematics, we expect that all quantities are expressed by numbers and that the relationship between two quantities is expressed by the notion o a unction, which to a numerical input quantity associates (maps) another numerical quantity, a unique output. robability too does such a mapping, but the input quantity is not a number but an event, which mathematically can be represented as a set. robability is then a quantiied likelihood that the speciic event will happen. This type o representation was proposed by Kolmogorov (956) *. There are other probability systems dierent rom Kolmogorov s axiomatic system, according to which the input is not a set. Thus, in Jaynes (3) the input o the mapping is a logical proposition and probability is a quantiication o the plausibility o the proposition. The two systems are conceptually dierent but the dierences mainly rely on * Here we cite the English translation, second edition, whilst the original publication was in German in 933. Jaynes s book that we cite here was published ater his death in 998.

. Basic concepts o probability interpretation rather than on the mathematical results. Here we will ollow Kolmogorov s system. Kolmogorov s approach to probability theory is based on the notion o measure, which maps sets onto numbers. The objects o probability theory, the events, to which probability is assigned, are thought o as sets. or instance the outcome o a roulette spin, i.e. the pocket in which the ball eventually alls on to the wheel is one o 37 (in a European roulette 38 in an American one) pockets numbered to 36 and coloured black or red (except which is coloured green). Thus all sets {}, {}, {36} are events (also called elementary events). But they are not the only ones. All possible subsets o Ω, including the empty set Ø, are events. The set Ω : {,,, 36} is an event too. Because any possible outcome is contained in Ω, the event Ω occurs in any case and it is called the certain event. The sets ODD : {, 3, 5,, 35}, EVEN : {, 4, 6,, 36}, RED : {, 3, 5, 7, 9,, 4, 6, 8, 9,, 3, 5, 7, 3, 3, 34, 36}, and BLACK : Ω RED {} are also events (in act, betable). While events are represented as sets, in probability theory there are some dierences rom set theory in terminology and interpretation, which are shown in Table.. Table. Terminology correspondence in set theory and probability theory (adapted rom Kolmogorov, 956) Set theory A Ø A Ω AB Ø (or A B Ø; disjoint sets) AB N Ø : AB N : A + B + + N (or : A U B U U N ) : A B A (the complementary o A) B A (B is a subset o A) Events Event A is impossible Event A is certain Events A and B are incompatible (mutually exclusive) Events A, B,, N are incompatible Event is deined as the simultaneous occurrence o A, B,, N Event is deined as the occurrence o at least one o the events A, B,, N Event is deined as the occurrence o A and, at the same time, the non-occurrence o B The opposite event A consisting o the nonoccurrence o A rom the occurrence o event B ollows the inevitable occurrence o event A Based on Kolmogorov s (956) axiomatization, probability theory is based on three undamental concepts and our axioms. The concepts are:. A non-empty set Ω, sometimes called the basic set, sample space or the certain event whose elements ω are known as outcomes or states.

. Random variables 3. A set Σ known as -algebra or -ield whose elements E are subsets o Ω, known as events. Ω and Ø are both members o Σ, and, in addition, (a) i E is in Σ then the complement Ω E is in Σ; (b) the union o countably many sets in Σ is also in Σ. 3. A unction called probability that maps events to real numbers, assigning each event E (member o Σ) a number between and. The triplet (Ω, Σ, ) is called probability space. The our axioms, which deine properties o, are Non-negativity: or any event A, (A) Normalization: (Ω) Additivity: or any events A, B with AB Ø, (A + B) (A) + (B) IV. Continuity at zero: I A A A n is a decreasing sequence o events, with A A A n Ø, then lim n (A n ) (..I) (..II) (..III) (..IV) In the case that Σ is inite, axiom IV ollows rom axioms I-III; in the general case, however, it should be put as an independent axiom.. Random variables A random variable is a unction that maps outcomes to numbers, i.e. quantiies the sample space Ω. More ormally, a real single-valued unction (ω), deined on the basic set Ω, is called a random variable i or each choice o a real number a the set { < a} or all ω or which the inequality (ω) < α holds true, belongs to Σ. With the notion o the random variable we can conveniently express events using basic mathematics. In most cases this is done almost automatically. or instance in the roulette case a random variable that takes values to 36 is intuitively assumed when we deal with a roulette experiment. We must be attentive that a random variable is not a number but a unction. Intuitively, we could think o a random variable as an object that represents simultaneously all possible states and only them. A particular value that a random variable may take in a random experiment, else known as a realization o the variable is a number. Usually we denote a random variable by an upper case letter, e.g., and its realization by a lower case letter, e.g. x. The two should not be conused. or example, i represents the rainall depth expressed in millimetres or a given rainall episode (in this case Ω is the set o all possible rainall depths) then { } represents an event in the probability notion (a subset o Ω and a member o Σ not to be conused with a physical event or episode) and has a probability { }. * I x is a realization o then x is not an event but a relationship between the two numbers x and, * The consistent notation here would be ({ }). However, we simpliied it dropping the parentheses; we will ollow this simpliication throughout this text. Some texts ollow another convention, i.e., they drop the curly brackets.

4. Basic concepts o probability which can be either true or alse. In this respect it has no meaning to write {x }. urthermore, i we consider the two variables and it is meaningul to write { } (i.e. { } represents an event) but there is no meaning in the expression {x y}..3 Distribution unction Distribution unction is a unction o the real variable x deined by (x) : { x} (.) where is a random variable *. Clearly, (x) maps numbers (x values) to numbers. The random variable to which this unction reers (is associated) is not an argument o the unction; it is usually denoted as a subscript o (or even omitted i there is no risk o conusion). Typically (x) has some mathematical expression depending on some parameters β i. The domain o (x) is not identical to the range o the random variable ; rather it is always the set o real numbers. The distribution unction is a non-decreasing unction obeying the relationship ( ) ( x) ( +) (.3) or its non-decreasing attitude, in the English literature the distribution unction is also known as cumulative distribution unction (cd) though cumulative is not necessary here. In hydrological applications the distribution unction is also known as non-exceedence probability. Correspondingly, the quantity * ( x) : { > x} ( x) (.4) is known as exceedence probability, is a non-increasing unction and obeys * * * ( ) ( x) ( +) (.5) The distribution unction is always continuous on the right; however, i the basic set Ω is inite or countable, (x) is discontinuous on the let at all points x i that correspond to outcomes ω i, and it is constant in between consecutive points. In other words, the distribution unction in these cases is staircase-like and the random variable is called discrete. I (x) is continuous, then the random variable is called continuous. A mixed case with a continuous part and a discrete part is also possible. In this case the distribution unction has some discontinuities on the let, without being staircase-like. The derivative o the distribution unction ( x) d ( x) : (.6) dx * In original Kolmogorov s writing (x) is deined as { < x}; however replacing < with makes the handling o distribution unction more convenient and has prevailed in later literature.

.3 Distribution unction 5 is called the probability density unction (sometimes abbreviated as pd). In continuous variables, this unction is deined everywhere but this is not the case in discrete variables, unless we use Dirac s δ unctions. The basic properties o (x) are ( x), ( x) dx (.7) Obviously, the probability density unction does not represent a probability; thereore it can take values higher than. Its relationship with probability is described by the ollowing equation: ( x) { x x + x} lim (.8) x x The distribution unction can be calculated rom the density unction by the ollowing relationship, inverse o (.6) x ( x) ( ξ ) dξ (.9) or continuous random variables, the inverse unction o ( x) exists. Consequently, the equation u (x) has a unique solution or x, that is xu ( u). The value x u, which corresponds to a speciic value u o the distribution unction, is called u-quantile o the variable..3. An example o the basic concepts o probability or clariication o the basic concepts o probability theory, we give the ollowing example rom hydrology. We are interested on the mathematical description o the possibilities that a certain day in a speciic place and time o the year is wet or dry. These are the outcomes or states o our problem, so the basic set or sample space is The ield Σ contains all possible events, i.e., Ω {wet, dry} {, { wet},{ dry} } Σ,Ω To ully deine probability on Σ it suices to deine the probability o one o either states, say (wet). In act this is not easy usually it is done by induction, and it needs a set o observations to be available and concepts o the statistics theory (see chapter 3) to be applied. or the time being let us arbitrarily assume that {wet}.. The remaining probabilities are obtained by applying the axioms. Clearly, (Ω) and ( ). Since wet and dry are incompatible, {wet} + {dry} ({wet} + {dry}) (Ω), so {dry}.8. We deine a random variable based on the rule ( dry ), ( wet) We can now easily determine the distribution unction o. or any x <,

6. Basic concepts o probability (x) { x} (because, cannot take negative values). or x <, ( x) { x} { }. 8 inally, or x, ( x) { x} { } + { } The graphical depiction o the distribution unction is shown on ig... The staircase-like shape relects the act that random variable is discrete. I this mathematical model is to represent a physical phenomenon, we must have in mind that all probabilities depend on a speciic location and a speciic time o the year. So the model cannot be a global representation o the wet and dry state o a day. The model as ormulated here is extremely simpliied, because it does not make any reerence to the succession o dry or wet states in dierent days. This is not an error; it simply diminishes the predictive capacity o the model. A better model would describe separately the probability o a wet day ollowing a wet day, a wet day ollowing a dry day (we anticipate that the latter should be smaller than the ormer), etc. We will discuss this case in section.4.. (x ).8.6.4. - x ig.. Distribution unction o a random variable representing the dry or wet state o a given day at a certain area and time o the year..4 Independent and dependent events, conditional probability Two events A and B are called independent (or stochastically independent), i ( AB) ( A) ( B) (.) Otherwise A and B are called (stochastically) dependent. The deinition can be extended to many events. Thus, the events A, A,, are independent i ( A A A ) ( A ) ( A ) L( A ) i L in i i i n i (.)

.4 Independent and dependent events, conditional probability 7 or any inite set o distinct indices i, i,, i n. The handling o probabilities o independent events is thus easy. However, this is a special case because usually natural events are dependent. In the handling o dependent events the notion o conditional probability is vital. By deinition (Kolmogorov, 956), conditional probability o the event A given B (i.e. under the condition that the event B has occurred) is the quotient ( AB) ( B) ( A B) : (.) Obviously, i (B), this conditional probability cannot be deined, while or independent A and B, (A B) (A). rom (.) it ollows that ( AB) ( A B) ( B) ( B A) ( A) (.3) and ( B A) ( A B) ( A) : ( B) (.4) The latter equation is known as the Bayes theorem. It is easy to prove that the generalization o (.) or dependent events takes the orms ( A L A ) ( A A L A ) L( A A ) ( A ) n (.5) n n ( A L A B) ( A A L A B) ( A A B) ( A B) n n n L (.6) which are known as the chain rules. It is also easy to prove (homework) that i A and B are mutually exclusive, then ( A B C) ( A C) + ( B C) + (.7) ( A + B) ( A) ( A) + ( C B) ( B) ( A) + ( B) C C (.8).4. Some examples on independent events a. Based on the example o section.3., calculate the probability that two consecutive days are wet assuming that the events in the two days are independent. Let A : {wet} the event that a day is wet and A {dry} the complementary event that a day is dry. As in section.3. we assume that p : (A). and q : ( A ).8. Since we are interested on two consecutive days, our basic set will be Ω { A A, A A, A A A A }, where indices and correspond to the irst and second day, respectively. By the independence assumption, the required probability will be ( A A ) ( A ) ( A ). 4 : p

8. Basic concepts o probability or completeness we also calculate the probabilities o all other events, which are: ( A A ) ( A A ) pq.6, ( A A ) q. 64 As anticipated, the sum o probabilities o all events is. b. Calculate the probability that two consecutive days are wet i it is known that one day is wet. Knowing that one day is wet means that the event A A should be excluded (has not occurred) or that the composite event A A + A A + A A has occurred. Thus, we seek the probability : A A A A + A A + A ) ( A which according to the deinition o conditional probability is ( A A ( A A + A A + A A ) ( A A + A A + A A ) Considering that all combinations o events are mutually exclusive, we obtain ( A A ) p + pq p p + q ( A A ) + ( A A ) + ( A A ) p.k c. Calculate the probability that two consecutive days are wet i it is known that the irst day is wet Even though it may seem that this question is identical to the previous one, in act it is not. In the previous question we knew that one day is wet, without knowing which one exactly. Here we have additional inormation, that the wet day is the irst one. This inormation alters the probabilities as we will veriy immediately. Now we know that the composite event A A + A A has occurred (events A A and A A should be excluded). Consequently, the probability sought is 3 : ( A A A A + A A ) which according to the deinition o conditional probability is 3 ( A A ( A A + A A ) ( A A + A A ) or ( A A ) 3 ( A A ) + ( A A ) p + pq p + q p p p. It is not a surprise that this is precisely the probability that one day is wet, as in section.3.. With these examples we demonstrated two important thinks: (a) that the prior inormation we have in a problem may introduce dependences in events that are initially assumed

.4 Independent and dependent events, conditional probability 9 independent, and, more generally, (b) that the probability is not an objective and invariant quantity, characteristic o physical reality, but a quantity that depends on our knowledge or inormation on the examined phenomenon. This should not seem strange as it is always the case in science. or instance the location and velocity o a moving particle are not absolute objective quantities; they depend on the observer s coordinate system. The dependence o probability on given inormation, or its subjectivity should not be taken as ambiguity; there was nothing ambiguous in calculating the above probabilities, based on the inormation given each time..4. An example on dependent events The independence assumption in problem.4.a is obviously a poor representation o the physical reality. To make a more realistic model, let us assume that the probability o today being wet (A ) or dry A depend on the state yesterday (A or A ). It is reasonable to assume that the ollowing inequalities hold: ( A A ) ( A ) p >, ( A A ) > ( A ) q ( A A ) < ( A ) p, ( A A ) < ( A ) q The problem now is more complicated than beore. Let us arbitrarily assume that ( A A ).4, ( A ) A.5 Since we can calculate ( A A ) + ( A ) A ( ) A A ( ) A A.6 Similarly, As the event A and using (.8) we obtain ( A A ) ( A A ).85 A + is certain (i.e. ( A + A ) ( A ) ( A A + ) A ) we can write ( A ) ( A A ) ( A ) ( A A ) ( A ) + (.9) It is reasonable to assume that the unconditional probabilities do not change ater one day, i.e. that ( A ) ( A ) p and ( A ) ( A ) q p. Thus, (.9) becomes p.4 p +.5 ( p) rom which we ind p. and q.8. (Here we have deliberately chosen the values o ( A A ) and ( A ) A such as to ind the same p and q as in.4.a).

. Basic concepts o probability Now we can proceed to the calculation o the probability that both days are wet: ( A A ) ( A A ) ( A ).4..8 > p. 4 or completeness we also calculate the probabilities o all other events, which are: ( A A ) ( A A ) ( A ).5.8., ( A A ) ( A A ) ( A ).6.. ( A A ) ( A A ) ( A ).85.8.68 > q. 64 Thus, the dependence resulted in higher probabilities o consecutive events that are alike. This corresponds to a general natural behaviour that is known as persistence (see also chapter 4)..5 Expected values and moments I is a continuous random variable and g() is an arbitrary unction o, then we deine as the expected value or mean o g() the quantity E [ g( )]: g( x) ( x) dx (.) Correspondingly, or a discrete random variable, taking on the values x, x,, E [ g( )]: g( x ) ( ) i i x i (.) or certain types o unctions g() we take very commonly used statistical parameters, as speciied below:. or g() r, where r,,,, the quantity m : E ( r) r [ ] (.) is called the rth moment (or the rth moment about the origin) o. or r, obviously the moment is.. or g(), the quantity [ ] m : E (.3) (that is the irst moment) is called the mean o. An alternative, commonly used, symbol or E[] is µ. ( ) ( ) r 3. or g, where r,,,, the quantity m µ : E ( r) r [( m ) ] (.4) is called the rth central moment o. or r and the central moments are respectively and. The central moments are related to the moments about the origin by µ ( r) m ( r) r m ( r ) m + L + r j j ( r j) j r () r ( ) m m + L( ) m m (.5)

.5 Expected values and moments These take the ollowing orms or small r µ m m () () (3) (3) () 3 3m m m µ m + (.6) (4) (4) (3) () 4 4m m + 6m m 3m µ m and can be inverted to read: m + m () (3) (3) 3 3 m m m µ + + (.7) ( ) ( ) (4) (4) 4. or g, the quantity m (3) 4 4µ m + 6 m m m µ + + [( m ) ] () : µ E E[ ] m (.8) (that is the second central moment) is called the variance o. The variance is also denoted as Var. Its square root, denoted as or StD[] is called the standard deviation o. [ ] The above amilies o moments are the classical ones having been used or more than a century. More recently, other types o moments have been introduced and some o them have been already in wide use in hydrology. We will discuss two amilies. 5. or g() [()] r, where r,,,, the quantity (r) β : E{ [()] r } x [(x)] r (x) dx x(u) u r du (.9) is called the rth probability weighted moment o (Greenwood et al., 979). All probability weighted moments have dimensions identical to those o (this is not the case in the other moments described earlier). * * 6. or g() (()), where r,,, (u) is the rth shited Legendre polynomial, i.e., * r r r * k * (u) : p r u with : k, k p r,k r r k r k r r + k ( ) ( r + k)! ( ) k k ( k!) ( r k)! the quantity (r) λ * * : E[ (())] x(u) (u) du (.3) r r

. Basic concepts o probability is called the rth L moment o (Hosking, 99). Similar to the probability weighted moments, the L moments have dimensions identical to those o. The L moments are related to the probability weighted moments by which or the most commonly used r takes the speciic orms r (r) λ : * (r) p r, k β (.3) k () λ () λ () ( m ) β () β () β (3) λ () () () 6 6 + (.3) β β β (4) λ (3) () () 3 + β β β () β In all above quantities the index may be omitted i there is no risk o conusion. The irst our moments, central moments and L moments are widely used in hydrological statistics as they have a conceptual or geometrical meaning easily comprehensible. Speciically, they describe the location, dispersion, skewness and kurtosis o the distribution as it is explained below. Alternatively, other statistical parameters with similar meaning are also used, which are also explained below..5. Location parameters Essentially, the mean describes the location o the centre o gravity o the shape deined by the probability density unction and the horizontal axis (ig..a). It is also equivalent with the static moment o this shape about the vertical axis (given that the area o the shape equals ). Oten, the ollowing types o location parameters are also used:. The mode, or most probable value, x p, is the value o x or which the density (x) becomes maximum, i the random variable is continuous, or, or discrete variables, the probability becomes maximum. I (x) has one, two or many maxima, we say that the distribution is unimodal, bi-modal or multi-modal, respectively.. The median, x.5, is the value or which { x.5 } { x.5 } /, i the random variable is continuous (analogously we can deine it or a discrete variable). Thus, a vertical line at the median separates the shape o the density unction in two equivalent parts each having an area o /. Generally, the mean, the mode and the median are not identical unless the density is has a symmetrical and unimodal shape.

.5 Expected values and moments 3.5. Dispersion parameters The variance o a random variable and its square root, the standard deviation, which has same dimensions as the random variable, describe a measure o the scatter or dispersion o the probability density around the mean. Thus, a small variance shows a concentrated distribution (ig..b). The variance cannot be negative. The lowest possible value is zero and this corresponds to a variable that takes one value only (the mean) with absolute certainty. Geometrically it is equivalent with the moment o inertia about the vertical axis passing rom the centre o gravity o the shape deined by the probability density unction and the horizontal axis. (x ) (x ).6.6 () ().4 (a).4 () (b).. () 4 6 8 x 4 6 8 x.6 (x ) () () (x ).6 ().4 (c).4 (d) (). (). () 4 6 8 x 4 6 8 x ig.. Demonstration o the shape characteristics o the probability density unction in relation to various parameters o the distribution unction: (a) Eect o the mean. Curves () and () have means 4 and, respectively, whereas they both have standard deviation, coeicient o skewness and coeicient o kurtosis 4.5. (b) Eect o the standard deviation. Curves () and () have standard deviation and respectively, whereas they both have mean 4, coeicient o skewness and coeicient o kurtosis 4.5. (c) Eect o the coeicient o skewness. Curves (), () and () have coeicients o skewness, +.33 and -.33, respectively, but they all have mean 4 and standard deviation ; their coeicients o kurtosis are 3, 5.67 and 5.67, respectively. (d) Eect o the coeicient o kurtosis. Curves (), () and () have coeicients o kurtosis 3, 5 and, respectively, whereas they all have mean 4, standard deviation and coeicient o skewness. Alternative measures o dispersion are provided by the so-called interquartile range, deined as the dierence x.75 x.5, i.e. the dierence o the.75 and.5 quantiles (or upper and lower quartiles) o the random variable (they deine an area in the density unction equal to.5), as well as the second L moment. This is well justiied as it can be shown that

4. Basic concepts o probability the second L moment is the expected value o the dierence between any two random realizations o the random variable. I the random variable is positive, as happens with most hydrological variables, two dimensionless parameters are also used as measures o dispersion. These are called the coeicient o variation and the L coeicient o variation, and are deined, respectively, by:.5.3 Skewness parameters C : m v, τ () () λ : (.33) m The third central moment and the third L moment are used as measures o skewness. A zero value indicates that the density is symmetric. This can be easily veriied rom the deinition o the third central moment. urthermore, the third L moment indicates the expected value o the dierence between the middle o three random realizations o a random variable rom the average o the other two values (the smallest and the largest); more precisely the third central moment is the /3 o this expected value. Clearly then, in a symmetric distribution the distances o the middle value to the smallest and largest ones will be equal to each other and thus the third L moment will be zero. I the third central or L moment is positive or negative, we say that the distribution is positively or negatively skewed respectively (ig..c). In a positively skewed unimodal distribution the ollowing inequality holds: x x ; the reverse holds or a negatively skewed distribution. More convenient measures o skewness are the ollowing dimensionless parameters, named the coeicient o skewness and the L coeicient o skewness, respectively:.5.4 Kurtosis parameters (3) 3 p. 5 (3) µ (3) λ Cs :, τ : (.34) () λ The term kurtosis describes the peakedness o the probability density unction around its mode. Quantiication o this property provide the ollowing dimensionless coeicients, based on the ourth central moment and the ourth L moment, respectively: (4) 4 (4) µ (4) λ Ck :, τ : (.35) () λ These are called the coeicient o kurtosis and the L coeicient o kurtosis. Reerence values or kurtosis are provided by the normal distribution (see section..), which has 3 and.6. Distributions with kurtosis greater than the reerence values are called leptokurtic (acute, sharp) and have typically at tails, so that more o the variance is due to inrequent extreme deviations, as opposed to requent modestly-sized deviations. Distributions with kurtosis less than the reerence values are called platykurtic (lat; ig..d). (4) τ m C k

.5 Expected values and moments 5.5.5 A simple example o a distribution unction and its moments We assume that the daily rainall depth during the rain days,, expressed in mm, or a certain location and time period, can be modelled by the exponential distribution, i.e., x / λ ( x) e, x where λ mm. We will calculate the location, dispersion, skewness and kurtosis parameters o the distribution. Taking the derivative o the distribution unction we calculate the probability density unction: x / λ ( x) (/ λ) e, x Both the distribution and the density unctions are plotted in ig..3. To calculate the mean, we apply (.) or g() : m E x / λ ( xe dx [ ] x x) dx (/ λ) Ater algebraic manipulations: m λ mm In a similar manner we ind that or any r ( r) r r [ ] r λ m E! and inally, applying (.6) (3) 3 λ 4 mm, µ λ 6 mm 3 (4) µ 4 4 9 44 mm λ (x).6.4. x p x.5 m - 4 6 8 x (mm) (x).8.6.4. - 4 6 8 x (mm) ig..3 robability density unction and probability distribution unction o the exponential distribution, modelling the daily rainall depth at a hypothetical site and time period. The mode is apparently zero (see ig..3). The inverse o the distribution unction is calculated as ollows:

6. Basic concepts o probability xu / λ ( x ) u e u x λ ln( u) u Thus, the median is x.5 ln.5 3.9 mm. We veriy that the inequality x x, which characterizes positively skewed distributions, holds. p. 5 m The standard deviation is mm and the coeicient o variation very high value indicating high dispersion. The coeicient o skewness is calculated or (.34): C s 3 λ / λ 3 u C v. This is a This veriies the positive skewness o the distribution, as also shown in ig..3. More speciically, we observe that the density unction has an inverse-j shape, in contrast to other, more amiliar densities (e.g. in ig..) that have a bell-shape. The coeicient o kurtosis is calculated rom (.35): C k 4 9λ / λ Its high value shows that the distribution is leptokurtic, as also depicted in ig..3. We proceed now in the calculations o probability weighted and L moments as well as other parameters based on these. rom (.9) we ind 4 9 (r) β λ r + x(u) u r du λ ln( u) u r du r + i i (.36) (This was somewhat tricky to calculate). This results in () β () λ, β 3λ, () β 4 λ, (3) β 8 5λ 48 (.37) Then, rom (.3) we ind the irst our L moments and the three L moment dimensionless coeicients as ollows: () λ λ mm ( m ) () λ 3λ λ λ mm 4 (3) λ λ 3λ λ 6 6 + λ 3.33 mm 8 4 6 (4) λ 5λ λ 3λ λ 3 + λ.67 mm 48 8 4 λ () (3) () τ ().5, (3) λ () λ (4) λ τ.333, (4) λ τ 3 (). 67 λ 6

.6 Change o variable 7 Despite the very dissimilar values in comparison to those o classical moments, the results indicate the same behaviour, i.e., that the distribution is positively skewed and leptokurtic. In the ollowing chapters we will utilize both classical and L moments in several hydrological problems..5.6 Time scale and distribution shape In the above example we saw that the distribution o a natural quantity such as rainall, which is very random and simultaneously takes only nonnegative values, at a ine timescale, such as daily, exhibits high variation, strongly positive skewness and inverted-j shape o probability density unction, which means that the most probable value (mode) is zero. Clearly, rainall cannot be negative, so its distribution cannot be symmetric. It happens that the main body o rainall values are close to zero, but a ew values are extremely high (with low probability), which creates the distribution tail to the right. As we will see in other chapters, the distribution tails are even longer (or atter, stronger, heavier) than described by this simple exponential distribution. In the exponential distribution, as demonstrated above, all moments (or any arbitrarily high but inite value o r) exist, i.e. take inite values. This is not, however, the case in long-tail distributions, whose moments above a certain rank r* diverge, i.e. are ininite. As we proceed rom ine to coarser scales, e.g. rom the daily toward the annual scale, aggregating more and more daily values, all moments increase but the standard deviation increases at a smaller rate in comparison to the mean, so the coeicient o variation decreases. In a similar manner, the coeicients o skewness and kurtosis decrease. Thus, the distributions tend to become more symmetric and the density unctions take a more bellshaped pattern. As we will se below, there are theoretical reasons or this behaviour or coarse timescales, which are related to the central limit theorem (see section..). A more general theoretical explanation o the observed natural behaviours both in ine and coarse timescales is oered by the principle o maximum entropy (Koutsoyiannis, 5a, b)..6 Change o variable In hydrology we oten preer to use in our analyses, instead o the variable that naturally describes a physical phenomenon (such as the rainall depth in the example above), another variable which is a one-to-one mathematical transormation o, e.g. g(). I is modelled as a random variable, then should be a random variable, too. The event { y} is identical with the event g y where g is the inverse unction o g. Consequently, the { ( )} distribution unctions o and are related by ( y) { y} g ( y) { } ( g ( y) ) (.38) In the case that the variables are continuous and the unction g dierentiable, it can be shown that the density unction o is given rom that o by

8. Basic concepts o probability ( y) ( g ( y) ) g ( y) (.39) g ( ) where g is the derivative o g. The application o (.39) is elucidated in the ollowing examples..6. Example : the standardized variable Very oten the ollowing transormation o a natural variable is used: Z ( m ) / This is called the standardized variable, is dimensionless and, as we will prove below, it has (a) zero mean, (b) unit standard deviation, and (c) third and ourth central moments equal to the coeicients o skewness and kurtosis o, respectively. rom (.38), setting g (Z) Z + m, we directly obtain Z ( z) g ( z) ( ) ( z m ) + Given that g (x) /, rom (.39) we obtain Besides, rom (.) we get E ( z) ( g ( z) ) g ( z) ( ) ( z m ) Z + g x m [ Z ] E[ g( )] g( x) ( x) dx ( x) dx m x ( x) dx ( x) dx m m and inally m Z E Z [ ] This entails that the moments about the origin and the central moments o Z are identical. Thus, the rth moment is E r [ Z ] E g( ) r r x m [( ) ] ( x) dx r r r r ( ) ( x m ) ( x) dx µ and inally µ ( r) Z m ( r) Z µ ( r) r

.7 Joint, marginal and conditional distributions 9.6. Example : The exponential transormation and the areto distribution Assuming that the variable has exponential distribution as in the example o section.5.5, we will study the distribution o the transormed variable e. The density and distribution o are and our transormation has the properties x / λ x / λ ( x) (/ λ) e, ( x) e ( ) e, g ( ) ln, g ( ) e g where and. rom (.38) we obtain ( y) g ( y) ln y / λ / λ ( ) ( ln y) e y and rom (.39) ( y) g ( g ( y) ) g ( y) ( ) ln y / λ λ ( / λ) e λy (/ λ+ ) (/ λ) y ln y e y The latter can be more easily derived by taking the derivative o (y). This speciic distribution is known as the areto distribution. The rth moment o this distribution is m ( r) r [ ] r E y ( y) dy λy r / λ r / λ y dx λ( r / λ) y, rλ, r < / λ r / λ This clearly shows that only a inite number o moments (r < /λ) exist or this distribution, which means that the areto distribution has a long-tail..7 Joint, marginal and conditional distributions In the above sections, concepts o probability pertaining to the analysis o a single variable have been described. Oten, however, the simultaneous modelling o two (or more) variables is necessary. Let the couple o random variables (, ) represent two sample spaces (Ω, Ω ), respectively. The intersection o the two events { x} and { y}, denoted as { x} { y} { x, y} is an event o the sample space Ω Ω Ω. Based on the latter event, we can deine the joint probability distribution unction o (, ) as a unction o the real variables (x, y): ( x y) : { x, y}, (.4) The subscripts, can be omitted i there is no risk o ambiguity. I is dierentiable, then the unction ( x, y) ( x, y) : (.4) x y

. Basic concepts o probability is the joint probability density unction o the two variables. Obviously, the ollowing equation holds: The unctions x y ( x, y) ( ξ, ω) dωdξ ( x) : ( x) lim ( x y), y ( y) : ( y) lim ( x y), x (.4) (.43) are called the marginal probability distribution unctions o and, respectively. Also, the marginal probability density unctions can be deined, rom, ( x) ( x, y) dy, ( y) ( x y) dx (.44) O particular interest are the so-called conditional probability distribution unction and conditional probability density unction o or a speciied value o y; these are given by x ( ξ, y) dξ ( x y), ( x y) ( y) ( x, y) ( y) (.45) respectively. Switching and we obtain the conditional unctions o..7. Expected values - moments The expected value o any given unction g(, ) o the random variables (, ) is deined by [ g(, )]: g( x, y) ( x y) E, dydx (.46) E[( The quantity m ) p E[ ( m p q the latter case is the + moment, i.e., ) q ] is called p + q moment o and. Likewise, the quantity ] is called the p + q central moment o and. The most common o [( m )( m )] E[ ] m m : E (.47) known as covariance o and and also denoted as Cov [, ]. Dividing this by the standard deviations and we deine the correlation coeicient ρ Cov [, ] : (.48) Var[ ] Var[ ] which is dimensionless with values ρ. As we will see later, this is an important parameter or the study o the correlation o two variables. The conditional expected value o a unction g() or a speciied value y o is deined by

.7 Joint, marginal and conditional distributions E ( [ g ) y] E[ g( ) y] : g( x) ( x y) dx (.49) An important quantity o this type is the conditional expected value o : E [ y] E[ y] : x ( x y) dx (.5) Likewise, the conditional expected value o is deined. The conditional variance o or a given y is deined as or [( [ ]) ] ( [ ] ) E y y x E y ( Var[ y] : E x y) dx (.5) [ ] [ ] y E[ y] ( ) Var[ y] Var[ y] : E (.5) Both E y E [ y] : η(y) and Var[ y] Var[ y]: υ(y) are unctions o the real variable y, rather than constants. I we do not speciy in the condition the value y o the random variable, then the quantities E η() and Var[ ] υ() become [ ] unctions o the random variable. Hence, they are random variables themselves and they have their own expected values, i.e., E [ E[ ]] E[ y] ( y) dy, E[ Var[ ]] Var[ y] ( y) [ ] ] It is easily shown that E E[ ] E[. dy (.53).7. Independent variables The random variables (, ) are called independent i or any couple o values (x, y) the ollowing equation holds: ( x y) ( x) ( y), (.54) The ollowing equation also holds: ( x y) ( x) ( y), (.55) and is equivalent with (.54). The additional equations [ ] E[ ] E[ ] ρ E (.56) [ x] E[ ], E[ x] E[ ] E (.57) are simple consequences o (.54) but not suicient conditions or the variable (, ) to be independent. Two variables (, ) or which (.56) holds are called uncorrelated..7.3 Sums o variables A consequence o the deinition o the expected value (equation (.46)) is the relationship

. Basic concepts o probability [ c g ( ) c g (, )] c E[ g (, )] c E[ g ( )] E +, (.58), + where c and c are any constant values whereas g and g are any unctions. Apparently, this property can be extended to any number o unctions g i. Applying (.58) or the sum o two variables we obtain Likewise, E which results in [( m + m ) ] E ( m ) [ ] E[ ] E[ ] E + + (.59) [ ] + E[ ( m ) ] + E ( m )( m ) [ ] Var[ ] + Var[ ] Cov[, ] [ ] (.6) Var + + (.6) The probability distribution unction o the sum Z + is generally diicult to calculate. However, i and are independent then it can be shown that ( z) ( z w) ( w dw Z ) (.6) The latter integral is known as the convolution integral o (x) and (y)..7.4 An example o correlation o two variables We study a lake with an area o km lying on an impermeable subsurace. The inlow to the lake during the month o April, composed o rainall and catchment runo, is modelled as a random variable with mean 4. 6 m 3 and standard deviation.5 6 m 3. The evaporation rom the surace o the lake, which is the only outlow, is also modelled as a random variable with mean 9. mm and standard deviation. mm. Assuming that inlow and outlow are stochastically independent, we seek to ind the statistical properties o the water level change in April as well as the correlation o this quantity with inlow and outlow. Initially, we express the inlow in the same units as the outlow. To this aim we divide the inlow volume by the lake area, thus calculating the corresponding change in water level. The mean is 4. 6 /. 6.4 m 4. mm and the standard deviation.5 6 /. 6.5 m 5. mm. We denote by and the inlow and outlow in April, respectively and by Z the water level change in the same month. Apparently, Z (.63) We are given the quantities m [ ] 4. mm, Var[ ] 5. mm E m [ ] 9. mm, Var[ ]. mm E

.7 Joint, marginal and conditional distributions 3 and we have assumed that the two quantities are independent, so that their covariance Cov[, ] (see.56) and their correlation ρ. Combining (.63) and (.58) we obtain E [ Z] E[ ] E[ ] E[ ] mz m m (.64) or m Z 3. mm. Subtracting (.63) and (.64) side by side we obtain and squaring both sides we ind z ( m ) ( m ) Z m (.65) ( Z m ) ( m ) + ( m ) ( m )( m ) z which, by taking expected values in both sides, results in the ollowing equation (similar to (.6) except in the sign o the last term) Var Since Cov[, ], (.66) gives and Z 5.3 mm. or [ Z ] Var[ ] Var[ ] + Var[ ] Cov[, ] (.66) Z 5. +. 9. mm Multiplying both sides o (.65) by ( m ) and then taking expected values we ind E [( Z m )( m )] E ( m ) z Cov [ ] E ( m )( m ) [ Z, ] Var[ ] Cov[, ] [ ] (.67) in which the last term is zero. Thus, Z 5. 5. mm Consequently, the correlation coeicient o and Z is ρ /( ) 5. /( 5.3 5.). 99 Z Z Z Likewise, Cov [ Z, ] Cov[, ] Var[ ] (.68) The irst term o the right hand side is zero and thus Z. Consequently, the correlation coeicient o and Z is ρ Z Z Z 4. mm ( ) 4. /( 5.3.). 3 / The positive value o ρ Z maniests the act that the water level increases with the increase o inlow (positive correlation o and Z). Conversely, the negative correlation o and Z (ρ Z < ) corresponds to the act that the water level decreases with the increase o outlow.

4. Basic concepts o probability The large, close to one, value o ρ Z in comparison to the much lower (in absolute value) value o ρ Z relects the act that in April the change o water level depends primarily on the inlow and secondarily on the outlow, given that the ormer is greater than the latter and also has greater variability (standard deviation)..7.5 An example o dependent discrete variables urther to the example o section.4., we introduce the random variables and to quantiy the events (wet or dry day) o today and yesterday, respectively. Values o or equal to and correspond to a day being dry and wet, respectively. We use the values o conditional probabilities (also called transition probabilities) o section.4., which with the current notation are: π : { }.4, π : { }.6 π : { }.5, π : { }.85 The unconditional or marginal probabilities, as ound in section.4., are p : { }., p : { }.8 and the joint probabilities, again as ound in section.4., are p : {, }.8, p : {, }. p : {, }., p : {, }.68 It is reminded that the marginal probabilities o were assumed equal to those o, which resulted in time symmetry (p p ). It can be easily shown (homework) that the conditional quantities π i j can be determined rom the joint p ij and vice versa, and the marginal quantities p i can be determined or either o the two series. Thus, rom the set o the ten above quantities only two are independent (e.g. π and π ) and all others can be calculated rom these two. The marginal moments o and are E[] E[] p + p p., E[ ] E[ ] p + p p. Var[] E[ ] E[]...6 Var[] and the + joint moment is E[] p + p + p + p p.8 so that the covariance is Cov[, ] E[] E[] E[].8..4 and the correlation coeicient

.8 Many variables 5 ρ [, ] Cov.4 :.5 Var[ ] Var[ ].6 I we know that yesterday was a dry day, the moments or today are calculated rom (.49)-(.5), replacing the integrals with sums and the conditional density with the conditional probabilities π i j : E[ ] π + π π.5, E[ ] π + π π.5 Var[ ].5.5.8 Likewise, E[ ] π + π π.4, E[ ] π + π π.4 Var[ ].4.4.4 We observe that in the irst case, Var[ ] < Var[]. This can be interpreted as a decrease o uncertainty or the event o today, caused by the inormation that we have or yesterday. However, in the second case Var[ ] > Var[]. Thus, the inormation that yesterday was wet, increases uncertainty or today. However, on the average the inormation about yesterday results in reduction o uncertainty. This can be expressed mathematically by E[Var[ ]] deined in (.53), which is a weighted average o the two Var[ j]: E{Var[ ]} : Var[ ] p + Var[ ] p This yields E{Var[ } :.8.8 +.4..5 <.6 Var[].8 Many variables All above theoretical analyses can be easily extended to more than two random variables. or instance, the distribution unction o the n random variables,,, n is ( x K, x ): { x,, x } n,, L, n K n n (.69) and is related to the n-dimensional probability density unction by x xn ( ) ( ), L, x, K, xn L,,,, n L ξ K ξ n n dξn Ldξ (.7) The variables,,, n are independent i or any x, x,, x n the ollowing holds true: ( x L x ) ( x ) K ( x ), (.7) L, n,, n The expected values and moments are deined in a similar manner as in the case o two variables, and the property (.58) is generalized or unctions g i o many variables. n n

6. Basic concepts o probability.9 The concept o a stochastic process An arbitrarily (usually ininitely) large amily o random variables (t) is called a stochastic process (apoulis, 99). To each one o them there corresponds an index t, which takes values rom an index set T. Most oten, the index set reers to time. The time t can be either discrete (when T is the set o integers) or continuous (when T is the set o real numbers); thus we have respectively a discrete-time or a continuous-time stochastic process. Each o the random variables (t) can be either discrete (e.g. the wet or dry state o a day) or continuous (e.g. the rainall depth); thus we have respectively a discrete-state or a continuous-state stochastic process. Alternatively, a stochastic process may be denoted as t instead o (t); the notation t is more requent or discrete-time processes. The index set can also be a vector space, rather than the real line or the set o integers; this is the case or instance when we assign a random variable (e.g. rainall depth) to each geographical location (a two dimensional vector space) or to each location and time instance (a three-dimensional vector space). Stochastic processes with multidimensional index set are also known as random ields. A realization x(t) o a stochastic process (t), which is a regular (numerical) unction o the time t, is known as a sample unction. Typically, a realization is observed at countable time instances (not in continuous time, even in a continuous-time process). This sequence o observations is also called a time series. Clearly then, a time series is a sequence o numbers, whereas a stochastic process is a amily o random variables. Unortunately, a large literature body does not make this distinction and conuses stochastic processes with time series..9. Distribution unction The distribution unction o the random variable t, i.e., ( x t) : { ( t) x} ; (.7) is called irst order distribution unction o the process. Likewise, the second order distribution unction is and the nth order distribution unction ( x, x ; t, t ): { ( t ) x ( t ) x } (.73), ( x, x ; t, K, t ): { ( t ) x,, ( t ) x }, n n K K (.74) A stochastic process is completely determined i we know the nth order distribution unction or any n. The nth order probability density unction o the process is derived by taking the derivatives o the distribution unction with respect to all x i..9. Moments The moments are deined in the same manner as in sections.5 and.7.. O particular interest are the ollowing:. The process mean, i.e. the expected value o the variable (t): n n

.9 The concept o a stochastic process 7 () t : E[ () t ] x ( x t) m ; dt (.75). The process autocovariance, i.e. the covariance o the random variables (t ) and (t ): C ( t, t ): Cov[ ( t ) ( t )] E[ ( ( t ) m( t ))( ( t ) m( t ))] (.76), The process variance (the variance o the variable (t)), is Var[(t)] C(t, t). Consequently, the process autocorrelation (the correlation coeicient o the random variables (t ) and (t )) is [ ( t ), ( t )] [ ( t )] Var[ ( t )] C( t, t ) ( t, t ) C( t, t ) Cov ρ ( t, t ): (.77) Var C.9.3 Stationarity As implied by the above notation, in the general setting, the statistics o a stochastic process, such as the mean and autocovariance, depend on time and thus vary with time. However, the case where these statistical properties remain constant in time is most interesting. A process with this property is called stationary process. More precisely, a process is called strict-sense stationary i all its statistical properties are invariant to a shit o time origin. That is, the distribution unction o any order o (t + τ) is identical to that o (t). A process is called wide-sense stationary i its mean is constant and its autocovariance depends only on time dierences, i.e. E[(t)] µ, Ε[((τ) µ) ((t + τ) µ)] C(τ) (.78) A strict-sense stationary process is also wide-sense stationary but the inverse is not true. A process that is not stationary is called nonstationary. In a nonstationary process one or more statistical properties depend on time. A typical case o a nonstationary process is a cumulative process whose mean is proportional to time. or instance, let us assume that the rainall intensity Ξ(t) at a geographical location and time o the year is a stationary process, with a mean µ. Let us urther denote (t) the rainall depth collected in a large container (a cumulative raingauge) at time t and assume that at the time origin, t, the container is empty. It is easy then to understand that E[(t)] µ t. Thus (t) is a nonstationary process. We should stress that stationarity and nonstationarity are properties o a process, not o a sample unction or time series. There is some conusion in the literature about this, as a lot o studies assume that a time series is stationary or not, or can reveal whether the process is stationary or not. As a general rule, to characterise a process nonstationary, it suices to show that some statistical property is a deterministic unction o time (as in the above example o the raingauge), but this cannot be straightorwardly inerred merely rom a time series. Stochastic processes describing periodic phenomena, such as those aected by the annual cycle o Earth, are clearly nonstationary. or instance, the daily temperature at a mid-latitude location could not be regarded as a stationary process. It is a special kind o a nonstationary