Separation of year and site effects by generalized linear models in regionalization of annual floods

Size: px
Start display at page:

Download "Separation of year and site effects by generalized linear models in regionalization of annual floods"

Transcription

1 WATER RESOURCES RESEARCH, VOL. 37, NO. 4, PAGES , APRIL 2001 Separation of year and site effects by generalized linear models in regionalization of annual floods Robin T. Clarke Instituto de Pesquisas Hidrfiulicas, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil Abstract. This paper explores the utility of generalized linear models (GLMs) and generalized linear mixed models (GLMMs) for regionalization and record augmentation of flood data. Because both models allow the separation of site and year effects, each gives an estimate of a site's mean annual flood free from effects of the particular years in which floods were recorded. In addition, a GLM gives a measure of the extent to which a short flood record can be regarded as representative of a longer period. The way in which GLMs and GLMMs are formulated effectively unites two problems that have largely been regarded as separate: namely, regional regression and data augmentation. GLMM model structure implicitly includes correlation between flood records at neighboring sites, and both models contain facilities for testing whether time trends exist in flood data, for testing whether such trends are regionally uniform when they are found to exist, and for testing which climatological and/or physiographic variables are useful for information transfer. By appropriate selection of probability distribution and link function, biases are avoided that arise where log transformed data are back transformed to the scale on which flood flows are recorded. The paper illustrates GLMs and GLMMs by fitting them to flood records of variable length from 12 sites in the Ibicui drainage basin in southern Brazil. While the models are extremely flexible and are easily fitted for gamma-distributed data (and to data following certain other probability distributions), the assumptions on which they are based are arguably more complex than in more familiar methods, requiring careful checking procedures. 1. Introduction postulated for flood characteristics other than Q. There are various problems with this approach, some of which are men- A central problem in hydrology, particularly in developing tioned in section 2. countries, is the transfer ("regionalization") of information The case where flow records exist but are short has historiabout hydrological regimes, in particular, of flow characteris- cally been often treated as a separate problem, generally called tics, from sites with long flow records to sites where records are "record extension" or, where it is necessary only to improve short or nonexistent. The broad range of activities embraced by estimates of mean and variance at the short-record site, hydrological regionalization includes the transfer of informa- "record augmentation" [Stedinger et al., 1993]. On the basis of tion about flood characteristics to a point on a drainage net- normal theory (or at least assuming that the logarithm of work where knowledge of the frequency and magnitude of annual flood (ln Q) is normally distributed [National Environflood flows is required for planning purposes. Regionalization mental Research Council (NERC), 1975]), Matalas and Jacobs of hydrologic data has a long history, extending back for >40 [1964] derived the augmented estimates of mean and variance years [Matalas and Benson, 1961; Benson, 1962; Matalas and of observations at the short-record site, using a longer record Gilroy, 1968; Thomas and Benson, 1970; Hardison, 1971]. correlated with it. Their results were extended to the multivar- For sites where flow records are absent, a widely used techiate case [Moran, 1974], again assuming normal theory. Cornique [e.g., Mosley and McKerchar, 1993; Stedinger et al., 1993] responding results for nonnormal cases do not appear to exist, for the transfer of information about flood characteristics from and back transformation from the log to linear scales intro- P sites with continuous records of discharge, not necessarily of duces bias (see section 2). As an alternative to the record the same length, to a site without records is the following. Let extension and augmentation procedure, Bayes' theorem has be the annual flood at stationj (j = 1...P),A be been used to combine an estimate of Q : obtained from redrainage area, and S be channel slope, where (A, S,... is gional regression with the estimate of Q : obtained from a a set of physiographic and climatic variables thought to detershort record available at site K [NERC, 1975]. mine the magnitudes of the ; let the suffix K denote the site This paper presents an alternative approach to those dewhere no flow record exists. Then (1) the "regional regression" scribed above, based on the use of generalized linear models In Oi = a + /3 In A i + D2 In S i e i is calculated (GLMs) [McCullagh and Nelder, 1989], in which site and year from the P sets { Oi, Ai, Si,... }; (2) this regression is used to effects are separated. One advantage of this approach is that estimate In O : using lna :, In S :...; and (3) exp [ln O :] is since each year effect is calculated, it is possible to assess the taken as an estimate of Q :. Analogous regressions may be degree to which a short record is "representative" of a longer Copyright 2001 by the American Geophysical Union. Paper number 2000WR /01/2000WR period of record. However, there are also other advantages as well as some disadvantages, as set out below. The paper shows that the flexibility of GLM structure includes both regional 979

2 980 CLARKE: GLM IN REGIONALIZATION OF ANNUAL FLOODS Table 1. Details of 12 Flow-Recording Stations in the River Ibicui Drainage Area (Southern Brazil) Code River Station Area, km 2 of variable precision since, in general, the P stations will have records of differing lengths. The difficulty can be partially resolved [Tasker, 1980] by a weighted multiple regression using number of years of record as a weighting function. However, since headwater catchments often tend to attract human set- 4. No allowance is made for the fact that since the means Oi are derived from periods of (usually) different length, each mean will be influenced by the particular climatic conditions pertaining during the period of observation. A short record may include by chance a relatively high proportion of wet or Toropi Vila Clara 2, Toropi Ponte Toropi 3,323 dry years, so being "unrepresentative" of long-run climatic Santa Maria Rosfirio do Sul 12,210 conditions. This should be distinguished from the problem of Jaguari Jaguari 2, Jaguarzinho Ernesto Alves 921 spatial correlation mentioned in Point 3, although the two are Jaguari Passo do Loreto 4,574 related. Short records can include sequences of years that are Ibicui Jacaqufi 27,260 wetter or drier than average, whether or not spatial correlation Arroio Miracatfi Ponte de Miracatfi 380 exists between floods recorded at different sites. The possible Ibicui Manoel Viana 28, Itfi Passo da Cachoeira 2,451 unrepresentative nature of short records is one justification for Ibirapuitfi Alegrete 5,776 the device, used later in the paper, to separate year effects Ibicui Passo Mariano Pinto 35,935 from site effects and to estimate them separately. Although GLMs and their extended form generalized linear mixed models (GLMMs) may be relatively unfamiliar to engiregression and data augmentation as particular cases. Furthermore, biases due to back transformation from log to linear neers concerned with practical details of flood data analysis, these models can be very easily fitted by present-day statistical scales are avoidable, a wide range of probability distributions packages when the probability distribution of floods belongs to for annual flood data can be used (although there are distinct advantages to be gained by using a distribution from the exponential family as defined below), and standard GLM hypothesis-testing procedures make it easy to test whether trends exist the exponential family (defined below); computing time rarely takes more than a few seconds. The following sections explore and illustrate the utility of these models for the analysis and regionalization of flood data. in annual flood sequences. In concluding this section, we note that although the title and content of this paper refer to annual flood data, the general approach is equally valid for mean annual flow, annual 3. Data GLM and GLMM characteristics are illustrated using data rainfall, and other variables of hydrological interest, although there will be additional points to consider if, say, records confrom the River Ibicui, a tributary in southern Brazil of the River Uruguai, which joins the la Plata drainage system. Antain a marked serial correlation structure. nual maximum mean daily discharges ("annual floods") were available from the 12 gauging stations shown in Table 1. The duration of records extends from 9 to 41 years. The 12 drainage 2. Problems With Regional Regression basins range in area from 380 to 35,935 km 2, through almost 2 and Data Augmentation orders of magnitude. It could be argued that records from Among problems associated with regional regression, at small and large basins should be analyzed separately. The least in the way it is frequently used in some developing coun- reasons for not doing so are (1) that the work reported here tries, are the following. formed part of a planning study for development of the Ibicui 1. No allowance is made for the facthathe ( i are usually basin as a whole, so that there was an administrative requirement to analyze records from all available sites jointly, and (2) that with such a limited number of sites a breakdown into groups classified by area would result in some classes with very few sites. tlement later than downstream areas, a weighted regression may give insufficient weight to records from upstream sites 4. Preliminary Analysis where hydrological records are shorter. Classical statistical models commonly require constant vari- 2. Here exp [ln O :] is an underestimate of the true mean ance either in the scale of measurement or in a scale to which annual flood at the site K that is of interest. It can be shown data can be transformed [McCullagh and Nelder, 1989]. The log that if E[( ] = /xi, the mean annual flood at site j, and the transform used in the regional regression model, given above, coefficient of variation C is small enough for terms of order not only secures linearity in the model parameters (the coefhigher than C 2 to be neglected, thene[ln ( i] = In/x i - C2/2 ficients of In A i, In S i,... ) but also goe some way toward' approximately, where C 2 is E[(Q - g/)2]/g. This bias will achieving homogeneous variance, as the multiple regression be introduced wherever In Q (or In Qo, the log of the annual model requires, although the log transform of data introduces flood in year i at site j) enters the analysis. bias when results are back transformed. In their account of 3. No allowance is made for the correlation between an- GLMs, of which classical models such as regional regression nual floods at different stations, arising where in a particular year a long period of heavy rainfall over an extensive area results in large floods at many flow-recording stations within it (or, conversely, where an extended drought results in annual maximum discharges which are low at many sites). This problem has been widely studied [Matalas and Benson, 1961; Kuczera, 1983; Stedinger and Tasker, 1985, 1986a, 1986b; Hosking and Wallis, 1988]. are particular cases, McCullagh and Nelder [1989, chapter 8] discuss models for data in which the variances of data groups show a quadratic relation when plotted against their means: that is, models for data with constant coefficient of variation (CV). The form of the relation between mean and variance is (together with considerations of statistical independence of data and asymmetry of their distribution) one of the determinands leading to the concept of quasi-likelihood [McCullagh

3 CLARKE: GLM IN REGIONALIZATION OF ANNUAL FLOODS 981 Table 2. Mean Annual Floods with Standard Errors, Variances of Mean Annual Floods, and Coefficients of Variation for 12 Sites in River Ibicui Drainage Basin Station Mean (_ SE), Variance, Coefficient of Years a m 3 s -1 m 6 s -2 Variation, % Vila Clara _ (x 103) 38.4 Ponte Toropi _ Rosfirio do Sul _ Jaguari _ Ernesto Alves _ Passo do Loreto _ Jacaqufi _ Ponte de Miracatfi _ Manoel Viana _ Passo da Cachoeira _ Alegrete _ Passo Mariano Pinto _ anumbers of complete years (not necessarily consecutive) of flow record from which annual maximum mean daily discharges were derived. and Nelder, 1989, chapter 9], used where there is no theory available on the random mechanism by which the data are generated. Explicit formulation of a likelihood function is questionable in such circumstances. The starting point in the analysis of the Ibicui data was therefore a plot of the relation between site means and variances. For the sites listed in Table 1, the means, variances, and CV of annual flood are shown in Table 2. However, these statistics are of variable precision, some being based upon many more years of record than others; therefore it was necessary to take an account of the very different numbers of years from which means and variances were calculated. Figure 1 shows the station variances together with a fitted quadratic curve, the fit taking account of the different number of years of record at each station by use of weighted regression. Ignoring a nonsignificant constant and linear term, the fitted curve is variance = C(mean)2 with C = _ with R 2 = 97.8% (residual d.f. = 10). Clearly, the quadraticurve 3.5 x 106 Variances (o) of annual floods, and fitted quadratic curve _ I I I O Coefficients of variation within 12 gauge sites 55 5O.-e 45 > o , I I, i, I Mean annual flood, cumecs Figure 1. (top) Quadratic relation between variance of annual flood (vertical axis) and mean annual flood (horizontal axis) for 12 gauge sites in the River Ibicui drainage basin, southern Brazil. (bottom) Plot of coefficients of variation (CV) against mean annual flood for the same basins.

4 982 CLARKE: GLM IN REGIONALIZATION OF ANNUAL FLOODS Table 3. Analysis of Records From Each Station Individually: Estimates of the Shape Parameter, from 12 Sites and Deviance Measures of Goodness of Gamma Distribution Fit Site, SE (,) Deviance a Vila Clara Ponte Toropi Rosfirio do Sul Jagbari Ernesto Alves Passo do Loreto Jacaqufi Ponte de Miracatfi Manoel Viana Passo da Cachoeira Alegrete Passo Mariano Pinto Pooled b aall with 3 d.f. bpooled estimate of,, obtained by weighting individual values inversely as their variances. shows a strong relation between variance and mean; however, Figure 1 (bottom) show that when the CV is plotted against mean annual flood, the quadratic trend is no longer evident. CV values are consistent with the hypothesis of zero correla- tion with mean annual flood. The conclusion is that a model assuming constant coefficient of variation is appropriate for the data. 5. A GLM With Gamma-Distributed Annual Floods A quadratic relation between site variance and mean suggests a gamma distribution for annual flood data [McCullagh and Nelder, 1989]. Denoting the annual flood by random variable Q with observed values q, this distribution is G(/x, v) given by G(Ix,,) = [11r(,)](,q/ix) v exp (-uq/lx)d(lnq) ordinary least squares arises as maximum likelihood from the q->0,>0 t >0. (1) In classicalinear modeling the residual sum of squares In this form the mean and variance of the gamma distribution (RSS) gives a measure of goodness of model fit; in GLMs the are/x and/x2/u, respectively; the CV is 1/v /2. With the annual corresponding measure of goodness of fit is the deviance D, floods at the 12 sites arrayed in a table with 12 columns (sites) which reduces to RSS if a normal distribution is substituted for and 41 rows (years) containing many missing values, the GLM G(ixo, ) in point 2 above. Table 3 shows the deviances D for the annual flood Qii in year i at site j is defined as follows: obtained from likelihood considerations and defined for the (1) E[Qo] = i&ij; (2) Qij is distributed as G(I. ij, 1 ); (3) gamma distribution [McCullagh and Nelder, 1989] by r ii = xi5/3, where xi5 is a vector of explanatory variables (such asa i, Ss, or some measure of precipitation Po intensity year D = -2 ] {ln(qo/12o) - (Q,j -/20)//2o}. (2) i at station j) and/3 is a vector of coefficients (discussion of the i j form of r o, which is termed the linear predictor, is given The deviance is interpreted as a measure of goodness of fit below); (4) r is = #(/xo), where #( ) is a known, monotonic between the observedata Qo and the fitted values generated differentiable "link" function relating the expected value by the model, with large deviance indicating poor fit. More E[ Qo] to the linear predictor is. McCullagh and Nelder [1989] precisely, for the data Qo from station j, deviance is calculated point out that the log link function #(/xo) = In (/xo) achieves as the difference between two log likelihoods relevant to diflinearity without the need to abandon the original scale of measurement by transforming the data. They show that with the combination of log link function and quadratic varianceferent models of the data: (1) the log likelihood when (in this case) a gamma distribution with/x o different for each item of data is fitted, so that as many parameters are fitted as there are mean relationship, fitting the GLM is equivalento assuming data items, giving no deviations (this gives the maximum that Qo has the gamma distribution with constant shape parameter, independent of the mean, in the same sense that achievable log likelihood) and (2) the log likelihood obtained when (in this case) a single gamma distribution with mean/x is normal distribution. Since the model assumes that the shape parameter, is constant between sites, it was necessary to test the hypothesis that, did not vary significantly from site to site. A twoparameter gamma distribution of the form (1) was therefore fitted to the data from each site separately, and the 12 estimates of the shape parameter, were as shown in Table 3. The pooled estimate of,, averaged over all 12 sites with weights equal to the inverses of their variances, was _ Since u- - CV 2, the square of the coefficient of variation of the gamma distribution, we have CV 2 = 1/4.927 = with approximate (large sample) standard error _+0.017, agreeing reasonably well with the value C = _ reported above, found when variance is regressed on the squared mean. It cannot be ruled out that the fact that, can be taken as constant for the Ibicui data may be just a fortunate coincidence, despite the fact that the 12 sites have drainage basins ranging over almost 2 orders of magnitude (from 380 to nearly 36,000 km2). However, even where the hypothesis of constant, must be rejected, procedures similar to that of Aitkin [1987] are available to model such heterogeneity in terms of one or two additional parameters: for example, by writing 5 = 'o(1 + OA ) if, is thought to depend on areaa and 'o and 0 are constant. The informal argument above can be supplemented by a formalikelihood ratio test. When the gamma parameters/x and, are estimated each site separately, the log likelihood function at site j is {/,j[ln( 'Qo/12 ' - Qo/12 ')] - In [F(.)]}, i where the estimates/,i are as shown in Table 3 and/2 are the raw means. Summed over j, the station suffix, the total log likelihood is Under the null hypothesis that all are equal to,, say, the log likelihood is reduced to , where the pooled, = , not too different from the pooled value of shown in Table 3. The reduction in log likelihood, with 11 d.f., is distributed approximately as X 2 with 11 d.f. and is not statistically significant. This confirms that there is no evidence of differences between the 12 shape pa- rameters.

5 CLARKE: GLM IN REGIONALIZATION OF ANNUAL FLOODS 983 fitted to all the data from station j. The log likelihood for case 2 will be smaller than the log likelihood for case 1, and the adapted to test whether the trend varied significantly between sub-basins by introducing dummy variables [e.g., Clarke, 1994] deviance measures the difference. Associated with each of the into the vector of explanatory variables deviances in Table 3 is a number of degrees of freedom, three 2. The second case is a i constant and s of the form at each station; since the deviance is distributed approximately /3ofo(As, Ss,...) + /33C (As, Ss,... ) +..., with fo( ), as X 2, the deviances in Table 3 can be regarded as approximate fl( ),..., known functions of physical or climatological ex- X 2 variates with mean 3 (= d.f.) and variance 6 (= 2 d.f.). planatory variables, such as area A s, channel slope Ss, etc. This Significant values of X 2 would indicate that the gamma distri- model is broadly analogous to the "regional regression" model bution gave a poor fit since the deviance measure of discrepancy is then large and statistically significant; only 1 of the 12 deviances in Table 3 is statistically significant at P < 5 %; this mentioned earlier but avoids the bias introduced by back transformation from logs to the original scale of measurement. The simplest cases are fo = 1, fl = Aj, f2 = Sj,..., and fo = 1, is not far from what would be expected where 12 independent f = In As, f2 = In Ss,... If/30,/31,..., are estimated, then significance tests are made at the 5% level, showing that the gamma gave a good fit at all stations. As a final justification of the two-parameter gamma distribution, attempts were made to fit a three-parameter gamma at each site. When this distribution was fitted by maximum likelihood, the iterative calculation failed to converge at any of the 12 sites. the model can be used to estimate E[QK] at a site K without records. It is also possible to includ explanatory variables on the right-hand side of (3) having a different value in each cell of the years times sites table. Where the annual flood is a consequence of snowmelt, for example, the variable Zis might be a measure of the water equivalent of snowpack shortly before it melted; the site term s then would be of the form tdo(4,... ) + tffl(4,... ) GLMs in Which the Linear Predictor q Separates Site and Year Effects We now consider appropriate forms for the linear predictor rt / = x /9. The expected value E[Q /] = /x / of the annual flood in year i at site j depends both upon the year of measurement and the site at which it was observed. To make this explicit, the linear predictor of the GLM can be written in the form *l,j = I & + ai + sj, (3) 3. The third case is a i a random variable and s constant. This corresponds to the case in which the effects a i of the years are regarded as a random sample from a population of year effects, the purpose of the analysis being to estimate mean annual flood at each gauge site, averaged over the population of years of which the years of record are a sample. The year effect a i is taken to have zero mean and variance tr a, 2 and attention is focused on the estimation of/x + si, the mean value for gauge j. Fitting the GLM yields estimates of the constants/x and s and the variance ga' 2 Since annual floods Q is and Q ik in the same year i but at different sites j and k contain the same random component a i, the GLM builds in the correlation where /x is a regional mean annual flood, a i is a component specific to the ith year of record, and s i is a component specific between them; with the log link function used in this paper, cov to the jth gauging site. This is of the general form r/is = x0r./3, [Q s, Q,] = exp (2/x + s + S )ga. 2 Inclusion of both in which a i and s are elements of the vector/3 of parameters, constant and random effects in addition to the constant/x in with the elements of x0 r. equal to 0 or 1:0 for data Q is not in the linear predictor/i s makes the GLM into a GLMM. Thus year i or from site j and 1 otherwise. In (3), the mean/x is a the site terms s would also be regarded as random if it were constant, but a i and s may be regarded either as constants or required to explore the spatial structure of annual flood magas random variables, depending on the purpose of the analysis. nitudes within a region, using a correlation function p(ss, s,) Various cases can be distinguished as follows. in terms of a distance measure between sites j and k. 1. The first case is a i constant and s i constant. This is broadly analogous to the "record augmentation" procedures mentioned above. The fixed quantities/x, a, and si are esti- 7. Separation of Site and Year Effects in the Ibicui Data mated by fitting the GLM; then by using the year effects a i the degree to which a short period of record is representative can To recapitulate, analysis of the Ibicui data shows that (1) be assessed. Summed over all years (41 years for the Ibicui annual floods at the 12 sites can be taken as gamma-distributed data), the total of the a will be zero; if, therefore, the sum of with shape parameter v, (2) a GLM with log link function a i for a much shorter record is negative (positive), annual avoids log transformation of annual flood data and the consefloods in those years will be lower (higher) than the full period quent bias when data are back transformed to the scale of of record requires for that site. Assuming a log link function, measurement, and (3) linear predictor of the GLM can sepathe corrected mean for the site j will be estimated as exp ( + rate year and site effects, with site effects described, if and s)' A variation of the model in (3) is obtained by putting a i = when appropriate, in terms of basin characteristics. For the [30fo(ti) q- /31f1(ti) q-... [3kfk(ti)withfo(ti), fl(ti),..., as Ibicui data the gamma-distributed Q s for year i at site j has known functions of time. The linear predictor of the GLM is E[Q s] = / s, and we now consider the three cases (1) r/i = then still of the same form, and the model can be used to / + a + ss, with a and s; constant, as in (3);(2) r/ ; = / + explore whether time trends exist in annual floods. In its sim- a i + fla;, a simplified form of case (2) in section 6; and (3) plest form, fo(ti) = 1, fl(t ) = t i to explore a linear trend. If r/ ; = / + a i + ss, with a random and s fixed. Cases 1 and a significantrend of any kind were found, it would obviously 2 are GLMs; case 3 is a GLMM. be inappropriate to try to estimate flood frequencies using 1. For case 1, linear predictor of form r/ s = / + a i + s;: methods that assume stationarity in annual flood sequences. If year effects a and site effects; are constant. For the effects of a general, basin-wide time trend in annual floods were de- year (a ) and station (ss) both constant, Table 4 shows the tected, yet another variant of the basic GLM model could be fitted site means exp (/ + ss). The raw means are also

6 984 CLARKE: GLM IN REGIONALIZATION OF ANNUAL FLOODS Table 4. Raw Means of Annual Maximum Floods and Estimates of Site Means exp (/ + si) Given by a Log Link Function Q ii with Gamma Distribution, Shape Parameter v Constant Over Sites Station Raw Mean, exp Record m 3 s - ( + ) Difference Length Vila Clara Ponte Toropi Rosfirio do Sul J agu ari Ernesto Alves Passo do Loreto Jacaqufi Ponte de Miracatfi Manoel Viana Passo da Cachoeira Alegrete Passo Mariano Pinto /3A : year effects a i and site effects for basin j taken as proportional to drainage basin area Ai. Standard GLM fitting procedures give the estimate of/3 as/3 = _+ 0.0 s 251, and an approximate t test gives t = on 239 d.f., showing the expected high significance. The residual deviance, given in (2) above, is D 2 = on 239 d.f., showingood evidence of model fit. The year effects a i (not presented in section 6 for reasons of space) are shown in Figure 2. As where constants are estimated in an analysis of variance table, these effects are constrained to sum to zero over the 41 years of record; for any record shorter than the full 41 years, the difference between zero and the sum of the year effects a i for the period of record measures how far the site's flood record is representative of the period as a whole. As explained above, the year effects can also be used to detect time trends in annual flood records, which may occur in regions where land use or climate regime has changed. Although Figure 2 shows no visual evidence of time trends, it can be seen that the standard errors, shown as error bars, are larger in the early years of record, as a consequence reproduced for comparison. It can be seen that while some of of the fewer data then available. Thus for any drainage areaa c the larger differences between the raw and fitted means occur the annual flood in any year L, say, would be estimated by where records are short, this is not always the case. It is of / L c = exp (/ + z. + / A c). If the fitted values / c are interest to calculate the regressions of both sets of means on obtained for each year and each site, including entries for drainage basin area A ; the regression of the raw means gives which data are missing, the averages for the 12 stations are r 2 = 93.3% on 10 d.f. with a residual standard deviation of shown in Table 5. Clearly, the differences between the two sets _+275 m 3 s-i; using the estimates of/ + s, r 2 = 94.5 %, with of means are now much greater than in the previous model, but a residual standardeviation of _+266 m 3 s-], the gain in terms if the two sets of means are each regressed on drainage area, of explained variance being very slight in this instance. the raw means (as before) gave r 2 = 93.3% with residual 2. For case 2, linear predictor of form r i = / + a i q- standardeviation +_275 m 3 s-l, while the values of exp (/ + 1-s I 12 Ibicui stations:annual effects across stations and standard errors Year Figure 2. Plot of year effects, free from effects of the 12 stations within the Ibicui drainage basin. Standard errors of year effects are also shown; those in earlier years are larger as a consequence of fewer data.

7 CLARKE: GLM IN REGIONALIZATION OF ANNUAL FLOODS 985 Table 5. Comparison of Raw Means With Means of Fitted exp (al + /3AK) Raw Mean, exp m 3 s -1 (a + /3Ag) Vila Clara Ponte Toropi Rosfirio do Sul Jaguaff Ernesto Alves Passo do Loreto Jacaqufi Ponte de Miracatfi Manoel Viana Passo da Cachoeira Alegrete Passo Mariano Pinto recorded. As applied to the Ibicui data, the basis of the method is the two-parameter gamma distribution G( ij, '), with shape parameter, constant from site to site (although the method could be adapted to deal with nonconstant,). The gamma mean/ ij varies from site to site and from year to year and is related to a linear predictor of the form rhj = xi fl, where fl is a vector of parameters and xij is a vector of explanatory variables. By suitable choices for the variables in xi, the site and year effects present in each annual flood can be disentangled, and/or the relation between site effects and basin characteristics can be established for regionalization purposes. Use of a log link function in the GLM to relate ij to r i j has the desirable consequence that log transformation of flood data for regionalization purposes is avoided. While the usual regional regression expresses the expected value E [In Q] to basin characteristics, the GLM relates In E[Q] to them. The constant, used in the analysis of the Ibicui data has an analogy in the commonly used index flood method of flood frequency analysis, in which it is assumed that the distributions of floods at different sites in a region are the same except for a scale or index flood parameter which reflects the size, rainfall, and runoff characteristics of each drainage basin. This index flood is usually taken as the mean annual flood at each site, and the data Qij/Oj are then pooled over sites. In the gamma distribution given in (1) above, the random variable Q representing the annual flood is also scaled by dividing by the mean/. The paper also suggests that the separation of site and year effects is desirable because (1) it allows mean annual flood at a site with short record to be estimated free from the effects of the particular sequence of years in which floods were recorded; (2) it provides a direct measure of how far the sequence of years in the record is representative of the longest period of record at any site, when all records of annual flood can be assumed stationary and free from trend; (3) by modeling the year effects si in terms of known functions of time the existence of time trends in flood records can be explored and their gl + / AK) gave r 2 = 96.3 % with residual standard deviation significance tested (the estimation of long-term flood frequen- _+242 m 3 s -1. The reduction in standardeviation might be cies at sites where trends exist being a fruitless exercise); and considered useful rather than dramatic. (d) where time trends in flood data are shown to exist, the 3. For case 3, linear predictor of form r i j = / + a i q- $j: extent to which they are spatially homogeneous can be exyear effects a i are random and gauging station effectsj are plored by introducing dummy variables [e.g., Clarke, 1994]. fixed. The GLMM model has 13 parameters: 12 site effectsj Fitting GLMs and GLMMs to annual flood data requires an subject to the constraint Zaj = 0, which reduces the 12 to 11, iterative calculation to maximize a log likelihood function. the variance o- a 2 of the year effects, and the common shape Provided that the probability distribution of annual floods beparameter,. Their estimates are calculated by iteratively weighted nonlinear least squares [McCullagh and Nelder, 1989], and convergence was achieved after five iterations. The estimate of %2 was _ The 12 site means exp ( + j) are shown in Table 6, and the estimate of the shape parameter, was If the estimated site means are regressed upon drainage area, the percentage variance aclongs to the exponential family (for each member of which a set of sufficient statistics exist for the distribution parameters 0) with general form [Cox and Hinckley, 1974] f(q; o) - exp {a(o)b(q) + c(o) + d(y)} and to which the gamma, normal, binomial, Poisson, and inverse normal distributions belong, the calculation is relatively counted for is r 2 = 94.3 % with 10 d.f., the residual standard trouble-free and is achieved by iterative weighted least squares deviation being _+250 m 3 s -1. This represents a small improve- [McCullagh and Nelder, 1989]. There is no reason why other ment on the r 2 = 93.3%, with residual standard deviation well-known distributions, such as the generalized extreme _+275 m 3 s -1, obtained when the raw means are regressed on drainage area. value (GEV) distribution, should not be used instead of,), but this has not been explored in the present paper. It is conjectured that problems of failure to converge and conver- 8. Discussion gence to nonunique optima are likely to be encountered where distributions not from the exponential family are used. This paper suggests that the GLMs and GLMM discussed Since the data analyzed in the paper are sequences of annual above constitute an alternative approach to regionalization floods, the usual assumption has been made that no serial and information transfer between sites where annual floods are correlation exists between elements of the time series that constitute the columns of the years times site matrix of annual flood records. In fact, the GLMs used in this paper require the Table 6. Comparison of Raw Means With Fitted Station Effects: Year Effects Random Raw Mean, Estimates of exp m 3 S -1 (.L + Sj) Vila Clara Ponte Toropi Rosfirio do Sul Jaguari Ernesto Alves Passo do Loreto Jacaqufi Ponte de Miracatfi Manoel Viana Passo da Cachoeira Alegrete Passo Mariano Pinto

8 986 CLARKE: GLM IN REGIONALIZATION OF ANNUAL FLOODS stronger assumption of statistical independence between an- Anderson, R. J., N. F. Ribeiro, and H. F. Diaz, An analysis of flooding nual floods in successive years, but this assumption is almost in the Parana/Paraguay River Basin, The World Bank Latin Am. Tech. Dep., Washington, D.C., universal in flood frequency studies. This paper notes, how- Benson, M. A., Evolution of methods for evaluating the occurrence of ever, that GLMs may also be appropriate for information floods, U.S. Geol. Surv. Water Supply Pap., 1580-A, 30 pp., transfer and regionalization of other hydrological variables, for Clarke, R. T., Statistical Modelling in Hydrology, John Wiley, New which the assumption of statistical independence may be less York, Cox, D. R., and D. V. Hinckley, Theoretical Statistics, Chapman and tenable; it may be expected that mean annual flows, for exam- Hall, New York, ple, show some degree of serial correlation, particularly in Fahrmeir, L., and G. Tutz, Multivariate Statistical Modelling Based on basins with large annual carryover storage, such as those of the Generalized Linear Models, Springer Ser. in Stat., Springer-Verlag, Amazon and la Plata. Extension of GLM and GLMM to the New York, analysis of correlated data and to the analysis of correlated Hardison, C. H., Prediction error of regression estimates of streamflow characteristics at ungauged sites, U.S. Geol. Surv. Prof. Pap., 750-C, data is, at present, an active field of research. C228-C236, Hosking, J. R. M., and J. R. Wallis, The effect of intersite dependence 9. Conclusions on regional flood frequency analysis, Water Resour. Res., 24, , This paper explores the use of generalized linear models Kuczera, G., Effect of sampling uncertainty and spatial correlation on (GLMs) and generalized linear mixed models (GLMMs) as an an empirical Bayes procedure for combining site and regional information, J. Hydrol., 65(4), , alternative to existing procedures for information transfer of Laraque, A., J. C. Olivry, D. Orange, and B. Marieu, Variation in space flood characteristics by regional regression and record aug- and time of rainfall and hydrological regimes in central Africa from mentation. The following conclusions are made. (1) GLMs and the beginning of the century (in Portuguese), in XII Symposium of GLMMs allow the effects of site and year effects (both of the Brazilian Water Resources Association, Anais vol. 3, pp , Assoc. Bras. de Recursos Humanos, S o Paulo, Brazil, which enter the observed annual flood Qi at a site j in year i) Lettenmaier, D. P., J. R. Wallis, and E. F. Wood, Effect of regional to be separated, with the desirable consequence that the mean heterogeneity on flood frequency estimation, Water Resour. Res., 23, annual flood at a site with short record can be adjusted for the , particular sequence of years in which floods were recorded. (2) Matalas, N. C., and M. A. Benson, Effects of interstation correlation By the same token, calculation of the year effects gives a on regression analysis, J. Geophys. Res., 66(10), , Matalas, N. C., and E. J. Gilroy, Some comments on regionalization in quantitative measure of how far a short flood record can be hydrologic studies, Water Resour. Res., 4, , considered representative of flood characteristics determined Matalas, N. C., and B. Jacobs, A correlation procedure for augmenting over a longer record. (3) GLMs and GLMMs combine into a hydrological data, U.S. Geol. Surv. Prof. Pap., 434-E, El-E7, single formulation two problems that have usually been treated McCullagh, P., and J. A. Nelder, Generalized Linear Models, 2nd ed., Chapman and Hall, New York, separately in flood frequency analysis: namely, regional regres- Moran, M. A., On estimators obtained from a sample augmented by sion and record augmentation. (4) GLMMs implicitly allow for multiple regression, Water Resour. Res., 10, 81-85, the inclusion of seasonal correlation between recorded floods: Mosley, M.P., and A. I. McKerchar, Streamflow, in Handbook of that is, the tendency for annual floods to be large (small) at Hydrology, edited by D. R. Maidment, Chap. 8, pp , McGraw-Hill, New York, neighboring sites in wet (dry) years. (5) The statistical structure Natural Environmental Research Council (NERC), NERC flood studof both GLMs and GLMMs allows formal tests to be made for ies report, vol. 1, chap. 3 and 4, London, detecting time trends in annual flood records and for assessing Stedinger, J. R., and G. D. Tasker, Regional hydrologic analysis, 1, which climatological and physiographic variables are useful for Ordinary, weighted, and generalized least squares compared, Water regional information transfer. (6) Appropriate choice of GLM Resour. Res., 21, , Stedinger, J. R., and G. D. Tasker, Correction to "Regional Hydrolink function and probability distribution avoids biases intrologic Analysis, 1, Ordinary, Weighted, and Generalized Least duced by back transformation from the log scale, used in re- Squares Compared," Water Resour. Res., 22, 844, 1986a. gional regression, to the scale of flood measurement (i.e., m 3 Stedinger, J. R., and G. D. Tasker, Regional hydrologic analysis, 2, s- ). (7) For the data used in this paper to explore GLMs and Model-error estimators, estimation of sigma and log-pearson type 3 distributions, Water Resour. Res., 22, , 1986b. GLMMs, two-parameter gamma distributions with constant Stedinger, J. R., R. M. Vogel, and E. Foufoula-Georgiou, Frequency shape parameter, were appropriate. This may not generally be analysis of extreme events, in Handbook of Hydrology, edited by the case. While the models can be adapted to nonhomoge- D. R. Maidment, Chap. 18, pp , McGraw-Hill, New neous, if the nonhomogeneity can be parsimoniously de- York, scribed by one or two parameters, GLMs and GLMMs would Tasker, G. D., Hydrologic regression with weighted least squares, Water Resour. Res., 16, , have little to offer where nonhomogeneity cannot be so mod- Thomas, D. M., and M. A. Benson, Generalization of streamflow eled. characteristics from drainage-basin characteristics, U.S. Geol. Surv. Water Supply Pap., 1975, 55 pp., Acknowledgment. The author thanks the referees for some constructive suggestions. References Airkin, M., Modelling variance heterogeneity in normal regression using GLIM, Appl. Stat., 36, , R. T. Clarke, Instituto de Pesquisas Hidrfiulicas, UFRGS, Caixa Postale 15029, Avenida Bonto Goncalves 9500, Porto Alegre, RS CEP , Brazil. (clarke@if. ufrgs.br) (Received August 21, 2000; revised November 13, 2000; accepted November 13, 2000.)

The effects of errors in measuring drainage basin area on regionalized estimates of mean annual flood: a simulation study

The effects of errors in measuring drainage basin area on regionalized estimates of mean annual flood: a simulation study Predictions in Ungauged Basins: PUB Kick-off (Proceedings of the PUB Kick-off meeting held in Brasilia, 20 22 November 2002). IAHS Publ. 309, 2007. 243 The effects of errors in measuring drainage basin

More information

Estimating time trends in Gumbel-distributed data by means of generalized linear models

Estimating time trends in Gumbel-distributed data by means of generalized linear models WATER RESOURCES RESEARCH, VOL. 38, NO. 7, 1111, 10.1029/2001WR000917, 2002 Estimating time trends in Gumbel-distributed data by means of generalized linear models Robin T. Clarke Instituto de Pesquisas

More information

How Significant is the BIAS in Low Flow Quantiles Estimated by L- and LH-Moments?

How Significant is the BIAS in Low Flow Quantiles Estimated by L- and LH-Moments? How Significant is the BIAS in Low Flow Quantiles Estimated by L- and LH-Moments? Hewa, G. A. 1, Wang, Q. J. 2, Peel, M. C. 3, McMahon, T. A. 3 and Nathan, R. J. 4 1 University of South Australia, Mawson

More information

Bayesian GLS for Regionalization of Flood Characteristics in Korea

Bayesian GLS for Regionalization of Flood Characteristics in Korea Bayesian GLS for Regionalization of Flood Characteristics in Korea Dae Il Jeong 1, Jery R. Stedinger 2, Young-Oh Kim 3, and Jang Hyun Sung 4 1 Post-doctoral Fellow, School of Civil and Environmental Engineering,

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction ReCap. Parts I IV. The General Linear Model Part V. The Generalized Linear Model 16 Introduction 16.1 Analysis

More information

The use of L-moments for regionalizing flow records in the Rio Uruguai basin: a case study

The use of L-moments for regionalizing flow records in the Rio Uruguai basin: a case study Regionalization in Ifylwltm (Proceedings of the Ljubljana Symposium, April 1990). IAHS Publ. no. 191, 1990. The use of L-moments for regionalizing flow records in the Rio Uruguai basin: a case study ROBM

More information

Regional Estimation from Spatially Dependent Data

Regional Estimation from Spatially Dependent Data Regional Estimation from Spatially Dependent Data R.L. Smith Department of Statistics University of North Carolina Chapel Hill, NC 27599-3260, USA December 4 1990 Summary Regional estimation methods are

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

BOOTSTRAPPING WITH MODELS FOR COUNT DATA

BOOTSTRAPPING WITH MODELS FOR COUNT DATA Journal of Biopharmaceutical Statistics, 21: 1164 1176, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.607748 BOOTSTRAPPING WITH MODELS FOR

More information

Design Flood Estimation in Ungauged Catchments: Quantile Regression Technique And Probabilistic Rational Method Compared

Design Flood Estimation in Ungauged Catchments: Quantile Regression Technique And Probabilistic Rational Method Compared Design Flood Estimation in Ungauged Catchments: Quantile Regression Technique And Probabilistic Rational Method Compared N Rijal and A Rahman School of Engineering and Industrial Design, University of

More information

Daily Rainfall Disaggregation Using HYETOS Model for Peninsular Malaysia

Daily Rainfall Disaggregation Using HYETOS Model for Peninsular Malaysia Daily Rainfall Disaggregation Using HYETOS Model for Peninsular Malaysia Ibrahim Suliman Hanaish, Kamarulzaman Ibrahim, Abdul Aziz Jemain Abstract In this paper, we have examined the applicability of single

More information

Rainfall variability and uncertainty in water resource assessments in South Africa

Rainfall variability and uncertainty in water resource assessments in South Africa New Approaches to Hydrological Prediction in Data-sparse Regions (Proc. of Symposium HS.2 at the Joint IAHS & IAH Convention, Hyderabad, India, September 2009). IAHS Publ. 333, 2009. 287 Rainfall variability

More information

Regional Frequency Analysis of Extreme Climate Events. Theoretical part of REFRAN-CV

Regional Frequency Analysis of Extreme Climate Events. Theoretical part of REFRAN-CV Regional Frequency Analysis of Extreme Climate Events. Theoretical part of REFRAN-CV Course outline Introduction L-moment statistics Identification of Homogeneous Regions L-moment ratio diagrams Example

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Flood frequency analysis at ungauged sites in the KwaZulu-Natal Province, South Africa

Flood frequency analysis at ungauged sites in the KwaZulu-Natal Province, South Africa Flood frequency analysis at ungauged sites in the KwaZulu-Natal Province, South Africa TR Kjeldsen 1 *, JC Smithers * and RE Schulze 1 Environment & Resources DTU, Technical University of Denmark, Building

More information

Generalized Linear Models: An Introduction

Generalized Linear Models: An Introduction Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,

More information

A Report on a Statistical Model to Forecast Seasonal Inflows to Cowichan Lake

A Report on a Statistical Model to Forecast Seasonal Inflows to Cowichan Lake A Report on a Statistical Model to Forecast Seasonal Inflows to Cowichan Lake Prepared by: Allan Chapman, MSc, PGeo Hydrologist, Chapman Geoscience Ltd., and Former Head, BC River Forecast Centre Victoria

More information

1. Evaluation of Flow Regime in the Upper Reaches of Streams Using the Stochastic Flow Duration Curve

1. Evaluation of Flow Regime in the Upper Reaches of Streams Using the Stochastic Flow Duration Curve 1. Evaluation of Flow Regime in the Upper Reaches of Streams Using the Stochastic Flow Duration Curve Hironobu SUGIYAMA 1 ABSTRACT A stochastic estimation of drought evaluation in the upper reaches of

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models.

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. 1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

Modeling of peak inflow dates for a snowmelt dominated basin Evan Heisman. CVEN 6833: Advanced Data Analysis Fall 2012 Prof. Balaji Rajagopalan

Modeling of peak inflow dates for a snowmelt dominated basin Evan Heisman. CVEN 6833: Advanced Data Analysis Fall 2012 Prof. Balaji Rajagopalan Modeling of peak inflow dates for a snowmelt dominated basin Evan Heisman CVEN 6833: Advanced Data Analysis Fall 2012 Prof. Balaji Rajagopalan The Dworshak reservoir, a project operated by the Army Corps

More information

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/ Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.

More information

11. Generalized Linear Models: An Introduction

11. Generalized Linear Models: An Introduction Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and

More information

1990 Intergovernmental Panel on Climate Change Impacts Assessment

1990 Intergovernmental Panel on Climate Change Impacts Assessment 1990 Intergovernmental Panel on Climate Change Impacts Assessment Although the variability of weather and associated shifts in the frequency and magnitude of climate events were not available from the

More information

TREND AND VARIABILITY ANALYSIS OF RAINFALL SERIES AND THEIR EXTREME

TREND AND VARIABILITY ANALYSIS OF RAINFALL SERIES AND THEIR EXTREME TREND AND VARIABILITY ANALYSIS OF RAINFALL SERIES AND THEIR EXTREME EVENTS J. Abaurrea, A. C. Cebrián. Dpto. Métodos Estadísticos. Universidad de Zaragoza. Abstract: Rainfall series and their corresponding

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

GENERALIZED LINEAR MODELING APPROACH TO STOCHASTIC WEATHER GENERATORS

GENERALIZED LINEAR MODELING APPROACH TO STOCHASTIC WEATHER GENERATORS GENERALIZED LINEAR MODELING APPROACH TO STOCHASTIC WEATHER GENERATORS Rick Katz Institute for Study of Society and Environment National Center for Atmospheric Research Boulder, CO USA Joint work with Eva

More information

Extreme Rain all Frequency Analysis for Louisiana

Extreme Rain all Frequency Analysis for Louisiana 78 TRANSPORTATION RESEARCH RECORD 1420 Extreme Rain all Frequency Analysis for Louisiana BABAK NAGHAVI AND FANG XIN Yu A comparative study of five popular frequency distributions and three parameter estimation

More information

Trends in floods in small Norwegian catchments instantaneous vs daily peaks

Trends in floods in small Norwegian catchments instantaneous vs daily peaks 42 Hydrology in a Changing World: Environmental and Human Dimensions Proceedings of FRIEND-Water 2014, Montpellier, France, October 2014 (IAHS Publ. 363, 2014). Trends in floods in small Norwegian catchments

More information

Examination of homogeneity of selected Irish pooling groups

Examination of homogeneity of selected Irish pooling groups Hydrol. Earth Syst. Sci., 15, 819 830, 2011 doi:10.5194/hess-15-819-2011 Author(s) 2011. CC Attribution 3.0 License. Hydrology and Earth System Sciences Examination of homogeneity of selected Irish pooling

More information

Assessment of rainfall and evaporation input data uncertainties on simulated runoff in southern Africa

Assessment of rainfall and evaporation input data uncertainties on simulated runoff in southern Africa 98 Quantification and Reduction of Predictive Uncertainty for Sustainable Water Resources Management (Proceedings of Symposium HS24 at IUGG27, Perugia, July 27). IAHS Publ. 313, 27. Assessment of rainfall

More information

University of East London Institutional Repository:

University of East London Institutional Repository: University of East London Institutional Repository: http://roar.uel.ac.uk This paper is made available online in accordance with publisher policies. Please scroll down to view the document itself. Please

More information

GLM models and OLS regression

GLM models and OLS regression GLM models and OLS regression Graeme Hutcheson, University of Manchester These lecture notes are based on material published in... Hutcheson, G. D. and Sofroniou, N. (1999). The Multivariate Social Scientist:

More information

Generalized Linear Models 1

Generalized Linear Models 1 Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter

More information

PRELIMINARY DRAFT FOR DISCUSSION PURPOSES

PRELIMINARY DRAFT FOR DISCUSSION PURPOSES Memorandum To: David Thompson From: John Haapala CC: Dan McDonald Bob Montgomery Date: February 24, 2003 File #: 1003551 Re: Lake Wenatchee Historic Water Levels, Operation Model, and Flood Operation This

More information

TABLE OF CONTENTS. 3.1 Synoptic Patterns Precipitation and Topography Precipitation Regionalization... 11

TABLE OF CONTENTS. 3.1 Synoptic Patterns Precipitation and Topography Precipitation Regionalization... 11 TABLE OF CONTENTS ABSTRACT... iii 1 INTRODUCTION... 1 2 DATA SOURCES AND METHODS... 2 2.1 Data Sources... 2 2.2 Frequency Analysis... 2 2.2.1 Precipitation... 2 2.2.2 Streamflow... 2 2.3 Calculation of

More information

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics Linear, Generalized Linear, and Mixed-Effects Models in R John Fox McMaster University ICPSR 2018 John Fox (McMaster University) Statistical Models in R ICPSR 2018 1 / 19 Linear and Generalized Linear

More information

for explaining hydrological losses in South Australian catchments by S. H. P. W. Gamage The Cryosphere

for explaining hydrological losses in South Australian catchments by S. H. P. W. Gamage The Cryosphere Geoscientific Model Development pen Access Geoscientific Model Development pen Access Hydrology and Hydrol. Earth Syst. Sci. Discuss., 10, C2196 C2210, 2013 www.hydrol-earth-syst-sci-discuss.net/10/c2196/2013/

More information

Dear Editor, Response to Anonymous Referee #1. Comment 1:

Dear Editor, Response to Anonymous Referee #1. Comment 1: Dear Editor, We would like to thank you and two anonymous referees for the opportunity to revise our manuscript. We found the comments of the two reviewers very useful, which gave us a possibility to address

More information

Sources of uncertainty in estimating suspended sediment load

Sources of uncertainty in estimating suspended sediment load 136 Sediment Budgets 2 (Proceedings of symposium S1 held during the Seventh IAHS Scientific Assembly at Foz do Iguaçu, Brazil, April 2005). IAHS Publ. 292, 2005. Sources of uncertainty in estimating suspended

More information

Regionalization for one to seven day design rainfall estimation in South Africa

Regionalization for one to seven day design rainfall estimation in South Africa FRIEND 2002 Regional Hydrology: Bridging the Gap between Research and Practice (Proceedings of (he fourth International l-'riknd Conference held at Cape Town. South Africa. March 2002). IAI IS Publ. no.

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Course Contents Introduction to Random Variables (RVs) Probability Distributions

More information

Reprinted from MONTHLY WEATHER REVIEW, Vol. 109, No. 12, December 1981 American Meteorological Society Printed in I'. S. A.

Reprinted from MONTHLY WEATHER REVIEW, Vol. 109, No. 12, December 1981 American Meteorological Society Printed in I'. S. A. Reprinted from MONTHLY WEATHER REVIEW, Vol. 109, No. 12, December 1981 American Meteorological Society Printed in I'. S. A. Fitting Daily Precipitation Amounts Using the S B Distribution LLOYD W. SWIFT,

More information

Regional Flood Estimation for NSW: Comparison of Quantile Regression and Parameter Regression Techniques

Regional Flood Estimation for NSW: Comparison of Quantile Regression and Parameter Regression Techniques 21st International Congress on Modelling and Simulation, Gold Coast, Australia, 29 Nov to 4 Dec 2015 www.mssanz.org.au/modsim2015 Regional Flood Estimation for NSW: Comparison of Quantile Regression and

More information

Overview of a Changing Climate in Rhode Island

Overview of a Changing Climate in Rhode Island Overview of a Changing Climate in Rhode Island David Vallee, Hydrologist in Charge, National Weather Service Northeast River Forecast Center, NOAA Lenny Giuliano, Air Quality Specialist, Rhode Island Department

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

A review: regional frequency analysis of annual maximum rainfall in monsoon region of Pakistan using L-moments

A review: regional frequency analysis of annual maximum rainfall in monsoon region of Pakistan using L-moments International Journal of Advanced Statistics and Probability, 1 (3) (2013) 97-101 Science Publishing Corporation www.sciencepubco.com/index.php/ijasp A review: regional frequency analysis of annual maximum

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

WINFAP 4 QMED Linking equation

WINFAP 4 QMED Linking equation WINFAP 4 QMED Linking equation WINFAP 4 QMED Linking equation Wallingford HydroSolutions Ltd 2016. All rights reserved. This report has been produced in accordance with the WHS Quality & Environmental

More information

On the modelling of extreme droughts

On the modelling of extreme droughts Modelling and Management of Sustainable Basin-scale Water Resource Systems (Proceedings of a Boulder Symposium, July 1995). IAHS Publ. no. 231, 1995. 377 _ On the modelling of extreme droughts HENRIK MADSEN

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Regional estimation of rainfall intensity-duration-frequency curves using generalized least squares regression of partial duration series statistics

Regional estimation of rainfall intensity-duration-frequency curves using generalized least squares regression of partial duration series statistics WATER RESOURCES RESEARCH, VOL. 38, NO. 11, 1239, doi:10.1029/2001wr001125, 2002 Regional estimation of rainfall intensity-duration-frequency curves using generalized least squares regression of partial

More information

Prediction of Snow Water Equivalent in the Snake River Basin

Prediction of Snow Water Equivalent in the Snake River Basin Hobbs et al. Seasonal Forecasting 1 Jon Hobbs Steve Guimond Nate Snook Meteorology 455 Seasonal Forecasting Prediction of Snow Water Equivalent in the Snake River Basin Abstract Mountainous regions of

More information

ENGINEERING HYDROLOGY

ENGINEERING HYDROLOGY ENGINEERING HYDROLOGY Prof. Rajesh Bhagat Asst. Professor Civil Engineering Department Yeshwantrao Chavan College Of Engineering Nagpur B. E. (Civil Engg.) M. Tech. (Enviro. Engg.) GCOE, Amravati VNIT,

More information

Historical Trends in Florida Temperature and Precipitation

Historical Trends in Florida Temperature and Precipitation Historical Trends in Florida Temperature and Precipitation Jayantha Obeysekera (SFWMD) - Presenter Michelle M. Irizarry-Ortiz (SFWMD) Eric Gadzinski (UM) February 24, 2010 UF WI Symposium Gainesville,

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your

More information

UNIT 5:Random number generation And Variation Generation

UNIT 5:Random number generation And Variation Generation UNIT 5:Random number generation And Variation Generation RANDOM-NUMBER GENERATION Random numbers are a necessary basic ingredient in the simulation of almost all discrete systems. Most computer languages

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

START INDEX

START INDEX CONTENTS... Preface to the English edition III... Preface to the original French edition V Summary (English, French, Russian, Spanish)... VII Tothereader... 1. Simple random character of series of observations.

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood.

Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood. Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood. P.M.E.Altham, Statistical Laboratory, University of Cambridge June 27, 2006 This article was published

More information

Analysis of 2 n Factorial Experiments with Exponentially Distributed Response Variable

Analysis of 2 n Factorial Experiments with Exponentially Distributed Response Variable Applied Mathematical Sciences, Vol. 5, 2011, no. 10, 459-476 Analysis of 2 n Factorial Experiments with Exponentially Distributed Response Variable S. C. Patil (Birajdar) Department of Statistics, Padmashree

More information

Estimation of extreme flow quantiles and quantile uncertainty for ungauged catchments

Estimation of extreme flow quantiles and quantile uncertainty for ungauged catchments Quantification and Reduction of Predictive Uncertainty for Sustainable Water Resources Management (Proceedings of Symposium HS2004 at IUGG2007, Perugia, July 2007). IAHS Publ. 313, 2007. 417 Estimation

More information

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 5.1. (a) In a log-log model the dependent and all explanatory variables are in the logarithmic form. (b) In the log-lin model the dependent variable

More information

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010 Statistical Models for Management Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon February 24 26, 2010 Graeme Hutcheson, University of Manchester GLM models and OLS regression The

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -36 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -36 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -36 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Multivariate stochastic models Matalas

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Course 4 Solutions November 2001 Exams

Course 4 Solutions November 2001 Exams Course 4 Solutions November 001 Exams November, 001 Society of Actuaries Question #1 From the Yule-Walker equations: ρ φ + ρφ 1 1 1. 1 1+ ρ ρφ φ Substituting the given quantities yields: 0.53 φ + 0.53φ

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

Package homtest. February 20, 2015

Package homtest. February 20, 2015 Version 1.0-5 Date 2009-03-26 Package homtest February 20, 2015 Title Homogeneity tests for Regional Frequency Analysis Author Alberto Viglione Maintainer Alberto Viglione

More information

Bias-corrected AIC for selecting variables in Poisson regression models

Bias-corrected AIC for selecting variables in Poisson regression models Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations

More information

ESTIMATION OF LOW RETURN PERIOD FLOODS. M.A. BERAN and M J. NOZDRYN-PLOTNICKI Institute of Hydrology, Wallingford, Oxon.

ESTIMATION OF LOW RETURN PERIOD FLOODS. M.A. BERAN and M J. NOZDRYN-PLOTNICKI Institute of Hydrology, Wallingford, Oxon. Hydrological Sciences-Bulletin des Sciences Hydrologiques, XXII, 2 6/1977 ESTIMATION OF LOW RETURN PERIOD FLOODS M.A. BERAN and M J. NOZDRYN-PLOTNICKI Institute of Hydrology, Wallingford, Oxon. OXJ0 8BB,

More information

Uncertainty propagation in a sequential model for flood forecasting

Uncertainty propagation in a sequential model for flood forecasting Predictions in Ungauged Basins: Promise and Progress (Proceedings of symposium S7 held during the Seventh IAHS Scientific Assembly at Foz do Iguaçu, Brazil, April 2005). IAHS Publ. 303, 2006. 177 Uncertainty

More information

Results of Intensity-Duration- Frequency Analysis for Precipitation and Runoff under Changing Climate

Results of Intensity-Duration- Frequency Analysis for Precipitation and Runoff under Changing Climate Results of Intensity-Duration- Frequency Analysis for Precipitation and Runoff under Changing Climate Supporting Casco Bay Region Climate Change Adaptation RRAP Eugene Yan, Alissa Jared, Julia Pierce,

More information

Estimating Relationship Development Spreadsheet and Unit-as-an-Independent Variable Regressions

Estimating Relationship Development Spreadsheet and Unit-as-an-Independent Variable Regressions Estimating Relationship Development Spreadsheet and Unit-as-an-Independent Variable Regressions Raymond P. Covert and Noah L. Wright MCR, LLC MCR, LLC rcovert@mcri.com nwright@mcri.com ABSTRACT MCR has

More information

Workshop: Build a Basic HEC-HMS Model from Scratch

Workshop: Build a Basic HEC-HMS Model from Scratch Workshop: Build a Basic HEC-HMS Model from Scratch This workshop is designed to help new users of HEC-HMS learn how to apply the software. Not all the capabilities in HEC-HMS are demonstrated in the workshop

More information

Regression of Time Series

Regression of Time Series Mahlerʼs Guide to Regression of Time Series CAS Exam S prepared by Howard C. Mahler, FCAS Copyright 2016 by Howard C. Mahler. Study Aid 2016F-S-9Supplement Howard Mahler hmahler@mac.com www.howardmahler.com/teaching

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

SPI: Standardized Precipitation Index

SPI: Standardized Precipitation Index PRODUCT FACT SHEET: SPI Africa Version 1 (May. 2013) SPI: Standardized Precipitation Index Type Temporal scale Spatial scale Geo. coverage Precipitation Monthly Data dependent Africa (for a range of accumulation

More information

Abebe Sine Gebregiorgis, PhD Postdoc researcher. University of Oklahoma School of Civil Engineering and Environmental Science

Abebe Sine Gebregiorgis, PhD Postdoc researcher. University of Oklahoma School of Civil Engineering and Environmental Science Abebe Sine Gebregiorgis, PhD Postdoc researcher University of Oklahoma School of Civil Engineering and Environmental Science November, 2014 MAKING SATELLITE PRECIPITATION PRODUCTS WORK FOR HYDROLOGIC APPLICATION

More information

Journal of Pharmacognosy and Phytochemistry 2017; 6(4): Sujitha E and Shanmugasundaram K

Journal of Pharmacognosy and Phytochemistry 2017; 6(4): Sujitha E and Shanmugasundaram K 2017; 6(4): 452-457 E-ISSN: 2278-4136 P-ISSN: 2349-8234 JPP 2017; 6(4): 452-457 Received: 01-05-2017 Accepted: 02-06-2017 Sujitha E Research Scholar, Department of Soil and Water Conservation Engineering,

More information

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz 1 EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS Rick Katz Institute for Study of Society and Environment National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu Home page: www.isse.ucar.edu/hp_rick/

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 .0.0 5 5 1.0 7 5 X2 X2 7 1.5 1.0 0.5 3 1 2 Hierarchical clustering

More information

A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model

A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model Rand R. Wilcox University of Southern California Based on recently published papers, it might be tempting

More information

Influence of Terrain on Scaling Laws for River Networks

Influence of Terrain on Scaling Laws for River Networks Utah State University DigitalCommons@USU All Physics Faculty Publications Physics 11-1-2002 Influence of Terrain on Scaling Laws for River Networks D. A. Vasquez D. H. Smith Boyd F. Edwards Utah State

More information

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta International Journal of Science and Engineering Investigations vol. 7, issue 77, June 2018 ISSN: 2251-8843 Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in

More information

How should trends in hydrological extremes be estimated?

How should trends in hydrological extremes be estimated? WATER RESOURCES RESEARCH, VOL. 49, 6756 6764, doi:10.1002/wrcr.20485, 2013 How should trends in hydrological extremes be estimated? Robin T. Clarke 1 Received 15 April 2013; revised 7 August 2013; accepted

More information

DEVELOPMENT OF A LARGE-SCALE HYDROLOGIC PREDICTION SYSTEM

DEVELOPMENT OF A LARGE-SCALE HYDROLOGIC PREDICTION SYSTEM JP3.18 DEVELOPMENT OF A LARGE-SCALE HYDROLOGIC PREDICTION SYSTEM Ji Chen and John Roads University of California, San Diego, California ABSTRACT The Scripps ECPC (Experimental Climate Prediction Center)

More information

ANALYSIS OF RAINFALL DATA FROM EASTERN IRAN ABSTRACT

ANALYSIS OF RAINFALL DATA FROM EASTERN IRAN ABSTRACT ISSN 1023-1072 Pak. J. Agri., Agril. Engg., Vet. Sci., 2013, 29 (2): 164-174 ANALYSIS OF RAINFALL DATA FROM EASTERN IRAN 1 M. A. Zainudini 1, M. S. Mirjat 2, N. Leghari 2 and A. S. Chandio 2 1 Faculty

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information