Extending clustered point process-based rainfall models to a non-stationary climate Jo Kaczmarska 1, 2 Valerie Isham 2 Paul Northrop 2 1 Risk Management Solutions 2 Department of Statistical Science, University College London Berlin Workshop on Weather Generators September 2017
Introduction/Motivation Requirement: Ability to use Poisson cluster models to simulate rainfall in a changing climate. Method: Non-parametric kernel-based approach relates parameters of the model to large scale atmospheric variables such as sea-level pressure and temperature, which can be simulated from a climate model. Focus of research: Sub-hourly resolution at single site.
Structure of Presentation 1. Point process-based rainfall model used here 2. Fitting methodology (Generalised Method of Moments) Current approach: discrete covariate: month Proposed approach: smooth seasonality and add or replace with continuous covariates 3. Empirical Study and Results
Original Bartlett-Lewis Rectangular Pulse model Summary of parameters: η rate of cell death µ X mean cell intensity Rainfall intensity Individual cells Storm duration Exp(γ) Cell inter-arrival times Exp(β) 2 Storm arrivals at rate λ Cell arrivals at rate β with cell at storm origin Time λ rate of storm arrival γ rate of storm death β rate of cell arrival
GMM fitting with discrete covariate With a separate model for each calendar month, m, we solve: [ { n 1 ˆθ m = argmin θm n t=1 I (mt=m) I (m t = m) [T (Y t ) τ(θ m )] W m { 1 n t=1 I (mt=m) t=1 t=1 n } ] I (m t = m) [T (Y t ) τ(θ m )]. where: Y t is the rainfall data in observation month t and T (Y t ) is the vector of statistics for that month, with τ(θ m ) the vector of expected values for calendar month m } T
GMM fitting with discrete covariate With a separate model for each calendar month, m, we solve: [ { n 1 ˆθ m = argmin θm n t=1 I (mt=m) I (m t = m) [T (Y t ) τ(θ m )] W m { 1 n t=1 I (mt=m) t=1 t=1 n } ] I (m t = m) [T (Y t ) τ(θ m )]. where: Y t is the rainfall data in observation month t and T (Y t ) is the vector of statistics for that month, with τ(θ m ) the vector of expected values for calendar month m We have: θ with 5 components, and T & τ with 13 (mean hourly rainfall, plus the coeffn of variation, skewness, & lag-1 ac at timescales of 5 minutes, and 1,6 and 24 hours) } T
GMM fitting with discrete covariate With a separate model for each calendar month, m, we solve: [ { n 1 ˆθ m = argmin θm n t=1 I (mt=m) I (m t = m) [T (Y t ) τ(θ m )] W m { 1 n t=1 I (mt=m) t=1 t=1 n } ] I (m t = m) [T (Y t ) τ(θ m )]. where: Y t is the rainfall data in observation month t and T (Y t ) is the vector of statistics for that month, with τ(θ m ) the vector of expected values for calendar month m We have: θ with 5 components, and T & τ with 13 (mean hourly rainfall, plus the coeffn of variation, skewness, & lag-1 ac at timescales of 5 minutes, and 1,6 and 24 hours) W m is the weights matrix for month m - we use the diagonal matrix of inverse sample variances of the 13 properties. } T
GMM fitting with discrete covariate With a separate model for each calendar month, m, we solve: [ { n 1 ˆθ m = argmin θm n t=1 I (mt=m) I (m t = m) [T (Y t ) τ(θ m )] W m { 1 n t=1 I (mt=m) t=1 t=1 n } ] I (m t = m) [T (Y t ) τ(θ m )]. where: Y t is the rainfall data in observation month t and T (Y t ) is the vector of statistics for that month, with τ(θ m ) the vector of expected values for calendar month m We have: θ with 5 components, and T & τ with 13 (mean hourly rainfall, plus the coeffn of variation, skewness, & lag-1 ac at timescales of 5 minutes, and 1,6 and 24 hours) W m is the weights matrix for month m - we use the diagonal matrix of inverse sample variances of the 13 properties. } T
What if our covariate is continuous rather than discrete? We could partition our continuous covariate into a number of discrete ordered bins and fit a separate model for each bin as per existing method. But is there a better way?
What if our covariate is continuous rather than discrete? We could partition our continuous covariate into a number of discrete ordered bins and fit a separate model for each bin as per existing method. But is there a better way? Motivating Example: Scatterplot smoothing Bins y 0.0 1.0 2.0 3.0 0.0 0.2 0.4 0.6 0.8 1.0 x
What if our covariate is continuous rather than discrete? We could partition our continuous covariate into a number of discrete ordered bins and fit a separate model for each bin as per existing method. But is there a better way? Motivating Example: Scatterplot smoothing Bins Gaussian kernel weights y 0.0 1.0 2.0 3.0 y 0.0 1.0 2.0 3.0 0.0 0.2 0.4 0.6 0.8 1.0 x 0.0 0.2 0.4 0.6 0.8 1.0 x
Kernel Smoothing The example showed a local mean or Nadaraya-Watson estimate, using Gaussian kernel weights. It is given by: ˆf (x 0 ) = n j=1 K h(x j x 0 ) y j n j=1 K h(x j x 0 ) which must be calculated for each required value of x 0. ( ) K h (X t x 0 ) = 1 h K are local weights; (X t x 0 ) h h determines the width of the local neighbourhood. Properties of kernel function, K: integrates to 1 peaks at zero decreases as X t x 0 increases.
Methodology - Local Mean GMM Recall the formula with the discrete covariate, month: [ { n 1 ˆθ m = argmin θm n t=1 I (mt=m) I (m t = m) [T (Y t ) τ(θ m )] W m { 1 n t=1 I (mt=m) t=1 t=1 n } ] I (m t = m) [T (Y t ) τ(θ m )]. } T
Methodology - Local Mean GMM Recall the formula with the discrete covariate, month: [ { n 1 ˆθ m = argmin θm n t=1 I (mt=m) I (m t = m) [T (Y t ) τ(θ m )] W m { 1 n t=1 I (mt=m) t=1 t=1 n } ] I (m t = m) [T (Y t ) τ(θ m )]. Now, for a continuous covariate, we replace the indicator functions with kernel weights to get parameters at X = x 0 : [ { 1 n } T ˆθ(x 0 ) = argmin θx0 K h (X t x 0 ) [T (Y t ) τ(θ x0 )] n t=1 { 1 n } ] W x0 K h (X t x 0 ) [T (Y t ) τ(θ x0 )] n t=1 } T
Methodology - Local Mean GMM Recall the formula with the discrete covariate, month: [ { n 1 ˆθ m = argmin θm n t=1 I (mt=m) I (m t = m) [T (Y t ) τ(θ m )] W m { 1 n t=1 I (mt=m) t=1 t=1 n } ] I (m t = m) [T (Y t ) τ(θ m )]. Now, for a continuous covariate, we replace the indicator functions with kernel weights to get parameters at X = x 0 : [ { 1 n } T ˆθ(x 0 ) = argmin θx0 K h (X t x 0 ) [T (Y t ) τ(θ x0 )] n t=1 { 1 n } ] W x0 K h (X t x 0 ) [T (Y t ) τ(θ x0 )] n t=1 We use intervals of a month; covariate will typically be a mean monthly value. } T
The different timescales Time intervals in the summation - months
The different timescales Time intervals in the summation - months Short enough that it s reasonable to treat the series within each interval as stationary
The different timescales Time intervals in the summation - months Short enough that it s reasonable to treat the series within each interval as stationary Long enough for small sample biases in the properties to be negligible.
The different timescales Time intervals in the summation - months Short enough that it s reasonable to treat the series within each interval as stationary Long enough for small sample biases in the properties to be negligible. Long enough to permit treatment of the properties as independent between the intervals.
The different timescales Time intervals in the summation - months Short enough that it s reasonable to treat the series within each interval as stationary Long enough for small sample biases in the properties to be negligible. Long enough to permit treatment of the properties as independent between the intervals. Timescales for the properties - 5 min to 24 hours
The different timescales Time intervals in the summation - months Short enough that it s reasonable to treat the series within each interval as stationary Long enough for small sample biases in the properties to be negligible. Long enough to permit treatment of the properties as independent between the intervals. Timescales for the properties - 5 min to 24 hours To provide information about the observed rainfall structure at both cell and storm levels
The different timescales Time intervals in the summation - months Short enough that it s reasonable to treat the series within each interval as stationary Long enough for small sample biases in the properties to be negligible. Long enough to permit treatment of the properties as independent between the intervals. Timescales for the properties - 5 min to 24 hours To provide information about the observed rainfall structure at both cell and storm levels Including a range of levels (especially sub-hourly) helps with parameter identification.
Data Practical application 5-min rainfall data from Bochum, Germany, 1931 to 1999; Monthly NCEP reanalysis data (available from Jan 1948, grid point: lat 52.5N, long 7.5E) plus NAO index. Combined data gives 624 monthly observations.
Data Practical application 5-min rainfall data from Bochum, Germany, 1931 to 1999; Monthly NCEP reanalysis data (available from Jan 1948, grid point: lat 52.5N, long 7.5E) plus NAO index. Combined data gives 624 monthly observations. Practical Issues Numerical optimisation Extend to local linear? ( design-adaptive ) Calculation of weighting matrix Choice of bandwidth: variance versus bias; Extension for multiple covariates: How many? Which ones? Bandwidth matrix
Single covariate: effect of the bandwidth h = 0.5 h = 1.5 h = 5 log λ log µ x log β 4.0 4.5 5 0 5 10 15 20 3 2 1 0 1 5 0 5 10 15 20 2 1 0 5 0 5 10 15 20 log β log µ x log λ 4.0 4.5 5 0 5 10 15 20 3 2 1 0 1 5 0 5 10 15 20 2 1 0 5 0 5 10 15 20 log β log µ x log λ 4.0 4.5 5 0 5 10 15 20 3 2 1 0 1 5 0 5 10 15 20 2 1 0 5 0 5 10 15 20 The choice of bandwidth, h, affects the smoothness of the curves. A higher h gives a smoother, flatter curve, with lower variance, but higher bias. log η log γ 0.5 1.0 1.5 2.0 5 0 5 10 15 20 3.0 2.5 2.0 log η log γ 0.5 1.0 1.5 2.0 5 0 5 10 15 20 3.0 2.5 2.0 log η log γ 0.5 1.0 1.5 2.0 5 0 5 10 15 20 3.0 2.5 2.0 We are assuming a global bandwidth: i.e. the same h across the whole curve. 1.5 5 0 5 10 15 20 1.5 5 0 5 10 15 20 temperature, deg C 1.5 5 0 5 10 15 20
Optimal bandwidth - temperature 1. Visualisation 2. A variant of Cross-Validation: 25 hold-out samples, splitting our 624 observation months randomly each time into: training sample of 399, and test sample of 225 For each split, find h that minimises the sum of weighted squared residuals over the test set, with parameters derived from the training set. The median optimal h was 1.28, the mean 1.35 Density scaled prediction error 1.5 1.0 0.5 0.0 1.08 1.06 1.04 1.02 1.00 1 2 3 4 5 bandwidth 1.0 1.5 2.0 optimal bandwidth
Multiple Covariates Now require bandwidth matrix (controls size and direction of smoothing) Diagonal H product kernels: smoothing by different amounts in the directions of the co-ordinate axes (2) K H (X t x 0 ) = K h1 (X t1 x 01 ) K h2 (X t2 x 0 )... K h3 (X t3 x 03 ) 2 (1) 2 (2) 2 (3) 0 0 0 2 2 2 2 1 0 1 2 2 1 0 1 2 2 1 0 1 2 Curse of dimensionality
Choosing covariates: preliminary analysis mean coeff of var skewness ac lag 1 5 min 6 hour 5 min 6 hour 5 min 6 hour slp -0.538 0.304 0.526 0.121 0.385 0.046-0.013 geo200 0.030 0.608 0.414 0.525 0.364-0.470-0.305 geo500-0.096 0.629 0.499 0.512 0.420-0.422-0.280 geo700-0.195 0.627 0.550 0.485 0.451-0.372-0.258 temp 0.170 0.577 0.275 0.546 0.265-0.542-0.344 thick 0.085 0.592 0.370 0.527 0.331-0.484-0.307 rhum 0.140-0.586-0.417-0.465-0.318 0.413 0.240 rhum700 0.169-0.580-0.434-0.448-0.328 0.385 0.229 shum 0.243 0.538 0.224 0.528 0.238-0.534-0.343 shum700 0.289 0.479 0.195 0.467 0.217-0.478-0.295 uwind 0.374-0.204-0.384-0.092-0.275-0.068-0.022 vwind 0.213-0.356-0.368-0.232-0.286 0.124 0.084 nao 0.069-0.045-0.094-0.030-0.087-0.036-0.012 nao(winter) 0.372-0.182-0.235 0.070-0.207-0.164 0.016
Model Comparison: Optimal covariates Scaled Error Statistic 110 100 90 80 100.0 86.1 87.2 83.0 74.3 slp/temp temp sm.month month none Covariates We compare prediction errors with different covariates (weights based on global variances). For multiple covariates, the optimal bandwidth calculation is limited to re-scaling the diagonal matrix of the univariate h s i.e. assume relative differences stay the same.
Optimal pair of covariates: slp and temp 0.04 10 0.04 10 0.03 λ 0.02 µ x 5 0.03 λ 0.02 µ x 5 0.01 1000 1010 1020 1030 sea level pressure,mb 0 1000 1010 1020 1030 sea level pressure,mb 0.01 0 10 20 temperature, deg C 0 0 10 20 temperature, deg C 0.4 0.4 β 5 γ β 5 γ 0.2 0.2 0 1000 1010 1020 1030 sea level pressure,mb 1000 1010 1020 1030 sea level pressure,mb 0 0 10 20 temperature, deg C 0 10 20 temperature, deg C 10 10 η 5 temperature, deg C η 5 sea level pressure,mb 5 0 5 10 15 20 1005 1010 1015 1020 1025 1030 1000 1010 1020 1030 0 10 20 sea level pressure,mb temperature, deg C Bandwidths: sea-level pressure: 2.0; temperature: 1.75
Comparison of fit v current approach mean rate, mm/h 0.14 0.12 0.10 0.08 0.06 0.04 2 4 6 8 10 month mean rate, mm/h 0.14 0.12 0.10 0.08 0.06 0.04 1005 1015 1025 sea level pressure,mb mean rate, mm/h 0.14 0.12 0.10 0.08 0.06 0.04 0 5 10 15 temperature, deg C mean rate, mm/h 0.14 0.12 0.10 0.08 0.06 0.04 4 0 2 4 6 zonal wind, m/s zonal wind, m/s Model with covariate Month, v Model with 3 optimal covariates.
Scaled prediction errors Comparison of fit: Other Properties Breakdown of weighted prediction errors by component (mean over 15 test samples) 0 20 40 60 80 100 none month temp slp/temp Covariate slp/temp/ uwind ac1.24h ac1.6h ac1.1h ac1.5m skew24h skew6h skew1h skew5m coeffv24h coeffv6h coeffv1h coeffv5m mean
Interannual Variability Observed v fitted (by calendar month) Observed v fitted (NCEP covariates) 0.14 0.14 Mean hourly rainfall, mm 0.12 0.10 0.08 Mean hourly rainfall, mm 0.12 0.10 0.08 0.06 0.06 1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000 Simulated distributions of mean annual rainfall (expressed in mm per hour) for Bochum. The bands show the 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentiles and the thick black line shows the observed values.
Summary Local mean GMM appears to offer a useful new approach to fitting point-process rainfall models. With just 2 or 3 covariates, we can produce a model with better explanatory power than the current approach, and produce simulations that reflect future climate change scenarios. Key advantages: Can relate various properties of rainfall to covariates, not just mean or probability of occurrence. Extends existing methodology so can use with other versions of model (including spatial-temporal versions). Framework allows estimation of standard errors.
Limitations: Future developments? Parameter identifiability still an issue Interannual variability improved, but doesn t help fit to extremes very much Same level of smoothing for every parameter Boundary bias & high variance at boundaries Curse of dimensionality Options: Penalised splines - allows additive covariate effects & different levels of smoothing for different parameters. Initial tests with single covariate gave similar results. Addressing extremes: Adding other properties and/or covariates or combining with other model versions?
Acknowledgements Richard Chandler (University College London) Valerie Isham (University College London) Paul Northrop (University College London) Christian Onof (Imperial College London) EPSRC (UK Engineering and Physical Sciences Research Council)
Research presented here Useful References Kaczmarska, J. M., Isham, V. S. & Northrop, P. J. (2015), Local generalised method of moments: an application to point process-based rainfall models, Environmetrics 26 (4), 312-325 Point process based models/gmm Onof, C., Chandler, R.E., Kakou, A., Northrop, P., Wheater, H.S. & Isham, V. (2000), Rainfall modelling using Poisson-cluster processes: a review of developments, Stochastic Environmental Research and Risk Assessment, 14, 384-411 Jesus, J. & Chandler, R. E. (2011), Estimating functions and the generalized method of moments, Interface Focus, 1(6), 871-885 Local fitting Fan & Gijbels, L. (1996), Local Polynomial Modelling and its Applications, Chapman and Hall. Lewbel, A. (2007), A local generalized method of moments estimator, Economics Letters 94. Carroll, R. J., Ruppert, D. & Welsh, A. H. (1998), Local estimating equations, Journal of the American Statistical Association 93 (441), 214-227.