Statistical Models for Rainfall with Applications to April 21, 2008
Overview The idea: Insure farmers against the risk of crop failure, like drought, instead of crop failure itself. It reduces moral hazard and adverse selection.
Overview The idea: Insure farmers against the risk of crop failure, like drought, instead of crop failure itself. It reduces moral hazard and adverse selection. (Rainfall, Other Factors) Crop Yield Income.
Overview The idea: Insure farmers against the risk of crop failure, like drought, instead of crop failure itself. It reduces moral hazard and adverse selection. (Rainfall, Other Factors) Crop Yield Income. Statistically speaking: Crop Yield i = f (Rainfall i ) + ɛ i, Income i = f (Crop Yield i )
Stabilizing Annual Income Without Insurance: Net Income = Yield ($) With Insurance: Net Income = Yield ($) + Insurance Payout - Cost of Insurance Without Insurance With Insurance Income ($) 1300 1200 1100 1000 900 800 700 Income ($) 1300 1200 1100 1000 900 800 700 0 100 200 300 400 500 0 100 200 300 400 500 Year Year
Contract Design Net Income i = Yield i + Insurance Payout i Cost of Insurance = f (R i ) + ɛ i + g(r i ) c Payout = a(rainfall) + b, a < 0 Minimizing Variance 1400 1200 220 Payout ($) 1000 800 600 400 SD of Net Income ($) 200 180 160 200 0 140 0 200 400 600 800 1000 1400 2.0 1.5 1.0 0.5 0.0 Rainfall (mm) Slope = a
The Data 70 Daily Rainfall 60 50 40 mm 30 20 10 0 1961 1962 1963 1964 1965 Year Figure: Daily Rainfall, Lilongwe, Malawi, 1961-1965.
For most data sets, we have between 5 and 45 years of daily rainfall data. Wet Day Indicator X = {X mtd } = 1{Y mtd > 0}, the indicator of a wet day on day d in year t of month m. Rainfall Y = {Y mtd } denote the amount of rainfall on day d in year t of month m. We fit a model whose likelihood can be factored into two parts: the likelihood of the frequency of wet days, X, and the likelihood of the intensity of rainfall on wet days, Y : P(X, Y θ) = P(X θ 1 ) P(Y θ 2 ).
The model for rainfall intensity (Y ) Y mtd X mtd = 1 Gamma(α mt, β mt ),
The model for rainfall intensity (Y ) Y mtd X mtd = 1 Gamma(α mt, β mt ), ( log(αmt ) log(β mt ) ) ) N (µ m, Σ m,
The model for rainfall intensity (Y ) Y mtd X mtd = 1 Gamma(α mt, β mt ), ( log(αmt ) log(β mt ) ) ) N (µ m, Σ m, p(µ m, Σ m ) Σ m 3 2, for t = 1,..., 45, and d = 1,..., n d.
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
Illustration Rainfall Intensity by Month 20 15 mm 10 5 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Month
The model for rainfall frequency (X ) ( ) p1mt 1 p X mtd X mt(d 1) P mt = 1mt 1 p 2mt p 2mt
The model for rainfall frequency (X ) ( ) p1mt 1 p X mtd X mt(d 1) P mt = 1mt 1 p 2mt p 2mt p jmt Beta(µ jm, s jm )
The model for rainfall frequency (X ) ( ) p1mt 1 p X mtd X mt(d 1) P mt = 1mt 1 p 2mt p 2mt p jmt Beta(µ jm, s jm ) µ jm U(0, 1) s jm U(0, ) for d = 2,.., n d, m = 1,..., 12, t = 1,..., 45, and j = 1, 2.
Fit the model
Fit the model
Fit the model
Fit the model
Fit the model
Fit the model
Fit the model
Fit the model www.r-project.org
Model Checking - Posterior Predictive Checks 1. 1 data set = 45 years of daily rainfall. Measure ˆθ data. ex) ˆθ data = SD of annual rainfall 2. Simulate 500 copies of the data (45 years of daily rainfall each). Measure ˆθ rep for each copy from rep = 1,...,500. 3. Plot the distribution of ˆθ rep (a histogram), and then plot the observed value ˆθ data and see if it falls within the distribution of simulated statistics.
Posterior Predictive Check *Could this model have produced this data? Interannual Variability 50 40 30 20 10 0 120 140 160 180 200 220 mm
Interpreting Posterior Predictive Checks Interpret it like a p-value. 1 out of 20 values of ˆθ data will lie in the tail of the distribution of ˆθ rep just by random chance.
When a bad result is good: (ˆθ data = Y mt1 Y (m 1)t(31) ) This model exaggerates the difference in rainfall between the last day of one month and the first day of the next month. AUG SEP OCT NOV Frequency 0 50 100 150 200 250 Frequency 0 50 100 150 200 Frequency 0 50 100 150 Frequency 0 50 100 150 0.2 0.1 0.0 0.1 0.2 0.3 0.4 0.2 0.0 0.2 0.4 0.6 0.8 0.5 0.0 0.5 1.0 1.5 2 0 2 4 6 mean.last.sim[, i] mean.last.sim[, i] mean.last.sim[, i] mean.last.sim[, i] DEC JAN FEB MAR Frequency 0 20 40 60 80 100 Frequency 0 20 40 60 80 Frequency 0 20 40 60 80 100 140 Frequency 0 50 100 150 2 0 2 4 6 8 10 6 4 2 0 2 4 6 8 5 0 5 10 5 0 5 mean.last.sim[, i] mean.last.sim[, i] mean.last.sim[, i] mean.last.sim[, i] APR MAY JUN JUL Frequency 0 20 40 60 80 100 140 Frequency 0 50 100 150 Frequency 0 50 100 150 Frequency 0 50 100 150 200 250 8 6 4 2 0 2 3 2 1 0 1 2 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.4 0.2 0.0 0.2 mean.last.sim[, i] mean.last.sim[, i] mean.last.sim[, i] mean.last.sim[, i]
The Frequency Domain Rainfall-generating process is smooth through time: Y t gamma(α t, β t ), for t = 1,..., (45 365). α t N(X t η α + a sin (2π 1 1 365t) + b cos (2π 365 t), σ2 α), β t N(X t η β + c sin (2π 1 1 365t) + d cos (2π 365 t), σ2 β ). Non-informative priors on η α, η β, a, b, c, d.
Thanks : Dan Osgood (IRI) & colleagues Andy Robertson (IRI) Thanks. www.stat.columbia.edu/ shirley