Uncertainties in modelling Luca Vezzaro (luve@env.dtu.dk) Modelling and Control of Environmental Systems Padova, 14 th January 2015
Introduction: Where are uncertainties? Why worrying about that? Theoretical classification of uncertainty How can be described? Uncertainty Assessment How do we consider uncertainty in modelling? Small examples How can we use uncertainty? 2
Introduction: Where are uncertainties? Why worrying about that? Theoretical classification of uncertainty How can be described? Uncertainty Assessment How do we consider uncertainty in modelling? Small examples How can we use uncertainty? 3
12 October 2014 The warning was not sent, because the model did not predict the flood model that had never been wrong in the past Wrong (?) model prediction
13-14 October 2012 High resolution 24-48 hours in advance!! From http://www.climalteranti.it/2013/03/25/la-gestione-deirischi-in-un-clima-mutato-parte-iii-le-criticita/ 14 October 2012 5
How weather prediction models work From BBC Two The Code Initial state Ensamble 1 day 3 days 1 week 6
Uncertainty affect all mathematical models INPUT SUBMODEL OUTPUT Weather forecast Complexity INPUT SUBMODEL OUTPUT Force equation INPUT MODEL OUTPUT F=m. a 7
Let s take to the carnival What time should we leave Padua? Train 8
Padova-Venice transport model X t = x /v What about stops on the way? t = x /v + t stop * n stops Fixed Fixed Can vary 9
Padova-Venice transport model t = x /v + t stop * n stops X = 40 km n stop = 6 v =? t stop =? What are the values for the model parameters? t obs,1 = 50 min t obs,2 = 55 min 10
Let s guess the parameters values X = 37 km n stop = 6 T obs,1 = 50 min T obs,2 = 55 min v [km/h] t stop [min] t [min] 50 2 60.0 55 2 55.6 65 2 48.9 50 1 54.0 55 1 49.6 50 1.5 57 55 1.5 52.6 11
On which parameter should we spend time on? Which one are the really important ones? Is there something more important to look at? 12
How models are applied System under study MEASUREMENTS Experimenting Optimized System Reality WICKED PROBLEM Virtual Reality PARAMETERS Model of the System Simulating NUMERICAL ERRORS Solution for the System 13
Why is important to assess uncertainty? Uncertainty can be used as a wild card Can be used to argue for precaution (or for not acting at all) Pro / Against Global Warming Or for justifying actions ( like invading a country) "As we know, there are known knowns. There are things we know we know. We also know there are known unknowns. That is to say we know there are some things we do not know. But there are also unknown unknowns, the ones we don't know we don't know." Donald Rumsfeld, former U.S. Secretary of Defense 14
Can we trust models? Yes, but we need to address uncertainty!! All models are wrong, but some are useful (Box&Draper, 1987) 15
Introduction: Where are uncertainties? Why worrying about that? Theoretical classification of uncertainty How can be described? Uncertainty Assessment How do we consider uncertainty in modelling? Small examples How can we use uncertainty? 16
What is uncertainty? What is often referred to as "uncertainty" actually hides important technical distinctions (EEA, 2001); Any departure from the unachievable ideal of complete determinism A three dimensional concept (e.g. Walker et al., 2003) Location Nature Level 17
Location of Uncertainty (1 st dimension) Generic Locations: Context (e.g. boundaries, framing) Model Structure Input data Calibration data Parameter Uncertainty Model Output (conclusion) Location Level Nature 18
Location of uncertainty Context vs. Model Uncertainty Context Uncertainty Wrong definition of the system: the wrong question was answered! 19
Let s take to the carnival Train Train The carnival is in Verona 20
Wetland modelling You develop a model that is very good at predicting oxygen and ammonia during rain events but you are interested in nutrients loads on an yearly basis 21
Location of uncertainty Context vs. Model Uncertainty Input data Model Structure is the description of the internal relationships dominating the system Model structure uncertainty there are competing models describing the system There is not a single way to model your system 22
Let s take to the carnival Bus Regional Train High speed train There are many ways to get to Venice 23
Uncertainty in environmental models Location The location of uncertainty can be: Model structure Relation between the output of the model, the input u and state variable x e.g. Removal for stormwater pollutants dl dt K a K rem C w P m L dl dt K a K rem C w V L Proportional to intensity or runoff volume? 24
Wetland modelling Conceptual models (tanks in series)? hydrodynamic models (Saint- Venant)? Stochastic models???? 25
Uncertainty in environmental models Location Souce: Hølten- Lützhøf et al 2008 Model parameters Parameter values can vary with wide ranges e.g. Sorption capacity for benzene (soil/water coefficient k d ) 81 L/kg 85 L/kg 370 L/kg 320 L/kg 280 L/kg 100 L/kg 92 L/kg 59 L/kg 38 L/kg 54 L/kg 44 L/kg 38 L/kg 31 L/kg 79 L/kg 50 L/kg 4 L/kg 31 L/kg 614 L/kg 337 L/kg 95 L/kg 89 L/kg 83 L/kg 85 L/kg 54 L/kg 26 L/kg 13 L/kg 22 L/kg 26
Location of uncertainty Data and parameter uncertainty Measurements of global sea temperature Source: Zecca and Chiari, 2009 No model is able to simulate this peak During WWII measurements were biased 27
Location of uncertainty Data and parameter uncertainty Rainfall-runoff model Delay in the model s response Simulated but not measured Measured but not simulated 28
Location of uncertainty Data and parameter uncertainty Modelled catchment Rain Gauge Input uncertainty Source: Goole Earth, 2010 29
Level of Uncertainty (2 nd dimension) Statistical Uncertainty Scenario Uncertainty Recognised Ignorance Total Ignorance Location Level Nature 30
Statistical Uncertainty There exist solid grounds for the assignment of a discrete probability to each of a well-defined set of outcomes. We have a well known functional relationship We have an adequate combination of: (i) number of parameters and (ii) amount and character of data. 0.8 0.6 0.4 σ=0 μ=0.5 0.2 0-2 -1 0 1 2 31
Scenario Uncertainty We can describe a set of outcomes to be expected, but we cannot associate probabilities very well. Assumptions; various plausible scenarios; unverified what if? questions; Ambiguous results. 0.8 0.6 0.4??? 0.2 0-2 -1 0 1 2 32
Statistisc and scenario uncertainty Variable, X Scenario 2 Scenario 1 Confidence level Confidence interval Scenario 3 Estimation Planning horizon Time Scenarios don t need to be related to time (e.g. other variables) People usually don t talk about uncertainties in the scenarios. 33
Ignorance Statistical Uncertainty Scenario Uncertainty Recognised Ignorance Total Ignorance We do not know the essential functional relationships. There exist neither grounds for the assignment of probabilities, nor even the basis for the definition of a complete set of outcomes. More information may become known later through research, but little is known for the time being. Recognized ignorance: Total ignorance: We know that we don t know! We don t know what we don t know 34
Level of Uncertainty Statistical Uncertainty Scenario Uncertainty Recognised Ignorance Total Ignorance Uncertainty Uncertainty Uncertainty known known unknown Probability Probability Probability known unknown unknown That s were we are working so far Location Level Nature 35
The Nature of Uncertainty (3 rd dimension) Stochastic Uncertainty: variability and inherent indeterminism in the underlying natural phenomena. You cannot avoid it Epistemic Uncertainty: variability and inherent indeterminism in the underlying natural phenomena. Can be reduced, but.. Ambiguity: no matter how much energy you spend, you will find contrasting (but equally good) results Location Level Nature 36
Three dimensions of uncertainty Location of uncertainty Context, model structure, input data, parameters Level of uncertainty Statistical Scenario Recognized Total Uncertainty uncertainty ignorance ignorance Nature of uncertainty Can it be reduced or not? (epistemic or aleatoric) 37
How to classify uncertainty Location Level Nature From Warmink et al., 2010 38
Uncertainty matrix Try to identify the uncertainties in a model Context Level of uncertainty Nature Location of uncertainty Not Statistical Scenario Ignorance Reducible reducible Model structure Input data Parameters 39
Introduction: Where are uncertainties? Why worrying about that? Theoretical classification of uncertainty How can be described? Uncertainty Assessment How do we consider uncertainty in modelling? Small examples How can we use uncertainty? 40
Uncertainty in environmental models General formulation of a dynamic model m k : observed value time space input parameters But we are living in the real world (which is not perfect): O x, tox, t mk,, u, u, x, tm,, u, u, x, t r observations error parameters error input error model error random error 41
The big question which model factors are affecting my output? Rain Inputs System attributes y= f(x(t),θ) Model parameters To find what is relevant To quantify how uncertain Sensitivity analysis Uncertainty analysis 42
Addressing the model uncertainties Without observations to compare with Sensitivity Analysis (SA) Identification of sources of uncertainty With observation to compare with Uncertainty Analysis (UA) Quantification of uncertainty 43
Padova-Venice transport model v [km/h] t stop [min] t [min] 50 2 60.0 55 2 55.6 S S v = 0.72 i y y i i i i v [km/h] t stop [min] t [min] 50 2 60.0 50 1 55.6 S tstop = 0.11 v [km/h] t stop [min] t [min] 100 10 84.0 100 15 114 S tstop = 0.71 46
Sensitivity analysis Once-At-Time method Traditionally, One-At-Time (OAT) methods are used: I change the parameter by Δθ I run the model and I obtain Δy I calculate the sensitivity index Not all the parameter space is explored!!! Δθ 2 S y Explored parameter space What about interactions? (θ 1,θ 2 ) Δθ 1 Actual parameter space 47
Global Sensitivity Analysis (GSA) Some examples Variance decomposition Identifies the direct and indirect influence σ 2 Output variance σ 2 Variance due to variation of parameter θ i σ 2 Variance due to variation of input Very good overview of the model behaviour Heavy computational requirements Can be applied also to models with many submodels? Variance due to model structural uncertainty 48
Sobol sensitivity indices First order index σ 2 Variance due to variation of parameter θ i σ 2 Output variance Total Sensitivity index 1- σ 2 σ 2 Variance due to variation of ALL parameters EXCLUDED θ i Output variance θ 1 θ 2 θ 3 6 8 9 8 1 7 7 8 5 6 3 1 6 4 6 σ 2 θ 1 θ 2 θ 3 1 8 9 0 1 7 5 8 5 8 3 1 2 4 6 σ 2 θ 1 θ 2 θ 3 6 3 7 8 2 5 7 4 9 6 2 3 6 1 3 σ 2 49
Global Sensitivity Analysis (GSA) if thousand simulations are too many Morris method OAT method applied in several points of the parameter space Trajectory 2 Δθ 3 Δθ 1 Δθ 2 Source: Pujol, 2009 Many first order sensitivity indices Average μ Variance σ 2 θ 3 Information about: Trajectory 1 Sensitive factors Δθ 3 Δθ 1 Factors with interactions or non-linear effect θ 2 Δθ 2 Limited computational requirements θ 1 for k parameters: (k+1) * size of parameter sample (a couple of dozens) 50
Morris method Results analysis Information about: Sensitive factors Factors with interactions or non-linear effect 51
Regression approaches Run the model in different point of the parameter space Assume a linear response Estimate regression parametes α, β, γ The regression parameters are proportional to sensitivity indices 52
Example of GSA A stormwater quality model 9 model factors 4 parameters for the hydraulic submodel 3 parameters for the quality submodel 2 input error factors Influence on two outputs Mass discharged Concentration in the outlet Sample size = 10000 parameter sets 53
GSA results 1st order sensitivity indices (what you get from OAT) Deposition rate Washoff rate All the other factors are not relevant With OAT I would stop here 54
GSA results look at the interactions Two correlated parameters The hydraulic submodel affects the quality estimation 55
Sensitivity analysis step-by-step 1 Define prior distribution for each model parameter 2 Generate N parameter sets N 3 Run model for N parameter sets model out i 4 Evaluate performance of the N simulations out i RMSE 5 Analyze results 56
Identification of sources of uncertainty NO Measurements Global Sensitivity Analysis Which factors should I focus on? 757
Uncertainty Analysis How to estimate the term ε Various approaches are available: Bayesian and pseudo-bayesian approaches Grey Box modelling Etc. r u m u k O t x u t x u m t x t x O,,,,,,,,,,,, +/- XX%
Grey Box method The error is modelled as a stochastic term New parameters, defining the error structure are added to the model The uncertainty of the model is given by the simulation with the parameters which gives the best predictions ˆ 59
Grey Box method Parameter estimated through calibration Uncertainty bounds Point prediction 60
Statistical methods The Bayes theorem Conditional probability of B given A Prior probability of A Posterior probability Example from Wikipedia: Prior probability of B Suppose there is a school having 60% boys and 40% girls as students P(A)=0,4. The girl students wear trousers or skirts in equal numbers P(B A)=0,5; the boys all wear trousers P(B A )=1 An observer sees a (random) student from a distance; all they can see is that this student is wearing trousers. What is the probability this student is a girl? P(A B)=0,25 61
Statistical methods The Bayes theorem Translated in modelling terms Prior probability of parameters θ P ( y) L( y ) P( ) / K Posterior probability of parameters θ given the observation y Likelihood function It s a measure of the fitness of the model output compared to the results (e.g. based on square error, etc.) 62
Bayesian approaches P ( y) L( y ) P( ) / K L 2 L 1 L 3 1 Estimate likelihood 2 Estimate posterior distribution 3 Use posterior distribution to obtain uncertainty bounds P(θ/y) θ 63
Statistical methods The Generalized Likelihood Uncertainty Estimation (GLUE) Soft version of the Bayesian approach (pseudo-bayesian) Weight the models according to the goodness of their prediction Based on the equifinality principle: Observations can be simulated by many different parameter sets with similar goodness 64
Pseudo-Bayesian approaches (GLUE) L 3 L 2 L 1 L 3 1 Estimate informal likelihood P(θ/y) 2 Find behavioural parameter sets 3 Use behavioural sets to obtain uncertainty bounds θ 65
Informal likelihoods How do I evaluate my model performance? N i T T i Z Y L M 2 1, N obs i T T i Z Y L M 2 2 1, N t Z t M Z Y L M N obs j j T j i T T i 1 2 exp, N obs j j T j i obs i t Z t M N 1 2 2 1 All based on the Square Error: New ways of considering observation errors 66
Statistical methods Bayes vs. GLUE The two approaches are equally complex Bayes is more mathematically elegant and safe GLUE is more relaxed (less assumptions) but results are subjective Mathematicians are still fighting, so we still have to see who will win E.g. four methods similar uncertainty bounds From Dotto et al. (2012) 67
Comparison of the approaches How do I choose my method? An example of comparison of different approaches Flow module TSS Module Coverage of GLUE SCEM-UA AMALGAM BAYES observations + + + + Computational requirements + ++ + + Coverage of Ability to identify ill-posed models (correlated parameters, poor efficiency and low coverage) Ability to identify sensitive observations + + + + Computational requirements - ++ + - + + + + parameters + + + + Availability Software package Code for use in MATLAB and OCTAVE Software package Software package Required programming skills Min Min Max Max Limitations Subjectivity of acceptance threshold Subjectivity of acceptance threshold Subjectivity of acceptance threshold Knowledge about error distribution From Dotto et al. (2012) 68
Simple receipt for UA 1. 1 Choose the parameters that are uncertain 2. 2 Choose a distribution for these parameters (e.g. uniform, triangular, gaussian, etc.) 3. 3 Generate a sample of parameters 4. Choose a (un)formal likelihood measure evaluating the 4 parameters 5. 5 Run the model with the generate sample Bayes GLUE Estimate the parameter posterior distribution Select the parameter sets with better performances 69
How the results of UA look like? Confidence bounds Bayes: confidence bounds (e.g. 95% probability that your results lie within the bands) GLUE: prediction bounds (e.g. the best parameters give prediction within the interval) 70
How the results of UA look like? Information about parameter intervals Flat distribution = non sensitive Correlation between parameters Peak distribution = sensitive Parameter intervals (you can use afterwards for predictions) 71
Introduction: Where are uncertainties? Why worrying about that? Theoretical classification of uncertainty How can be described? Uncertainty assessment How do we consider uncertainty in modelling? Small examples How can we use uncertainty? 72
Uncertainty in decision making Everyday pic-nic planning in Denmark Next 48 hours 10-15 days 73
Uncertainty in decision making Everyday pic-nic planning in Denmark Next 48 hours Highly uncertain rain High probability rain 10-15 days 74
Model for stormwater treatment Identification of most important parameters Multi-CSTR model (can simulate different units) Over 12 parameters (related and not-related to PP) What are the major sources of uncertainty? Analysis performed on: 2 Systems (different removal) Pond 6 Substance (different environmental fate) Heavy metals (Cu,Zn) sorbing Pyrene Biofilter Glyphosate Benzene IPBC biodegradable volatile no dominant processes 75
Model for stormwater treatment Identification of most important parameters Pond Most important factors are related to TSS (settling/resuspension) Biofilter Majority of important factors are related to hydraulic submodel (evapotranspiration) Do not need to spend a lot of money in measuring micropollutants! 27
Discharged MP loads Relative reduction Copper Zinc Fluoranthene 77
Compliance with Water Quality criteria EMC ELV is exceeded between XX and YY times per year Return period 78
Uncertainty in control of urban drainage networks Didactical example Detention basins Treatment plant
Example Real Time Control West Town Objective: Maximize storage East Town
Example Model Predictive Control West Town Objective: Maximize future storage East Town Model forecast (without uncertainty)
Example Risk-based Model Predictive Control? Rainfall evolution is uncertain
Example Risk-based Model Predictive Control Risk of overflow West Town East Town Target Target If we do not consider uncertainty If we consider uncertainty Objective: Minimize CSO risk
Conclusion Models are an essential tool for managing environmental systems But they are all wrong!!! (but some are useful) Acknowledging the models limits, failure and uncertainties is essential Managing model uncertainties is a complex issue, which a lot of people struggle to understand When you present your model result don t give a number, give an interval! 84
Thank you for listening!!! For any question: luve@env.dtu.dk