Basic Verification Concepts

Similar documents
Basic Verification Concepts

Current verification practices with a particular focus on dust

Verification of Probability Forecasts

Categorical Verification

Verification Methods for High Resolution Model Forecasts

Ensemble Verification Metrics

Verification of nowcasts and short-range forecasts, including aviation weather

Verification of Continuous Forecasts

THE MODEL EVALUATION TOOLS (MET): COMMUNITY TOOLS FOR FORECAST EVALUATION

The Impact of Horizontal Resolution and Ensemble Size on Probabilistic Forecasts of Precipitation by the ECMWF EPS

Assessment of Ensemble Forecasts

Adaptation for global application of calibration and downscaling methods of medium range ensemble weather forecasts

Evaluating Forecast Quality

Feature-specific verification of ensemble forecasts

Verification of ensemble and probability forecasts

Validation of Forecasts (Forecast Verification) Overview. Ian Jolliffe

Spatial Forecast Verification Methods

12.2 PROBABILISTIC GUIDANCE OF AVIATION HAZARDS FOR TRANSOCEANIC FLIGHTS

Concepts of Forecast Verification FROST-14

Probabilistic weather hazard forecast guidance for transoceanic flights based on merged global ensemble forecasts

Verifying Ensemble Forecasts Using A Neighborhood Approach

Spatial forecast verification

Model verification and tools. C. Zingerle ZAMG

Spatial verification of NWP model fields. Beth Ebert BMRC, Australia

Upscaled and fuzzy probabilistic forecasts: verification results

Probabilistic verification

Verification of the operational NWP models at DWD - with special focus at COSMO-EU

Assessing high resolution forecasts using fuzzy verification methods

Precipitation verification. Thanks to CMC, CPTEC, DWD, ECMWF, JMA, MF, NCEP, NRL, RHMC, UKMO

Towards Operational Probabilistic Precipitation Forecast

Verification of wind forecasts of ramping events

Ensemble forecast and verification of low level wind shear by the NCEP SREF system

Calibration of short-range global radiation ensemble forecasts

Focus on Spatial Verification Filtering techniques. Flora Gofa

Surface Hydrology Research Group Università degli Studi di Cagliari

WRF Developmental Testbed Center (DTC) Visitor Program Summary on visit achievements by Barbara Casati

Methods of forecast verification

Probabilistic seasonal forecast verification

Application and verification of ECMWF products 2013

Challenges of Communicating Weather Information to the Public. Sam Lashley Senior Meteorologist National Weather Service Northern Indiana Office

Using time-lag ensemble techniques to assess behaviour of high-resolution precipitation forecasts

Verification and performance measures of Meteorological Services to Air Traffic Management (MSTA)

Verification of the operational NWP models at DWD - with special focus at COSMO-EU. Ulrich Damrath

Drought forecasting methods Blaz Kurnik DESERT Action JRC

Overview of Verification Methods

Predicting uncertainty in forecasts of weather and climate (Also published as ECMWF Technical Memorandum No. 294)

ECMWF products to represent, quantify and communicate forecast uncertainty

Complimentary assessment of forecast performance with climatological approaches

Performance of the INM short-range multi-model ensemble using high resolution precipitation observations

Fundamentals of Data Assimila1on

Sensitivity of COSMO-LEPS forecast skill to the verification network: application to MesoVICT cases Andrea Montani, C. Marsigli, T.

Developments towards multi-model based forecast product generation

Seasonal Climate Watch September 2018 to January 2019

Standardized Anomaly Model Output Statistics Over Complex Terrain.

Probabilistic Weather Forecasting and the EPS at ECMWF

Enhancing Weather Information with Probability Forecasts. An Information Statement of the American Meteorological Society

Model verification / validation A distributions-oriented approach

EMC Probabilistic Forecast Verification for Sub-season Scales

Standardized Verification System for Long-Range Forecasts. Simon Mason

Implementation of global surface index at the Met Office. Submitted by Marion Mittermaier. Summary and purpose of document

Estimation of Forecat uncertainty with graphical products. Karyne Viard, Christian Viel, François Vinit, Jacques Richon, Nicole Girardot

J11.5 HYDROLOGIC APPLICATIONS OF SHORT AND MEDIUM RANGE ENSEMBLE FORECASTS IN THE NWS ADVANCED HYDROLOGIC PREDICTION SERVICES (AHPS)

ECMWF 10 th workshop on Meteorological Operational Systems

Verifying the Relationship between Ensemble Forecast Spread and Skill

Accounting for the effect of observation errors on verification of MOGREPS

Verification Marion Mittermaier

P3.1 Development of MOS Thunderstorm and Severe Thunderstorm Forecast Equations with Multiple Data Sources

Evaluation of Ensemble Icing Probability Forecasts in NCEP s SREF, VSREF and NARRE-TL Systems

Description of the case study

Operational use of ensemble hydrometeorological forecasts at EDF (french producer of energy)

Focus on parameter variation results

David G. DeWitt Director, Climate Prediction Center (CPC) NOAA/NWS

10A.1 The Model Evaluation Tool

Stochastic Hydrology. a) Data Mining for Evolution of Association Rules for Droughts and Floods in India using Climate Inputs

Application and verification of the ECMWF products Report 2007

Performance of TANC (Taiwan Auto- Nowcaster) for 2014 Warm-Season Afternoon Thunderstorm

Forecast Verification Research - state of the art and future challenges

Exploring ensemble forecast calibration issues using reforecast data sets

Precipitation Calibration for the NCEP Global Ensemble Forecast System

Professor David B. Stephenson

Ensemble forecasting: Error bars and beyond. Jim Hansen, NRL Walter Sessions, NRL Jeff Reid,NRL May, 2011

Contents. Acknowledgments. xix

Proper Scores for Probability Forecasts Can Never Be Equitable

Peter P. Neilley. And. Kurt A. Hanson. Weather Services International, Inc. 400 Minuteman Road Andover, MA 01810

Verification of ECMWF products at the Deutscher Wetterdienst (DWD)

Strategy for Using CPC Precipitation and Temperature Forecasts to Create Ensemble Forcing for NWS Ensemble Streamflow Prediction (ESP)

Five years of limited-area ensemble activities at ARPA-SIM: the COSMO-LEPS system

5.2 PRE-PROCESSING OF ATMOSPHERIC FORCING FOR ENSEMBLE STREAMFLOW PREDICTION

Understanding Weather and Climate Risk. Matthew Perry Sharing an Uncertain World Conference The Geological Society, 13 July 2017

Verification of ECMWF products at the Finnish Meteorological Institute

Operational Hydrologic Ensemble Forecasting. Rob Hartman Hydrologist in Charge NWS / California-Nevada River Forecast Center

JP3.7 SHORT-RANGE ENSEMBLE PRECIPITATION FORECASTS FOR NWS ADVANCED HYDROLOGIC PREDICTION SERVICES (AHPS): PARAMETER ESTIMATION ISSUES

3.6 NCEP s Global Icing Ensemble Prediction and Evaluation

Uncertainty analysis of nonpoint source pollution modeling:

Forecast verification at Deltares

YOPP archive: needs of the verification community

Clustering Techniques and their applications at ECMWF

An Investigation of Reforecasting Applications for NGGPS Aviation Weather Prediction: An Initial Study of Ceiling and Visibility Prediction

A Local Ensemble Prediction System for Fog and Low Clouds: Construction, Bayesian Model Averaging Calibration, and Validation

Use of extended range and seasonal forecasts at MeteoSwiss

Transcription:

Basic Verification Concepts Barbara Brown National Center for Atmospheric Research Boulder Colorado USA bgb@ucar.edu

Basic concepts - outline What is verification? Why verify? Identifying verification goals Forecast goodness Designing a verification study Types of forecasts and observations Matching forecasts and observations Statistical basis for verification Comparison and inference Verification attributes Miscellaneous issues Verification Tutorial January 2007 Basic Verification Concepts 2

What is verification? Verify: ver i fy Pronunciation: 'ver-&-"fi 1:to confirm or substantiate in law by oath 2:to establish the truth, accuracy, or reality of <verify the claim> synonym see CONFIRM Verification is the process of comparing forecasts to relevant observations Verification is one aspect of measuring forecast goodness Verification measures the quality of forecasts (as opposed to their value) Verification Tutorial January 2007 Basic Verification Concepts 3

Why verify? Purposes for verification (traditional definition) Administrative Scientific Economic Verification Tutorial January 2007 Basic Verification Concepts 4

Why verify? Administrative purpose Monitoring performance Choice of model or model configuration (has the model improved?) Scientific purpose Identifying and correcting model flaws Forecast improvement Economic purpose Improved decision making Feeding decision models or decision support systems Verification Tutorial January 2007 Basic Verification Concepts 5

Why verify? What are some other reasons to verify hydrometeorological forecasts? Verification Tutorial January 2007 Basic Verification Concepts 6

Why verify? What are some other reasons to verify hydrometeorological forecasts? Help operational forecasters understand model biases and select models for use in different conditions Help users interpret forecasts (e.g., What does a temperature forecast of 0 degrees really mean? ) Identify forecast weaknesses, strengths, differences Verification Tutorial January 2007 Basic Verification Concepts 7

Identifying verification goals What questions do we want to answer? Examples: In what locations does the model have the best performance? Are there regimes in which the forecasts are better or worse? Is the probability forecast well calibrated (i.e., reliable)? Do the forecasts correctly capture the natural variability of the weather? Other examples? Verification Tutorial January 2007 Basic Verification Concepts 8

Identifying verification goals (cont.) What forecast performance attribute should be measured? Related to the question as well as the type of forecast and observation Choices of verification statistics/measures/graphics Should match the type of forecast and the attribute of interest Should measure the quantity of interest (i.e., the quantity represented in the question) Verification Tutorial January 2007 Basic Verification Concepts 9

Forecast goodness Depends on the quality of the forecast AND The user and his/her application of the forecast information Verification Tutorial January 2007 Basic Verification Concepts 10

Good forecast or bad forecast? F O Many verification approaches would say that this forecast has NO skill and is very inaccurate. Verification Tutorial January 2007 Basic Verification Concepts 11

Good forecast or Bad forecast? If I m a water manager for this watershed, it s a pretty bad forecast F O Verification Tutorial January 2007 Basic Verification Concepts 12

Good forecast or Bad forecast? F O A Flight Route O B Different users have different ideas about what makes a forecast good If I m an aviation traffic strategic planner It might be a pretty good forecast Different verification approaches can measure different types of goodness Verification Tutorial January 2007 Basic Verification Concepts 13

Forecast goodness Forecast quality is only one aspect of forecast goodness Forecast value is related to forecast quality through complex, non-linear relationships In some cases, improvements in forecast quality (according to certain measures) may result in a degradation in forecast value for some users! However - Some approaches to measuring forecast quality can help understand goodness Examples Diagnostic verification approaches New features-based approaches Use of multiple measures to represent more than one attribute of forecast performance Verification Tutorial January 2007 Basic Verification Concepts 14

Simple guide for developing verification studies Consider the users of the forecasts of the verification information What aspects of forecast quality are of interest for the user? Develop verification questions to evaluate those aspects/attributes Exercise: What verification questions and attributes would be of interest to operators of an electric utility? a city emergency manager? a mesoscale model developer? Verification Tutorial January 2007 Basic Verification Concepts 15

Simple guide for developing verification studies Identify observations that represent the event being forecast, including the Element (e.g., temperature, precipitation) Temporal resolution Spatial resolution and representation Thresholds, categories, etc. Identify multiple verification attributes that can provide answers to the questions of interest Select measures and graphics that appropriately measure and represent the attributes of interest Identify a standard of comparison that provides a reference level of skill (e.g., persistence, climatology, old model) Verification Tutorial January 2007 Basic Verification Concepts 16

Types of forecasts, observations Continuous Temperature Rainfall amount 500 mb height Categorical Dichotomous Rain vs. no rain Strong winds vs. no strong wind Night frost vs. no frost Often formulated as Yes/No Multi-category Cloud amount category Precipitation type May result from subsetting continuous variables into categories Verification Tutorial January 2007 Basic Verification Concepts 17

Types of forecasts, observations Probabilistic Observation can be dichotomous, multi-category, or continuous Precipitation occurrence Yes/No Precipitation type Temperature distribution Forecast can be Single probability value (for dichotomous events) Multiple probabilities (discrete probability distribution for multiple categories) Continuous distribution For dichotomous or multiple categories, probability values may be limited to certain values (e.g., multiples of 0.1) Ensemble Multiple iterations of a continuous or categorical forecast May be transformed into a probability distribution Observations may be continuous, dichotomous or multi-category Probability of freezing temperatures; from U. Washington Verification Tutorial January 2007 Basic Verification Concepts 18

Matching forecasts and observations May be the most difficult part of the verification process! Many factors need to be taken into account Identifying observations that represent the forecast event Example: Precipitation accumulation over an hour at a point For a gridded forecast there are many options for the matching process Point-to-grid Match obs to closest gridpoint Grid-to-point Interpolate? Take largest value? Verification Tutorial January 2007 Basic Verification Concepts 19

Matching forecasts and observations Point-to-Grid and Grid-to-Point Matching approach can impact the results of the verification Verification Tutorial January 2007 Basic Verification Concepts 20

Matching forecasts and observations Example: 20 0 Two approaches: Match rain gauge to nearest gridpoint or 10 Obs=10 Fcst=0 Interpolate grid values to rain gauge location 20 20 Crude assumption: equal weight to each gridpoint Differences in results associated with matching: 20 10 0 Obs=10 Fcst=15 representativeness error 20 20 Verification Tutorial January 2007 Basic Verification Concepts 21

Matching forecasts and observations Final point: It is not advisable to use the model analysis as the verification observation Why not?? Verification Tutorial January 2007 Basic Verification Concepts 22

Matching forecasts and observations Final point: It is not advisable to use the model analysis as the verification observation Why not?? Issue: Non-independence!! Verification Tutorial January 2007 Basic Verification Concepts 23

Statistical basis for verification Joint, marginal, and conditional distributions are useful for understanding the statistical basis for forecast verification These distributions can be related to specific summary and performance measures used in verification Specific attributes of interest for verification are measured by these distributions Verification Tutorial January 2007 Basic Verification Concepts 24

Statistical basis for verification Basic (marginal) probability p = Pr( X = x) x is the probability that a random variable, X, will take on the value x Example: X = gender of tutorial lecturer What is an estimate of Pr(X=female)? Verification Tutorial January 2007 Basic Verification Concepts 25

Statistical basis for verification Basic (marginal) probability p = Pr( X = x) x is the probability that a random variable, X, will take on the value x Example: X = gender of tutorial lecturer What is an estimate of Pr(X=female)? Answer: Female lecturers (3): B. Brown, B. Casati, E. Ebert Male lecturers (4): S. Mason, P. Nurmi, M. Pocernich, L. Wilson Total number of lecturers: 7 Pr(X=female) is 3/7 = 0.43 Verification Tutorial January 2007 Basic Verification Concepts 26

Basic probability Joint probability pxy, = Pr( X = x, Y = y) = probability that both events x and y occur Example: What is the probability that a lecturer is female and has dark hair? Verification Tutorial January 2007 Basic Verification Concepts 27

Basic probability Joint probability pxy, = Pr( X = x, Y = y) = probability that both events x and y occur Example: What is the probability that a lecturer is female and has dark hair? Answer: 3 of 7 lecturers are female 2 of the 3 female lecturers have dark hair Thus, the probability that a lecturer is female and has dark hair is Pr( X = female, Y = dark hair) = where X is gender and Y is hair color 2 7 Verification Tutorial January 2007 Basic Verification Concepts 28

Basic probability Conditional probability Pr( X = x Y = y) or Pr( Y = y X = x) Probability associated with one attribute given the value of a second attribute Example What is the probability that hair color is dark, given a lecturer is female? Pr( Y = dark hair X = female) What is the probability that a lecturer is male, given hair color is dark? Pr( X = male Y = dark hair) (Note: Gray counts as light ) Verification Tutorial January 2007 Basic Verification Concepts 29

What does this have to do with verification? Verification can be represented as the process of evaluating the joint distribution of forecasts and observations, p( f, x) All of the information regarding the forecast, observations, and their relationship is represented by this distribution Furthermore, the joint distribution can be factored into two pairs of conditional and marginal distributions: p( f, x) = p( F = f X = x) p( X = x) p( f, x) = p( X = x F = f) p( F = f) Verification Tutorial January 2007 Basic Verification Concepts 30

Decompositions of the joint distribution Many forecast verification attributes can be derived from the conditional and marginal distributions Likelihood-base rate decomposition p( f, x) = p( F = f X = x) p( X = x) Likelihood Base rate Calibration-refinement decomposition p( f, x) = p( X = x F = f) p( F = f) Calibration Refinement Verification Tutorial January 2007 Basic Verification Concepts 31

Graphical representation of distributions Joint distributions Scatter plots Density plots 3-D histograms Contour plots Verification Tutorial January 2007 Basic Verification Concepts 32

Graphical representation of distributions Marginal distributions Stem and leaf plots Histograms Box plots Cumulative distributions Quantile-Quantile plots Verification Tutorial January 2007 Basic Verification Concepts 33

Graphical representation of distributions Conditional distributions Stem and leaf plots Conditional quantile plots Conditional boxplots Verification Tutorial January 2007 Basic Verification Concepts 34

Exercise: Stem and leaf plots Probability forecasts (Tampere) Date 2003 Observed rain?? Forecast (probability) Jan 1 No 0.3 Jan 2 No 0.1 Jan 3 No 0.1 Jan 4 No 0.2 Jan 5 No 0.2 Jan 6 No 0.1 Jan 7 Yes 0.4 Jan 8 Yes 0.7 Jan9 Yes 0.7 Jan 12 No 0.2 Jan 13 Yes 0.2 Jan 14 Yes 1.0 Jan 15 Yes 0.7 Verification Tutorial January 2007 Basic Verification Concepts 35

Stem and leaf plots: Marginal and conditional Marginal distribution of Tampere probability forecasts 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Forecast probability Conditional distributions of Tampere probability forecasts Obs precip = No 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Obs precip = Yes Instructions: Mark X s in the appropriate cells, representing the forecast probability values for Tampere. The resulting plots are one simple way to look at marginal and conditional distributions. Verification Tutorial January 2007 Basic Verification Concepts 36

R exercise: Representation of distributions Verification Tutorial January 2007 Basic Verification Concepts 37

Comparison and inference Skill scores A skill score is a measure of relative performance Ex: How much more accurate are Laurie Wilson s temperature predictions than climatology (or my temperature predictions)? Provides a comparison to a standard Generic skill score definition M M M perf ref M ref Where M is the verification measure for the forecasts, M ref is the measure for the reference forecasts, and M perf is the measure for perfect forecasts Positively oriented (larger is better) Verification Tutorial January 2007 Basic Verification Concepts 38

Comparison and inference Uncertainty in scores and measures should be estimated whenever possible! Uncertainty arises from Sampling variability Observation error Representativeness error Others? Erroneous conclusions can be drawn regarding improvements in forecasting systems and models Confidence intervals and hypothesis tests Parametric (i.e., depending on a statistical model) Non-parametric (e.g., derived from re-sampling procedures, often called bootstrapping ) Verification Tutorial January 2007 Basic Verification Concepts 39

Verification attributes Verification attributes measure different aspects of forecast quality Represent a range of characteristics that should be considered Many can be related to joint, conditional, and marginal distributions of forecasts and observations Verification Tutorial January 2007 Basic Verification Concepts 40

Verification attribute examples Bias (Marginal distributions) Correlation Overall association (Joint distribution) Accuracy Differences (Joint distribution) Calibration Measures conditional bias (Conditional distributions) Discrimination Degree to which forecasts discriminate between different observations (Conditional distribution) Verification Tutorial January 2007 Basic Verification Concepts 41

Desirable characteristics of verification measures Statistical validity Properness (probability forecasts) Best score is achieved when forecast is consistent with forecaster s best judgments Hedging is penalized Example: Brier score Equitability Constant and random forecasts should receive the same score Example: Gilbert skill score (2x2 case); Gerrity score No scores achieve this in a more rigorous sense Ex: Most scores are sensitive to bias, event frequency Verification Tutorial January 2007 Basic Verification Concepts 42

Miscellaneous issues In order to be verified, forecasts must be formulated so that they are verifiable! Corollary: All forecast should be verified if something is worth forecasting, it is worth verifying Stratification and aggregation Aggregation can help increase sample sizes and statistical robustness but can also hide important aspects of performance Most common regime may dominate results, mask variations in performance Thus it is very important to stratify results into meaningful, homogeneous sub-groups Verification Tutorial January 2007 Basic Verification Concepts 43

Verification issues cont. Observations No such thing as truth!! Observations generally are more true than a model analysis (at least they are relatively more independent) Observational uncertainty should be taken into account in whatever way possible e.g., how well do adjacent observations match each other? Verification Tutorial January 2007 Basic Verification Concepts 44

Stem and leaf plots: Marginal and conditional Marginal distribution of Tampere probability forecasts Forecast probability 0.0 0.1 X X X 0.2 X X X X 0.3 X 0.4 X 0.5 0.6 0.7 X X X 0.8 0.9 1.0 X Conditional distributions of Tampere probability forecasts Obs precip = No Obs precip = Yes 0.0 X X X 0.1 X X X 0.2 X X 0.3 0.4 X 0.5 0.6 0.7 X X X 0.8 0.9 1.0 X Verification Tutorial January 2007 Basic Verification Concepts 45