Verification of nowcasts and short-range forecasts, including aviation weather

Similar documents
Spatial Forecast Verification Methods

Verification Methods for High Resolution Model Forecasts

Verification of ensemble and probability forecasts

Spatial forecast verification

Basic Verification Concepts

Basic Verification Concepts

Intercomparison of spatial forecast verification methods: A review and new project launch

Ensemble Verification Metrics

Categorical Verification

Spatial Forecast Verification and the Mesoscale Verification Intercomparison over Complex Terrain

Spatial verification of NWP model fields. Beth Ebert BMRC, Australia

Current verification practices with a particular focus on dust

Evaluating Forecast Quality

Extracting probabilistic severe weather guidance from convection-allowing model forecasts. Ryan Sobash 4 December 2009 Convection/NWP Seminar Series

Verification and performance measures of Meteorological Services to Air Traffic Management (MSTA)

The Impact of Horizontal Resolution and Ensemble Size on Probabilistic Forecasts of Precipitation by the ECMWF EPS

Verification of Probability Forecasts

Overview of Verification Methods

THE MODEL EVALUATION TOOLS (MET): COMMUNITY TOOLS FOR FORECAST EVALUATION

Evaluation of Ensemble Icing Probability Forecasts in NCEP s SREF, VSREF and NARRE-TL Systems

Focus on Spatial Verification Filtering techniques. Flora Gofa

Sensitivity of COSMO-LEPS forecast skill to the verification network: application to MesoVICT cases Andrea Montani, C. Marsigli, T.

Improvement and Ensemble Strategy of Heavy-Rainfall Quantitative Precipitation Forecasts using a Cloud-Resolving Model in Taiwan

Detective Miss Marple in Murder at the Gallop. 3rd verification workshop, ECMWF 2007, Göber

TAF and TREND Verification

Downscaling of ECMWF seasonal integrations by RegCM

Wind verification with DIST method: preliminary results. Maria Stefania Tesini Offenbach - 6 September 2016 COSMO General Meeting

Forecast Verification Analysis of Rainfall for Southern Districts of Tamil Nadu, India

Forecast Verification Research - state of the art and future challenges

10A.1 The Model Evaluation Tool

FORECASTING: A REVIEW OF STATUS AND CHALLENGES. Eric Grimit and Kristin Larson 3TIER, Inc. Pacific Northwest Weather Workshop March 5-6, 2010

Assessing high resolution forecasts using fuzzy verification methods

WWRP Implementation Plan Reporting AvRDP

Monthly probabilistic drought forecasting using the ECMWF Ensemble system

Concepts of Forecast Verification FROST-14

7.1 The Schneider Electric Numerical Turbulence Forecast Verification using In-situ EDR observations from Operational Commercial Aircraft

Methods of forecast verification

Upscaled and fuzzy probabilistic forecasts: verification results

Global Flash Flood Forecasting from the ECMWF Ensemble

12.2 PROBABILISTIC GUIDANCE OF AVIATION HAZARDS FOR TRANSOCEANIC FLIGHTS

ECMWF 10 th workshop on Meteorological Operational Systems

Validation of Forecasts (Forecast Verification) Overview. Ian Jolliffe

Verification of Continuous Forecasts

Denver International Airport MDSS Demonstration Verification Report for the Season

A Gaussian Mixture Model Approach to

CAS & CAeM Aviation Research and Demonstration Project Paris-CDG airport

WRF Developmental Testbed Center (DTC) Visitor Program Summary on visit achievements by Barbara Casati

EMC Probabilistic Forecast Verification for Sub-season Scales

Precipitation verification. Thanks to CMC, CPTEC, DWD, ECMWF, JMA, MF, NCEP, NRL, RHMC, UKMO

The Role of Meteorological Forecast Verification in Aviation. Günter Mahringer, November 2012

Verification of Weather Warnings

<Operational nowcasting systems in the framework of the 4-D MeteoCube>

Utilising Radar and Satellite Based Nowcasting Tools for Aviation Purposes in South Africa. Erik Becker

Generating probabilistic forecasts from convectionpermitting. Nigel Roberts

Probabilistic verification

OBJECTIVE CALIBRATED WIND SPEED AND CROSSWIND PROBABILISTIC FORECASTS FOR THE HONG KONG INTERNATIONAL AIRPORT

Assessment of Ensemble Forecasts

Drought forecasting methods Blaz Kurnik DESERT Action JRC

Verifying Ensemble Forecasts Using A Neighborhood Approach

Complimentary assessment of forecast performance with climatological approaches

North Carolina State University Land Grant University founded in undergraduate fields of study 80 master's degree areas 51 doctoral degree

VERFICATION OF OCEAN WAVE ENSEMBLE FORECAST AT NCEP 1. Degui Cao, H.S. Chen and Hendrik Tolman

WMO Aviation Research Demonstration Project (AvRDP) and Seamless Trajectory Based Operation (TBO) PW Peter Li

VERIFICATION OF PROXY STORM REPORTS DERIVED FROM ENSEMBLE UPDRAFT HELICITY

Point verification and improved communication of the low-to-medium cloud cover forecasts

Comparison of the NCEP and DTC Verification Software Packages

Catalysing Innovation in Weather Science - the role of observations and NWP in the World Weather Research Programme

New Meteorological Services Supporting ATM

How to shape future met-services: a seamless perspective

Severe storm forecast guidance based on explicit identification of convective phenomena in WRF-model forecasts

3.6 NCEP s Global Icing Ensemble Prediction and Evaluation

0-6 hour Weather Forecast Guidance at The Weather Company. Steven Honey, Joseph Koval, Cathryn Meyer, Peter Neilley The Weather Company

Air Force Weather Ensembles

Air Force Weather Ensembles

FAA Weather Research Plans

Feature-specific verification of ensemble forecasts

Probabilistic Quantitative Precipitation Forecasts for Tropical Cyclone Rainfall

Verification of wind forecasts of ramping events

Verification of the operational NWP models at DWD - with special focus at COSMO-EU

Analysis and accuracy of the Weather Research and Forecasting Model (WRF) for Climate Change Prediction in Thailand

Ensemble forecast and verification of low level wind shear by the NCEP SREF system

Model error and seasonal forecasting

Comparison of Convection-permitting and Convection-parameterizing Ensembles

Implementation of global surface index at the Met Office. Submitted by Marion Mittermaier. Summary and purpose of document

Understanding Weather and Climate Risk. Matthew Perry Sharing an Uncertain World Conference The Geological Society, 13 July 2017

COMPOSITE-BASED VERIFICATION OF PRECIPITATION FORECASTS FROM A MESOSCALE MODEL

Update on CoSPA Storm Forecasts

Canadian Meteorological and Oceanographic Society and American Meteorological Society 21 st Conference on Numerical Weather Prediction 31 May 2012

QUANTIFYING THE ECONOMIC VALUE OF WEATHER FORECASTS: REVIEW OF METHODS AND RESULTS

Using time-lag ensemble techniques to assess behaviour of high-resolution precipitation forecasts

Weather Forecasting: Lecture 2

Deterministic and Probabilistic prediction approaches in Seasonal to Inter-annual climate forecasting

Probabilistic Weather Forecasting and the EPS at ECMWF

Linking the Hydrologic and Atmospheric Communities Through Probabilistic Flash Flood Forecasting

Probabilistic weather hazard forecast guidance for transoceanic flights based on merged global ensemble forecasts

International Desks: African Training Desk and Projects

Air Traffic Management : MET Requirements

J th Conference on Aviation, Range, and Aerospace Meteorology, 90th AMS Annual Meeting, Atlanta, GA, Jan, 2010

Montréal, 7 to 18 July 2014

Nesting and LBCs, Predictability and EPS

Transcription:

Verification of nowcasts and short-range forecasts, including aviation weather Barbara Brown NCAR, Boulder, Colorado, USA WMO WWRP 4th International Symposium on Nowcasting and Very-short-range Forecast 2016 (WSN16) Hong Kong; July 2016

Goals To understand where we are going, it s helpful to understand where we have been and what we have learned Evolution of verification of short-range forecasts Challenges Observations and Uncertainty User-relevant approaches

Early verification Yes Observed No Finley period 1880 s (see Murphy paper on The Finley Affair ; WAF, 11, 1996) Focused on contingency table statistics Development of many of the common measures still used today: Gilbert (ETS) Peirce (Hanssen-Kuipers) Heidke Etc Yes Hits false alarms No Misses correct negatives These methods are still the backbone of many verification efforts (e.g., warnings) Important notes: Many categorical scores are not independent! At least 3 metrics are needed to fully characterize the bivariate distribution of forecasts and observations

Early years continued: Continuous measures Focus on squared error statistics Mean-squared error Correlation Bias Note: Little recognition before Murphy of the non-independence of these measures Extension to probabilistic forecasts Brier Score (1950) well before prevalence of probability forecasts! Note: Reliance on squared error statistics means we are optimizing toward the average not toward extremes! Development of NWP measures S1 score Anomaly correlation Still relied on for monitoring and comparing performance of NWP systems (Are these still the best measures for this purpose?)

The Renaissance : The Allan Murphy era Expanded methods for probabilistic forecasts Decompositions of scores led to more meaningful interpretations of verification results Attribute diagram Initiation of ideas of meta verification: Equitabiltiy, Propriety Statistical framework for forecast verification Joint distribution of forecasts and observations and their factorizations Placed verification in a statistical context Dimensionality of the forecast problem: d= n f *n x - 1

Forecasts contain no intrinsic value. They acquire value through their ability to influence the decisions made by users of the forecasts. Forecast quality is inherently multifaceted in nature however, forecast verification has tended to focus on one or two aspects of overall forecasting performance such as accuracy and skill. Allan H. Murphy, Weather and Forecasting, 8, 1993: What is a good forecast: An essay on the nature of goodness in forecasting

The Murphy era cont. Connections between forecast quality and value Evaluation of costloss decision-making situations in the context of improved forecast quality Non-linear nature of quality-value relationships From Murphy, 1993 (Weather and Forecasting)

Murphy era cont. Development of the idea of diagnostic verification Also called distributionoriented verification Focus on measuring or representing attributes of performance rather than relying on summary measures A revolutionary idea: Instead of relying on a single measure of overall performance, ask questions about performance and measure attributes that are able to answer those questions Example: Use of conditional quantile plots to examine conditional biases in forecacsts

The Modern era New focus on evaluation of ensemble forecasts Development of new methods specific to ensembles (rank histogram, CRPS) Greater understanding of limitations of methods Meta verification Evaluation of sampling uncertainty in verification measures Approaches to evaluate multiple attributes simultaneously (note: this is actually an extension of Murphy s attribute diagram idea to other types of measures) Ex: Performance diagrams, Taylor diagrams

Overforecast Perfect score Bias Frz Rn Rain Snow Underforecast Ice pellets Credit: J. Wolff, NCAR

The Modern era cont. Development of an international Verification Community Workshops, textbooks Evaluation approaches for special kinds of forecasts Extreme events (Extremal Dependency Scores) NWP measures Extension of diagnostic verification ideas Spatial verification methods Feature-based evaluations (e.g., of time series) Movement toward Userrelevant approaches WMO Joint Working Group on Forecast Verification Research From Ferro and Stephenson 2011 (Wx and Forecasting)

Spatial verification methods Inspired by the limited diagnostic information available from traditional approaches for evaluating NWP predictions Difficult to distinguish differences between forecasts The double penalty problem Forecasts that appear good by the eye test fail by traditional measures often due to small offsets in spatial location Smoother forecasts often win even if less useful Traditional scores don t say what went wrong or was good about a forecast Many new approaches developed over the last 15 years Starting to also be applied in climate model evaluation

New Spatial Verification Approaches Neighborhood Successive smoothing of forecasts/obs Gives credit to "close" forecasts Object- and feature-based Evaluate attributes of identifiable features Scale separation Measure scale-dependent error Field deformation Measure distortion and displacement (phase error) for whole field How should the forecast be adjusted to make the best match with the observed field? http://www.ral.ucar.edu/projects/icp/

SWFDP, South Africa Example Applications US Weather prediction Center From Landman and Marx 2015 presentation Ebert and Ashrit (2015): CRA

Object-based extreme rainfall evaluation: 6hr Accumulated Precipitation Near Peak (90 th %) Intensity Difference (Fcst Obs) High Resolution Deterministic Does Fairly Well High Resolution Ensemble Mean Underpredicts Mesoscale Deterministic Underpredicts Mesoscale Ensemble Underpredicts the most Difference(P90 Fcst P90 Obs) Overforecast Underforecast

MODE Time Domain: Adding the time Dimension MODE-TD allows Observed evaluation of timing errors, Modeled storm volume, storm velocity, initiation, decay, etc. Application of MODE-TD to WRF prediction of an MCS in 2007 (Credit: A. Prein, NCAR) MODE and MODE-TD are available through the Model Evaluation Tools (http://www.dtcenter.org/met/users/ )

Meta-evaluation of spatial methods: What are the capabilities of the new methods? Initial intercomparison (2005-2011): Considered method capabilities for precipitation in High Plains of the US (https://www.ral.ucar.edu/projects/icp/) MesoVICT (Mesoscale Verification in Complex Terrain); 2013-??? considers How do/can spatial methods: Transfer to other regions with complex terrain (Alpine region), and other parameters: e.g., wind (speed and direction)? Work with forecast ensembles? Incorporate observations uncertainty (analysis ensemble)?

MesoVICT 3 tiers Tier 3 Complex terrain Sensitivity tests to method parameters Tier 2a Tier 1 Core Deterministic precip + VERA analysis + JDC obs 6 cases, min 1 Ensemble wind + VERA analysis + JDC obs Tier 2b Other variables ensemble + VERA ensemble + JDC obs Mesoscale model forecasts from MAP- Dphase Precipitation and wind Deterministic and Ensemble Verification with VERA

Challenges Observation limitations Representativeness Biases Measuring and incorporating uncertainty information Sampling: Methods are available but not typically applied Observation: Few methods available; not clear how to do this in genera; User-relevant verification Evaluating forecasts in the context of user applications and decision making

Observation limitations Observations are still often the limiting factor in verification Example: Aviation weather Observations can be characterized by Sparseness: Difficult, especially for many aviation variables (e.g., icing turbulence, precipitation type) Representativeness: How to evaluate analysis products that provide nowcasts at locations with no observations? Biases: Observations of extreme conditions (e.g., icing, turbulence) biased against where the event occurs! (pilot avoidance) Verification methods must take these attributes into account (e.g., choice of verification measures)

Example: Precipitation Type Snow precip type forecast POD (2 models): POD vs lead time MPING MPING: Crowd-sourced precip type o Human-generated observations have biases (e.g., in types observed) Type of observation impacts the verification results METAR Credit: J. Wolff (NCAR)

Conceptual Model: Forecast Quality and Value Morss et al. 2008 (BAMS)

User-relevant verification Levels of user-relevance 1. Making traditional verification methods useful for a range of users (e.g., variety of thresholds) 2. Developing and applying specific methods for particular users [Ex: Particular statistics; user-relevant variables] 3. Applying meaningful diagnostic (e.g., spatial) methods that are relevant for a particular users question 4. Connecting economic and other value directly with forecast performance Most verification studies are at Levels 1 and 2, with some approaching 3, and very few actually at Level 4 Some examples.

Applications of object based approaches Example: Evaluation of jet cores, highs, lows (using MODE object based approach) for model acceptance testing Courtesy Marion Mittermaier, UK Met Office

User approach to ensemble evaluation Translate ensemble info into user-relevant information Evaluate on the basis of the impact variable Ideal: User-specific info for many users; more general, user-relevant info for others Predicted chance of 30% capacity loss in E-W direction 9 h ahead Steiner: Translate convective ensembles into probability maps of aircraft capacity Courtesy M. Steiner

Examples of user-based forecast verification and value studies: Looking at the relationship between quality and value Keith (2003; Weather and Forecasting) Value of ceiling forecasts for fuel savings: Cost/loss evaluation of alternate airport fuel loading needs Keith (2005; unpublished): an average of $23K is saved per flight using probabilistic forecasts => Savings of approximately $50M per year in operating costs due to more optimal balance between false alarms and misses Skill

Comments on user-relevant verification Moving toward user relevant verification will increase both the usefulness and quality of forecasts, and will benefit developers as well as users Many of the steps toward user relevance (e.g., user-specified stratifications & thresholds) are easy to achieve Others require major multi-disciplinary efforts Verification practitioners people who do verification should endeavor as much as possible to understand the needs of the forecast users Much is left to be explored!

Challenge: Develop best new user-relevant verification method Sponsored by WMO/WWRP JWGFVR (Verification Working Group) High Impact Weather, Sub-seasonal to seasonal, and Polar Prediction projects Focus All applications of weather/climate/hydro forecasts Metrics can be quantitative scores or diagnostics Criteria for being selected as best Originality, user relevance, simplicity, robustness, resistance to hedging. Desirable characteristics: (i) (ii) Clear statistical foundation; Applicability to a broader set of problems

Challenge: Develop best new user-relevant verification method Deadline for submission: 31 Oct 2016 Prize: Invited keynote talk at the 7th International Verification Methods Workshop in May, 2017 (Berlin) Contact verifchallenge@ucar.edu for more information See website at http://www.wmo.int/pages/prog/arep/wwrp/new/ FcstVerChallenge.html

Summary Much progress has been made in the last few decades Advancing capabilities and impacts of forecast evaluation Many new approaches have been developed, examined, and applied, and are providing opportunities for more meaningful evaluations of both weather and climate forecasts Thinking beyond contingency tables Thoughtfulness in selecting and implementing verification approaches will pay off in more meaningful results Optimize forecasts for what we care about But still more challenges ahead

Remaining challenges (some examples) Expansion of user-relevant metrics Providing a breadth of information to users Sorting out how to incorporate uncertainty appropriately Spatial / Temporal Measurement / Observation Sampling Improving communication Developing ways to communicate forecast quality information to the general public, specific users