MET+ Tutorial. Goals. MET+ Team

Size: px

Start display at page:

Download "MET+ Tutorial. Goals. MET+ Team"

Calvin Bryant
5 years ago
Views:

Goals MET+ Tutorial Overview of MET and METViewer for new users Update current users on changes for METv6.

in conjunction with the WRF and WRF-ChemTutorials Jan 31 Feb 2, 2018 1 MET+ Team A verification toolkit

John Halley Gotway, Julie Prestopnik, Randy Bullock, Tatiana Burek, Minna Win, Howard Soh, George McCabe

Wolff, Michelle Harrold, Tina Kalb, Dan Adriaansen ESRL Engineers: Bonny Strong, Jim Frimmel, Kirk Holub,

Over 70 traditional statistics using both point and gridded datasets Multiple interpolation methods

and temporal scales 3200+ users, both US (30%) and internationally (70%) Object Based and Spatial Methods

1 Goals MET+ Tutorial Overview of MET and METViewer for new users Update current users on changes for METv6.1 and latest METViewer upgrades Highlight current MET users interesting research Introduce MET+ Held at NCAR in conjunction with the WRF and WRF-ChemTutorials Jan 31 Feb 2, MET+ Team A verification toolkit designed for flexible yet systematic evaluatio (supported to the community via the DTC) 3 NCAR Engineers: John Halley Gotway, Julie Prestopnik, Randy Bullock, Tatiana Burek, Minna Win, Howard Soh, George McCabe Statisticians: Tressa Fowler, Barb Brown, Eric Gilleland Scientists: Tara Jensen, Kathryn Newman, Jamie Wolff, Michelle Harrold, Tina Kalb, Dan Adriaansen ESRL Engineers: Bonny Strong, Jim Frimmel, Kirk Holub, Randy Pierce, Molly Smith Scientists: Jeff Hamilton, Isidora Jankov, Jeff Beck, Tanya Peevey, Man Zhang 4 Over 70 traditional statistics using both point and gridded datasets Multiple interpolation methods Computation of confidence intervals Able to read in GRIB1, GRIB2 and CF-compliant Applied to many spatial and temporal scales users, both US (30%) and internationally (70%) Object Based and Spatial Methods Bad forecast or Good forecast with displacement error? Geographical Representation of Errors 90 th Percentile of difference between two models

accumulation QPE MODE PCP Combine MODE-TD

accumulation QPE 12-h VSDB Grid Stat Series

2 MET Flowchart Example: Accumulated precipitation 3-h accumulation QPE 12-h accumulation QPE MODE PCP Combine MODE-TD Wavelet Stat METViewer Database and Display 12-h accumulation QPF Grid Stat Series Analysis Multiple runs over time Ensemble Stat Point Stat 6 Example: Accumulated precipitation METViewer Components 3-h accumulation QPE 12-h accumulation QPE MODE PCP Combine MODE-TD Wavelet Stat METViewer Database and Display VSDB Grid Stat Series Analysis Multiple runs over time Ensemble Stat Database and Display 12-h accumulation QPF Point Stat 7

3 METViewer Many plot options Hit Generate Plot METViewer components Packages: Java, Apache/Tomcat, MySQL, R statistics Pick your variable Time Series Pick Box your Plots model Bar Graphs Histograms Rank Histograms ROC Pick Reliability your stratifications Ensemble Spread Skill Performance Diagram Taylor Diagram Configure plot area Modify colors, line types, confidence intervals, names, etc 9 VSDB Output At NCEP: Installed on Development Side of IDP therefore only available internally at this time Local initial adopters: Perry, Jacob, Tracey, Binbin, Ying MET+ Unified Package Schedule Python wrappers around MET and METViewer: Simple to set-up and run Automated plotting of 2D fields and statistics Communication between MET & python algorithms (Cython) 11 Initial system - Global deterministic with plans to generalize across scales when possible to quickly spin-up Ensembles, High Resolution & Global Components Spatial Plots MET Python wrappers MET+ Stats Plots GitHub Repository METViewer Wed - Jan 31 09:00 Welcome and Intro 09:20 Basic VX Concepts 09:50 Contingency Tables 10:30 Break 10:45 Continuous Stats 11:30 Statistical Significance (Box plots, CIs, Pairwise Diff) 12:00 Lunch 01:00 MET Download and use-cases 01:30 Data Types and pre-processing 02:00 Point-Stat and Interpolation 02:30 Break 02:45 Practical Session - Pre-processing and Point Stat

Thurs Feb 1 Schedule 08:00 Stat-Analysis 08:30 Masking and Regridding 09:15 Grid-Stat and Interpretation 10:00 Break 10:15 Practical Session - Grid-Stat and

and Probabilities Fri Feb 2 Schedule 08:00 MODE and MODE-TD 08:45 MODE Customization and Output 09:30 MET-TC 10:10 Break 10:20 Practical - MODE, MTD and MET-TC 12:00

org/met/users/ Resources MET User s Guide: https://dtcenter.org/met/us ers/docs/overview.php Verification Methods FAQ: http://www.cawcr.gov.

4 Thurs Feb 1 Schedule 08:00 Stat-Analysis 08:30 Masking and Regridding 09:15 Grid-Stat and Interpretation 10:00 Break 10:15 Practical Session - Grid-Stat and Stat-Analysis 12:00 Lunch 01:00 Series-Analysis 01:35 Ensemble-Stat 02:00 Probabilistic Verification 02:30 Break 02:45 Practical Session - Series-Analysis; Ensembles and Probabilities Fri Feb 2 Schedule 08:00 MODE and MODE-TD 08:45 MODE Customization and Output 09:30 MET-TC 10:10 Break 10:20 Practical - MODE, MTD and MET-TC 12:00 Lunch 01:00 METViewer 01:20 Containers 01:45 MET+ python wrappers 02:30 Wrap-up 02:35 Break 02:50 Practical - METViewer and MET+ Where to get help Resources MET User s Guide: ers/docs/overview.php Verification Methods FAQ: ojects/verification/ Verification Discussion Group: Subscribe at ilman/listinfo/vx-discuss Copyright 2017, University Corporation for Atmospheric Research, all rights reserved 16

Basic concepts - outline Basic Verification Concepts Tressa L. Fowler National Center for Atmospheric Research Boulder Colorado USA What is verification? Why verify?

5 Basic concepts - outline Basic Verification Concepts Tressa L. Fowler National Center for Atmospheric Research Boulder Colorado USA What is verification? Why verify? Identifying verification goals Forecast goodness Designing a verification study Types of forecasts and observations Matching forecasts and observations Verification attributes Miscellaneous issues Questions to ponder: Who? What? When? Where? Which? Why? Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 18 How do you do verification? What is verification? Using MET is the easy part, scientifically speaking. Good verification depends mostly on what you do before and after MET. What do you want to know? Good forecasts. Good observations. Well matched. Appropriate selection of methods Thorough and correct interpretation of results. Verification is the process of comparing forecasts to relevant observations Verification is one aspect of measuring forecast goodness Verification measures the quality of forecasts (as opposed to their value) For many purposes a more appropriate term is evaluation Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 19 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 20

Why verify? Why verify? Purposes of verification (traditional definition) Administrative purpose Monitoring performance Choice of model or model configuration (has the model improved?

reasons to verify weather forecasts? Help operational forecasters understand model biases and select models for use in different conditions Help users interpret forecasts (e.g.

) Identify forecast weaknesses, strengths, differences Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 21 Copyright 2015, University Corporation for Atmospheric

6 Why verify? Why verify? Purposes of verification (traditional definition) Administrative purpose Monitoring performance Choice of model or model configuration (has the model improved?) Scientific purpose Identifying and correcting model flaws Forecast improvement Economic purpose Improved decision making Feeding decision models or decision support systems What are some other reasons to verify weather forecasts? Help operational forecasters understand model biases and select models for use in different conditions Help users interpret forecasts (e.g., What does a temperature forecast of 0 degrees really mean? ) Identify forecast weaknesses, strengths, differences Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 21 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 22 Identifying verification goals Identifying verification goals (cont.) What questions do we want to answer? Examples: In what locations does the model have the best performance? Are there regimes in which the forecasts are better or worse? Is the probability forecast well calibrated (i.e., reliable)? Do the forecasts correctly capture the natural variability of the weather? Other examples? What forecast performance attribute should be measured? Related to the question as well as the type of forecast and observation Choices of verification statistics, measures, graphics Should match the type of forecast and the attribute of interest Should measure the quantity of interest (i.e., the quantity represented in the question) Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 23 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 24

7 Forecast goodness Good forecast or bad forecast? Depends on the quality of the forecast AND F O The user and his/her application of the forecast information Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 25 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 26 Good forecast or Bad forecast? Good forecast or Bad forecast? If I m a water manager for this watershed, it s a pretty bad forecast F O A F O Flight Route O B If I m an aviation traffic strategic planner Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 27 Different users have different ideas about what makes a forecast good Copyright 2015, University Corporation for Atmospheric Research, all rights reserved It might be a pretty good forecast Different verification approaches can measure different types of goodness 28

Forecast goodness Forecast quality is only one aspect of forecast goodness Forecast value is related to forecast quality through complex, non-linear relationships In some cases, improvements in

However - Some approaches to measuring forecast quality can help understand goodness Examples Diagnostic verification approaches New features-based approaches Use of multiple measures to represent

verification studies Consider the users of the forecasts of the verification information What aspects of forecast quality are of interest for the user? Typically (always?

8 Forecast goodness Forecast quality is only one aspect of forecast goodness Forecast value is related to forecast quality through complex, non-linear relationships In some cases, improvements in forecast quality (according to certain measures) may result in a degradation in forecast value for some users! However - Some approaches to measuring forecast quality can help understand goodness Examples Diagnostic verification approaches New features-based approaches Use of multiple measures to represent more than one attribute of forecast performance Examination of multiple thresholds Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 29 Basic guide for developing verification studies Consider the users of the forecasts of the verification information What aspects of forecast quality are of interest for the user? Typically (always?) need to consider multiple aspects Develop verification questions to evaluate those aspects/attributes Exercise: What verification questions and attributes would be of interest to operators of an electric utility? a city emergency manager? a mesoscale model developer? aviation planners? Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 30 Basic guide for developing verification studies Identify observations that represent the event being forecast, including the Element (e.g., temperature, precipitation) Temporal resolution Spatial resolution and representation Thresholds, categories, etc. Observations are not truth We can t know the complete truth. Observations generally are more true than a model analysis (at least they are relatively more independent) Observational uncertainty should be taken into account in whatever way possible In other words, how well do adjacent observations match each other? Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 31 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 32

Observations might be garbage if Basic guide for developing verification studies Not Independent (of forecast or each other) Biased Space Time Instrument Sampling Reporting Measurement errors Not

interest Identify a standard of comparison that provides a reference level of skill (e.g.

rights reserved 34 Types of forecasts, observations Continuous Temperature Rainfall amount 500 mb height Categorical Dichotomous Rain vs. no rain Strong winds vs. no strong wind Night frost vs.

9 Observations might be garbage if Basic guide for developing verification studies Not Independent (of forecast or each other) Biased Space Time Instrument Sampling Reporting Measurement errors Not enough of them Identify multiple verification attributes that can provide answers to the questions of interest Select measures and graphics that appropriately measure and represent the attributes of interest Identify a standard of comparison that provides a reference level of skill (e.g., persistence, climatology, old model) Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 33 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 34 Types of forecasts, observations Continuous Temperature Rainfall amount 500 mb height Categorical Dichotomous Rain vs. no rain Strong winds vs. no strong wind Night frost vs. no frost Often formulated as Yes/No Multi-category Cloud amount category Precipitation type May result from subsetting continuous variables into categories Ex: Temperature categories of 0-10, 11-20, 21-30, etc. Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 35 Types of forecasts, observations Probabilistic Observation can be dichotomous, multi-category, or continuous Precipitation occurrence Dichotomous (Yes/No) Precipitation type Multi-category Temperature distribution - Continuous Forecast can be Single probability value (for dichotomous events) Multiple probabilities (discrete probability distribution for multiple categories) Continuous distribution For dichotomous or multiple categories, probability values may be limited to certain values (e.g., multiples of 0.1) Ensemble Multiple iterations of a continuous or categorical forecast May be transformed into a probability distribution Observations may be continuous, dichotomous or multi-category Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 2-category precipitation forecast (PoP) for US ECMWF 2-m temperature meteogram for Helsinki 36

10 Matching forecasts and observations Matching forecasts and observations May be the most difficult part of the verification process! Many factors need to be taken into account - Identifying observations that represent the forecast event Example: Precipitation accumulation over an hour at a point - For a gridded forecast there are many options for the matching process Point-to-grid Match obs to closest gridpoint Grid-to-point Interpolate? Take largest value? Point-to-Grid and Grid-to-Point Matching approach can impact the results of the verification Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 37 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 38 Matching forecasts and observations Example: Two approaches: Match rain gauge to nearest gridpoint or Interpolate grid values to rain gauge location Crude assumption: equal weight to each gridpoint Differences in results associated with matching: Representativeness difference Will impact most verification scores Obs=10 Fcst= Obs=10 Fcst=15 20 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 39 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 40

Matching forecasts and observations Comparison and inference Final point: It is not advisable to use the model analysis as the verification observation. Why not?? Issue: Non-independence!

11 Matching forecasts and observations Comparison and inference Final point: It is not advisable to use the model analysis as the verification observation. Why not?? Issue: Non-independence!! Uncertainty in scores and measures should be estimated whenever possible! Uncertainty arises from Sampling variability Observation error Representativeness differences Others? Erroneous conclusions can be drawn regarding improvements in forecasting systems and models Methods for confidence intervals and hypothesis tests Parametric (i.e., depending on a statistical model) Non-parametric (e.g., derived from resampling procedures, often called bootstrapping ) Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 41 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 42 Verification attributes Verification attributes measure different aspects of forecast quality Represent a range of characteristics that should be considered Many can be related to joint, conditional, and marginal distributions of forecasts and observations Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 43 Tornado Tornado Observed forecast yes no Total fc yes no Total obs Copyright 2015, University Corporation for Atmospheric Research, all rights reserved Joint : The probability of two events in conjunction. Pr (Tornado forecast AND Tornado observed) = 30 / 2800 = 0.01 Conditional : The probability of one variable given that the second is already determined. Pr (Tornado Observed Tornado Fcst) = 30/50 = 0.60 Marginal : The probability of one variable without regard to the other. Pr(Yes Forecast) = 100/2800 = 0.04 Pr(Yes Obs) = 50 / 2800 =

12 Verification attribute examples Miscellaneous issues Bias - (Marginal distributions) Correlation - Overall association (Joint distribution) Accuracy - Differences (Joint distribution) Calibration - Measures conditional bias (Conditional distributions) Discrimination - Degree to which forecasts discriminate between different observations (Conditional distribution) In order to be verified, forecasts must be formulated so that they are verifiable! Corollary: All forecasts should be verified if something is worth forecasting, it is worth verifying Stratification and aggregation Aggregation can help increase sample sizes and statistical robustness but can also hide important aspects of performance Most common regime may dominate results, mask variations in performance. Thus it is very important to stratify results into meaningful, homogeneous sub-groups Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 45 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 46 Some key things to think about Some key things to think about Who wants to know? What does the user care about? kind of parameter are we evaluating? What are its characteristics (e.g., continuous, probabilistic)? thresholds are important (if any)? forecast resolution is relevant (e.g., site-specific, areaaverage)? are the characteristics of the obs (e.g., quality, uncertainty)? are appropriate methods? Why do we need to verify it? How do you need/want to present results (e.g., stratification/aggregation)? Which methods and metrics are appropriate? methods are required (e.g., bias, event frequency, sample size) Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 47 Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 48

13 Resources Verification Methods FAQ: /verification/ Verification Discussion Group: Subscribe at listinfo/vx-discuss Categorical Verification Tina Kalb Forecast F M H Observation Contributions from Tara Jensen, Matt Pocernich, Eric Gilleland, Tressa Fowler, Barbara Brown and others Copyright 2015, University Corporation for Atmospheric Research, all rights reserved 49 Finley Tornado Data (1884) Finley Tornado Data (1884) Forecast answering the question: Will there be a tornado? YES NO Observation answering the question: Did a tornado occur? YES NO Forecast Observed Yes No Total Yes No Total Contingency Table Answers fall into 1 of 2 categories Forecasts and Obs are Binary

14 A Success? What if forecaster never forecasted a tornado? Forecast Observed Yes No Total Yes No Total Forecast Observed Yes No Total Yes No Total Percent Correct = ( )/2803 = 96.6%!!!! Copyright 2015, University Corporation for Atmospheric Research, all rights reserved Percent Correct = (0+2752)/2803 = 98.2%!!!! 2 x 2 Contingency Table maybe Accuracy is not the most informative statistic But the contingency table concept is good Forecast Observed Yes No Total False Forecast Yes Hit Alarm Yes No Miss Correct Negative Forecast No Total Obs. Yes Obs. No Total Example: Accuracy = (Hits+Correct Negs)/Total MET supports both 2x2 and NxN Contingency Tables

15 Common Notation (however not universal notation) What if data are not binary? Forecast Observed Yes No Total Yes a b a+b No c d c+d Total a+c b+d n Example: Accuracy = (a+d)/n Examples: Temperature < 0 C Precipitation > 1 inch CAPE > 1000 J/kg Ozone > 20 µg/m³ Winds at 80 m > 24 m/s 500 mb HGTS < 5520 m Radar Reflectivity > 40 dbz MSLP < 990 hpa LCL < 1000 ft Cloud Droplet Concentration > 500/cc Hint: Pick a threshold that is meaningful to your end-user Contingency Table for Freezing Temps (i.e. T<=0 C) Alternative Perspective on Contingency Table Forecast Observed <= 0C > 0C Total <= 0C a b a+b > 0C c d c+d Total a+c b+d n Correct Negatives False Alarms Misses Forecast = yes Observed = yes Another Example: Base Rate (aka sample climatology) = (a+c)/n Hits

16 Conditioning to form a statistic Considers the probability of one event given another event Notation: p(x Y=1) is probability of X occuring given Y=1 or in other words Y=yes Conditioning on forecasts Conditioning on Fcst provides: Info about how your forecast is performing Apples-to-Oranges comparison if comparing stats from 2 models Conditioning on Obs provides: Info about ability of forecast to discriminate between event and non-event - also called Conditional Probability or Likelihood Apples-to-Apples comparison if comparing stats from 2 models Forecast = yes f=1 p(x f=1) Observed = yes x=1 p(x=1 f=1) = a / aub = a/(a+b) = Fraction of Hits p(x=0 f=1) = b / aub = b/(a+b) = False Alarm Ratio Conditioning on observations What s considered good? Conditioning on Forecast Fraction of hits - p(x=1 f=1) = a/(a+b) : close to 1 False Alarm Ratio - p(x=0 f=1) = b/(a+b) : close to 0 Forecast = yes f=1 p(f x=1) Observed = yes x=1 p(f=1 x=1) = a / auc = a/(a+c) = Hit Rate p(f=0 x=1) = c / auc = c/(a+c) = Fraction of Misses Conditioning on Observations Hit Rate - p(f=1 x=1) = a/(a+c): close to 1 [aka Probability of Detection Yes (PODy)] Fraction of misses p(f=0 x=1) = a/(a+c) : close to 0

Examples of Categorical Scores (most based on conditioning) Hit Rate (PODy) = a/(a+c) POD False Alarm

(Frequency) Bias (FBIAS) = (a+b)/(a+c) Threat Score or Critical Success Index = a/(a+b+c) (CSI) b a c d

Total Yes 28 72 100 No 23 2680 2703 Total 51 2752 2803 Threat Score = 28 / (28 + 72+ 23) = 0.

720 Example Relationships among scores CSI is a nonlinear function of POD and FAR CSI depends on base

17 Examples of Categorical Scores (most based on conditioning) Hit Rate (PODy) = a/(a+c) POD False Alarm Ratio (FAR) = b/(a+b) Detection PODn = d/(b+d) = ( 1 POFD) POFD False Alarm Rate (POFD) = b/(b+d) (Frequency) Bias (FBIAS) = (a+b)/(a+c) Threat Score or Critical Success Index = a/(a+b+c) (CSI) b a c d Probability of Probability of False Detection Examples of CTC calculations Forecast Observed Yes No Total Yes No Total Threat Score = 28 / ( ) = Probability of Detection = 28 / ( ) = 0.55 False Alarm Ratio= 72/( ) = Example Relationships among scores CSI is a nonlinear function of POD and FAR CSI depends on base rate (event frequency) and Bias CSI POD 1 FAR CSI POD Bias 1 FAR Very different combinations of FAR and POD can lead to same CSI value

18 HMT Performance Diagram On same plot POD 1-FAR (aka Success Ratio) CSI Freq Bias Dots: Scores Aggregated Over Lead Time Colors: different thresholds Results: Decreasing skill with higher thresholds across multiple metrics Highest skill h lead times 9km - Ensemble Mean 6h Precip >0.1 in. >0.5 in. >1.0 in. >2.0 in Success Ratio (1-FAR) Equal lines of CSI Roberts et al. (2011), Roebber (WAF, 2009), Wilson (presentation, 2008) 18 Best Equal lines of Freq. Bias Freq Bias Skill Scores How do you compare the skill of easy to predict events with difficult to predict events? Provides a single value to summarize performance. Reference forecast - best naive guess; persistence; climatology. Reference forecast must be comparable. Perfect forecast implies that the object can be perfectly observed. Example: Generic Skill Score SS A MSESS A A perf 1 ref A ref M SE M SE climo where A = any measure ref = reference perf = perfect where MSE = Mean Square Error Interpreted as fractional improvement over reference forecast Reference could be: Climotology, Persistence, your baseline forecast, etc.. Climotology could be a separate forecast or a gridded forecast sample climatology SS typically positively oriented with 1 as optimal Commonly Used Skill Scores Gilbert Skill Score - based on the CSI corrected for the number of hits expected by chance. Heidke Skill Score - based on Accuracy corrected by the number of hits expected by chance. Hanssen-Kuipers Discriminant (Pierce Skill Score) measures ability of forecast to discriminate between (or correctly classify) events and non-events. H-K=POD-POFD Brier Skill Score for probabilistic forecasts Fractional Skill Score for neighborhood methods Intensity-Scale Skill Score for wavelet methods

org.uk/eumetcal/verification/www/english/courses/msgcrs/index.htm WMO Verification working group forecast verification web page, http://www.cawcr.gov.

19 Example Thank you! References: Jolliffe and Stephenson (2012): Forecast Verification: a practitioner s guide, Wiley & Sons, 240 pp. Wilks (2011): Statistical Methods in Atmospheric Science, Academic press, 467 pp. Stanski, Burrows, Wilson (1989) Survey of Common Verification Methods in Meteorology WMO Verification working group forecast verification web page, Verification of Continuous Forecasts Presented by Barbara Brown Adapted from presentations created by Barbara Casati and Barbara Brown Exploratory methods Scatter plots Discrimination plots Box plots Statistics Bias Error statistics Robustness Comparisons

20 Scatter-plot: plot of observation versus forecast values Perfect forecast = obs, points should be on the 45 o diagonal Provides information on: bias, outliers, error magnitude, linear association, peculiar behaviours in extremes, misses and false alarms (link to contingency table) Exploratory methods: joint distribution Quantile-quantile plots: OBS quantile versus the corresponding FRCS quantile Exploratory methods: marginal distribution Scatter-plot and qq-plot: example 1 Q: is there any bias? Positive (over-forecast) or negative (under-forecast)? Scatter-plot and qq-plot: example 2 Describe the peculiar behaviour of low temperatures

Scatter-plot and Contingency Table Does the forecast detect correctly temperatures below 10

21 Scatter-plot: example 3 Describe how the error varies as the temperatures grow Does the forecast detect correctly temperatures above 18 degrees? Scatter-plot and Contingency Table Does the forecast detect correctly temperatures below 10 degrees? outlier Example Receiver Operating Characteristic Plot Create with points from PRC line type. Copyright 2015, UCAR, all rights reserved.

Example Box (and Whisker) Plot Visual comparison: Histograms,

dev = x X n i= 1 Inter Quartile Range = IQR = q q 0.75 0.

99 IQR 8.52 9.75 Copyright 2015, UCAR, all rights reserved.

22 Example Box (and Whisker) Plot Visual comparison: Histograms, box-plots, Summary statistics: Location: n 1 mean =X= xi n i= 1 median = q 0.5 Exploratory methods: marginal distributions Spread: 2 i n 1 st dev = x X n i= 1 Inter Quartile Range = IQR = q q OBS FRCS MEAN MEDIAN STDEV IQR Copyright 2015, UCAR, all rights reserved. Exploratory methods: conditional distributions Exploratory methods: conditional qq-plot Conditional histogram and conditional box-plot

23 Continuous scores: linear bias linear bias = Mean Error = 1 n n i=1 f i o i = f o Attribute: measures the bias Mean Error = average of the errors = difference between the means It indicates the average direction of error: positive bias indicates over-forecast, negative bias indicates underforecast (y=forecast, x=observation) Does not indicate the magnitude of the error (positive and negative error can cancel outs) Bias correction: misses (false alarms) improve at the expenses of false alarms (misses). Q: If I correct the bias in an over-forecast, do false alarms grow or decrease? And the misses? Good practice rules: sample used for evaluating bias correction should be consistent with sample corrected (e.g. winter separated by summer); for fair validation, cross validation should be adopted for bias corrected forecasts MAE = 1 n Mean Absolute Error n i=1 f i o i Attribute: measures accuracy Average of the magnitude of the errors Linear score = each error has same weight It does not indicates the direction of the error, just the magnitude Median Absolute Deviation Continuous scores: MSE MAD = median f i o i Attribute: measures accuracy MSE = 1 n n i=1 f i o i 2 Attribute: measures accuracy Median of the magnitude of the errors Very robust Extreme errors have no effect Average of the squares of the errors: it measures the magnitude of the error, weighted on the squares of the errors it does not indicate the direction of the error Quadratic rule, therefore large weight on large errors: good if you wish to penalize large error sensitive to large values (e.g. precipitation) and outliers; sensitive to large variance (high resolution models); encourage conservative forecasts (e.g. climatology)

24 Continuous scores: RMSE RMSE = MSE = 1 n n i=1 f i o i 2 Attribute: measures accuracy Model 1 Model 2 RMSE is the squared root of the MSE: measures the magnitude of the error retaining the variable unit (e.g. O C) Similar properties of MSE: it does not indicate the direction the error; it is defined with a quadratic rule = sensitive to large values, etc. NOTE: RMSE is always larger or equal than the MAE Forecast Lead Time Continuous scores: linear correlation r XY = 1 n n 1 y n i y i=1 n y i y 2 1 n i=1 x i x n i=1 x i x 2 = cov(y,x) s Y s X Attribute: measures association Measures linear association between forecast and observation Y and X rescaled (non-dimensional) covariance: ranges in [-1,1] It is not sensitive to the bias The correlation coefficient alone does not provide information on the inclination of the regression line (it says only is it is positively or negatively tilted); observation and forecast variances are needed; the slope coefficient of the regression line is given by b = (s X /s Y )r XY Not robust = better if data are normally distributed Not resistant = sensitive to large values and outliers Scores for continuous forecasts Simplest overall measure of performance: Correlation coefficient n Cov( f, x) ( fi f )( xi x) fx i 1 Var( f ) Var( x) rfx ( n 1) s s f x

25 Continuous scores: anomaly correlation Correlation calculated on anomaly. Anomaly is difference between what was forecast (observed) and climatology. Centered or uncentered versions. MSE and bias correction MSE = f o 2 +s 2 f +s 2 o 2s f s o r fo MSE = ME 2 + var(f o) MSE is the sum of the squared bias and the variance. So bias = MSE Continuous skill scores: MAE skill score MAE MAE ref MAE SS MAE = = 1 MAE MAE MAE perf ref ref Attribute: measures skill Skill score: measure the forecast accuracy with respect to the accuracy of a reference forecast: positive values = skill; negative values = no skill Difference between the score and a reference forecast score, normalized by the score obtained for a perfect forecast minus the reference forecast score (for perfect forecasts MAE=0) Reference forecasts: persistence: appropriate when time-correlation > 0.5 sample climatology: information only a posteriori actual climatology: information a priori Continuous skill scores: MSE skill score MSE MSEref MSE SS MSE = = 1 MSE MSE MSE perf ref ref Attribute: measures skill Same definition and properties as the MAE skill score: measure accuracy with respect to reference forecast, positive values = skill; negative values = no skill Sensitive to sample size (for stability) and sample climatology (e.g. extremes): needs large samples Reduction of Variance: MSE skill score with respect to climatology. If sample climatology is considered: linear correlation bias MSE 2 s Y Y X Y= X; MSE cli =sx and RV= 1 =r 2 XY rxy sx sx sx reliability: regression line slope coeff b=(s X /s Y )r XY

Continuous skill scores: good practice rules If the climatology is calculated pulling together data from many different stations and times of the year, the skill score will be better than if a

26 Continuous skill scores: good practice rules Use same climatology for the comparison of different models. When evaluating the Reduction of Variance, sample climatology gives always worse skill score than long-term climatology: ask always which climatology is used to evaluate the skill. Continuous skill scores: good practice rules If the climatology is calculated pulling together data from many different stations and times of the year, the skill score will be better than if a different climatology for each station and month of the year are used. In the former case the model gets credit from forecasting correctly seasonal trends and specific locations climatologies. In the latter case the specific topographic effects and long-term trends are removed and the forecast discriminating capability is better evaluated. Choose the appropriate climatology for fulfilling your verification purposes. Persistence forecast: use same time of the day to avoid diurnal cycle effects. Continuous Scores of Ranks Linear Error in Probability Space Problem: Continuous scores sensitive to large values or non robust. Solution: Use the ranks of the variable, rather than its actual values. Temp o C rank The value-to-rank transformation: diminish effects due to large values transform distribution to a Uniform distribution remove bias Rank correlation is the most common i= 1 The LEPS is a MAE evaluated by using the cumulative frequencies of the observation Errors in the tail of the distribution are penalized less than errors in the centre of the distribution MAE and LEPS are minimized by the median correction 1 n X i X i LEPS F y F x n q 0.75

The type (continuous, binary) of your data determines the analyses to use within each tool.

27 What you can do with MET verification software depends on what type of data you have. The format (grid, point) of your data determines your MET tool(s). The type (continuous, binary) of your data determines the analyses to use within each tool. copyright 2015, UCAR, all rights reserved. copyright 2015, UCAR, all rights reserved. Gridded Forecasts (2D or 3D) Point Observations (2D or 3D) copyright 2015, UCAR, all rights reserved. copyright 2015, UCAR, all rights reserved.

copyright 2015, UCAR, all rights reserved. Matching Grids to Grids Must use some converter to put forecasts and observations on the same grid.

28 Time If your forecasts and observations are not at the same time, you may need to define a time window for your observations. Gridded Observations (2D or 3D) Forecast Time Obs Obs Observation Window copyright 2015, UCAR, all rights reserved. copyright 2015, UCAR, all rights reserved. Matching Grids to Grids Must use some converter to put forecasts and observations on the same grid. (High resolution) Gridded Data for use with Neighborhood Methods observed forecast Example: copygb Intensity threshold exceeded where squares are blue copyright 2015, UCAR, all rights reserved. copyright 2015 UCAR, all rights reserved slide from Mittermaier

29 Gridded data to transform into Objects REAL observed Pixels (traditional Verification) or Pictures (Object Verification)? Forecast 1 Forecast 2 copyright 2015, UCAR, all rights reserved. Humans can pick out which objects exist and go together. In object based verification, we use software to mimic this process. REAL observed Forecast 1 Forecast 2 Examine spatial error field at different scales using wavelets Decompose with Wavelet copyright 2015 UCAR, all rights reserved

30 Data MET Tool Thank you! Gridded Forecasts Gridded Observations Gridded Forecasts Point Observations Tropical Cyclone A decks and B decks (both point observations) Grid stat (traditional or neighborhood) Series Analysis Wavelet Stat MODE Ensemble Tool Point Stat Ensemble Tool MET - TC copyright 2015, UCAR, all rights reserved. References: Jolliffe and Stephenson (2012): Forecast Verification: a practitioner s guide, Wiley & Sons, 240 pp. Wilks (2011): Statistical Methods in Atmospheric Science, Academic press, 467 pp. Stanski, Burrows, Wilson (1989) Survey of Common Verification Methods in Meteorology ses/msgcrs/index.htm WMO Verification working group forecast verification web page, Accounting for Uncertainty Statistical significance, confidence, uncertainty Tressa L. Fowler Observational Model Model parameters Physics Verification scores Sampling Verification statistic is a realization of a random process What if the experiment were re run under identical conditions? Would you get the same answer?

31 Uncertainty estimates are among a long list of important verification practices Well defined questions or goals. Large, representative, (identical?) sample. Consistent, independent observations. Appropriate methods and statistics. Uncertainty estimates. Spatial, temporal, and conditional differences evaluated. User relevant results. Thoroughly tested software. Define question(s) first. Then the confidence interval is around the right statistic. Which model is best? Is my model upgrade an improvement? How frequently are ceilings in the correct category? You can t fix by analysis what you bungled by design. Light, Singer and Willett. Two ways to examine scores Practical vs. statistical significance CI about Actual Scores may be difficult to differentiate model performance differences May not be the same. Why? Failure to use significant figures. Very large sample sizes. Stats assumes independent samples, but weather rarely delivers. Which do you need? Both! CI about Pairwise Differences may allow for better differentiation of model performance Model 2 Diff: Model 1 Model 2 Model 1 SS CIs do not encompass 0

Confidence Intervals (CIs) Types of Confidence Intervals If we re run the experiment N times, and create N (1 α)100% CI s, then we expect the true value of the parameter to fall inside (1 α)100 of

Parametric (normal) Sensitive to departures from assumed distribution. Often sensitive to outliers. Not available for some statistics.

32 Confidence Intervals (CIs) Types of Confidence Intervals If we re run the experiment N times, and create N (1 α)100% CI s, then we expect the true value of the parameter to fall inside (1 α)100 of the intervals. Confidence intervals can be parametric or non parametric Bootstrap Available for almost any statistic. More robust to outliers. Sensitive to lack of continuity, small samples. Parametric (normal) Sensitive to departures from assumed distribution. Often sensitive to outliers. Not available for some statistics. Estimate Normal Approximation CI s Standard normal variate Population ( true ) parameter Normal Approximation CI s Is a (1-α)100% Normal CI for ϴ, where ϴ is the statistic of interest (e.g., the forecast mean) se(θ) is the standard error for the statistic z v is the v-th quantile of the standard normal distribution where v= α/2. A typical value of α is 0.05 so (1-α)100% is refered to as the 95 th percentile Normal CI se(θ) θ z α/2

check validity of the normal distribution (e.g.

33 Application of Normal Approximation CI s Normal Approximation CI s Independence assumption (i.e., iid ) temporal and spatial Should check the validity of the independence assumption MET accounts for first order temporal correlation Normal distribution assumption Should check validity of the normal distribution (e.g., qq plots, other methods) MET does not do this should be done outside of MET However MET applies appropriate approaches to verification statistics Multiple testing When computing many confidence intervals, the true significance levels are affected (reduced) by the number of tests that are done. Normal approximation is appropriate for numerous verification measures Examples: Mean error, Correlation, ACC, BASER, POD, FAR, CSI Alternative CI estimates are available for other types of variables Examples: forecast/observation variance, GSS, HSS, FBIAS, Brier Score All approaches expected the sample values to be independent and identically distributed. (Nonparametric) Bootstrap CI s Empirical Distribution (Histogram) of statistic calculated on repeated samples IID Bootstrap Algorithm 1. Resample with replacement from the sample (forecast and observation pairs), x 1, x 2,..., x n Bounds for 90% CI 2. Calculate the verification statistic(s) of interest from the resample in step 1. 5% 5% 3. Repeat steps 1 and 2 many times, say B times, to obtain a sample of the verification statistic(s) θ B. 4. Estimate (1 α)100% CI s from the sample in step 3. Values of statistic θ B

34 Bootstrap CI Considerations METViewer alternatives Number of points impacts speed of bootstrap Grid based typically uses more points than Point based THUS: Bootstrap is quicker with Point based Number of resamples impacts speed of bootstrap Recommended value is 1000 If you need to reduce try to determine where solutions converge to pick your value Bootstrap can be disabled in MET, if concerned about compute speed check status in config file before running Two types of parametric intervals available where appropriate. Accumulate scores (e.g. overall average), find parametric interval. Summarize scores (e.g. find average or median value of all daily POD values), find interval appropriate for average or median. Bootstrap the statistics for each field over time. Measures (between field) uncertainty of the estimates over time, rather than the within field uncertainty. Pairwise difference statistics and intervals (with event equalization). Gives more power to detect differences by eliminating case to case variability. Conclusions Uncertainty estimates are an essential part of good verification evaluations. All estimates are wrong, some estimates are useful. MET and METViewer developers strive to provide the most correct and useful intervals for output statistics. References and further reading Gilleland, E., 2010: Confidence intervals for forecast verification. NCAR Technical Note NCAR/TN-479+STR, 71pp. Available at: Jolliffe and Stephenson (2011): Forecast verification: A practitioner s guide, 2 nd Edition, Wiley & sons JWGFVR (2009): Recommendation on verification of precipitation forecasts. WMO/TD report, no.1485 WWRP Nurmi (2003): Recommendations on the verification of local weather forecasts. ECMWF Technical Memorandum, no. 430 Wilks (2012): Statistical methods in the atmospheric sciences, ch. 7. Academic Press See also Appendix C of MET Documentation:

35 Downloading and Compiling MET MET BASICS, PRE- PROCESSING AND POINT- BASED VERIFICATION Julie Prestopnik Release History MET was first released in 2007 METv6.1: Released December 04, 2017 Major Enhancements: Added specification of point obs using the variable name (vs. GRIB code) Added convert(x) config file option for conversion functions (convert(x)= C_to_F(x);) Added censor_thresh and censor_val config file options for filtering thresholds and replacement values Added new Economic Cost-Loss Value (ECLV) STAT line type derived from CTC or PCT lines Added new Gradient (GRAD) STAT line type for statics derived from gradients, including the S1 score Added new Relative Position (RELP) STAT line type Option to compute 1-D Fourier Decomposition when generating CNT, SL1L2, SAL1L2, VL1L2, and VAL1L2 output lines Added initial support for binned climatologies by adding climo_stdev and climo_cdf_bins config file options Other Enhancements: Support for Gaussian grids and Polar Stereographic CF-Compliant files Shape option added to config files to define a square or circle area Added support for reading BUFR files directly, rather than just PREPBUFR MET Users METv6.1: Released December 04, 2017 Pre-installed on tutorial machines registered MET users from 132 countries 48/27/14/11%: University/Gov t/nonprofit/private 30/16/6/3%: USA/China/India/Brazil On-line and hands-on tutorials

Downloading MET www.dtcenter.org/met/users Download MET release and compile locally. Register and download: www.dtcenter.org/met/users Language: Primarily in C++ with calls to a Fortran library Supported Platforms and Compilers: 1.

36 Downloading MET Download MET release and compile locally. Register and download: Language: Primarily in C++ with calls to a Fortran library Supported Platforms and Compilers: 1. Linux with GNU compilers 2. Linux with Portland Group (PGI) compilers 3. Linux with Intel compilers Dependencies REQUIRED: C++/Fortran Compilers (GNU, PGI, Intel) GNU Make Utility Unidata s 4 library (both -C and -CXX) HDF5 library (required to support 4) NCEP s BUFRLIB Library v GNU Scientific Library (GSL) Z Library (zlib) OPTIONAL: GRIB2 C-Library with JASPER and PNG libraries HDF4 and HDF-EOS2 libraries for MODIS-Regrid tool Cairo and FreeType libraries for MODE-Graphics tool RECOMMENDED: Unified Post-Processor COPYGB (included with Unified Post-Processor) wgrib and wgrib2 R statistics and graphics package Directory Structure File or Directory Contents README configure (plus supporting files and subdir) bin/ data/ doc/ out/ scripts/ src/ Installation instructions and release notes Used by the autoconf build process; Configures the MET package for installation on a system Built MET executables Contains map data, colortables, sample input data, GRIB and GRIB2 table files, and default configuration files MET User s Guide Output generated by the test scripts Test scripts to be run after building MET MET Source Code

Building MET Steps for building MET: 1. Build required/optional libraries. Same family of compilers for MET 2. Download and unpack latest MET patches. 3.

37 Building MET Steps for building MET: 1. Build required/optional libraries. Same family of compilers for MET 2. Download and unpack latest MET patches. 3. autoconf determines available compilers, but can be explicitly set by the user 4. Set environment variables in.cshrc or equivalent file Paths for HDF5,, BUFRLIB, and GSL libraries The compilation of various tools can be turned on/off 5. Configure the installation for your system and run configure The compilation of various tools can be turned on/off 6. Run make install and make test and check for runtime errors. Test scripts run each of the MET tools at least once. Uses sample data distributed with the tarball. Existing MET Builds MET User s Page Download -> Existing MET Builds Cheyenne module use /glade/p/ral/jnt/met/met_releases/cheyenne/modulefiles module load met/6.1 Theia module use /contrib/modulefiles module load met/6.1 MET Flowchart Configuration Files MET tools controlled using command line options and configuration files Well commented and documented in MET User s Guide Easy to modify Distributed with the tarball Configuration files control things such as: Fields/levels to be verified Thresholds to be applied Interpolation methods to be used Verification methods to be applied Regions over which to accumulate statistics README file in data/config area describes various settings

Graphics Limited graphics incorporated into MET Options for plotting MET

scripts on MET website Future METViewer database/display system R Statistics

org) Powerful statistical analysis and plotting tools Large and growing user

plotting and analysis scripts posted on the MET website Use R to plot data in

38 Graphics Limited graphics incorporated into MET Options for plotting MET statistical output R, NCL, IDL, GNUPlot, and many others Sample plotting scripts on MET website Future METViewer database/display system R Statistics and Graphics The R Project for Statistical Computing ( Powerful statistical analysis and plotting tools Large and growing user community Freely available and well supported for Linux/Windows/Mac Sample R plotting and analysis scripts posted on the MET website Use R to plot data in the practical sessions Use Case: Point Verification Getting Started with MET A Typical Use Case Julie Prestopnik

Use Case: 2-m Temperature Verify 2-m temperature versus PREPBUFR point observations.

Unidata s IDV for display Use Case: Find Observations MET downloads page Select

0 Global PREPBUFR point observations Use Case: Get Observations Use Case: Process

0 Select 20080807 through 20080809 Pull daily PREPBUFR files Can view by Faceted Browse

39 Use Case: 2-m Temperature Verify 2-m temperature versus PREPBUFR point observations. Initialized with 3-hourly output to 48h Run WRFOUT through UPP to create GRIB Unidata s IDV for display Use Case: Find Observations MET downloads page Select Observation Datasets NCAR CISL Site: DS337.0 Global PREPBUFR point observations Use Case: Get Observations Use Case: Process Observations Register and log in NCAR CISL Site: DS337.0 Select through Pull daily PREPBUFR files Can view by Faceted Browse or Complete File List (shown here) GDAS PREPBUFR 4/day in 6-hr chunks nr = not restricted Run each through pb2nc All observation types Full 6-hr time window Quality marker of 9 Only over CONUS

Use Case: Plot 2-m TMP Obs 2-m TMP obs in ADPSFC message type Run plot_point_obs utility to display obs prepbufr.gdas.20080807.

Once/fcst time Obs +/- 5 min Bilinear interp Score over FULL model domain Continuous statistics line (incl RMSE) Use Case:

CI s Column 48hr fcst Column 48hr fcst TOTAL 771 MAE 1.84045 FBAR 290.54368 MSE 5.5114 FSTDEV 4.49174 BCMSE 5.34955 OBAR 290.

40 Use Case: Plot 2-m TMP Obs 2-m TMP obs in ADPSFC message type Run plot_point_obs utility to display obs prepbufr.gdas t00z.nc Use Case: Verify 2-m TMP Fcst Run point_stat to compare gridded forecast to point observations. Once/fcst time Obs +/- 5 min Bilinear interp Score over FULL model domain Continuous statistics line (incl RMSE) Use Case: Continuous Statistics Use Case: Time Series of RMSE One STAT output file for each lead time Continuous line type (CNT) including CI s Column 48hr fcst Column 48hr fcst TOTAL 771 MAE FBAR MSE FSTDEV BCMSE OBAR RMSE OSTDEV E PR_CORR E hr RMSE = 2.35 ME E ESTDEV E MBIAS E

41 Use Case: Conclusions Diurnal cycle for surface temperature RMSE best in the morning (perfect RMSE = 0) RMSE worst in the evening Performance degrades as lead time increases Too little data to make strong conclusions This is a single initialization Verify/aggregate over a month, season, or year File Formats and Pre-Processing File Formats Pre-processing Tools Useful Links Supported File Formats FILE FORMATS Forecasts GRIB1 GRIdded Binary file GRIB2 GRIB version 2 disabled by default (--enable-grib2) Output from wrf_interp WRF-ARW utility, CF-Compliant versions 3 and 4, and internal MET format Gridded Analyses Same as Forecast file formats GRIB Stage II/IV, MRMS, URMA, Model Analyses WWMCA World Wide Merged Cloud Analysis TRMM Tropical Rainfall Measuring Mission MODIS Moderate-Resolution Imaging Spectroradiometer Point Observations PREPBUFR binary data assimilation product (NDAS or GDAS) MET specific 11-column, little-r, SURFRAD, WWSIS, Aeronet MADIS Metar, Raob, Profiler, Maritime, Mesonet, or acarsprofiles LIDAR - CALIPSO

42 Data Inventory Tools wgrib dumps GRIB1 headers and data. PRE-PROCESSING TOOLS wgrib2 dumps GRIB2 headers and data. ncdump - dumps headers and data. ncview plots gridded data. GrADS command line interface to produce plots. NCL command line interface to produce plots. IDV gui-driven visualization of many gridded and point datasets Pre-Processing / Reformatting Data Reformating Tools Input Reformat Plot Statistics Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF PCP Combine Regrid Data Plane MODIS Regrid WWMCA Regrid 2NC PB2NC MADIS2NC LIDAR2NC Gen VxMask Shift Data Plane Gridded Point Obs Plot Data Plane WWMCA Plot PS Plot Point Obs MTD Series Analysis MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Analysis Plot MODE Field MODE Analysis Stat Analysis PNG STAT MET-TC DLand TC PAIRS TCST TC STAT Land Data File TC DLAND ATCF Track Data PB2NC, 2NC, MADIS2NC, LIDAR2NC Reformat point observations to the format expected by Point-Stat and Ensemble-Stat. MODIS_Regrid, WWMCA_Regrid Regrid HDF MODIS or binary WWMCA observations to the gridded format expected by the MET statistics tools. Regrid_Data_Plane Regrid one or more gridded data fields to user-specified grid. PCP_Combine Add, subtract, or sum precipitation values across multiple gridded data files and write to the gridded format expected by the MET statistics tools. GSI Diag GSI Tools

43 1. PB2NC Tool PB2NC PREPBUFR PB2NC Stands for PREPBUFR to Functionality: Filters and reformats binary PREPBUFR and BUFR point observations into intermediate format. Configuration file specifies: Observation types, variables, locations, elevations, quality marks, and times to retain or derive for use in Point-Stat or Ensemble-Stat. Data formats: Reads PREPBUFR and BUFR using NCEP s BUFRLIB. Writes point as input to Point-Stat or Ensemble- Stat. BUFR is the World Meteorological Organization (WMO) standard binary code for the representation and exchange of observational data. The PREPBUFR format is produced by NCEP for analyses and data assimilation. The system that produces this format: Assembles observations dumped from a number of sources Encodes information about the observational error for each data type background (first guess) interpolation for each data location Performs both rudimentary multi-platform quality control and more complex platform-specific quality control North American and Global datasets Only works with NCEP datasets with embedded tables. Support for external BUFR tables coming soon PB2NC: Usage PB2NC PB2NC: Run PB2NC 171 Usage: pb2nc prepbufr_file netcdf_file config_file [-pbfile prepbufr_file] [-valid_beg time] [-valid_end time] [-nmsg n] [-index] [-dump path] [-log file] [-v level] [-compress level] prepbufr_file netcdf_file config_file -pbfile -valid_beg -valid_end -nmsg -index -dump Input PrepBufr file name Output file name PB2NC configuration file Additional input PrepBufr files Beginning/Ending of valid time window [YYYYMMDD_[HH[MMSS]] Number of PrepBufr messages to process Lists available BUFR variables Dump entire contents o PrepBufr file to file in path Output file for log messages -log -v Level of logging -compress Compression level 172 met-6.1/bin/pb2nc \ ndas.t00z.prepbufr.tm nr \ out/tutorial_pb.nc PB2NCConfig_tutorial -v 2 ==> append : to filename to view the data source BUFR 230ADPUPA UPPER-AIR (RAOB, PIBAL, RECCO, DROPS) REPORTS 231AIRCAR MDCRS ACARS AIRCRAFT REPORTS 232AIRCFT AIREP/PIREP, AMDAR(ASDAR/ACARS), E-ADAS(AMDAR BUFR) ACF233SATWND SATELLITE-DERIVED WIND REPORTS 234PROFLR WIND PROFILER REPORTS 235VADWND VAD (NEXRAD) WIND REPORTS 236SATEMP TOVS SATELLITE DATA (SOUNDINGS, RETRIEVALS, RADIANCES) 237ADPSFC SURFACE LAND (SYNOPTIC, METAR) REPORTS 238SFCSHP SURFACE MARINE (SHIP, BUOY, C-MAN PLATFORM) REPORTS 239SFCBOG MEAN SEA-LEVEL PRESSURE BOGUS REPORTS 240SPSSMI SSM/I RETRIEVAL PRODUCTS (REPROCESSED WIND SPEED, TPW) 241SYNDAT SYNTHETIC TROPICAL CYCLONE BOGUS REPORTS 242ERS1DA ERS SCATTEROMETER DATA (REPROCESSED WIND SPEED) 243GOESND GOES SATELLITE DATA (SOUNDINGS, RETRIEVALS, RADIANCES) 244QKSWND QUIKSCAT SCATTEROMETER DATA (REPROCESSED WIND SPEED) 245MSONET MESONET SURFACE REPORTS (COOPERATIVE NETWORKS) 246GPSIPW GLOBAL POSITIONING SATELLITE- INTEGRATED PRECIP. WATER 247RASSDA RADIO ACOUSTIC SOUNDING SYSTEM (RASS) TEMP PROFILE RPTSM063000BYTCNT What obs are in a PREPBUFR file? >less \ ndas.t00z.prepbufr.tm nr

44 2. 2NC Tool 2NC 2NC: Usage 2NC Stands for to Functionality: Reformat point observations into intermediate format. Multiple input formats supported (11-column, little-r, SURFRAD, WWSIS, and Aeronet). Configuration file optional to define time summaries and message type mappings for little-r. Data formats: Reads various input formats and writes point as input to Point-Stat and Ensemble-Stat. Support for additional standard formats may be added as time and funding allow. Usage: ascii2nc ascii_file netcdf_file [-format ascii_format] [-config file] [-mask_grid string] [-mask_poly file] [-mask_sid file list] [-log file] [-v level] [-compress level] ascii_file netcdf_file -format string -config file -mask_grid string -mask_poly file -mask_sid file list Input file name Output file name met_point, little_r, surfrad, wwsis, aeronet Optional configuration file name Retain points within a named grid or gridded data file. Retain points within a lat/lon polyline. Retain a list of station ID s MET-Point Format 2NC MET-Point Format 2NC 175 Msg STID ValidTime Lat Lon Elev Var Lvl Hgt QC Ob Ob assigns value to variable ADPUPA _ NA 1618 *HGT ADPUPA _ NA *TMP ADPUPA _ NA *DPT ADPUPA _ NA 92 *RH ADPUPA _ *MixRat ADPUPA _ *HGT ADPUPA _ *TMP Msg Message type STID Station ID ValidTime Valid time for observation Lat Latitude [North] Lon Longitude [East] Elev Elevation [m] (Note: currently not used by MET code so can be filled with ) Var Lvl Hgt QC flag Ob GRIB code or variable name (i.e. AccPrecip or 61, MSLP or 2, Temp or 11, etc ) Pressure [mb] or Accumulation Interval [hr] Height above Mean Sea Level [m MSL] (Note: currently not used by MET code so can be filled with ) Quality control flag value Observed value * Use a value of "-9999" to indicate missing data 176 Msg STID ValidTime Lat Lon Elev Var Lvl Hgt QC Ob ADPUPA _ HGT NA 1618 ADPUPA _ TMP NA ADPUPA _ DPT NA ADPUPA _ RH NA 92 ADPUPA _ MIXR ADPUPA _ HGT ADPUPA _ TMP Msg Message type STID Station ID ValidTime Valid time for observation Lat Latitude [North] Lon Longitude [East] Elev Elevation [m] (Note: currently not used by MET code so can be filled with ) Var Lvl Hgt QC flag Ob GRIB code or variable name (i.e. AccPrecip or 61, MSLP or 2, Temp or 11, etc ) Pressure [mb] or Accumulation Interval [hr] Height above Mean Sea Level [m MSL] (Note: currently not used by MET code so can be filled with ) Quality control flag value Observed value * Use a value of "-9999" to indicate missing data

45 2NC: Run met-6.1/bin/ascii2nc sample_obs.txt sample_ascii.nc -v 2 netcdf sample_ascii { dimensions: mxstr = 15 ; hdr_arr_len = 3 ; obs_arr_len = 5 ; nhdr = 5 ; nobs = UNLIMITED ; // (2140 currently) variables: char hdr_typ(nhdr, mxstr) ; hdr_typ:long_name = "message type" ; char hdr_sid(nhdr, mxstr) ; hdr_sid:long_name = "station identification" ; char hdr_vld(nhdr, mxstr) ; hdr_vld:long_name = "valid time" ; hdr_vld:units = "YYYYMMDD_HHMMSS UTC" ; float hdr_arr(nhdr, hdr_arr_len) ; Result of ncdump h Result of ncdump v obs_arr hdr_arr:long_name = "array of observation station header values" ; hdr_arr:_fill_value = f ; hdr_arr:columns = "lat lon elv" ; ; float obs_arr(nobs, obs_arr_len) ; obs_arr:long_name = "array of observation values" ; obs_arr:_fill_value = f ; obs_arr:columns = "hdr_id gc lvl hgt ob" ; obs_arr:hdr_id_long_name = "index of matching header data" ; ; obs_arr = 0, 7, 837, 1618, 1618, 1, 11, 837, 1618, , 2, 17, 837, 1618, , 3, 52, 837, 1618, 92, 4, 53, 837, 1618, , 5, 7, 826, 1724, 1724, 6, 11, 826, 1724, , 7, 17, 826, 1724, , 8, 52, 826, 1724, 84, 9, 53, 826, 1724, , 10, 7, 815.3, 1829, 1829, 11, 11, 815.3, 1829, , 12, 17, 815.3, 1829, , 13, 52, 815.3, 1829, 45, 14, 53, 815.3, 1829, , 15, 7, 815, 1832, 1832, 16, 11, 815, 1832, , 17, 17, 815, 1832, , 18, 52, 815, 1832, 44, 19, 53, 815, 1832, , 20, 7, 784.7, 2134, 2134, 21, 11, 784.7, 2134, , 22, 17, 784.7, 2134, , 23, 52, 784.7, 2134, 47, 2NC 3. MADIS2NC Tool Stands for MADIS to Functionality: Reformat MADIS point observations into intermediate format. No configuration file. Data formats: Reads MADIS METAR, ROAB, Profiler, Maritime, Mesonet, or acarsprofiles types. Writes point as input to Point-Stat or Ensemble- Stat. MADIS2NC MADIS2NC: Usage Usage: madis2nc madis_file out_file -type str [-qc_dd list] [-lvl_dim list] [-rec_beg n] [-rec_end n] [-mask_grid string] [-mask_poly file] [-mask_sid file list] [-log file] [-v level] [-compress level] madis_file out_file -type str -qc_dd list -lvl_dim list -rec_beg n -rec_end n -mask_grid string -mask_poly file -mask_sid file list Input MADIS file name Output file name metar, raob, profiler, maritime, mesonet, or acarsprofiles QC flag values to be accepted (Z,C,S,V,X,Q,K,G,B) Vertical level dimensions to be processed First MADIS record to process Last MADIS record to process Retain points within a named grid or gridded data file. Retain points within a lat/lon polyline. Retain a list of station ID s. MADIS2NC 180 MADIS2NC: Run met-6.1/bin/madis2nc \ profiler_ _1800.nc test.nc -type profiler -v 2 DEBUG 1: Reading MADIS File: profiler_ _1800.nc DEBUG 1: Writing MET File: test.nc DEBUG 2: Processing PROFILER recs = 22 DEBUG 2: Rejected based on QC = 0 DEBUG 2: Rejected based on fill = 1674 DEBUG 2: Retained or derived = 1494 Result of ncdump v obs_arr obs_arr = 0, 33, -9999, 1000, , 0, 34, -9999, 1000, , 0, 33, -9999, 1250, , 0, 34, -9999, 1250, , 0, 33, -9999, 2250, , 0, 34, -9999, 2250, , 0, 33, -9999, 2500, , 0, 34, -9999, 2500, , 0, 33, -9999, 3750, , 0, 34, -9999, 3750, , 1, 33, -9999, 500, , 1, 34, -9999, 500, , 1, 33, -9999, 750, , 1, 34, -9999, 750, , 1, 33, -9999, 1000, , 1, 34, -9999, 1000, , 1, 33, -9999, 1250, , 1, 34, -9999, 1250, , 1, 33, -9999, 1500, , 1, 34, -9999, 1500, , 1, 33, -9999, 1750, , 1, 34, -9999, 1750, , MADIS2NC

46 4. Regrid_Data_Plane Tool Regrid Data Plane Regrid-Data-Plane: Usage Regrid Data Plane 181 Functionality: Stand-alone tool implementing the automated regridding capability of the MET statistics tools. Extract one or more user-specified fields from the input data file. Regrid to the output grid using the specified interpolation method and width. No configuration file. Data formats: Reads any MET supported gridded data file (i.e. GRIB1/2 and flavors of ). Writes gridded as input to the MET statistics tools. 182 Usage: regrid_data_plane input_filename to_grid to_grid output_filename -field string [-method type] [-width n] [-shape type] [-vld_thresh n] -width n [-name list] [-log file] [-v level] [-compress level] input_filename output_filename -field string -method type -shape type -vld_thresh n -name list Input gridded data file name Output grid as a named grid, gridded data file, or grid specification Output file name Input field configuration string (may be used multiple times) Interpolation method Interpolation shape (SQUARE or CIRCLE) Interpolation width Interpolation required valid data ratio Output variable name(s) Regrid-Data-Plane: Run Regrid Data Plane 5. PCP-Combine Tool PCP Combine met-6.1/bin/regrid_data_plane \ in.grb G212 tmp_p500_g212.nc \ -field name= TMP ; level= P500 ; met-6.1/bin/regrid_data_plane \ in.grb gfs.t06z.pgrb2full.0p50.f078 \ surface_winds.nc \ -field name= UGRD ; level= Z10 ; \ -field name= VGRD ; level= Z10 ; \ -field name= WIND ; level= Z10 ; \ -name UWind,VWind,WindSpeed Stands for Precip-Combine Functionality: Mathematically combines precipitation fields across multiple files. Add precipitation over 2 files 2 NMM output files to go from 3-hr to 6-hr accumulation. Sum precipitation over more than 2 files 12 WSR-88D Level II data to go from 5 min accumulation to 1-hr accumulation. Subtract precipitation in 2 files 2 ARW output files to go from 12 hr accumulations to 6 hour accumulation Specify field name on the command line. No configuration file. Data formats: Reads GRIB1, GRIB2, or pinterp or CF compliant format. Writes gridded as input to stats tools

PCP-Combine: Usage PCP Combine PCP-Combine: Sum PCP Combine Usage: pcp_combine [-sum sum_args] or [-add add_args] or [-subtract sub_args] [-field string] [-name variable_name] [-log file] [-v level]

Sum_args: (init_time, in_accum, valid_time, out_accum, out_file, -pcpdir path, -pcprx reg_exp) Accumulates data over two files. Add_args: (in_file1, Accum1, in_file2, Accum2, out_file).

$1/bin/pcp_combine \ -sum 20050807_000000 6 20050807_120000 12 sample_fcst.$

47 PCP-Combine: Usage PCP Combine PCP-Combine: Sum PCP Combine Usage: pcp_combine [-sum sum_args] or [-add add_args] or [-subtract sub_args] [-field string] [-name variable_name] [-log file] [-v level] [-compress level] -sum -add -subtract -field -name Accumulates data over multiple files. Sum_args: (init_time, in_accum, valid_time, out_accum, out_file, -pcpdir path, -pcprx reg_exp) Accumulates data over two files. Add_args: (in_file1, Accum1, in_file2, Accum2, out_file). Subtracts data over two files. Sub_args: (in_file1, Accum1, in_file2, Accum2, out_file). Defines the data to be extracted from the input files. Name of combined variable in output file. Two examples of the sum option 1) Sum two 6-hourly accumulation forecast files into a single 12-hour accumulation forecast. met-6.1/bin/pcp_combine \ -sum _ _ sample_fcst.nc -pcpdir data/ ) Summing 12 1-hourly accumulation observation files into a single 12-hour accumulated observation. met-6.1/bin/pcp_combine \ -sum _ \ _ \ sample_obs.nc -pcpdir data/st2ml PCP-Combine: Add and Subtract PCP Combine PCP-Combine: Example #1 PCP Combine Use -add option for already binned precipitation: Adding two 6-hourly accumulation forecast files into a single 12-hour accumulation forecast. met-6.1/bin/pcp_combine add \ _ grb 6 \ _ grb 6 \ APCP_12_ _ nc Use -subtract option for runtime accumulations: Subtract 36 hour accumulation minus 12 hour accumulation for 24 hours in between. met-6.1/bin/pcp_combine subtract \ nam_ _f036.grib 36 \ nam_ _f012.grib 12 \ nam_ _f036_apcp_24.nc _ hr acc Graphics produced using ncview _ hr acc _ hr acc

48 SPECIALIZED SATELLITE PRE-PROCESSING TOOLS 6. MODIS-Regrid Tool Depends on HDF4/HDFEOS libraries. Compilation disabled by default (--enable-modis) Functionality: Reformat MODIS satellite observations into intermediate format. No configuration file. MODIS Regrid Data formats: Reads MODIS level 2 data. Writes gridded as input to the MET statistics tools MODIS-Regrid: Usage MODIS Regrid MODIS-Regrid: Run MODIS Regrid Usage: modis_regrid -data_file path -field name -out path -scale value -offset value -fill value [-units text] [-compress level] modis_file -data_file path -field name -out path -scale value -offset value -fill value -units text modis_file Gridded data file defining output grid Field to process in MODIS file, e.g. temperature Output file name Scale factor to use Offset factor Bad data value Units string to be written to the output file Input file MODIS file name met-6.1/bin/modis_regrid -field Cloud_Fraction \ -data_file grid_file -out t2.nc \ -units percent -scale offset 0 -fill 127 \ ~/modis_regrid_test_data/modisfile

7. WWMCA-Regrid Tool WWMCA Regrid WWMCA-Regrid: Usage WWMCA Regrid Functionality: Reformat Air Force binary World Wide Merged Cloud Analysis into intermediate format. No configuration file.

49 7. WWMCA-Regrid Tool WWMCA Regrid WWMCA-Regrid: Usage WWMCA Regrid Functionality: Reformat Air Force binary World Wide Merged Cloud Analysis into intermediate format. No configuration file. Data formats: Reads binary WWMCA files. Writes gridded as input to the MET statistics tools. Usage: wwmca_regrid -out filename -config filename -nh filename [pt_filename] -sh filename [pt_filename] [-log file] [-v level] [-compress level] -out filename -config filename -nh filename -sh filename [pt_filename] Output file name Configuration file name Northern Hemisphere data file Southern Hemisphere data file Pixel time files for the Northern and Southern hemispheres to mask data by pixel age WWMCA-Regrid: Run WWMCA Regrid 8. LIDAR2NC Tool LIDAR2NC met-6.1/bin/wwmca_regrid \ -config WWMCARegridConfig \ -nh WWMCA_TOTAL_CLOUD_PCT_NH_ \ -sh WWMCA_TOTAL_CLOUD_PCT_SH_ \ -out WWMCA_TOTAL_CLOUD_PCT_ _GFS_LATLON.nc \ -v 2 Stands for LIDAR to Depends on HDF4/HDFEOS libraries. Compilation disabled by default (--enable-lidar2nc) Functionality: Reformat LIDAR point observations into intermediate format. No configuration file. Data formats: Reads CALIPSO Lidar data. Writes point as input to Point-Stat or Ensemble- Stat. Support for additional LIDAR formats may be added as time and funding allow

$name 198 met-6.1/bin/lidar2nc \ CAL_LID_L2_05kmCLay-Prov-V3-40.2016-12-01T01-24-58ZN.hdf \ -out CAL_LID_L2_05kmCLay-Prov-V3-40.2016-12-01T01-24- 58ZN.$

50 LIDAR2NC: Usage LIDAR2NC LIDAR2NC: Run LIDAR2NC 197 Usage: lidar2nc lidar_file -out out_file [-log file] [-v level] [-compress level] lidar_file Input LIDAR HDF file name -out out_file Output file name 198 met-6.1/bin/lidar2nc \ CAL_LID_L2_05kmCLay-Prov-V T ZN.hdf \ -out CAL_LID_L2_05kmCLay-Prov-V T ZN.nc DEBUG 1: Processing Lidar File: data/lidar_data/cal_lid_l2_05kmclay-prov-v t zn.hdf DEBUG 1: Writing MET File: tutorial/out/lidar2nc/cal_lid_l2_05kmclay-prov-v t zn.nc DEBUG 2: Processing Lidar points = 3728 obs_arr = 0, 500, _, 0, 1, 0, 501, , , , 0, 502, , , , 0, 503, , , 0, 0, 504, , , 100, 0, 601, , , 2, 0, 602, , , 0, 0, 603, , , 0, 0, 604, , , 3, 0, 600, , , 2, 0, 601, , , 2, 0, 505, , , , 0, 506, , , , 1, 500, _, 0, 1, 1, 501, , , , 1, 502, , , , 1, 503, , , 0, 1, 504, , , 100, 1, 601, , , 2, 1, 602, , , 0, 1, 603, , , 0, Point-Stat Tool Input Reformat Plot Statistics Analysis MET-TC Point-Stat Tool Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF PCP Combine Regrid Data Plane MODIS Regrid WWMCA Regrid 2NC PB2NC MADIS2NC LIDAR2NC Gen VxMask Shift Data Plane Gridded Point Obs Plot Data Plane WWMCA Plot PS Plot Point Obs MTD Series Analysis MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Plot MODE Field MODE Analysis Stat Analysis PNG STAT DLand TC PAIRS TCST TC STAT Land Data File TC DLAND ATCF Track Data GSI Diag GSI Tools

Point-Stat: Overview Compare gridded forecasts to point observations. Accumulate matched pairs over a defined area at a single point in time. Verify one or more variables/levels.

Parametric and non-parametric confidence intervals for statistics. Compute partial sums for raw fields and/or the raw matched pair values. Methods for probabilistic forecasts.

51 Point-Stat: Overview Compare gridded forecasts to point observations. Accumulate matched pairs over a defined area at a single point in time. Verify one or more variables/levels. Analysis tool provided to aggregate through time. Verification methods: Continuous statistics for raw fields. Single and Multi-Category counts and statistics for thresholded fields. Parametric and non-parametric confidence intervals for statistics. Compute partial sums for raw fields and/or the raw matched pair values. Methods for probabilistic forecasts. HiRA spatial verification method. Point-Stat: Input/Output Input Files Gridded forecast file GRIB1 output of Unified Post-Processor (or other) GRIB2 from NCEP (or other) from PCP-Combine, wrf_interp, or CF-compliant Point observation file output of PB2NC, 2NC, MADIS2NC, or LIDAR2NC configuration file Output Files statistics file with all output lines (end with.stat ) Optional files sorted by line type with a header row (ends with _TYPE.txt ) Point-Stat: Usage Usage: point_stat fcst_file obs_file config_file [-point_obs netcdf_file] [-obs_valid_beg time] [-obs_valid_end time] [-outdir path] [-log file] [-v level] fcst_file Gridded forecast file obs_file NC point observation file config_file configuration file -point_obs Additional NC point observation files -obs_valid_beg Beginning of valid time window for matching -obs_valid_end End of valid time window for matching -outdir Output directory to be used -log Optional log file -v Level of logging Point-Stat: Configuration Many configurable parameters only set a few: 2-meter temperature. Threshold temperatures near freezing. Match to obs at the surface. Accumulate stats over all the points in the domain. Match observation to the nearest forecast value. Generate all output line types other than vector and probabilistic. fcst = { message_type = [ "ADPSFC"]; field = [ { name = "TMP"; level = [ "Z2" ]; cat_thresh = [ >273.0, >283.0, >293.0 ]; } ]; }; obs = fcst; mask = { grid = [ "FULL" ]; poly = []; sid = ""; }; interp = { vld_thresh = 1.0; type = [ { method = UW_MEAN; width = 1; } ]; }; output_flag = { fho = BOTH; ctc = BOTH; cts = BOTH; mctc = BOTH; mcts = BOTH; cnt = BOTH; sl1l2 = BOTH; sal1l2 = BOTH; vl1l2 = NONE; val1l2 = NONE; pct = NONE; pstd = NONE; pjc = NONE; prc = NONE; eclv = NONE; mpr = BOTH; };

$Evaluate neighborhood fraction of events as a probability forecast.$ As with all neighborhood methods, allows for some spatial / temporal uncertainty in either model or observation by giving credit for being close.

As with all neighborhood methods, allows for some spatial / temporal uncertainty in either model or observation by giving credit for being close.

Also allows for comparison of models with different grid resolutions via adjustment of neighborhood size.

52 Point-Stat: HiRA Framework Point-Stat: Input High Resolution Assessment (HiRA) verification logic is applied to deterministic forecasts matched to point observations. Evaluate neighborhood fraction of events as a probability forecast. As with all neighborhood methods, allows for some spatial / temporal uncertainty in either model or observation by giving credit for being close. Allows for comparison of deterministic and ensemble forecasts via the same set of probabilistic statistics. Also allows for comparison of models with different grid resolutions via adjustment of neighborhood size. Mittermaier, 2014 Model Forecast White boxes = 0 Colored boxes > 0 Threshold Forecast Blue boxes = event HiRA Proportion 1x1 Neighborhood: 1/1 3x3 Neighborhood: 1/9 5x5 Neighborhood: 4/25 hira = { // Enable or disable flag = TRUE; // Neighborhood sizes (parity logic) width = [ 2, 3, 4, 5 ]; // Probability thresholds cov_thresh = [ ==0.25 ]; // Neighborhood shape shape = SQUARE; }; 2-meter TMP (IDV) 4003 TMP ADPSFC Obs (plot_point_obs) Point-Stat: Run met-6.1/bin/point_stat \ sample_fcst.grb sample_pb.nc \ PointStatConfig_TMPZ2 -outdir out -v 2 DEBUG 1: Default Config File: met-6.1/share/met/data/config/pointstatconfig_default DEBUG 1: User Config File: PointStatConfig_TMPZ2 DEBUG 1: Forecast File: sample_fcst.grb DEBUG 1: Climatology File: none DEBUG 1: Observation File: sample_pb.nc DEBUG 2: DEBUG 2: Reading data for TMP/Z2. DEBUG 2: For TMP/Z2 found 1 forecast levels and 0 climatology levels. DEBUG 2: DEBUG 2: Searching observations from 9396 messages. DEBUG 2: DEBUG 2: Processing TMP/Z2 versus TMP/Z2, for observation type ADPSFC, over region FULL, for interpolation method UW_MEAN(1), using 4003 pairs. DEBUG 2: Computing Categorical Statistics. DEBUG 2: Computing Multi-Category Statistics. DEBUG 2: Computing Continuous Statistics. DEBUG 2: Computing Scalar Partial Sums. DEBUG 2: DEBUG 1: Output file: out/point_stat_360000l_ _120000v.stat DEBUG 1: Output file: out/point_stat_360000l_ _120000v_fho.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_ctc.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_cts.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_mctc.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_mcts.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_cnt.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_sl1l2.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_sal1l2.txt DEBUG 1: Output file: out/point_stat_360000l_ _120000v_mpr.txt Point-Stat: Output Types Statistics line types: 16 possible Categorical Single Threshold Contingency table counts and stats (FHO, CTC, CTS, ECLV) Categorical Multiple Thresholds NxN Contingency table counts and stats (MCTC, MCTS) Continuous - raw fields Continuous statistics (CNT) Partial Sums (SL1L2, SAL1L2, VL1L2, VAL1L2) Probabilistic Nx2 Contingency table counts and stats (PCT, PSTD) Continuous statistics and ROC curve (PJC, PRC) Economic Cost/Loss value (ECLV) Matched pairs Raw matched pairs a lot of data! (MPR) 22 header columns common to all line types Remaining columns specific to each line type

53 Point-Stat: Sample Output Point-Stat: CTC Output Line 1. STAT file output for sample run: 1 line each for CNT, SL1L2, MCTC, MCTS 3 lines each for FHO, CTC, CTS 4,003 lines for MPR! 2. Additional TXT files for each line type Output file: out/point_stat_360000l_ _120000v.stat Output file: out/point_stat_360000l_ _120000v_fho.txt Output file: out/point_stat_360000l_ _120000v_ctc.txt Output file: out/point_stat_360000l_ _120000v_cts.txt Output file: out/point_stat_360000l_ _120000v_mctc.txt Output file: out/point_stat_360000l_ _120000v_mcts.txt Output file: out/point_stat_360000l_ _120000v_cnt.txt Output file: out/point_stat_360000l_ _120000v_sl1l2.txt Output file: out/point_stat_360000l_ _120000v_sal1l2.txt Output file: out/point_stat_360000l_ _120000v_mpr.txt VERSION V6.1 MODEL WRF DESC NA FCST_LEAD FCST_VALID_BEG _ FCST_VALID_END _ OBS_LEAD OBS_VALID_BEG _ OBS_VALID_END _ FCST_VAR TMP FCST_LEV Z2 OBS_VAR TMP OBS_LEV Z2 OBTYPE ADPSFC VX_MASK FULL INTERP_MTHD UW_MEAN INTERP_PNTS 1 FCST_THRESH > OBS_THRESH > COV_THRESH NA ALPHA NA LINE_TYPE CTC TOTAL 4003 FY_OY (hits) 3111 FY_ON (f.a.) 78 FN_OY (miss) 215 FN_ON (c.n.) 599 Point-Stat: Matched Pairs Matched Pair (MPR) line type contains 1 line for each matched pair. Data overload! TOTAL INDEX OBS_SID OBS_LAT OBS_LON OBS_LVL OBS_ELV FCST OBS OBS_QC NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA GRIDDED VERIFICATION

54 What can Stat Analysis do? Tina Kalb Stat-Analysis Tool Filtering Summarizing Aggregating of Grid-Stat, Point-Stat, Ensemble-Stat & Wavelet-Stat output Questions to MET Help - Can I get Q: Overall statistics for gridded observations compared to forecasts, hours 0-24? A: Using Stat Analysis Tool on Grid- Stat output Q: Long-term statistics at individual sites (e.g., MAE or RMS error, daily forecasts for a month)? A: Using Stat Analysis Tool on Point- Stat output Q: Contingency table statistics aggregated over multiple runs? A: Using Stat Analysis Tool on any output Q: Statistics aggregated for a large number (N) of individual stations in one simultaneous run? A: It would be cumbersome. You would have to configure Stat Analysis Tool to run (N) number of jobs A: OR use METViewer tool. Stat Analysis Tool Statistics MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Analysis MODE Analysis Stat Analysis User Defined Display User Graphics Package User Graphics Package For Stat Analysis Tool: MET provides the analysis in output. You provide the graphing / plotting capability. Stat Analysis Jobs Filtering (filter) filters out lines from one or more stat files filters based on user-specified filtering options. Summarizing (summary) Summary information from a single data column Includes mean, standard deviation, min, max, IQR, percentiles (0th, 25th, 50th, 75th, and 90 th ) Customized tool for AFWA (go_index) computes GO Index, performance statistic used primarily by the US Air Force Ramp Computes amount of change from one time to next Changes thresholded to produce contingency table

55 Stat Analysis Jobs Aggregation aggregate - aggregates stat data across multiple time steps or masking regions. Output line type is same as input line type (i.e. SSVAR = SSVAR) aggregate_stat aggregates across multiple times/regions then calculates statistics. Output line is different from input line types. Valid line type combinations include: -line_type -out_line_type FHO, CTC yields CTS MCTC yields MCTS SL1L2, SAL1L2 yields CNT VL1L2, VAL1L2 yields WDIR PCT yields PSTD, PJC, PRC NBRCTC yields NBRCTS MPR yields FHO, CTC, CTS, MCTC, MCTS, CNT, SL1L2, SAL1L2, PCT, PSTD, PJC, PRC Stat Analysis Tool: Usage Usage: stat_analysis -lookin path [-out filename] [-tmp_dir path] [-v level] -config config_file or job at command line options with associated arguments [filter] [summary] [aggregate] [aggregate_stat] [go_index] -lookin Path to *.stat files this can be a directory or a single file name (Use one or more times) -out Output name for file -tmp_dir Folder for temporary files -v Level of logging -config StatAnalysisConfig file filter summary aggregate aggregate_stat go_index See previous 2 slides See previous 2 slides See previous 2 slides See previous 2 slides See previous 2 slides Stat-Analysis: Configuration Stat Analysis Tool: Run job aggregate Many configurable parameters only set a few: 10-m U-component of wind. Aggregate stats over DTC165 and DTC166 regions Accumulate only CTCs calculated using Distance- Weighted Mean interpolation Dump lines included in accumulation Dump aggregation to file -OR - can put it all in the jobs area fcst_var = ["UGRD"]; obs_var = []; fcst_lev = []; obs_lev = []; obtype = []; vx_mask = ["DTC165", "DTC166"]; interp_mthd = ["DW_MEAN"]; jobs = [ "-job filter -line_type CTC -dump_row outdir/job_filter_ctc_ugrd.stat", "-job aggregate -line_type CTC -dump_row outdir/job_aggregate_ctc_ugrd.stat" ]; -OR - jobs = [ "-job filter -line_type CTC dump_row out/job_filter_ctc_ugrd.stat", "-job aggregate -line_type CTC -fcst_var UGRD -vx_mask DTC165 -vx_mask DTC166 -interp_mthd DW_MEAN -dump_row out/job_aggregate_ctc_ugrd.stat" ]; "-job aggregate -line_type CTC -fcst_var UGRD -vx_mask DTC165 -vx_mask DTC166 -interp_mthd DW_MEAN -dump_row out/job_aggregate.stat" Stat Analysis Filter Output in job_aggregate.stat V4.1 WRF _ _ _ _ UGRD Z10 UGRD Z10 ADPSFC DTC165 DW_MEAN 9 >=5.000 >=5.000 NA NA CTC V4.1 WRF _ _ _ _ UGRD Z10 UGRD Z10 ADPSFC DTC166 DW_MEAN 9 >=5.000 >=5.000 NA NA CTC (NOTE: header modified to show only pertinent info) F C S T F C S T OBS Y N Y N OBS Y N Y N

Stat Analysis Tool: Run job aggregate "-job aggregate -line_type CTC -fcst_var UGRD -vx_mask DTC165 -vx_mask DTC166 -interp_mthd DW_MEAN -dump_row out/job_aggregate.

56 Stat Analysis Tool: Run job aggregate "-job aggregate -line_type CTC -fcst_var UGRD -vx_mask DTC165 -vx_mask DTC166 -interp_mthd DW_MEAN -dump_row out/job_aggregate.stat" Stat Analysis Output in the file specified by out flag (i.e. stat_analysis.out) JOB_LIST: -job aggregate -fcst_var UGRD -vx_mask DTC165 - vx_mask DTC166 -interp_mthd DW_MEAN -line_type CTC -dump_row out/aggregate2.stat COL_NAME: TOTAL FY_OY FY_ON FN_OY FN_ON CTC: F C S T OBS Y N Y N Stat Analysis Tool: Run job aggregate_stat "-job aggregate_stat -line_type CTC out_line_type CTS -fcst_var UGRD - vx_mask DTC165 -vx_mask DTC166 -interp_mthd DW_MEAN -dump_row out/job_aggregate_stat.stat" Aggregate_stat Output (stat_analysis.out continued) COL_NAME: TOTAL BASER BASER_NCL BASER_NCU BASER_BCL BASER_BCU FMEAN FMEAN_NCL FMEAN_NCU FMEAN_BCL FMEAN_BCU ACC ACC_NCL ACC_NCU ACC_BCL ACC_BCU FBIAS FBIAS_BCL FBIAS_BCU PODY PODY_NCL PODY_NCU PODY_BCL PODY_BCU PODN PODN_NCL PODN_NCU PODN_BCL PODN_BCU POFD POFD_NCL POFD_NCU POFD_BCL POFD_BCU FAR FAR_NCL FAR_NCU FAR_BCL FAR_BCU CSI CSI_NCL CSI_NCU CSI_BCL CSI_BCU GSS GSS_BCL GSS_BCU HK HK_NCL HK_NCU HK_BCL HK_BCU HSS HSS_BCL HSS_BCU ODDS ODDS_NCL ODDS_NCU ODDS_BCL ODDS_BCU CTS: NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA F C S T OBS Y N Y N Base Rate: 0.04 Freq Bias: 1.27 PODY: 0.35 FAR: 0.72 CSI: 0.18 GSS: 0.15 Stat Analysis Tool: Run job summary -job summary -fcst_var UGRD -interp_mthd DW_MEAN -line_type CTS -column GSS -dump_row out/job_summary.stat # Description 1 Column Name Summary 2 Total Mean* Includes normal and bootstrap upper and lower confidence limits 8-10 Standard deviation** Includes bootstrap upper and lower confidence limits 11 Minimum value th percentile th percentile Median (50 th percentile) th percentile th percentile Maximum value Summary Output (stat_analysis.out cont.) COL_NAME: TOTAL MEAN MEAN_NCL MEAN_NCU MEAN_BCL MEAN_BCU STDEV STDEV_BCL STDEV_BCU MIN P10 P25 P50 P75 P90 MAX SUMMARY: Use your favorite plotting software Stat Analysis User Graphics Package MET provides the analysis in ascii ouput. You provide the graphing / plotting capability.

Stat_Analysis Example User Contributed Plotting Scripts 03/01/2013-08/30/2013 vx_mask = [ EPOCH ] Jobs: aggregate PCT aggregate_stat, PCT to PRC aggregate_stat PCT to PSTD Please feel free to send

edu Gen-Vx-Mask Tool Input Reformat Plot Statistics Analysis MET-TC Gen-Vx-Mask Tool Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF PCP Combine Regrid

57 Stat_Analysis Example User Contributed Plotting Scripts 03/01/ /30/2013 vx_mask = [ EPOCH ] Jobs: aggregate PCT aggregate_stat, PCT to PRC aggregate_stat PCT to PSTD Please feel free to send your contributions to met_help@ucar.edu Gen-Vx-Mask Tool Input Reformat Plot Statistics Analysis MET-TC Gen-Vx-Mask Tool Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF PCP Combine Regrid Data Plane MODIS Regrid WWMCA Regrid 2NC PB2NC MADIS2NC LIDAR2NC Gen VxMask Shift Data Plane Gridded Point Obs Plot Data Plane WWMCA Plot PS Plot Point Obs MTD Series Analysis MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Plot MODE Field MODE Analysis Stat Analysis PNG STAT DLand TC PAIRS TCST TC STAT Land Data File TC DLAND ATCF Track Data GSI Diag GSI Tools 228

229 Gen-Vx-Mask: Overview Generate Verification Mask Replaces earlier Gen-Poly-Mask and Gen-Circle-Mask tools Purpose: Generate mask once for a domain and use

Run iteratively to define complex masking region. Define mask once prior to running the MET statistics tools. No configuration file.

4211 31.2291-120.4976 31.2650-120.5741 31.3009-120.6123 31.3369-120.6506 31.3728-120.6888 31.4087-120.6888 31.4447-120.

[-complement][-union] [-intersection][-symdiff] [-thresh string] [-height n][-width n] [-value n] [-name string] [-log file] [-v level] [-compress level]

for the mask Defines the spatial masking area Output file name poly, box, circle, track, grid, data, solar_alt, or solar_azi Field for initial value at each

track, data, solar_alt, and solar_azi types Height and width for box type Output mask value and variable name Gen-Vx-Mask: Types Gen-Vx-Mask: Lat/Lon Types

$Altitude (solar_alt) Solar Azimuth (solar_azi) gen_vx_mask wrfprs_ruc13_12.tm00 MyLatLonPoints.txt \ poly_mask.nc 1.000 -type poly 1.000 poly poly_mask.$ 000 box -width 15 box_mask.nc 1.000 0.875 0.750 0.625 0.500 0.375 0.250 0.125 track 0.875 0.750 0.625 0.500 0.375 0.250 0.125 0.000 track 2848.947 2492.837 2136.

000 box -width 15 box_mask.nc 1.000 0.875 0.750 0.625 0.500 0.375 0.250 0.125 track 0.875 0.750 0.625 0.500 0.375 0.250 0.125 0.000 track 2848.947 2492.837 2136.

58 229 Gen-Vx-Mask: Overview Generate Verification Mask Replaces earlier Gen-Poly-Mask and Gen-Circle-Mask tools Purpose: Generate mask once for a domain and use the output many times. Functionality: Generate a 0/1 bitmap mask field to define which grid points are included in statistics. Support multiple masking methods. Run iteratively to define complex masking region. Define mask once prior to running the MET statistics tools. No configuration file. Data formats: Reads gridded data files. Reads formatted lat/lon file. Writes gridded output mask file. Accumulated Precip Accumulated Precip CONUS more points 230 Gen-Vx-Mask: Usage Usage: gen_vx_mask input_file mask_file out_file [-type string] [-input_field string] [-mask_field string] [-complement][-union] [-intersection][-symdiff] [-thresh string] [-height n][-width n] [-value n] [-name string] [-log file] [-v level] [-compress level] input_file mask_file out_file -type string -input_field -mask_field -complement -union -intersection -symdiff -thresh -height, -width -value, -name Defines grid for the mask Defines the spatial masking area Output file name poly, box, circle, track, grid, data, solar_alt, or solar_azi Field for initial value at each grid point (instead of 0) Field for data masking Define complement of the mask Control logic for combining -input_field and current mask Threshold for circle, track, data, solar_alt, and solar_azi types Height and width for box type Output mask value and variable name Gen-Vx-Mask: Types Gen-Vx-Mask: Lat/Lon Types mask_file = Lat/Lon file Polyline (poly) Box Circle Track mask_file = gridded data file Grid Data Lat or Lon mask_file = gridded data file or timestamp Solar Altitude (solar_alt) Solar Azimuth (solar_azi) gen_vx_mask wrfprs_ruc13_12.tm00 MyLatLonPoints.txt \ poly_mask.nc type poly poly poly_mask.nc circle circle thresh le box -width 15 box_mask.nc track track thresh le250 MyLatLonPoints circle_mask_no_thresh.nc circle_mask_with_thresh.nc track_mask_no_thresh.nc track_mask_with_thresh.nc 0.000

Gen-Vx-Mask: Data File Types Gen-Vx-Mask: Timestamp Types

$grib \ grid_mask.nc -type 102700.000 grid 1.$ nc -type solar_alt 34.730 115.719 98350.000 0.875 25.755 104.

940 85300.000 0.500-1.171 70.681 80950.000 0.375-10.146 59.

nc level= L0 ; 76600.000 72250.000 grid grid_mask.nc 0.250 0.

643 0.875 0.875 49.007 0.875 0.750 0.750 0.750 42.871 0.750 0.625 0.

375 233 data -thresh le90000 data mask with thresh.nc 0.375 0.

250 lat 12.190 0.125 -thresh ge25&&le55 0.

125 0.000 solar_azi -thresh le90 solar_azi_mask_with_thresh.nc 0.

59 Gen-Vx-Mask: Data File Types Gen-Vx-Mask: Timestamp Types gen_vx_mask wrfprs_ruc13_12.tm00 d01_ _02400.grib \ grid_mask.nc -type grid gen_vx_mask wrfprs_ruc13_12.tm _12 \ solar_alt_mask.nc -type solar_alt data -mask_field name= PRES ; data_mask.nc level= L0 ; grid grid_mask.nc solar_alt solar_alt_mask_no_thresh.nc solar_azi solar_azi_mask_no_thresh.nc data -thresh le90000 data mask with thresh.nc lat lat_mask_no_thresh.nc lat thresh ge25&&le solar_alt -thresh gt0 data_mask_with_thresh.nc solar_azi -thresh le90 solar_azi_mask_with_thresh.nc Gen-Vx-Mask: Storm Following Gen-Vx-Mask: Set Logic Land == 1 Sandy 200km Not Land && Sandy Complex masking definition including storm-following masking -complement -intersection

60 More on Masking and Interpolation Randy Bullock

63 Grid-Stat Tool 251

Grid-Stat Tool Input Reformat Plot Statistics Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF GSI Diag PCP Combine Regrid Data Plane MODIS

Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Analysis Plot MODE Field MODE Analysis Stat Analysis PNG STAT DLand MET-TC TC PAIRS TCST TC STAT Land Data File TC DLAND ATCF

Verify one or more variables/levels. Analysis tool provided to aggregate through time. Verification methods: Continuous statistics for raw fields.

$Grid-Stat: Common Grid Model Forecast StageIV Analysis Regrid the StageIV Analysis (GRIB) to the model domain: copygb xg"255 5 169 154 31357-129770 8-120500 10395 10395 0 64" \ ST4.$ 2010122212.06h ST4.2010122212.06h_regrid Automated regridding in configuration file. Practice running copygb in the practical session.

2010122212.06h ST4.2010122212.06h_regrid Automated regridding in configuration file. Practice running copygb in the practical session.

64 Grid-Stat Tool Input Reformat Plot Statistics Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF GSI Diag PCP Combine Regrid Data Plane MODIS Regrid WWMCA Regrid 2NC PB2NC MADIS2NC LIDAR2NC GSI Tools Gen VxMask Shift Data Plane Gridded Point Obs Plot Data Plane WWMCA Plot PS Plot Point Obs MTD Series Analysis MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Analysis Plot MODE Field MODE Analysis Stat Analysis PNG STAT DLand MET-TC TC PAIRS TCST TC STAT Land Data File TC DLAND ATCF Track Data Grid-Stat: Overview Compare gridded forecasts to gridded observations on the same grid. Accumulate matched pairs over a defined area at a single point in time. Verify one or more variables/levels. Analysis tool provided to aggregate through time. Verification methods: Continuous statistics for raw fields. Single and Multi-Category counts and statistics for thresholded fields. Parametric and non-parametric confidence intervals for statistics. Compute partial sums for raw fields. Methods for probabilistic forecasts. Continuous statistics and categorical counts/statistics using neighborhood verification method. Grid-Stat: Common Grid Model Forecast StageIV Analysis Regrid the StageIV Analysis (GRIB) to the model domain: copygb xg" " \ ST h ST h_regrid Automated regridding in configuration file. Practice running copygb in the practical session. Grid-Stat: Input/Output Input Files Gridded forecast and observation files GRIB1 output of Unified Post-Processor (or other) GRIB2 from NCEP (or other) from PCP-Combine, wrf_interp, or CF-compliant configuration file Output Files statistics file with all output lines (end with.stat ) Optional files sorted by line type with a header row (ends with _TYPE.txt ) Optional matched pairs file

65 Grid-Stat: Usage Grid-Stat: Configuration Usage: grid_stat fcst_file obs_file config_file [-outdir path] [-log file] [-v level] fcst_file Gridded forecast file obs_file Gridded observation file config_file configuration file -outdir Output directory to be used -log Optional log file -v Level of logging Many configurable parameters only set a few: Precipitation accumulated over 24 hours. GRIB1 forecast observation Threshold any rain and moderate rain (mm). Accumulate stats over all the points in the domain and just the eastern United States. Compute neighborhood statistics with two sizes. Generate all possible output types, except probabilistic. fcst = { field = [ { name = "APCP"; level = [ "A24" ]; cat_thresh = [ >0.0, >20.0 ]; } ]; }; obs = { field = [ { name = "APCP_24"; level = [ "(*,*)" ]; cat_thresh = [ >0.0, >20.0 ]; } ]; }; mask = { grid = [ "FULL" ]; poly = [ "EAST.poly" ]; }; output_flag = { fho = BOTH; ctc = BOTH; cts = BOTH; mctc = BOTH; mcts = BOTH; nbrhd = { cnt = BOTH; vld_thresh = 1.0; sl1l2 = BOTH; width = [ 3, 5 ]; vl1l2 = BOTH; cov_thresh = [ >=0.5 ]; pct = NONE; } pstd = NONE; pjc = NONE; prc = NONE; nbrctc = BOTH; nbrcts = BOTH; nbrcnt = BOTH; Copyright 2018, University Corporation for Atmospheric Research, }; all rights reserved Grid-Stat: Field name and level GRIB1 and GRIB2 files name = GRIB Abbreviation ; TMP for Temperature, APCP for accumulated precipitation. level = [ string ]; Multiple values expand to multiple vx tasks Level indicator followed by level value. A for accumulation interval in HH[MMSS] format (A06). P for pressure level (P500) or layer (P ). Z for vertical level (Z2 or Z10). L for generic level type (L100). R for a specific GRIB record number (R225). Gridded files name = string ; Defines variable name. level = [ string ]; Defines index into dimensions. For APCP_06(lat,lon) from PCP-Combine output name = APCP_06 ; level = [ (*,*) ]; For TT(Time, num_metgrid_levels, south_north, west_east) from p_interp name = TT ; level = [ (0,0,*,*), (0,1,*,*), (0,2,*,*) ]; Grid-Stat: Config File Defaults MET Statistics tools parse up to 4 configuration files: 1. MET_BASE/config/ConfigConstants defines configuration file constants (e.g. NONE, STAT, BOTH) and should not be modified. 2. MET_BASE/config/ConfigMapData defines default map data for all plots (map data files, line colors, widths, and types for Plot-Point-Obs, Plot-Data-Plane, Wavlet-Stat, and MODE). 3. MET_BASE/config/GridStatConfig_default defines default settings for the specific tool. 4. User-specific configuration file passed on the command line override default settings. NOTE: When running a shared installation of MET, override default settings in the user-specific configuration file rather than modifying the system-wide defaults.

66 Grid-Stat: Run met-6.0/bin/grid_stat \ sample_fcst.grb sample_obs.nc \ GridStatConfig_APCP24 -outdir out -v 2 DEBUG 1: Default Config File: met-6.0/share/met/data/config/gridstatconfig_default DEBUG 1: User Config File: GridStatConfig_APCP24 DEBUG 1: Forecast File: sample_fcst.grb DEBUG 1: Observation File: sample_obs.nc DEBUG 2: DEBUG 2: Processing APCP/A24 versus APCP_A24, for interpolation method UW_MEAN(1), over region FULL, using 6412 pairs DEBUG 2: Computing Categorical Statistics. DEBUG 2: Computing Multi-Category Statistics. DEBUG 2: Computing Continuous Statistics. DEBUG 2: Processing APCP/A24 versus APCPA24, for interpolation method UW_MEAN(1), over region EAST, using 2582 pairs. DEBUG 2: Processing APCP/A24 versus APCPA24, for interpolation method NBRHD(9), raw thresholds of >0.000 and >0.000, over region EAST, using 5829 pairs. DEBUG 2: Computing Neighborhood Categorical Statistics. DEBUG 2: Computing Neighborhood Continuous Statistics. MORE NEIGHBORHOOD VERIFICATION TASKS LISTED DEBUG 2: DEBUG 1: Output file: out/grid_stat_240000l_ _000000v.stat DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_fho.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_ctc.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_cts.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_mctc.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_mcts.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_cnt.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_sl1l2.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_vl1l2.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_nbrctc.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_nbrcts.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_nbrcnt.txt DEBUG 1: Output file: out/grid_stat_240000l_ _000000v_pairs.nc Grid-Stat: Output Types Statistics line types: 17 possible Same as Point-Stat FHO, CTC, CTS, MCTC, MCTS, CNT SL1L2, SAL1L2, VL1L2, VAL1L2 PCT, PSTD, PJC, PRC Neighborhood apply threshold, define neighborhood Neighborhood continuous statistics (NBRCNT) Neighborhood contingency table counts (NBRCTC) Neighborhood contingency table statistics (NBRCTS) 22 header columns common to all line types Remaining columns specific to each line type Grid-Stat: Sample Output Grid-Stat: CTC Output Line 1. STAT file output for sample run: 2 lines each for CNT, MCTC, MCTS, and SL1L2 = 2 verification regions (FULL and EAST) 4 lines each for FHO, CTC, and CTS = 2 regions * 2 thresholds 8 lines each for NBRCNT, NBRCTC, NBRCTS = 2 regions * 2 thresholds * 2 neighborhood sizes 2. Additional TXT files for each line type 3. file containing matched pairs VERSION V6.0 MODEL WRF DESC NA FCST_LEAD FCST_VALID_BEG _ FCST_VALID_END _ OBS_LEAD OBS_VALID_BEG _ OBS_VALID_END _ FCST_VAR APCP_24 FCST_LEV A24 OBS_VAR APCP_24 OBS_LEV A24 OBTYPE MC_PCP VX_MASK EAST INTERP_MTHD UW_MEAN INTERP_PNTS 1 FCST_THRESH > OBS_THRESH > COV_THRESH NA ALPHA NA LINE_TYPE CTC TOTAL 2582 FY_OY (hits) 5 FY_ON (f.a.) 104 FN_OY (miss) 70 FN_ON (c.n.) 2403

$FCST, OBS, and DIFF for FULL and EAST FCST_FULL OBS_FULL DIFF_FULL FCST_EAST OBS_EAST DIFF_EAST // // matched // pairs output file // nc_pairs_flag = { latlon = TRUE; raw = TRUE; diff = TRUE; climo =$ $Example: Convective Precip vs. Total Precip Configuration file settings: Selecting variable/levels fcst = { field = [ { name = "ACPCP"; level = [ "A24" ]; cat_thresh = [ >0.$ $0 ]; } ]; }; obs = { field = [ { name = "APCP"; level = [ "A24" ]; cat_thresh = [ >0.$ // - "COS_LAT" to define the weight as the cosine of the grid point latitude. // This an approximation for grid box area used by NCEP and WMO.

// - "COS_LAT" to define the weight as the cosine of the grid point latitude. // This an approximation for grid box area used by NCEP and WMO.

67 Grid-Stat: Matched Pairs Forecast, observation, and difference fields for each combination of Variable, level, masking region, and interpolation method (smoothing) Sample output contains 6 fields: FCST, OBS, and DIFF for FULL and EAST FCST_FULL OBS_FULL DIFF_FULL FCST_EAST OBS_EAST DIFF_EAST // // matched // pairs output file // nc_pairs_flag = { latlon = TRUE; raw = TRUE; diff = TRUE; climo = TRUE; weight = FALSE; nbrhd = FALSE; apply_mask = TRUE; } Comparing Different Fields Grid-Stat and Point-Stat may be used to compare two different variables. User must interpret results. Example: Convective Precip vs. Total Precip Configuration file settings: Selecting variable/levels fcst = { field = [ { name = "ACPCP"; level = [ "A24" ]; cat_thresh = [ >0.0 ]; } ]; }; obs = { field = [ { name = "APCP"; level = [ "A24" ]; cat_thresh = [ >0.0 ]; } ]; }; Grid Point Weighting Climatologies // The "grid_weight_flag" specifies how grid weighting should be applied // - "NONE" to disable grid weighting using a constant weight (default). // - "COS_LAT" to define the weight as the cosine of the grid point latitude. // This an approximation for grid box area used by NCEP and WMO. // - "AREA" to define the weight as the true area of the grid box (km^2). Required for anomaly correlation (ANOM_CORR) Monthly 2.5 degree match_day = FALSE Experimental daily 1.0 degree match_day = TRUE Any other reference forecast. Add support for binned climatologies in met-6.1. // // Climatology mean data // climo_mean = { } file_name = [ // List of file names ]; field = [ // Same length as fcst.field ]; regrid = { method = NEAREST; width = 1; vld_thresh = 0.5; } time_interp_method = DW_MEAN; match_day = FALSE; time_step = 21600;

68 Series-Analysis Tool SERIES ANALYSIS, ENSEMBLES AND PROBABILITY Series-Analysis Tool Series-Analysis: Overview Input Reformat Plot Statistics Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF PCP Combine Regrid Data Plane MODIS Regrid WWMCA Regrid 2NC PB2NC MADIS2NC LIDAR2NC Gen VxMask Shift Data Plane Gridded Point Obs Plot Data Plane WWMCA Plot PS Plot Point Obs MTD Series Analysis MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Analysis Plot MODE Field MODE Analysis Stat Analysis PNG STAT DLand MET-TC TC PAIRS TCST TC STAT Land Data File TC DLAND ATCF Track Data Grid-to-grid comparisons on common grid. Grid-Stat and Point-Stat: Compute statistics aggregated over many grid points for a single point in time. Series-Analysis Tool: Compute statistics aggregated through time for each point in the grid. GSI Diag GSI Tools

Series-Analysis: Input/Output Input Files Gridded forecast and observation files GRIB1 output of Unified Post-Processor (or other) GRIB2 from NCEP (or other) from PCP-Combine, p_interp, or

Different fields from the same file. Example: 24hr NAM fcst of 3hr APCP vs StageII Series-Analysis: Usage Usage: series_analysis -fcst file_1... file_n -obs file_1... file_n [-both file_1.

69 Series-Analysis: Input/Output Input Files Gridded forecast and observation files GRIB1 output of Unified Post-Processor (or other) GRIB2 from NCEP (or other) from PCP-Combine, p_interp, or CF-compliant configuration file Output File file containing one or more statistics computed for each grid point. Series-Analysis: Define Series Define series as: Same field from multiple files. Different fields from the same file. Example: 24hr NAM fcst of 3hr APCP vs StageII Series-Analysis: Usage Usage: series_analysis -fcst file_1... file_n -obs file_1... file_n [-both file_1... file_n] [-paired] -out file -config file [-log file] [-v level] [-compress level] -fcst Gridded forecast files or file list -obs Gridded observation files or file list -both Set fcst and obs to the same list of files (i.e. Grid-Stat pairs) -paired Input fcst and obs files are paired -out Output file name -config configuration file -log Output directory to be used -v Level of logging Series-Analysis: Configuration Precipitation accumulated over 3 hours. fcst and obs Threshold precip at 0.01 and Do not restrict the analysis area in any way. Process 100,000 grid points in each pass. Require 75% of matched pairs in series to be valid. Compute contingency table statistics listed. fcst = { cat_thresh = [ >0.254, >2.540 ]; field = [ { name = "APCP_03"; level = [ "(*,*)" ]; } ]; }; obs = fcst; mask = { grid = ""; poly = ""; }; block_size = ; vld_thresh = 0.75; output_stats = { fho = []; ctc = []; cts = [ "TOTAL", "BASER", "GSS", "FBIAS", "HK", "HSS" ]; mctc = []; mcts = []; cnt = []; sl1l2 = []; sal1l2 = []; pct = []; pstd = []; pjc = []; prc = []; };

$Series-Analysis: Run met-6.1/bin/series_analysis \ -fcst nam_24hr_fcst_summer \ -obs st2_00z_vld_summer \ -config SeriesAnalysisConfig \ -out series_nam_st2_24hr_fcst_summer.$ 0/share/met/data/config/seriesanalysisconfig_default DEBUG 1: User Config File: SeriesAnalysisConfig DEBUG 1: Length of configuration "fcst.field" = 1 DEBUG 1: Length of configuration "obs.

0/share/met/data/config/seriesanalysisconfig_default DEBUG 1: User Config File: SeriesAnalysisConfig DEBUG 1: Length of configuration "fcst.field" = 1 DEBUG 1: Length of configuration "obs.

70 Series-Analysis: Run met-6.1/bin/series_analysis \ -fcst nam_24hr_fcst_summer \ -obs st2_00z_vld_summer \ -config SeriesAnalysisConfig \ -out series_nam_st2_24hr_fcst_summer.nc -v 2 DEBUG 1: Reading file list: nam_24hr_fcst_summer DEBUG 1: Reading file list: st2_00z_vld_summer DEBUG 1: Default Config File: met-5.0/share/met/data/config/seriesanalysisconfig_default DEBUG 1: User Config File: SeriesAnalysisConfig DEBUG 1: Length of configuration "fcst.field" = 1 DEBUG 1: Length of configuration "obs.field" = 1 DEBUG 1: Length of forecast file list = 92 DEBUG 1: Length of observation file list = 92 DEBUG 1: Series defined by the forecast file list of length 92. DEBUG 2: Computing statistics using a block size of , requiring 10 passes through the 1121 x 881 grid. DEBUG 2: Processing data pass number 1 of 10 for grid points 1 to DEBUG 2: Processing series entry 1 of 92: APCP_03(*,*) versus APCP_03(*,*) DEBUG 2: Found data for APCP_03(*,*) in NAM_4km_03h/ /nam_ _021_024.nc DEBUG 2: Found data for APCP_03(*,*) in ST2_4km_03h/ST2ml h.nc DEBUG 2: Processing data pass number 2 of 10 for grid points to DEBUG 2: Processing data pass number 3 of 10 for grid points to DEBUG 2: Processing data pass number 4 of 10 for grid points to DEBUG 2: Processing data pass number 5 of 10 for grid points to DEBUG 2: Processing data pass number 6 of 10 for grid points to DEBUG 2: Processing data pass number 7 of 10 for grid points to DEBUG 2: Processing data pass number 8 of 10 for grid points to DEBUG 2: Processing data pass number 9 of 10 for grid points to DEBUG 2: Processing data pass number 10 of 10 for grid points to DEBUG 1: Output file: out/series_nam_st2_24hr_fcst_summer.nc Run time approx 30 minutes Series-Analysis: ncdump netcdf series_nam_st2_24hr_fcst_summer { dimensions: lat = 881 ; lon = 1121 ; variables: int n_series ; n_series:long_name = "length of series" ; float series_cts_total_gt0.254(lat, lon) ; series_cts_total_gt0.254:_fillvalue = f ; series_cts_total_gt0.254:name = "TOTAL" ; series_cts_total_gt0.254:long_name = "Total number of matched pairs" ; series_cts_total_gt0.254:fcst_thresh = ">0.254" ; series_cts_total_gt0.254:obs_thresh = ">0.254" ; float series_cts_baser_gt0.254(lat, lon) series_cts_baser_gt0.254:_fillvalue = f ; series_cts_baser_gt0.254:name = "BASER" ; series_cts_baser_gt0.254:long_name = "Base rate" ; series_cts_baser_gt0.254:fcst_thresh = ">0.254" ; series_cts_baser_gt0.254:obs_thresh = ">0.254" ; float series_cts_gss_gt0.254(lat, lon) ; series_cts_gss_gt0.254:_fillvalue = f ; series_cts_gss_gt0.254:name = "GSS" ; series_cts_gss_gt0.254:long_name = "Gilbert Skill Score" ; series_cts_gss_gt0.254:fcst_thresh = ">0.254" ; series_cts_gss_gt0.254:obs_thresh = ">0.254" ; float series_cts_fbias_gt0.254(lat, lon) ; series_cts_fbias_gt0.254:_fillvalue = f ; series_cts_fbias_gt0.254:name = "FBIAS" ; series_cts_fbias_gt0.254:long_name = "Frequency bias" ; series_cts_fbias_gt0.254:fcst_thresh = ">0.254" ; series_cts_fbias_gt0.254:obs_thresh = ">0.254" ; float series_cts_hk_gt0.254(lat, lon) ; series_cts_hk_gt0.254:_fillvalue = f ; series_cts_hk_gt0.254:name = "HK" ; series_cts_hk_gt0.254:long_name = "Hanssen-Kuipers discriminant" ; series_cts_hk_gt0.254:fcst_thresh = ">0.254" ; series_cts_hk_gt0.254:obs_thresh = ">0.254" ; Series-Analysis: Statistics Ensemble_Stat 3hr APCP > mm (0.01 in) 3hr APCP > 2.54 mm (0.1 in) Presenter: Tina Kalb Copyright 2015, University Corporation for Atmospheric Research, all rights reserved copyright 2018, UCAR, all rights reserved

Verifying Ensembles & Probability Fcsts with MET Point-Stat and Grid-Stat Tool (probability) Brier Score + Decomposition Reliability Diagrams Receiver Operating Characteristic Diagram + Area Under

71 Verifying Ensembles & Probability Fcsts with MET Point-Stat and Grid-Stat Tool (probability) Brier Score + Decomposition Reliability Diagrams Receiver Operating Characteristic Diagram + Area Under the Curve Joint/Conditional factorization table Ensemble-Stat Tool Ensemble Mean Fields Probability Fields Rank Histograms Spread-Skill Calculation Ensemble-Stat Tool Statistics MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Analysis MODE Analysis Stat Analysis User Defined Display User Graphics Package User Graphics Package Copyright 2018, University Corporation for Atmospheric Research all rights reserved Copyright 2018, University Corporation for Ensemble-Stat Capabilities Reads: Gridded ensemble member files Gridded AND point observations files Calculates: Ensemble Mean, Standard Deviations, Mean + 1 SD fields Ensemble Min, Max, and Range fields Ensemble Valid Data Count field Ensemble Relative Frequency by threshold fields (i.e. probability) Rank and PIT Histograms (if Obs Field Provided) Ensemble Spread-Skill (if Obs Field Provided) Writes: Stat file with Rank Histogram, PIT Histogram, Spread-Skill partial sums, and Point Observation Ranks Gridded field of Observation Ranks to a file Copyright 2018, University Corporation for Atmospheric Research all rights reserved Ensemble Stat Tool: Usage Usage: ensemble_stat n_ens ens_file_1 \... ens_file_n ens_file_list config_file [-grid_obs file] [-point_obs file] [-ssvar_mean file] [-obs_valid_beg time] [-obs_valid_end time] [-outdir path] [-log file] [-v level] pyright 2018, University Corporation for Number of Ensemble members followed by list of ensemble member names OR ens_file_list (the name of an file with names of members) Config file name Name of gridded or point observed file Required if Rank Histograms desired (optional) Specify an ensemble mean model data file for use in calculating ensemble spread-skill (optional) YYYYMMDD[_HH[MMSS]] format to set the beginning and end of the matching observation time window (optional) Set output directory (optional) Outputs log messages to the specified file (optional) Set level of verbosity (optional)

72 Ensemble-Stat: Configuration Many configurable parameters ens = fields to summarize ens_thresh - All members must be available vld_thresh all data in grid must be valid 24hr Accumulated Precip (APCP) Composite Reflectivity (REFC) N-S component of Wind (UGRD) Thresholds used for Ensemble Relative Freq (i.e. probability) GRIB1_ptv = 129; Use GRIB Table 129 instead of Table 2 // // Ensemble product fields to be processed // (i.e. mean, min, max, stdev fields) // ens = { ens_thresh = 1.0; vld_thresh = 1.0; field = [ { name = "APCP"; level = [ "A24" ]; cat_thresh = [ >0.0, >=10.0 ]; }, { name = "REFC"; level = [ "L0" ]; cat_thresh = [ >=35.0 ]; GRIB1_ptv = 129; }, { name = "UGRD"; level = [ "Z10" ]; cat_thresh = [ >=5.0 ]; }, ]; } Ensemble-Stat: Configuration Many configurable parameters only set a few: Fcst specifies fields to be verified ADPSFC message type for point obs 24hr precip for gridded obs field Bin size for spread-skill calculation is 0.1 mm Bin size for probability integral transform statistics is 0.05 mm // Forecast and observation fields to be // verified (i.e. RHIST, PHIST, SSVAR) // fcst = { field = [ { name = "APCP"; level = [ "A24" ]; } ]; } obs = fcst; // Point observation filtering options // May be set separately in each "obs.field" entry // message_type = [ "ADPSFC" ]; sid_exc = []; obs_thresh = [ NA ]; obs_quality = []; duplicate_flag = NONE; obs_summary = NONE; obs_perc_value = 50; skip_const = FALSE; // // Ensemble bin sizes // May be set separately in each "obs.field" entry // ens_ssvar_bin_size = 0.1; ens_phist_bin_size = 0.05; Ensemble-Stat Tool: Run ensemble_stat \ 6 sample_fcst/ /*gep*/d01_ _02400.grib \ config/ensemblestatconfig \ -grid_obs sample_obs/st4/st h \ -point_obs out/ascii2nc/precip24_ nc \ -outdir out/ensemble_stat -v 2 NOTE: You can pass in a path with wildcards to pull out the files you would like to process or you can pass in an filename that contains a list of ensemble members Gridded and Obs fields are included for use in calculating Rank Histogram, PIT Histogram, and Spread-Skill Copyright 2018, University Corporation for Atmospheric Research all rights reserved Ensemble Stat Tool: Run *** Running Ensemble-Stat on APCP using GRIB forecasts, point observations, and gridded observations *** DEBUG 1: Default Config File: /d3/projects/met/met_releases/met-6.0/data/config/ensemblestatconfig_default DEBUG 1: User Config File: config/ensemblestatconfig GSL_RNG_TYPE=mt19937 GSL_RNG_SEED=1 DEBUG 1: Ensemble Files[6]: DEBUG 1:../data/sample_fcst/ /arw-fer-gep1/d01_ _02400.grib DEBUG 1:../data/sample_fcst/ /arw-fer-gep5/d01_ _02400.grib DEBUG 1:../data/sample_fcst/ /arw-sch-gep2/d01_ _02400.grib DEBUG 1:../data/sample_fcst/ /arw-sch-gep6/d01_ _02400.grib DEBUG 1:../data/sample_fcst/ /arw-tom-gep3/d01_ _02400.grib DEBUG 1:../data/sample_fcst/ /arw-tom-gep7/d01_ _02400.grib DEBUG 1: Gridded Observation Files[1]: DEBUG 1:../data/sample_obs/ST4/ST h DEBUG 1: Point Observation Files[1]: DEBUG 1:../out/ascii2nc/precip24_ nc DEBUG 2: DEBUG 2: DEBUG 2: DEBUG 2: Processing ensemble field: APCP/A24 DEBUG 2: DEBUG 2: Processing gridded verification APCP_24/A24 versus APCP_24/A24, for observation type MC_PCP, over region FULL, for interpolation method UW_MEAN(1), using pairs DEBUG 1: Output file: out/ensemble_stat/ensemble_stat_ _120000v.stat DEBUG 1: Output file: out/ensemble_stat/ensemble_stat_ _120000v_rhist.txt DEBUG 1: Output file: out/ensemble_stat/ensemble_stat_ _120000v_phist.txt DEBUG 1: Output file: out/ensemble_stat/ensemble_stat_ _120000v_orank.txt DEBUG 1: Output file: out/ensemble_stat/ensemble_stat_ _120000v_ssvar.txt DEBUG 1: Output file: out/ensemble_stat/ensemble_stat_ _120000v_ens.nc DEBUG 1: Output file: out/ensemble_stat/ensemble_stat_ _120000v_orank.nc Copyright 2018 University Corporation for

73 Ensemble-Stat: Output Files Ensemble Stat Tool: nc Output Up to 4 txt files and stat file Ranked histogram (CPSS, IGN) Probability integral transform histogram Skill/spread variance e.g. FBAR, OBAR, MSE, RMSE, PR_CORR Relative position netcdf Gridded ensemble mean, standard deviation, min, max, range, frequency orank file (gridded obs rank) ensemble_flag = { mean = TRUE; stdev = TRUE; minus = TRUE; plus = TRUE; min = TRUE; max = TRUE; range = TRUE; vld_count = TRUE; frequency = TRUE; rank = TRUE; weight = FALSE; }; output_flag = { rhist = BOTH; phist = BOTH; orank = BOTH; ssvar = BOTH; relp = BOTH; }; Ensemble Mean Ensemble StdDev Prob > 5 mm Prob > 0 mm Copyright 2018 University Corporation for Ensemble Stat Tool: txt Output Rank Histogram Output from *_rhist.txt VERSION MODEL FCST_LEAD FCST_VALID_BEG FCST_VALID_END OBS_LEAD OBS_VALID_BEG OBS_VALID_END FCST_VAR FCST_LEV OBS_VAR OBS_LEV OBTYPE VX_MASK INTERP_MTHD INTERP_PNTS FCST_THRESH OBS_THRESH COV_THRESH ALPHA LINE_TYPE TOTAL CRPS IGN N_RANK RANK_1 RANK_2 RANK_3 RANK_4 RANK_5 RANK_6 RANK_7 V6.0 WRF _ _ _ _ APCP_24 A24 APCP_24 A24 ADPSFC FULL UW_MEAN 1 NA NA NA NA RHIST CRPS IGN RANK HIST Output from *_phist.txt VERSION MODEL FCST_LEAD FCST_VALID_BEG FCST_VALID_END OBS_LEAD OBS_VALID_BEG OBS_VALID_END FCST_VAR FCST_LEV OBS_VAR OBS_LEV OBTYPE VX_MASK INTERP_MTHD INTERP_PNTS FCST_THRESH OBS_THRESH COV_THRESH ALPHA LINE_TYPE TOTAL BIN_SIZE N_BIN BIN_1 BIN_2 BIN_3 BIN_4 BIN_5 BIN_6 BIN_7 BIN_8 BIN_9 BIN_10 BIN_11 BIN_12 BIN_13 BIN_14 BIN_15 BIN_16 BIN_17 BIN_18 BIN_19 BIN_20 V6.0 WRF _ _ _ _ APCP_24 A24 APCP_24 A24 ADPSFC FULL UW_MEAN 1 NA NA NA NA PHIST Probability integral transform histogram Copyright 2018 University Corporation for Copyright 2014, University Corporation for Atmospheric Copyright Research, 2018, University Corporation all rights for Atmospheric reserved Research, all rights reserved

74 Uses for Output from Ensemble Stat Statistics Analysis User Defined Display MODE Series Analysis Grid Stat Ensemble Stat PS STAT PS STAT STAT MODE Analysis Stat Analysis Verification of ensembles and probabilites Barbara Brown Point Stat STAT Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright 2018 University Corporation for Copyright UCAR 2018, all rights reserved. How good is this ensemble forecast? Questions to ask before beginning? How were the ensembles constructed? Poor man s ensemble (distinct members) Multi physics (distinct members) Random perturbation of initial conditions (anonymous members) How are your forecasts used? Improved point forecast (ensemble mean) Probability of an event Full distribution Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved.

75 Approaches to evaluating ensemble forecasts As individual members Use methods for continuous or categorical forecasts As probability forecasts Create probabilities by applying thresholds or statistical post processing As a full distribution Use individual members or fit a distributions through post processing Evaluate each member as a separate, deterministic forecast Why? Because it is easy and important If members are unique, it might provide useful diagnostics. If members are biased, verification statistics might be skewed. If members have different levels of bias, should you calibrate? Do these results conform to your understanding of how the ensemble members were created? Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved. Verifying a probabilistic forecast You cannot verify a probabilistic forecast with a single observation. The more data you have for verification, (as with other statistics) the more certain you are. Rare events (low probability) require more data to verify. These comments refer to probabilistic forecasts developed by methods other than ensembles as well. Properties of a perfect probabilistic forecast of a binary event. Reliability frequency Resolution observed observed non-events events forecast Sharpness Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved.

76 The Brier Score Mean square error of a probability forecast 1 n i i n i 1 BS (f x ) where n is the number of forecasts f i is the forecast prob on occasion i x i is the observation (0 or 1) on occasion i Weights larger errors more than smaller ones Copyright UCAR 2018, all rights reserved. 2 1 n 1 BS ( f x ) BS n k 1 k Brier Score k 2 where f k = forecast probability on occasion k x k = observation (0 or 1) on occasion k BS can be decomposed into 3 components that represent important properties of the forecasts: I I Ni ( fi xi ) Ni ( xi x) x(1 x) n i 1 n i 1 Reliability Resolution Uncertainty Where the I is the number of discrete values of f (e.g., f 1 = 0.05, f 2 = 0.10, f 3 = 0.20, etc.) and I n N i 1 i x i 1 xk N and Copyright i k UCAR 2018, all rights reserved. Ni n I 1 1 x x N x n k i i k 1 N i 1 Components of the Brier Score Reliability Measures how well the conditional relative frequency of events matches the forecast Resolution Measures how well the forecasts distinguish situations with different frequencies of occurrence Uncertainty Measures the variability in the observations (i.e., the difficulty of the forecast situations) 1 I n i 1 1 I n i 1 N ( f x ) i i i N ( x x) x(1 x) Looking at Brier Score components is critical to understand Copyright forecast UCAR 2018, all rights performance reserved. i i 2 2 Brier Skill Score (BSS) BSS RES REL UNC BSS is a simple combination of the 3 components of the Brier Score (assumes Sample Climatology as the reference forecast) Copyright UCAR 2018, all rights reserved.

77 Our friend, the scatterplot Introducing the attribute diagram! ( close relative to the reliability diagram) Analogous to the scatter plot same intuition holds. Data must be binned! Hides how much data is represented by each Expresses conditional probabilities. Confidence intervals can illustrate the problems with small sample sizes. Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved. Attribute diagram shows reliability, resolution, skill Reliability Diagram Exercise Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved.

78 Discrimination Discrimination: The ability of the forecast system to clearly distinguish situations leading to the occurrence of an event of interest from those leading to the nonoccurrence of the event. Depends on: Separation of means of conditional distributions Variance within conditional distributions Relative Operating Characteristic (ROC) Measures the ability of the forecast to discriminate between events and nonevents (resolution) (a) observed observed non-events events frequency forecast Good discrimination (b) observed observed non-events events frequency forecast Poor discrimination (c) observed observed non-events events frequency forecast Good discrimination Plot hit rate H vs false alarm rate F using a set of varying probability thresholds to make the yes/no decision. Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved. Interpretation of ROC Close to upper left corner good resolution Close to diagonal little skill Area under curve ("ROC area") is a useful summary ROC is conditioned on measure of forecast skill the observations (i.e., Perfect: ROC area = 1 given that Y occurred, what was the No skill: ROC area = 0.5 corresponding ROC skill score ROCS = forecast?) 2(ROCarea 0.5) Reliability and ROC Not sensitive to bias. diagrams are good companions Copyright UCAR 2018, all rights reserved. Relative Operating Characteristic (ROC) ROC example: ROC diagram for T12< 5 C at T+72. Shades indicate the different levels of statistical processing applied as shown in the key. The cross indicates the ROC (FAR, HR) of the ECMWF high-resolution deterministic model. from "Verification of PREVIN site-specific probability forecasts", Met Office ( Copyright UCAR 2018, all rights reserved.

79 Sharpness also important Copyright UCAR 2018, all rights reserved. Sharpness measures the specificity of the probabilistic forecast. Given two reliable forecast systems, the one producing the sharper forecasts is preferable. But: don t want sharp if not reliable. Implies unrealistic confidence. Sharpness resolution Sharpness is a property of the forecasts alone; a measure of sharpness in Brier score decomposition would be how populated the extreme N i s are. BS 1 n I i 1 N ( f i i x ) i 2 1 n i 1 N ( x Copyright UCAR 2018, all rights reserved. I i i x) 2 x(1 x) Sharpness for binary probability forecasts For a binary probability forecast, sharpness is based on the distribution (histogram) of frequencies associated with each possible probability Sometimes summarized using the variance of the distribution Reasonable sharpness Perfect forecast Max possible sharpness Poor sharpness Forecasts of a full distribution How is it expressed? Discretely by providing forecasts from all ensemble members A parametric distribution normal (ensemble mean, spread) Smoothed function kernel smoother Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved.

Rank Histograms Evaluating ensembles Spread skill Continuous Ranked Probability Score: Measures skill using squared error Copyright UCAR 2018, all rights reserved.

80 Rank Histograms Evaluating ensembles Spread skill Continuous Ranked Probability Score: Measures skill using squared error Copyright UCAR 2018, all rights reserved. (analogous to MAE) Ensemble Calibration / Reliability By default, we assume all ensemble forecasts have the same number of members. Comparing forecasts with different number of members is an advanced topic. For a perfect ensemble, the observation comes from the same distribution as the ensemble. Copyright UCAR 2018, all rights reserved. Rank histogram examples Creating rank histograms Rank 1 of 21 Rank 14 of 21 Rank 5 of 21 Rank 3 of 21 Rank histograms are a way to examine the calibration of an ensemble Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved.

Rank histogram examples Verifying a continuous expression of a distribution (i.e. normal, Poisson, beta) Probability of any observation occurring is on [0,1] interval.

81 Rank histogram examples Verifying a continuous expression of a distribution (i.e. normal, Poisson, beta) Probability of any observation occurring is on [0,1] interval. Probability Integral Transformed (PIT) fancy word for how likely is a given forecast Still create a rank histogram using bins of probability of observed events. Rank histograms are a way to examine the calibration of an ensemble Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved. Verifying a distribution forecast Warnings about rank histograms Assume all samples come from the same climatology! A flat rank histogram can be derived by combining forecasts with offsetting biases See Hamill, T. M., and J. Juras, 2006: Measuring forecast skill: is it real skill or is it the varying climatology? Quart. J. Royal Meteor. Soc., Jan 2007 issue Techniques exist for evaluating flatness, but they mostly require much data. Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved.

82 Continuous and discrete rank probability scores Measures of accuracy for Multiple category forecasts (e.g., precipitation type) Rank Probability Score (RPS) Continuous distributions (e.g., ensemble distribution) Continuous Ranked Probability Score (CRPS) Relates to Brier score for a forecast of a binary event, the RPS score is equivalent to the Brier score. Copyright UCAR 2018, all rights reserved. Rank Probability Scores Copyright UCAR 2018, all rights reserved. A good RPS score minimizes area Ignorance score (for multi category or ensemble forecasts) A local score k * () t n 1 ) IS log ( p ) n i 1 2 * tk, ( t) is the category that actually was observed at time t Based on information theory Only rewards forecasts with some probability in correct category Perfect score: 0 [i.e., log 2 (1) = 0] Copyright UCAR 2018, all rights reserved. Copyright UCAR 2018, all rights reserved.

83 Final comments Know how and why your ensemble is being created. Use a combination of graphics and scores. Areas of more research Verification of spatial forecasts Additional intuitive measures of performance for probability and ensemble forecasts. Measure Attribute evaluated Comments Probability forecasts Brier score Accuracy Based on squared error Resolution Reliability Skill score Sharpness measure Resolution (resolving different categories) Calibration Skill Sharpness Compares forecast category climatologies to overall climatology Skill involves comparison of forecasts Only considers distribution of forecasts Ignores calibration Relative Operating Discrimination Characteristic (ROC) C/L Value Value Ignores calibration Ensemble distribution Rank histogram Calibration Can be misleading Spread-skill Calibration Difficult to achieve CRPS Accuracy Squared difference between forecast and observed distributions Analogous to MAE in limit Copyright UCAR 2018, all rights reserved. log p score Local score, rewards for correct Accuracy category; infinite if observed Copyright UCAR 2018, all rights category reserved. has 0 density Useful references Good overall references for forecast verification: (1) Wilks, D.S., 2006: Statistical Methods in the Atmospheric Sciences (2nd Ed). Academic Press, 627 pp. (2) WMO Verification working group forecast verification web page, (3) Jolliffe and Stephenson Book: Jolliffe, I.T., and D.B. Stephenson, 2012: Forecast Verification. A Practitioner's Guide in Atmospheric Science., 2 nd Edition, Wiley and Sons Ltd. Verification tutorial Eumetcal ( learning modules ) Rank histograms: Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129, Spread skill relationships: Whitaker, J.S., and A. F. Loughe, 1998: The relationship between ensemble spread and ensemble mean skill. Mon. Wea. Rev., 126, Brier score, continuous ranked probability score, reliability diagrams: Wilks text again. Relative operating characteristic: Harvey, L. O., Jr, and others, 1992: The application of signal detection theory to weather forecasting behavior. Mon. Wea. Rev., 120, Economic value diagrams: (1)Richardson, D. S., 2000: Skill and relative economic value of the ECMWF ensemble prediction system. Quart. J. Royal Meteor. Soc., 126, (2) Zhu, Y, and others, 2002: The economic value of ensemble based weather forecasts. Bull. Amer. Meteor. Soc., 83, Overestimating skill: Hamill, T. M., and J. Juras, 2006: Measuring forecast skill: is it real skill or is it the varying climatology? Quart. J. Royal Meteor. Soc., Jan 2007 issue. Copyright UCAR 2018, all rights reserved. Probabilities underforecast Essentially no skill Perfect forecast Reliability Diagram Exercise Tends toward mean but has skill Small samples In some bins Reliable forecast of rare event Copyright UCAR 2018, all rights reserved. No resolution Overresolved forecast Typical categorical forecast

Probability fields Statistics Analysis User Defined Display MODE PS MODE Analysis Verifying Probabilities with MET Series Analysis STAT PS Grid Stat Ensemble Stat STAT STAT Stat Analysis Point Stat

84 Probability fields Statistics Analysis User Defined Display MODE PS MODE Analysis Verifying Probabilities with MET Series Analysis STAT PS Grid Stat Ensemble Stat STAT STAT Stat Analysis Point Stat STAT Copyright UCAR 2018, all rights reserved. Verifying Probabilities Probabilistic verification method tools: Grid Stat, Point Stat, and Stat Analysis Define Nx2 contingency table using: Multiple forecast probability thresholds One observation threshold Example: FCST: Probability of precip [0.00, 0.25, 0.50, 0.75, 1.00] ==0.25 OBS: Accumulated precip > 0.00 Verifying Probabilities: Example Verify probability of precip with total precip: Configuration file settings: fcst = { field = [ { name = "POP"; level = [ "Z0" ]; //cat_thresh = [ >=0.0, >=0.25, >=0.50, >=0.75, >=1.00 ]; cat_thresh = [ ==0.25 ]; prob = TRUE; } ]; }; obs = { field = [ { name = "APCP"; level = [ "A12" ]; cat_thresh = [ >0.0 ]; } ]; };

85 Grid Stat: Probability Config. Many configurable parameters only set a few: APCP_24 is name of ens mean in netcdf file prob = True important cat_thresh used for reliability and roc curves Use 24hr Accumulation in grib file threshold at >10 mm Generate probabilistic statistics fcst = { wind_thresh = [ NA ]; field = [ { name = "APCP_24_A24_ENS_FREQ_ge10.000"; level = [ "(*,*)" ]; prob = TRUE; cat_thresh = [ >=0.0, >=0.1, >=0.2, >=0.3, >=0.4, >=0.5, >=0.6, >=0.8, >=1.0 ]; //cat_thresh = [ ==0.1 ]; } ]; }; obs = { }; wind_thresh = [ NA ]; field = [ { name = "APCP"; level = [ "A24" ]; cat_thresh = [ > ]; } ]; output_flag = { fho = NONE; ctc = NONE; cts = NONE; mctc = NONE; mcts = NONE; cnt = NONE; sl1l2 = NONE; vl1l2 = NONE; pct = BOTH; pstd = BOTH; pjc = BOTH; prc = BOTH; nbrctc = NONE; nbrcts = NONE; nbrcnt = NONE; }; Grid Stat for Probability: Run Output written to.stat file and, if desired, to individual text files: PCT Probability Contingency Table Counts PSTD Probability Contingency Table Scores Brier Score, Reliability, Resolution, Uncertainty, Area Under ROC PJC Joint/Continuous Statistics of Probabilistic Variables Calibration, Refinement, Likelihood, Base Rate, Reliability points PRC ROC Curve Points for Probabilistic Variables Grid Stat Probability: Examples A teaser Spatial Methods Application You can use MODE on probability fields also In this case: Probability field threshold = 50% Observed field threshold > 12.7 mm (or 0.5 ) pyright 2018 University Corporation for Atmospheric Research all rights reserved Copyright 2018 University Corporation for Atmospheric Research all rights reserved

86 OBJECT-BASED AND CYCLONE VERIFICATION

Docker MET and METViewer Docker (Amazon Web Services): Open-source technology to build and deploy applications inside

quickly, reliably, and consistently deploy applications MET Container available on MET downloads page http://www.

php MET and METViewer placed in a Docker Container 1) Set up to work with a suite of test-cases for NWP innovation

96 Docker MET and METViewer Docker (Amazon Web Services): Open-source technology to build and deploy applications inside software containers Packages software containing: code, runtime, system tools, system libraries, etc Enables you to quickly, reliably, and consistently deploy applications MET Container available on MET downloads page met/users/downloads/ docker_container.php MET and METViewer placed in a Docker Container 1) Set up to work with a suite of test-cases for NWP innovation testing (MMET/MERIT) 2) Presented in an AMS Short Course at this meeting (Jan 6th) See yesterday s talk by Halley Gotway orial/index.php

MODE Customization and Output Verifying with Objects MODE Example Matched Object 1 Matched Object 2 Unmatched Object ENS FCST Radius=5 ObjectThresh >0.

20 Merging No false alarms OBS Radius=5 ObjectThresh >0.

observation GRIB1, GRIB2 (Unified Post-Processor, NCEP, other) (PCP-Combine, wrf_interp, CF-compliant) Config File https://dtcenter.

97 MODE Customization and Output Verifying with Objects MODE Example Matched Object 1 Matched Object 2 Unmatched Object ENS FCST Radius=5 ObjectThresh >0.25 MergingThresh >0.20 Merging No false alarms OBS Radius=5 ObjectThresh >0.25 MergingThresh >0.20 Matching then Merging Misses Presenter: Tina Kalb Matching copyright 2018, UCAR, all rights reserved MODE Input and Usage Input Files: Gridded forecast and observation GRIB1, GRIB2 (Unified Post-Processor, NCEP, other) (PCP-Combine, wrf_interp, CF-compliant) Config File Usage: mode fcst_file obs_file config_file [-config_merge merge_config_file] [-outdir path] [-log file] [-v level] fcst_file obs_file config_file -config_merge Gridded forecast file Gridded observation file configuration file Second configuration file for fuzzy engine merging -outdir Output directory to be used -log Optional log file -v Level of logging

Config File MODE Output PostScript object pictures, definitions matching/merging strategy total interest for each object pair Text attributes of simple, paired objects,

) for objects netcdf gridded object fields view with ncview copyright 2018, UCAR, all rights reserved object pictures Total Interest of object pairs Pairs above dashed

Raw Fcst Field Raw Obs Field Field names model description Weight of object attributes Definition of objects smoothing radius intensity threshold area threshold matching

98 Config File MODE Output PostScript object pictures, definitions matching/merging strategy total interest for each object pair Text attributes of simple, paired objects, clusters size, shape, position, separation, total interest verification scores (CSI, bias, etc.) for objects netcdf gridded object fields view with ncview copyright 2018, UCAR, all rights reserved object pictures Total Interest of object pairs Pairs above dashed line processed further Page 2 and 3 of PostScript: Band shows which Simple Objects are merged (aka Cluster) Colors show matching between Fcst and Obs. Raw Fcst Field Raw Obs Field Field names model description Weight of object attributes Definition of objects smoothing radius intensity threshold area threshold matching and/or merging # and area of objects Median Max. Interest (MMI) copyright 2018, UCAR, all rights reserved Cluster merged by Fuzzy Logic Simple Obj. not merged by Fuzzy Logic Unmatched (FY_ON) False Alarm Matched (FY_OY) Hit copyright 2018, UCAR, all rights reserved

Page 4 of PostScript Page 5 of PostScript - Summary information for clusters in the domain Objects overlapped In two different views Which do you prefer?

Interest (MMI*) Raw Field and Double Thresholding For Merging Process A 1 obs fcst Interest Matrix observed A B Convolution Threshold (>=25.4mm) Double Thresholding Value (>=22.

99 Page 4 of PostScript Page 5 of PostScript - Summary information for clusters in the domain Objects overlapped In two different views Which do you prefer? copyright 2018, UCAR, all rights reserved copyright 2014, UCAR, all rights reserved Page 6 & 7 of PostScript Summary Score for Forecast Median of the Max. Interest (MMI*) Raw Field and Double Thresholding For Merging Process A 1 obs fcst Interest Matrix observed A B Convolution Threshold (>=25.4mm) Double Thresholding Value (>=22.5mm) B 2 3 forecast copyright 2018, UCAR, all rights reserved * Davis et al., 2009: The Method for Object-based Diagnostic Evaluation (MODE) Applied to WRF Forecasts from the 2005 SPC Spring Program. Weather and Forecasting maximum interest MMI = median { 0.90, 0.80, 0.90, 0.80, 0.55 } = 0.80 copyright 2018, UCAR, all rights reserved

Summary Score for Forecast Median of the Max. Interest (MMI*) Median of the Max. Interest (MMI) Quilt Plot A 1 obs fcst Interest Matrix observed A B MMI B 2 3 * Davis et al.

Weather and Forecasting forecast copyright 2018, UCAR, all rights reserved 1 0.90 0.65 2 0.50 0.80 3 0.40 0.55 maximum interest MMI = median { 0.90, 0.80, 0.90, 0.80, 0.55 } = 0.

copyright 2018, UCAR, all rights reserved MODE Output PostScript object pictures, definitions matching/merging strategy total interest for each object pair Text attributes of simple, paired objects,

100 Summary Score for Forecast Median of the Max. Interest (MMI*) Median of the Max. Interest (MMI) Quilt Plot A 1 obs fcst Interest Matrix observed A B MMI B 2 3 * Davis et al., 2009: The Method for Object-based Diagnostic Evaluation (MODE) Applied to WRF Forecasts from the 2005 SPC Spring Program. Weather and Forecasting forecast copyright 2018, UCAR, all rights reserved maximum interest MMI = median { 0.90, 0.80, 0.90, 0.80, 0.55 } = 0.80 MMI as a function of convolution radius (grid squares) and threshold (mm) for 24-h forecast of 1-h rainfall Each pixel is a MODE run. This graphic is not in MET, but R code on MET website. copyright 2018, UCAR, all rights reserved MODE Output PostScript object pictures, definitions matching/merging strategy total interest for each object pair Text attributes of simple, paired objects, clusters size, shape, position, separation, total interest verification scores (CSI, bias, etc.) for objects netcdf gridded object fields view with ncview Object Attribute file (*_obj.txt) Header with fields names and object definition info Object ID and Category Simple Object Attributes Simple Obj. Centroid info, Length, Width, Area, etc Matched Pair/Composite information Centroid Distance, Angle Difference, Symmetric Difference, etc NA s for not relevent output Output Contingency Table Stat file (*_cts.txt) Header with fields names and object definition info Contingency Table counts hits, false alarms, misses and correct negs (FY FN_OY ON notation) Contingency Table statistics such BASER, FBIAS, GSS, CSI, PODY, FAR etc copyright 2018, UCAR, all rights reserved copyright 2018, UCAR, all rights reserved

Output Use of MODE Pair Attributes Object Attribute file (*_obj.txt) Observed Contingency Table Stat file (*_cts.txt) Forecast Centroid Distance: Quantitative measure of forecast spatial Displacement.

Small is good copyright 2018, UCAR, all rights reserved Area Ratio = Fcst Area Obs Area obs area Fcst area copyright 2018, UCAR, all rights reserved Area Ratio: Provides an objective measure of

Close to 1 is good Use of MODE Pair Attributes Use of MODE Pair Attributes Symmetric Difference: Non-Intersecting Area Obs Value P50 = 26.6 P90 = 31.

101 Output Use of MODE Pair Attributes Object Attribute file (*_obj.txt) Observed Contingency Table Stat file (*_cts.txt) Forecast Centroid Distance: Quantitative measure of forecast spatial Displacement. Small is good Axis Angle: For non-circular Objects, measure of orientation errors. Small is good copyright 2018, UCAR, all rights reserved Area Ratio = Fcst Area Obs Area obs area Fcst area copyright 2018, UCAR, all rights reserved Area Ratio: Provides an objective measure of whether there is an over- or underprediction of areal extent of forecast. Close to 1 is good Use of MODE Pair Attributes Use of MODE Pair Attributes Symmetric Difference: Non-Intersecting Area Obs Value P50 = 26.6 P90 = 31.5 Symmetric Difference: Non-Intersecting Area Obs Value P50 = 26.6 P90 = 31.5 Fcst Value P50 = 29.0 P90 = 33.4 Fcst Value P50 = 29.0 P90 = 33.4 Symmetric Diff: Summary statistic for how well Forecast and Observed objects match. Small is good P50 P90 Int: Objective measures of Median (50 th percentile) and near-peak (90 th percentile) intensities in objects. Ratio close To 1 is good Symmetric Diff: Summary statistic for how well Forecast and Observed objects match. Small is good P50 P90 Int: Objective measures of Median (50 th percentile) and near-peak (90 th percentile) intensities in objects. Ratio close To 1 is good Obs Fcst Total Interest 0.75 Total Interest: Summary statistic derived from fuzzy logic engine with user-defined Interest Maps for all these attributes plus some others. Close to 1 is good Angle_diff & Sym_diff less so Total Int. higher Obs Fcst Total Interest 0.90 Total Interest: Summary statistic derived from fuzzy logic engine with user-defined Interest Maps for all these attributes plus some others. Close to 1 is good copyright 2018, UCAR, all rights reserved copyright 2018, UCAR, all rights reserved

Forecast False Alarm How netcdf could be

to show matched clusters critical success

Hit obs Hit fcst bias = Hit + False Alarm

inside objects (in this case Reflectivity)

1 2 3 Object #3 2 1 3 Fcst Area: 6302 Obs

102 Scoring MODE Objects use total interest threshold to separate matched objects, or hits from false alarms and misses Traditional Categorical Statistics Forecast False Alarm How netcdf could be used Employ a different plotting approach to show matched clusters critical success index (CSI) = Hit Hit + Miss + False Alarm Hit obs Hit fcst bias = Hit + False Alarm Hit + Miss sometimes area-weighted Miss Observation copyright 2018, UCAR, all rights reserved Display actual intensities inside objects (in this case Reflectivity) Plots generated using NCL copyright 2018, UCAR, all rights reserved MODE Example: Traditional MODE Example: El Nino Climate Object # Fcst Area: 6302 Obs Area: 4020 Centroid Dist: 12.4 Int Area: 3189 Interest: 0.98 Model Observations Not individual forecasts Quantify differences in each anomaly type separately

Effect of Radius and Threshold MODE Analysis Tool Increasing Radius mode_analysis Increasing Threshold copyright 2018, UCAR, all rights reserved MODE_Analysis Usage Usage: mode_analysis -lookin path

Toggles -fcst versus -obs Selects lines pertaining to forecast objects or observation objects -single versus -pair Selects single object lines or pair lines -simple versus -cluster Selects simple

$-int10_min max, -centroid_dist_min max, -angle_diff_min max, etc MODE Analysis Tool -summary Example Command Line mode_analysis -summary \ -lookin mode_output/wrf4ncep/40km/ge03.$

103 Effect of Radius and Threshold MODE Analysis Tool Increasing Radius mode_analysis Increasing Threshold copyright 2018, UCAR, all rights reserved MODE_Analysis Usage Usage: mode_analysis -lookin path -summary or -bycase [-column name] [-dump_row filename] [-out filename] [-log filename] [-v level] [-help] [MODE FILE LIST] [-config config_file] or [MODE LINE OPTIONS] MODE LINE OPTIONS Object Toggles -fcst versus -obs Selects lines pertaining to forecast objects or observation objects -single versus -pair Selects single object lines or pair lines -simple versus -cluster Selects simple object lines or cluster -matched versus -unmatched Selects matched simple object lines or unmatched simple object lines. Other Options (each option followed by value) -model, -fcst obs_thr, -fcst_var, etc -area_min max, -intersection_area_min max, etc -centroid_x_min max, -centroid_y_min max, -axis_ang_min max, -int10_min max, -centroid_dist_min max, -angle_diff_min max, etc MODE Analysis Tool -summary Example Command Line mode_analysis -summary \ -lookin mode_output/wrf4ncep/40km/ge03.\ -fcst -cluster \ -area_min 100 \ -column centroid_lat -column centroid_lon \ -column area \ -column axis_ang \ -column length Output Total mode lines read = 393 Total mode lines kept = 17 Provides summary statistics for Forecast Clusters with minimum area of 100 grid-sq for the specified MODE output columns Field N Min Max Mean StdDev P10 P25 P50 P75 P90 Sum centroid_lat centroid_lon area axis_ang length copyright 2018, UCAR, all rights reserved copyright 2018, UCAR, all rights reserved

104 MODE Analysis Tool -bycase Example Command Line mode_analysis -bycase -lookin mode_output/wrf4ncep/40km/ge03. -single -simple Output Total mode lines read = 393 Total mode lines kept = 141 Fcst Valid Time Area Matched Area Unmatched # Fcst Matched # Fcst Unmatched # Obs Matched # Obs Unmatched Apr 26, :00: May 13, :00: May 14, :00: May 18, :00: May 19, :00: May 25, :00: Jun 1, :00: Jun 3, :00: Jun 4, :00: copyright 2018, UCAR, all rights reserved Provides tallied information for all Simple Objects for each case in directory MODE Output PostScript object pictures, definitions matching/merging strategy total interest for each object pair Text attributes of simple, paired objects, clusters size, shape, position, separation, total interest verification scores (CSI, bias, etc.) for objects netcdf gridded object fields view with ncview Example REFC > 30 dbz Impact of smoothing radius 3 gs Total Interest: gs Area Ratio: 0.57 Centroid Distance: 95km P90 Intensity Ratio: 1.00 Total Interest: 0.96 Area Ratio: 0.57 Centroid Distance: 94km P90 Intensity Ratio: gs FSS = 0.64 Convolution Radius Increases Total Interest: 0.96 Area Ratio: 0.53 Centroid Distance: 92km P90 Intensity Ratio: 1.04 copyright 2018, UCAR, all rights reserved

Members Unmatched Forecast Object Unmatched

Newman 2015 MET tutorial February 3, 2015

105 Example May 11, 2013 Matched Observed Object No Ensemble Mean Matched Ensemble Mean Matched Forecast Object Spread increases With Time DTC SREF Tests ARW Members Unmatched Forecast Object Unmatched Observed Object MODE Example: Fcst Analogs Area ratio: 1.19 Centroid dist: Angle Diff: Area ratio: 0.81 Centroid dist: Angle Diff: Kathryn M. Newman 2015 MET tutorial February 3, 2015 Boulder, CO Model Evaluation Tools Tropical Cyclone (MET-TC) Area ratio: 1.09 Centroid dist: Angle Diff: 34.20

tools which utilize the MET software framework Allows for additional capabilities and features to be added to future releases WHY use MET-TC?

106 Introduction Introduction WHAT is MET-TC? A set of tools to aid in TC forecast evaluation and verification Developed to replicate (and add to) the functionality of the National Hurricane Center (NHC) verification software Modular set of tools which utilize the MET software framework Allows for additional capabilities and features to be added to future releases WHY use MET-TC? Provides Tropical Cyclone (TC) verification statistics consistent with operational centers Easily parse and subset TC datasets MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics Compile & build Must use METv4.1 or METv5.0 for MET-TC MET-TC specific code and tools: bin/ : executables for each MET-TC module (tc_dland, tc_pairs, tc_stat) share/met/config/ : configuration files (TCPairsConfig_default, TCStatConfig_default) share/met/tc_data/ : static files used in MET-TC (*land.dat, wwpts_us.txt) doc/ : contains the MET-TC User s Guide src/tools/tc_utils/ : source code for three MET-TC modules scripts/rscripts/ : contains the R script (plot_tcmpr.r) which provides graphics tools for MET-TC Getting Started The best track analysis is used primarily used as the observational dataset in MET-TC. May use any reference dataset in ATCF format The input files must be in Automated Tropical Cyclone Forecasting System (ATCF) format. Model output must be run through an internal/external vortex tracking algorithm MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics

Observations Observations are an important consideration for TC verification Quality and quantity of observations available Typically sparse or intermittent The best track analysis is used primarily

107 Observations Observations are an important consideration for TC verification Quality and quantity of observations available Typically sparse or intermittent The best track analysis is used primarily used as the observational dataset in MET-TC. All operational model aids and best track analysis can be found on the NHC ftp server: ftp://ftp.nhc.noaa.gov/atcf/archive/ MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics Observations Best track analysis Subjective assessment of TC s center location and intensity (6 hr) using all observations available Includes center position, maximum sfc winds, minimum center pressure, quadrant radii of 34/50/64 kt winds Subjectively smoothed representation of storm s location and intensity over its lifetime AL, 02, ,, BEST, 0, 132N, 252W, 35, 1006, TS, 34, NEQ, 30, 30, 0, 30, 1012, 170, 30, 45, 0, L, 0,, 0, 0, BERTHA, M, 12, NEQ, 30, 30, 0, 30 AL, 02, ,, BEST, 0, 134N, 265W, 40, 1006, TS, 34, NEQ, 60, 30, 0, 60, 1012, 170, 30, 50, 0, L, 0,, 0, 0, BERTHA, M, 12, NEQ, 30, 30, 30, 30 AL, 02, ,, BEST, 0, 140N, 278W, 40, 1003, TS, 34, NEQ, 60, 30, 0, 60, 1012, 180, 30, 50, 0, L, 0,, 0, 0, BERTHA, D, 12, NEQ, 30, 30, 30, 30 AL, 02, ,, BEST, 0, 148N, 292W, 45, 1000, TS, 34, NEQ, 75, 30, 0, 75, 1012, 180, 30, 55, 0, L, 0,, 0, 0, BERTHA, D, 12, NEQ, 60, 30, 30, 60 AL, 02, ,, BEST, 0, 154N, 308W, 45, 1000, TS, 34, NEQ, 75, 30, 0, 75, 1012, 180, 30, 55, 0, L, 0,, 0, 0, BERTHA, D, 12, NEQ, 120, 120, 60, 90 AL, 02, ,, BEST, 0, 158N, 326W, 45, 1000, TS, 34, NEQ, 75, 30, 0, 75, 1012, 180, 30, 55, 0, L, 0,, 0, 0, BERTHA, D, 12, NEQ, 120, 120, 60, 90 MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics AL, 02, ,, BEST, 0, 163N, 344W, 45, 1000, TS, 34, NEQ, 75, 30, 0, 75, 1012, 180, 30, 55, 0, L, 0,, 0, 0, BERTHA, D, 12, NEQ, 120, 120, 60, Getting Started Automated Tropical Cyclone Forecasting System (ATCF) format First developed at Naval Oceanographic and Atmospheric Research Laboratory (NRL) Currently used for NHC operations Must adhere to for MET-TC tools to properly parse the input data (first 17 columns must exist - missing values ok) To ensure proper matching input data must contain: Basin, cyclone number, initialization time, forecast hour, model name AL, 18, , 03, AVNO, 48, 152N, 812W, 25, 1006, XX, 34, NEQ, 0, 0, 0, 0, MET-TC User s Guide outlines these 17 columns and necessary fields For detailed information on ATCF format: MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics Getting Started Model output must be run through an internal/external vortex tracking algorithm Any algorithm that obtains basic position, maximum wind, minimum sea level pressure information from model forecasts (in ATCF format) may be used Fully supported and freely available: GFDL Vortex Tracker For more information (includes code and documentation): hp MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics

specifications Compute summary statistics Dark circles represent MET-TC modules Light boxes represent input/output Aggregated Statistics TC-dland TC_DLAND Aids in quickly parsing data for filter

coastlines/islands considered to be a significant landmass. (aland.dat, shland.dat, wland.

108 aland.dat MET-TC components aland.dat TC_DLAND ATCF ADECK ATCF BDECK Distance to land TC_PAIRS Pair output TC_STAT Primary functions of the code are: Compute pair statistics from ATCF input files Filter pair statistics based on user specifications Compute summary statistics Dark circles represent MET-TC modules Light boxes represent input/output Aggregated Statistics TC-dland TC_DLAND Aids in quickly parsing data for filter jobs: Only verify over water Threshold verification based on distance to land Exclusion/inclusion of forecasts within a specified window of landfall Input: file containing Lon/Lat coordinates of all coastlines/islands considered to be a significant landmass. (aland.dat, shland.dat, wland.dat) Output: gridded field representing distance to nearest coastline/island in format ATCF ADECK ATCF BDECK Distance to land TC_PAIRS Pair output TC_STAT Aggregated Statistics MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics TC-dland Usage: tc_dland out_file [-grid_spec] [-noll] [-land file] [-log file] [-v level] This exe only needs to be run once to establish the file! If running over the AL/EP and desire NHC land/water determination OR 1/10 th degree grid global coverage: file provided in build out_file Indicates output file containing the computed distances to land -grid_spec Overrides the default 1/10 th grid -noll Skips writing to reduce size of file -land file Overwrites the default land data file -log file Outputs log messages to the specified file -v level Overrides the default level of verbosity (2) EP MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics AL TC-pairs aland.dat TC_DLAND ATCF ADECK ATCF BDECK Produces pair statistics on independent model input or user-specified consensus forecasts Matches forecast with reference TC dataset (most commonly Best Track Analysis) Pair generation can be subset based on user-defined filtering criteria pair output allows for new or additional analyses to be completed without performing full verification process Distance to land TC_PAIRS Pair output TC_STAT Aggregated Statistics MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics

Tc_pairs Input: gridded distance file, forecast/reference in ATCF format Output:TCSTAT format Header, column-based output Usage: tc_pairs -adeck source -bdeck source -config file [-out base] [-log

Indicates path of output file base -log file Name of log file associated with pairs output -v level Indicates desired level of verbosity MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat

109 Tc_pairs Input: gridded distance file, forecast/reference in ATCF format Output:TCSTAT format Header, column-based output Usage: tc_pairs -adeck source -bdeck source -config file [-out base] [-log file] [-v level] -adeck source ATCF format file containing TC model forecast -bdeck source ATCF format file containingtc reference dataset -config file Name of configuration file to be used -out base Indicates path of output file base -log file Name of log file associated with pairs output -v level Indicates desired level of verbosity MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics Tc_pairs Configuration file determines filtering criteria MODEL STORM_ID BASIN CYCLONE STORM_NAME INIT_BEG/INIT_END INIT_INC/INIT_EXC VALID_BEG/VALID_END INIT_HR INIT_MASK VALID_MASK CHECK_DUP INTERP_12 CONSENSUS LAG_TIME BEST_BASELINE OPER_BASELINE MATCH_POINTS DLAND_FILE VERSION // // Model initialization time windows to include or exclude // init_beg = ""; init_end = ""; init_inc = []; init_exc = []; // // Valid model time window // valid_beg = ""; valid_end = " ; // // Model initialization hours // init_hour = []; // // lat/lon polylines defining masking regions // init_mask = ""; valid_mask = " ; // // Specify if the code should check for duplicate ATCF lines when building tracks // check_dup = FALSE; // // Specify whether special processing should be performed for interpolated models. // interp12 = REPLACE; // // Specify how consensus forecasts should be defined: //e.g. // consensus = [ // { // name = CON1 ; // members = [ MOD1, MOD2, MOD3 ]; // required = [TRUE, FALSE, FALSE]; // min_req = 2; // } // Take care not to over-subset! consensus = []; // Can perform additional filters with tc_stat tool MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics Tc_pairs Output in space delimited columns with header information LEAD VALID INIT_MASK VALID_MASK LINE_TYPE TOTAL INDEX LEVEL WATCH_WARN INITIALS ALAT ALON BLAT BLON TK_ERR X_ERR Y_ERR ALTK_ERR CRTK_ERR _ NA NA TCMPR 11 1 TD NA NA _ NA NA TCMPR 11 2 TS NA NA _ NA NA TCMPR 11 3 HU NA NA _ NA NA TCMPR 11 4 TS NA NA _ NA NA TCMPR 11 5 TD NA NA _ NA NA TCMPR 11 6 TD NA NA _ NA NA TCMPR 11 7 TS NA NA _ NA NA TCMPR 11 8 TS NA NA _ NA NA TCMPR 11 9 HU NA NA _ NA NA TCMPR HU HUWATCH NA _ NA NA TCMPR HU HUWARN NA _ NA NA TCMPR 11 1 TS NA NA _ NA NA TCMPR 11 2 HU NA NA _ NA NA TCMPR 11 3 TS NA NA _ NA NA TCMPR 11 4 TD NA NA _ NA NA TCMPR 11 5 TD NA NA _ NA NA TCMPR 11 6 TS NA NA _ NA NA TCMPR 11 7 TS NA NA _ NA NA TCMPR 11 8 HU NA NA _ NA NA TCMPR 11 9 HU HUWATCH NA _ NA NA TCMPR HU HUWARN NA _ NA NA TCMPR HU TSWARN NA _ NA NA TCMPR 11 1 HU NA NA _ NA NA TCMPR 11 2 TS NA NA _ NA NA TCMPR 11 3 TD NA NA _ NA NA TCMPR 11 4 TD NA NA _ NA NA TCMPR 11 5 TS NA NA TC Metrics Track Error: great-circle distance between the forecast location and the actual location of the storm center (nmi) Along-track Error: indicator of whether a forecasting system is moving a storm too slowly/quickly Cross-track Error: indicates displacement to the right/left of the observed track Intensity Error: Difference between forecast and actual intensity (kts) Graphics courtesy of NCAR TCMT MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics

Tc_stat Provides summary statistics and filtering jobs on TCST output Filter job: Stratifies pair output by various conditions and thresholds Summary job: aland.

land TC_PAIRS Pair output TC_STAT Aggregated Statistics Tc_stat Usage: tc_stat -lookin source [-out file] [-log file] [-v level] [-config file] [JOB COMMAND LINE] Configuration file options will be

110 Tc_stat Provides summary statistics and filtering jobs on TCST output Filter job: Stratifies pair output by various conditions and thresholds Summary job: aland.dat TC_DLAND ATCF ADECK ATCF BDECK Produces summary statistics on specific column of interest Input: TCST output from tc_pairs Output: TCST output file for either filter or summary job Distance to land TC_PAIRS Pair output TC_STAT Aggregated Statistics Tc_stat Usage: tc_stat -lookin source [-out file] [-log file] [-v level] [-config file] [JOB COMMAND LINE] Configuration file options will be applied to every job, unless an individual job specifies a configuration option joblist options will override -lookin source Location oftcst files generated from tc_pairs -out file Desired name of output file -log file Name of log file associated with tc_stat output -v level Verbosity level -config file Configuration file to be used Job command specify joblist on command line line MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics MET-TC: Intro getting started TC-dland TC-pairs TC-stat Graphics Tc_stat Configuration file will filter TCST output from tc_pairs to desired subset over which statistics will be computed AMODEL/BMODEL INIT_MASK/VALID_MASK LANDFALL STORM_ID LINE_TYPE LANDFALL_BEG (END) BASIN TRACK_WATCH_WARN MATCH_POINTS CYCLONE COLUMN_THRESH_NAME (VAL) EVENT_EQUAL STORM_NAME COLUMN_STR_NAME (VAL) EVENT_EQUAL_LEAD INIT_BEG/INIT_END INIT_THRESH_NAME (VAL) OUT_INIT_MASK INIT_INC/INIT_EXC INIT_STR_NAME (VAL) OUT_VALID_MASK VALID_BEG/VALID_END WATER_ONLY JOBS [ ] VALID_INC/VALID_EXC RAPID_INTEN VERSION INIT_HR/VALID_HR/LEAD // // Stratify by the ADECK and BDECK distances to land. // water_only = FALSE; // // Specify whether only those track points for which rapid intensification/weakening of the maximum wind speed occurred in the previous time step should be retained. // rapid_inten = { track = NONE;(NONE, ADECK, BDECK, BOTH) time = 24; exact = TRUE; (exact or max int. diff) thresh = >=30.0; } // // Specify whether only those track points occurring near landfall should be // retained, and define the landfall retention window as a number of seconds // offset from the landfall time. // landfall = FALSE; landfall_beg = ; landfall_end = 0; // // Specify whether only those track points common to both the ADECK and BDECK tracks should be retained. // match_points = TRUE; // // Specify whether only those cases common to all models in the dataset should be retained. // event_equal = TRUE; // // Specify lead times that must be present for a track to be included in the event equalization logic event_equal_lead = [ 12, 24, 36 ]; // MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics Tc_stat TC_stat output similar to TC_pairs for filter job (TCSTAT) Summary job output -column option produces summary statistics for the specified column -by option can be used to search each unique entry in selected column MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics

111 Graphics tools Graphical capabilities are included in the MET-TC release plot_tcmpr.r Input: TCSTAT tc_pairs output Output: R graphics, tc_stat logs/filter job TCSTAT (optional) Graphics tools-examples Box plots Mean error w/ CIs Usage: Rscript plot_tcmpr.r lookin -filter (specify filter job) -config (run filter job w/ configuration file) Default Rscript configuration file included in release Frequency of superior performance Rank frequency MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics MET-TC: Introduction Getting Started TC-dland TC-pairs TC-stat Graphics METViewer Training METVIEWER, CONTAINERS, AND MET+

Verification Dataflow METViewer components Packages: Java, Apache/Tomcat, MySQL, R statistics Models Obs/ Analyses MET Or NCEP Vx Package Neighborhood and MODE Objects & Attributes Continuous,

112 Verification Dataflow METViewer components Packages: Java, Apache/Tomcat, MySQL, R statistics Models Obs/ Analyses MET Or NCEP Vx Package Neighborhood and MODE Objects & Attributes Continuous, Categorical & Probabilitstic Stats M E T V i e w e r P l o t s VSDB Output 447 What can be loaded into METViewer? STAT files (*.stat) files from MET packages Grid-Stat Point-Stat Ensemble-Stat MODE VSDB files from NCEP verification packages Grid-to-Obs Grid-to-Obs_e* Grid-to-Grid Grid-to-Grid_e* * Some variables in ensemble VSDB currently not available for loading but will be soon MET Ensemble and Probability Evaluation Ensemble Characteristics Rank Histogram PIT CRPS Ignorance Score Eckel Spread-Skill Probability Measures Brier Score + Decomposition Brier Skill Score ROC and Area Under ROC Reliability

METViewer 1 Click on down arrow to pick database

on tab to pick plot type Series Box Bar ROC

Taylor Diagram Hist (Rhist, Phist, RELP) ECLV Y-axis

113 METViewer 1 Click on down arrow to pick database Interface Basics 449 Plot Template selection 2 Click on tab to pick plot type Series Box Bar ROC Reliability Ensemble Spread-Skill Performance Diagram Taylor Diagram Hist (Rhist, Phist, RELP) ECLV Y-axis variables Click on down arrow to pick variable then pick the statistic(s) 3

114 Y-axis series Stratifications 4. Click on down arrow to what lines/boxes you want to include on Y-axis 1 Click on Group if you want selections grouped into 1 series (e.g. leads 0+12 in group 1 and 24+36hr in group 2, etc ) Use Series Variable to define additional series 5 Fixed Values your stratifications. For example, select 1 or more thresholds to be aggregated, or 1 or more init times, or a date range based on valid times or cycle time Use Fixed Values to define additional stratifications X-axis How to compute statistics 6 Select dropdown arrow to choose what you want to plot against on the X-axis (Independent Variable) 7 Summary scores computed per each combination of fixed values and independent variable then mean or median taken Aggregate statistics accumulates SL1L2 lines or CTC counts prior to calculating the statistic Pick SL1L2 if plotting continuous statistics (RMSE, MAE etc ) Pick CTC (aka FHO) if plotting categorical statistics (ETS, TSS, Freq. Bias etc ) None is if using MET output and statistics are already computed

RMSE for each day, fcst lead combination model fcst_init_beg lead fcst_var stat_name stat_value Mean Median GFS/212 8/5/2014 0:00 0 T RMSE 0.

823 GFS/212 8/9/2014 0:00 0 T RMSE 0.942 GFS/212 8/4/2014 0:00 24 T RMSE 1.211 GFS/212 8/5/2014 0:00 24 T RMSE 1.

127 GFS/212 8/3/2014 0:00 48 T RMSE 1.919 GFS/212 8/4/2014 0:00 48 T RMSE 1.390 GFS/212 8/5/2014 0:00 48 T RMSE 1.570 1.533 1.

547 457 458 Example: Y1 Variable: RMSE Y1 Series: GFS/212 Fixed: 5 days, 0 UTC X-axis: 3 lead times Aggregation and Bootstraping Formatting

115 Summary Mean, Median or Sum Summary vs. Aggregation 8 If Summary Option selected - choose whether you want the Mean, Median or Sum of the values plotted Summary for example Computes RMSE for each day, fcst lead combination model fcst_init_beg lead fcst_var stat_name stat_value Mean Median GFS/212 8/5/2014 0:00 0 T RMSE GFS/212 8/6/2014 0:00 0 T RMSE GFS/212 8/7/2014 0:00 0 T RMSE GFS/212 8/8/2014 0:00 0 T RMSE GFS/212 8/9/2014 0:00 0 T RMSE GFS/212 8/4/2014 0:00 24 T RMSE GFS/212 8/5/2014 0:00 24 T RMSE GFS/212 8/6/2014 0:00 24 T RMSE GFS/212 8/7/2014 0:00 24 T RMSE GFS/212 8/8/2014 0:00 24 T RMSE GFS/212 8/3/2014 0:00 48 T RMSE GFS/212 8/4/2014 0:00 48 T RMSE GFS/212 8/5/2014 0:00 48 T RMSE GFS/212 8/6/2014 0:00 48 T RMSE GFS/212 8/7/2014 0:00 48 T RMSE Aggregation from partial sums Agg Example: Y1 Variable: RMSE Y1 Series: GFS/212 Fixed: 5 days, 0 UTC X-axis: 3 lead times Aggregation and Bootstraping Formatting Series Bootstrapping for confidence intervals, including setting confidence ( ) available on the Aggregation Statistics Page NOTE: you must also set CIs to true on the series you want them on Turn on Conf Intvls Change line color, symbol, line types and widths Turn off/on connecting across NAs Change Legend info 459 Format Series 9

Difference Curves and StatSignificance Statistically Significant Plot Add difference curve

Turn on Show Signif to have Statistically Significant differences highlighted 462

removing Lag1 autocorrelation, Display number of stats X1, X2, Y1, Y2 set axis, font size,

Legend & Caption Set font size, location, text color Under Formatting abline(h=0) is what

116 Difference Curves and StatSignificance Statistically Significant Plot Add difference curve Next to it is remove difference curve You might want to also turn on confidence intervals Turn on Show Signif to have Statistically Significant differences highlighted 462 Formatting the Plot Titles, Axes, Legends, etc 10 Common Staggering Points, Turn on removing Lag1 autocorrelation, Display number of stats X1, X2, Y1, Y2 set axis, font size, label orientation Formatting Plot size, image type, font size for titles, additional lines Legend & Caption Set font size, location, text color Under Formatting abline(h=0) is what tells it to put a horizontal line at 0 abline(h=1) would be horizontal at 1 abline(v=0) would be vertical at 0 etc

117 METViewer History METViewer History METViewer Save Plots, XML, Data, Rscrips, etc **Based on which tab is selected METViewer Upload XML scripts from your system

METViewer MODE Interface Plot Data: Choose between Stat and MODE 470 Docker MET and METViewer Docker (Amazon Web Services): Open-source technology to build and deploy applications inside

available on MET downloads page http://www.dtcenter.org/ met/users/downloads/ docker_container.

118 METViewer MODE Interface Plot Data: Choose between Stat and MODE 470 Docker MET and METViewer Docker (Amazon Web Services): Open-source technology to build and deploy applications inside software containers Packages software containing: code, runtime, system tools, system libraries, etc Enables you to quickly, reliably, and consistently deploy applications MET Container available on MET downloads page met/users/downloads/ docker_container.php MET and METViewer placed in a Docker Container 1) Set up to work with a suite of test-cases for NWP innovation testing (MMET/MERIT) 2) Presented in an AMS Short Course at this meeting (Jan 6 th ) dex.php MET+ Overview

119 MET+ Unified Package Python wrappers around MET and METViewer: Simple to set-up and run Automated plotting of 2D fields and statistics 473 Initial system - Global deterministic with plans to generalize across scales when possible to quickly spin-up Ensembles, High Resolution & Global Components Spatial Plots Python wrappers MET MET+ Stats Plots GitHub Repository METViewer What is currently wrapped with Python? Input Reformat Plot Statistics Gridded Forecast Analysis Obs MODIS Data WWMCA Data Point PrepBufr Point MADIS Point Lidar HDF GSI Diag PCP Combine Regrid Data Plane MODIS Regrid WWMCA Regrid 2NC PB2NC MADIS2NC LIDAR2NC GSI Tools Gen VxMask Shift Data Plane Gridded Point Obs Plot Data Plane WWMCA Plot PS Plot Point Obs MTD Series Analysis MODE Wavelet Stat Grid Stat Ensemble Stat Point Stat PS STAT PS STAT STAT STAT Analysis Plot MODE Field MODE Analysis Stat Analysis PNG STAT DLand MET-TC TC PAIRS TCST TC STAT Land Data File TC DLAND ATCF Track Data TCPlot -MPR.R PNG Copyright 2017, Copyright University 2017, Corporation University Corporation for Atmospheric for Atmospheric Research, Research, all rights all rights reserved master_metplus.py METplus config scripts metplus_final.conf Input What does wrapped by Python mean? METplus Script 1 MET Tool 1 Output 1 METplus Script 2 Output 2 From.conf to running MET METplus Script 3 MET Tool 2 Output 3 476

What does wrapped by Python mean? At https://github.com/ncar/metplus/ What does wrapped by Python mean? METplus/parm/use_cases/feature_relative feature_relative.

120 What does wrapped by Python mean? At What does wrapped by Python mean? METplus/parm/use_cases/feature_relative feature_relative.conf Control File and Config Python Scripts What does wrapped by Python mean? At What does wrapped by Python mean? METplus/parm/use_cases/feature_relative feature_relative.conf In Configs: Environment variables passed in from Constants File Series_Analysis_Config

MET+ Beta - Prerequisites Python 2.7 ** When we started this was specified by NCO R version 3.25 ** Only if you are using plot_tcmpr.r tool nco (netcdf operators) MET version 6.

jensen/metplus WCOSS Gyre: /global/noscrub/julie.prestopnik/metplus Surge: /gpfs/hps3/emc/global/noscrub/julie.

121 MET+ Beta - Prerequisites Python 2.7 ** When we started this was specified by NCO R version 3.25 ** Only if you are using plot_tcmpr.r tool nco (netcdf operators) MET version 6.0 or later installed ** Tool is designed to sit on-top of MET and should be version insensitive after METv6.0 Basic familiarity with MET MET+ Beta Installations Theia /scratch4/bmc/dtc/tara.jensen/metplus WCOSS Gyre: /global/noscrub/julie.prestopnik/metplus Surge: /gpfs/hps3/emc/global/noscrub/julie.prestopnik/m ETplus 482 Getting Started Instructions for grabbing release: Tplus Grabbing the Release Instructions for downloading: Click on the green download button on right side Instructions for cloning: git clone ETplus You should now have a METplus directory

Grabbing the Release Recommended Procedure - User https://github.com/ncar/metplus/wiki/github-repo- Information 486 Recommended Procedure - Developer https://github.

122 Grabbing the Release Recommended Procedure - User Information 486 Recommended Procedure - Developer Information Existing MET Builds v6.1_existing_met_builds.php 487

123 Setting up profile - Theia Theia -.cshrc set loadmemetplus='yes' if ( $loadmemetplus == 'yes' ) then module use /contrib/modulefiles module load met module load nco module load wgrib2 module load R set METPLUS_PATH=/scratch4/BMC/dtc/Tara.Jensen/METplus set MET_PATH=/contrib/met/6.1 setenv JLOGFILE ${METPLUS_PATH}/logs/metplus_jlogfile setenv PYTHONPATH ${METPLUS_PATH}/ush:${METPLUS_PATH}/parm setenv PATH ${PATH}:${METPLUS_PATH}/ush:. endif Setting up profile - Gyre WCOSS - /u/user/.bashrc set loadmetplus='yes' if [ $loadmetplus=='yes' ]; then echo "Loading METplus environment" module use /global/noscrub/julie.prestopnik/modulefiles module load met/6.1 module load nco module load grib_util/v1.0.3 module use /usrx/local/dev/modulefiles module load python export METPLUS_DEMO="/global/noscrub/Julie.Prestopnik/" export MET_DEMO="/global/noscrub/Julie.Prestopnik/met/6.1" export JLOGFILE="${METPLUS_DEMO}/METplus/logs/metplus_jlogfile" export PYTHONPATH="${METPLUS_DEMO}/METplus/ush:${METPLUS_DEMO} /METplus/parm" export PATH="${PATH}:${METPLUS_DEMO}/METplus/ush:." fi Setting up profile - Surge set loadmetplus='yes' if [ $loadmetplus=='yes' ]; then echo "Loading METplus environment" module use /gpfs/hps3/emc/global/noscrub/julie.prestopnik/modulefiles module load met/6.1 module load grib_util/1.0.3 module use /usrx/local/dev/modulefiles module load python module load nco-gnu-sandybridge/4.4.4 export METPLUS_DEMO="/gpfs/hps3/emc/global/noscrub/Julie.Prestopnik/" export MET_DEMO="/gpfs/hps3/emc/global/noscrub/Julie.Prestopnik/met/6.1" export JLOGFILE="${METPLUS_DEMO}/METplus/logs/metplus_jlogfile" export PYTHONPATH="${METPLUS_DEMO}/METplus/ush:${METPLUS_DEMO}/MET plus/parm" export PATH="${PATH}:${METPLUS_DEMO}/METplus/ush:." fi Directory Structure doc/ - Doxygen documentation internal_tests/ - developer tests parm/ - where configs live README.md - general README sorc/ - executables ush/ - python scripts 492

METplus/doc METplus/parm METplus/internal_tests METplus/sorc 493

configuration files with Environment Variables should reside here

administrator for all to use Use_cases feature_relative Three

to grib comparison and lost of GEMPAK examples

$scores using plot_tcmpr.r script {user}_system.conf.$

124 METplus/doc METplus/parm METplus/internal_tests METplus/sorc 493 Key to running METplus parm dir METplus/ush Met_config All MET configuration files with Environment Variables should reside here Metplus_config Three basic files that can be set by a system administrator for all to use Use_cases feature_relative Three ways of running feature_relative software qpf One example of grib to grib comparison and lost of GEMPAK examples track_and_intensity One example of computing track and intensity scores using plot_tcmpr.r script {user}_system.conf.{system_name} Allows user to over-ride base system setting and write data into a given directory

$data Use_cases Common install for FUNCTIONAL GROUP includes paths for tests your conducting {user}_system.conf.$

125 Suggestions on how to set up parm dir Met_config All MET configuration files with Environment Variables should reside here Metplus_config Common install for BRANCH includes paths to commonly used data Use_cases Common install for FUNCTIONAL GROUP includes paths for tests your conducting {user}_system.conf.{system_name} Place your variances from use-cases in here, including pointing to your output directory, or pointing to a different config you are trying, etc Key to running METplus- parm dir Met_config All MET configuration files with Environment Variables should reside here Metplus_config Three basic files that can be set by a system administrator for all to use Use_cases feature_relative Three ways of running feature_relative software qpf One example of grib to grib comparison and lots of GEMPAK examples track_and_intensity One example of computing track and intensity scores using plot_tcmpr.r script {user}_system.conf.{system_name} Allows user to over-ride base system setting and write data into a given directory Three Use Cases Track and Intensity Using MET-TC to pair up ATCF track files plot_tcmpr.r to compute track and intensity errors and plot Feature Relative Use MET-TC to pair up ATCF track files Extract 30deg by 30deg tiles from GFS Forecast and Analysis files for comparison Use Series-Analysis to compute statistics for the stack of tiles over CONUS Use Plot-Data-Plane to generate quick look plots QPF Use Pcp-Combine to accumulate 1-hr QPE into 3-hr accumulation Use Grid-Stat to compute Categorical statistics Phase I Left to be done Moving MET code-base from SVN to GitHub Additional scripting to emulate base Global verification Scripting to push data to METViewer server, load data and make basic batch plots Install and test on Theia and WCOSS Python scripting to emulate base Global Vx plots 500

Categorical Verification

Forecast M H F Observation Categorical Verification Tina Kalb Contributions from Tara Jensen, Matt Pocernich, Eric Gilleland, Tressa Fowler, Barbara Brown and others Finley Tornado Data (1884) Forecast