Introduction to Weather Data Cleaning

Similar documents
SuperPack North America

Speedwell Weather System. SWS Version 11 What s New?

Speedwell Weather. Topics: Setting up a weather desk (a shopping list)

Speedwell High Resolution WRF Forecasts. Application

SuperPack Data Subscription

This introduction is intended for compliance officers at Protection Seller and Broker-Advisor firms

Weather Data Weather Forecasts Weather Risk Consultancy Weather Risk Management Software SWS Weather Data Format

SWS Version 9.0 Features Summary

1 Introduction. Station Type No. Synoptic/GTS 17 Principal 172 Ordinary 546 Precipitation

Climate Services in Seychelles

CustomWeather Statistical Forecasting (MOS)

QualiMET 2.0. The new Quality Control System of Deutscher Wetterdienst

Weather Climate Science to Service Partnership South Africa

SuperPack -Light. Data Sources. SuperPack-Light is for sophisticated weather data users who require large volumes of high quality world

Delivering Met to the Military User

W. Douglas Wewer 1,2 *, Michael J. Jenkins 1

Mauritius Meteorological Services

Integrated Electricity Demand and Price Forecasting

4.5 Comparison of weather data from the Remote Automated Weather Station network and the North American Regional Reanalysis

Communicating uncertainty from short-term to seasonal forecasting

8-km Historical Datasets for FPA

MetConsole AWOS. (Automated Weather Observation System) Make the most of your energy SM

Quantifying Weather Risk Analysis

New rainfall and climate quality control systems used for land surface observation

Probabilistic Weather Forecasting and the EPS at ECMWF

Regional Consultation on Climate Services at the National Level for South East Europe Antalya, Turkey November

CLIMATE CHANGE ADAPTATION BY MEANS OF PUBLIC PRIVATE PARTNERSHIP TO ESTABLISH EARLY WARNING SYSTEM

The WMO Integrated Global Observing System (WIGOS), current status and planned regional activities

Weather Derivatives. AgBiz Insurance Workshop 08 October Raphael N. Karuaihe, PhD

Providers of Weather, Climate and Water Information

Combining Deterministic and Probabilistic Methods to Produce Gridded Climatologies

CSO Climate Data Rescue Project Formal Statistics Liaison Group June 12th, 2018

World Meteorological Organization

Data recovery and rescue at FMI

Monitoring Extreme Weather Events. February 8, 2010

Forecast solutions for the energy sector

Vaisala AviMet Automated Weather Observing System

COMMERCIALISING WEATHER AND CLIMATE SERVICES:

REQUIREMENTS FOR WEATHER RADAR DATA. Review of the current and likely future hydrological requirements for Weather Radar data

STANDARD OPERATING PROCEDURES

YOPP archive: needs of the verification community

Space for Smarter Government Programme

technological change and economic growth more fragmented; slower, higher population growth middle emissions path

Climate Variables for Energy: WP2

Commercialisation. Lessons learned from Dutch weather market

Met Éireann Climatological Note No. 15 Long-term rainfall averages for Ireland,

Implementation of global surface index at the Met Office. Submitted by Marion Mittermaier. Summary and purpose of document

June 2011 Wind Speed Prediction using Global and Regional Based Virtual Towers in CFD Simulations

Ambient Weather DHR70B-BRASS Handheld Fishing Barometer User Manual

Available data and products for Agricultural purpose at the National Meteorological Agency of Ethiopia

Fischer Instruments Stainless Steel Barometer User Manual

SuperPack Data Subscription

New Intensity-Frequency- Duration (IFD) Design Rainfalls Estimates

Infrastructure and Expertise available to the Advisory Section

Kevin Smith Senior Climatologist. Australian Bureau of Meteorology.

CAREERS AND BURSARIES

The Future of the USAP Antarctic Internet Data Distribution System

13 June Board on Research Data & Information (BRDI)

New COST Action: Towards a European Network on Chemical Weather Forecasting and Information Systems

TONGA COUNTRY REPORT. Officer : John Holi

MesoWest Accessing, Storing, and Delivering Environmental Observations

MesoWest: One Approach to Integrate Surface Mesonets

Hindcast Arabian Gulf

Seasonal Forecast for the area of the east Mediterranean, Products and Perspectives

Extending the use of Flood Modeller Pro towards operational forecasting with Delft-FEWS

The current status, functions, challenges and needs of South Sudan Meteorological Department (SSMD)

RSE «Kazhydromet» Hydrometcenter th September 2015, Astana

COPYRIGHTED MATERIAL. 1 Introduction

AN INTERNATIONAL SOLAR IRRADIANCE DATA INGEST SYSTEM FOR FORECASTING SOLAR POWER AND AGRICULTURAL CROP YIELDS

Ambient Weather WS-152B 6" Contemporary Barometer User Manual

Flash flood forecasting and warning infrastructures of National Meteorology and Hydrological Services of Cambodia

Fischer Banjo Weather Station with Thermometer, Hygrometer, Barometer User Manual

Observation requirements for regional reanalysis

Implementation Guidance of Aeronautical Meteorological Observer Competency Standards

N ew York State Agricultural Experiment S tation vnrfx* NYSAES I Publications I Latest Press Releases

Guidance on Aeronautical Meteorological Observer Competency Standards

Operational MRCC Tools Useful and Usable by the National Weather Service

The Vaisala AUTOSONDE AS41 OPERATIONAL EFFICIENCY AND RELIABILITY TO A TOTALLY NEW LEVEL.

How GIS can support the Production

SAMPLE. SITE SPECIFIC WEATHER ANALYSIS Rainfall Report. Bevens Engineering, Inc. Susan M. Benedict REFERENCE:

Lee County, Alabama 2015 Forecast Report Population, Housing and Commercial Demand

Climpact2 and regional climate models

Data Comparisons Y-12 West Tower Data

Creation of a 30 years-long high resolution homogenized solar radiation data set over the

DEVELOPMENT OF A FORECAST EARLY WARNING SYSTEM ethekwini Municipality, Durban, RSA. Clint Chrystal, Natasha Ramdass, Mlondi Hlongwae

A Framework for Assessing Operational Model MJO Forecasts

SHORT TERM LOAD FORECASTING

Descripiton of method used for wind estimates in StormGeo

Founding of a Grower-based Weather/Pest Information Network to Aid IPM Adoption

EVALUATION OF NDFD AND DOWNSCALED NCEP FORECASTS IN THE INTERMOUNTAIN WEST 2. DATA

interpreted by András Horányi Copernicus Climate Change Service (C3S) Seasonal Forecasts Anca Brookshaw (anca.brookshaw.ecmwf.int)

Operational water balance model for Siilinjärvi mine

Global Weather Enterprise

Imke Durre * and Matthew J. Menne NOAA National Climatic Data Center, Asheville, North Carolina 2. METHODS

Opportunities and Priorities in a New Era for Weather and Climate Services

Are Symmetrical Patterns More Successful?

Fischer 1508BTH-45 5" Brass Barometer with Temperature & Humidity User Manual

The benefits and developments in ensemble wind forecasting

The history of infrastructures and the future of cyberinfrastructure in the Earth system sciences

London Heathrow Field Site Metadata

Transcription:

Introduction to Weather Data Cleaning

Speedwell Weather Limited An Introduction Providing weather services since 1999 Largest private-sector database of world-wide historic weather data Major provider of Settlement Data for weather risk contracts. Products: Weather data Weather forecasts Weather derivative pricing and risk-management software Weather risk contract settlement services Consultancy/Weather station installation Emphasis on quality. We see weather data as a form of financial market data We serve clients in insurance, weather derivatives, banking, energy and agriculture sectors world-wide Offices in the United States and the United Kingdom

Introduction to Weather Data Cleaning Weather data quality control cleaning - is fundamental to our provision of weather data. The availability of reliable ground truth is key to the pricing and settling of parametric weather risk contracts It is also important in the production of our site-specific downscaled forecasts. This document shows how we approach the cleaning task. We clean over 5,000 sites around the world every day with dedicated teams in the UK and USA At any one time we provide Settlement Data for dozens of sites to a specific Settlement Data Specification for those sites where clients have open weather risk. Please see separate documentation relating to the provision of Settlement Data

Data Cleaning Problem: Weather data is not perfect Missing values Erroneous observations Consistency problems Multiple data sets claiming to be for the same weather station Solution (1): Ignore the problem Erroneous observations will lead to inaccurate pricing Solution (2): Use only good stations Difficult to determine what is good without cleaning it Greatly limits your ability to trade Never purchase data from anyone unless they can provide satisfactory answers to these questions: - What is the original source of the data? - What is the observation convention? - What are the attributes (lat, lon, elevation)? - What has been done to the data - Which data points have been QC d, and what were they pre-qc? Solution (3): Clean / fill the data Fill missing values Detect and replace erroneous observations Confirm the consistency of the data

Data Cleaning The quality of Meteorological observations varies significantly even from G20 national met services Missing / erroneous observation are common place A lot of weather data available in public archives is stored in an inconsistent manner and is of low quality Fundamentals of a proper data cleaning (1) Organization (2) Redundancy (3) Flexibility (4) Human interaction (5) Transparency Fundamental to satisfying the above is the implementation of software systems infrastructure...but data cleaning cannot and SHOULD not be FULLY automated (see 4) Part of the Speedwell Data cleaning process diagram

Data Cleaning: Organization Fundamentals of a proper data cleaning (1) Organization - logical flow - data management Data preparation (2) Redundancy (3) Flexibility (4) Human interaction (5) Transparency Some of the 50+ Speedwell data quality types Initial Review In-depth analysis / data filling Manual Review Data delivery

Data Cleaning: Redundancy Fundamentals of a proper data cleaning (1) Organization (2) Redundancy - data sources - testing - estimates - delivery (3) Flexibility (4) Human interaction (5) Transparency Example of weather variables stored for a single site Testing: no one test is appropriate for all situations. - comparison against itself - physical consistency - statistical probability -comparison against neighbors - Observations are compared against the median of a basket of proxies and the MAD (median absolute deviation). If the observation is statistically different from the surrounding stations it is sent to the filling process A fundamental pre-requisite for effective data cleaning is access to a library of weather data, providing access to near-by sites, allowing plausibility testing for the site being cleaned. Speedwell Weather maintains a very large inventory of weather data for over 50 different weather elements which are stored using over 50 data quality types to fully respect differing conventions and sources (eg Synoptic / Climate, Cleaned / Raw etc). This allows us to document data point changes which may occur when national met offices change data records to reflect their internal QC procedures. Data sources bring in as much as possible and keep what is useful. Typical processing includes: Climate data (daily / hourly), Synoptic data, METAR, ECMWF forecast data, climatology If one source fails there are others Data delivery - Multiple FTP deliveries - 24-hour support Estimates (filling) - logging of all deliveries Why have one when you can have many? Useful for more indepth manual analysis -Description of data quality and type Speedwell s weather data archive contain daily and hourly data for over 100,000 sites world-wide

Data Cleaning: Flexibility Fundamentals of a proper data cleaning (1) Organization (2) Redundancy (3) Flexibility - consider the situation - appropriateness of tests (4) Human interaction (5) Transparency Estimate #2 Estimates of daily observations from hourly observations (curve fitting) Estimate #1 surrounding station regression using deseasonalized data Estimate #3 Estimates of daily observations by manipulating other data types (Synoptic, METAR, ½ hourly) Estimate #4 Day +1 forecasts can actually be very good Estimate #5 Climatology worst case scenario Estimate #6, #7, #8, Flexibility allows you to add any appropriate estimates. The possibilities are unlimited. - satellite derived values - installed stations - reanalysis

Data Cleaning: The Human element and Transparency Fundamentals of a proper data cleaning (1) Organization (2) Redundancy (3) Flexibility (4) Human interaction - meteorology is complicated - introduction of non-automated information (5) Transparency - explanation of the process - share what has been cleaned - no-one likes black boxes

T: +44 (0) 1582 465 551 E: datateam@speedwellweather.com Harpenden, UK I Charleston, SC, USA Copyright www.speedwellweather 2016 Speedwell Weather Ltd..com All rights reserved. Vs 201606B