Bayesian Inference for Wind Field Retrieval

Similar documents
Bayesian analysis of the scatterometer wind retrieval inverse problem: some new approaches

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations

Estimating Conditional Probability Densities for Periodic Variables

A variational radial basis function approximation for diffusion processes

Recent Advances in Bayesian Inference Techniques

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Remote Sensing Operations

Long term performance monitoring of ASCAT-A

Stability in SeaWinds Quality Control

Pattern Recognition and Machine Learning

STA 4273H: Statistical Machine Learning

Bayesian time series classification

Scatterometer Data-Interpretation: Measurement Space and Inversion *

Forward Problems and their Inverse Solutions

Self-Organization by Optimizing Free-Energy

Probabilistic Graphical Models

EUMETSAT SAF Wind Services

DRAGON ADVANCED TRAINING COURSE IN ATMOSPHERE REMOTE SENSING. Inversion basics. Erkki Kyrölä Finnish Meteorological Institute

Markov Chain Monte Carlo methods

Confidence Estimation Methods for Neural Networks: A Practical Comparison

ASCAT NRT Data Processing and Distribution at NOAA/NESDIS

Bayesian Sampling and Ensemble Learning in Generative Topographic Mapping

Climate data records from OSI SAF scatterometer winds. Anton Verhoef Jos de Kloe Jeroen Verspeek Jur Vogelzang Ad Stoffelen

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Bayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine

Gaussian process for nonstationary time series prediction

An Alternative Infinite Mixture Of Gaussian Process Experts

Variational Principal Components

Bagging During Markov Chain Monte Carlo for Smoother Predictions

STA 4273H: Statistical Machine Learning

Machine Learning Techniques for Computer Vision

Principles of Bayesian Inference

Big model configuration with Bayesian quadrature. David Duvenaud, Roman Garnett, Tom Gunter, Philipp Hennig, Michael A Osborne and Stephen Roberts.

Nonparametric Bayesian Methods (Gaussian Processes)

Widths. Center Fluctuations. Centers. Centers. Widths

STA 4273H: Sta-s-cal Machine Learning

Modeling human function learning with Gaussian processes

Tutorial on Probabilistic Programming with PyMC3

Variational Methods in Bayesian Deconvolution

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models

STAT 518 Intro Student Presentation

9 Multi-Model State Estimation

Bayesian Machine Learning - Lecture 7

Bayesian Inference for the Multivariate Normal

Regression with Input-Dependent Noise: A Bayesian Treatment

Learning Gaussian Process Models from Uncertain Data

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Lecture 9. Time series prediction

EVALUATION OF WINDSAT SURFACE WIND DATA AND ITS IMPACT ON OCEAN SURFACE WIND ANALYSES AND NUMERICAL WEATHER PREDICTION

CS491/691: Introduction to Aerial Robotics

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

ASCAT B OCEAN CALIBRATION AND WIND PRODUCT RESULTS

Improved characterization of neural and behavioral response. state-space framework

STA 4273H: Statistical Machine Learning

STA 414/2104: Machine Learning

Expectation Propagation in Dynamical Systems

Lecture 2: From Linear Regression to Kalman Filter and Beyond

U-Likelihood and U-Updating Algorithms: Statistical Inference in Latent Variable Models

Tools for Parameter Estimation and Propagation of Uncertainty

Probabilistic Machine Learning

Gaussian Process for Internal Model Control

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Large-scale Ordinal Collaborative Filtering

Relative Merits of 4D-Var and Ensemble Kalman Filter

Bayesian hierarchical modelling for data assimilation of past observations and numerical model forecasts

Prediction of Data with help of the Gaussian Process Method

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

Bayesian Deep Learning

The connection of dropout and Bayesian statistics

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

VIBES: A Variational Inference Engine for Bayesian Networks

CSC 2541: Bayesian Methods for Machine Learning

Probabilities for climate projections

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

Markov chain Monte Carlo methods for visual tracking

Machine Learning Overview

A Probabilistic Framework for solving Inverse Problems. Lambros S. Katafygiotis, Ph.D.

The document was not produced by the CAISO and therefore does not necessarily reflect its views or opinion.

A two-season impact study of the Navy s WindSat surface wind retrievals in the NCEP global data assimilation system

An introduction to Bayesian statistics and model calibration and a host of related topics

Cheng Soon Ong & Christian Walder. Canberra February June 2018

REVISION OF THE STATEMENT OF GUIDANCE FOR GLOBAL NUMERICAL WEATHER PREDICTION. (Submitted by Dr. J. Eyre)

COM336: Neural Computing

Mixture Models and EM

Principles of Bayesian Inference

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

Introduction to Machine Learning

Neutron inverse kinetics via Gaussian Processes

Communicating uncertainties in sea surface temperature

INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES

Lecture 6: April 19, 2002

COURSE INTRODUCTION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Improved Fields of Satellite-Derived Ocean Surface Turbulent Fluxes of Energy and Moisture

Bias correction of satellite data at the Met Office

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

Approximate Inference Part 1 of 2

Transcription:

Bayesian Inference for Wind Field Retrieval Ian T. Nabney 1, Dan Cornford Neural Computing Research Group, Aston University, Aston Triangle, Birmingham B4 7ET, UK Christopher K. I. Williams Division of Informatics, University of Edinburgh, 5 Forrest Hill, Edinburgh EH1 2QL, UK Abstract In many problems in spatial statistics it is necessary to infer a global problem solution by combining local models. A principled approach to this problem is to develop a global probabilistic model for the relationships between local variables and to use this as the prior in a Bayesian inference procedure. We use a Gaussian process with hyper-parameters estimated from Numerical Weather Prediction Models, which yields meteorologically convincing wind fields. We use neural networks to make local estimates of wind vector probabilities. The resulting inference problem cannot be solved analytically, but Markov Chain Monte Carlo methods allow us to retrieve accurate wind fields. Key words: Bayesian inference; surface winds; spatial priors; Gaussian Processes 1 Introduction Satellite borne scatterometers are designed to retrieve surface winds over the oceans. These observations enhance the initial conditions supplied to Numerical Weather Prediction (NWP) models [1] which solve a set of differential equations describing the evolution of the atmosphere in space and time. These initial conditions are especially important over the ocean since other observational data is sparse. This paper addresses the problem of wind field retrieval from scatterometer data using neural networks and probabilistic models alone. 1 Email of corresponding author: i.t.nabney@aston.ac.uk Preprint submitted to Elsevier Preprint 2 July 2007

A scatterometer measures the amount of radiation scattered from a directed radar beam back toward the satellite by the ocean s surface. This backscatter is principally related to the local instantaneous wind stress. Under the assumption of a fixed change in wind speed with height, the surface wind stress can be related to the local 10 m wind vector [2] which is a variable used within NWP models. The relation between the back-scatter (denoted σ o ) 2 and the wind vector, (u,v), at a single location is complex to model [3]. Most previous work has generated local forward (or sensor) models [4] describing the mapping (u, v) σ o, which is one-to-one. The local inverse mapping, σ o (u,v), is a oneto-many mapping, since the noise on σ o makes it very difficult to uniquely determine the wind direction, although the speed is well defined. There are generally two dominant wind direction solutions which correspond to winds whose directions differ by roughly 180. The principal difficulty in wind field retrieval is to correct the wrong wind vector directions. This must be done by considering the global consistency of the retrieved winds. The local inverse problem can be modelled directly using a neural network approach, which produces encouraging results [5]. Our work builds on both approaches, unified within a Bayesian framework for combining local model predictions with a global prior model [6]. The advantage of the Bayesian approach is that it is a principled method for taking account of uncertainty in the models. In earlier work using forward models for wind retrieval, including the systems used operationally by the European Space Agency and the UK Meteorological Office, the wind fields are obtained by heuristic methods which rely on a background (or first guess) forecast wind field from a NWP model [7]. The method proposed here could include this information; however, due to careful development of the prior wind field model, it should be possible to achieve autonomous wind field retrieval. Autonomous retrieval means that wind fields are retrieved using scatterometer observations only, without the aid of a NWP model. This is useful since NWP models are not available to all users of scatterometer data due to their massive computational cost. 2 Modelling Approach The polar orbiting ERS-1 satellite [2] carries a scatterometer which obtains observations of back-scatter in a swathe approximately 500 km wide. The swathe is divided into scenes, which are 500 500 km square regions each containing 2 The vector σ o consists of three individual back-scatter measurements from different viewing angles. In this article we do not make the dependence on incidence angle or view angle explicit. 2

19 19 cells with a cell size of 50 50 km. Thus there is some overlap between cells. We refer to observations at a single cell as a local measurement, while a field is taken to denote a spatially connected set of local measurements from a swathe. Capital letters are used to denote a wind field (U,V ) or a field of scatterometer measurements Σ o. Lower case letters are used to represent a local measurement, and occasionally the subscript i is used to index the cell location. The aim is to obtain p(u,v Σ o ), the conditional probability of the wind field, (U,V ), given the satellite observations, Σ o. Using Bayes theorem: p(u,v Σ o ) = p(σo U,V )p(u,v ) p(σ o ). (1) Once the Σ o have been observed, p(σ o ) is a constant and thus we can write: p(u,v Σ o ) p(σ o U,V )p(u,v ). (2) The normalising constant, p(σ o ), cannot be computed analytically, so we intend to use Markov Chain Monte Carlo (MCMC) [8] methods to sample from the posterior distribution (2), p(u,v Σ o ). The likelihood of the observations is denoted by p(σ o U,V ) and p(u,v ) is the prior wind field model. The next two sections deal with the specification of the models in Equation (2). 2.1 Prior wind-field models The prior wind-field model, p(u, V ), is an informative prior, and will constrain the local solutions to produce consistent and meteorologically plausible wind fields. Our prior is a vector Gaussian Process (GP) based wind-field model that is tuned to NWP assimilated data 3 for the mid-latitude North Atlantic. This ensures that the wind-field model is representative of NWP wind fields, which are the best available estimates of the true wind field. GPs provide a flexible class of models [9] particularly when the variables are distributed in space. The wind field data is assumed to come from a zero mean multi-variate normal distribution, whose variables are the wind vectors at each cell location and whose covariance matrix is a function of the spatial location of the observations and some physically interpretable parameters. A particular form of GP was chosen to incorporate geophysical information in the prior model [10, 11]. 3 Assimilated data means we are using the best guess initial conditions of a NWP model. 3

The probability of a wind field under our model is given by: p(u,v ) = 1 (2π) n 2 Kuv 1 2 exp 1 2 U V K 1 uv U, V where K uv is the joint covariance matrix for U,V. The covariance matrices are determined by appropriate covariance functions which have parameters to represent the variance, characteristic length scales of features, the noise variance and the ratio of divergence to vorticity in the wind fields. Tuning these parameters to their maximum a posteriori probability values using NWP data produces an accurate prior wind-field model which requires no additional NWP data once the parameters are determined. Since there is also a temporal aspect to the problem, different parameter values were determined for each month of the year. Because the GP model defines a field over a continuous space domain it can be used for prediction at unmeasured locations, and can easily cope with missing or removed data. 2.2 Likelihood models There are two approaches to computing the likelihood, based on local forward and inverse models, respectively. 2.2.1 Use of forward models We have developed a probabilistic local forward model, p(σ o i u i,v i ), consisting of a neural network with Gaussian output noise [12, this issue]. The wind vector in a given cell is assumed to completely determine the local observation of σ o i, and thus conditionally on the wind vectors in each cell the σo i values are independent and the likelihood can be written: p(σ o U,V ) = i p(σ o i u i,v i ), (3) where the product is taken over all the cells in the region being considered. This decomposition is only valid for the conditional local model: the observations of σ o i and (u i,v i ) are not jointly independent in this way. Equation (2) can now be written: ( ) p(u,v Σ o ) p(σ o i u i,v i ) p(u,v ), (4) i this being referred to as forward disambiguation. 4

2.2.2 Use of inverse models The inverse mapping has at least two possible values at each location, thus a probabilistic model of the form p(u i,v i σ o i ) is essential to fully model the mapping. A neural network local inverse model that captures the multi-modal nature of p(u i,v i σ o i ) has been developed [13, this issue]. Using Bayes theorem: p(σ o i u i,v i ) = p(u i,v i σ o i )p(σo i ). (5) p(u i,v i ) Having observed the Σ o values, p(σ o i ) is constant and Equation (5) can be written: p(σ o i u i,v i ) p(u i,v i σ o i ). (6) p(u i,v i ) Replacing the product term in Equation (4) using Equation (6) gives: p(u,v Σ o ) ( i p(u i,v i σ o i ) ) p(u,v ), (7) p(u i,v i ) this being referred to as inverse disambiguation. Using Bayes theorem in this form has been called the Scaled-Likelihood method [14] although, to our knowledge, it has not been previously applied in this spatial context. 3 Application [Figure 1 here] We propose a modular design (Fig. 1) for retrieving wind fields in an operational environment. Initially the scatterometer data is pre-processed using an unconditional density model for σ o to remove outliers. (This model is currently under development using a specially modified Generative Topographic Mapping [15] with a conical latent space.) The selected σ o data is then forward propagated through the inverse model to produce a field of local wind predictions p(u i,v i σ o i ). Some problem specific methods (not discussed here) are used to select an initial wind field. Starting from this point the posterior distribution, p(u,v Σ o ), is sampled using either Equation (4) or (7). A MCMC approach using a modified 4 Metropolis algorithm [8] is applied. We also propose to use optimisation methods to find the modes quickly 5. 4 The modifications take advantage of our knowledge of the likely ambiguities. 5 It is known that the local inverse model has dominantly bimodal solutions, and it is suggested that this is also the case for the global solution. Thus a non-linear optimisation algorithm has also been used to find both modes. 5

The problem is complicated by the high dimensionality of p(u,v Σ o ), which is typically of the order of four hundred. Thus good local inverse and forward models are essential to accurate wind field retrieval. This modular scheme makes each part of the modelling process more focussed, while providing an elegant solution to a complex problem. If necessary, individual models can be updated or changed while the other models can be kept fixed. This scheme is autonomous, but NWP predictions could be used during the initialisation or in the prior wind field model. 4 Preliminary Results [Figure 2 here] Fig. 2 shows an example of the results of the application of inverse disambiguation to a real wind field containing a trough. The initialisation routine (which is itself complex) produces an excellent initial wind field (Fig. 2a) with a few poor wind vectors. The most probable wind field (Fig. 2b) can be seen to resemble the NWP winds (Fig. 2c), although the trough is more marked and positioned further south and east. It is these situations where the scatterometer data will improve the estimation of NWP initial conditions, as it is very likely that the scatterometer wind field is more reliable than the NWP forecast wind field, since it is based on observational data. Fig. 2d shows the evolution of the energy (which is defined to be the unnormalised negative log probability of the wind field) during sampling from the posterior, Equation (7). The energy falls very rapidly in the first 50 iterations, and then remains relatively constant as the sampling explores the mode. [Table 1 here] Table 1 shows the results of applying the retrieval procedure to 62 wind fields from the North Atlantic during March 1998. We show the vector root mean square error (VRMSE) compared to the NWP winds, for three different models, our forward and inverse neural network models and the current operational forward model 6. The middle column gives the mean VRMSE for the local models, computed by choosing the locally retrieved wind vector closest to the NWP model. The right hand column gives the VRMSE after using Equation (2) with the appropriate forward or inverse model to retrieve themaximum 6 The operational model is made probabilistic by adding a Gaussian density model to its output with noise variance determined by the results on an independent validation set. 6

a posteriori probability wind field. The local inverse model, greatly improves the retrieval of the wind field (in VRMSE terms). 4.1 Discussion The results presented indicate that the methodology is operationally viable. Fig. 2a shows that there are some cells for which the inverse model does not give a good initialisation. These poor local predictions are believed to result from unreliable σ o measurements and thus the addition of a density model for σ o may alleviate these problems. A comparison of Fig. 2b and 2c shows that the NWP winds are not always reliable, especially where there are rapid spatial changes in direction. However, NWP winds are used to train all the models, and thus careful data selection is vital to achieving reliable training sets for neural network models for wind field retrieval. This factor also increases the noise level in the training data for the local models. The retrieval accuracy in terms of VRMSE can be seen to increase by using the Bayesian framework, for all models. The large improvement in the VRMSE when using the local inverse model is probably related to ability of the direct local inverse model to accurately represent the true errors on the locally retrieved wind vectors. 5 Conclusions We have presented an elegant framework for the retrieval of wind fields using neural network techniques from satellite scatterometer data. The modular approach adopted means that model changes are simple to implement, since each model has a well defined role. We are now developing prior wind field models that include atmospheric fronts [16]. Improving the forward and inverse models using better training data should further improve local retrieval as well as wind field retrieval results. This will be vital when the methods are applied to more complex wind fields than that in Fig. 2. The approach taken will produce reliable scatterometer derived wind fields. The use of Bayesian methods allows the retrieval of more accurate wind fields as well as the assessment of posterior uncertainty in the retrieved winds. 7

Acknowledgements This work is supported by the European Union funded NEUROSAT programme (grant number ENV4 CT96-0314) and also EPSRC grant GR/L03088 Combining Spatially Distributed Predictions from Neural Networks. Thanks to MSc student David Evans for supplying the results of the application of inverse disambiguation. Thanks also to David Offiler for his useful comments during the project. References [1] A C Lorenc, R S Bell, S J Foreman, C D Hall, D L Harrison, M W Holt, D Offiler, and S G Smith. The use of ERS-1 products in operational meteorology. Advances in Space Research, 13:19 27, 1993. [2] D Offiler. The calibration of ERS-1 satellite scatterometer winds. Journal of Atmospheric and Oceanic Technology, 11:1002 1017, 1994. [3] A Stoffelen and D Anderson. Scatterometer data interpretation: Measurement space and inversion. Journal of Atmospheric and Oceanic Technology, 14:1298 1313, 1997. [4] A Stoffelen and D Anderson. Scatterometer data interpretation: Estimation and validation of the transfer function CMOD4. Journal of Geophysical Research, 102:5767 5780, 1997. [5] P Richaume, F Badran, M Crepon, C Mejia, H Roquet, and S Thiria. Neural network wind retrieval from ERS-1 satterometer data. Journal of Geophysical Research, 1998. submitted. [6] A Tarantola. Inverse Problem Theory. Elsevier, London, 1987. [7] A Stoffelen and D Anderson. Ambiguity removal and assimiliation of scatterometer data. Quarterly Journal of the Royal Meteorological Society, 123:491 518, 1997. [8] W.R. Gilks, S. Richardson, and D. J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Chapman and Hall, London, 1996. [9] C K I Williams. Prediction with Gaussian Processes: From linear regression to linear prediction and beyond. In M I Jordan, editor, Learning and Inference in Graphical Models. Kluwer, 1998. [10] D Cornford. Non-zero mean Gaussian Process prior wind field models. Technical Report NCRG/98/020, Neural Computing Research Group, Aston University, Aston Triangle, Birmingham, UK, 1998. Available from: http://www.ncrg.aston.ac.uk/. [11] D Cornford. Random field models and priors on wind. Technical Report NCRG/97/023, Neural Computing Research Group, Aston University, Aston Triangle, Birmingham, UK, 1997. Available from: http://www.ncrg.aston.ac.uk/. 8

[12] G Ramage, D Cornford, and I T Nabney. A neural network sensor model with input noise. Neurocomputing Letters, 1999. submitted. [13] D J Evans, D Cornford, and I T Nabney. Structured neural network modelling of multi-valued functions for wind retrieval from scatterometer measurements. Neurocomputing Letters, 1999. submitted. [14] N Morgan and H A Boulard. Neural networks for statistical recognition of continuous speech. Proceedings of the IEEE, 83:742 770, 1995. [15] C M Bishop, M Svensen, and C K I Williams. GTM: The generative topographic mapping. Neural Computation, 10:215 234, 1998. [16] D Cornford, I T Nabney, and C K I Williams. Adding constrained discontinuities to Gaussian Process models of wind fields. In Advances In Neural Information Processing Systems 11. The MIT Press, 1999. To appear. 9

Table 1 Performance of the wind field retrieval algorithm on ERS-2 data for March 1998, n = 62 scenes, North Atlantic only. Results of both local and wind field retrieval are shown. VRMSE = Vector Root Mean Square Error. Model VRMSE (local) VRMSE (wind field) CMOD4, operational model 2.69 2.57 NN2, our NN forward model 2.79 2.65 MDN, our NN inverse model 2.84 2.28 10

Fig. 1. A schematic of the proposed methodology for operational wind field retrieval. The likelihood and prior models are denoted by L and P respectively. Either the forward or the inverse models can be used to obtain the final wind field. 11

800 800 700 700 y axis (km) 600 500 y axis (km) 600 500 400 10ms 1 20ms 1 300 500 600 700 800 900 1000 1100 x axis (km) (a) 400 300 10ms 1 20ms 1 500 600 700 800 900 1000 1100 x axis (km) (b) y axis (km) 800 700 600 500 Energy 18000 16000 14000 12000 10000 8000 6000 4000 400 10ms 1 20ms 1 300 500 600 700 800 900 1000 1100 x axis (km) (c) 2000 0 2000 0 500 1000 1500 2000 Time (d) Fig. 2. Results of running the MCMC sampling on a wind field from the North Atlantic on the 4th of January 1994. (a) the initialised wind field from the inverse model, (b) the most probable wind field from Equation (7), (c) the true NWP wind field and (d) the evolution of the energies in the Markov chain. The position of the trough is marked by a solid line in (b) and (c). 12