Introduction to Optimal Interpolation and Variational Analysis

Similar documents
Gentle introduction to Optimal Interpolation and Variational Analysis (without equations)

Introduction to Optimal Interpolation and Variational Analysis

Error assessment of sea surface temperature satellite data relative to in situ data: effect of spatial and temporal coverage.

Alexander Barth, Aida Alvera-Azcárate, Mohamed Ouberdous, Charles Troupin, Sylvain Watelet & Jean-Marie Beckers

Preparation and dissemination of the averaged maps and fields of selected satellite parameters for the Black Sea within the SeaDataNet project

CAIBEX workshop Mesoscale experiments and modelling Cape Ghir

6th Diva user workshop

SIO 210: Data analysis

SIO 210: Data analysis methods L. Talley, Fall Sampling and error 2. Basic statistical concepts 3. Time series analysis

Alexander Barth, Aida Alvera-Azc. Azcárate, Robert H. Weisberg, University of South Florida. George Halliwell RSMAS, University of Miami

Ocean data assimilation for reanalysis

Applications of an ensemble Kalman Filter to regional ocean modeling associated with the western boundary currents variations

The Australian Operational Daily Rain Gauge Analysis

SIO 210 CSP: Data analysis methods L. Talley, Fall Sampling and error 2. Basic statistical concepts 3. Time series analysis

CHAPTER 2 DATA AND METHODS. Errors using inadequate data are much less than those using no data at all. Charles Babbage, circa 1850

Bay of Biscay s temperature and salinity climatology

Data Assimilation with the Ensemble Kalman Filter and the SEIK Filter applied to a Finite Element Model of the North Atlantic

Optimal Spectral Decomposition (OSD) for GTSPP Data Analysis

NOAA In Situ Satellite Blended Analysis of Surface Salinity: Preliminary Results for

Feb 21 and 25: Local weighted least squares: Quadratic loess smoother

New Diva capabilities for climatologies

Project of Strategic Interest NEXTDATA. Deliverables D1.3.B and 1.3.C. Final Report on the quality of Reconstruction/Reanalysis products

John Steffen and Mark A. Bourassa

Model and observation bias correction in altimeter ocean data assimilation in FOAM

Characterizing the South Atlantic Bight seasonal variability and cold-water event in 2003 using a daily cloud-free SST and chlorophyll analysis

Assimilation of Satellite Sea-surface Salinity Fields: Validating Ocean Analyses and Identifying Errors in Surface Buoyancy Fluxes

SUPPLEMENTARY INFORMATION

EVALUATION OF THE GLOBAL OCEAN DATA ASSIMILATION SYSTEM AT NCEP: THE PACIFIC OCEAN

The minimisation gives a set of linear equations for optimal weights w:

High resolution climatology of the northeast Atlantic using Data Interpolating Variational Analysis (Diva)

Ensemble Data Assimilation and Uncertainty Quantification

Adaptive Data Assimilation and Multi-Model Fusion

Comparison of of Assimilation Schemes for HYCOM

HYCOM APE Skill Testing using Historical Records

Assimilation of SST data in the FOAM ocean forecasting system

Impact of frontal eddy dynamics on the Loop Current variability during free and data assimilative HYCOM simulations

Eddy-resolving Simulation of the World Ocean Circulation by using MOM3-based OGCM Code (OFES) Optimized for the Earth Simulator

Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition

Overview of data assimilation in oceanography or how best to initialize the ocean?

Daily OI SST Trip Report Richard W. Reynolds National Climatic Data Center (NCDC) Asheville, NC July 29, 2005

Demonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM

Objective Determination of Feature Resolution in an SST Analysis. Richard W. Reynolds (NOAA, CICS) Dudley B. Chelton (Oregon State University)

Data. Mediterranean Sea Hydrographic Atlas: towards optimal data analysis by including time-dependent statistical parameters

Richard W. Reynolds * NOAA National Climatic Data Center, Asheville, North Carolina

an accessible interface to marine environmental data Russell Moffitt

Inter-comparison of Historical Sea Surface Temperature Datasets

Calculation and Application of MOPITT Averaging Kernels

Validation of sea ice concentration in the myocean Arctic Monitoring and Forecasting Centre 1

NCODA Implementation with re-layerization

Assimilation of satellite altimetry referenced to the new GRACE geoid estimate

Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006

Validation Report: WP5000 Regional tidal correction (Noveltis)

Status of Diva online as VRE application

SIO 211B, Rudnick, adapted from Davis 1

Yellow Sea Thermohaline and Acoustic Variability

USE OF SURFACE MESONET DATA IN THE NCEP REGIONAL GSI SYSTEM

T2.2: Development of assimilation techniques for improved use of surface observations

E = UV W (9.1) = I Q > V W

Feature resolution in OSTIA L4 analyses. Chongyuan Mao, Emma Fiedler, Simon Good, Jennie Waters, Matthew Martin

YOPP archive: needs of the verification community

Exploring the Temporal and Spatial Dynamics of UV Attenuation and CDOM in the Surface Ocean using New Algorithms

GFDL, NCEP, & SODA Upper Ocean Assimilation Systems

Optimal Spectral Decomposition (OSD) for Ocean Data Analysis

A High Resolution Daily Gridded Rainfall Data Set ( ) for Mesoscale Meteorological Studies

Conductivity pressure correction for the 2000dbar conductivity cell

Desertification in the Aral Sea Region: A study of the natural and Anthropogenic Impacts

Near-Surface Dispersion and Circulation in the Marmara Sea (MARMARA)

Cloud-free sea surface temperature and colour reconstruction for the Gulf of Mexico:

OCEAN MODELING. Joseph K. Ansong. (University of Ghana/Michigan)

Enhancing predictability of the Loop Current variability using Gulf of Mexico Hycom

Assessing Shelf Sea Tides in Global HYCOM

The Climatology of Clouds using surface observations. S.G. Warren and C.J. Hahn Encyclopedia of Atmospheric Sciences.

New Salinity Product in the Tropical Indian Ocean Estimated from OLR

Analysis of Physical Oceanographic Data from Bonne Bay, September 2002 September 2004

Lab 12: El Nino Southern Oscillation

CHAPTER 1: INTRODUCTION

Observed Climate Variability and Change: Evidence and Issues Related to Uncertainty

An Assessment of IPCC 20th Century Climate Simulations Using the 15-year Sea Level Record from Altimetry Eric Leuliette, Steve Nerem, and Thomas Jakub

How Non-Linearities in the Equation of State of Sea Water Can Confound Estimates of Steric Sea Level Change

SUPPLEMENTARY INFORMATION

Performance of the ocean wave ensemble forecast system at NCEP 1

Inhomogeneous Background Error Modeling and Estimation over Antarctica with WRF-Var/AMPS

Data Assimilation Research Testbed Tutorial

OOPC-GODAE workshop on OSE/OSSEs Paris, IOCUNESCO, November 5-7, 2007

Snapshots of sea-level pressure are dominated in mid-latitude regions by synoptic-scale features known as cyclones (SLP minima) and anticyclones (SLP

A Study on Residual Flow in the Gulf of Tongking

The Canadian approach to ensemble prediction

Ensemble-variational assimilation with NEMOVAR Part 2: experiments with the ECMWF system

Evaluating the results of Hormuz strait wave simulations using WAVEWATCH-III and MIKE21-SW

A comparison of sequential data assimilation schemes for. Twin Experiments

Operational systems for SST products. Prof. Chris Merchant University of Reading UK

Biogeochemical modelling and data assimilation: status in Australia

Overview of HYCOM activities at SHOM

On Sampling Errors in Empirical Orthogonal Functions

Coastal Antarctic polynyas: A coupled process requiring high model resolution in the ocean and atmosphere

statistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI

Assimilation of SWOT simulated observations in a regional ocean model: preliminary experiments

3. Estimating Dry-Day Probability for Areal Rainfall

Regional climate downscaling for the Marine and Tropical Sciences Research Facility (MTSRF) between 1971 and 2000

Lesson IV. TOPEX/Poseidon Measuring Currents from Space

Transcription:

Statistical Analysis of Biological data and Times-Series Introduction to Optimal Interpolation and Variational Analysis Alexander Barth, Aida Alvera Azcárate, Pascal Joassin, Jean-Marie Beckers, Charles Troupin a.barth@ulg.ac.be Revision 1.1

Chapter 1 Theory of optimal interpolation and variational analysis Contents 1.1 The gridding problem....................... 6 1.2 Sampling locations......................... 8 1.3 True field at the sampling locations.............. 9 1.4 Observation errors......................... 10 1.5 Data analysis and gridding methodologies.......... 13 1.6 Objective analysis......................... 16 1.7 Cressman method......................... 17 1.8 Optimal interpolation....................... 20 1.8.1 Error covariances.......................... 21 1.8.2 Assumptions............................ 22 1.8.3 Analysis............................... 22 1.9 Variational Inverse Method................... 26 1.9.1 Equivalence between OI and variational analysis........ 29 1.9.2 Error fields by analogy with optimal interpolation....... 29 1.9.3 Diva................................. 31 1.9.4 Example............................... 33 1.10 Summary.............................. 35 1.1 The gridding problem Gridding is the determination of a field φ(r), on a regular grid of positions r based on arbitrarily located observations. Often the vector r is on a 2D, 3D or even 4D space. 1

The fewer observations are available, the harder the gridding problem is In oceanography, in situ observations are sparse Observations are inhomogeneously distributed in space and time (more observations in the coastal zones and in summer) The variability of the ocean is the sum of various processes occurring at different spatial and temporal scales. Figure 1.1: Example of oceanographic field. Figure 1.1 shows an idealized square domain with a barrier (e.g. a peninsula or a dike). This field is the true field that we want to reconstruct based on observations. Let s assume that the field represents temperature. The barrier suppresses the exchanges between each side of the barrier. The field varies smoothly over some length-scale 2

1.2 Sampling locations Figure 1.2: Sampling locations within the domain In regions where a measurement campaign has been carried out, a higher spatial coverage is achieved. Based on the value of the field at the shown location, we will estimate the true field. 1.3 True field at the sampling locations 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 22 21.5 21 20.5 20 19.5 19 18.5 18 17.5 17 16.5 Figure 1.3: Values of the true field extracted at the location of the observations. some information about the position of the structures and fronts is lost no method can provide exactly the true field. But the more information about its structure we include in the analysis, the closer we can get to the true field. 3

1.4 Observation errors Observations are in general affected by different error sources and other problems that need to be taken into account: 1. Instrumental errors (limited precision or possible bias of the sensor) 2. Representative errors: the observations do not necessarily corresponds to the field we want to obtain. For example, we want to have a monthly average, but the observations are instantaneous (or averages over a very short period of time). 3. Synopticity errors: all observations are not taken at the same time. 4. Other errors sources: human errors (e.g. permutation of longitude and latitude), transmission errors, malfunctioning of the instrument, wrong decimal separators... 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 22 21.5 21 20.5 20 19.5 19 18.5 18 17.5 17 16.5 Figure 1.4: Observations with errors a random perturbation was added to the observations. This simulates the impact of the different error sources. To simplify matters, each observation was perturbed independently. 1.5 Data analysis and gridding methodologies Because observations have errors, it is always better to produce a field approximation and never a strict interpolation. 4

Figure 1.5: Gridded field using linear interpolation. This method is implemented in the function griddata of Matlab and GNU Octave. Right panel shows true field. Figure 1.5 shows what would happen if the observations would have been interpolated linearly. The domain is decomposed into triangles where the vertices are the location of the data points based on the Delaunay triangulation. Within each triangle, the value is interpolated linearly. The noise on the observations is in general amplified when higher order schemes, such as cubic interpolation, are used. There are a number of methods that have been traditionally used to solve the gridding problem: Subjective: drawing by hand (1.6) Objective: predefined mathematical operations 5

Figure 1.6: Isohyet (lines of constant precipitation) drawn by hand (from http://www. srh.noaa.gov/hgx/hurricanes/1970s.htm) 1.6 Objective analysis As opposed to the subjective method, objective analysis techniques aim to use mathematical formulations to infer the value of the field at unobserved locations based on the observations. first guess is removed from observations Gridding procedure (also called analysis ): Gridded field = First guess of field + weighted sum of observation There are several ways to define the weights, which result in different gridding techniques. We will review in the following the most used gridding methods. 1.7 Cressman method Cressman weights depend only on the distance between the location where the value of the field should be estimated and the location of the observation: 6

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-4 -3-2 -1 0 1 2 3 4 Figure 1.7: Cressman weights (blue) and Barnes weights for (red). The search radius is the typical control parameter and defines the length-scale over which an observation is used. This length scale can be made to vary in space depending on data coverage and/or physical scales. This parameter is chosen by the users based on their knowledge of the domain and the problem. Figure 1.8: Gridded field by Cressman weighting. Right panel shows true field. The Cressman weighting is a very simple and numerically quite efficient method. However, it suffers from some limitations which are apparent in figure 1.8. No estimate can be obtained at locations when no observation is located within the R. In regions with very few observations, the method can return a discontinuous field. The presence of barriers cannot be taken into account easily. All observations are assumed to have a similar error variance since the weighting is based only on distance. 7

1.8 Optimal interpolation The first guess and the observations have errors These errors are characterized by the error covariance Optimal interpolation seeks the optimal combination between the first guess and the observations 1.8.1 Error covariances Need to be specified for observation and first guess Error covariances describe how different variables or the same variable at different locations are related in average. The error covariance is in general a matrix. the square root of the diagonal elements is the average error of the observations and the first guess the off-diagonal elements describe the relationships between the variables Covariance is directly related to the correlation 1.8.2 Assumptions Hypotheses concerning those errors: observations and the first guess are unbiased, i.e. the error is zero in average. error covariance matrices are known. Those matrices are related to the expected magnitude of the error. the error of the first guess is independent of the observation errors. 1.8.3 Analysis The optimal interpolation (OI) scheme can be derived as the Best Linear Unbiased Estimator (BLUE) of the true state which has the following properties: The estimator is linear in the first guess and observations The estimator is not biased This estimate has a minimal total variance i.e. no other estimator would have an error variance lower that the BLUE estimator. From these assumptions the optimal weights of the observations can be derived to grid the field. The optimal weights depend on the error covariances. In particular, 8

Large expected error of observations compared to first guess more weight to the observations than the first guess Variables are strongly correlated spatially smooth gridded field. See file diva intro slides.pdf if you want to know how the weights are computed An error estimate can also be computed by optimal interpolation. Figure 1.9: Gridded field using optimal interpolation. Right panel shows true field. As an illustration, the field obtained by optimal interpolation is shown in figure 1.9. Figure 1.10: Relative error variance of the gridded field by optimal interpolation The error variance of the field is clearly related to the location of the observations. This relative error is bounded between 0 and 1 because it is scaled by the error variance of the first guess. 9

1.9 Variational Inverse Method The variational inverse method (VIM) or variational analysis (Brasseur et al., 1996) aims to find a field with the following characteristics: smooth somewhat close to first guess close to the observed values This is achieved by defining a cost function which penalizes a field that doesn t satisfy those requirements. close is quantified using the RMS error smooth is quantified by averaging the square of the first and second derivatives of the field (for a slowly varying field those derivatives are small). those different requirements are weighted to define what is more important, a smooth field or a field close to observations. 1.9.1 Equivalence between OI and variational analysis The variational inverse method is identical to the optimal interpolation analysis if the weights are defined as the inverse of the covariance matrix used in optimal interpolation analysis. This allows to compute the weights based on the error covariances Error covariance can be obtained by the statistical behaviour of the field 1.9.2 Error fields by analogy with optimal interpolation Error fields can be derived since: The optimal interpolation analysis is equivalent to VIM if the weights used by VIM are the inverse of the error covariances used by optimal interpolation. In the context of optimal interpolation, it can be shown that the error field equals the analysis of the covariance fields error field of VIM equals analysis (by VIM) of covariance fields (the input of the analysis tool is a vector containing the covariance of data points with the point in which the error estimate is to be calculated) 10

1.9.3 Diva Diva is a program written in Fortran and Shell scripts performing a variational analysis. This program is available at http://modb.oce.ulg.ac.be/modb/diva.html. Its features include: graphical interface or command line 3D fields are computed by performing a succession of 2D analyses generalized cross-validation for parameter estimation Land and topography are taken into account. 1.9.4 Example Figure 1.11: Field reconstructed using Diva implementing the variational inverse method. Right panel shows true field. The results are similar to the field obtained by optimal interpolation. But now, the land-sea boundary is taken into account. 11

Figure 1.12: Relative error of the field reconstructed using Diva The error is lowest near the observations and tends to 1 far away from them. The presence of the barrier also impacts the estimation of error variance. The error estimate should be interpreted cautiously since we supposed that the used error covariances are correct. 1.10 Summary A first guess is needed Observations have errors (in particular the observations do not necessarily represent what we want) Error covariances are used to quantify the magnitude of the errors gridded field of minimum expected error optimal interpolation smooth gridded field close to observations variational inverse method Both approaches are actually the same (but some things are more easy to do in one approach than in the other) 12

Chapter 2 Modelling of error covariances Garbage in, garbage out 2.1 Background error covariance matrix The error covariance needs to be specified for optimal interpolation and variational analysis. It plays a crucial role in the accuracy of the output. A common test in the gridding problem is to apply the method for a single isolated data point. The resulting field is proportional to the error covariance of the background. 2.1.1 Parametric error covariance In most optimal interpolation approaches, the error covariance is explicitly modelled based on the decomposition between variance and correlation. For simplicity, the variance of the background field is often assumed constant. The correlation matrix is parameterized using analytical functions. The correlation is essentially a function of distance The correlation is one for zero distance Generally, the correlation decreases if distance increases The correlation function is often a Gaussian. The length-scale of the function is an important parameter since it determines over which distance a data point can be extrapolated. This length-scale is the typical scale of the processes in the field. In the ocean, mesoscale processes have a length-scale of the order of the radius of deformation (figure 2.1). Biochemical tracers advected by ocean currents will have also this typical lengthscale. 13

Figure 2.1: Gulf Stream and eddy field as measured by satellite altimetry the 28 January 2008 (units are m). The diameter of these eddies are related to the radius of deformation. Modelling directly the error covariance matrix offers numerous tweaks such that the error covariance reflects better our expectations on how variables are correlated. The effect of island and barriers can be taken into account by imposing that the error covariance between two location is zero if a straight line connecting them intersects a land point. Additional fields can be used model to the error covariance. For example, satellite altimetry gives an indication about which water masses are likely to have similar properties. The impact of an observation is can be limited to the structures of similar sea surface height. The impact of the sea surface height on the covariance for a realistic example is shown in figure 2.2. 14

Figure 2.2: (a) Sea surface height. (b) Isotropic covariance. (c) Covariance modified by sea surface height. 2.1.2 Error covariance based on differential operators For variational inverse methods, the concept of error covariance is implicit. Instead of modelling the error covariance directly, the variational approach models its inverse with the help of a differential operator. Additional constraints can be included to modify the implicit covariance function. For example, an advection constraint is used which imposes that the field is (approximately) a stationary solution to the advection diffusion equation. The advection constraint aims thus at modeling the effects of velocity field on the reconstructed field. Such as constraint is introduced by adding a term to the cost function The effect of this constraint is to propagate the information along the direction of the flow (figure 2.3). This option is available in Diva. 15

Figure 2.3: (a) Flow field. (b) Correlation modified by flow field. 2.2 Observation error covariance The observations are often assumed to be uncorrelated. This maybe a valid assumption for the instrumental error, but it is not in general for the representativity error since this error is spatially correlated. However, is it complicated to use a non-diagonal matrix for the observations error covariance since it must be inverted. Binning of observations: instead of working with all observations, those which are close are averaged into bins and they are treated as a single observation. Inflating the error variance. The spatial correlation observation errors reduce then the weight of the observations in the analysis. In situ observations have a high vertical and/or temporal resolution (e.g. vertical profiles, moorings) and a low horizontal resolution. Thus the error correlation of the observations is important in time and vertical dimension. However, by performing 2D-analysis of vertical levels and time instances independently of vertically and temporally interpolated (or averaged similar to the binning approach), the correlation in those dimensions does not play a role in this case. 2.3 Summary the more realistic the error covariances are the better the analysis will be in optimal interpolation, the error covariances are often analytical functions in variational inverse method, the error covariance are implicitly defined. The cost function penalize abrupt changes of the field and large deviation from the first guess 16

Key parameters: correlation length of the error of the first guess error variance of the observation and error variance of the first guess their ratio is called signal-to-noise ratio 17

Chapter 3 Parameter optimization and cross-validation Optimal interpolation and the variational inverse method are only optimal if their underlying error covariance matrices are known. Even after one particular class of error covariance matrices has been adopted, several parameters have to be determined such as the correlation length and the signal-to-noise ratio. The optimality of any method is obviously lost if some parameters are wrong. 3.1 A priori parameter optimization By analysing the difference between the background estimate and the observations, useful information on their statistics can be derived. Since the errors of the background estimate and the errors of the observations are supposed to be independent, one can show that the covariance of their difference is the sum of two covariance matrices This relationship can be used to estimate some parameters of the error statistics if the following assumptions are true: The error variance of the background is everywhere the same The correlation of the background error between two points depends only on the distance between them. The observation errors have all the same variance and are uncorrelated. The shape of the covariance function is determined empirically by the following steps: We the error covariance by averaging over a sample of couples of observations (or all possible couples of observations). The couples are binned according to their distance. The covariance is computed by averaging over all couples in a given bin. This procedure is repeated for all bins. 18

The analytical function is fitted to obtain the correlation length-scale and the error variance of the first guess σ 2. The first data bin with zero radial distance is not taken into account for the fitting since it includes the observation error variance. The variance in the first bin can be used to estimate the error variance. 1.2 1 data covariance error on covariance data used for fitting fitted curve relative weight during fitting 0.8 0.6 0.4 0.2 0 0.2 0.4 0 0.2 0.4 0.6 0.8 1 Figure 3.1: Determination of the correlation length-scale and error covariance of the background. This approach is realized with the tool divafit. 3.2 Validation The validation consists in using an independent data set to determine the accuracy of the results. The data sets maybe some observations explicitly set aside for the validation purposes or some newly available data. One can use for example satellite SST to validate a climatology obtained by in situ data. Some poorly known parameters can be varied to minimize the RMS error between the analysis and the validation data set. This provides a mean to objectively calibrate some parameters. It is crucial for this step that the validation data set is independent from the observation used in the analysis. Exercise 1 What would be the optimal signal-to-noise ratio if the analysed observations would include the validation data set (in the context of optimal interpolation or variational inverse method and all other parameters are kept constant)? 19

3.3 Cross-validation In the cross-validation approach, one data point is excluded from the data-set the analysis is performed without it. The analyzed field is interpolated at the location of the excluded point. This interpolated value is thus not influenced by the excluded data point. Their difference give thus an indication of the accuracy of the field at the location. This procedure is repeated for all observations RMS error is computed as the sum of the square of all difference However, this procedure would require an additional analysis for every observation. For most applications, this is too prohibitive in terms of calculation. Craven and Wahba (1979) showed are numerical efficient way to compute the cross-validation RMS error with only one additional analysis. If a large number of observations are available, then a different approach can be adopted to estimate the error of the analysis: A small subset of randomly chosen observations are set aside (the cross-validation data set). The analysis is performed without the cross-validation data set. The result is compared to the observations initially set aside and an RMS error is computed. The two last steps are repeated to optimize certain parameters of the analysis. Once the optimal parameters are derived a final analysis is carried out using all data. The size of the cross-validation data set has to be sufficiently large so that the error estimate is robust but at the same time its size has to be a negligible fraction of the total data set (typically 1%). This explains why a large amount of data is necessary. This approach is used in DINEOF to determine the optimal number of EOFs. An estimate of the global accuracy of an analysis is already an useful information in itself. But by performing the analysis with several set of parameters, one can compute the cross-validator for each analysis. The set of parameters producing the lowest cross-validator is thus the optimal one. 20

3.4 Summary Don t skip the validation of your results! If you don t have an additional dataset, Cross-validation allows you to estimate the error of your analysis set some date aside do the analysis compute the RMS error between the analysis and the data set aside By minimizing the RMS error, you can estimate some parameters of your analysis 21

Chapter 4 Real world example The following shows an application of Diva to compute three-dimensional fields of temperature and salinity in the Northern Atlantic by Troupin et al. (2008). 4.1 Data Data were gathered from various databases in the region 0 60 N 0 50 W. We processed them to remove duplicates, detect outliers and perform vertical interpolation with Weighted Parabolas method (Reiniger and Ross, 1968). 22

(a) Winter, 200m (b) Winter, 3500m (c) Summer, 200m (d) Summer, 3500m Figure 4.1: Localisation of temperature observations at 200 m and 3500 m in winter and summer. 4.2 Finite-element mesh Resolution of the minization problem relies on a highly optimized finite-element technique, which permits computational efficiency independent on the data number and the consideration of real boundaries (coastlines and bottom). 23

60 200 m 60 3500 m 50 50 40 40 30 30 20 20 10 10 0 50 40 30 20 10 0 0 50 40 30 20 10 0 Figure 4.2: Meshes generated at the 200 m (left) and 3500 m. 4.3 Results Our results are compared with 1/4 climatology from the World Ocean Atlas 2001 (WOA01 in the following, (Boyer et al., 2005)) 24

(a) Diva, 200m (b) WOA01, 200m (c) Diva, 3500m (d) WOA01, 3500m Figure 4.3: Comparison of analysed temperature field with Diva (left) and WOA01. Acknowledgements The National Fund of Scientific Research (F.R.S.-FNRS, Belgium) SeaDataNet project SESAME, Southern European Seas: Assessing and Modelling Ecosystem changes ECOOP, European COastal-shelf sea OPerational observing and forecasting system 25

Bibliography Abramowitz, M. and A. Stegun, 1964: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications, New York. Alvera-Azcárate, A., A. Barth, M. Rixen, and J. M. Beckers, 2005: Reconstruction of incomplete oceanographic data sets using Empirical Orthogonal Functions. Application to the Adriatic Sea Surface Temperature. Ocean Modelling, 9, 325 346, doi:10.1016/j.ocemod.2004.08.001. Beckers, J.-M. and M. Rixen, 2003: EOF calculation and data filling from incomplete oceanographic datasets. Journal of Atmospheric and Oceanic Technology, 20, 1839 1856. Boyer, T., S. Levitus, H. Garcia, R. Locarnini, C. Stephens, and J. Antonov, 2005: Objective analyses of annual, seasonal, and monthly temperature and salinity for the world ocean on a 0.25 degree grid. International Journal of Climatology, 25, 931 945. Brankart, J.-M. and P. Brasseur, 1996: Optimal analysis of in situ data in the Western Mediterranean using statistics and cross-validation. Journal of Atmospheric and Oceanic Technology, 13, 477 491. Brasseur, P., J.-M. Beckers, J.-M. Brankart, and R. Schoenauen, 1996: Seasonal temperature and salinity fields in the Mediterranean Sea: Climatological analyses of an historical data set. Deep Sea Research, 43, 159 192. Craven, P. and G. Wahba, 1979: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377 403. Girard, D., 1989: A fast Monte-Carlo cross-validation procedure for large least squares problems with noisy data. Numerische Mathematik, 56, 1 23. Reiniger, R. and C. Ross, 1968: A method of interpolation with application to oceanographic data. Deep-Sea Research, 15, 185 193. Troupin, C., M. Ouberdous, F. Machin, M. Rixen, D. Sirjacobs, and J.-M. Beckers, 2008: Three-dimensional analysis of oceanographic data with the software DIVA. Geophysical Research Abstracts, 4th EGU General Assembly, volume 10. 26