Feb 21 and 25: Local weighted least squares: Quadratic loess smoother

Size: px
Start display at page:

Download "Feb 21 and 25: Local weighted least squares: Quadratic loess smoother"

Transcription

1 Feb 1 and 5: Local weighted least squares: Quadratic loess smoother An example of weighted least squares fitting of data to a simple model for the purposes of simultaneous smoothing and interpolation is the quadratic loess smoother. In one dimension, this can be used to smooth, filter or interpolate a time series of values that may or may not be at regular sampling intervals. An application in two or more dimensions could be to produce a gridded analysis or climatology from a set of data observed at irregularly spaced locations and times, such as a set of shipboard hydrographic observations of temperature, salinity, or other ocean properties. The model is a local quadratic function, which can be written in terms of coordinates centered at the estimation point, x o, y o y = a + a ( x x ) + a ( y y ) + a ( x x )( y y ) + a ( x x ) + a ( y y ) 1 o 3 o 4 o o 5 o 6 o which has the design matrix E = 1 xi xo yi yo ( xi xo)( yi yo) ( xi xo) ( yi yo) d1 data d = dn, a1 a = and a 6 The flexibility to arbitrarily choose a linear model and a set of weights means the least squares fitting approach can be tailored to meet a user s notion of what constitutes a rational model based on some a priori knowledge of the processes being observed and modeled. An example of this is the Climatology of the Australian Regional Seas (CARS): Ridgway, K., J. R. Dunn and J. L. Wilkin (00), Ocean interpolation by 4- dimensional weighted least squares: Application to the waters around Australasia, Journal of Atmospheric and Oceanic Technology, 19, (pdf)

2 Additional terms in the model can be included provided they have a form that can be expressed with the design matrix, coefficients a k, and data coordinates (including, e.g. observation times t i ). Length and time scales of variability The quadratic loess smoother requires an a priori choice be made for the scales (usually length or time) to apply in the selection of the smoothing weights. The smoother can be interpreted as a filter, since the linear weighting procedure is effectively implemented as a convolution of the weights with the data. It is more general in the sense that the data do not have to be at regular intervals, because the weights are computed simply as a function of the normalized distance function r. It has been shown, empirically, that the effective cutoff frequency f c the quadratic loess smoother when it is interpreted as a filter is f c L -1 where L is the half width (i.e. the normalization scale) used in the loess smoother. If the loess smoother is to be used to deliberately remove certain scales of variability (i.e. as a filter), then selection of L is straightforward. However, if the objective is to use the smoother to do the best possible job of interpolating gaps in the data, then the smoothing scale should be adapted to the natural length or time scales of variability in the underlying physical process being observed. The weighting can be viewed in terms of the model equations to which we seek the least squares best solution for the parameters a We weight the rows of the matrix equation Ea = d by weights w i which can be summarized as a weighting vector w solved with diag( wea ) = diag( wd ) Ea ˆ = Wd >> a=ehat\(w*d); The classic quadratic loess smoother uses the weighting function: w= r in r < 3 3 (1 ) 1 where r might be normalized one of two ways: (1) r is a normalized Cartesian distance with prescribed smoothing scale L

3 r 1/ x xo y y o = + L L () r is normalized differently for each estimation point x o,y o after finding the distance r max that encloses the nearest N data points (( ) ( ) o o ) * r = x x + y y * * r sort( r ) * rmax = r ( N) * r r = r max Since the data coordinates are transformed to be with respect to the estimation location, x o,y o, the final loess estimate is simply a 1. Two-dimensional spatial mapping using a loess filter is demonstrated in Matlab script jw_lecture_loessd.m Optimal Interpolation / Objective Analysis / Gauss-Markov smoothing John s old scanned notes on OI This brings us to method of optimal interpolation (OI), also known as objective mapping or Gauss-Markov smoothing. [See Emery and Thompson, section 4..] Optimal interpolation estimates the field being observed at an arbitrary location and time through a linear combination of the available data. The weights used are chosen so that the expected error of the estimate is a minimum in the least squares sense, and the estimate itself is unbiased (i.e. has the same mean as the true field). OI is therefore sometimes referred to as the Best Linear Unbiased Estimator (BLUE) of a field. The underlying covariance length and time scales of the data and true field enter into the computation of the linear weights. Important concepts in optimal interpolation: OI produces the best linear unbiased estimate of a field from a set of arbitrarily distributed observations. Central to the estimation procedure is knowledge of the underlying covariance function of between the data and the process being observed (the model-data covariance), and the data being used with themselves (the data-data covariance).

4 This data-data covariance includes an a priori estimate of the uncertainty (error) in the observations. The model-data and data-data covariance patterns should be similar. If the data errors are independent, the error variance simply adds to the diagonal of data covariance matrix. If the data errors are correlated, off-diagonal elements of the data-data covariance matrix would differ from the mode-data covariance, but in practice this is seldom if ever considered. Frequently, the covariance is assumed to be homogenous and isotropic, in which case the covariance becomes simply a function of the distance separating the locations of the data and model (grid) points. If valid, assumptions of homogeneity and isotropy facilitate the estimation of the shape of the covariance function by taking an ensemble of data covariance binned according to spatial and/or temporal lags. The OI method produces an objective estimate of the expected error in the result. The OI technique can be formulated to simultaneously interpolate different but related data types (e.g. winds and geopotential heights) provided a linear relationship between the model and data (e,g. as in the case of geostrophic winds (or ocean currents) computed from geopotential (or sea surface) heights). In the case of geostrophic turbulence, the assumption of isotropy dictates a fixed relationship between the covariance of the individual velocity components and streamfunction. Simultaneously estimating multiple variables has the advantage that known physical constraints (e.g. continuity, geostrophy) can be incorporated into the mapping procedure, thereby producing results that are balanced kinematically and/or dynamically. Examples: Combining altimeter sea surface height observations and velocity observations from sequential satellite imagery. Wilkin, J. L., M. M. Bowen and W. J. Emery (00), Mapping mesoscale currents by optimal interpolation of satellite radiometer and altimeter data, Ocean Dynamics, 5, (pdf). John s old scanned notes Optimal interpolation exploits knowledge of the autocorrelation of a process to determine the relative weight to be given to a set of data (in a weighted sum) to estimate the true field at a certain location (in space and time). The autocorrelation is essentially indicating which data are near and which are far from the estimation point.

5 The problem: Estimate some variable, D, at location(s) x a, t on the basis of a set of neighboring observations (the data) d at locations x b, t. The data are assumed to be observations of the true field with some observational error: dx (, t) = Dx (, t) + nx (, t) The measurement errors n are assumed unbiased, n = 0, and uncorrelated with the field being observed, D. [In practice it is desirable to remove any well resolved (long space/time scales) deterministic signals from the data first so that the interpolation is being applied to a data set with reduced variance. (For example, a seasonal cycle or spatial variability of very long wavelength.) ] We denote Dx ( ) as the estimate of the true value at location x (and time t), and will compute this is a linear weighted sum of the data: T Dx ( ) = Dx ( ) + ( d d) w( x) T = Dx ( ) + w ( x)( d d) where the weights w(x) are not specified (yet), and the dependence on x emphasizes that the weights will be different for every estimate location. The assumption we have unbiased data implies that the mean of the data, d, will be a valid estimate of the mean of the field D. The weights w are selected so as to minimize the expected value of the mean square variance of the error between the linear weighted estimate, Dx ( ), and the true value of the variable being observed, D(x) (Of course, we don t actually know what this true value is if we did we probably wouldn t be bothering with all this.) Therefore, we minimize: ( ( ) ( )) n D x D x = true estimate T T T n = D D D D ( ( d d) w) ( ( d d) w) T T T T T = ( D D) w ( d d)( D D) ( D D) ( d d) w+ w ( d d)( d d) w Here, ( d d)( d d ) T is the data-data covariance matrix which we denote as C.

6 The nd through 4 th terms are of the form T T T AB BA + ACA T T T w ( d-d)( D D) [( d-d)( D D)] w+ w Cw and we denote ( D D)( d-d ) T as the model-data covariance matrix, C md, i.e. this is the covariance of the true field at the estimation location, D(x), with all the data, d, (hence it is a vector the same length as the data). The identity of completing the square for a simple quadratic algebraic equation finds the constants k i, k to rearrange ax + bx + c = a(...) + constant = a( x + k ) + k 1 When completing the square for the matrix equation above it can be shown that this rearranges to: T T T -1-1 T -1 T ACA BA AB = (A BC )C(A BC ) BC B [You can verify this by expanding the line above and simplifying by noting that C is T -1 symmetric, CC =I. ] So it follows that: C=C (because it is a covariance matrix), and therefore ( ) T T T n = ( D D) + ( w C C ) C( w C C ) C C C T md md md md The second term is quadratic, and the expected vale of n is minimized by making this term zero. This gives us the optimal weights w. -1 w C C = 0 or md w = C C md -1 The Best Linear Unbiased Estimate (BLUE) is then T ˆD= D+ w (d-d) = D + C -1 mdc (d-d)

7 In practice, the data-data covariance matrix can be very large and expense to invert. Typically, it has a much larger dimension than the model which would be a grid of coordinates on which we are computing our climatology or analysis. It is more computationally efficient to compute the product of the weights and data directly by solving, in a least squares sense, the problem: by a Matlab matrix left divide: >> ws = Cdd\(d-dbar); * -1 which gives us the product w =C (d-d) and the estimate is then calculated as: * Cw = (d - d) ˆD= D+ C = D + C -1 mdc (d-d) * mdw However, we would still need the data-data covariance inverse to make a formal estimate of the expected error in the analysis. Optimal interpolation example The matlab scripts cov_mercator.m and oi_mercator.m demonstrate fitting a covariance function to a set of synthetic data, and using this function to optimally interpolate to a regular grid. The data used in this example is ocean temperature taken from the French Mercator operational ocean forecast system for the North Atlantic. cov_mercator: The script cov_mercator.m loads the example Mercator snapshot form a mat file, subsamples the data to a small (3%) subset and adds some normally distributed random noise to emulate instrument error (or unresolved high frequency physical variability due e.g. to internal waves in the case of in situ ocean temperature observations). The lon/lat coordinates of the sub-sampled data set are converted to simple x,y, coordinates w.r.t. the southwest corner of the data range, and then the separation distance between all data (r) so that a binned-lagged covariance as a function of r

8 can be computed from the data themselves, i.e. estimate Cr () = d(x)d(x + r) Two functional forms (Gaussian and Markov) for C(r) are fitted to the estimated covariance using Matlab s fminsearch function. Note that the covariance at r=0 is not used in the fit because this includes the effect of the independent observational, or error, variance. r Gaussian: Cr ( ) = s exp( ) a Markov: Cr ( ) = s(1 + r )exp( r/ a) a where s is the signal variance, i.e. the variance of the true field at zero lag. The apparent error variance, e, is calculated from the difference of the data variance at r=0 (i.e. var[data]) and the signal variance as r 0 that is indicated by the y-intercept of the functional fit, i.e. C(0). e s oi_mercator: Using the fitted Markov covariance function, normalized model-data (C md ) and data-data (C dd ) covariance matrices are computed. The data-data is augmented on the diagonal with the ratio of error to signal variance. The optimal interpolation fit to the data is computed by direct inversion of C dd and also by the Matlab matrix-left-divide operation to compute the product C dd d directly.

9 Expected errors of the OI are calculated, and qualitatively compared to the actual residuals of the fit to show that, as expected, approximately 65% of the residuals fall within the expected errors. If the data set (N points) is large, the matrix sizes may get too large for practical handling in Matlab (or any other language) because the OI problem has to solve matrix inversions or simultaneous equation solutions of dimension NxN. This may be handled by dividing the model grid into subdomains, like tiles, with subsets of the data limited to only those points that fall within the model tile plus a halo region around the tiles. The halo region should be at least one covariance scale wide to ensure smoothness at the tile boundaries. The data-data matrix C dd will have to be computed anew for each tile, but computing e.g. order(10) OI operations for order(n/10) data elements may be faster than one OI for order(n) data elements. This is because the computational efforts of the matrix operations scale with N 3. If there are too many data within a few covariance scales to practically invert C dd, then it is likely that there are more data than necessary to resolve the mapped field. This indicates that a shorter covariance scale can probably be used. Alternatively, it is probably safe to decimate the data (just use less of it, thereby making N smaller) or average the observations in small bins. For independent errors, the binning step will reduce the expected error (the noise variance) of the binned values, and this information can be carried through the analysis. Expected errors A posteriori testing of error estimates can be done to see whether the proportion of residuals within the expected errors is statistically consistent. More in depth tests would compare the results to a set of independent data, such as from another instrument, or data withheld from the OI itself. See Walker and Wilkin (1998) for an example of checking the validity of error estimates though a Chi-squared test. The expected error is computed in the demonstration script oi_mercator.m The expected error in the analysis, or estimate, is given by e ( x ) = s c C c k md -1 T md where s is the variance of the signal (the true solution) and c md is the covariance of the model (at location x k ) with the data d, i.e. it is the k th row of C md. The vector of all estimated errors is = diag( s ) -1 T e I cmdc c md The optimal interpolation analysis of the data therefore states that our best linear unbiased estimate of the true signal is ˆD± e.

10 If we were able to make some independent analysis of the error in ˆD, such as we actually fabricated the data to test the method as in the oi_mercator.m script, then we expect for about 68% of the estimates the true value D would fall within ˆD± e. From the equation for e we see that the maximum the expected error can be is simply the signal variance. This occurs when there are no data within a few covariance scales of the estimation location and c md is 0. In this case our best estimate is just the background field an our uncertainty is the full variance of the signal basically the OI is unable to help inform us. If we have some data close (in terms of covariance scale) to the estimation location, and c md is greater than zero, then the expected error is less than the signal variance and OI has helped us. If the data error variance is small, the diagonal elements of C -1 are large, and this would further decrease the expected error. So having better quality data improves the skill of the estimate.

The minimisation gives a set of linear equations for optimal weights w:

The minimisation gives a set of linear equations for optimal weights w: 4. Interpolation onto a regular grid 4.1 Optimal interpolation method The optimal interpolation method was used to compute climatological property distributions of the selected standard levels on a regular

More information

Ocean data assimilation for reanalysis

Ocean data assimilation for reanalysis Ocean data assimilation for reanalysis Matt Martin. ERA-CLIM2 Symposium, University of Bern, 14 th December 2017. Contents Introduction. On-going developments to improve ocean data assimilation for reanalysis.

More information

4. DATA ASSIMILATION FUNDAMENTALS

4. DATA ASSIMILATION FUNDAMENTALS 4. DATA ASSIMILATION FUNDAMENTALS... [the atmosphere] "is a chaotic system in which errors introduced into the system can grow with time... As a consequence, data assimilation is a struggle between chaotic

More information

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance. hree-dimensional variational assimilation (3D-Var) In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance. Lorenc (1986) showed

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields.

Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields. Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields. Optimal Interpolation ( 5.4) We now generalize the

More information

Chapter 1: Systems of Linear Equations and Matrices

Chapter 1: Systems of Linear Equations and Matrices : Systems of Linear Equations and Matrices Multiple Choice Questions. Which of the following equations is linear? (A) x + 3x 3 + 4x 4 3 = 5 (B) 3x x + x 3 = 5 (C) 5x + 5 x x 3 = x + cos (x ) + 4x 3 = 7.

More information

Introduction to Optimal Interpolation and Variational Analysis

Introduction to Optimal Interpolation and Variational Analysis Statistical Analysis of Biological data and Times-Series Introduction to Optimal Interpolation and Variational Analysis Alexander Barth, Aida Alvera Azcárate, Pascal Joassin, Jean-Marie Beckers, Charles

More information

Numerical Weather Prediction: Data assimilation. Steven Cavallo

Numerical Weather Prediction: Data assimilation. Steven Cavallo Numerical Weather Prediction: Data assimilation Steven Cavallo Data assimilation (DA) is the process estimating the true state of a system given observations of the system and a background estimate. Observations

More information

The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions

The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions Joshua Hacker National Center for Atmospheric Research hacker@ucar.edu Topics The closure problem and physical parameterizations

More information

SECTION 7: CURVE FITTING. MAE 4020/5020 Numerical Methods with MATLAB

SECTION 7: CURVE FITTING. MAE 4020/5020 Numerical Methods with MATLAB SECTION 7: CURVE FITTING MAE 4020/5020 Numerical Methods with MATLAB 2 Introduction Curve Fitting 3 Often have data,, that is a function of some independent variable,, but the underlying relationship is

More information

Gentle introduction to Optimal Interpolation and Variational Analysis (without equations)

Gentle introduction to Optimal Interpolation and Variational Analysis (without equations) Gentle introduction to Optimal Interpolation and Variational Analysis (without equations) Alexander Barth, Aida Alvera Azcárate, Pascal Joassin, Jean-Marie Beckers, Charles Troupin a.barth@ulg.ac.be November

More information

Morphing ensemble Kalman filter

Morphing ensemble Kalman filter Morphing ensemble Kalman filter and applications Center for Computational Mathematics Department of Mathematical and Statistical Sciences University of Colorado Denver Supported by NSF grants CNS-0623983

More information

Representation of inhomogeneous, non-separable covariances by sparse wavelet-transformed matrices

Representation of inhomogeneous, non-separable covariances by sparse wavelet-transformed matrices Representation of inhomogeneous, non-separable covariances by sparse wavelet-transformed matrices Andreas Rhodin, Harald Anlauf German Weather Service (DWD) Workshop on Flow-dependent aspects of data assimilation,

More information

Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006

Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006 Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems Eric Kostelich Data Mining Seminar, Feb. 6, 2006 kostelich@asu.edu Co-Workers Istvan Szunyogh, Gyorgyi Gyarmati, Ed Ott, Brian

More information

Optimal Interpolation

Optimal Interpolation Optimal Interpolation Optimal Interpolation and/or kriging consist in determining the BEST LINEAR ESTIMATE in the least square sense for locations xi where you have no measurements: Example 1: Collected

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

Local Ensemble Transform Kalman Filter

Local Ensemble Transform Kalman Filter Local Ensemble Transform Kalman Filter Brian Hunt 11 June 2013 Review of Notation Forecast model: a known function M on a vector space of model states. Truth: an unknown sequence {x n } of model states

More information

Applications of an ensemble Kalman Filter to regional ocean modeling associated with the western boundary currents variations

Applications of an ensemble Kalman Filter to regional ocean modeling associated with the western boundary currents variations Applications of an ensemble Kalman Filter to regional ocean modeling associated with the western boundary currents variations Miyazawa, Yasumasa (JAMSTEC) Collaboration with Princeton University AICS Data

More information

Methods of Data Assimilation and Comparisons for Lagrangian Data

Methods of Data Assimilation and Comparisons for Lagrangian Data Methods of Data Assimilation and Comparisons for Lagrangian Data Chris Jones, Warwick and UNC-CH Kayo Ide, UCLA Andrew Stuart, Jochen Voss, Warwick Guillaume Vernieres, UNC-CH Amarjit Budiraja, UNC-CH

More information

Adaptive Data Assimilation and Multi-Model Fusion

Adaptive Data Assimilation and Multi-Model Fusion Adaptive Data Assimilation and Multi-Model Fusion Pierre F.J. Lermusiaux, Oleg G. Logoutov and Patrick J. Haley Jr. Mechanical Engineering and Ocean Science and Engineering, MIT We thank: Allan R. Robinson

More information

Overview of Spatial Statistics with Applications to fmri

Overview of Spatial Statistics with Applications to fmri with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example

More information

M.Sc. in Meteorology. Numerical Weather Prediction

M.Sc. in Meteorology. Numerical Weather Prediction M.Sc. in Meteorology UCD Numerical Weather Prediction Prof Peter Lynch Meteorology & Climate Cehtre School of Mathematical Sciences University College Dublin Second Semester, 2005 2006. Text for the Course

More information

OOPC-GODAE workshop on OSE/OSSEs Paris, IOCUNESCO, November 5-7, 2007

OOPC-GODAE workshop on OSE/OSSEs Paris, IOCUNESCO, November 5-7, 2007 OOPC-GODAE workshop on OSE/OSSEs Paris, IOCUNESCO, November 5-7, 2007 Design of ocean observing systems: strengths and weaknesses of approaches based on assimilative systems Pierre Brasseur CNRS / LEGI

More information

Matrix operations Linear Algebra with Computer Science Application

Matrix operations Linear Algebra with Computer Science Application Linear Algebra with Computer Science Application February 14, 2018 1 Matrix operations 11 Matrix operations If A is an m n matrix that is, a matrix with m rows and n columns then the scalar entry in the

More information

SIO 210: Data analysis methods L. Talley, Fall Sampling and error 2. Basic statistical concepts 3. Time series analysis

SIO 210: Data analysis methods L. Talley, Fall Sampling and error 2. Basic statistical concepts 3. Time series analysis SIO 210: Data analysis methods L. Talley, Fall 2016 1. Sampling and error 2. Basic statistical concepts 3. Time series analysis 4. Mapping 5. Filtering 6. Space-time data 7. Water mass analysis Reading:

More information

Regression #4: Properties of OLS Estimator (Part 2)

Regression #4: Properties of OLS Estimator (Part 2) Regression #4: Properties of OLS Estimator (Part 2) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #4 1 / 24 Introduction In this lecture, we continue investigating properties associated

More information

Estimation and Prediction Scenarios

Estimation and Prediction Scenarios Recursive BLUE BLUP and the Kalman filter: Estimation and Prediction Scenarios Amir Khodabandeh GNSS Research Centre, Curtin University of Technology, Perth, Australia IUGG 2011, Recursive 28 June BLUE-BLUP

More information

Introduction. Spatial Processes & Spatial Patterns

Introduction. Spatial Processes & Spatial Patterns Introduction Spatial data: set of geo-referenced attribute measurements: each measurement is associated with a location (point) or an entity (area/region/object) in geographical (or other) space; the domain

More information

The Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich. Main topics

The Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich. Main topics The Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich Arizona State University Co-workers: Istvan Szunyogh, Brian Hunt, Ed Ott, Eugenia Kalnay, Jim Yorke, and many others http://www.weatherchaos.umd.edu

More information

Kalman Filter Computer Vision (Kris Kitani) Carnegie Mellon University

Kalman Filter Computer Vision (Kris Kitani) Carnegie Mellon University Kalman Filter 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Examples up to now have been discrete (binary) random variables Kalman filtering can be seen as a special case of a temporal

More information

Kalman Filter and Ensemble Kalman Filter

Kalman Filter and Ensemble Kalman Filter Kalman Filter and Ensemble Kalman Filter 1 Motivation Ensemble forecasting : Provides flow-dependent estimate of uncertainty of the forecast. Data assimilation : requires information about uncertainty

More information

Fundamentals of Data Assimilation

Fundamentals of Data Assimilation National Center for Atmospheric Research, Boulder, CO USA GSI Data Assimilation Tutorial - June 28-30, 2010 Acknowledgments and References WRFDA Overview (WRF Tutorial Lectures, H. Huang and D. Barker)

More information

Lessons in Estimation Theory for Signal Processing, Communications, and Control

Lessons in Estimation Theory for Signal Processing, Communications, and Control Lessons in Estimation Theory for Signal Processing, Communications, and Control Jerry M. Mendel Department of Electrical Engineering University of Southern California Los Angeles, California PRENTICE HALL

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Dan Cawford Manfred Opper John Shawe-Taylor May, 2006 1 Introduction Some of the most complex models routinely run

More information

SIO 210: Data analysis

SIO 210: Data analysis SIO 210: Data analysis 1. Sampling and error 2. Basic statistical concepts 3. Time series analysis 4. Mapping 5. Filtering 6. Space-time data 7. Water mass analysis 10/8/18 Reading: DPO Chapter 6 Look

More information

Definition of a Stochastic Process

Definition of a Stochastic Process Definition of a Stochastic Process Balu Santhanam Dept. of E.C.E., University of New Mexico Fax: 505 277 8298 bsanthan@unm.edu August 26, 2018 Balu Santhanam (UNM) August 26, 2018 1 / 20 Overview 1 Stochastic

More information

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

The Ensemble Kalman Filter:

The Ensemble Kalman Filter: p.1 The Ensemble Kalman Filter: Theoretical formulation and practical implementation Geir Evensen Norsk Hydro Research Centre, Bergen, Norway Based on Evensen 23, Ocean Dynamics, Vol 53, No 4 p.2 The Ensemble

More information

Introduction to Data Assimilation

Introduction to Data Assimilation Introduction to Data Assimilation Alan O Neill Data Assimilation Research Centre University of Reading What is data assimilation? Data assimilation is the technique whereby observational data are combined

More information

XVI. Objective mapping

XVI. Objective mapping SIO 211B, Rudnick, adapted from Davis 1 XVI. Objective mapping An objective map is the minimum mean-square error estimate of a continuous function of a variable, given discrete data. The interpolation

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Simple Examples. Let s look at a few simple examples of OI analysis.

Simple Examples. Let s look at a few simple examples of OI analysis. Simple Examples Let s look at a few simple examples of OI analysis. Example 1: Consider a scalar prolem. We have one oservation y which is located at the analysis point. We also have a ackground estimate

More information

Chapter 2 The Simple Linear Regression Model: Specification and Estimation

Chapter 2 The Simple Linear Regression Model: Specification and Estimation Chapter The Simple Linear Regression Model: Specification and Estimation Page 1 Chapter Contents.1 An Economic Model. An Econometric Model.3 Estimating the Regression Parameters.4 Assessing the Least Squares

More information

DATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do

DATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do DATA IN SERIES AND TIME I Several different techniques depending on data and what one wants to do Data can be a series of events scaled to time or not scaled to time (scaled to space or just occurrence)

More information

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric

More information

Statistical methods. Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra

Statistical methods. Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra Statistical methods Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra Statistical methods Generating random numbers MATLAB has many built-in functions

More information

Why is the field of statistics still an active one?

Why is the field of statistics still an active one? Why is the field of statistics still an active one? It s obvious that one needs statistics: to describe experimental data in a compact way, to compare datasets, to ask whether data are consistent with

More information

Types of Spatial Data

Types of Spatial Data Spatial Data Types of Spatial Data Point pattern Point referenced geostatistical Block referenced Raster / lattice / grid Vector / polygon Point Pattern Data Interested in the location of points, not their

More information

Math 313 Chapter 1 Review

Math 313 Chapter 1 Review Math 313 Chapter 1 Review Howard Anton, 9th Edition May 2010 Do NOT write on me! Contents 1 1.1 Introduction to Systems of Linear Equations 2 2 1.2 Gaussian Elimination 3 3 1.3 Matrices and Matrix Operations

More information

Cosmology & CMB. Set5: Data Analysis. Davide Maino

Cosmology & CMB. Set5: Data Analysis. Davide Maino Cosmology & CMB Set5: Data Analysis Davide Maino Gaussian Statistics Statistical isotropy states only two-point correlation function is needed and it is related to power spectrum Θ(ˆn) = lm Θ lm Y lm (ˆn)

More information

Fitting functions to data

Fitting functions to data 1 Fitting functions to data 1.1 Exact fitting 1.1.1 Introduction Suppose we have a set of real-number data pairs x i, y i, i = 1, 2,, N. These can be considered to be a set of points in the xy-plane. They

More information

Data Assimilation Research Testbed Tutorial

Data Assimilation Research Testbed Tutorial Data Assimilation Research Testbed Tutorial Section 3: Hierarchical Group Filters and Localization Version 2.: September, 26 Anderson: Ensemble Tutorial 9//6 Ways to deal with regression sampling error:

More information

Fundamentals of Data Assimila1on

Fundamentals of Data Assimila1on 014 GSI Community Tutorial NCAR Foothills Campus, Boulder, CO July 14-16, 014 Fundamentals of Data Assimila1on Milija Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University

More information

Validation of sea ice concentration in the myocean Arctic Monitoring and Forecasting Centre 1

Validation of sea ice concentration in the myocean Arctic Monitoring and Forecasting Centre 1 Note No. 12/2010 oceanography, remote sensing Oslo, August 9, 2010 Validation of sea ice concentration in the myocean Arctic Monitoring and Forecasting Centre 1 Arne Melsom 1 This document contains hyperlinks

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Henrik Aalborg Nielsen 1, Henrik Madsen 1, Torben Skov Nielsen 1, Jake Badger 2, Gregor Giebel 2, Lars Landberg 2 Kai Sattler 3, Henrik Feddersen 3

Henrik Aalborg Nielsen 1, Henrik Madsen 1, Torben Skov Nielsen 1, Jake Badger 2, Gregor Giebel 2, Lars Landberg 2 Kai Sattler 3, Henrik Feddersen 3 PSO (FU 2101) Ensemble-forecasts for wind power Comparison of ensemble forecasts with the measurements from the meteorological mast at Risø National Laboratory Henrik Aalborg Nielsen 1, Henrik Madsen 1,

More information

Aspects of the practical application of ensemble-based Kalman filters

Aspects of the practical application of ensemble-based Kalman filters Aspects of the practical application of ensemble-based Kalman filters Lars Nerger Alfred Wegener Institute for Polar and Marine Research Bremerhaven, Germany and Bremen Supercomputing Competence Center

More information

Adaptive Filter Theory

Adaptive Filter Theory 0 Adaptive Filter heory Sung Ho Cho Hanyang University Seoul, Korea (Office) +8--0-0390 (Mobile) +8-10-541-5178 dragon@hanyang.ac.kr able of Contents 1 Wiener Filters Gradient Search by Steepest Descent

More information

Brian J. Etherton University of North Carolina

Brian J. Etherton University of North Carolina Brian J. Etherton University of North Carolina The next 90 minutes of your life Data Assimilation Introit Different methodologies Barnes Analysis in IDV NWP Error Sources 1. Intrinsic Predictability Limitations

More information

Frequentist-Bayesian Model Comparisons: A Simple Example

Frequentist-Bayesian Model Comparisons: A Simple Example Frequentist-Bayesian Model Comparisons: A Simple Example Consider data that consist of a signal y with additive noise: Data vector (N elements): D = y + n The additive noise n has zero mean and diagonal

More information

Gaussian Filtering Strategies for Nonlinear Systems

Gaussian Filtering Strategies for Nonlinear Systems Gaussian Filtering Strategies for Nonlinear Systems Canonical Nonlinear Filtering Problem ~u m+1 = ~ f (~u m )+~ m+1 ~v m+1 = ~g(~u m+1 )+~ o m+1 I ~ f and ~g are nonlinear & deterministic I Noise/Errors

More information

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y. ATMO/OPTI 656b Spring 009 Bayesian Retrievals Note: This follows the discussion in Chapter of Rogers (000) As we have seen, the problem with the nadir viewing emission measurements is they do not contain

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

8. Kalman Filtering. ATM 552 Notes Kalman Filtering: Chapter 8 page 231

8. Kalman Filtering. ATM 552 Notes Kalman Filtering: Chapter 8 page 231 AM 552 Notes Kalman Filtering: Chapter 8 page 231 8. Kalman Filtering In this section we will discuss the technique called Kalman filtering. Kalman filtering is a technique for estimating the state of

More information

On Sampling Errors in Empirical Orthogonal Functions

On Sampling Errors in Empirical Orthogonal Functions 3704 J O U R N A L O F C L I M A T E VOLUME 18 On Sampling Errors in Empirical Orthogonal Functions ROBERTA QUADRELLI, CHRISTOPHER S. BRETHERTON, AND JOHN M. WALLACE University of Washington, Seattle,

More information

Principles of the Global Positioning System Lecture 11

Principles of the Global Positioning System Lecture 11 12.540 Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring http://geoweb.mit.edu/~tah/12.540 Statistical approach to estimation Summary Look at estimation from statistical point

More information

Chapter 6. Random Processes

Chapter 6. Random Processes Chapter 6 Random Processes Random Process A random process is a time-varying function that assigns the outcome of a random experiment to each time instant: X(t). For a fixed (sample path): a random process

More information

OPTIMAL ESTIMATION of DYNAMIC SYSTEMS

OPTIMAL ESTIMATION of DYNAMIC SYSTEMS CHAPMAN & HALL/CRC APPLIED MATHEMATICS -. AND NONLINEAR SCIENCE SERIES OPTIMAL ESTIMATION of DYNAMIC SYSTEMS John L Crassidis and John L. Junkins CHAPMAN & HALL/CRC A CRC Press Company Boca Raton London

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Inhomogeneous Background Error Modeling and Estimation over Antarctica with WRF-Var/AMPS

Inhomogeneous Background Error Modeling and Estimation over Antarctica with WRF-Var/AMPS Inhomogeneous Background Error Modeling and Estimation over Antarctica with WRF-Var/AMPS Yann MICHEL 1 Météo-France, CNRM/GMAP 2 NCAR, MMM/DAG 10 th Annual WRF Users Workshop 23 th June 2009 Yann MICHEL

More information

Computational Data Analysis!

Computational Data Analysis! 12.714 Computational Data Analysis! Alan Chave (alan@whoi.edu)! Thomas Herring (tah@mit.edu),! http://geoweb.mit.edu/~tah/12.714! Introduction to Spectral Analysis! Topics Today! Aspects of Time series

More information

Demonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM

Demonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM Demonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM A. Srinivasan, E. P. Chassignet, O. M. Smedstad, C. Thacker, L. Bertino, P. Brasseur, T. M. Chin,, F.

More information

Fig.3.1 Dispersion of an isolated source at 45N using propagating zonal harmonics. The wave speeds are derived from a multiyear 500 mb height daily

Fig.3.1 Dispersion of an isolated source at 45N using propagating zonal harmonics. The wave speeds are derived from a multiyear 500 mb height daily Fig.3.1 Dispersion of an isolated source at 45N using propagating zonal harmonics. The wave speeds are derived from a multiyear 500 mb height daily data set in January. The four panels show the result

More information

Intelligent Embedded Systems Uncertainty, Information and Learning Mechanisms (Part 1)

Intelligent Embedded Systems Uncertainty, Information and Learning Mechanisms (Part 1) Advanced Research Intelligent Embedded Systems Uncertainty, Information and Learning Mechanisms (Part 1) Intelligence for Embedded Systems Ph. D. and Master Course Manuel Roveri Politecnico di Milano,

More information

7.6 The Inverse of a Square Matrix

7.6 The Inverse of a Square Matrix 7.6 The Inverse of a Square Matrix Copyright Cengage Learning. All rights reserved. What You Should Learn Verify that two matrices are inverses of each other. Use Gauss-Jordan elimination to find inverses

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 10 December 17, 01 Silvia Masciocchi, GSI Darmstadt Winter Semester 01 / 13 Method of least squares The method of least squares is a standard approach to

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

Comparison of of Assimilation Schemes for HYCOM

Comparison of of Assimilation Schemes for HYCOM Comparison of of Assimilation Schemes for HYCOM Ashwanth Srinivasan, C. Thacker, Z. Garraffo, E. P. Chassignet, O. M. Smedstad, J. Cummings, F. Counillon, L. Bertino, T. M. Chin, P. Brasseur and C. Lozano

More information

(Extended) Kalman Filter

(Extended) Kalman Filter (Extended) Kalman Filter Brian Hunt 7 June 2013 Goals of Data Assimilation (DA) Estimate the state of a system based on both current and all past observations of the system, using a model for the system

More information

Lecture 3: Statistical sampling uncertainty

Lecture 3: Statistical sampling uncertainty Lecture 3: Statistical sampling uncertainty c Christopher S. Bretherton Winter 2015 3.1 Central limit theorem (CLT) Let X 1,..., X N be a sequence of N independent identically-distributed (IID) random

More information

Impact of Argo, SST, and altimeter data on an eddy-resolving ocean reanalysis

Impact of Argo, SST, and altimeter data on an eddy-resolving ocean reanalysis Click Here for Full Article GEOPHYSICAL RESEARCH LETTERS, VOL. 34, L19601, doi:10.1029/2007gl031549, 2007 Impact of Argo, SST, and altimeter data on an eddy-resolving ocean reanalysis Peter R. Oke 1 and

More information

The Kalman Filter ImPr Talk

The Kalman Filter ImPr Talk The Kalman Filter ImPr Talk Ged Ridgway Centre for Medical Image Computing November, 2006 Outline What is the Kalman Filter? State Space Models Kalman Filter Overview Bayesian Updating of Estimates Kalman

More information

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Least Squares Regression

Least Squares Regression E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute

More information

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016 Spatial Statistics Spatial Examples More Spatial Statistics with Image Analysis Johan Lindström 1 1 Mathematical Statistics Centre for Mathematical Sciences Lund University Lund October 6, 2016 Johan Lindström

More information

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017 Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand

More information

Data assimilation concepts and methods March 1999

Data assimilation concepts and methods March 1999 Data assimilation concepts and methods March 1999 By F. Bouttier and P. Courtier Abstract These training course lecture notes are an advanced and comprehensive presentation of most data assimilation methods

More information

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Jeremy Staum Collaborators: Bruce Ankenman, Barry Nelson Evren Baysal, Ming Liu, Wei Xie supported by the NSF under Grant No.

More information

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher Ensemble forecasting and flow-dependent estimates of initial uncertainty Martin Leutbecher acknowledgements: Roberto Buizza, Lars Isaksen Flow-dependent aspects of data assimilation, ECMWF 11 13 June 2007

More information

PROJECTION METHODS FOR DYNAMIC MODELS

PROJECTION METHODS FOR DYNAMIC MODELS PROJECTION METHODS FOR DYNAMIC MODELS Kenneth L. Judd Hoover Institution and NBER June 28, 2006 Functional Problems Many problems involve solving for some unknown function Dynamic programming Consumption

More information

Assimilation of SWOT simulated observations in a regional ocean model: preliminary experiments

Assimilation of SWOT simulated observations in a regional ocean model: preliminary experiments Assimilation of SWOT simulated observations in a regional ocean model: preliminary experiments Benkiran M., Rémy E., Le Traon P.Y., Greiner E., Lellouche J.-M., Testut C.E., and the Mercator Ocean team.

More information

Data Assimilation. Matylda Jab lońska. University of Dar es Salaam, June Laboratory of Applied Mathematics Lappeenranta University of Technology

Data Assimilation. Matylda Jab lońska. University of Dar es Salaam, June Laboratory of Applied Mathematics Lappeenranta University of Technology Laboratory of Applied Mathematics Lappeenranta University of Technology University of Dar es Salaam, June 2013 Overview 1 Empirical modelling 2 Overview Empirical modelling Experimental plans for various

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

New Fast Kalman filter method

New Fast Kalman filter method New Fast Kalman filter method Hojat Ghorbanidehno, Hee Sun Lee 1. Introduction Data assimilation methods combine dynamical models of a system with typically noisy observations to obtain estimates of the

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information