Feb 21 and 25: Local weighted least squares: Quadratic loess smoother
|
|
- Walter Powell
- 6 years ago
- Views:
Transcription
1 Feb 1 and 5: Local weighted least squares: Quadratic loess smoother An example of weighted least squares fitting of data to a simple model for the purposes of simultaneous smoothing and interpolation is the quadratic loess smoother. In one dimension, this can be used to smooth, filter or interpolate a time series of values that may or may not be at regular sampling intervals. An application in two or more dimensions could be to produce a gridded analysis or climatology from a set of data observed at irregularly spaced locations and times, such as a set of shipboard hydrographic observations of temperature, salinity, or other ocean properties. The model is a local quadratic function, which can be written in terms of coordinates centered at the estimation point, x o, y o y = a + a ( x x ) + a ( y y ) + a ( x x )( y y ) + a ( x x ) + a ( y y ) 1 o 3 o 4 o o 5 o 6 o which has the design matrix E = 1 xi xo yi yo ( xi xo)( yi yo) ( xi xo) ( yi yo) d1 data d = dn, a1 a = and a 6 The flexibility to arbitrarily choose a linear model and a set of weights means the least squares fitting approach can be tailored to meet a user s notion of what constitutes a rational model based on some a priori knowledge of the processes being observed and modeled. An example of this is the Climatology of the Australian Regional Seas (CARS): Ridgway, K., J. R. Dunn and J. L. Wilkin (00), Ocean interpolation by 4- dimensional weighted least squares: Application to the waters around Australasia, Journal of Atmospheric and Oceanic Technology, 19, (pdf)
2 Additional terms in the model can be included provided they have a form that can be expressed with the design matrix, coefficients a k, and data coordinates (including, e.g. observation times t i ). Length and time scales of variability The quadratic loess smoother requires an a priori choice be made for the scales (usually length or time) to apply in the selection of the smoothing weights. The smoother can be interpreted as a filter, since the linear weighting procedure is effectively implemented as a convolution of the weights with the data. It is more general in the sense that the data do not have to be at regular intervals, because the weights are computed simply as a function of the normalized distance function r. It has been shown, empirically, that the effective cutoff frequency f c the quadratic loess smoother when it is interpreted as a filter is f c L -1 where L is the half width (i.e. the normalization scale) used in the loess smoother. If the loess smoother is to be used to deliberately remove certain scales of variability (i.e. as a filter), then selection of L is straightforward. However, if the objective is to use the smoother to do the best possible job of interpolating gaps in the data, then the smoothing scale should be adapted to the natural length or time scales of variability in the underlying physical process being observed. The weighting can be viewed in terms of the model equations to which we seek the least squares best solution for the parameters a We weight the rows of the matrix equation Ea = d by weights w i which can be summarized as a weighting vector w solved with diag( wea ) = diag( wd ) Ea ˆ = Wd >> a=ehat\(w*d); The classic quadratic loess smoother uses the weighting function: w= r in r < 3 3 (1 ) 1 where r might be normalized one of two ways: (1) r is a normalized Cartesian distance with prescribed smoothing scale L
3 r 1/ x xo y y o = + L L () r is normalized differently for each estimation point x o,y o after finding the distance r max that encloses the nearest N data points (( ) ( ) o o ) * r = x x + y y * * r sort( r ) * rmax = r ( N) * r r = r max Since the data coordinates are transformed to be with respect to the estimation location, x o,y o, the final loess estimate is simply a 1. Two-dimensional spatial mapping using a loess filter is demonstrated in Matlab script jw_lecture_loessd.m Optimal Interpolation / Objective Analysis / Gauss-Markov smoothing John s old scanned notes on OI This brings us to method of optimal interpolation (OI), also known as objective mapping or Gauss-Markov smoothing. [See Emery and Thompson, section 4..] Optimal interpolation estimates the field being observed at an arbitrary location and time through a linear combination of the available data. The weights used are chosen so that the expected error of the estimate is a minimum in the least squares sense, and the estimate itself is unbiased (i.e. has the same mean as the true field). OI is therefore sometimes referred to as the Best Linear Unbiased Estimator (BLUE) of a field. The underlying covariance length and time scales of the data and true field enter into the computation of the linear weights. Important concepts in optimal interpolation: OI produces the best linear unbiased estimate of a field from a set of arbitrarily distributed observations. Central to the estimation procedure is knowledge of the underlying covariance function of between the data and the process being observed (the model-data covariance), and the data being used with themselves (the data-data covariance).
4 This data-data covariance includes an a priori estimate of the uncertainty (error) in the observations. The model-data and data-data covariance patterns should be similar. If the data errors are independent, the error variance simply adds to the diagonal of data covariance matrix. If the data errors are correlated, off-diagonal elements of the data-data covariance matrix would differ from the mode-data covariance, but in practice this is seldom if ever considered. Frequently, the covariance is assumed to be homogenous and isotropic, in which case the covariance becomes simply a function of the distance separating the locations of the data and model (grid) points. If valid, assumptions of homogeneity and isotropy facilitate the estimation of the shape of the covariance function by taking an ensemble of data covariance binned according to spatial and/or temporal lags. The OI method produces an objective estimate of the expected error in the result. The OI technique can be formulated to simultaneously interpolate different but related data types (e.g. winds and geopotential heights) provided a linear relationship between the model and data (e,g. as in the case of geostrophic winds (or ocean currents) computed from geopotential (or sea surface) heights). In the case of geostrophic turbulence, the assumption of isotropy dictates a fixed relationship between the covariance of the individual velocity components and streamfunction. Simultaneously estimating multiple variables has the advantage that known physical constraints (e.g. continuity, geostrophy) can be incorporated into the mapping procedure, thereby producing results that are balanced kinematically and/or dynamically. Examples: Combining altimeter sea surface height observations and velocity observations from sequential satellite imagery. Wilkin, J. L., M. M. Bowen and W. J. Emery (00), Mapping mesoscale currents by optimal interpolation of satellite radiometer and altimeter data, Ocean Dynamics, 5, (pdf). John s old scanned notes Optimal interpolation exploits knowledge of the autocorrelation of a process to determine the relative weight to be given to a set of data (in a weighted sum) to estimate the true field at a certain location (in space and time). The autocorrelation is essentially indicating which data are near and which are far from the estimation point.
5 The problem: Estimate some variable, D, at location(s) x a, t on the basis of a set of neighboring observations (the data) d at locations x b, t. The data are assumed to be observations of the true field with some observational error: dx (, t) = Dx (, t) + nx (, t) The measurement errors n are assumed unbiased, n = 0, and uncorrelated with the field being observed, D. [In practice it is desirable to remove any well resolved (long space/time scales) deterministic signals from the data first so that the interpolation is being applied to a data set with reduced variance. (For example, a seasonal cycle or spatial variability of very long wavelength.) ] We denote Dx ( ) as the estimate of the true value at location x (and time t), and will compute this is a linear weighted sum of the data: T Dx ( ) = Dx ( ) + ( d d) w( x) T = Dx ( ) + w ( x)( d d) where the weights w(x) are not specified (yet), and the dependence on x emphasizes that the weights will be different for every estimate location. The assumption we have unbiased data implies that the mean of the data, d, will be a valid estimate of the mean of the field D. The weights w are selected so as to minimize the expected value of the mean square variance of the error between the linear weighted estimate, Dx ( ), and the true value of the variable being observed, D(x) (Of course, we don t actually know what this true value is if we did we probably wouldn t be bothering with all this.) Therefore, we minimize: ( ( ) ( )) n D x D x = true estimate T T T n = D D D D ( ( d d) w) ( ( d d) w) T T T T T = ( D D) w ( d d)( D D) ( D D) ( d d) w+ w ( d d)( d d) w Here, ( d d)( d d ) T is the data-data covariance matrix which we denote as C.
6 The nd through 4 th terms are of the form T T T AB BA + ACA T T T w ( d-d)( D D) [( d-d)( D D)] w+ w Cw and we denote ( D D)( d-d ) T as the model-data covariance matrix, C md, i.e. this is the covariance of the true field at the estimation location, D(x), with all the data, d, (hence it is a vector the same length as the data). The identity of completing the square for a simple quadratic algebraic equation finds the constants k i, k to rearrange ax + bx + c = a(...) + constant = a( x + k ) + k 1 When completing the square for the matrix equation above it can be shown that this rearranges to: T T T -1-1 T -1 T ACA BA AB = (A BC )C(A BC ) BC B [You can verify this by expanding the line above and simplifying by noting that C is T -1 symmetric, CC =I. ] So it follows that: C=C (because it is a covariance matrix), and therefore ( ) T T T n = ( D D) + ( w C C ) C( w C C ) C C C T md md md md The second term is quadratic, and the expected vale of n is minimized by making this term zero. This gives us the optimal weights w. -1 w C C = 0 or md w = C C md -1 The Best Linear Unbiased Estimate (BLUE) is then T ˆD= D+ w (d-d) = D + C -1 mdc (d-d)
7 In practice, the data-data covariance matrix can be very large and expense to invert. Typically, it has a much larger dimension than the model which would be a grid of coordinates on which we are computing our climatology or analysis. It is more computationally efficient to compute the product of the weights and data directly by solving, in a least squares sense, the problem: by a Matlab matrix left divide: >> ws = Cdd\(d-dbar); * -1 which gives us the product w =C (d-d) and the estimate is then calculated as: * Cw = (d - d) ˆD= D+ C = D + C -1 mdc (d-d) * mdw However, we would still need the data-data covariance inverse to make a formal estimate of the expected error in the analysis. Optimal interpolation example The matlab scripts cov_mercator.m and oi_mercator.m demonstrate fitting a covariance function to a set of synthetic data, and using this function to optimally interpolate to a regular grid. The data used in this example is ocean temperature taken from the French Mercator operational ocean forecast system for the North Atlantic. cov_mercator: The script cov_mercator.m loads the example Mercator snapshot form a mat file, subsamples the data to a small (3%) subset and adds some normally distributed random noise to emulate instrument error (or unresolved high frequency physical variability due e.g. to internal waves in the case of in situ ocean temperature observations). The lon/lat coordinates of the sub-sampled data set are converted to simple x,y, coordinates w.r.t. the southwest corner of the data range, and then the separation distance between all data (r) so that a binned-lagged covariance as a function of r
8 can be computed from the data themselves, i.e. estimate Cr () = d(x)d(x + r) Two functional forms (Gaussian and Markov) for C(r) are fitted to the estimated covariance using Matlab s fminsearch function. Note that the covariance at r=0 is not used in the fit because this includes the effect of the independent observational, or error, variance. r Gaussian: Cr ( ) = s exp( ) a Markov: Cr ( ) = s(1 + r )exp( r/ a) a where s is the signal variance, i.e. the variance of the true field at zero lag. The apparent error variance, e, is calculated from the difference of the data variance at r=0 (i.e. var[data]) and the signal variance as r 0 that is indicated by the y-intercept of the functional fit, i.e. C(0). e s oi_mercator: Using the fitted Markov covariance function, normalized model-data (C md ) and data-data (C dd ) covariance matrices are computed. The data-data is augmented on the diagonal with the ratio of error to signal variance. The optimal interpolation fit to the data is computed by direct inversion of C dd and also by the Matlab matrix-left-divide operation to compute the product C dd d directly.
9 Expected errors of the OI are calculated, and qualitatively compared to the actual residuals of the fit to show that, as expected, approximately 65% of the residuals fall within the expected errors. If the data set (N points) is large, the matrix sizes may get too large for practical handling in Matlab (or any other language) because the OI problem has to solve matrix inversions or simultaneous equation solutions of dimension NxN. This may be handled by dividing the model grid into subdomains, like tiles, with subsets of the data limited to only those points that fall within the model tile plus a halo region around the tiles. The halo region should be at least one covariance scale wide to ensure smoothness at the tile boundaries. The data-data matrix C dd will have to be computed anew for each tile, but computing e.g. order(10) OI operations for order(n/10) data elements may be faster than one OI for order(n) data elements. This is because the computational efforts of the matrix operations scale with N 3. If there are too many data within a few covariance scales to practically invert C dd, then it is likely that there are more data than necessary to resolve the mapped field. This indicates that a shorter covariance scale can probably be used. Alternatively, it is probably safe to decimate the data (just use less of it, thereby making N smaller) or average the observations in small bins. For independent errors, the binning step will reduce the expected error (the noise variance) of the binned values, and this information can be carried through the analysis. Expected errors A posteriori testing of error estimates can be done to see whether the proportion of residuals within the expected errors is statistically consistent. More in depth tests would compare the results to a set of independent data, such as from another instrument, or data withheld from the OI itself. See Walker and Wilkin (1998) for an example of checking the validity of error estimates though a Chi-squared test. The expected error is computed in the demonstration script oi_mercator.m The expected error in the analysis, or estimate, is given by e ( x ) = s c C c k md -1 T md where s is the variance of the signal (the true solution) and c md is the covariance of the model (at location x k ) with the data d, i.e. it is the k th row of C md. The vector of all estimated errors is = diag( s ) -1 T e I cmdc c md The optimal interpolation analysis of the data therefore states that our best linear unbiased estimate of the true signal is ˆD± e.
10 If we were able to make some independent analysis of the error in ˆD, such as we actually fabricated the data to test the method as in the oi_mercator.m script, then we expect for about 68% of the estimates the true value D would fall within ˆD± e. From the equation for e we see that the maximum the expected error can be is simply the signal variance. This occurs when there are no data within a few covariance scales of the estimation location and c md is 0. In this case our best estimate is just the background field an our uncertainty is the full variance of the signal basically the OI is unable to help inform us. If we have some data close (in terms of covariance scale) to the estimation location, and c md is greater than zero, then the expected error is less than the signal variance and OI has helped us. If the data error variance is small, the diagonal elements of C -1 are large, and this would further decrease the expected error. So having better quality data improves the skill of the estimate.
The minimisation gives a set of linear equations for optimal weights w:
4. Interpolation onto a regular grid 4.1 Optimal interpolation method The optimal interpolation method was used to compute climatological property distributions of the selected standard levels on a regular
More informationOcean data assimilation for reanalysis
Ocean data assimilation for reanalysis Matt Martin. ERA-CLIM2 Symposium, University of Bern, 14 th December 2017. Contents Introduction. On-going developments to improve ocean data assimilation for reanalysis.
More information4. DATA ASSIMILATION FUNDAMENTALS
4. DATA ASSIMILATION FUNDAMENTALS... [the atmosphere] "is a chaotic system in which errors introduced into the system can grow with time... As a consequence, data assimilation is a struggle between chaotic
More informationIn the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.
hree-dimensional variational assimilation (3D-Var) In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance. Lorenc (1986) showed
More informationStatistical signal processing
Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable
More informationOptimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields.
Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields. Optimal Interpolation ( 5.4) We now generalize the
More informationChapter 1: Systems of Linear Equations and Matrices
: Systems of Linear Equations and Matrices Multiple Choice Questions. Which of the following equations is linear? (A) x + 3x 3 + 4x 4 3 = 5 (B) 3x x + x 3 = 5 (C) 5x + 5 x x 3 = x + cos (x ) + 4x 3 = 7.
More informationIntroduction to Optimal Interpolation and Variational Analysis
Statistical Analysis of Biological data and Times-Series Introduction to Optimal Interpolation and Variational Analysis Alexander Barth, Aida Alvera Azcárate, Pascal Joassin, Jean-Marie Beckers, Charles
More informationNumerical Weather Prediction: Data assimilation. Steven Cavallo
Numerical Weather Prediction: Data assimilation Steven Cavallo Data assimilation (DA) is the process estimating the true state of a system given observations of the system and a background estimate. Observations
More informationThe Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions
The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions Joshua Hacker National Center for Atmospheric Research hacker@ucar.edu Topics The closure problem and physical parameterizations
More informationSECTION 7: CURVE FITTING. MAE 4020/5020 Numerical Methods with MATLAB
SECTION 7: CURVE FITTING MAE 4020/5020 Numerical Methods with MATLAB 2 Introduction Curve Fitting 3 Often have data,, that is a function of some independent variable,, but the underlying relationship is
More informationGentle introduction to Optimal Interpolation and Variational Analysis (without equations)
Gentle introduction to Optimal Interpolation and Variational Analysis (without equations) Alexander Barth, Aida Alvera Azcárate, Pascal Joassin, Jean-Marie Beckers, Charles Troupin a.barth@ulg.ac.be November
More informationMorphing ensemble Kalman filter
Morphing ensemble Kalman filter and applications Center for Computational Mathematics Department of Mathematical and Statistical Sciences University of Colorado Denver Supported by NSF grants CNS-0623983
More informationRepresentation of inhomogeneous, non-separable covariances by sparse wavelet-transformed matrices
Representation of inhomogeneous, non-separable covariances by sparse wavelet-transformed matrices Andreas Rhodin, Harald Anlauf German Weather Service (DWD) Workshop on Flow-dependent aspects of data assimilation,
More informationData Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006
Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems Eric Kostelich Data Mining Seminar, Feb. 6, 2006 kostelich@asu.edu Co-Workers Istvan Szunyogh, Gyorgyi Gyarmati, Ed Ott, Brian
More informationOptimal Interpolation
Optimal Interpolation Optimal Interpolation and/or kriging consist in determining the BEST LINEAR ESTIMATE in the least square sense for locations xi where you have no measurements: Example 1: Collected
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationLocal Ensemble Transform Kalman Filter
Local Ensemble Transform Kalman Filter Brian Hunt 11 June 2013 Review of Notation Forecast model: a known function M on a vector space of model states. Truth: an unknown sequence {x n } of model states
More informationApplications of an ensemble Kalman Filter to regional ocean modeling associated with the western boundary currents variations
Applications of an ensemble Kalman Filter to regional ocean modeling associated with the western boundary currents variations Miyazawa, Yasumasa (JAMSTEC) Collaboration with Princeton University AICS Data
More informationMethods of Data Assimilation and Comparisons for Lagrangian Data
Methods of Data Assimilation and Comparisons for Lagrangian Data Chris Jones, Warwick and UNC-CH Kayo Ide, UCLA Andrew Stuart, Jochen Voss, Warwick Guillaume Vernieres, UNC-CH Amarjit Budiraja, UNC-CH
More informationAdaptive Data Assimilation and Multi-Model Fusion
Adaptive Data Assimilation and Multi-Model Fusion Pierre F.J. Lermusiaux, Oleg G. Logoutov and Patrick J. Haley Jr. Mechanical Engineering and Ocean Science and Engineering, MIT We thank: Allan R. Robinson
More informationOverview of Spatial Statistics with Applications to fmri
with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example
More informationM.Sc. in Meteorology. Numerical Weather Prediction
M.Sc. in Meteorology UCD Numerical Weather Prediction Prof Peter Lynch Meteorology & Climate Cehtre School of Mathematical Sciences University College Dublin Second Semester, 2005 2006. Text for the Course
More informationOOPC-GODAE workshop on OSE/OSSEs Paris, IOCUNESCO, November 5-7, 2007
OOPC-GODAE workshop on OSE/OSSEs Paris, IOCUNESCO, November 5-7, 2007 Design of ocean observing systems: strengths and weaknesses of approaches based on assimilative systems Pierre Brasseur CNRS / LEGI
More informationMatrix operations Linear Algebra with Computer Science Application
Linear Algebra with Computer Science Application February 14, 2018 1 Matrix operations 11 Matrix operations If A is an m n matrix that is, a matrix with m rows and n columns then the scalar entry in the
More informationSIO 210: Data analysis methods L. Talley, Fall Sampling and error 2. Basic statistical concepts 3. Time series analysis
SIO 210: Data analysis methods L. Talley, Fall 2016 1. Sampling and error 2. Basic statistical concepts 3. Time series analysis 4. Mapping 5. Filtering 6. Space-time data 7. Water mass analysis Reading:
More informationRegression #4: Properties of OLS Estimator (Part 2)
Regression #4: Properties of OLS Estimator (Part 2) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #4 1 / 24 Introduction In this lecture, we continue investigating properties associated
More informationEstimation and Prediction Scenarios
Recursive BLUE BLUP and the Kalman filter: Estimation and Prediction Scenarios Amir Khodabandeh GNSS Research Centre, Curtin University of Technology, Perth, Australia IUGG 2011, Recursive 28 June BLUE-BLUP
More informationIntroduction. Spatial Processes & Spatial Patterns
Introduction Spatial data: set of geo-referenced attribute measurements: each measurement is associated with a location (point) or an entity (area/region/object) in geographical (or other) space; the domain
More informationThe Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich. Main topics
The Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich Arizona State University Co-workers: Istvan Szunyogh, Brian Hunt, Ed Ott, Eugenia Kalnay, Jim Yorke, and many others http://www.weatherchaos.umd.edu
More informationKalman Filter Computer Vision (Kris Kitani) Carnegie Mellon University
Kalman Filter 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Examples up to now have been discrete (binary) random variables Kalman filtering can be seen as a special case of a temporal
More informationKalman Filter and Ensemble Kalman Filter
Kalman Filter and Ensemble Kalman Filter 1 Motivation Ensemble forecasting : Provides flow-dependent estimate of uncertainty of the forecast. Data assimilation : requires information about uncertainty
More informationFundamentals of Data Assimilation
National Center for Atmospheric Research, Boulder, CO USA GSI Data Assimilation Tutorial - June 28-30, 2010 Acknowledgments and References WRFDA Overview (WRF Tutorial Lectures, H. Huang and D. Barker)
More informationLessons in Estimation Theory for Signal Processing, Communications, and Control
Lessons in Estimation Theory for Signal Processing, Communications, and Control Jerry M. Mendel Department of Electrical Engineering University of Southern California Los Angeles, California PRENTICE HALL
More informationGaussian Process Approximations of Stochastic Differential Equations
Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Dan Cawford Manfred Opper John Shawe-Taylor May, 2006 1 Introduction Some of the most complex models routinely run
More informationSIO 210: Data analysis
SIO 210: Data analysis 1. Sampling and error 2. Basic statistical concepts 3. Time series analysis 4. Mapping 5. Filtering 6. Space-time data 7. Water mass analysis 10/8/18 Reading: DPO Chapter 6 Look
More informationDefinition of a Stochastic Process
Definition of a Stochastic Process Balu Santhanam Dept. of E.C.E., University of New Mexico Fax: 505 277 8298 bsanthan@unm.edu August 26, 2018 Balu Santhanam (UNM) August 26, 2018 1 / 20 Overview 1 Stochastic
More informationStatistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit
Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationThe Ensemble Kalman Filter:
p.1 The Ensemble Kalman Filter: Theoretical formulation and practical implementation Geir Evensen Norsk Hydro Research Centre, Bergen, Norway Based on Evensen 23, Ocean Dynamics, Vol 53, No 4 p.2 The Ensemble
More informationIntroduction to Data Assimilation
Introduction to Data Assimilation Alan O Neill Data Assimilation Research Centre University of Reading What is data assimilation? Data assimilation is the technique whereby observational data are combined
More informationXVI. Objective mapping
SIO 211B, Rudnick, adapted from Davis 1 XVI. Objective mapping An objective map is the minimum mean-square error estimate of a continuous function of a variable, given discrete data. The interpolation
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationSimple Examples. Let s look at a few simple examples of OI analysis.
Simple Examples Let s look at a few simple examples of OI analysis. Example 1: Consider a scalar prolem. We have one oservation y which is located at the analysis point. We also have a ackground estimate
More informationChapter 2 The Simple Linear Regression Model: Specification and Estimation
Chapter The Simple Linear Regression Model: Specification and Estimation Page 1 Chapter Contents.1 An Economic Model. An Econometric Model.3 Estimating the Regression Parameters.4 Assessing the Least Squares
More informationDATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do
DATA IN SERIES AND TIME I Several different techniques depending on data and what one wants to do Data can be a series of events scaled to time or not scaled to time (scaled to space or just occurrence)
More informationA Framework for Daily Spatio-Temporal Stochastic Weather Simulation
A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric
More informationStatistical methods. Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra
Statistical methods Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra Statistical methods Generating random numbers MATLAB has many built-in functions
More informationWhy is the field of statistics still an active one?
Why is the field of statistics still an active one? It s obvious that one needs statistics: to describe experimental data in a compact way, to compare datasets, to ask whether data are consistent with
More informationTypes of Spatial Data
Spatial Data Types of Spatial Data Point pattern Point referenced geostatistical Block referenced Raster / lattice / grid Vector / polygon Point Pattern Data Interested in the location of points, not their
More informationMath 313 Chapter 1 Review
Math 313 Chapter 1 Review Howard Anton, 9th Edition May 2010 Do NOT write on me! Contents 1 1.1 Introduction to Systems of Linear Equations 2 2 1.2 Gaussian Elimination 3 3 1.3 Matrices and Matrix Operations
More informationCosmology & CMB. Set5: Data Analysis. Davide Maino
Cosmology & CMB Set5: Data Analysis Davide Maino Gaussian Statistics Statistical isotropy states only two-point correlation function is needed and it is related to power spectrum Θ(ˆn) = lm Θ lm Y lm (ˆn)
More informationFitting functions to data
1 Fitting functions to data 1.1 Exact fitting 1.1.1 Introduction Suppose we have a set of real-number data pairs x i, y i, i = 1, 2,, N. These can be considered to be a set of points in the xy-plane. They
More informationData Assimilation Research Testbed Tutorial
Data Assimilation Research Testbed Tutorial Section 3: Hierarchical Group Filters and Localization Version 2.: September, 26 Anderson: Ensemble Tutorial 9//6 Ways to deal with regression sampling error:
More informationFundamentals of Data Assimila1on
014 GSI Community Tutorial NCAR Foothills Campus, Boulder, CO July 14-16, 014 Fundamentals of Data Assimila1on Milija Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University
More informationValidation of sea ice concentration in the myocean Arctic Monitoring and Forecasting Centre 1
Note No. 12/2010 oceanography, remote sensing Oslo, August 9, 2010 Validation of sea ice concentration in the myocean Arctic Monitoring and Forecasting Centre 1 Arne Melsom 1 This document contains hyperlinks
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationHenrik Aalborg Nielsen 1, Henrik Madsen 1, Torben Skov Nielsen 1, Jake Badger 2, Gregor Giebel 2, Lars Landberg 2 Kai Sattler 3, Henrik Feddersen 3
PSO (FU 2101) Ensemble-forecasts for wind power Comparison of ensemble forecasts with the measurements from the meteorological mast at Risø National Laboratory Henrik Aalborg Nielsen 1, Henrik Madsen 1,
More informationAspects of the practical application of ensemble-based Kalman filters
Aspects of the practical application of ensemble-based Kalman filters Lars Nerger Alfred Wegener Institute for Polar and Marine Research Bremerhaven, Germany and Bremen Supercomputing Competence Center
More informationAdaptive Filter Theory
0 Adaptive Filter heory Sung Ho Cho Hanyang University Seoul, Korea (Office) +8--0-0390 (Mobile) +8-10-541-5178 dragon@hanyang.ac.kr able of Contents 1 Wiener Filters Gradient Search by Steepest Descent
More informationBrian J. Etherton University of North Carolina
Brian J. Etherton University of North Carolina The next 90 minutes of your life Data Assimilation Introit Different methodologies Barnes Analysis in IDV NWP Error Sources 1. Intrinsic Predictability Limitations
More informationFrequentist-Bayesian Model Comparisons: A Simple Example
Frequentist-Bayesian Model Comparisons: A Simple Example Consider data that consist of a signal y with additive noise: Data vector (N elements): D = y + n The additive noise n has zero mean and diagonal
More informationGaussian Filtering Strategies for Nonlinear Systems
Gaussian Filtering Strategies for Nonlinear Systems Canonical Nonlinear Filtering Problem ~u m+1 = ~ f (~u m )+~ m+1 ~v m+1 = ~g(~u m+1 )+~ o m+1 I ~ f and ~g are nonlinear & deterministic I Noise/Errors
More informationConsider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.
ATMO/OPTI 656b Spring 009 Bayesian Retrievals Note: This follows the discussion in Chapter of Rogers (000) As we have seen, the problem with the nadir viewing emission measurements is they do not contain
More informationDynamic System Identification using HDMR-Bayesian Technique
Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in
More information8. Kalman Filtering. ATM 552 Notes Kalman Filtering: Chapter 8 page 231
AM 552 Notes Kalman Filtering: Chapter 8 page 231 8. Kalman Filtering In this section we will discuss the technique called Kalman filtering. Kalman filtering is a technique for estimating the state of
More informationOn Sampling Errors in Empirical Orthogonal Functions
3704 J O U R N A L O F C L I M A T E VOLUME 18 On Sampling Errors in Empirical Orthogonal Functions ROBERTA QUADRELLI, CHRISTOPHER S. BRETHERTON, AND JOHN M. WALLACE University of Washington, Seattle,
More informationPrinciples of the Global Positioning System Lecture 11
12.540 Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring http://geoweb.mit.edu/~tah/12.540 Statistical approach to estimation Summary Look at estimation from statistical point
More informationChapter 6. Random Processes
Chapter 6 Random Processes Random Process A random process is a time-varying function that assigns the outcome of a random experiment to each time instant: X(t). For a fixed (sample path): a random process
More informationOPTIMAL ESTIMATION of DYNAMIC SYSTEMS
CHAPMAN & HALL/CRC APPLIED MATHEMATICS -. AND NONLINEAR SCIENCE SERIES OPTIMAL ESTIMATION of DYNAMIC SYSTEMS John L Crassidis and John L. Junkins CHAPMAN & HALL/CRC A CRC Press Company Boca Raton London
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationInhomogeneous Background Error Modeling and Estimation over Antarctica with WRF-Var/AMPS
Inhomogeneous Background Error Modeling and Estimation over Antarctica with WRF-Var/AMPS Yann MICHEL 1 Météo-France, CNRM/GMAP 2 NCAR, MMM/DAG 10 th Annual WRF Users Workshop 23 th June 2009 Yann MICHEL
More informationComputational Data Analysis!
12.714 Computational Data Analysis! Alan Chave (alan@whoi.edu)! Thomas Herring (tah@mit.edu),! http://geoweb.mit.edu/~tah/12.714! Introduction to Spectral Analysis! Topics Today! Aspects of Time series
More informationDemonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM
Demonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM A. Srinivasan, E. P. Chassignet, O. M. Smedstad, C. Thacker, L. Bertino, P. Brasseur, T. M. Chin,, F.
More informationFig.3.1 Dispersion of an isolated source at 45N using propagating zonal harmonics. The wave speeds are derived from a multiyear 500 mb height daily
Fig.3.1 Dispersion of an isolated source at 45N using propagating zonal harmonics. The wave speeds are derived from a multiyear 500 mb height daily data set in January. The four panels show the result
More informationIntelligent Embedded Systems Uncertainty, Information and Learning Mechanisms (Part 1)
Advanced Research Intelligent Embedded Systems Uncertainty, Information and Learning Mechanisms (Part 1) Intelligence for Embedded Systems Ph. D. and Master Course Manuel Roveri Politecnico di Milano,
More information7.6 The Inverse of a Square Matrix
7.6 The Inverse of a Square Matrix Copyright Cengage Learning. All rights reserved. What You Should Learn Verify that two matrices are inverses of each other. Use Gauss-Jordan elimination to find inverses
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 10 December 17, 01 Silvia Masciocchi, GSI Darmstadt Winter Semester 01 / 13 Method of least squares The method of least squares is a standard approach to
More informationRegression. Oscar García
Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is
More informationComparison of of Assimilation Schemes for HYCOM
Comparison of of Assimilation Schemes for HYCOM Ashwanth Srinivasan, C. Thacker, Z. Garraffo, E. P. Chassignet, O. M. Smedstad, J. Cummings, F. Counillon, L. Bertino, T. M. Chin, P. Brasseur and C. Lozano
More information(Extended) Kalman Filter
(Extended) Kalman Filter Brian Hunt 7 June 2013 Goals of Data Assimilation (DA) Estimate the state of a system based on both current and all past observations of the system, using a model for the system
More informationLecture 3: Statistical sampling uncertainty
Lecture 3: Statistical sampling uncertainty c Christopher S. Bretherton Winter 2015 3.1 Central limit theorem (CLT) Let X 1,..., X N be a sequence of N independent identically-distributed (IID) random
More informationImpact of Argo, SST, and altimeter data on an eddy-resolving ocean reanalysis
Click Here for Full Article GEOPHYSICAL RESEARCH LETTERS, VOL. 34, L19601, doi:10.1029/2007gl031549, 2007 Impact of Argo, SST, and altimeter data on an eddy-resolving ocean reanalysis Peter R. Oke 1 and
More informationThe Kalman Filter ImPr Talk
The Kalman Filter ImPr Talk Ged Ridgway Centre for Medical Image Computing November, 2006 Outline What is the Kalman Filter? State Space Models Kalman Filter Overview Bayesian Updating of Estimates Kalman
More informationKalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q
Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I
More informationTAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω
ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.
More informationLeast Squares Regression
E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute
More informationSpatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016
Spatial Statistics Spatial Examples More Spatial Statistics with Image Analysis Johan Lindström 1 1 Mathematical Statistics Centre for Mathematical Sciences Lund University Lund October 6, 2016 Johan Lindström
More informationQuantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017
Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand
More informationData assimilation concepts and methods March 1999
Data assimilation concepts and methods March 1999 By F. Bouttier and P. Courtier Abstract These training course lecture notes are an advanced and comprehensive presentation of most data assimilation methods
More informationBetter Simulation Metamodeling: The Why, What and How of Stochastic Kriging
Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Jeremy Staum Collaborators: Bruce Ankenman, Barry Nelson Evren Baysal, Ming Liu, Wei Xie supported by the NSF under Grant No.
More informationEnsemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher
Ensemble forecasting and flow-dependent estimates of initial uncertainty Martin Leutbecher acknowledgements: Roberto Buizza, Lars Isaksen Flow-dependent aspects of data assimilation, ECMWF 11 13 June 2007
More informationPROJECTION METHODS FOR DYNAMIC MODELS
PROJECTION METHODS FOR DYNAMIC MODELS Kenneth L. Judd Hoover Institution and NBER June 28, 2006 Functional Problems Many problems involve solving for some unknown function Dynamic programming Consumption
More informationAssimilation of SWOT simulated observations in a regional ocean model: preliminary experiments
Assimilation of SWOT simulated observations in a regional ocean model: preliminary experiments Benkiran M., Rémy E., Le Traon P.Y., Greiner E., Lellouche J.-M., Testut C.E., and the Mercator Ocean team.
More informationData Assimilation. Matylda Jab lońska. University of Dar es Salaam, June Laboratory of Applied Mathematics Lappeenranta University of Technology
Laboratory of Applied Mathematics Lappeenranta University of Technology University of Dar es Salaam, June 2013 Overview 1 Empirical modelling 2 Overview Empirical modelling Experimental plans for various
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationNew Fast Kalman filter method
New Fast Kalman filter method Hojat Ghorbanidehno, Hee Sun Lee 1. Introduction Data assimilation methods combine dynamical models of a system with typically noisy observations to obtain estimates of the
More informationLeast Squares Regression
CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the
More informationcovariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of
Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More information