Constrained data assimilation. W. Carlisle Thacker Atlantic Oceanographic and Meteorological Laboratory Miami, Florida USA

Similar documents
Lecture 18: Optimization Programming

Bayesian Methods for Machine Learning

Paul Schrimpf. October 17, UBC Economics 526. Constrained optimization. Paul Schrimpf. First order conditions. Second order conditions

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Short tutorial on data assimilation

STA 414/2104: Machine Learning

1 Computing with constraints

Machine Learning and Data Mining. Support Vector Machines. Kalev Kask

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

STA414/2104 Statistical Methods for Machine Learning II

Constrained optimization: direct methods (cont.)

MATH2070 Optimisation

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Linear Dynamical Systems

Statistical Machine Learning from Data

Partially Collapsed Gibbs Samplers: Theory and Methods. Ever increasing computational power along with ever more sophisticated statistical computing

Lagrange Relaxation and Duality

STA 4273H: Statistical Machine Learning

Covariance Matrix Simplification For Efficient Uncertainty Management

Algorithms for constrained local optimization

STA 4273H: Statistical Machine Learning

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1

Pattern Recognition and Machine Learning

STA 4273H: Statistical Machine Learning

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Partially Collapsed Gibbs Samplers: Theory and Methods

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Constrained Optimization

Fundamentals of Data Assimilation

Algorithms for Constrained Optimization

Lecture 16 Deep Neural Generative Models

Multidisciplinary System Design Optimization (MSDO)

DRAGON ADVANCED TRAINING COURSE IN ATMOSPHERE REMOTE SENSING. Inversion basics. Erkki Kyrölä Finnish Meteorological Institute

September Math Course: First Order Derivative

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

Demonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Probabilistic Graphical Models

Their Statistical Analvsis. With Web-Based Fortran Code. Berg

M.Sc. in Meteorology. Numerical Weather Prediction

University of Toronto Department of Statistics

The Kuhn-Tucker Problem

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Comparison of of Assimilation Schemes for HYCOM

Probabilistic Graphical Models

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

5 Handling Constraints

Machine Learning. Support Vector Machines. Manfred Huber

4TE3/6TE3. Algorithms for. Continuous Optimization

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models

MATH529 Fundamentals of Optimization Constrained Optimization I

But if z is conditioned on, we need to model it:

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

Markov Chain Monte Carlo Methods

16 : Markov Chain Monte Carlo (MCMC)

Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics )

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

Discriminative Models

Dynamic System Identification using HDMR-Bayesian Technique

Link lecture - Lagrange Multipliers

State Estimation using Moving Horizon Estimation and Particle Filtering

Estimating Observation Impact in a Hybrid Data Assimilation System: Experiments with a Simple Model

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

Bayesian Dropout. Tue Herlau, Morten Morup and Mikkel N. Schmidt. Feb 20, Discussed by: Yizhe Zhang

Introduction to Machine Learning

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad

Nonlinear Optimization: What s important?

Lecture 6: Bayesian Inference in SDE Models

Brian J. Etherton University of North Carolina

Naive Bayes and Gaussian Bayes Classifier

Testing Restrictions and Comparing Models

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

Conditions for successful data assimilation

Economics 101A (Lecture 3) Stefano DellaVigna

Naive Bayes and Gaussian Bayes Classifier

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

1 Probabilities. 1.1 Basics 1 PROBABILITIES

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

11. Further Issues in Using OLS with TS Data

Par$cle Filters Part I: Theory. Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

The Inversion Problem: solving parameters inversion and assimilation problems

Bayesian Gaussian / Linear Models. Read Sections and 3.3 in the text by Bishop

13: Variational inference II

Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields.

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Bias-Variance Tradeoff

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Inverse Problems in the Bayesian Framework

Variational Learning : From exponential families to multilinear systems

CSC 2541: Bayesian Methods for Machine Learning

XVI. Objective mapping

Local Ensemble Transform Kalman Filter

Computation. For QDA we need to calculate: Lets first consider the case that

Transcription:

Constrained data assimilation W. Carlisle Thacker Atlantic Oceanographic and Meteorological Laboratory Miami, Florida 33149 USA

Plan Range constraints: : HYCOM layers have minimum thickness. Optimal interpolation: no inequality constraints. normally distributed errors. Graphical interpretation: linear regression. active constraint additional data Algorithms: little knowledge of error statistics. how precise should solution be? Simple examples: 4 variables, 1 new observation.

Variational formalism The objective is to reach a compromise x between a background estimate b of the model state and additional information provided by observations d based on their respective error covariances B and D. The minimum of J(x) = 1 2 (x b)t B 1 (x b) + 1 2 (Hx d)t D 1 (Hx d) defines this compromise: x b = BH T (HBH T + D) 1 (d Hb) where H selects model counterparts of d.

Posterior covariance matrix After assimilation, just as the background state b is updated to get the analysis x, the background error-covariance matrix B can be updated to get an analysis error-covariance matrix A = B BH T (HBH T + D) 1 HB.

Example: 2 variables and 1 measurement x 1 b 1 σ 1 2 σ12 2 x = b = B = d = d 1 D = σ 2 x 2 b 2 σ12 2 σ2 2 d After assimilation x 1 is an information-weighted average of the prior estimate b 1 and the new data d 1 : x 1 = ( b1 σ 2 1 + d 1 σ 2 d ) / ( 1 σ 2 1 + 1 σ 2 2 and the correction to x 2 is a regression estimate based on the correction to x 1. x 2 b 2 = σ2 12 σ1 2 (x 1 b 1 ) )

Simple example: 2 variables, 1 observed 3 contours of background probability density 2 layer thickness for cell 2 1 regression line 0 corrected cell 1 thickness 1 0 1 2 3 4 layer thickness for cell 1

Constraint: layer thickness 1 3 increased cell 1 thickness decreased cell 1 thickness Cell 2 initially has minimum layer thickness. layer thickness for cell 2 2 1 0 regression line Measurements increase thickness for cell 1. No problem. What if measurement decreases thickness for cell 1? 1 Optimum would be at constraint boundary. 0 1 2 3 4 layer thickness for cell 1

Possible prescription Simply correct by inflating to minimum thickness. But what if more than two variables? Should all layers that are too thin be inflated to the minimum thickness? Should values any of the other variables be adjusted? Should some constraint-violating thicknesses be inflated beyond the minimum?

Next example: 3 variables, 1 observed background probability density in plane of cell 1 correction 2 Observe cell 1. Get constraint violation for cell 3 layer thickness for cell 3 1 0 Constraint line provides corrected thickness for cell 3. Enforcing constraint for cell 3 requires a correction for cell 2 1 0 1 2 3 layer thickness for cell 2 Get another regression problem. Suggests second pass with constraints as data.

Another case: 3 variables, 1 observed background probability density in plane of cell 1 correction layer thickness for cell 3 2 1 0 1 Thicknesses for both cells 2 and 3 are too small after correcting cell 1. Inflating both to minimal thickness is not optimal. When cell 2 has minimum thickness, optimum for cell 3 is somewhat larger than minimum. 1 0 1 2 layer thickness for cell 2

Two expensive approaches Quadratic programming considers all possibilities for enforcing range constraints. Markov Chain Monte Carlo for non-gaussian integrals. Any inexpensive approaches? Error covariances are poorly known, so it doesn t pay to be too picky about finding a precise solution to the resulting math problem. Maybe enforcing all violated constraints and accounting for the impact on the other variables is good enough.

If we know which constraints to enforce Inequality constraints that are not violated can be ignored. Those that are violated can be treated like equality constraints and enforced with Lagrange multipliers. A practical strategy is to assume that all violations should be enforced.

Variational formalism revisited Using Lagrange multipliers y, the constrained minimum of J corresponds to a stationary point of: L(x, y) = J(x 0 ) + 1 2 (x x 0) T A 1 (x x 0 ) (Px c) T y where x 0 is the unconstrained minimum of J(x) and A is the posterior error-covariance matrix, and where the enforced constraints are Px = c. Stationarity requires that x x 0 = AP T y, so the Lagrange multipliers are given by y = (PAP T ) 1 (c Px 0 ). Thus, the constrained solution is: x x 0 = AP T (PAP T ) 1 (c Px 0 ). Compare: b x 0, B A, H P, d c, and D 0.

Algorithm 1 Correct constraint violations using x x 0 = AP T (PAP T ) 1 (c Px 0 ). Most difficult part of solving is computing the posterior error covariance matrix A = B BH T (HBH T + D) 1 HB. If D is diagonal, then A can be computed sequentially, one observation at a time. Approximate algorithm Following lead of optimal interpolation, where covariances are not updated after assimilation, use B rather than A.

Algorithm 2 Append constraints c to d and treat as error-free data. Append P to H and expand D with zeros. Redo assimilation. Involves solving a larger (rather than smaller) problem on the second pass, but not necessary to know A. Requires only minimal changes to existing data-assimilation codes. Gives exactly same results as algorithm 1.

Four-variable computational example 3.0 2.5 background values Exponential correlation length of 1-cell width. σ 2 b = 1 and σ2 d = 0.01. layer thickness 2.0 1.5 1.0 0.5 approximate exact ignoring constraints Assimilating data causes constraint violation three cells away. Treating constraint as data fixes the problem. Both algorithms give same result. 0.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 cell index Not updating B to A before 2nd pass works almost as well.

Modified example 3.0 2.5 2.0 background values Same errors as before. Observed thickness is smaller. layer thickness 1.5 1.0 both enforced one enforced ignoring constraints Assimilation causes problems for 2 cells. Correcting only the worst violation fixes the other one. 0.5 0.0 Correcting both gives about the same answer. 1.0 1.5 2.0 2.5 3.0 3.5 4.0 cell index

Conclusions Range constraints can be enforced economically. Simplest is to enforce constraints without regard for implied corrections. Implied corrections can be computed by treating enforced constraints like error-free data, which should be assimilated on a second pass. Few changes are needed for existing assimilation software. Cost of finding the minimal subset of violations to enforce to get an optimal solution, is not justified given lack of precise knowledge about errors.