Using Quasi-Newton Methods to Find Optimal Solutions to Problematic Kriging Systems

Similar documents
Applying Axiomatic Design Theory to the Multi-objective Optimization of Disk Brake *

Direct Computation of Generator Internal Dynamic States from Terminal Measurements

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

A. Incorrect! The letter t does not appear in the expression of the given integral

A Diagnostic Treatment of Unconstrained Optimization Problems via a Modified Armijo line search technique.

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Chapter 9 Method of Weighted Residuals

Calculus of Variations

Linear First-Order Equations

Table of Common Derivatives By David Abraham

2010 Mathematics Subject Classification: 90C30. , (1.2) where t. 0 is a step size, received from the line search, and the directions d

Math 1271 Solutions for Fall 2005 Final Exam

3.2 Differentiability

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1

OPG S. LIST OF FORMULAE [ For Class XII ] OP GUPTA. Electronics & Communications Engineering. Indira Award Winner

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes

Polynomial Inclusion Functions

Calculus in the AP Physics C Course The Derivative

The Sokhotski-Plemelj Formula

Torque OBJECTIVE INTRODUCTION APPARATUS THEORY

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math 342 Partial Differential Equations «Viktor Grigoryan

Trigonometric Functions

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

Global Convergence Properties of a New Class of Conjugate. Gradient Method for Unconstrained Optimization

Section 2.1 The Derivative and the Tangent Line Problem

Shape functions in 1D

lim Prime notation can either be directly applied to a function as previously seen with f x 4.1 Basic Techniques for Finding Derivatives

A Conjugate Gradient Method with Inexact. Line Search for Unconstrained Optimization

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Quantum mechanical approaches to the virial

THE ACCURATE ELEMENT METHOD: A NEW PARADIGM FOR NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS

x = c of N if the limit of f (x) = L and the right-handed limit lim f ( x)

The Press-Schechter mass function

MATH2231-Differentiation (2)

A New Conjugate Gradient Method. with Exact Line Search

Analysis of Halo Implanted MOSFETs

5.4 Fundamental Theorem of Calculus Calculus. Do you remember the Fundamental Theorem of Algebra? Just thought I'd ask

SYNCHRONOUS SEQUENTIAL CIRCUITS

Unit #6 - Families of Functions, Taylor Polynomials, l Hopital s Rule

LINEAR DIFFERENTIAL EQUATIONS OF ORDER 1. where a(x) and b(x) are functions. Observe that this class of equations includes equations of the form

MOOCULUS. massive open online calculus C A L C U L U S T H I S D O C U M E N T W A S T Y P E S E T O N A P R I L 3,

The Exact Form and General Integrating Factors

Optimization of Geometries by Energy Minimization

Lecture 6: Calculus. In Song Kim. September 7, 2011

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

Average value of position for the anharmonic oscillator: Classical versus quantum results

Implicit Differentiation

by using the derivative rules. o Building blocks: d

Derivative of a Constant Multiple of a Function Theorem: If f is a differentiable function and if c is a constant, then

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

Math 20B. Lecture Examples.

Diagonalization of Matrices Dr. E. Jacobs

(a 1 m. a n m = < a 1/N n

cosh x sinh x So writing t = tan(x/2) we have 6.4 Integration using tan(x/2) 2t 1 + t 2 cos x = 1 t2 sin x =

A New Nonlinear Conjugate Gradient Coefficient. for Unconstrained Optimization

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation

Calculus I. Lecture Notes for MATH 114. Richard Taylor. Department of Mathematics and Statistics. c R. Taylor 2007

Parameter estimation: A new approach to weighting a priori information

Dissipative numerical methods for the Hunter-Saxton equation

The Explicit Form of a Function

Calculus BC Section II PART A A GRAPHING CALCULATOR IS REQUIRED FOR SOME PROBLEMS OR PARTS OF PROBLEMS

Computed Tomography Notes, Part 1. The equation that governs the image intensity in projection imaging is:

Examining Geometric Integration for Propagating Orbit Trajectories with Non-Conservative Forcing

Section 7.1: Integration by Parts

18 EVEN MORE CALCULUS

Planar sheath and presheath

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Summary: Differentiation

A Unified Theorem on SDP Rank Reduction

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

Nonlinear Dielectric Response of Periodic Composite Materials

FUZZY ARITHMETIC WITHOUT USING THE METHOD OF - CUTS

4.2 First Differentiation Rules; Leibniz Notation

A Course in Machine Learning

MATH 205 Practice Final Exam Name:

Section The Chain Rule and Implicit Differentiation with Application on Derivative of Logarithm Functions

Basic Differentiation Rules and Rates of Change. The Constant Rule

AN INTRODUCTION TO NUMERICAL METHODS USING MATHCAD. Mathcad Release 13. Khyruddin Akbar Ansari, Ph.D., P.E.

arxiv: v1 [math.co] 3 Apr 2019

Antiderivatives and Indefinite Integration

Experiment 2, Physics 2BL

WJEC Core 2 Integration. Section 1: Introduction to integration

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Including the Consumer Function. 1.0 Constant demand as consumer utility function

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

Schrödinger s equation.

The Sokhotski-Plemelj Formula

x f(x) x f(x) approaching 1 approaching 0.5 approaching 1 approaching 0.

AN INTRODUCTION TO NUMERICAL METHODS USING MATHCAD. Mathcad Release 14. Khyruddin Akbar Ansari, Ph.D., P.E.

Integration Review. May 11, 2013

PREPARATION OF THE NATIONAL MAGNETIC FIELD STANDARD IN CROATIA

Chapter 1 Overview: Review of Derivatives

Digital Signal Processing II Lecture 2: FIR & IIR Filter Design

Constraint Reformulation and a Lagrangian Relaxation based Solution Algorithm for a Least Expected Time Path Problem Abstract 1.

Transcription:

Usin Quasi-Newton Methos to Fin Optimal Solutions to Problematic Kriin Systems Steven Lyster Centre for Computational Geostatistics Department of Civil & Environmental Enineerin University of Alberta Solvin a riin system of equations to etermine optimal linear estimate weihts is straihtforwar if the left-han sie covariance matri is strictly positive efinite. If the covariance matri is sinular or near-sinular, problems are encountere when tryin to irectly solve the system. Quasi-Newton methos use a recursive alorithm to convere to a local minimum of a iven function. In the case of a quaratic function with nown first an secon erivatives an a positive semiefinite essian matri the quasi-newton methos are quite efficient an will always convere to a lobally optimal solution. Use of these methos coul eliminate some types of errors encountere when riin or simulatin a fiel, particularly when assinin conitionin ata to ri noes. Introuction Minimizin the variance of a linear estimate is the basis for most eostatistical estimation. he stanar proceure is to epress the variance as a function of the estimation weihts, then tae the erivative with respect to the weihts an set the erivative equal to zero. If the secon erivative is reater than zero there is a sinle optimal set of weihts at which the variance is a minimum; this is the case if the left han sie covariance matri in the riin system of equations is positive efinite Journel an uijbrets, 1978. Solvin the riin equations typically involves some technique such as matri inversion or Gaussian elimination. hese methos wor well for most riin problems. owever, if the riin system is ill-behave problems arise when usin conventional analytical methos for irectly solvin for optimal estimation weihts. Quaratic Form of Estimation Variance he variance of a linear estimate may be epresse as a quaratic function of the weihts: 2 1 2 σ E Q b σ Z 1 2 Where 121-1

E i j { Z u Z u } Q 2 b 2 E{ Z u Z u } i λ j If the essian matri of the quaratic function, Q, is positive efinite then the variance function has a sinle strict local minimizer. In this case it is possible to fin the optimal estimation weihts * by solvin the riin system of equations Q b. In cases where Q is not positive efinite, most commonly sinular or near sinular ie, positive semiefinite, the riin equations cannot be solve analytically throuh most methos. Invertin the matri Q or usin a technique such as Gaussian elimination is not possible for sinular matrices. here are ways of fiin these problems Reference??, such as removin one or more ata that are causin the sinularity, averain ata that occur at the same location, or movin one or more ata. hese are all a-hoc solutions; the use of a ifferent optimization alorithm is propose here to fin the optimal weihts irectly rather than moifyin the problem to fit the optimization metho. Quasi-Newton Methos he stanar form of Newton s metho recursively uses a quaratic approimation for the objective function to fin a minimum Chon an Za, 2001. Because the variance is a quaratic function alreay with a efine essian matri, the stanar Newton s alorithm will solve the optimization problem in a sinle step from any initial point. For a quaratic function, Newton s metho simplifies as follows: F 1 1 Where: F is the essian of the objective at is the first erivative of the objective at his then leas to: Q Q b Q b 1 1 1 Equation 2 is the same as the solution to the riin system of equations, an requires the matri Q to be invertible. In the case of a sinular matri this is not true, an so the same problem arises as before. Quasi-Newton methos are a family of optimization alorithms that use the same basic principle as Newton s metho, but utilize an approimation for the inverse essian matri an therefore o not require matri inversion or solvin systems of equations. he proceure for all of these methos is the same: 1. Select an initial point for the alorithm, 0, an an initial approimation for the inverse essian matri, 0. 2. Fin the irection of search 2 121-2

121-3 3. Perform a line search to et the minimum objective value in the search irection; for a quaratic function, this simplifies to Q α 4. Get the net estimate for the optimal point, 1 α 5. Compute the net approimation for the inverse essian by usin 1 1 1 is then foun by an equation that epens on the quasi-newton metho use. 6. Set 1 an o bac to step 2. Repeat until some stoppin criterion is met. he initial uess for 0 can theoretically be set to anythin an the alorithm will still convere to a minimum point. As lon as Q is positive semiefinite a minimum point must eist an the alorithm will convere. he simplest possible selection for 0 is a zero vector, containin all zeros. Because the optimal linear estimate weihts are always quite small numbers typically less than one, a zero vector shoul be a close enouh approimation for the alorithm to convere in a reasonable amount of time. For the initial approimation of the inverse essian, 0, the only requirements are that it be real, positive efinite an symmetric. Within these requirements any matri may be use an still result in converence. he easiest matri to use is an ientity matri; this satisfies all of the conitions an simplifies the math as well. he intermeiate approimations for the inverse essian,, are foun by usin equations that epen on the particular quasi-newton metho bein employe. he three metho iscusse here are the ran one equation alorithm, the DFP alorithm, an the BFGS alorithm Chon an Za, 2001. he equations for for each metho are: Ran one: 1 3 DFP: 1 4 BFGS: 1 1 5 he ifference between the alorithms is the level of sophistication in the upate equations for the inverse essian. he ran one equation alorithm is the simplest, an fins a symmetric upate approimation. For quaratic objectives with a positive efinite Q each iterative

approimation will also be positive efinite. Computational problems may be encountere in some cases as the enominator in Equation 3 may approach zero Chon an Za, 2001. he DFP alorithm uses a more robust approach to finin the intermeiate matri approimations. Usin this metho each matri is uarantee to be positive efinite. here are no computational ivie-by-zero errors when ealin with the DFP alorithm, provie the proceure is stoppe once an optimal value has been reache an the elta terms in the enominators bein to approach zero. he main rawbac to the DFP alorithm is that it requires accurate line searches to fin the search istance for each estimate; this is not a problem as an eact line search of a quaratic function may be performe by usin the equation in step 3. he BFGS alorithm, thouh comple in appearance, may often be more efficient than the DFP alorithm Chon an Za, 2001. he intermeiate matrices are uarantee to be positive efinite usin BFGS, an there shoul be no computational problems encountere. he most obvious ifference between the DFP an BFGS alorithms is that BFGS oes not require very accurate line searches; this is not an issue in this case. he quasi-newton methos to employ is up to the user; iven the ivie-by-zero potential of the ran one alorithm it may be unesirable. he BFGS alorithm may be too robust an not efficient enouh for a problem as simple as minimizin an estimation variance. Eample o emonstrate the use of a quasi-newton metho for solvin an unstable riin system, a simple synthetic problem was create. Fiure 1 shows an arranement of nine ata in a 100m by 100m square fiel. A ri with blocs 10m square was set up for the estimation; the ri is shown in Fiure 1. For the purpose of estimation, a linear varioram moel with a rane of 100m an a nuet effect of 0.1 was use. he point bein estimate was arbitrarily set as 35,35. Fiure 1: Arranement of ata for the eample. 121-4

wo cases were consiere: first, solvin the riin system of equations for the ri noe at 35,35 usin both matri inversion an the DFP alorithm; then, the ata were assine to the closest ri noes an the system was solve aain. It may be easily seen from Fiure 1 that two of the ata were assine to the same ri noe; this create a sinular matri that coul not be inverte. Because the ata were only move slihtly the results for the two cases were very similar; this allowe a comparison of the eact solution from the first case to the quasi-newton solution from the secon case. able 1 shows the results of solvin for the minimum estimation variance in both cases, with both the eact an DFP solutions for case one. Case 1 Case 2 Value Eact DFP DFP λ 1 0.2726 0.2726 0.2258 λ 2 0.0802 0.0802 0.0616 λ 3 0.1802 0.1802 0.2856 λ 4 0.1972 0.1972 0.2292 λ 5 0.1579 0.1579 0.0953 λ 6 0.1261 0.1261 0.0953 λ 7 0.0564 0.0564 0.0624 λ 8-0.0116-0.0116-0.0055 λ 9-0.024-0.024-0.012 Variance 0.2865 0.2865 0.2905 able 1: Solutions foun for minimizin the estimation variance in the eample. For case one, the DFP alorithm prouce eactly the same results as solvin the riin system of equations throuh matri inversion. his shows that quasi-newton methos are a vali way of solvin the system. When the ata were relocate to the ri noe locations an the essian matri became sinular, the eact solution coul no loner be foun. he DFP alorithm in this case returne results quite similar to the first case, with the optimal variance bein very close to the same value. he variance shoul be close to the same for both cases as movin the ata a few meters shoul not completely chane the riin matrices; that this is inee the result foun shows that even when the covariance matri is sinular a quasi-newton metho will return vali results. Another notable result is that the weihts iven in case 2 to the collocate conitionin ata are ientical. his is probably the most ieal way to hanle the problem of multiple solutions; with a sinular covariance matri there are many possible solutions. Each of the solutions must have the same sum of weihts for the collocate ata; beyon this constraint, any combination of weihts is possible without affectin the optimality of the estimation variance. hat the quasi-newton solution resulte in the collocate ata receivin equal weihtin with no eplicit control suests the metho will always return this result; this is as unbiase as is possible in a case with many solutions. Nine iterations were performe for the DFP alorithm; with nine ata, this is the maimum number that shoul be require to convere to a solution. able 2 shows the proress of the alorithm for each iteration. From this information it may be seen that the alorithm actually 121-5

convere to within four ecimal places at iteration five in both cases. It is clear that performin the maimum number of iterations is not necessary to fin a solution. Iteration Variance Case 1 Case 2 0 1.0000 1.0000 1 0.3325 0.3409 2 0.2892 0.2953 3 0.2871 0.2919 4 0.2866 0.2906 5 0.2865 0.2905 6 0.2865 0.2905 7 0.2865 0.2905 8 0.2865 0.2905 9 0.2865 0.2905 able 2: Proression of the DFP alorithm in the eample throuh nine iterations when minimizin the estimation variance. Conclusions Quasi-Newton methos may be use to fin optimal solutions to the minimization of estimation variances without irectly solvin the riin system of equations. Optimal solutions are foun even when the covariance matri is not strictly positive efinite an analytical solutions are not attainable irectly. he stability an efficiency of quasi-newton methos for very lare problems has not been prove; if these techniques are employe to avoi issues with sinular matrices an problematic riin systems etra attention must be pai to ensure the results are reasonable. References E.K.P. Chon an S.. Za. An Introuction to Optimization. Secon Eition. John Wiley an Sons, Inc, New Yor, 2001. A.G. Journel, Ch.J. uijbrets. Minin Geostatistics. he Blacburn Press, Calwell, New Jersey, 2003; Oriinal printin 1978. 121-6