Usin Quasi-Newton Methos to Fin Optimal Solutions to Problematic Kriin Systems Steven Lyster Centre for Computational Geostatistics Department of Civil & Environmental Enineerin University of Alberta Solvin a riin system of equations to etermine optimal linear estimate weihts is straihtforwar if the left-han sie covariance matri is strictly positive efinite. If the covariance matri is sinular or near-sinular, problems are encountere when tryin to irectly solve the system. Quasi-Newton methos use a recursive alorithm to convere to a local minimum of a iven function. In the case of a quaratic function with nown first an secon erivatives an a positive semiefinite essian matri the quasi-newton methos are quite efficient an will always convere to a lobally optimal solution. Use of these methos coul eliminate some types of errors encountere when riin or simulatin a fiel, particularly when assinin conitionin ata to ri noes. Introuction Minimizin the variance of a linear estimate is the basis for most eostatistical estimation. he stanar proceure is to epress the variance as a function of the estimation weihts, then tae the erivative with respect to the weihts an set the erivative equal to zero. If the secon erivative is reater than zero there is a sinle optimal set of weihts at which the variance is a minimum; this is the case if the left han sie covariance matri in the riin system of equations is positive efinite Journel an uijbrets, 1978. Solvin the riin equations typically involves some technique such as matri inversion or Gaussian elimination. hese methos wor well for most riin problems. owever, if the riin system is ill-behave problems arise when usin conventional analytical methos for irectly solvin for optimal estimation weihts. Quaratic Form of Estimation Variance he variance of a linear estimate may be epresse as a quaratic function of the weihts: 2 1 2 σ E Q b σ Z 1 2 Where 121-1
E i j { Z u Z u } Q 2 b 2 E{ Z u Z u } i λ j If the essian matri of the quaratic function, Q, is positive efinite then the variance function has a sinle strict local minimizer. In this case it is possible to fin the optimal estimation weihts * by solvin the riin system of equations Q b. In cases where Q is not positive efinite, most commonly sinular or near sinular ie, positive semiefinite, the riin equations cannot be solve analytically throuh most methos. Invertin the matri Q or usin a technique such as Gaussian elimination is not possible for sinular matrices. here are ways of fiin these problems Reference??, such as removin one or more ata that are causin the sinularity, averain ata that occur at the same location, or movin one or more ata. hese are all a-hoc solutions; the use of a ifferent optimization alorithm is propose here to fin the optimal weihts irectly rather than moifyin the problem to fit the optimization metho. Quasi-Newton Methos he stanar form of Newton s metho recursively uses a quaratic approimation for the objective function to fin a minimum Chon an Za, 2001. Because the variance is a quaratic function alreay with a efine essian matri, the stanar Newton s alorithm will solve the optimization problem in a sinle step from any initial point. For a quaratic function, Newton s metho simplifies as follows: F 1 1 Where: F is the essian of the objective at is the first erivative of the objective at his then leas to: Q Q b Q b 1 1 1 Equation 2 is the same as the solution to the riin system of equations, an requires the matri Q to be invertible. In the case of a sinular matri this is not true, an so the same problem arises as before. Quasi-Newton methos are a family of optimization alorithms that use the same basic principle as Newton s metho, but utilize an approimation for the inverse essian matri an therefore o not require matri inversion or solvin systems of equations. he proceure for all of these methos is the same: 1. Select an initial point for the alorithm, 0, an an initial approimation for the inverse essian matri, 0. 2. Fin the irection of search 2 121-2
121-3 3. Perform a line search to et the minimum objective value in the search irection; for a quaratic function, this simplifies to Q α 4. Get the net estimate for the optimal point, 1 α 5. Compute the net approimation for the inverse essian by usin 1 1 1 is then foun by an equation that epens on the quasi-newton metho use. 6. Set 1 an o bac to step 2. Repeat until some stoppin criterion is met. he initial uess for 0 can theoretically be set to anythin an the alorithm will still convere to a minimum point. As lon as Q is positive semiefinite a minimum point must eist an the alorithm will convere. he simplest possible selection for 0 is a zero vector, containin all zeros. Because the optimal linear estimate weihts are always quite small numbers typically less than one, a zero vector shoul be a close enouh approimation for the alorithm to convere in a reasonable amount of time. For the initial approimation of the inverse essian, 0, the only requirements are that it be real, positive efinite an symmetric. Within these requirements any matri may be use an still result in converence. he easiest matri to use is an ientity matri; this satisfies all of the conitions an simplifies the math as well. he intermeiate approimations for the inverse essian,, are foun by usin equations that epen on the particular quasi-newton metho bein employe. he three metho iscusse here are the ran one equation alorithm, the DFP alorithm, an the BFGS alorithm Chon an Za, 2001. he equations for for each metho are: Ran one: 1 3 DFP: 1 4 BFGS: 1 1 5 he ifference between the alorithms is the level of sophistication in the upate equations for the inverse essian. he ran one equation alorithm is the simplest, an fins a symmetric upate approimation. For quaratic objectives with a positive efinite Q each iterative
approimation will also be positive efinite. Computational problems may be encountere in some cases as the enominator in Equation 3 may approach zero Chon an Za, 2001. he DFP alorithm uses a more robust approach to finin the intermeiate matri approimations. Usin this metho each matri is uarantee to be positive efinite. here are no computational ivie-by-zero errors when ealin with the DFP alorithm, provie the proceure is stoppe once an optimal value has been reache an the elta terms in the enominators bein to approach zero. he main rawbac to the DFP alorithm is that it requires accurate line searches to fin the search istance for each estimate; this is not a problem as an eact line search of a quaratic function may be performe by usin the equation in step 3. he BFGS alorithm, thouh comple in appearance, may often be more efficient than the DFP alorithm Chon an Za, 2001. he intermeiate matrices are uarantee to be positive efinite usin BFGS, an there shoul be no computational problems encountere. he most obvious ifference between the DFP an BFGS alorithms is that BFGS oes not require very accurate line searches; this is not an issue in this case. he quasi-newton methos to employ is up to the user; iven the ivie-by-zero potential of the ran one alorithm it may be unesirable. he BFGS alorithm may be too robust an not efficient enouh for a problem as simple as minimizin an estimation variance. Eample o emonstrate the use of a quasi-newton metho for solvin an unstable riin system, a simple synthetic problem was create. Fiure 1 shows an arranement of nine ata in a 100m by 100m square fiel. A ri with blocs 10m square was set up for the estimation; the ri is shown in Fiure 1. For the purpose of estimation, a linear varioram moel with a rane of 100m an a nuet effect of 0.1 was use. he point bein estimate was arbitrarily set as 35,35. Fiure 1: Arranement of ata for the eample. 121-4
wo cases were consiere: first, solvin the riin system of equations for the ri noe at 35,35 usin both matri inversion an the DFP alorithm; then, the ata were assine to the closest ri noes an the system was solve aain. It may be easily seen from Fiure 1 that two of the ata were assine to the same ri noe; this create a sinular matri that coul not be inverte. Because the ata were only move slihtly the results for the two cases were very similar; this allowe a comparison of the eact solution from the first case to the quasi-newton solution from the secon case. able 1 shows the results of solvin for the minimum estimation variance in both cases, with both the eact an DFP solutions for case one. Case 1 Case 2 Value Eact DFP DFP λ 1 0.2726 0.2726 0.2258 λ 2 0.0802 0.0802 0.0616 λ 3 0.1802 0.1802 0.2856 λ 4 0.1972 0.1972 0.2292 λ 5 0.1579 0.1579 0.0953 λ 6 0.1261 0.1261 0.0953 λ 7 0.0564 0.0564 0.0624 λ 8-0.0116-0.0116-0.0055 λ 9-0.024-0.024-0.012 Variance 0.2865 0.2865 0.2905 able 1: Solutions foun for minimizin the estimation variance in the eample. For case one, the DFP alorithm prouce eactly the same results as solvin the riin system of equations throuh matri inversion. his shows that quasi-newton methos are a vali way of solvin the system. When the ata were relocate to the ri noe locations an the essian matri became sinular, the eact solution coul no loner be foun. he DFP alorithm in this case returne results quite similar to the first case, with the optimal variance bein very close to the same value. he variance shoul be close to the same for both cases as movin the ata a few meters shoul not completely chane the riin matrices; that this is inee the result foun shows that even when the covariance matri is sinular a quasi-newton metho will return vali results. Another notable result is that the weihts iven in case 2 to the collocate conitionin ata are ientical. his is probably the most ieal way to hanle the problem of multiple solutions; with a sinular covariance matri there are many possible solutions. Each of the solutions must have the same sum of weihts for the collocate ata; beyon this constraint, any combination of weihts is possible without affectin the optimality of the estimation variance. hat the quasi-newton solution resulte in the collocate ata receivin equal weihtin with no eplicit control suests the metho will always return this result; this is as unbiase as is possible in a case with many solutions. Nine iterations were performe for the DFP alorithm; with nine ata, this is the maimum number that shoul be require to convere to a solution. able 2 shows the proress of the alorithm for each iteration. From this information it may be seen that the alorithm actually 121-5
convere to within four ecimal places at iteration five in both cases. It is clear that performin the maimum number of iterations is not necessary to fin a solution. Iteration Variance Case 1 Case 2 0 1.0000 1.0000 1 0.3325 0.3409 2 0.2892 0.2953 3 0.2871 0.2919 4 0.2866 0.2906 5 0.2865 0.2905 6 0.2865 0.2905 7 0.2865 0.2905 8 0.2865 0.2905 9 0.2865 0.2905 able 2: Proression of the DFP alorithm in the eample throuh nine iterations when minimizin the estimation variance. Conclusions Quasi-Newton methos may be use to fin optimal solutions to the minimization of estimation variances without irectly solvin the riin system of equations. Optimal solutions are foun even when the covariance matri is not strictly positive efinite an analytical solutions are not attainable irectly. he stability an efficiency of quasi-newton methos for very lare problems has not been prove; if these techniques are employe to avoi issues with sinular matrices an problematic riin systems etra attention must be pai to ensure the results are reasonable. References E.K.P. Chon an S.. Za. An Introuction to Optimization. Secon Eition. John Wiley an Sons, Inc, New Yor, 2001. A.G. Journel, Ch.J. uijbrets. Minin Geostatistics. he Blacburn Press, Calwell, New Jersey, 2003; Oriinal printin 1978. 121-6