Least-square inversion with inexact adjoints. Method of conjugate directions: A tutorial a

Similar documents
Introduction to Optimization Techniques. How to Solve Equations

Recurrence Relations

Infinite Sequences and Series

6.3 Testing Series With Positive Terms

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

Differentiable Convex Functions

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

Chapter 10: Power Series

Optimally Sparse SVMs

Algebra of Least Squares

The Method of Least Squares. To understand least squares fitting of data.

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

TEACHER CERTIFICATION STUDY GUIDE

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

7 Sequences of real numbers

Linear Regression Demystified

Machine Learning for Data Science (CS 4786)

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Math 257: Finite difference methods

Appendix: The Laplace Transform

Math 113 Exam 3 Practice

CHAPTER I: Vector Spaces

The Random Walk For Dummies

Basic Iterative Methods. Basic Iterative Methods

Chapter 7: Numerical Series

PC5215 Numerical Recipes with Applications - Review Problems

NUMERICAL METHODS FOR SOLVING EQUATIONS

v = -!g(x 0 ) Ûg Ûx 1 Ûx 2 Ú If we work out the details in the partial derivatives, we get a pleasing result. n Ûx k, i x i - 2 b k

Chapter 6 Overview: Sequences and Numerical Series. For the purposes of AP, this topic is broken into four basic subtopics:

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Chapter 7: The z-transform. Chih-Wei Liu

Ma 530 Introduction to Power Series

Optimization Methods MIT 2.098/6.255/ Final exam

FLOOR AND ROOF FUNCTION ANALOGS OF THE BELL NUMBERS. H. W. Gould Department of Mathematics, West Virginia University, Morgantown, WV 26506, USA

Chapter 4. Fourier Series

Principle Of Superposition

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

subcaptionfont+=small,labelformat=parens,labelsep=space,skip=6pt,list=0,hypcap=0 subcaption ALGEBRAIC COMBINATORICS LECTURE 8 TUESDAY, 2/16/2016

Filter banks. Separately, the lowpass and highpass filters are not invertible. removes the highest frequency 1/ 2and

Polynomials with Rational Roots that Differ by a Non-zero Constant. Generalities

CALCULATION OF FIBONACCI VECTORS

CALCULATING FIBONACCI VECTORS

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

Topic 9: Sampling Distributions of Estimators

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

RADICAL EXPRESSION. If a and x are real numbers and n is a positive integer, then x is an. n th root theorems: Example 1 Simplify

Vector Quantization: a Limiting Case of EM

Ray-triangle intersection

(VII.A) Review of Orthogonality

Math 113 Exam 3 Practice

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

Sequences. Notation. Convergence of a Sequence

IP Reference guide for integer programming formulations.

Analysis of Algorithms. Introduction. Contents

SNAP Centre Workshop. Basic Algebraic Manipulation

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Mixtures of Gaussians and the EM Algorithm

Notes for Lecture 5. 1 Grover Search. 1.1 The Setting. 1.2 Motivation. Lecture 5 (September 26, 2018)

Chapter 6: Numerical Series

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Comparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Efficient GMM LECTURE 12 GMM II

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

PAijpam.eu ON DERIVATION OF RATIONAL SOLUTIONS OF BABBAGE S FUNCTIONAL EQUATION

CHAPTER 10 INFINITE SEQUENCES AND SERIES

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

ECE-S352 Introduction to Digital Signal Processing Lecture 3A Direct Solution of Difference Equations

Notes on iteration and Newton s method. Iteration

Polynomial Functions and Their Graphs

x a x a Lecture 2 Series (See Chapter 1 in Boas)

Introduction to Computational Biology Homework 2 Solution

A collocation method for singular integral equations with cosecant kernel via Semi-trigonometric interpolation

Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine

Theorem: Let A n n. In this case that A does reduce to I, we search for A 1 as the solution matrix X to the matrix equation A X = I i.e.


MATH 304: MIDTERM EXAM SOLUTIONS

2 Geometric interpretation of complex numbers

CSE 202 Homework 1 Matthias Springer, A Yes, there does always exist a perfect matching without a strong instability.

Abstract Vector Spaces. Abstract Vector Spaces

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

Math 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

FIR Filter Design: Part II

Time-Domain Representations of LTI Systems

Introduction to Optimization Techniques

Introduction to Machine Learning DIS10

On Involutions which Preserve Natural Filtration

Frequency Response of FIR Filters

Lecture 8: Solving the Heat, Laplace and Wave equations using finite difference methods

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Chapter 6 Infinite Series

Question 1: The magnetic case

Chapter 6 Principles of Data Reduction

Transcription:

Least-square iversio with iexact adjoits. Method of cojugate directios: A tutorial a a Published i SEP Report, 92, 253-365 (1996) Sergey Fomel 1 ABSTRACT This tutorial describes the classic method of cojugate directios: the geeralizatio of the cojugate-gradiet method i iterative least-square iversio. I derive the algebraic equatios of the cojugate-directio method from geeral optimizatio priciples. The derivatio explais the magic properties of cojugate gradiets. It also justifies the use of cojugate directios i cases whe these properties are distorted either by computatioal errors or by iexact adjoit operators. The extra cost comes from storig a larger umber of previous search directios i the computer memory. A simple program ad two examples illustrate the method. INTRODUCTION This paper describes the method of cojugate directios for solvig liear operator equatios i Hilbert space. This method is usually described i the umerous textbooks o ucostraied optimizatio as a itroductio to the much more popular method of cojugate gradiets. See, for example, Practical optimizatio by Gill et al. (1995) ad its bibliography. The famous cojugate-gradiet solver possesses specific properties, well-kow from the origial works of Hestees ad Stiefel (1952) ad Fletcher ad Reeves (1964). For liear operators ad exact computatios, it guaratees fidig the solutio after, at most, iterative steps, where is the umber of dimesios i the solutio space. The method of cojugate gradiets does t require explicit computatio of the objective fuctio ad explicit iversio of the Hessia matrix. This makes it particularly attractive for large-scale iverse problems, such as those of seismic data processig ad iterpretatio. However, it does require explicit computatio of the adjoit operator. Claerbout (1992, 2003) shows dozes of successful examples of the cojugate gradiet applicatio with umerically precise adjoit operators. The motivatio for this tutorial is to explore the possibility of usig differet types of precoditioig operators i the place of adjoits i iterative least-square iversio. For some liear or liearized operators, implemetig the exact adjoit may pose a 1 e-mail: sergey@sep.staford.edu

Fomel 2 Cojugate directios difficult problem. For others, oe may prefer differet precoditioers because of their smoothess (Claerbout, 1995a; Crawley, 1995), simplicity (Kleima ad va de Berg, 1991), or asymptotic properties (Sevik ad Herma, 1994). I those cases, we could apply the atural geeralizatio of the cojugate gradiet method, which is the method of cojugate directios. The cost differece betwee those two methods is i the volume of memory storage. I the days whe the cojugate gradiet method was iveted, this differece looked too large to eve cosider a practical applicatio of cojugate directios. With the evidet icrease of computer power over the last 30 years, we ca afford to do it ow. I derive the mai equatios used i the cojugate-directio method from very geeral optimizatio criteria, with miimum restrictios implied. The textbook algebra is illustrated with a simple program ad two simple examples. IN SEARCH OF THE MINIMUM We are lookig for the solutio of the liear operator equatio d = A m, (1) where m is the ukow model i the liear model space, d stads for the give data, ad A is the forward modelig operator. The data vector d belogs to a Hilbert space with a defied orm ad dot product. The solutio is costructed by iterative steps i the model space, startig from a iitial guess m 0. Thus, at the -th iteratio, the curret model m is foud by the recursive relatio m = m 1 + α s, (2) where s deotes the step directio, ad α stads for the scalig coefficiet. The residual at the -th iteratio is defied by Substitutig (2) ito (3) leads to the equatio r = d A m. (3) r = r 1 α A s. (4) For a give step s, we ca choose α to miimize the squared orm of the residual r 2 = r 1 2 2 α (r 1, A s ) + α 2 A s 2. (5) The paretheses deote the dot product, ad x = (x, x) deotes the orm of x i the correspodig Hilbert space. The optimal value of α is easily foud from equatio (5) to be α = (r 1, A s ). (6) A s 2

Fomel 3 Cojugate directios Two importat coclusios immediately follow from this fact. First, substitutig the value of α from formula (6) ito equatio (4) ad multiplyig both sides of this equatio by r, we ca coclude that (r, A s ) = 0, (7) which meas that the ew residual is orthogoal to the correspodig step i the residual space. This situatio is schematically show i Figure 1. Secod, substitutig formula (6) ito (5), we ca coclude that the ew residual decreases accordig to r 2 = r 1 2 (r 1, A s ) 2 A s 2, (8) ( Pythagoras s theorem ), uless r 1 ad A s are orthogoal. These two coclusios are the basic features of optimizatio by the method of steepest descet. They will help us defie a improved search directio at each iteratio. A s Figure 1: Geometry of the residual i the data space (a scheme). r 1 r IN SEARCH OF THE DIRECTION Let s suppose we have a geerator that provides particular search directios at each step. The ew directio ca be the gradiet of the objective fuctio (as i the method of steepest descet), some other operator applied o the residual from the previous step, or, geerally speakig, ay arbitrary vector i the model space. Let us deote the automatically geerated directio by c. Accordig to formula (8), the residual decreases as a result of choosig this directio by How ca we improve o this result? r 1 2 r 2 = (r 1, A c ) 2 A c 2. (9)

Fomel 4 Cojugate directios First step of the improvemet Assumig > 1, we ca add some amout of the previous step s 1 to the chose directio c to produce a ew search directio s ( 1), as follows: s ( 1) = c + β ( 1) s 1, (10) where β ( 1) is a adjustable scalar coefficiet. Accordig to to the fudametal orthogoality priciple (7), (r 1, A s 1 ) = 0. (11) As follows from equatio (11), the umerator o the right-had side of equatio (9) is ot affected by the ew choice of the search directio: ( ) r 1, A s ( 1) 2 [ = (r 1, A c ) + β ( 1) (r 1, A s 1 ) ] 2 = (r 1, A c ) 2. (12) However, we ca use trasformatio (10) to decrease the deomiator i (9), thus further decreasig the residual r. We achieve the miimizatio of the deomiator A s ( 1) 2 = A c 2 + 2 β ( 1) (A c, A s 1 ) + ( ) β ( 1) 2 A s 1 2 (13) by choosig the coefficiet β ( 1) to be β ( 1) = (A c, A s 1 ) A s 1 2. (14) Note the aalogy betwee (14) ad (6). Aalogously to (7), equatio (14) is equivalet to the orthogoality coditio ( A s ( 1), A s 1 ) = 0. (15) Aalogously to (8), applyig formula (14) is also equivalet to defiig the miimized deomiator as A c ( 1) 2 = A c 2 (A c, A s 1 ) 2. (16) A s 1 2 Secod step of the improvemet Now let us assume > 2 ad add some amout of the step from the ( 2)-th iteratio to the search directio, determiig the ew directio s ( 2), as follows: s ( 2) = s ( 1) + β ( 2) s 2. (17) We ca deduce that after the secod chage, the value of umerator i equatio (9) is still the same: ( ) r 1, A s ( 2) 2 [ = (r 1, A c ) + β ( 2) (r 1, A s 2 ) ] 2 = (r 1, A c ) 2. (18)

Fomel 5 Cojugate directios This remarkable fact occurs as the result of trasformig the dot product (r 1, A s 2 ) with the help of equatio (4): (r 1, A s 2 ) = (r 2, A s 2 ) α 1 (A s 1, A s 2 ) = 0. (19) The first term i (19) is equal to zero accordig to formula (7); the secod term is equal to zero accordig to formula (15). Thus we have proved the ew orthogoality equatio (r 1, A s 2 ) = 0, (20) which i tur leads to the umerator ivariace (18). The value of the coefficiet i (17) is defied aalogously to (14) as β ( 2) β ( 2) = ( A s ( 1), A s 2 ) = (A c, A s 2 ), (21) A s 2 2 A s 2 2 where we have agai used equatio (15). If A s 2 is ot orthogoal to A c, the secod step of the improvemet leads to a further decrease of the deomiator i (8) ad, cosequetly, to a further decrease of the residual. Iductio Cotiuig by iductio the process of addig a liear combiatio of the previous steps to the arbitrarily chose directio c (kow i mathematics as the Gram- Schmidt orthogoalizatio process), we fially arrive at the complete defiitio of the ew step s, as follows: j= 1 s = s (1) = c + β (j) s j. (22) j=1 Here the coefficiets β (j) are defied by equatios which correspod to the orthogoality priciples ad β (j) = (A c, A s j ) A s j 2, (23) (A s, A s j ) = 0, 1 j 1 (24) (r, A s j ) = 0, 1 j. (25) It is these orthogoality properties that allowed us to optimize the search parameters oe at a time istead of solvig the -dimesioal system of optimizatio equatios for α ad β (j).

Fomel 6 Cojugate directios ALGORITHM The results of the precedig sectios defie the method of cojugate directios to cosist of the followig algorithmic steps: 1. Choose iitial model m 0 ad compute the residual r 0 = d A m 0. 2. At -th iteratio, choose the iitial search directio c. 3. If is greater tha 1, optimize the search directio by addig a liear combiatio of the previous directios, accordig to equatios (22) ad (23), ad compute the modified step directio s. 4. Fid the step legth α accordig to equatio (6). The orthogoality priciples (24) ad (7) ca simplify this equatio to the form α = (r 1, A c ) A s 2. (26) 5. Update the model m ad the residual r accordig to equatios (2) ad (4). 6. Repeat iteratios util the residual decreases to the required accuracy or as log as it is practical. At each of the subsequet steps, the residual is guarateed ot to icrease accordig to equatio (8). Furthermore, optimizig the search directio guaratees that the covergece rate does t decrease i compariso with (9). The oly assumptio we have to make to arrive at this coclusio is that the operator A is liear. However, without additioal assumptios, we caot guaratee global covergece of the algorithm to the least-square solutio of equatio (1) i a fiite umber of steps. WHAT ARE ADJOINTS FOR? THE METHOD OF CONJUGATE GRADIENTS The adjoit operator A T projects the data space back to the model space ad is defied by the dot product test (d, A m) ( A T d, m ) (27) for ay m ad d. The method of cojugate gradiets is a particular case of the method of cojugate directios, where the iitial search directio c is c = A T r 1. (28) This directio is ofte called the gradiet, because it correspods to the local gradiet of the squared residual orm with respect to the curret model m 1. Aligig the iitial search directio alog the gradiet leads to the followig remarkable simplificatios i the method of cojugate directios.

Fomel 7 Cojugate directios Orthogoality of the gradiets The orthogoality priciple (25) trasforms accordig to the dot-product test (27) to the form (r 1, A s j ) = ( A T r 1, s j ) = (c, s j ) = 0, 1 j 1. (29) Formig the dot product (c, c j ) ad applyig formula (22), we ca see that (c, c j ) = c, s j i=j 1 i=1 β (i) s i = (c, s j ) i=j 1 i=1 β (i) (c, s i ) = 0, 1 j 1. (30) Equatio (30) proves the orthogoality of the gradiet directios from differet iteratios. Sice the gradiets are orthogoal, after iteratios they form a basis i the -dimesioal space. I other words, if the model space has dimesios, each vector i this space ca be represeted by a liear combiatio of the gradiet vectors formed by iteratios of the cojugate-gradiet method. This is true as well for the vector m 0 m, which poits from the solutio of equatio (1) to the iitial model estimate m 0. Neglectig computatioal errors, it takes exactly iteratios to fid this vector by successive optimizatio of the coefficiets. This proves that the cojugate-gradiet method coverges to the exact solutio i a fiite umber of steps (assumig that the model belogs to a fiite-dimesioal space). The method of cojugate gradiets simplifies formula (26) to the form α = (r 1, A c ) A s 2 = ( A T r 1, c ) A s 2 = c 2 A s 2, (31) which i tur leads to the simplificatio of formula (8), as follows: r 2 = r 1 2 c 4 A s 2. (32) If the gradiet is ot equal to zero, the residual is guarateed to decrease. If the gradiet is equal to zero, we have already foud the solutio. Short memory of the gradiets Substitutig the gradiet directio (28) ito formula (23) ad applyig formulas (4) ad (27), we ca see that β (j) = (A c, r j r j 1 ) α j A s j 2 = ( c, A T r j A T r j 1 ) α j A s j 2 = (c, c j+1 c j ) α j A s j 2. (33)

Fomel 8 Cojugate directios The orthogoality coditio (30) ad the defiitio of the coefficiet α j from equatio (31) further trasform this formula to the form β ( 1) = c 2 α 1 A s 1 2 = c 2 c 1 2, (34) β (j) = 0, 1 j 2. (35) Equatio (35) shows that the cojugate-gradiet method eeds to remember oly the previous step directio i order to optimize the search at each iteratio. This is aother remarkable property distiguishig that method i the family of cojugatedirectio methods. PROGRAM The program i Table 1 implemets oe iteratio of the cojugate-directio method. It is based upo Jo Claerbout s cgstep() program (?) ad uses a aalogous amig covetio. Vectors i the data space are deoted by double letters. I additio to the previous steps s j ad their cojugate couterparts A s j (array s), the program stores the squared orms A s j 2 (variable beta) to avoid recomputatio. For practical reasos, the umber of remembered iteratios ca actually be smaller tha the total umber of iteratios. EXAMPLES Example 1: Iverse iterpolatio Matthias Schwab has suggested (i a persoal commuicatio) a iterestig example, i which the cgstep program fails to comply with the cojugate-gradiet theory. The iverse problem is a simple oe-dimesioal data iterpolatio with a kow filter (?). The kow portio of the data is a sigle spike i the middle. Oe hudred other data poits are cosidered missig. The kow filter is the Laplacia (1, 2, 1), ad the expected result is a bell-shaped cubic splie. The forward problem is strictly liear, ad the exact adjoit is easily computed by reverse covolutio. However, the cojugate-gradiet program requires sigificatly more tha the theoretically predicted 100 iteratios. Figure 2 displays the covergece to the fial solutio i three differet plots. Accordig to the figure, the actual umber of iteratios required for covergece is about 300. Figure 3 shows the result of a similar experimet with the cojugate-directio solver cdstep. The umber of required iteratios is reduced to almost the theoretical oe hudred. This idicates that the orthogoality of directios implied i the cojugate-gradiet method has bee distorted by computatioal errors. The additioal cost of correctig these errors with the cojugate-directio solver comes from storig the precedig 100 directios i memory. A smaller umber of memorized steps produces smaller improvemets.

Fomel 9 Cojugate directios 1 void s f c d s t e p ( bool f o r g e t / r e s t a r t f l a g /, 2 it x / model s i z e /, 3 it y / data s i z e /, 4 float x / curret model [ x ] /, 5 cost float g / g r a d i e t [ x ] /, 6 float r r / data r e s i d u a l [ y ] /, 7 cost float gg / cojugate g r a d i e t [ y ] / ) 8 / < Step o f cojugate d i r e c t i o i t e r a t i o. 9 The data r e s i d u a l i s rr = A x dat 10 > / 11 { 12 float s, s i, s s ; 13 double alpha, beta ; 14 it i,, ix, i y ; 15 16 s = s f f l o a t a l l o c ( x+y ) ; 17 s s = s+x ; 18 19 for ( i x =0; i x < x ; i x++) { s [ i x ] = g [ i x ] ; } 20 for ( i y =0; i y < y ; i y++) { s s [ i y ] = gg [ i y ] ; } 21 22 s f l l i s t r e w i d ( s t e p s ) ; 23 = s f l l i s t d e p t h ( s t e p s ) ; 24 25 for ( i =0; i < ; i++) { 26 s f l l i s t d o w ( steps, &s i, &beta ) ; 27 alpha = c b l a s d s d o t ( y, gg, 1, s i+x, 1) / beta ; 28 c b l a s s a x p y ( x+y, alpha, s i, 1, s, 1 ) ; 29 } 30 31 beta = c b l a s d s d o t ( y, s+x, 1, s+x, 1 ) ; 32 i f ( be ta < DBL EPSILON) retur ; 33 34 s f l l i s t a d d ( steps, s, beta ) ; 35 i f ( f o r g e t ) s f l l i s t c h o p ( s t e p s ) ; 36 alpha = c b l a s d s d o t ( y, rr, 1, ss, 1) / beta ; 37 38 c b l a s s a x p y ( x, alpha, s, 1, x, 1 ) ; 39 c b l a s s a x p y ( y, alpha, ss, 1, rr, 1 ) ; 40 } Table 1: The source of this program is RSF/api/c/cdstep.c

Fomel 10 Cojugate directios Figure 2: Covergece of the missig data iterpolatio problem with the cojugategradiet solver. Curret models are plotted agaist the umber of iteratios. The three plots are differet displays of the same data. Figure 3: Covergece of the missig data iterpolatio problem with the logmemory cojugate-directio solver. Curret models are plotted agaist the umber of iteratios. The three plots are differet displays of the same data.

Fomel 11 Cojugate directios Example 2: Velocity trasform The ext test example is the velocity trasform iversio with a CMP gather from the Mobil AVO dataset (Nichols, 1994; Lumley et al., 1994; Lumley, 1994). I use Jo Claerbout s veltra program (Claerbout, 1995b) for ati-aliased velocity trasform with rho-filter precoditioig ad compare three differet pairs of operators for iversio. The first pair is the CMP stackig operator with the migratio weightig fuctio ( ) w = (t 0/t) t ad its adjoit. The secod pair is the pseudo-uitary velocity trasform with the weightig proportioal to s x, where x is the offset ad s is the slowess. These two pairs were used i the velocity trasform iversio with the iterative cojugate-gradiet solver. The third pair uses the weight proportioal to x for CMP stackig ad s for the reverse operator. Sice these two operators are ot exact adjoits, it is appropriate to apply the method of cojugate directios for iversio. The covergece of the three differet iversios is compared i Figure 4. We ca see that the third method reduces the least-square residual error, though it has a smaller effect tha that of the pseudo-uitary weightig i compariso with the uiform oe. The results of iversio after 10 cojugate-gradiet iteratios are plotted i Figures 5 ad 6, which are to be compared with the aalogous results of Lumley (1994) ad Nichols (1994). Figure 4: Compariso of covergece of the iterative velocity trasform iversio. The left plot compares cojugate-gradiet iversio with uweighted (uiformly weighted) ad pseudo-uitary operators. The right plot compares pseudo-uitary cojugate-gradiet ad weighted cojugate-directio iversio.

Fomel 12 Cojugate directios Figure 5: Iput CMP gather (left) ad its velocity trasform couterpart (right) after 10 iteratios of cojugate-directio iversio. Figure 6: The modeled CMP gather (left) ad the residual data (right) plotted at the same scale.

Fomel 13 Cojugate directios CONCLUSIONS The cojugate-gradiet solver is a powerful method of least-square iversio because of its remarkable algebraic properties. I practice, the theoretical basis of cojugate gradiets ca be distorted by computatioal errors. I some applicatios of iversio, we may wat to do that o purpose, by applyig iexact adjoits i precoditioig. I both cases, a safer alterative is the method of cojugate directios. Jo Claerbout s cgstep() program actually implemets a short-memory versio of the cojugate-directio method. Extedig the legth of the memory raises the cost of iteratios, but ca speed up the covergece. REFERENCES Claerbout, J., 1995a, Ellipsoids versus hyperboloids, i SEP-89: Staford Exploratio Project, 201 206., 2003, Image estimatio by example: Geophysical soudigs image costructio: Multidimesioal autoregressio: Staford Exploratio Project. Claerbout, J. F., 1992, Earth Soudigs Aalysis: Processig Versus Iversio: Blackwell Scietific Publicatios., 1995b, Basic Earth Imagig: Staford Exploratio Project. Crawley, S., 1995, Approximate vs. exact adjoits i iversio, i SEP-89: Staford Exploratio Project, 207 216. Fletcher, R., ad C. M. Reeves, 1964, Fuctio miimizatio by cojugate gradiets: Computer Joural, 7, 149 154. Gill, P. E., W. Murray, ad M. H. Wright, 1995, Practical optimizatio: Academic Press. Hestees, M. R., ad E. Stiefel, 1952, Methods of cojugate gradiets for solvig liear systems: J. Res. NBS, 49, 409 436. Kleima, R. E., ad P. M. va de Berg, 1991, Iterative methods for solvig itegral equatios: Radio Sciece, 26, 175 181. Lumley, D., D. Nichols, ad T. Rekdal, 1994, Amplitude-preserved multiple suppressio, i SEP-82: Staford Exploratio Project, 25 45. Lumley, D. E., 1994, Estimatig a pseudouitary operator for velocity-stack iversio, i SEP-82: Staford Exploratio Project, 63 78. Nichols, D., 1994, Velocity-stack iversio usig L p orms, i SEP-82: Staford Exploratio Project, 1 16. Sevik, A. G. J., ad G. C. Herma, 1994, Fast iterative solutio of sparsely sampled seismic iverse problems: Iverse Problems, 10, 937 948.