Linear inversion. A 1 m 2 + A 2 m 2 = Am = d. (12.1) again.

Size: px
Start display at page:

Download "Linear inversion. A 1 m 2 + A 2 m 2 = Am = d. (12.1) again."

Transcription

1 14 Linear inversion In Chapter 12 we saw how the parametrization of a continuous moel allows us to formulate a iscrete linear relationship between ata an moel m. With unknown corrections ae to the moel vector, this linear relationship remains formally the same if we write the physical moel parameters as m 1 an the corrections as m 2 but combine both in one vector m: A 1 m 2 + A 2 m 2 = Am = (12.1) again. Assuming we have M 1 moel parameters an M 2 corrections, this is a system of N equations (ata) an M = M 1 + M 2 unknowns. For more than one reason the solution of the system is not straightforwar: Even if we o not inclue multiple measurements along the same path, many of the N rows will be epenent. Since the ata always contain errors, this implies we cannot solve the system exactly, but have to minimize the misfit between Am an. For this misfit we can efine ifferent norms, an we face a choice of options. Despite the fact that we have (usually) many more ata than unknowns (i.e. N M), the system is almost certainly ill-pose in the sense that small errors in can lea to large errors in m; a parameter m j may be completely unetermine (A ij = 0 for all i) if it represents a noe that is far away from any raypath. We cannot escape making a subjective choice among an infinite set of equally satisfactory solutions by imposing a regularization strategy. For large M, the numerical computation of the solution has to be one with an iterative matrix solver which is often halte when a satisfactory fit is obtaine. Such efficient shortcuts interfere with the regularization strategy. We shall eal with each of these aspects in succession. Appenix D introuces some concepts of probability theory that are neee in this chapter. 255

2 256 Linear inversion 14.1 Maximum likelihoo estimation an least squares In experimental sciences, the most commonly use misfit criterion is the criterion of least squares, in which we minimize χ 2 ( chi square ) as a function of the moel: χ 2 (m) = N M j=1 A ijm j i 2 = min, (14.1) i=1 where σ i is the stanar eviation in atum i; χ 2 is a irect measure of the ata misfit, in which we weigh the misfits inversely with their stanar errors σ i. For uncorrelate an normally istribute errors, the principle of maximum likelihoo leas naturally to the least squares efinition of misfit. If there are no sources of bias, the expecte value E( i ) of i (the average of infinitely many observations of the same observable) is equal to the correct or error-free value. In practice, we have only one observation for each atum, but we usually have an eucate guess at the magnitue of the errors. We almost always use a normal istribution for errors, an assume errors to be uncorrelate, such that the probability ensity is given by a Gaussian or normal istribution of the form: P ( i ) = 1 σ i 2π exp σ 2 i i E( i ) 2. (14.2) The joint probability ensity for the observation of an N-tuple of ata with inepenent errors = ( 1, 2,..., N ) is foun by multiplying the iniviual probability ensities for each atum: N i E( i ) 2 P () = i=1 1 σ i 2π exp 2σ 2 i 2σ 2 i. (14.3) If we replace the expecte values in (14.3) with the preicte values from the moel parameters, we obtain again a probability, but now one that is conitional on the moel parameters taking the values m j : P ( m) = N i=1 1 σ i 2π exp i j A ijm j 2 2σ 2 i. (14.4) We usually assume that there are no extra errors introuce by the moelling (e.g. we ignore the approximation errors introuce by linearizations, neglect of anisotropy, or the shortcomings of ray theory etc.). In fact, if such moelling errors are also uncorrelate, unbiase an normally istribute, we can take them into account by incluing them in σ i but this is a big if. See Tarantola [351] for a much more comprehensive iscussion of this issue.

3 14.1 Maximum likelihoo estimation an least squares 257 Clearly, one woul like to have a moel that is associate with a high probability for its preicte ata vector. This leas to the efinition of the likelihoo function L for the moel m given the observation of the ata : L(m ) = P ( m) exp 1 2 χ 2 (m). Thus, maximizing the likelihoo for a moel involves minimizing χ 2. Since this involves minimizing the sum of squares of ata misfit, the metho is more generally known as the metho of least squares. The strong point of the metho of least squares is that it leas to very efficient methos of solving (12.1). Its major weakness is the reliance on a normal istribution of the errors, which may not always be the case. Because of the quaratic epenence on the misfit, outliers misfits of several stanar eviations have an influence on the solution that may be out of proportion, which means that errors may ominate in the solution. For a truly normal istribution, large errors have such a low probability of occurrence that we woul not worry about this. In practice however, many ata o suffer from outliers. For picke arrival times Jeffreys [146] has alreay observe that the ata have a tail-like istribution that eviates from the Gaussian for large eviations from the mean t m, mainly because a later arrival is misientifie as P or S: P (t) = 1 σ 2π e (t t m)2 /2σ 2 + g(t), where the probability ensity g(t) varies slowly an where 1. A simple metho to bring the ata istribution close to normal is to reject outliers with a elay that excees the largest elay time to be expecte from reasonable effects of lateral heterogeneity. This ecision can be mae after a first trial inversion: for example, one may reject all ata that leave a resiual in excess of 3σ after a first inversion attempt. If we ivie all ata an the corresponing row of A by their stanar eviations, we en up with a ata vector that is univariant, i.e. all stanar eviations are equal to 1. Thus, without loss of generality, we may assume that the ata are univariant, in which case we see from (14.1) that χ 2 is simply the square length of the resiual vector r = Am. From Figure 14.1 we see that r is then perpenicular to the subspace spanne by all vectors Ay (the range R(A) of A). For if it was not, we coul a a δm to m such that Aδm reuces the length of r. Thus, for all y the ot prouct between r an Ay must be zero: r Ay = A T r y = A T ( Am) y = 0,

4 258 Linear inversion r = Am R(A) Fig If the ata vector oes not lie in the range of A, the best we can o is to minimize the length of the resiual vector r. This implies that r must be perpenicular to any possible vector Ay. where A T is the transpose of A (i.e. A T ij = A ji). Since this ot prouct is 0 for all y, clearly A T ( Am) = 0, or: A T Am = A T, (14.5) which is known as the set of normal equations to solve the least-squares problem. Chi square is an essential statistical measure of the gooness of fit. In the hypothetical case that we satisfy every atum with a misfit of one stanar eviation we fin χ 2 = N; clearly values much higher than N are unwante because the misfit is higher than coul be expecte from the knowlege of ata errors, an values much lower than N inicate that the moel is trying to fit the ata errors rather than the general tren in the ata. For example, if two very close rays have travel time anomalies iffering by only 0.5 s an the stanar eviation is estimate to be 0.7 s, we shoul accept that a smooth moel preicts the same anomaly for each, rather than introucing a steep velocity graient in the 3D moel to try to satisfy the ifference. Because we want χ 2 N, it is often convenient to work with the reuce χ 2 or χ 2 re, which is efine as χ 2 /N, so that the optimum solution is foun for χ 2 re 1. But how close shoul χ 2 be to N? Statistical theory shows that χ 2 itself has a variance of 2N, or a stanar eviation of 2N. Thus, for ata the true moel woul with 67% confience be foun in the interval χ 2 = ± Such theoretical bouns are almost certainly too narrow because our estimates of the stanar eviations σ i are themselves uncertain. For example, if the true σ i are equal to 0.9 but we use 1.0 to compute χ 2, our compute χ 2 itself is in error (i.e. too low) by almost 20%, an a moel satisfying this level of misfit is probably not goo enough. It is therefore important to obtain accurate estimates of the stanar errors, e.g. using (6.2) or (6.12). Provie one is confient that the estimate stanar errors are unbiase, one shoul still aim for a moel that brings χ 2 very close to N, say to within 20 or 30%. An aitional help in eciing how close one wishes to be to a moel that fits at a level given by χ 2 = N is to plot the traeoff between the moel norm an χ 2

5 14.1 Maximum likelihoo estimation an least squares 259 χ 2 B m 2 Fig The L- or traeoff curve between χ 2 an moel norm m 2. A (sometimes calle the L-curve), shown schematically in Figure If the traeoff curve shows that one coul significantly reuce the norm of the moel while paying only a small price in terms of an increase in χ 2 (point A in Figure 14.2), this is an inication that the stanar errors in the ata have been unerestimate. For common ata errors o not correlate between nearby stations, but the true elays shoul correlate even if the Earth s properties vary erratically (because of the overlap in finite-frequency sensitivity). The baly correlating ata can only be fit by significantly increasing the norm an complexity of the moel, which is what we see happening on the horizontal part of the traeoff curve. Conversely, if we notice that a significant ecrease in χ 2 can be obtaine at the cost of only a minor increase in moel norm (point B), this inicates an overestimate of ata errors an tells us we may wish to accept a moel with χ 2 < N. If the eviations require are unexpectely large, this is an inication that the error estimation for the ata may nee to be revisite. Depening on where on the L-curve we fin that χ 2 = N, we fin that we o or o not have a strong constraint on the norm of the moel. If the optimal ata fit is obtaine close to point B where the L-curve is steep, even large changes in χ 2 have little effect on the moel norm. On the other han, near point A even large changes in the moel give only a small improvement of the ata fit. Both A an B represent unwante situations, since at A we are trying to fit ata errors, which leas to erratic features in the moel, whereas at B we are amping too strongly. In a well esigne tomography experiment, χ 2 N near the ben in the L-curve. We use the term moel norm here in a very general sense one may wish to inspect the Eucliean m 2 as well as more complicate norms that we shall encounter in Section In many cases one inverts ifferent ata groups that have uncorrelate errors. For example Montelli et al. [215] combine travel times from the ISC catalogues with cross-correlation travel times from broaban seismometers. The ISC set, with about 10 6 ata was an orer of magnitue larger than the secon ata set (10 5 ), an a brute force least-squares inversion woul give preference to the short perio ISC ata in cases where there are systematic incompatibilities. This is easily iagnose

6 260 Linear inversion by computing χ 2 for the iniviual ata groups. One woul wish to weigh the ata sets such that each group iniviually satisfies the optimal χ 2 criterion, i.e. if χ 2 i esignates the misfit for ata set i with N i ata, one imposes χ 2 i N i for each ata set. This may be accomplishe by giving each ata set equal weight an minimizing a weighte penalty function: P = i 1 N i χ 2 i. Note that this gives a solution that eviates from the maximum likelihoo solution, an we shoul only resort to weighting if we suspect that important conitions are violate, especially those of zero mean, uncorrelate an normally istribute errors. More often, an imbalance for iniviual χi 2 simply reflects an over- or unerestimation of the stanar eviations for one particular group of ata, an may prompt us to revisit our estimates for prior ata errors. Early tomographic stuies often ignore a formal statistical appraisal of the gooness of fit, an merely quote how much better a 3D tomographic moel satisfies the ata when compare to a 1D (layere or spherically symmetric) backgroun or starting moel, using a quantity name variance reuction, essentially the reuction in the Eucliean norm of the misfit vector. This reuction is as much a function of the fit of the 1D starting moel as of the ata fit itself i.e. the same 3D moel can have ifferent variance reuctions epening on the starting moel an is therefore useless as a statistical measure of quality for the tomographic moel. Exercises Exercise 14.1 Derive the normal equations by ifferentiating the expression for χ 2 with respect to m k for k = 1,..., M. Assume univariant ata (σ i = 1). Exercise 14.2 Why can we not conclue from (14.5) that Am? 14.2 Alternatives to least squares In the parlance of mathematics, the square Eucliean norm i r i 2 is one of a class of Lebesgue norms efine by the power p use in the sum: ( i r i p ) 1/p. Thus, the Eucliean norm is also known as the L 2 norm because p = 2. Of special interest are the L 1 norm (p = 1) an the case p which leas to minimizing the maximum among all r i. Instea of simply rejecting outliers, which always requires the choice of a har boun for acceptance, we may ownweight ata that show a large misfit in a

7 14.3 Singular value ecomposition 261 m = A T 2 v = λ v A M N A M 1 M 1 N M M 1 N 1 N M a. b. Fig (a) The original matrix system Am =. (b) The eigenvalue problem for the least-squares matrix A T A. previous inversion attempt, an repeat the process until it converges. In 1898, the Belgian mathematician Charles Lagrange propose such an iteratively weighte least-squares solution, by iteratively solving: W p Am W p 2 = min, where W p is a iagonal matrix with elements r i p 2 an 0 p < 2, which are etermine from the misfits r i in atum i after the previous iteration. We can start with an unweighte inversion to fin the first r. The choice p = 1 leas to the minimization of the L 1 norm if it converges. The elements of the resiual vector r i vary with each iteration, an convergence is not assure, but the avantage is that the inversion makes use of the very efficient numerical tools available for linear least-squares problems. The metho was introuce in geophysics by Scales et al. [304] Singular value ecomposition Though the least squares formalism hanles the incompatibility problem of ata in an overetermine system, we usually fin that A T A has a eterminant equal to zero, i.e. eigenvalues equal to zero, an its inverse oes not exist. Even though in tomographic applications A T A is often too large to be iagonalize, we shall analyse the inverse problem using singular values ( eigenvalues of a non-square matrix), since this formalism gives consierable insight. Let v i be an eigenvector of A T A with eigenvalue λ 2 i, so that AT Av = λ 2 i v. We may use square eigenvalues because A T A is symmetric an has only non-negative, real eigenvalues. Its eigenvectors are orthogonal. The choice of λ 2 instea of λ as eigenvalue is for convenience: the notation λ 2 i avois the occurrence of λ i later in the evelopment. We can arrange all M eigenvectors as columns in an M M matrix V an write (see Figure 14.3): A T AV = V 2. (14.6) The eigenvectors are normalize such that V T V = V V T = I.

8 262 Linear inversion With (14.6) we can stuy the uneretermine nature of the problem Am =, of which the least-squares solution is given by the system A T Am = A T. The eigenvectors v i span the M-imensional moel space so m can be written as a linear combination of eigenvectors: m = V y. Since V is orthonormal, m = y an we can work with y instea of m if we wish to restrict the norm of the moel. Using this: A T AV y = V 2 y = A T, or, multiplying both on the left with V T an using the orthogonality of V: 2 y = V T A T. Since is iagonal, this gives y i (an with that m = V y) simply by iviing the i-th component of the vector on the right by λ 2 i. But clearly, any y i which is multiplie by a zero eigenvalue can take any value without affecting the ata fit! We fin the minimum norm solution, the solution with the smallest y 2, by setting such components of y to 0. If we rank the eigenvalues λ 2 1 λ2 2...λ2 K > 0, 0,..., 0, then the last M K columns of V belong to the nullspace of A T A. We truncate the matrices V an to an M K matrix V K an a K K iagonal matrix to obtain the minimum norm estimate: ˆm min norm = V K 2 K V T K AT. (14.7) Note that the inverse of K exists because we have remove the zero eigenvalues. The orthogonality of the eigenvectors still guarantees VK T V K = I K, but now V K VK T = I M. To see how errors in the ata propagate into the moel, we use the fact that (14.7) represents a linear transformation of ata with a covariance matrix C. The posteriori covariance of transforme ata T is equal to T C T T (see Equation in Appenix D). In our case we have scale the ata such that C = I so that the posteriori moel covariance is: C ˆm = V K 2 K V K T AT I AV K 2 K V K T = V K 2 K 2 K 2 K V K T = V K 2 K V K T. (14.8) Thus the posteriori variance of the estimate for parameter m i is given by: σ 2 m i = K j=1 Vij 2 λ 2. (14.9) j To istinguish ata uncertainty from moel uncertainty we enote the moel stanar eviation as σ mi an the ata stanar eviation as σ i.

9 14.3 Singular value ecomposition 263 m T A A A T Am Fig Mappings between the moel space (left) an the ata space (right). The range of A is inicate by the grey area within the ata space. The range of the backprojection A T is inicate by the grey area in the moel space. This equation makes it clear that removing zero singular values is not sufficient, since the errors blow up as λ 2 j, renering the incorporation of small λ j very angerous. Dealing with small eigenvalues is known as regularization of the problem. Before we iscuss this in more etail, we nee to show the connection between the evelopment given here an the theory of singular value ecomposition which is more commonly foun in the literature. One way of looking at the system Am = is to see the components m i as weights in a summation of the columns of A to fit the ata vector. The columns make up the range of A in the ata space (Figure 14.4). Similarly, the rows of A the columns of A T make up the range of the backprojection A T in the moel space. The rest of the moel space is the nullspace: if m is in the nullspace, Am = 0. Components in the nullspace o not contribute to the ata fit, but a to the norm of m. We fin the minimum norm solution by avoiing any components in the nullspace, in other wors by selecting a moel in the range of A T : an fin y by solving for: ˆm = A T y AA T y =. The eterminant of AA T is likely to be zero, so just as in the case of least squares we shall wish to eliminate zero eigenvalues. Let the eigenvectors of AA T be u i with eigenvalues λ 2 i : AA T U = U 2. (14.10) Since AA T is symmetric, the eigenvectors are orthogonal an we can scale them to be orthonormal, such that U T U = UU T = I. Multiplying (14.10) on the left by

10 264 Linear inversion A = λ λ λ 0 0 T V U M M A = U K λ λ K K T V K KM 0 N M N N N M a. b. N M Fig (a) The full eigenvalue problem for AA T leas to a matrix with small or zero eigenvalues on the iagonal. (b) removing zero eigenvalues has no effect on A. N K A T an grouping A T U we see that A T U is an eigenvector of A T A: A T A(A T U) = (A T U) 2 an comparison with (14.6) shows that A T u i must be a constant v i, an λ i = λ i. We choose the constant to be λ i, so that Multiplying this on the left by A we obtain: A T U = V. (14.11) AA T U = U 2 = AV, or, iviing on the right by λ i for all λ i = 0, an efining u i λ i = Av i with a nullspace eigenvector v i in case λ i = 0: AV = U. (14.12) In the same way, by multiplying (14.12) on the right by V T we fin: A = UV T (14.13) which is the singular value ecomposition of A. Note that in this evelopment we have carefully avoie using the inverse of, so there is no nee to truncate it to exclue zero singular values. However, because the tail of the iagonal matrix contains only zeroes, (14.13) is equivalent to the truncate version (Figure 14.5): A = UV T = U K K V T K. (14.14) Exercises Exercise 14.3 (14.12). Exercise 14.4 Show that the choice (14.11) inee implies that U T U = I. Hint: use Show that ˆm = V K 1 K U T K is equivalent to ˆm min norm.

11 14.4 Tikhonov regularization Tikhonov regularization The truncation to inclue only nonzero singular values is an example of regularization of the inverse problem. Removing zero λ i is not sufficient however, since small singular values may give rise to large moelling errors, as shown by (14.9). This equation tells us that small errors in the ata vector may cause very large excursions in moel space in the irection of v k if λ k 1. It thus seems wise to truncate V in (14.13) even further, an exclue eigenvectors belonging to small singular values. The price we pay is a small increase in χ 2, but we are reware by a significant reuction in the moelling error. We coul apply a sharp cut-off by choosing K at some nonzero threshol level for the singular values. Less critical to the choice of threshol is a tapere cut-off. We show that the latter approach is equivalent to aing M equations of the form n m i = 0, with n small, to the tomographic system. Such equations act as artificial ata that bias the moel parameters towars zero: A m = n I. (14.15) 0 If the j-th column of A associate with parameter m j has large elements, the aition of one aitional constraint n m j = 0 will have very little influence. But the more m j is uneretermine by the unampe system, the more the amping will push m j towars zero. The least squares solution of (14.15) is: (A T A + 2 n I)m = AT. (14.16) The avantage of the formulation (14.15) is that it can easily be solve iteratively, without a nee for singular value ecomposition. But the solution of (14.15) oes have a simple representation in terms of singular values, an it is instructive to analyse it with SVD. If v k is an eigenvector of A T A with eigenvalue λ 2 k, then the ampe matrix gives: (A T A + 2 n I)v k = (λ 2 k + 2 n )v k, (14.17) an we see that the ampe system has the same eigenvectors but with raise eigenvalues λ 2 k + 2 n > 0. The minimum norm solution (14.7) is therefore replace by: with the posteriori moel variance given by: ˆm ampe = V K ( 2 K + 2 n I) 1 V T K AT (14.18) σ 2 m i = K Vij 2 λ 2 j +. (14.19) 2 n j=1

12 266 Linear inversion Since there are no zero eigenvalues, we may set K = N, but of course this maximizes the variance an some truncation may still be neee. For simplicity, we assume a amping with the same n everywhere on the iagonal. The metho is often referre to as Tikhonov regularization, after its original iscoverer [364]. Because one as 2 to the iagonal of A T A it is also known as rige regression. Spakman an Nolet [338] vary the amping factor n along the iagonal. When corrections are part of the moel, one shoul vary amping factors such that amping results in corrections that are reasonable in view of the prior uncertainty (for example, one woul juge corrections as large as 100 km for hypocentral parameters usually unacceptable an increase n for those corrections). A comparison of (14.19) with (14.9) shows that ampe moel errors blow up at most by a factor n 1. Thus, amping reuces the variance of the solution. This comes at a price however: by iscaring eigenvectors, we reuce our ability to shape the moel. The small eigenvalues are usually associate with vectors that are strongly oscillating in space: the positive an negative parts cancel upon integration an the resulting integral (12.12) is small. Damping small eigenvalues is thus expecte to lea to smoother moels. However, even long-wavelength features of the moel may be biase towars zero because of regularization. The fact that biase estimations prouce smaller variances is a well known phenomenon in statistical estimation, an it is easily misunerstoo: one can obtain a very small moel parameter m i with a very small posteriori variance σi 2, yet learn nothing about the moel because the bias is of the orer of the true m i. We shall come back to this in the section on resolution, but first investigate a more powerful regularization metho, base on Bayesian statistics. Exercises Exercise 14.5 Show that the minimization of Am m 2 leas to (14.16). Exercise 14.6 In the L-curve for (14.18), inicate where = 0 an where Bayesian inference The simple Tikhonov regularization by norm amping we introuce in the previous section, while reucing the anger of excessive error propagation, is usually not satisfactory from a geophysical point of view. At first sight, this may seem surprising: for, when the m i represent perturbations with respect to a backgroun moel, the amping towars 0 is efensible if we prefer the moel values given by the backgroun moel in the absence of any other information. However, if the

13 14.5 Bayesian inference 267 information given by the ata is unequally istribute, some parts of the moel may be ampe more than others, introucing an apparent structure in m that may be very misleaing. The error estimate (14.19) oes not represent the full moelling error because it neglects the bias. In general, we woul like the moel to have a minimum of unwarrante structure, or etail. Jackson [145] an Tarantola [349], significantly extening earlier work by Franklin [105], introuce the Bayesian metho into geophysical inversion to eal with this problem, name after the Reveren Thomas Bayes ( ), a British mathematician whose theorem on joint probabilities is a cornerstone of this inference metho. We shall give a brief exposé of Bayesian estimation for the case of N observations in a ata vector obs. Let P (m) be the prior probability ensity for the moel m = (m 1, m 2,..., m M ), e.g. a Gaussian probability of the form: P (m) = 1 1 exp 1 (2π) M/2 et C m 1/2 2 m C 1 m m. (14.20) Here, C m is the prior covariance matrix for the moel parameters. By prior we mean that we generally have an iea of the allowable variations in the moel values, e.g. how much the 3D Earth may iffer from a 1D backgroun moel without violating more general laws of physics. We may express such knowlege as a prior probability ensity for the moel values. The iagonal elements of C m are the variances of that prior istribution. The off-iagonal elements reflect the correlation of moel parameters often it helps to think of them as escribing the likely smoothness of the moel. In a strict Bayesian philosophy such constraints may be subjective. This, however, is not to say that we may impose constraints following the whim of an arbitrary person. An experience geophysicist may often evelop a very goo intuition of the prior uncertainty of moel parameters, perhaps because he has one experiments in the laboratory on analogue materials, or because he has experience with tomographic inversions in similar geological provinces. We shall classify such efensible subjective notions to be objective after all. The ranom errors in our observations make that the observe ata vector obs eviates from the true (i.e. error-free) ata. For the ata we assume the normal istribution (14.2). Assuming the linear relationship Am = has no errors (or incorporating those errors into σ i as iscusse before), we fin the conitional probability ensity for the observe ata, given a moel m: P ( m) = 1 (2π) N/2 1 et C 1/2 exp 1 2 (Am obs ) C 1 (Am obs ), (14.21)

14 268 Linear inversion where C is the matrix with ata covariance, usually taken to be iagonal with entries σi 2 because we have little knowlege about ata correlations. Though we have an expression for the ata probability P ( m), for solution of the inverse problem we are more intereste in the probability of the moel, given the observe ata obs. This is where Bayes theorem is useful. It starts from the recognition that the joint probability can be split up in a conitional an marginal probability in two ways, assuming the probabilities for moel an ata are inepenent: P (m, obs ) = P (m obs )P ( obs ) = P ( obs m)p (m), from which we fin Bayes theorem: P (m obs ) = P (obs m)p (m). (14.22) P ( obs ) Using (14.20) an (14.21): P (m obs ) exp 1 2 (Am obs ) C 1 (Am obs ) 1 2 m C 1 m m. Thus, we obtain the maximum likelihoo solution by minimizing: (Am obs ) C 1 (Am obs ) + m Cm 1 m = χ 2 (m) + m Cm 1 m = min, or, ifferentiating with respect to m i : A T C 1 (Am obs ) + Cm 1 m = 0. One sees that this is again a system of normal equations belonging to the ampe system: C 1 2 C 1 2 m A m =. (14.23) 0 Of course, if we have alreay scale the ata to be univariant the ata covariance matrix is C = I. This simply shows that we are sooner or later oblige to scale the system with the ata uncertainty. The prior smoothness constraint is unlikely to be a har constraint, an in practice we face again a traeoff between the ata fit an the amping of the moel, much as in Figure We obtain a manageable flexibility in the traeoff between smoothness of the moel an χ 2 by scaling C 1 2 with a scaling factor. Varying allows us to tweak the moel amping until χ 2 N. Equation (14.23) is thus usually encountere in the equivalent, simplifie C 1 2

15 14.5 Bayesian inference 269 form: A C 1 2 m m =. (14.24) 0 How shoul one specify C m? The moel covariance essentially tells us how moel parameters are correlate. Usually, such correlations are only high for nearby parameters. Thus, C m smoothes the moel when operating on m. Conversely, Cm 1 roughens the moel, an () expresses the penalization of those moel elements that ominate after the roughening operation. The simplest roughening operator is the Laplacian 2, which is zero when a moel parameter is exactly the average of its neighbours. If we parametrize the moel with tetrahera or blocks, so that every noe has well-efine nearest neighbours, we can minimize the ifference between parameter m i an the average of its neighbours (Nolet [235]): 1 1 (m i m j ) 2 = min, 2 N i i j N i where N i is the set of N i nearest neighbours of moe i. Differentiating with respect to m k gives M equations: m k 1 N k j N k m j = 0, (14.25) in which we recognize the k-th row of C 1 2 m m in (14.24). One isavantage of the system (14.24) is that it often converges much more slowly than the Tikhonov system (14.15) in iterative matrix solvers (VanDecar an Snieer [381]). The reason is that we are simultaneously solving a system arising from a set of integral equations, an the regularization system which involves finite-ifferencing. Without sacrificing the Bayesian philosophy, it is possible to transform (14.24) to a simple norm amping. Spakman an Nolet [338] introuce m = C 1 2 m m. Inserting this into (14.24) we fin: AC 1 2 m m =. (14.26) I 0 Though it is not practical to invert the matrix C 1 2 m that is implicit in (14.25) to fin an exact expression for C 1 2 m, many explicit smoothers of m may act as an appropriate correlation matrix C 1 2 m for regularization purposes. After inversion for m, the tomographic moel is obtaine from the smoothing operation m = C 1 2 m m. The system (14.26) has the same form as the Tikhonov regularization (14.15).

16 270 Linear inversion Despite this resemblance, in my own experience the acceleration of convergence is only moest compare to inverting (14.24) irectly Information theory Given the lack of resolution, geophysicists are conemne to accept the fact that there are infinitely many moels that all satisfy the ata within the error bouns. The Earth is a laboratory, but one that is very ifferent from those in experimental physics, where we are taught to carefully esign an experiment so that we have full control. Unerstanably, we feel unhappy with a wie choice of regularizations, resulting in our inability to come up with a unique outcome of the experiment. The temptation is always to resort to some higher if not metaphysical principle that allows us to choose the best moel among the infinite set before we start plotting tomographic cross-sections. It shoul be recognize that this simply replaces one subjective choice (that of a moel) with another (that of a criterion). Though some tomographers religiously ahere to such metaphysical consierations, I reaily confess to being an atheist. In my view, such external criteria are simply a matter of taste. As an example, the methos of regularization are relate to concepts known from the fiel of information theory, notably to the concept of information entropy. We shall briefly look into this, but warn the reaer that, in the en, there is no panacea for our fall from Paraise. We start with a simple application of the concept of information entropy: suppose we have only one atum, a elay measure along a ray of length L. We then have a 1 M system, or just one equation: 1 = m(r)s = m i s, i As a thought experiment, assume that the segments of s i are of equal length s, an that we allow only one of them to cause the travel time anomaly. Which one? Information theory looks at this problem in the following way: let P i be the probability that m i = 0. By the law of probabilities, P i = 1. Intuitively, we juge that in the absence of any other information, all P i shoul be equal if not this woul constitute aitional information on the m i. Formally, we may get to this conclusion by efining the information entropy: I = i P i ln P i, (14.27) which can be unerstoo if we consier that any P i = 0 will yiel I =, thus minimizing the isorer in the solution (note that if any P i = 1, all others must be 0, again minimizing isorer). We express our esire to have a solution

17 14.6 Information theory 271 with minimum unwarrante information as the esire to maximize I, while still satisfying P i = 1. Such problems are solve with the metho of Lagrange multipliers. This metho recognizes that the absolute maximum of I zero for all probabilities equal to 1 oes not satisfy the constraint that P i = 1. So we relax the maximum conition by aing λ( P i 1) to I an require: I + λ( P i 1) = Max. Since the ae factor is require to be zero, the function to maximize has not really change as long as we satisfy that constraint. All we have one is a another imension, or epenent variable, the Lagrange multiplier λ. We recover the original constraint by maximizing with respect to λ. Taking the erivative with respect to P i now gives an equation that involves λ: P i ln P i + λ P i = 0, P i i i or ln P i = (1 + λ) P i = e 1 λ. We fin the Lagrange multiplier from the constraint: P i = Ne 1 λ = 1 λ = ln N 1, or i P i = e ln(1/n) = 1 N. Thus, if we impose the criterion of maximum entropy for the information in our moel, all m i are equally likely to contribute. The reasoning oes not change much if we allow every m i to contribute to the anomaly an again maximize (14.27). In that case, all m i are equally likely to contribute. In the absence of further information, there is no reason to assume that one woul contribute more than any other, an all are equal: m i = 1 / s = 1 /L. The smoothest moel is the moel with the highest information entropy. Such reasoning provies a higher principle to justify the amping towars smooth moels. Constable et al. [64] name the construction of the smoothest moel that satisfies the ata with the prescribe tolerance Occam s inversion, after the fourteenth century philosopher William of Occam, or Ockham, who avocate the principle that simple explanations are more likely than complicate ones an who applie what came to be known as Occam s razor to eliminate unnecessary presuppositions. However, one shoul not assume that smooth moels are free of presuppositions: in fact, if we apply (14.25) in (14.24) we arbitrarily impose that smooth structures

18 272 Linear inversion are more likely than others. Artefacts may be suppresse, but so will sharp bounaries, e.g. the top of a subuction zone. Loris et al. [188], who invert for moels that can be expane with the fewest wavelets of a given wavelet basis, provie a variant on Occam s razor that is in principle able to preserve sharp features while eliminating unwarrante etail. An interesting connection arises if we assume that sparse moel parametrizations are a priori more probable than parametrizations with many basis functions. Assume that the prior moel probability P (m) is inversely proportional to the number of basis functions with nonzero coefficients in an exponential fashion: P (m) e K, where K is the number of basis functions. If we insert this into Bayes equation, we fin that the maximum likelihoo equation becomes: ln χ 2 (m) K = min, which is Akaike s [1] criterion for the optimum selection of the number of parameters K, use in seismic tomography by Zollo et al. [422]. Note, however, that this criterion lacks a crucial element: it oes not impose any restrictions on the shape of the basis functions. Presumably one coul use it by ranking inepenently efine basis functions in orer of increasing roughness, again appealing to William of Occam for his blessing Numerical consierations With N often of the orer of ata, an M only one orer of magnitue smaller than N, the matrix system Am = is gigantic in size. Some reuction in the number of rows N can be obtaine by combining (almost) coincient raypaths into summary rays (see Section 6.1). The correct way to o this is to sum the rows of all N S ata belonging to a summary ray group S into one new average row that replaces them in the matrix: M 1 A ij m j = 1 i ± σ S, (14.28) N S N j=1 S with the variance σs 2 equal to i S σ 2 S = 1 N 2 S i S i S σ 2 i + σ 2 0. Here, σ0 2 is ae to account for lateral variations within the summary ray that affect the variance of the sum. Gumunsson et al. [126] analyse the relationship

19 14.7 Numerical consierations 273 between the with of a bunle an the variance in teleseismic P elay times from the ISC catalogue. Care must be taken in efining the volume that efines the members of the summary ray. Events with a common epicentre but ifferent epth provie important vertical resolution in the earthquake region an shoul often be treate separately. When using ray theory an large cells to parametrize the moel we o not lose much information if we average over large volumes with size comparable to the moel cells. But the Fréchet kernels of finite-frequency theory show that the sensitivity narrows own near source an receiver, an summarizing may uno some of the benefits of a finite-frequency approach. Summary rays are sometimes applie to counteract the effect of ominant ray trajectories on the moel which may lea to strong parameter correlations along the prevailing ray irection by ignoring the reuction of the error in the average. However, this violates statistical theory if we seek the maximum likelihoo solution for normally istribute errors. The uneven istribution of sensitivity is better fought using unstructure gris with aapte resolution, an smoothness amping using a correlation matrix C m that promotes equal parameter correlation in all irections. If the parametrization is local, many elements of A are zero. For a least-squares solution, A T A has lost much of this sparseness, though, so we shall wish to avoi constructing A T A explicitly. We can obtain a large savings in memory space by only storing the nonzero elements of A. We o this row-wise surprisingly the multiplications Am an A T can both be one in row-orer, using the following row-action algorithms: p = Am: q = A T : for i = 1, N for i = 1, N for j = 1, M for j = 1, M p i p i + A ij m j q j q j + A ij i where only nonzero elements of A ij shoul take part. This often leas to complicate bookkeeping. Claerbout s [60] ot-prouct test: q Ap = A T q p for ranom vectors p an q can be use as a first (though not conclusive) test to valiate the coing. Early tomographic efforts in the meical an biological sciences le to a reiscovery of row-action methos (Censor [44]). The early methos, however, ha the isavantage that they introuce an unwante scaling into the problem that The explicit computation an use of A T A is also unwise from the point of view of numerical stability since its conition number the measure of the sensitivity of the solution to ata errors is the square of that of A itself. For a iscussion of this issue see Numerical Recipes [269].

20 274 Linear inversion interferes with the optimal regularization one wishes to impose (see van er Sluis an van er Vorst [377] for a etaile analysis). Conjugate graient methos work without implicit scaling. The stablest algorithm known toay is LSQR, evelope by Paige an Sauners [249] an introuce into seismic tomography by the author [233, 234]. We give a short erivation of LSQR. The main iea of the algorithm is to evelop orthonormal bases µ k in moel space, an ρ k in ata space. The first basis vector in ata space, ρ 1, is simply in the irection of the ata vector: β 1 ρ 1 =, an µ 1 is the backprojection of ρ 1 : α 1 µ 1 = A T ρ 1. Coefficients α i an β i are normalization factors such that ρ i = µ i = 1. We fin the secon basis vector in ata space by mapping µ 1 into ata space, an orthogonalize to ρ 1 : β 2 ρ 2 = Aµ 1 (Aµ 1 ρ 1 )ρ 1 = Aµ 1 α 1 ρ 1, where we use Aµ 1 ρ 1 = µ 1 A T ρ 1 = µ 1 α 1 µ 1. Similarly: α 2 µ 2 = A T ρ 2 β 2 µ 1. Although it woul seem that we have to go through more an lengthier orthogonalizations as the basis grows, it turns out that at least in theory, ignoring rounoff errors the orthogonalization to the previous basis function only is sufficient. For example, for ρ 3 we fin β 3 ρ 3 = Aµ 2 α 2 ρ 2. Taking the ot prouct with ρ 1, we fin: β 3 ρ 3 ρ 1 = Aµ 2 ρ 1 α 2 ρ 2 ρ 1 = µ 2 A T ρ 1 = µ 2 (α 1 µ 1 ) = 0, an ρ 3 is perpenicular to ρ 1. A similar proof by inuction can be mae for all ρ k an µ k in the iterative sequence: If we expan the solution after k iterations: β k+1 ρ k+1 = Aµ k α k ρ k (14.29) α k+1 µ k+1 = A T ρ k+1 β k+1 µ k. (14.30) m k = k γ j µ j, j=1 k γ j Aµ j =, j=1

21 an with (14.29): 14.8 Appenix D: Some concepts of probability theory an statistics 275 k γ j (β j+1 ρ j+1 + α j ρ j ) = β 1 ρ 1. j=1 Taking the ot prouct of this with ρ 1 yiels γ 1 = β 1 /α 1, whereas subsequent factors are foun by taking the prouct with ρ k to give γ k = β k γ k 1 /α k Appenix D: Some concepts of probability theory an statistics I assume the reaer is familiar with iscrete probabilities, such as the probability that a flippe coin will come up with hea or tail. If ae up for all possible outcomes, the sum of all probabilities is 1. This concept of probability cannot irectly be applie to variables that can take any value within prescribe bouns. For such variables we use probability ensity. The probability ensity P (X 0 ) for a ranom variable X at X 0 is equal to the probability that X is within the interval X 0 X X 0 + X, ivie by X. This can be extene to multiple variables. If P () is the probability ensity for the ata in vector, then the probability that we fin the ata within a small N-imensional volume in ata space is given by 0 P () 1. We only eal with normalize probability ensities, i.e. the integral over all ata: P () N = 1. (14.31) Joint probability ensities give the probability that two or more ranom variables take a particular value, e.g. P (m, ). If the istributions for the two variables are inepenent, the joint probability ensity is the prouct of the iniviual ensities: P (m, ) = P (m)p (). (14.32) Conversely, one fins the marginal probability ensity of one of the variables by integrating out the secon variable: P (m) = P (m, ) N. (14.33) The conitional probability ensity gives the probability of the first variable uner the conition that the secon variable has a given value, e.g. P (m obs ) gives the probability ensity for moel m given an observe set of ata in obs. The expectation or expecte value E(X) of X is efine as the average over all values of X weighte by the probability ensity: X E(X) = P (X)X X. (14.34)

22 276 Linear inversion The expectation is a linear functional: an for inepenent variables it is separable: E(aX + by ) = ae(x) + be(y ), (14.35) E(XY ) = E(X)E(Y ). (14.36) The variance is a measure of the sprea of X aroun its expecte value: σ 2 X = E[(X X) 2 ], (14.37) where σ X itself is known as the stanar eviation. The covariance between two ranom variables X an Y is efine as Cov(X, Y ) = E[(X X)(Y Ȳ )]. (14.38) In the case of an N-tuple of variables this efines an N N covariance matrix, with the variance on the iagonal. The covariance matrix of a linear combination of variables is foun by applying the linearity (14.35). Consier a linear transformation x = T y. Since the sprea of a variable oes not change if we reefine the average as zero, we can assume that E(x i ) = 0 without loss of generality. Then: Cov(x i, x j ) = E T ij y k T jl y l = T ij T jl E(y k y l ) k l kl or, in matrix notation: = kl T ij T jl Cov(y k, y l ), C x = T C y T T. (14.39)

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Math 1B, lecture 8: Integration by parts

Math 1B, lecture 8: Integration by parts Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1 Lecture 5 Some ifferentiation rules Trigonometric functions (Relevant section from Stewart, Seventh Eition: Section 3.3) You all know that sin = cos cos = sin. () But have you ever seen a erivation of

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

The Press-Schechter mass function

The Press-Schechter mass function The Press-Schechter mass function To state the obvious: It is important to relate our theories to what we can observe. We have looke at linear perturbation theory, an we have consiere a simple moel for

More information

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7. Lectures Nine an Ten The WKB Approximation The WKB metho is a powerful tool to obtain solutions for many physical problems It is generally applicable to problems of wave propagation in which the frequency

More information

Vectors in two dimensions

Vectors in two dimensions Vectors in two imensions Until now, we have been working in one imension only The main reason for this is to become familiar with the main physical ieas like Newton s secon law, without the aitional complication

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

Inverse Theory Course: LTU Kiruna. Day 1

Inverse Theory Course: LTU Kiruna. Day 1 Inverse Theory Course: LTU Kiruna. Day Hugh Pumphrey March 6, 0 Preamble These are the notes for the course Inverse Theory to be taught at LuleåTekniska Universitet, Kiruna in February 00. They are not

More information

05 The Continuum Limit and the Wave Equation

05 The Continuum Limit and the Wave Equation Utah State University DigitalCommons@USU Founations of Wave Phenomena Physics, Department of 1-1-2004 05 The Continuum Limit an the Wave Equation Charles G. Torre Department of Physics, Utah State University,

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

Entanglement is not very useful for estimating multiple phases

Entanglement is not very useful for estimating multiple phases PHYSICAL REVIEW A 70, 032310 (2004) Entanglement is not very useful for estimating multiple phases Manuel A. Ballester* Department of Mathematics, University of Utrecht, Box 80010, 3508 TA Utrecht, The

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

and from it produce the action integral whose variation we set to zero:

and from it produce the action integral whose variation we set to zero: Lagrange Multipliers Monay, 6 September 01 Sometimes it is convenient to use reunant coorinates, an to effect the variation of the action consistent with the constraints via the metho of Lagrange unetermine

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments Problem F U L W D g m 3 2 s 2 0 0 0 0 2 kg 0 0 0 0 0 0 Table : Dimension matrix TMA 495 Matematisk moellering Exam Tuesay December 6, 2008 09:00 3:00 Problems an solution with aitional comments The necessary

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

Regularized extremal bounds analysis (REBA): An approach to quantifying uncertainty in nonlinear geophysical inverse problems

Regularized extremal bounds analysis (REBA): An approach to quantifying uncertainty in nonlinear geophysical inverse problems GEOPHYSICAL RESEARCH LETTERS, VOL. 36, L03304, oi:10.1029/2008gl036407, 2009 Regularize extremal bouns analysis (REBA): An approach to quantifying uncertainty in nonlinear geophysical inverse problems

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

Quantum mechanical approaches to the virial

Quantum mechanical approaches to the virial Quantum mechanical approaches to the virial S.LeBohec Department of Physics an Astronomy, University of Utah, Salt Lae City, UT 84112, USA Date: June 30 th 2015 In this note, we approach the virial from

More information

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

inflow outflow Part I. Regular tasks for MAE598/494 Task 1 MAE 494/598, Fall 2016 Project #1 (Regular tasks = 20 points) Har copy of report is ue at the start of class on the ue ate. The rules on collaboration will be release separately. Please always follow the

More information

Level Construction of Decision Trees in a Partition-based Framework for Classification

Level Construction of Decision Trees in a Partition-based Framework for Classification Level Construction of Decision Trees in a Partition-base Framework for Classification Y.Y. Yao, Y. Zhao an J.T. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canaa S4S

More information

First Order Linear Differential Equations

First Order Linear Differential Equations LECTURE 6 First Orer Linear Differential Equations A linear first orer orinary ifferential equation is a ifferential equation of the form ( a(xy + b(xy = c(x. Here y represents the unknown function, y

More information

Diagonalization of Matrices Dr. E. Jacobs

Diagonalization of Matrices Dr. E. Jacobs Diagonalization of Matrices Dr. E. Jacobs One of the very interesting lessons in this course is how certain algebraic techniques can be use to solve ifferential equations. The purpose of these notes is

More information

THE EFFICIENCIES OF THE SPATIAL MEDIAN AND SPATIAL SIGN COVARIANCE MATRIX FOR ELLIPTICALLY SYMMETRIC DISTRIBUTIONS

THE EFFICIENCIES OF THE SPATIAL MEDIAN AND SPATIAL SIGN COVARIANCE MATRIX FOR ELLIPTICALLY SYMMETRIC DISTRIBUTIONS THE EFFICIENCIES OF THE SPATIAL MEDIAN AND SPATIAL SIGN COVARIANCE MATRIX FOR ELLIPTICALLY SYMMETRIC DISTRIBUTIONS BY ANDREW F. MAGYAR A issertation submitte to the Grauate School New Brunswick Rutgers,

More information

Sturm-Liouville Theory

Sturm-Liouville Theory LECTURE 5 Sturm-Liouville Theory In the three preceing lectures I emonstrate the utility of Fourier series in solving PDE/BVPs. As we ll now see, Fourier series are just the tip of the iceberg of the theory

More information

CS9840 Learning and Computer Vision Prof. Olga Veksler. Lecture 2. Some Concepts from Computer Vision Curse of Dimensionality PCA

CS9840 Learning and Computer Vision Prof. Olga Veksler. Lecture 2. Some Concepts from Computer Vision Curse of Dimensionality PCA CS9840 Learning an Computer Vision Prof. Olga Veksler Lecture Some Concepts from Computer Vision Curse of Dimensionality PCA Some Slies are from Cornelia, Fermüller, Mubarak Shah, Gary Braski, Sebastian

More information

θ x = f ( x,t) could be written as

θ x = f ( x,t) could be written as 9. Higher orer PDEs as systems of first-orer PDEs. Hyperbolic systems. For PDEs, as for ODEs, we may reuce the orer by efining new epenent variables. For example, in the case of the wave equation, (1)

More information

Bayesian Estimation of the Entropy of the Multivariate Gaussian

Bayesian Estimation of the Entropy of the Multivariate Gaussian Bayesian Estimation of the Entropy of the Multivariate Gaussian Santosh Srivastava Fre Hutchinson Cancer Research Center Seattle, WA 989, USA Email: ssrivast@fhcrc.org Maya R. Gupta Department of Electrical

More information

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises

More information

Transmission Line Matrix (TLM) network analogues of reversible trapping processes Part B: scaling and consistency

Transmission Line Matrix (TLM) network analogues of reversible trapping processes Part B: scaling and consistency Transmission Line Matrix (TLM network analogues of reversible trapping processes Part B: scaling an consistency Donar e Cogan * ANC Eucation, 308-310.A. De Mel Mawatha, Colombo 3, Sri Lanka * onarecogan@gmail.com

More information

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy, NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which

More information

Math 1271 Solutions for Fall 2005 Final Exam

Math 1271 Solutions for Fall 2005 Final Exam Math 7 Solutions for Fall 5 Final Eam ) Since the equation + y = e y cannot be rearrange algebraically in orer to write y as an eplicit function of, we must instea ifferentiate this relation implicitly

More information

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth MA 2232 Lecture 08 - Review of Log an Exponential Functions an Exponential Growth Friay, February 2, 2018. Objectives: Review log an exponential functions, their erivative an integration formulas. Exponential

More information

Integration Review. May 11, 2013

Integration Review. May 11, 2013 Integration Review May 11, 2013 Goals: Review the funamental theorem of calculus. Review u-substitution. Review integration by parts. Do lots of integration eamples. 1 Funamental Theorem of Calculus In

More information

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum October 6, 4 ARDB Note Analytic Scaling Formulas for Crosse Laser Acceleration in Vacuum Robert J. Noble Stanfor Linear Accelerator Center, Stanfor University 575 San Hill Roa, Menlo Park, California 945

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

fv = ikφ n (11.1) + fu n = y v n iσ iku n + gh n. (11.3) n

fv = ikφ n (11.1) + fu n = y v n iσ iku n + gh n. (11.3) n Chapter 11 Rossby waves Supplemental reaing: Pelosky 1 (1979), sections 3.1 3 11.1 Shallow water equations When consiering the general problem of linearize oscillations in a static, arbitrarily stratifie

More information

Stable and compact finite difference schemes

Stable and compact finite difference schemes Center for Turbulence Research Annual Research Briefs 2006 2 Stable an compact finite ifference schemes By K. Mattsson, M. Svär AND M. Shoeybi. Motivation an objectives Compact secon erivatives have long

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Experiment 2, Physics 2BL

Experiment 2, Physics 2BL Experiment 2, Physics 2BL Deuction of Mass Distributions. Last Upate: 2009-05-03 Preparation Before this experiment, we recommen you review or familiarize yourself with the following: Chapters 4-6 in Taylor

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

1 The Derivative of ln(x)

1 The Derivative of ln(x) Monay, December 3, 2007 The Derivative of ln() 1 The Derivative of ln() The first term or semester of most calculus courses will inclue the it efinition of the erivative an will work out, long han, a number

More information

Quantum Search on the Spatial Grid

Quantum Search on the Spatial Grid Quantum Search on the Spatial Gri Matthew D. Falk MIT 2012, 550 Memorial Drive, Cambrige, MA 02139 (Date: December 11, 2012) This paper explores Quantum Search on the two imensional spatial gri. Recent

More information

In the usual geometric derivation of Bragg s Law one assumes that crystalline

In the usual geometric derivation of Bragg s Law one assumes that crystalline Diffraction Principles In the usual geometric erivation of ragg s Law one assumes that crystalline arrays of atoms iffract X-rays just as the regularly etche lines of a grating iffract light. While this

More information

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016 Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

Qubit channels that achieve capacity with two states

Qubit channels that achieve capacity with two states Qubit channels that achieve capacity with two states Dominic W. Berry Department of Physics, The University of Queenslan, Brisbane, Queenslan 4072, Australia Receive 22 December 2004; publishe 22 March

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion Hybri Fusion for Biometrics: Combining Score-level an Decision-level Fusion Qian Tao Raymon Velhuis Signals an Systems Group, University of Twente Postbus 217, 7500AE Enschee, the Netherlans {q.tao,r.n.j.velhuis}@ewi.utwente.nl

More information

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling Balancing Expecte an Worst-Case Utility in Contracting Moels with Asymmetric Information an Pooling R.B.O. erkkamp & W. van en Heuvel & A.P.M. Wagelmans Econometric Institute Report EI2018-01 9th January

More information

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE Journal of Soun an Vibration (1996) 191(3), 397 414 THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE E. M. WEINSTEIN Galaxy Scientific Corporation, 2500 English Creek

More information

TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS

TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS MISN-0-4 TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS f(x ± ) = f(x) ± f ' (x) + f '' (x) 2 ±... 1! 2! = 1.000 ± 0.100 + 0.005 ±... TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS by Peter Signell 1.

More information

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France APPROXIMAE SOLUION FOR RANSIEN HEA RANSFER IN SAIC URBULEN HE II B. Bauouy CEA/Saclay, DSM/DAPNIA/SCM 91191 Gif-sur-Yvette Ceex, France ABSRAC Analytical solution in one imension of the heat iffusion equation

More information

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10 Some vector algebra an the generalize chain rule Ross Bannister Data Assimilation Research Centre University of Reaing UK Last upate 10/06/10 1. Introuction an notation As we shall see in these notes the

More information

Lecture 6 : Dimensionality Reduction

Lecture 6 : Dimensionality Reduction CPS290: Algorithmic Founations of Data Science February 3, 207 Lecture 6 : Dimensionality Reuction Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will consier the roblem of maing

More information

A Modification of the Jarque-Bera Test. for Normality

A Modification of the Jarque-Bera Test. for Normality Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods Hyperbolic Moment Equations Using Quarature-Base Projection Methos J. Koellermeier an M. Torrilhon Department of Mathematics, RWTH Aachen University, Aachen, Germany Abstract. Kinetic equations like the

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

A simple model for the small-strain behaviour of soils

A simple model for the small-strain behaviour of soils A simple moel for the small-strain behaviour of soils José Jorge Naer Department of Structural an Geotechnical ngineering, Polytechnic School, University of São Paulo 05508-900, São Paulo, Brazil, e-mail:

More information

A New Minimum Description Length

A New Minimum Description Length A New Minimum Description Length Soosan Beheshti, Munther A. Dahleh Laboratory for Information an Decision Systems Massachusetts Institute of Technology soosan@mit.eu,ahleh@lis.mit.eu Abstract The minimum

More information

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems Construction of the Electronic Raial Wave Functions an Probability Distributions of Hyrogen-like Systems Thomas S. Kuntzleman, Department of Chemistry Spring Arbor University, Spring Arbor MI 498 tkuntzle@arbor.eu

More information

G j dq i + G j. q i. = a jt. and

G j dq i + G j. q i. = a jt. and Lagrange Multipliers Wenesay, 8 September 011 Sometimes it is convenient to use reunant coorinates, an to effect the variation of the action consistent with the constraints via the metho of Lagrange unetermine

More information