Surface Reconstruction: GNCs and MFA. Mads Nielsen y DIKU. Universitetsparken 1. DK-2100 Copenhagen. Denmark.

Size: px
Start display at page:

Download "Surface Reconstruction: GNCs and MFA. Mads Nielsen y DIKU. Universitetsparken 1. DK-2100 Copenhagen. Denmark."

Transcription

1 Surface Reconstruction: GNCs and MFA Mads Nielsen y DIKU Universitetsparken 1 DK-100 Copenhagen Denmark malte@diku.dk September 1994 Abstract Noise-corrupted signals and images can be reconstructed by minimization of a Hamiltonian. Often this Hamiltonian is a non-convex functional. The solution of minimum energy can then be approximated by the Graduated Non-Convexity (GNC) algorithm developed for the weak membrane by Blake and isserman. The GNC approximates the non-convex functional by a convex functional, and varies the solution space slowly towards the non-convex functional. In this work, we propose a way of monitoring a functional-relaxation algorithm. Especially the dependency on initial estimates and the way of convergence is easily captured. Earlier work used Mean Field Annealing (MFA) to relax the Hamiltonian in the general case. It is often claimed that MFA leads to a GNC algorithm. It is shown that this is not necessarily the case, and especially not the case for earlier MF approximations of the weak membrane. In the case of the weak membrane, MFA might lead to predictable and inexpedient results. Two automatic and proven GNC-generating methods are presented. One is using a Gaussian ltering of the smoothness term and is called Smoothness Focusing (SF). The other is using a Gaussian ltering of the a priori distribution of the derivative in a Maximum A Posteriori estimation scheme, and is called Probability Focusing (PF). The algorithms are experimentally compared to the Blake-isserman GNC and shown competitive. Index Terms{ Surface Reconstruction, Relaxation Algorithms, Regularization, Mean Field Annealing, Graduated Non-Convexity, Discontinuities. 1 Introduction Regularization is a method of reformulating ill-posed inverse problems as well-posed problems as done by Tikhonov and Arsenin [1]. This reformulation implies the addition of a stabilizing term, Published as Technical Report, INRIA 353, France y Part of this work was carried out at INRIA, Sophia Antipolis 1

2 followed by a global minimization of a energy functional, yielding a unique solution. Applied to the problem of reconstruction of a noise-corrupted surface, the energy functional or Hamiltonian (the terms are used interchangeably) can be expressed as follows: E[~s] = (~s? c) + f[~s]da where ~s is the reconstruction, is the domain of the reconstruction, c is the measurements, da is an area element of, and f is the stabilizing functional. Tikhonov uses a stabilizing functional which is a sum of the squares of derivatives of ~s, and obtains a convex energy functional, making the minimization is simple. Geman and Geman [] introduced a discontinuity eld in the reconstruction and proposed thereby the model of the weak membrane, which is formulated in the continuous case by Mumford and Shah [3]. The discontinuity eld is incorporated directly in the stabilizing functional by Blake and isserman [4], using what Rangarajan and Cheleppa [5] call the adiabatic approximation. The resulting stabilizing term will no longer lead to a convex solution space, therefore we will not use the term regularization, but surface reconstruction, because several solutions of minimum energy might theoretically exist. In general the stabilizing term reects the a priori knowledge of the surface, and other stabilizing functionals might be appropriate for other reconstruction problems. The stabilizing functional can be interpreted in terms of information theory, as an entropy measure [6], or in terms of Bayesian Estimation using Maximum A Posteriori (MAP) estimation as an M-estimator [7]. Given a noise model P (cjs), where s is the ideal surface, which is to be reconstructed and an a priori distribution of surfaces P (s), we can calculate the a posteriori probability of a given surface using Bayes formula: P (sjc) = P (cjs)p (s) P (c) where P (c) is a normalizing constant or in terms of statistical physics the partition function. Typically, the a priori distribution is given in terms of the derivative of the surface, which is why we use the notation P (rs). Using a noise model of additive, Gaussian, uncorrelated noise, and no spatial correlation of P (rs) we can nd the minus-log-probability function of a surface as E[~s] = X (~s? c) + f(r~s) (1) where f(r~s) =? log P (r~s), and r is a dierence operator in the discrete approximation. In terms of statistical physics E is called the Gibbs energy. Because the minus-log function is monotonically decreasing, a minimization of the Gibbs energy correspond to a maximization of the a posteriori probability. In the following we call the function f in Equation 1 the smoothness function. The properties of the reconstruction and the convexity of the solution space depends strongly on the smoothness function. Tikhonov [1] used a parabolic smoothness function, Blake and isserman [4] used a thresholded parabolic smoothness function, and Nielsen [8] used a Lorentzian estimator. In the following, we will not in general refer to any specic smoothness function. Geman and Geman [] used Simulated Annealing (SA) to nd the minimum of Equation 1, while Blake and isserman [4] introduced the deterministic and approximative Graduated Non-Convexity (GNC) algorithm in the case of the weak membrane and showed that it is up to 50 times faster than SA [9]. Geiger and Girosi [10] and Bilbro et. al. [11] used the Mean Field Annealing (MFA) formalism to create a deterministic version of the SA, and claimed it creates GNC-like algorithms. This paper concerns how the concept of GNC can be generalized to analytical and non-analytical smoothness functions, which criteria an algorithm has to fulll to be a GNC, and how previous work

3 such as MFA is explained in terms of GNC. In Section the Blake and isserman GNC is sketched and the criteria of an algorithm to be a GNC algorithm are emphasized. In Section 3 the MFA is sketched, and it is shown, that it is not leading to a GNC algorithm in the case of the weak membrane, but to an algorithm which in some cases yields predictable and inexpedient results. In Section 4 two alternatives to the MFA are given in the general case. They are proven to yield GNC algorithms and are fully automatic. All the proofs have been left to appendices in order not to distract the reader, which is only interested in the results. Graduated Non-Convexity The deterministic and approximative approach of GNC implies an approximation of the non-convex Hamiltonian by a convex Hamiltonian. This approximation is slowly varied towards the non-convex Hamiltonian, in the hope that the local minimum, which is tracked, will converge to the global minimum of the non-convex Hamiltonian. Three snapshots illustrating a GNC algorithm are shown in Figure 1. Energy Intermediate Final Convex Solution Figure 1: The Graduated Non-Convexity algorithm of the weak membrane in three snapshots. In general, a crucial point of the GNC algorithm is to be sure to have an initially convex Hamiltonian [4]. If it is not convex, the nal solution will depend upon an initial estimation. Let us assume the minimization algorithm to be of gradient descend type. We can dene the basin of attraction of a minimum as the set of points on characteristic paths ending in the minimum. In general the solution space will in this way be divided into a number of basins of attraction corresponding to the number of minima. If the initial solution space contains more than one basin of attraction, the nal result of the GNC will depend on the starting point of the algorithm. In other words, it is necessary to have an idea a priori of the solution, in order to nd the best solution. If the initial energy functional is non-convex, we cannot talk of a GNC, but of a relaxation algorithm. The second crucial point of a GNC is the development of non-convexities (when and where they appear and how they move), when the Hamiltonian is varied. It is dicult to gain any intuition or to formalize the criterion of having a good GNC. However, we can say that a non-convexity should not move much in the solution space after it is introduced in the Hamiltonian. If it moves far, it can push the solution far away from the earlier found equilibrium, after detection. Finally, we want the series of Hamiltonians not to be uniformly convergent towards the true energy functional but to be?-convergent towards the true energy functional [1]. A?-convergent series can informally be described as a series of functionals where the minimum of the functionals converges towards the minimum of the convergence functional. This is not the same as uniformly convergent because of phenomena such as Gibbs phenomenon, but for discrete approximations, uniform convergence will imply?-convergence [1], which is why we will be satised by uniform convergence. 3

4 In this paper we focus primarily on the rst crucial point (initial convexity), while the second (concavity motion) will only be used, when we will explain the behaviour of an algorithm. In Appendix A it is shown that a reconstruction problem on the form of Equation 1 implies a convex energy functional if and only if for all values of x the second derivative of f(x) is larger than? 1 in the one dimensional case. This implies that a GNC which changes only the smoothness function, should guarantee that the second derivative is dened for all x, and is larger than? 1 in the initial approximation. In Appendix A, it is also shown that in the D dimensional case the criterion is somewhat more complex, but a convex energy functional is guaranteed if all the eigenvalues of the Hessian matrix of f(x) are larger than? 1. Furthermore, non-convexities are guaranteed if one D of the eigenvalues are smaller than? 1 1. In the interval [? D ;? 1 ], the exact criterion is given in Appendix A. We see that a lack of convexity arising in one dimension, cannot be balanced by the other dimensions. The convexity must exist in all dimensions, and the dimensions must mix in an appropriate manner. The weak membrane is dened by minimization of the energy in Equation 1, where f(r~s) = ( jr~sj if jr~sj < T T otherwise and is a weighing constant and T can be perceived as a discontinuity threshold control parameter. The energy functional is not convex according to Theorem 1, Appendix A, because f(x) has second derivatives which are minus innity for jxj = T. Therefore the approximation of a GNC is constructed. The idea of the GNC algorithm is to approximate f by another functional, whose second derivative is bounded. This can be done by approximating f by f 1 in the critical region (around jxj = T ) by a second order polynomial with a second derivative larger than? 1. This formulation D was proposed by Blake and isserman [4] and is illustrated for the 1D case in Figure. T f(s ) x T s x Figure : Smoothness term as function of the derivative of the solution in the weak membrane. The solid line is the weak membrane, while the dotted line is the starting level of GNC as formulated by Blake and isserman. If we denote the initial smoothness function by f 0, we can construct a series of functions f c which vary continuously as a function of c. When c = 1 the solution space is convex, and when c = 0 the reconstruction corresponds to the weak membrane. The intermediate functions f c, c ]0; 1[ can be constructed by letting the interval of approximation shrink to a factor c of the original interval. The claim of Blake and isserman is that if we track the local minimum of f c, when slowly varying c from 1 to 0, we obtain a good approximation of the solution of the weak membrane [4]. It is shown [4], that the global minimum cannot always be tracked as the local minimum. A discussion of convergence is given by March [1] and Nielsen [13]. Other approximations of the initial smoothness function, which yield a convex energy functional can be constructed. In Section 4 two methods which are generally applicable and totally automatic are presented. 4

5 3 Mean Field Annealing and GNC Mean Field Annealing (MFA) is a technique, which is a deterministic version of Simulated Annealing. An amount of free energy is added to the molecules, which let them have many dierent possible states. Instead of simulating the stochastic behaviour of a molecule when the free energy is removed (ie. the temperature is decreased), the mean state of the molecule is simulated. Approximation of the distribution by its mean value is called the MF approximation. To calculate the mean state a Gibbs distribution of the states is used: P (~s) = E(~s) e? where E is the potential energy of a state, here given by Equation 1, while is used for the temperature which has often been denoted 1 or kt, but by here to gain consistency with later Gaussian scale space notation. is a normalizing constant called the partition function. The mean of the Gibbs distribution can be evaluated when tends towards zero, where only the ground state has non-zero probability and also becomes the mean state. For approaching innity, all states become equally probable, which is why the addition of free energy and the MF Approximation can be regarded as a relaxation algorithm. A convex energy functional is not necessarily obtained for high temperatures. This depends on the structure of the state space. In general, the solution space will become totally at and all states will be equally probable for innite high temperatures. If E : D! IR + is dened on a domain D with nite measure then the mean of ~s will be at the center of gravity of D for innite. If only the smoothness term of Equation 1 is MF approximated, the data term (which is quadratic and thereby convex) will ensure convexity of the total energy functional, and we obtain a GNC algorithm. The evaluation of the mean of the Gibbs distribution, is, however, often non-trivial as a summation over all possible states (ie. all possible combinations of reconstruction values in all points) has to be carried out. In many cases MF approximations can more easily be made, not in the total state space, but in some subspace, eg. the value of neighbouring points can be exchanged with their MF approximation. This is not the same as the global MFA, and will in general yield another result and might thus violate the condition of initial convexity of a GNC. When only a subspace is approximated, the approximation might depend on the rest of the state space. Eg. in the case of approximating a neighbouring pixel value it is necessary, for each temperature, to minimize the energy of each pixel, which changes the MF Approximation of the neighbouring pixels. Whether this optimization is simple (ie. the energy having a single stationary point) is not in general easily determined, but will depend on the actual properties of the eld and the approximation. In the recent years MFA algorithms of the weak membrane has been published [11], [10], [14]. In these works, the discontinuity eld introduced by Geman and Geman [] is substituted by its MF approximation, yielding an expression of the eective energy in a pixel in terms of a functional of the gradient. This MF-approximation of the weak membrane yields [11], [10] a smoothness function which has the form: f (x) = T? log(1 + e (T?jxj ) ) where can be interpreted as the temperature. This has the characteristics of being a smoothed version of the weak string. It is claimed [11], [10], that the MFA of the weak membrane is a GNC-like algorithm. This is not the case as the solution space might be non-convex no matter how much we \heat". When tends to zero, the smoothness function tends to the weak string. When tends to innity, the lower bound on the second derivative tends to a negative limit, which depends on. 5

6 The limiting bound is approximately?0:6, which shows that the MFA is not a GNC for > 0:9 no matter the dimensionality of the surface. This analysis is carried out in Appendix C. An illustration of the MFA smoothness function as a function of gradient and temperature is shown in Figure 3, while minus the second derivative with respect to the gradient is shown in Figure x sigma Figure 3: The F Approximation of the smoothness function of the weak membrane as a function of the gradient x and the temperature for the case T = 1, = 1. In the 1D case, we will analyse the behaviour of the MFA of the weak membrane by investigating the motion of the non-convexities. The positions on the smoothness function, where the second derivative reaches its minimum value is approximately a linear function of the square root of the temperature for high temperatures (see Appendix C). This implies that the concavities in the solution space is located in p k in the gradient space of the solution, where k is a positive constant. If is initialised p to an arbitrarily high value, such that every derivative in the signal is in the interval [?k ; p k ], no discontinuities will be detected using this. As is decreased slowly, the concavities traverse towards zero in the gradient space. Because we track the solution as a local minimum, the solution stays in the interval, when is lowered. This implies that all gradients of the solution are pulled towards zero. They will end in the interval p between p the two concavities in the function. When is decreased to zero, this interval is [?T= ; T= ]. This way, all gradients of the resulting signal will be in this interval and no discontinuities are detected. If the temperature was initialised at a lower value, some discontinuities might have been detected. These discontinuities will not be able to pass the concavities of the energy functional, and will still be outside the interval of the concavities no matter how the temperature is increased or decreased. The phenomenon can be visualized by looking at the graph of minus the second derivative of the MFA smoothness function in Figure 4. While the temperature is changed, the leaps cannot be overcome, and the solution is pushed in front of the concavities. The consequence of the MFA is, that the start temperature denes which discontinuities are detected. The annealing might let these discontinuities move, but the detection is performed by the arbitrary choice of initial temperature and the initial estimate. In Figure 5 the result of MFAs using dierent start temperatures is shown. The discontinuity threshold is low (T=1) resulting in the ground state when all points are detected as discontinuities. By raising the initial temperature the discontinuity detection threshold is raised, and fewer discontinuities are detected yielding a higher nal energy. In the limit we can force the MFA to detect no discontinuities if we start at a suciently high temperature. The above analysis was based upon the positions of the concavities in the energy functional. A concavity might not lead to multiple minima. One necessary condition to be fullled is that the 6

7 sigma x -0 Figure 4: Minus the second derivative of the MF Approximation of the smoothness function of the weak membrane as a function of the gradient x and the temperature for the case T = 1, = s Tstart X Figure 5: A MFA reconstruction. For each start temperature the MFA algorithm is run, using the same nal temperature close to zero in all executions. The parameters was T = 1 and = 10. The initial relaxation parameter was varied from 5 to using = 5 10 Tstart=5. concavity in a smooth energy functional contains a maximum. This is not the case when the forces from the data and the smoothness term point in the same direction. This means, that a "heating" to innity will not let discontinuities grow towards innity. The enlargement of the gradient will stop when the reconstruction reaches the initial data. The argumentation of using MFA is solely based upon statistical physics. In statistical physics MF Theory is not regarded as a good approximation inside the critical regions. The weak membrane is normally situated in a critical region. A critical region is the region where phase transitions are present. If both discontinuity points and non-discontinuity points are present in the optimal state, the weak membrane is in the critical region. If not, another model than the weak string could have been used. It should be mentioned, that MF Theory is regarded as a better approximation, when the dimension of the eld is high. This means that the MFA should be a better approximation in the D or 3D reconstruction problem. Nevertheless, the MFA on the weak membrane, will suer from the same problems outlined above, independently of the number of dimensions. This can be seen from the fact that the MF Approximation in higher dimensions, just exchange the derivative by the gradient magnitude, and that the non-convexities in this way (using Theorem ) are present in 7

8 a sphere in gradient space. This sphere is shrunk, without possibility of letting the gradients escape, when the temperature is decreased. 4 Focusing as GNC In this section we will present two algorithms based upon Gaussian defocusing as sources to gain convexity of the energy functional. In Appendix B it is proven that the weak membrane yields a convex Hamiltonian if the smoothness function is convoluted by a Gaussian of suciently large standard deviation. In fact, it is proven that any smoothness function, which diers from one which yields a convex Hamiltonian by only a Lebesgue integrable function, will cause a convex solution space if the smoothness function is ltered with a Gaussian of suciently large standard deviation. In this way a convex Hamiltonian can be constructed, and the solution can be found using a simple gradient descend algorithm. When slowly decreasing the standard deviation of the Gaussian towards zero, we can track the solution of the optimization problem by tracking the minimum as a local minimum. When the standard deviation approaches zero, the tracked solution is an approximation of the solution of the original problem. This method is denoted Smoothness Focusing (SF) using the nomenclature of coarse-to-ne strategies in scale space from Bergholms Edge Focusing [15]. In Appendix B it is also proven, that a convex energy functional can be obtained by Gaussian convolution of the a priori distribution of the gradient in a MAP estimation scheme. This corresponds to a Gaussian convolution of the Gibbs distribution of the gradient:?f (r~s) P (r~s) = e The corresponding minus-log-probability function yields a convex solution space if the initial probability function has a nite standard deviation. This is not the case for the weak membrane, but for many other robust estimators such as the Lorentzian. The solution can be tracked as a local minimum while the standard deviation of the Gaussian is decreased towards zero. This method is called Probability Focusing (PF). In both GNC generating schemes a conservative measure of the needed amount of smoothing is given in Appendix B. This means that we can guarantee that a nite amount of smoothing is needed and we can initiate the amount of smoothing. The general scheme of constructing a GNC algorithm which only requires that the smoothness function fullls the demands of Theorem 3 or Lemma in Appendix B, is presented as follows: = 0 while > 1 Minimize E (u) = P (c? ~s) + f (~s x ) = = endwhile where is the standard deviation of the Gaussian used for convolution of the smoothness function f or the probability function e?f. This means f (t) = G (t) f(t) or f (t) =? log(g (t) e?f (t) ) where G is the Gaussian of standard deviation. The choice of is a trade of between speed (large ) and precision (small ). A discussion of how to choose this \cooling rate" is given by Blake and isserman [4]. 8

9 4.1 Bayesian interpretation Reconstruction and the scale space extensions of the smoothness function can be interpreted in terms of probabilistic theory. The connection is: Assume an a priori probability distribution of the solutions is given and the smoothness function f is properly derived from this, then the minimization of the reconstruction energy corresponds to a maximization of the a posteriori probability (see derivation of Equation 1). Let us assume that we are not able to measure the derivative s x of the solution precisely, and the noise model on the derivatives is assumed stationary and Gaussian. The a priori distribution of the observed derivatives will in this case be the convolution of the a priori distribution of the real derivatives and the Gaussian noise model distribution. This gives an interpretation of the PF scheme as an expectation value of the probability (or an MF approximation, but using a Gaussian noise model of the gradient instead of the Gibbs distribution). The SF corresponds to having an uncertain determination of the derivatives and then estimating the expected energy. The expected energy is found by the Gaussian convolution if the noise model of the derivative is Gaussian. The GNCs by focusing can be perceived as an iterative way to nd increasingly certain values of the rst derivative. At the rst level, the derivatives are inaccurately determined, and we convolute the smoothness function by a Gaussian of large standard deviation. As the minimization of this yields an increasingly certain determination of the derivative of the solution, we can now use a Gaussian with a smaller standard deviation, and so forth. 5 Experiments We compare empirically the SF and the Blake isserman GNC on the weak membrane. The latter has the advantage that the smoothness function is only changed around the critical points where the second derivative is smaller than? 1. The Gaussian convolution yields the theoretically satisfying property of being derivable from probability theory. In the following experiments we show, that the SF can compete with the Blake isserman formulation. The outcome of the GNC algorithm implemented as done by Blake and isserman [4] is compared to the SF. The test problem is the weak membrane, which in scale space extension (after Gaussian convolution, with as scale parameter) yields: where f (x) = T + + x? T x? = p (x? T ) (erf(x? ) + erf(x + ))? x p (e?x?? e?x + )? T p (e?x? + e?x + ) x + = p (x + T ) and erf(x) = p x e?t dt?1 The qualitative dierence between the Blake-isserman approximation and the scale space approximation is that the scale space not only rounds o the corners, it also increases the value in zero (see Figure 6). Furthermore, the critical value of the second derivative is only reached in single points in the SF, but in intervals of nite size in the Blake-isserman formulation (see Figure 7). In these intervals, the energy functional is nearly non-convex, and might not have any gradient at all. This means that a gradient descent algorithm will probably end up in one of the bounds of the interval at random. Whether a gradient in the input is perceived as a discontinuity or not, might in this way be random. 9

10 Scale Space Blake-isserman Figure 6: Smoothness function of the derivative of the solution in starting level of GNC as formulated by Blake and isserman and in scale space extension. 1.5 Scale-Space Blake-isserman Figure 7: Second derivative of the smoothness function of the derivative of the solution in starting level of GNC as formulated by Blake and isserman and in scale space extension. Two types of experiments have been performed. One on a noise corrupted signal, and one on an ideal and precisely adjusted signal. The latter to test the precise behaviour to certain features, the rst to make an overall judgment. An ideal signal (consisting of an interval of negative gradient, an interval of zero gradient, a step edge, and an interval of zero gradient) has been noise corrupted with stationary Gaussian noise with standard deviation = :5. The two GNC algorithms have been run on the signal and the result can be seen in Figure 8 and Figure 9. The Blake-isserman GNC detects more discontinuities than the 10

11 SF GNC. This tendency is general and is present for many other parameter settings. The SF yields a nal energy which is 9 percent of the total energy found by the Blake-isserman algorithm. This is a general tendency which is emphasized in the next experiment Signal Scale-Space Figure 8: Regularized signal using the weak string approximation by SF. The nal energy E = 38:6 and only 1 discontinuity is detected Signal Blake-isserman Figure 9: Regularized signal using the weak string approximation by the Blake-isserman GNC. The nal energy E = 35:6 and discontinuities are detected. The two GNC algorithms are not always detecting the same discontinuities. It is known [4], that the weak string will detect discontinuities from a gradient if the gradient g > p T [17]. In this experiment, the algorithms have been tested on a constant gradient. From one experiment to the 11

12 next, the gradient has been increased. In Figure 11, the energy of the solution found by the two algorithms is plotted as a function of the gradient. In regions where the solution is not changing the detection of discontinuities, the plot should be a parabola, as illustrated in Figure 10. For each combination of discontinuities a parabola exists. The perfect GNC algorithm would choose the parabola of lowest energy for every gradient value. This is not the case for any of the two algorithms evaluated in this paper. Energy Gradient Figure 10: Energy of the weak string as a function of the gradient in an interval. Each of the curves corresponds to one combination of discontinuities. The optimal solution is the one of minimum energy. The experiment shows that, initially, where the gradient is small none of the algorithms detect any discontinuities, and the solutions are thereby identical. From a certain point (around g = 0:6) the Blake-isserman GNC detects discontinuities, and thereby leaves the initial parabola. The energy increases relative to the initial parabola, and it can be concluded that the discontinuities have been detected too early. In Figure 1 a zoom-in on the region of dierences can be seen. The SF GNC follows the initial parabola until the gradient g is close to 0:8. After this it follows a new parabola. The energy drops from the rst to the second parabola, and it can be concluded that discontinuities have been detected too late. The ideal detection gradient is g = 1=p. In the region of larger gradients (approximately 1.6), the SF GNC results in the worst solution, as too many discontinuities have been detected. This region is, though, of less interest, because it is the region where nearly all points have been detected as discontinuities. This situation is unlikely to appear in a realistic environment. In the interval of gradients 0:6 < g < 0:95 the Blake-isserman GNC yields a higher energy than the SF except in the region 0:75? 0:78. In order to show, that the Focusing GNC is a general scheme which can be applied to a broad class of regularization problems, it has also been tested on D data using the isotropic regularization introduced by Nielsen [8]. In the isotropic regularization, the smoothness function is the Lorentzian robust estimator, which is f(x) = log(1 + jxj ) It should be noticed, that this is not nite for innite jxj and furthermore not simple to approximate in the Blake-isserman way by substituting the function in the critical interval by a second order polynomial. The Lorentzian estimator implies a convex solution space if <. In Figure 13 and 14 the reconstruction using = 50 and the GNC by PF can be seen. 6 Conclusion We have proven criteria to guarantee convexity of an energy functional containing a quadratic dataterm and an arbitrary smoothness term has been given. Necessary conditions for the smoothness 1

13 5 0 Blake-isserman Smoothness Focusing Figure 11: The energy of the weak string as a function of the gradient. For each gradient a signal of length 0 and constant gradient has been constructed. The energy is plotted for the Blake-isserman GNC and for the Gauss GNC. = T = 1:0 in all computations Blake-isserman Smoothness Focusing Figure 1: The energy of the weak string as a function of the gradient. Computations performed as in 11. Focus is on the region of dierent energy. term to be approximated by a scale space extension of energy or probability yielding a convex energy functional is given. These results have been used for automatic construction of GNC algorithms. In the case of the weak string, the application is straight forward and yields results which are competitive to those of Blake and isserman. The Blake-isserman GNC has a tendency to overestimate the number of discontinuities, while the SF GNC has a tendency to under-estimate the number of discontinuities. 13

14 (a) (b) (c) (d) Figure 13: Regularization using the Lorentzian estimator and Smoothness Focusing. (a) is the original data. (b) is the noise corrupted signal, width SN R = p on the step edges. (c) is the reconstruction, and (d) is the normalized residual. Earlier, Mean Field Annealing has been used to make deterministic approximations of the process of simulated annealing of the weak string [10], [11]. It is proven, that these MFAs does not yield a GNC algorithm of the weak string, as the energy functional might be non-convex even for innite high temperatures. The start temperature of the MFA denes the positions of the discontinuities. The higher the temperature the fewer discontinuities will be detected. No matter how low the discontinuity threshold is in the weak string, it can be matched by a suciently high temperature, resulting in detection of no discontinuities by the MFA. The SF and PF GNCs implies the possibility of automatically applying GNC to any reconstruction, which can be formulated as the minimization of the energy of Equation 1. In general, an analytic expression of the energy functional is not needed. This implies the possibility of using an energy functional, which is measured as a histogram, and then only numerically known. In this way a new category of GNC applications is made possible. One application [18] to computer vision is to measure the statistics of the gradient in a scene. At a later time instance, we can, instead of using a priori information directly about the gradient, use the information, that the statistics of the scene is changing slowly. Both SF and PF has been applied with success in this conguration. 14

15 (a) (b) (c) (d) Figure 14: Regularization using the Lorentzian estimator and Smoothness Focusing. (a) is the original data. (b) noise corrupted signal, width SN R = p on the step edges. residual. Appendix A (c) is the reconstruction, and (d) is the normalized In this appendix, the necessary and sucient conditions for convexity of the energy functional are given. The proof is divided into three parts. The rst concerns the conditions on the smoothness function to create a convex solution space in the 1D case, the second deals with the D-dimensional case, and the last gives an interpretation of the results of the second. We will look into the case where the smoothness term is only dependent of the rst derivative of the solution. The energy can in the discrete formulation be expressed as: E(~s) = X (~s i? c i ) + f(~s i? ~s i?1 ) () where f : IR 7! IR, and is dependent on the sampling distance h, and subscript i denotes the function value taken in sample number i. Theorem 1 If the second derivative of the smoothness function f of the rst derivative of the solution can be bounded downwards to? 1 and the data term is the square distance, the solution space will be convex. Proof The solution space is convex if and only if the Hessian Matrix H is positive denite. By 15

16 denition In this case it yields where where F = 8 >< >: f 00?f 00 H ij j H = I + F?f 00 f 00 + f 00 3?f 00 3?f 00 3?f 00 N?1 f 00 i = f 00 (~s i? ~s i?1 ) The criterion for a matrix H to be positive denite is:?f 00 N?1 f 00 N?1 + f 00 N?f 00 N?f 00 N f 00 N 9 >= >; 8~x : ~x t H~x > 0 (3) By denition the solution space is convex if: 8~x : NX 1 x i + NX (x i? x i?1 ) f 00 i > 0 As P (x i ) P (x i? x i?1 ) for all ~x, H is positive denite if 8i : f 00 i >? 1 From Theorem 1 we can see that we only have to prove the existence of a lower bound on the second derivative of the smoothness term higher than? 1 to gain convexity of the solution space. The above results are correct for an integer sampled signal. If a dierently sampled signal was used instead, or a function of higher dimensionality was reconstructed, the limit would be changed according to the sampling distance h and the dimensionality D. If the smoothness function was a function of any higher order derivative of the solution, the lower bound would still exist, but take a dierent value. In the following table the values corresponding to smoothness functions of dierent derivative of the solution is listed. No matter which derivative the smoothness function is a function of, it is still a limit on the second derivative of the smoothness function. The smoothness function is f(s x (n)), where n denotes the order of dierentiation. n Limit h D 9h 4 D 40h 6 D If this pattern repeats in general the limit can be expressed as L(n) = n 1 n n! h n D 1 175h 8 D

17 The above results are valid for the one dimensional problem. In the following we generalize to the D dimensional case. Let an energy function be given on the following form: E(~s) = X (~s? c) + f(~s i+1? ~s i?1 ; ~s i+n1? ~s i?n1 ; ~s i+n1 n? ~s i?n1 n... :) (4) where n 1 is the length of a row, n is the length of a column of the image etc, and central approximations are used to gain symmetry. In this case we will search for the criterion for convexity. Lemma 1 If the Hessian H of f is in the class H, then the energy function given in Equation 4 is convex, where H = H IR DD 8y fir D kyk1 1g : y T Hy >? 1 Proof The solution space is convex if and only if the Hessian Matrix H of the energy function E(~s) is positive denite. The Hessian of the energy given in Equation 4 can be written as H = I + F where F is a band matrix with extra bands in distance n 1 ; n 1 n ;... etc. from the diagonal. The matrix F is composed by addition of submatrices F i originating from each of the image. If we remove the rows and columns containing only zeroes, we nd a D D matrix of the form: ( ) Hi?H F i = i j?h i H i j where vertical lines means reverse order of columns, while horizontal line means reverse order of rows, and H i is the Hessian matrix of f in the ith point. The D eigenvalues of this are D zeroes and D eigenvalues corresponding to twice the eigenvalues of H i. We search a class H of Hessian matrices H i such that 8i H i H, 8x : x T Hx > 0 (5) Let x i denote the sub-vector of x corresponding to the positions of F i in F, such that If we split x i into halves so that x i = 8i H i H, 8x : X i x T i F i x i = x T Fx! x 1i, we can write the criterion from (5) as x i X i (x 1i? x i ) T H i (x 1i? x i ) >?jxj Using the formula P (x i? x i+k ) 4jxj, and noticing that worst cases in the dierent dimensions are not mutual exclusive, we nd that this is equivalent to 8i H i H, 8i8y i fir D 8j jyj j 1g : y T i H i y i >? 1 This leads to the following class H of allowed Hessian matrices H = H IR DD 8y fir D kyj k 1 1g : y T Hy >? 1 where H is a symmetric matrix. 17

18 In general H is a symmetric matrix, and can thereby be diagonalized into H = T T LT where L is a diagonal matrix with real eigenvalues i. It should be noticed, that this class H is not rotationally symmetric, which is due to the rotational asymmetrical sampling grid. This asymmetry makes it dicult, in general, to reformulate the criterion directly in terms of H and eliminate y. We can though in general nd a sucient and a necessary criterion of convexity. Theorem The energy function of Equation 4 is convex if all the eigenvalues of the Hessian of f in Equation 4 are larger than? 1, where D is the dimensionality of ~s and is non-convex if one of D the eigenvalues are smaller than? 1. Proof The criterion of convexity in Lemma 1 is weakened by letting the domain of y, where the constraint on the Hessian shall be fullled, be shrunk into jyj = kyk 1. Because the Hessian is diagonalizable with real eigenvalues and with eigenvectors being an orthonormal basis H = T T LT, we have H = = H IR DD 8y fir D jyj 1g : y T T T LT y >? 1 H IR DD 8 fir D jxj 1g : x T Lx >? 1 where the substitution x = T y is used. This relaxed criterion is obviously violated if one of the eigenvalues of the Hessian is less than or equal to? 1. The criterion of convexity in Lemma 1 is tightened by letting the domain of y, where the constraint on the Hessian must be fullled, be enlarged into jyj D, where D is the dimensionality of ~s. Because the Hessian is diagonalizable with real eigenvalues and with eigenvectors being an orthonormal basis H = T T LT, we have that H = = = H IR DD 8y fir D jyj Dg : y T T T LT y >? 1 H IR DD 8 fir D jxj Dg : x T Lx >? 1 H IR DD 8 fir D jxj 1g : x T Lx >? 1 where the substitution x = T y is used. This sharpened criterion is fullled if all linear combinations of the eigenvalues with total weight 1 is larger than? 1. This is obviously only fullled if all the D eigenvalues of the Hessian are larger than? 1. D The above lemmas show that moving into higher dimensionality cannot balance a non-convexity from a lower dimensionality to obtain total convexity. In one dimension the only eigenvalue of the 11 Hessian shall be larger than?1= to ensure convexity. In D both the eigenvectors shall be larger than?1= to have the possibility of convexity, while also additional constraints on the interaction of the dimensionality must be fullled to ensure convexity. The result is that the dimensions cannot balance each other in gaining convexity, but might help each other in constructing non-convexities. Appendix B In this appendix two related methods to approximate a Hamiltonian to gain convexity is proposed. The rst operates on the smoothness term directly, while the second operates on the underlying probability distribution of the derivative(s). 18 D

19 Smoothness focussing The lower bound on the second derivative can be reached by a convolution of the smoothness function f by a Gaussian with an adequate standard deviation if the smoothness function only diers from the convex smoothness function by a Lebesgue integrable function. Theorem 3 If we let b be any constant, denote the convolution, and G(x; ) be the Gaussian in x of standard deviation we have for any function f which implies that Proof We have 8 > 0 : 8x IR : f(x) = g(x) > A p 1=3 ) 8x (G f) = (G g) + g(x)?b @ G(k; )g(x? k)dk + g(x? k)dk + G(k; IR jh(x)jdx = A (f(x) G(x; )) IR IR h(k)g(x? k; )dk G(x? k; IR (?b + )G(k; )dk? (?b + ) IR G(k; )dk? IR jh(k)jsup iir =?b +? sup G(x? i; )j IR jh(k)jdk =?b +? p e?3= A 3 G(x? k; )jdk G(x? i; This is a lower bound on the second derivative of the convolution. This bound should be greater than?b to prove the lemma.?b +? p e?3= A > 3?b, > Ap e?3=! 1=3 (6) This means that the second derivative of f will be larger than?b by convolution by a Gaussian of standard deviation larger than the above stated quantity. We have now proven that any function (which can be described as a function g whose second derivative is downwards bounded plus a Lebesgue integrable function h), can be bounded downwards in the second derivative arbitrarily close to the bound on the second derivative of g. As an example, we can mention the weak membrane, which can be described as a constant function, plus a negative parabola in a limited region. We want to limit the second derivative to 19

20 be larger than?1=. For the generality we express this as?b. The smoothness function f(x) corresponding to the weak string is f(x) = g(x) + h(x) where g(x) = T and h(x) = The Lebesgue integral A of jh(x)j yields A = IR jh(x)jdx = T?T ( (x? T ) if x < T 0 otherwise (T? x )dx = 4T 3 This and Equation 6 (noting that = b because g 00 (x) = 0) results in the following standard deviation of the Gaussian to ensure, that the energy function is convex: p > ( e?3= ) 1=3 T 0:70(=b ) 1=3 T 0:91 1=3 T 3b It should be mentioned, that this bound is a conservative measure, and in practice smaller values of might yield a convex energy function. Actually, in practice we nd the limit for the weak membrane of rst order regularization to be 30% lower in the example used earlier in this paper. The above mentioned limit on is conservative. Actually, it is so conservative, that some functions where A is innite (h is not Lebesgue integrable) have nevertheless a nite bound on. An example is periodic functions. By a Fourier series expansion followed by a scale space extension of f, one can see, that any periodic function can be limited downwards by any negative value of the second derivative. Lemma Any periodic function f(x) = f(x + l), which can be expressed as a Fourier series F (u), can be limited to have a second derivative larger than any negative limit by scale space extension. Proof The Fourier series expansion can be described as where! 0 = l f(x) = 1X u=0 F (u)e iu! 0x. The scale space extension of the Fourier series yields The second derivative of this is f(x; ) f(x; = 1X u=1 1X u=0 F (u)e iu! 0x?u! 0?u! 0 F (n)eiu! 0x e?u! 0 As this is exponentially decaying in all the terms of the sum as a function of, it can be limited to any interval around 0, by scale space extension with an adequate. In computer vision we might use the angle of a surface normal as parameter to the smoothness function. Such an angular description is periodic, and thereby we can limit the second derivative to any negative limit by scale space extension, if the function can be expressed as a Fourier series. The above result for periodic functions is not of practical importance if we use derivatives as basis of the smoothness measure. 0 3

21 Probability focussing Reconstruction can be formulated as Maximum A Posteriori (MAP) estimation. When the probability of observing a derivative is independent of the other observations, we nd that MAP-estimation leads to reconstruction, where the smoothness term is the minus-log-probability-function. We have that f(x) =? log p(x) where p(x) is the density of x. Also the scale space extension of the probability function directly leads to a convex solution space, if the distribution has a nite standard deviation. Theorem 4 The Gaussian convolution of a density function with nite standard deviation leads to a minus-log-probability function, whose second derivative can be bounded downwards by zero, if the standard deviation of the Gaussian is larger than the standard deviation of the density function. Proof Given a probability distribution p(x), we know that 8x : p(x) > 0 1 = > 1 1?1 p(x)?1 1?1 (x? y) p(x)p(y)dxdy The two rst conditions state that p is a probability distribution, while the third is a reformulation of the standard deviation being less than. Let G denote the Gaussian of zero mean and standard deviation and let f =? log(p G ) (7) be the minus log-probability of p convolved by a Gaussian. The criterion of f being convex is, 8x > 0, 8x : (G p) < 0 #, 8x : (G p) "( x? G ) p? (?x 4 G ) p (?x G ) p By expressing the convolution products as integrals and substituting t = x= this yields 8k : G(k? t)p(t)dt (t? 1)G(k? t)p(t)dt? tg(k? t)p(t)dt < 0 tg(k? t)p(t)dt < 0 where the integrals are to be taken over the real axis and no index on the Gauss functions means standard deviation = 1. We substitute t = s in the rightmost integral yielding 8k :, 8k : t? t s t s (s? 1)G(k? t)g(k? s)p(t)p(s)dsdt s tsg(k? t)g(k? s)p(t)p(s)dsdt < 0 (s? ts? 1)G(k? t)g(k? s)p(t)p(s)dsdt < 0 1

22 Noticing that the last factor is symmetric in s and t, we can add the symmetric term, and the inequality will still be valid. 8k :, 8k : t t s s [(s? ts? 1) + (t? st? 1)]G(k? t)g(k? s)p(t)p(s)dsdt < 0 [(s? t)? ]G(k? t)g(k? s)p(t)p(s)dsdt < 0 This last expression say that no matter which mean value a Gaussian of standard deviation 1 has, the multiplication with a function of standard deviation 1, will yield a function which standard deviation is less than 1. This is obviously true, as the Gaussian is less than 1 for every x, and the resulting minimized second order moment will be smaller. This proofs that we can guarantee that the second derivative of the smoothness function is larger than zero, and not just a negative limit, implying that this is actually a stronger proof than what is needed to guarantee GNC. Appendix C In this appendix the Mean Field Annealing of the weak membrane as proposed by Geiger and Girosi [10] and by Bilbro et. al. [11] is analysed. In this case the smoothness term is f (x) = T? log(1 + e (T?x ) ) We will prove that this has a lower bound on the second derivative which is not vanishing when the relaxation parameter is increased towards innity, and that the positions of the minima are moved towards plus and minus innity when is increased towards innity. Lemma 3 The third derivative of f is zero when x = 0 or when the equation 3 tanh(y=)? y + T = 0 is fullled where y = T?x. This gives solutions in x, which are proportional to p for high p and correspond to T= for low. Proof Let q = e (T?x and =?x f (x) = x q 1 f (x) = (1 + q)? 3 f (x) = 1 x +4 3 x 3 q and f (x) = T? log(1 + q). We nd x! q (1 + q)? q 1 + q q 1 +? 6 q q q (1 + q) (8)! (1 + q) + 4 q 1 + q The third derivative is evidently zero if x = 0 or (by division of the derivative by 3(1 + q) + x (q? 1) = 0, x = 3 = 0 1?q 1+q 3 3! 4 x ) (1+q) 3 (9)

23 Using the above substitution of y we nd which leads to y = 3 e y + 1 e y? 1 + T 3 tanh(y=)? y + T = 0 (10) This is not easily analytically solvable. The solution depends on T and but not on. The 3 function h(y) =? y consists of two decreasing functions so that h tanh(y=)?(y) : IR?! IR and h + (y) : IR +! IR are two invertible functions. This means that Equation 10 always has one negative and one positive root. Furthermore, these solutions are monotonic functions of T. Let us denote by y a the solution of the above equation when =T = a. When a is increased towards innity the last term of the equation vanishes, and we numerically nd the solution y 1 1: , where the negative solution has a meaning in terms of x. When a is decreased towards zero from above, two solutions still exist: y =?3a and y = 1=a + 3=. Still the negative solution is the meaningful in terms of x. When the solution is known in y it can be found in x as x = s T? y In case of high temperature (ie. large a), we nd x = q T?y1 temperatures (ie. vanishing a) we nd x = T p (1 + p q y 1 =. In case of low p 3). We do now know the positions of the minima of the second derivative of f, and will calculate the value in order to prove the lack of convexity of the MFA. Theorem 5 There exists an x so that the second derivative of f (x) is smaller than or equal to k, where k = y1q1+q1(q1+1)?0: , and y (1+q1) 1 is dened in the previous lemma and q 1 = e y1. Proof By substituting y and q into the second derivative of f in Equation 8, f yq + q(1 + q)? T = (x) = (1 + q) If we no matter the actual value of a uses y = y 1, which always results in two real solutions in x, f (x) = y 1q 1 + q 1 (1 + q 1 )? T = = k? T = (1 + q 1 ) (1 + q 1 ) This is always smaller than or equal to k because > 0 We have know proven, that there always exists an x so that the second derivative of the mean eld approximation of the smoothness term of the weak string will be smaller than k, where k?0: Along with the conditions of convexity of the solution space, we nd, that the solution space will never be convex if >?1=k. 3

G : Statistical Mechanics Notes for Lecture 3 I. MICROCANONICAL ENSEMBLE: CONDITIONS FOR THERMAL EQUILIBRIUM Consider bringing two systems into

G : Statistical Mechanics Notes for Lecture 3 I. MICROCANONICAL ENSEMBLE: CONDITIONS FOR THERMAL EQUILIBRIUM Consider bringing two systems into G25.2651: Statistical Mechanics Notes for Lecture 3 I. MICROCANONICAL ENSEMBLE: CONDITIONS FOR THERMAL EQUILIBRIUM Consider bringing two systems into thermal contact. By thermal contact, we mean that the

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

Plan of Class 4. Radial Basis Functions with moving centers. Projection Pursuit Regression and ridge. Principal Component Analysis: basic ideas

Plan of Class 4. Radial Basis Functions with moving centers. Projection Pursuit Regression and ridge. Principal Component Analysis: basic ideas Plan of Class 4 Radial Basis Functions with moving centers Multilayer Perceptrons Projection Pursuit Regression and ridge functions approximation Principal Component Analysis: basic ideas Radial Basis

More information

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

A characterization of consistency of model weights given partial information in normal linear models

A characterization of consistency of model weights given partial information in normal linear models Statistics & Probability Letters ( ) A characterization of consistency of model weights given partial information in normal linear models Hubert Wong a;, Bertrand Clare b;1 a Department of Health Care

More information

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr The discrete algebraic Riccati equation and linear matrix inequality nton. Stoorvogel y Department of Mathematics and Computing Science Eindhoven Univ. of Technology P.O. ox 53, 56 M Eindhoven The Netherlands

More information

ARTIFICIAL INTELLIGENCE LABORATORY. and CENTER FOR BIOLOGICAL INFORMATION PROCESSING. A.I. Memo No August Federico Girosi.

ARTIFICIAL INTELLIGENCE LABORATORY. and CENTER FOR BIOLOGICAL INFORMATION PROCESSING. A.I. Memo No August Federico Girosi. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL INFORMATION PROCESSING WHITAKER COLLEGE A.I. Memo No. 1287 August 1991 C.B.I.P. Paper No. 66 Models of

More information

Gravitational potential energy *

Gravitational potential energy * OpenStax-CNX module: m15090 1 Gravitational potential energy * Sunil Kumar Singh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 2.0 The concept of potential

More information

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS FUNCTIONS Michael Baron Received: Abstract We introduce a wide class of asymmetric loss functions and show how to obtain asymmetric-type optimal decision

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

Non-convex optimization. Issam Laradji

Non-convex optimization. Issam Laradji Non-convex optimization Issam Laradji Strongly Convex Objective function f(x) x Strongly Convex Objective function Assumptions Gradient Lipschitz continuous f(x) Strongly convex x Strongly Convex Objective

More information

Notes on Regularization and Robust Estimation Psych 267/CS 348D/EE 365 Prof. David J. Heeger September 15, 1998

Notes on Regularization and Robust Estimation Psych 267/CS 348D/EE 365 Prof. David J. Heeger September 15, 1998 Notes on Regularization and Robust Estimation Psych 67/CS 348D/EE 365 Prof. David J. Heeger September 5, 998 Regularization. Regularization is a class of techniques that have been widely used to solve

More information

Stochastic dominance with imprecise information

Stochastic dominance with imprecise information Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is

More information

The Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion.

The Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion. The Uniformity Principle A New Tool for Probabilistic Robustness Analysis B. R. Barmish and C. M. Lagoa Department of Electrical and Computer Engineering University of Wisconsin-Madison, Madison, WI 53706

More information

DETECTION theory deals primarily with techniques for

DETECTION theory deals primarily with techniques for ADVANCED SIGNAL PROCESSING SE Optimum Detection of Deterministic and Random Signals Stefan Tertinek Graz University of Technology turtle@sbox.tugraz.at Abstract This paper introduces various methods for

More information

The best expert versus the smartest algorithm

The best expert versus the smartest algorithm Theoretical Computer Science 34 004 361 380 www.elsevier.com/locate/tcs The best expert versus the smartest algorithm Peter Chen a, Guoli Ding b; a Department of Computer Science, Louisiana State University,

More information

ε ε

ε ε The 8th International Conference on Computer Vision, July, Vancouver, Canada, Vol., pp. 86{9. Motion Segmentation by Subspace Separation and Model Selection Kenichi Kanatani Department of Information Technology,

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

A general theory of discrete ltering. for LES in complex geometry. By Oleg V. Vasilyev AND Thomas S. Lund

A general theory of discrete ltering. for LES in complex geometry. By Oleg V. Vasilyev AND Thomas S. Lund Center for Turbulence Research Annual Research Briefs 997 67 A general theory of discrete ltering for ES in complex geometry By Oleg V. Vasilyev AND Thomas S. und. Motivation and objectives In large eddy

More information

Carnegie Mellon University Forbes Ave. Pittsburgh, PA 15213, USA. fmunos, leemon, V (x)ln + max. cost functional [3].

Carnegie Mellon University Forbes Ave. Pittsburgh, PA 15213, USA. fmunos, leemon, V (x)ln + max. cost functional [3]. Gradient Descent Approaches to Neural-Net-Based Solutions of the Hamilton-Jacobi-Bellman Equation Remi Munos, Leemon C. Baird and Andrew W. Moore Robotics Institute and Computer Science Department, Carnegie

More information

1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that

1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that Chapter 4 Nonlinear equations 4.1 Root finding Consider the problem of solving any nonlinear relation g(x) = h(x) in the real variable x. We rephrase this problem as one of finding the zero (root) of a

More information

= w 2. w 1. B j. A j. C + j1j2

= w 2. w 1. B j. A j. C + j1j2 Local Minima and Plateaus in Multilayer Neural Networks Kenji Fukumizu and Shun-ichi Amari Brain Science Institute, RIKEN Hirosawa 2-, Wako, Saitama 35-098, Japan E-mail: ffuku, amarig@brain.riken.go.jp

More information

HOW TO MAKE ELEMENTARY GEOMETRY MORE ROBUST AND THUS, MORE PRACTICAL: GENERAL ALGORITHMS. O. Kosheleva. 1. Formulation of the Problem

HOW TO MAKE ELEMENTARY GEOMETRY MORE ROBUST AND THUS, MORE PRACTICAL: GENERAL ALGORITHMS. O. Kosheleva. 1. Formulation of the Problem Ìàòåìàòè åñêèå ñòðóêòóðû è ìîäåëèðîâàíèå 2014, âûï. XX, ñ. 1?? ÓÄÊ 000.000 HOW TO MAKE ELEMENTARY GEOMETRY MORE ROBUST AND THUS, MORE PRACTICAL: GENERAL ALGORITHMS O. Kosheleva Many results of elementary

More information

How to Pop a Deep PDA Matters

How to Pop a Deep PDA Matters How to Pop a Deep PDA Matters Peter Leupold Department of Mathematics, Faculty of Science Kyoto Sangyo University Kyoto 603-8555, Japan email:leupold@cc.kyoto-su.ac.jp Abstract Deep PDA are push-down automata

More information

Linearly-solvable Markov decision problems

Linearly-solvable Markov decision problems Advances in Neural Information Processing Systems 2 Linearly-solvable Markov decision problems Emanuel Todorov Department of Cognitive Science University of California San Diego todorov@cogsci.ucsd.edu

More information

MODELLING OF FLEXIBLE MECHANICAL SYSTEMS THROUGH APPROXIMATED EIGENFUNCTIONS L. Menini A. Tornambe L. Zaccarian Dip. Informatica, Sistemi e Produzione

MODELLING OF FLEXIBLE MECHANICAL SYSTEMS THROUGH APPROXIMATED EIGENFUNCTIONS L. Menini A. Tornambe L. Zaccarian Dip. Informatica, Sistemi e Produzione MODELLING OF FLEXIBLE MECHANICAL SYSTEMS THROUGH APPROXIMATED EIGENFUNCTIONS L. Menini A. Tornambe L. Zaccarian Dip. Informatica, Sistemi e Produzione, Univ. di Roma Tor Vergata, via di Tor Vergata 11,

More information

Multi-Robotic Systems

Multi-Robotic Systems CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed

More information

Contents. 2.1 Vectors in R n. Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v. 2.50) 2 Vector Spaces

Contents. 2.1 Vectors in R n. Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v. 2.50) 2 Vector Spaces Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v 250) Contents 2 Vector Spaces 1 21 Vectors in R n 1 22 The Formal Denition of a Vector Space 4 23 Subspaces 6 24 Linear Combinations and

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

2 Garrett: `A Good Spectral Theorem' 1. von Neumann algebras, density theorem The commutant of a subring S of a ring R is S 0 = fr 2 R : rs = sr; 8s 2

2 Garrett: `A Good Spectral Theorem' 1. von Neumann algebras, density theorem The commutant of a subring S of a ring R is S 0 = fr 2 R : rs = sr; 8s 2 1 A Good Spectral Theorem c1996, Paul Garrett, garrett@math.umn.edu version February 12, 1996 1 Measurable Hilbert bundles Measurable Banach bundles Direct integrals of Hilbert spaces Trivializing Hilbert

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

October 7, :8 WSPC/WS-IJWMIP paper. Polynomial functions are renable

October 7, :8 WSPC/WS-IJWMIP paper. Polynomial functions are renable International Journal of Wavelets, Multiresolution and Information Processing c World Scientic Publishing Company Polynomial functions are renable Henning Thielemann Institut für Informatik Martin-Luther-Universität

More information

Notes on Iterated Expectations Stephen Morris February 2002

Notes on Iterated Expectations Stephen Morris February 2002 Notes on Iterated Expectations Stephen Morris February 2002 1. Introduction Consider the following sequence of numbers. Individual 1's expectation of random variable X; individual 2's expectation of individual

More information

An exploration of matrix equilibration

An exploration of matrix equilibration An exploration of matrix equilibration Paul Liu Abstract We review three algorithms that scale the innity-norm of each row and column in a matrix to. The rst algorithm applies to unsymmetric matrices,

More information

Slide a window along the input arc sequence S. Least-squares estimate. σ 2. σ Estimate 1. Statistically test the difference between θ 1 and θ 2

Slide a window along the input arc sequence S. Least-squares estimate. σ 2. σ Estimate 1. Statistically test the difference between θ 1 and θ 2 Corner Detection 2D Image Features Corners are important two dimensional features. Two dimensional image features are interesting local structures. They include junctions of dierent types Slide 3 They

More information

R. Schaback. numerical method is proposed which rst minimizes each f j separately. and then applies a penalty strategy to gradually force the

R. Schaback. numerical method is proposed which rst minimizes each f j separately. and then applies a penalty strategy to gradually force the A Multi{Parameter Method for Nonlinear Least{Squares Approximation R Schaback Abstract P For discrete nonlinear least-squares approximation problems f 2 (x)! min for m smooth functions f : IR n! IR a m

More information

Bayesian Paradigm. Maximum A Posteriori Estimation

Bayesian Paradigm. Maximum A Posteriori Estimation Bayesian Paradigm Maximum A Posteriori Estimation Simple acquisition model noise + degradation Constraint minimization or Equivalent formulation Constraint minimization Lagrangian (unconstraint minimization)

More information

Representation and Learning of. Klas Nordberg Gosta Granlund Hans Knutsson

Representation and Learning of. Klas Nordberg Gosta Granlund Hans Knutsson Representation and Learning of Invariance Klas Nordberg Gosta Granlund Hans Knutsson LiTH-ISY-R-55 994-0- Representation and Learning of Invariance Klas Nordberg Gosta Granlund Hans Knutsson Computer Vision

More information

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract Published in: Advances in Neural Information Processing Systems 8, D S Touretzky, M C Mozer, and M E Hasselmo (eds.), MIT Press, Cambridge, MA, pages 190-196, 1996. Learning with Ensembles: How over-tting

More information

4.1 Eigenvalues, Eigenvectors, and The Characteristic Polynomial

4.1 Eigenvalues, Eigenvectors, and The Characteristic Polynomial Linear Algebra (part 4): Eigenvalues, Diagonalization, and the Jordan Form (by Evan Dummit, 27, v ) Contents 4 Eigenvalues, Diagonalization, and the Jordan Canonical Form 4 Eigenvalues, Eigenvectors, and

More information

Separation of Variables in Linear PDE: One-Dimensional Problems

Separation of Variables in Linear PDE: One-Dimensional Problems Separation of Variables in Linear PDE: One-Dimensional Problems Now we apply the theory of Hilbert spaces to linear differential equations with partial derivatives (PDE). We start with a particular example,

More information

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun Boxlets: a Fast Convolution Algorithm for Signal Processing and Neural Networks Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun AT&T Labs-Research 100 Schultz Drive, Red Bank, NJ 07701-7033

More information

Lecture 6 Positive Definite Matrices

Lecture 6 Positive Definite Matrices Linear Algebra Lecture 6 Positive Definite Matrices Prof. Chun-Hung Liu Dept. of Electrical and Computer Engineering National Chiao Tung University Spring 2017 2017/6/8 Lecture 6: Positive Definite Matrices

More information

LECTURE 15 + C+F. = A 11 x 1x1 +2A 12 x 1x2 + A 22 x 2x2 + B 1 x 1 + B 2 x 2. xi y 2 = ~y 2 (x 1 ;x 2 ) x 2 = ~x 2 (y 1 ;y 2 1

LECTURE 15 + C+F. = A 11 x 1x1 +2A 12 x 1x2 + A 22 x 2x2 + B 1 x 1 + B 2 x 2. xi y 2 = ~y 2 (x 1 ;x 2 ) x 2 = ~x 2 (y 1 ;y 2  1 LECTURE 5 Characteristics and the Classication of Second Order Linear PDEs Let us now consider the case of a general second order linear PDE in two variables; (5.) where (5.) 0 P i;j A ij xix j + P i,

More information

and Dagpunar [1988]) general methods for discrete distributions include two table methods (i.e. inversion by sequential or table-aided search and the

and Dagpunar [1988]) general methods for discrete distributions include two table methods (i.e. inversion by sequential or table-aided search and the Rejection-Inversion to Generate Variates from Monotone Discrete Distributions W. H ORMANN and G. DERFLINGER University of Economics and Business Administration Vienna Department of Statistics, Augasse

More information

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space.

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Hilbert Spaces Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Vector Space. Vector space, ν, over the field of complex numbers,

More information

No. of dimensions 1. No. of centers

No. of dimensions 1. No. of centers Contents 8.6 Course of dimensionality............................ 15 8.7 Computational aspects of linear estimators.................. 15 8.7.1 Diagonalization of circulant andblock-circulant matrices......

More information

Dynamical Systems. August 13, 2013

Dynamical Systems. August 13, 2013 Dynamical Systems Joshua Wilde, revised by Isabel Tecu, Takeshi Suzuki and María José Boccardi August 13, 2013 Dynamical Systems are systems, described by one or more equations, that evolve over time.

More information

Divisor matrices and magic sequences

Divisor matrices and magic sequences Discrete Mathematics 250 (2002) 125 135 www.elsevier.com/locate/disc Divisor matrices and magic sequences R.H. Jeurissen Mathematical Institute, University of Nijmegen, Toernooiveld, 6525 ED Nijmegen,

More information

Planning With Information States: A Survey Term Project for cs397sml Spring 2002

Planning With Information States: A Survey Term Project for cs397sml Spring 2002 Planning With Information States: A Survey Term Project for cs397sml Spring 2002 Jason O Kane jokane@uiuc.edu April 18, 2003 1 Introduction Classical planning generally depends on the assumption that the

More information

1) The line has a slope of ) The line passes through (2, 11) and. 6) r(x) = x + 4. From memory match each equation with its graph.

1) The line has a slope of ) The line passes through (2, 11) and. 6) r(x) = x + 4. From memory match each equation with its graph. Review Test 2 Math 1314 Name Write an equation of the line satisfying the given conditions. Write the answer in standard form. 1) The line has a slope of - 2 7 and contains the point (3, 1). Use the point-slope

More information

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved Fundamentals of Linear Algebra Marcel B. Finan Arkansas Tech University c All Rights Reserved 2 PREFACE Linear algebra has evolved as a branch of mathematics with wide range of applications to the natural

More information

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 59 CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 4. INTRODUCTION Weighted average-based fusion algorithms are one of the widely used fusion methods for multi-sensor data integration. These methods

More information

below, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing

below, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing Kernel PCA Pattern Reconstruction via Approximate Pre-Images Bernhard Scholkopf, Sebastian Mika, Alex Smola, Gunnar Ratsch, & Klaus-Robert Muller GMD FIRST, Rudower Chaussee 5, 12489 Berlin, Germany fbs,

More information

Error Empirical error. Generalization error. Time (number of iteration)

Error Empirical error. Generalization error. Time (number of iteration) Submitted to Neural Networks. Dynamics of Batch Learning in Multilayer Networks { Overrealizability and Overtraining { Kenji Fukumizu The Institute of Physical and Chemical Research (RIKEN) E-mail: fuku@brain.riken.go.jp

More information

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7 Linear Algebra and its Applications-Lab 1 1) Use Gaussian elimination to solve the following systems x 1 + x 2 2x 3 + 4x 4 = 5 1.1) 2x 1 + 2x 2 3x 3 + x 4 = 3 3x 1 + 3x 2 4x 3 2x 4 = 1 x + y + 2z = 4 1.4)

More information

Price Competition and Endogenous Valuation in Search Advertising

Price Competition and Endogenous Valuation in Search Advertising Price Competition and Endogenous Valuation in Search Advertising Lizhen Xu Jianqing Chen Andrew Whinston Web Appendix A Heterogeneous Consumer Valuation In the baseline model, we assumed that consumers

More information

Optimal Rejuvenation for. Tolerating Soft Failures. Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi.

Optimal Rejuvenation for. Tolerating Soft Failures. Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi. Optimal Rejuvenation for Tolerating Soft Failures Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi Abstract In the paper we address the problem of determining the optimal time

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

Super-resolution via Convex Programming

Super-resolution via Convex Programming Super-resolution via Convex Programming Carlos Fernandez-Granda (Joint work with Emmanuel Candès) Structure and Randomness in System Identication and Learning, IPAM 1/17/2013 1/17/2013 1 / 44 Index 1 Motivation

More information

1 Introduction It will be convenient to use the inx operators a b and a b to stand for maximum (least upper bound) and minimum (greatest lower bound)

1 Introduction It will be convenient to use the inx operators a b and a b to stand for maximum (least upper bound) and minimum (greatest lower bound) Cycle times and xed points of min-max functions Jeremy Gunawardena, Department of Computer Science, Stanford University, Stanford, CA 94305, USA. jeremy@cs.stanford.edu October 11, 1993 to appear in the

More information

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces 9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we

More information

2 Tikhonov Regularization and ERM

2 Tikhonov Regularization and ERM Introduction Here we discusses how a class of regularization methods originally designed to solve ill-posed inverse problems give rise to regularized learning algorithms. These algorithms are kernel methods

More information

Group Theory. 1. Show that Φ maps a conjugacy class of G into a conjugacy class of G.

Group Theory. 1. Show that Φ maps a conjugacy class of G into a conjugacy class of G. Group Theory Jan 2012 #6 Prove that if G is a nonabelian group, then G/Z(G) is not cyclic. Aug 2011 #9 (Jan 2010 #5) Prove that any group of order p 2 is an abelian group. Jan 2012 #7 G is nonabelian nite

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques

Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques Hung Chen hchen@math.ntu.edu.tw Department of Mathematics National Taiwan University 3rd March 2004 Meet at NS 104 On Wednesday

More information

Nader H. Bshouty Lisa Higham Jolanta Warpechowska-Gruca. Canada. (

Nader H. Bshouty Lisa Higham Jolanta Warpechowska-Gruca. Canada. ( Meeting Times of Random Walks on Graphs Nader H. Bshouty Lisa Higham Jolanta Warpechowska-Gruca Computer Science University of Calgary Calgary, AB, T2N 1N4 Canada (e-mail: fbshouty,higham,jolantag@cpsc.ucalgary.ca)

More information

Upper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139

Upper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139 Upper and Lower Bounds on the Number of Faults a System Can Withstand Without Repairs Michel Goemans y Nancy Lynch z Isaac Saias x Laboratory for Computer Science Massachusetts Institute of Technology

More information

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are Quadratic forms We consider the quadratic function f : R 2 R defined by f(x) = 2 xt Ax b T x with x = (x, x 2 ) T, () where A R 2 2 is symmetric and b R 2. We will see that, depending on the eigenvalues

More information

Minimum and maximum values *

Minimum and maximum values * OpenStax-CNX module: m17417 1 Minimum and maximum values * Sunil Kumar Singh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 2.0 In general context, a

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition) Vector Space Basics (Remark: these notes are highly formal and may be a useful reference to some students however I am also posting Ray Heitmann's notes to Canvas for students interested in a direct computational

More information

On Coarse Geometry and Coarse Embeddability

On Coarse Geometry and Coarse Embeddability On Coarse Geometry and Coarse Embeddability Ilmari Kangasniemi August 10, 2016 Master's Thesis University of Helsinki Faculty of Science Department of Mathematics and Statistics Supervised by Erik Elfving

More information

The Great Wall of David Shin

The Great Wall of David Shin The Great Wall of David Shin Tiankai Liu 115 June 015 On 9 May 010, David Shin posed the following puzzle in a Facebook note: Problem 1. You're blindfolded, disoriented, and standing one mile from the

More information

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise Edges and Scale Image Features From Sandlot Science Slides revised from S. Seitz, R. Szeliski, S. Lazebnik, etc. Origin of Edges surface normal discontinuity depth discontinuity surface color discontinuity

More information

THE LUBRICATION APPROXIMATION FOR THIN VISCOUS FILMS: REGULARITY AND LONG TIME BEHAVIOR OF WEAK SOLUTIONS A.L. BERTOZZI AND M. PUGH.

THE LUBRICATION APPROXIMATION FOR THIN VISCOUS FILMS: REGULARITY AND LONG TIME BEHAVIOR OF WEAK SOLUTIONS A.L. BERTOZZI AND M. PUGH. THE LUBRICATION APPROXIMATION FOR THIN VISCOUS FILMS: REGULARITY AND LONG TIME BEHAVIOR OF WEAK SOLUTIONS A.L. BERTOI AND M. PUGH April 1994 Abstract. We consider the fourth order degenerate diusion equation

More information

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada.

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada. In Advances in Neural Information Processing Systems 8 eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press, 1996. Gaussian Processes for Regression Christopher K. I. Williams Neural Computing

More information

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness

More information

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY A MULTIGRID ALGORITHM FOR THE CELL-CENTERED FINITE DIFFERENCE SCHEME Richard E. Ewing and Jian Shen Institute for Scientic Computation Texas A&M University College Station, Texas SUMMARY In this article,

More information

1/sqrt(B) convergence 1/B convergence B

1/sqrt(B) convergence 1/B convergence B The Error Coding Method and PICTs Gareth James and Trevor Hastie Department of Statistics, Stanford University March 29, 1998 Abstract A new family of plug-in classication techniques has recently been

More information

Optimization of Quadratic Forms: NP Hard Problems : Neural Networks

Optimization of Quadratic Forms: NP Hard Problems : Neural Networks 1 Optimization of Quadratic Forms: NP Hard Problems : Neural Networks Garimella Rama Murthy, Associate Professor, International Institute of Information Technology, Gachibowli, HYDERABAD, AP, INDIA ABSTRACT

More information

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.)

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.) 4 Vector fields Last updated: November 26, 2009. (Under construction.) 4.1 Tangent vectors as derivations After we have introduced topological notions, we can come back to analysis on manifolds. Let M

More information

Towards a Mathematical Theory of Super-resolution

Towards a Mathematical Theory of Super-resolution Towards a Mathematical Theory of Super-resolution Carlos Fernandez-Granda www.stanford.edu/~cfgranda/ Information Theory Forum, Information Systems Laboratory, Stanford 10/18/2013 Acknowledgements This

More information

Notes on Time Series Modeling

Notes on Time Series Modeling Notes on Time Series Modeling Garey Ramey University of California, San Diego January 17 1 Stationary processes De nition A stochastic process is any set of random variables y t indexed by t T : fy t g

More information

Lecture 5. 1 Chung-Fuchs Theorem. Tel Aviv University Spring 2011

Lecture 5. 1 Chung-Fuchs Theorem. Tel Aviv University Spring 2011 Random Walks and Brownian Motion Tel Aviv University Spring 20 Instructor: Ron Peled Lecture 5 Lecture date: Feb 28, 20 Scribe: Yishai Kohn In today's lecture we return to the Chung-Fuchs theorem regarding

More information

Linear Diffusion and Image Processing. Outline

Linear Diffusion and Image Processing. Outline Outline Linear Diffusion and Image Processing Fourier Transform Convolution Image Restoration: Linear Filtering Diffusion Processes for Noise Filtering linear scale space theory Gauss-Laplace pyramid for

More information

Contents. 6 Systems of First-Order Linear Dierential Equations. 6.1 General Theory of (First-Order) Linear Systems

Contents. 6 Systems of First-Order Linear Dierential Equations. 6.1 General Theory of (First-Order) Linear Systems Dierential Equations (part 3): Systems of First-Order Dierential Equations (by Evan Dummit, 26, v 2) Contents 6 Systems of First-Order Linear Dierential Equations 6 General Theory of (First-Order) Linear

More information

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 15: Nonlinear optimization Prof. John Gunnar Carlsson November 1, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 1, 2010 1 / 24

More information

6 The Fourier transform

6 The Fourier transform 6 The Fourier transform In this presentation we assume that the reader is already familiar with the Fourier transform. This means that we will not make a complete overview of its properties and applications.

More information

Iterative procedure for multidimesional Euler equations Abstracts A numerical iterative scheme is suggested to solve the Euler equations in two and th

Iterative procedure for multidimesional Euler equations Abstracts A numerical iterative scheme is suggested to solve the Euler equations in two and th Iterative procedure for multidimensional Euler equations W. Dreyer, M. Kunik, K. Sabelfeld, N. Simonov, and K. Wilmanski Weierstra Institute for Applied Analysis and Stochastics Mohrenstra e 39, 07 Berlin,

More information

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is New NCP-Functions and Their Properties 3 by Christian Kanzow y, Nobuo Yamashita z and Masao Fukushima z y University of Hamburg, Institute of Applied Mathematics, Bundesstrasse 55, D-2146 Hamburg, Germany,

More information

The Best Circulant Preconditioners for Hermitian Toeplitz Systems II: The Multiple-Zero Case Raymond H. Chan Michael K. Ng y Andy M. Yip z Abstract In

The Best Circulant Preconditioners for Hermitian Toeplitz Systems II: The Multiple-Zero Case Raymond H. Chan Michael K. Ng y Andy M. Yip z Abstract In The Best Circulant Preconditioners for Hermitian Toeplitz Systems II: The Multiple-ero Case Raymond H. Chan Michael K. Ng y Andy M. Yip z Abstract In [0, 4], circulant-type preconditioners have been proposed

More information

where (E) is the partition function of the uniform ensemble. Recalling that we have (E) = E (E) (E) i = ij x (E) j E = ij ln (E) E = k ij ~ S E = kt i

where (E) is the partition function of the uniform ensemble. Recalling that we have (E) = E (E) (E) i = ij x (E) j E = ij ln (E) E = k ij ~ S E = kt i G25.265: Statistical Mechanics Notes for Lecture 4 I. THE CLASSICAL VIRIAL THEOREM (MICROCANONICAL DERIVATION) Consider a system with Hamiltonian H(x). Let x i and x j be specic components of the phase

More information

A Stable Finite Dierence Ansatz for Higher Order Dierentiation of Non-Exact. Data. Bob Anderssen and Frank de Hoog,

A Stable Finite Dierence Ansatz for Higher Order Dierentiation of Non-Exact. Data. Bob Anderssen and Frank de Hoog, A Stable Finite Dierence Ansatz for Higher Order Dierentiation of Non-Exact Data Bob Anderssen and Frank de Hoog, CSIRO Division of Mathematics and Statistics, GPO Box 1965, Canberra, ACT 2601, Australia

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information