(In practice not Gram-Schmidt, but another process Householder Transformations are used.) QR Decomposition When solving an overdetermined system by projection (or a least squares solution) often the following method is used: Factorize A = Q R with R upper triangular and Q orthogonal, i.e. Q T Q = 1. Compute y = Q T b. Solve Rx = y by substitution, ignoring the row entries that do not belong to columns of the original A. Q can be obtained by applying Gram-Schmidt orthogonalization to the columns of A and extending to a orthonormal basis of R n. R holds the coefficients of the Gram-Schmidt process.
Eigenvalues Computing the characteristic polynomial as determinant is a very unstable process. Instead eigenvalues are computed by transforming The matrix is converted by orthogonal transformations to almost upper diagonal form (upper Hessenberg form). The matrix is transformed to upper diagonal form. The eigenvalues are the diagonal entries. This process can be performed by the LAPack routine sgeev/dgeev.
Nonlinear equations We are given a function f : R R and want to find (one or all) z with f(z) = 0. Typically methods work by iteration, starting at a point x 0 and then iteratively approximate a zero z. If there are several zeroes, it might be necessary to work with several start values. The three main methods are: Bisection Newton s method (using tangents) Secant method
In general, problems are: How to select good start values. How to enforce convergence for bad start values. How long to iterate. Quadratic, Ternary, Quartic We ve seen the formula for the solutions of a quadratic equation. Similar formulas exist for equations of degree 3 and 4, but they are numerically unstable. Furthermore one can show (this is done in an abstract algebra course) that there cannot be a formula for polynomials of higher degree.
Newton s method We have that 0 = f(z) f(x) + f (x)(x z) Solving for z gives the iteration (replace x with zero of the tangent line). x x f(x) f (x) This method converges if x 0 is chosen close enough to z (and f has no zeroes in the interval, in particular z is no double zero of f). If we let e k = x k z the error, we obtain: e k+1 = x k+1 z = x k z f(x k) f (x k ) = f(x k) f (x k )e k f (x k ) = 1 2 f (ξ k ) f (x k ) for ξ k in the interval (Taylor approximation for 0 = f(z) by a degree 1 polynomial around x k ).
As x k z we get approximately e k+1 1 2 f (z) f (z) e2 k, i.e. each step we double the number of digits. Problem: Bad (or no convergence) if f (z) = 0. As a stop criterion check: Change in step width smaller than some tolerance. Given upper limit for number of iterations. Generalizations of Newton s exist for multidimensional systems.
Systems of polynomial equations Consider a system of polynomial equations in several variables: f 1 (x 1,..., x n ) = 0 f 2 (x 1,..., x n ) = 0. f m (x 1,..., x n ) = 0 To solve this system we want to eliminate variables in a similar way as with solving a system of linear equations. Problem: How to eliminate x i y versus yz? Convention: For x α 1 1 x α 2 2 x α n n write x α.
Gröbner basis approach We define an ordering (lex ordering) on monomials: x α x β if α < β lexicographically. (One can define an admissible ordering in more general. One main variant is to compare the total degrees first.) This way, we identify in every polynomial p a leading term lt(p). If S = {p 1,..., p m } is a set of polynomials, we say that a polynomial f reduces at S if q = lt(p i ) r for a monomial q in f, some monomial r and some i, The reduction of f at S is the polynomial obtained by subtracting multiples of p i until no leading term divides any longer. Note: In this process the monomials in f become smaller, this process can have only finitely many steps.
S-polynomial To define some measure of reduction, we define for two polynomials p, q their S-polynomial as S(p, q) = l lt(p) p l lt(q) q where l = lcm(lt(p), lt(q)). Observation 1: Common zeroes of p and q are zeroes of S(p, q). Observation 2: We can also reduce the S-polynomial at p and q and get a smaller polynomial without losing common zeroes.
Example: p = x 2 y 3 + 3xy 4, q = 3xy 4 + 2x 3 y, lt(p) = x 2 y 3, lt(q) = 2x 3 y Then lcm(lt(p), lt(q)) = 2x 3 y 3 and S(p, q) = 2x p y 2 q = 3xy 6 + 6x 2 y 4 We now can reduce S(p, q) at p and get: S(p, q) = 6y p = 3xy 6 18xy 5 Buchberger s Algorithm Given a set F of polynomials, we now iterate this process.
Require: F = (f 1,..., f s ). Ensure: A set G = (g 1,..., g t ). begin G := F ; repeat G := G; for every pair {p, q}, p q in G do S := S(p, q); (S-polynomial) S := S G ; (reduction modulo G ) if S 0 then G := G {S}; fi; end for until G = G ; end
Gröbner bases The resulting set G is called a Gröbner basis of F. (One can reduce terms against each other and this way get a reduced Gröbner basis.) Observation: Common zeroes of polynomials in F are common zeroes of polynomials in G. Note: One might get different performance/results for a different ordering of variables. Theorem: If one can obtain polynomials from F that only involve the last variable, this process will find them. One can thus use a back-substitution approach to solve for common zeroes.
Example Consider the equations x 2 + y 2 + z 2 = 1, x 2 + y 2 = z, x = y; respectively the set of polynomials {x 2 + y 2 + z 2 1, x 2 + y 2 z, x y} The (reduced) Gröbner basis calculation in Maple proceeds as this: > with(groebner); > f:=[xˆ2+yˆ2+zˆ2-1,xˆ2+yˆ2-z,x-y]; > g:=gbasis(f,plex(x,y,z)); g := [2y 2 z, z + z 2 1, x y] We now solve first for z, then for x and y.
Application Suppose we want to find the maximum value of the function f(x, y, z) = x 3 + 2xyz z 2 subject to the constraints (points on a sphere) x 2 + y 2 + z 2 = 1 By the method of Lagrange multipliers, we know that f = λ g at a local maximum or minimum. The three partial derivatives, and the constraints give the equations: 3x 2 + 2yz = 2xλ 2xz = 2yλ 2xy 2z = 2zλ x 2 + y 2 + z 2 = 1
We now compute a Gröbner basis for z y x λ and get z 7 1763 1152 z5 + 655 1152 z3 11 z, 288 576 59 z6 + yz 3 + 1605 118 z4 yz 453 118 z2, 6912 3835 z5 + y 2 z + 827 295 z3 3839z, 3835 9216 3835 z5 + y 3 + yz 2 + 906 295 z3 y 2562z, 3835 1152 3835 z5 + yz 2 108 295 z3 + xz + 2556z, 3835 19584 3835 z5 + 1999 295 z3 + xy 6403 3835 z, x 2 + y 2 + z 2 1, λ 167616 3835 z6 + 36717 590 z4 3 134419 yz 2 7670 z2 3x 2
Solving for z yields: z = 0, ±1, ± 2 3, ± 11 8 2 and from this one can solve for each z-value the corresponding x and y values and finally test for maxima/minima. Observation: This process can be done in an exact way or even using variables as coefficients. There are many issues with making this process effective, for example using different orderings.