LEAST SQUARES APPROXIMATION One more approach to approximating a function f (x) on an interval a x b is to seek an approximation p(x) with a small average error over the interval of approximation. A convenient definition of the average error of the approximation is given by E(p; f ) { b } 1 [f (x) p(x)] dx b a a (1) This is also called the root-mean-square-error (denoted subsequently by RMSE) in the approximation of f (x) by p(x). Note first that choosing p(x) to minimize E(p; f ) is equivalent to minimizing b a [f (x) p(x)] dx thus dispensing with the square root and multiplying fraction (although the minimums are generally different). The minimizing of (1) is called the least squares approximation problem.
Example Let f (x) = e x, p(x) = α 0 + α 1 x, α 0 and α 1 unknown. Approximate f (x) over [, 1]. Choose α 0, α 1 to minimize g(α 0, α 1 ) = Integrating, g(α 0, α 1 ) (e x α 0 α 1 x) dx () ( e x + α 0 + α 1x α 0 e x α 1 xe x + α 0 α 1 x ) dx g(α 0, α 1 ) = c 1 α 0 + c α 1 + c 3 α 0 α 1 + c 4 α 0 + c 5 α 1 + c 6 with constants {c 1,..., c 6 }, e.g. c 1 =, c 6 = ( e 1 e ) /. g is a quadratic polynomial in the two variables α 0, α 1. To find its minimum, solve the system g α 0 = 0, g α 1 = 0
It is simpler to differentiate () g(α 0, α 1 ) = (e x α 0 α 1 x) dx directly, obtaining (e x α 0 α 1 x) () dx = 0 This simplifies to (e x α 0 α 1 x) ( x) dx = 0 So α 0 = α 0 = e e e x dx = e e, 1 3 α 1 = xe x dx = e. = 1.175, α 1 = 3e. = 1.1036
Using these values for α 0 and α 1, we denote the resulting linear approximation by l 1 (x) = α 0 + α 1 x It is called the best linear approximation to e x in the sense of least squares. For the maximum error, max x 1 ex l 1 (x) =. 0.439
Errors in linear approximations of e x Approximation Max Error RMSE Taylor t 1 (x) 0.718 0.46 Least squares l 1 (x) 0.439 0.16 Chebyshev c 1 (x) 0.37 0.184 Minimax m 1 (x) 0.79 0.190 y y=e x y=l 1 (x) 1 x 1 Figure: The linear least squares approximation to e x
THE GENERAL CASE Approximate f (x) on [a, b], and let n 0. Seek p(x) to minimize the RMSE. Write g(α 0, α 1,..., α n ) p(x) = α 0 + α 1 x + + α n x n b a [f (x) α 0 α 1 x α n x n ] dx Find coefficients α 0, α 1,..., α n to minimize this integral. The integral g(α 0, α 1,..., α n ) is a quadratic polynomial in the n + 1 variables α 0, α 1,..., α n. To minimize g(α 0, α 1,..., α n ), invoke the conditions g α i = 0, i = 0, 1,..., n This yields a set of n + 1 equations that must be satisfied by a minimizing set α 0, α 1,..., α n for g. Manipulating this set of conditions leads to a simultaneous linear system.
To better understand the form of the linear system, consider the special case of [a, b] = [0, 1]. Differentiating g(α 0, α 1,..., α n ) = with respect to each α i, we obtain 0 0 0 0 [f (x) α 0 α 1 x α n x n ] dx (e x α 0 α n x n ) () dx = 0 (e x α 0 α n x n ) ( x) dx = 0. (e x α 0 α n x n ) ( x n ) dx = 0
Then the linear system is n α j i + j + 1 = 0 x i f (x) dx, i = 0, 1,..., n We will study the solution of simultaneous linear systems in Chapter 6. There we will see that this linear system is ill-conditioned and is difficult to solve accurately, even for moderately sized values of n such as n = 5. As a consequence, this is not a good approach to solving for a minimizer of g(α 0, α 1,..., α n ). For a better way to solve the least squares approximation problem, we need Legendre polynomials.
LEGENDRE POLYNOMIALS Define the Legendre polynomials as follows (for x [, 1]) For example, P n (x) = 1 n! n d n [ (x dx n 1 ) ] n, n = 0, 1,,... P 0 (x) = 1 P 1 (x) = x P (x) = 1 ( 3x 1 ) P 3 (x) = 1 ( 5x 3 3x ) P 4 (x) = 1 ( 35x 4 30x + 3 ) 8 The Legendre polynomials have many special properties, and they are widely used in numerical analysis and applied mathematics.
y 1 1 x P 1 (x) P (x) P 3 (x) P 4 (x) Figure: Legendre polynomials of degrees 1,, 3, 4
PROPERTIES Degree and normalization: Triple recursion relation: P n+1 (x) = n + 1 deg P n = n, P n (1) = 1, n 0 n + 1 xp n(x) P 0 (x) = 1, P 1 (x) = x Orthogonality and size: (P i, P j ) = Here we use the notation (f, g) = n n + 1 P n(x), n 1 0, i j j + 1, for general functions f (x) and g(x). i = j f (x)g(x) dx
Zeroes: All zeroes of P n (x) are located in [, 1] ; all zeroes are simple roots of P n (x) Basis: Every polynomial p(x) of degree n can be written in the form n p(x) = β j P j (x) with the choice of β 0, β 1,..., β n uniquely determined from p(x): β j = (p, P j), j = 0, 1,..., n (P j, P j )
FINDING THE LEAST SQUARES APPROXIMATION Here we discuss the least squares approximation problem on only the interval [, 1]. Approximation problems on other intervals [a, b] can be accomplished using a linear change of variable. We seek to find a polynomial p(x) of degree n that minimizes This is equivalent to minimizing [f (x) p(x)] dx We begin by writing p(x) in the form (f p, f p) (3) p(x) = n β j P j (x)
Use p(x) = n β jp j (x) to obtain g (β 0, β 1,..., β n ) (f p, f p) = f Rewrite it as g = (f, f ) = (f, f ) + n β j P j, f n [ β j (P j, P j ) β j (f, P j ) ] n β i P i i=0 n (f, P j ) n [ (P j, P j ) + (P j, P j ) β j (f, P ] j) (P j, P j ) We see that g (β 0, β 1,..., β n ) is smallest when β j = (f, P j), j = 0, 1,..., n (P j, P j )
For the choice of coefficients the minimum is We call β j = (f, P j), j = 0, 1,..., n (P j, P j ) g = (f, f ) l n (x) = n n (f, P j ) (P j, P j ) (f, P j ) (P j, P j ) P j(x) (4) the least squares approximation of degree n to f (x) on [, 1]. If β n = 0, then its actual degree is less than n.
Example Approximate f (x) = e x on [, 1]. We use (4) with n = 3: l 3 (x) = 3 β j P j (x), β j = (f, P j) (P j, P j ) (5) The coefficients {β 0, β 1, β, β 3 } are as follows. j 0 1 3 β j.35040 0.73576 0.14313 0.0013 Using (5) and simplifying, l 3 (x) =.99694 +.997955x +.5367x +.176139x 3
y 0.011 1 x 0.00460 Figure: Error in the cubic least squares approximation to e x
The error in various cubic approximations Approximation Max Error RMSE Taylor t 3 (x).0516.0145 Least squares l 3 (x).011.00334 Chebyshev c 3 (x).00666.00384 Minimax m 3 (x).00553.00388