Best Approximation. Chapter The General Case

Chpter 4 Best Approximtion 4.1 The Generl Cse In the previous chpter, we hve seen how n interpolting polynomil cn be used s n pproximtion to given function. We now wnt to find the best pproximtion to given function. This fundmentl problem in Approximtion Theory cn be stted in very generl terms. Let V be Normed Liner Spce nd W finite-dimensionl subspce of V, then for given v V, find w W such tht v w v w, for ll w W. Here w is clled the Best Approximtion to v out of the subspce W. Note tht the definition of V defines the prticulr norm to be used nd, when using tht norm, w is the vector tht is closest to v out of ll possible vectors in W. In generl, different norms led to different pproximtions. In the context of Numericl Anlysis, V is usully the set of continuous functions on some intervl [, b], with some selected norm, nd W is usully the spce of polynomils P n. The requirement tht W is finite-dimensionl ensures tht we hve bsis for W. Lest Squres Problem Let f(x) be given prticulr continuous function. Using the 2-norm f(x) 2 = ( f 2 (x)dx ) 1/2 find p (x) such tht f(x) p (x) 2 f(x) p(x) 2, 69

for ll p(x) P n, polynomils of degree t most n, nd x [, b]. This is known s the Lest Squres Problem. Best pproximtions with respect to the 2-norm re clled lest squres pproximtions. 4.2 Lest Squres Approximtion In the bove problem, how do we find p (x)? The procedure is the sme, regrdless of the subspce used. So let W be ny finite-dimensionl subspce of dimension (n + 1), with bsis vectors φ (x), φ 1 (x),... nd φ n (x). Therefore, ny member of W cn be expressed s Ψ(x) = c i φ i (x), where c i R. The problem is to find c i such tht f Ψ 2 is minimised. Define E(c, c 1,..., c n ) = i= (f(x) Ψ(x)) 2 dx. We require the minimum of E(c, c 1,..., c n ) over ll vlues c, c 1,...,c n. A necessry condition for E to hve minimum is: This implies, or E c i = = 2 = 2 f(x)φ i (x)dx = f(x)φ i (x)dx = (f Ψ) Ψ c i dx, (f Ψ)φ i (x)dx. Ψφ i (x)dx, c j φ j (x)φ i (x)dx. Hence, the c i tht minimise f(x) Ψ(x) 2 stisfy the system of equtions given by f(x)φ i (x)dx = j= c j φ j (x)φ i (x)dx, for i =, 1,...,n, (4.1) j= totl of (n + 1) equtions in (n + 1) unknowns c, c 1,..., c n. These equtions re often clled the Norml Equtions. 7

Exmple 4.2.1 Using the Norml Equtions (4.1) find the p(x) P n the best fits, in lest squres sense, generl continuous function f(x) in the intervl [, 1]. i.e. find p (x) such tht f(x) p (x) 2 f(x) p(x) 2, for ll p(x) P n, polynomils of degree t most n, nd x [, 1]. Tke the bsis for P n s φ = 1, φ 1 = x, φ 2 = x 2,..., φ n = x n. Then f(x)x i dx = = = = c j x j x i dx j= c j x i+j dx j= j= j= c j [ x i+j+1 i + j + 1 c j i + j + 1. ] 1 Or, writing them out: i = : i =1 :... i =n : fdx = c + c 1 2 + c 2 3 + + c n n + 1 xfdx = c 2 + c 1 3 + c 2 4 + + c n n + 2 x n fdx = c n + 1 + c 1 n + 2 + + c n 2n + 1. Or, in mtrix form: 1 1 1/2... 1/n + 1 c f(x)dx 1 1/2 1/3... 1/n + 2 c 1...... = xf(x)dx.. 1/n + 1 1/n + 2... 1/2n + 1 xn f(x)dx Does nything look fmilir? A system HA = f where H is the Hilbert mtrix. This is seriously bd news - this system is fmously ILL-CONDITIONED! We will hve to find better wy to find p. c n 71

4.3 Orthogonl Functions In generl, it will be hrd to solve the Norml Equtions, s the Hilbert mtrix is ill-conditioned. The previous exmple is n exmple of wht not to do! Insted, using the sme pproch s before choose (if possible) n orthogonl bsis φ i (x) such tht In this cse, the Norml Equtions (4.1) reduce to φ i (x)φ j (x)dx =, i j. f(x)φ i (x)dx = c i φ 2 i (x)dx, for i =, 1,...,n, (4.2) nd the coefficients c i cn be determined directly. Also, we cn increse n without disturbing the erlier coefficients. Note, tht ny orthogonl set with n elements is linerly independent nd hence, will lwys provide bsis for W, n n dimensionl spce,. 4.3.1 Generlistion of Lest Squres We cn generlise the ide of lest squres, using the inner product nottion. Suppose we define f 2 2 = f, f, where.,. is some inner product (e.g., we considered the cse f, g = fgdx in Chpter 1). Then the lest squres best pproximtion is the Ψ(x) such tht f Ψ 2 is minimised, i.e. we wish to minimise f Ψ, f Ψ. Writing Ψ(x) = n i= c iφ i (x), where φ i P n nd form bsis for P n nd expressing orthogonlity s φ i, φ j = for i j, then choosing c i = f(x), φ i(x) φ i (x), φ i (x) (c.f. eqution 4.2) gurntees tht f Ψ 2 f p 2 for ll p P n. In other words, Ψ is the best pproximtion to f out of P n. (See Tutoril sheet 4, question 1 for derivtion of this result). Exmple 4.3.1 Find the lest squres, stright line pproximtion to x 1/2 on [, 1]. i.e., find the Ψ(x) P 1 tht best fits x 1/2 on [, 1]. 72

First choose n orthogonl bsis for P 1 : φ (x) = 1 nd φ 1 (x) = x 1 2. These form n orthogonl bsis for P 1 since φ φ 1 dx = (x 1 2 )dx = [ 1 2 x2 1 2 x ] 1 = 1 2 1 2 =. Now construct Ψ = c φ + c 1 φ 1 = c + c 1 (x 1 2 ). To find the Ψ which stisfies f Ψ f p, we solve for the c i s follows... i=: c = f, φ φ, φ f, φ = x 1/2, 1 = x1/2 dx = [ 2 3 x3/2] 1 = 2 3 φ, φ = 1, 1 = 1dx = 1 c = 2 3 i=1: c 1 = f, φ 1 φ 1, φ 1 f, φ 1 = x 1/2, x 1 2 = x1/2 (x 1 2 )dx = (x3/2 1 2 x1/2 )dx = [ 2 5 x2/5 1 3 x3/2] 1 = 1 15 φ 1, φ 1 = x 1 2, x 1 2 = (x 1 2 )2 dx = (x2 x + 1 4 )dx = [ 1 3 x3 1 2 x2 + 1 4 x] 1 = 1 12 c 1 = 12 15 = 4 5 Hence, the lest squres, stright line pproximtion to x 1/2 on [, 1] is Ψ(x) = 2 3 + 4 5 4 5 x + 4 15. ( x 1 2) = Exmple 4.3.2 Show tht truncted Fourier Series is lest squres pproximtion of f(x) for ny f(x) in the intervl [ π, π]. Choose W to be the 2n + 1 dimensionl spce of functions spnned by the bsis φ = 1, φ 1 = cosx, φ 2 = sin x, φ 3 = cos2x, φ 4 = sin 2x,...,φ 2n 1 = cosnx, φ 2n = sin nx, This bsis forms n orthogonl set of functions: e.g. π φ φ 1 dx = π π π cosxdx = [sinx] π π =, etc.,... 73

Thus, lest squres pproximtion Ψ(x) of f(x) cn be written Ψ(x) = c + c 1 cosx + c 2 sin x + + c 2n 1 cosnx + c 2n sin nx, with the c i given by nd so on. c = f, φ φ, φ = 1 2π c 1 = f, φ 1 φ 1, φ 1 = π π π π f(x)dx, cosxf(x)dx/ π π cos 2 xdx = 1 π π π cosxf(x)dx, The pproximtion Ψ is the truncted Fourier series for f(x). Hence, Fourier series is n exmple of Lest Squres Approximtion: Best Approximtion in the lest squres sense. Exmple 4.3.3 Let x = {x i }, i = 1,..., n nd y = {y i }, i = 1,..., n be the set of dt points (x i, y i ). Find the lest squres best stright line fit to these dt points. We define the inner product in this cse to be x,y = x i y i, i=1 Next we let Ψ(x) = {c 1 (x i x) + c }, i = 1,..., n with x = 1 n n i=1 x i. Here φ (x) = 1, i = 1,..., n nd φ 1 (x) = {x i x}, i,..., n. Observe tht φ (x), φ 1 (x) = (x i x) 1 = i=1 x i i=1 x = nx nx =, so φ, φ 1 re n orthogonl set. Hence, if we clculte c nd c 1 s follows c 1 = y, φ 1 n φ 1, φ 1 = i=1 y i(x i x) n i=1 (x i x), 2 nd (using φ, φ = n i=1 1 = n) i=1 c = y, φ n φ, φ = i=1 y i. n then Ψ(x) is the best liner fit (in lest squres sense) to the dt points (x i, y i ). 74

4.3.2 Approximtions of Differing Degrees Consider f Ψ 2 f p(x) 2, Ψ, p P n, where Ψ = n i= c iφ i (x), where φ i (x) form n orthofonl bsis for P i. Note, p(x) my be ANY p(x) P n, polynomils of degree t most n. If we choose n 1 p(x) = c i φ i (x), i= then p(x) P n, nd p(x) is the best pproximtion to f(x) of degree n 1 (p(x) P n 1 ). Now from bove we hve n 1 f Ψ 2 f c i φ i 2. This mens tht the Lest Squres Best pproximtion from P n is t lest s good s the Lest Squres Best pproximtion from P n 1. i.e. Adding more terms (higher degree bsis functions) does not mke the pproximtion worse - in fct, it will usully mke it better. i= 4.4 Minimx In the previous two sections, we hve considered the best pproximtion in situtions involving the 2 norm. However, best pproximtion in terms of the mximum (or infinity) norm: f p f p, p P n, implies tht we choose the polynomil tht minimises the mximum error over [, b]. This is more nturl wy of thinking bout Best Approximtion. In such sitution, we cll p (x) the minimx pproximtion to f(x) on [, b]. Exmple 4.4.1 Find the best constnt (p P ) pproximtion to f(x) in the intervl [, b]. Let c P, thus we wnt to minimise f(x) c : { } min mx f(x) c, ll c [,b] Clerly, the c tht minimises this is c mx } error c = mx{f} + min{f} 2. b Exmple 4.4.2 Find the best stright line fit (p P 1 ) to f(x) = e x in the intervl [, 1]. 75

We wnt to find the stright line fit, hence we let p = mx + c nd we look to minimise f x e p 1(x) f(x) p = e x (mx + c) i.e., { } min mx ll m,c [,1] ex (mx + c). θ 1 Geometriclly, the mximum occurs in three plces, x =, x = θ nd x = 1. x = : e ( + c) = E (i) x = θ : e θ (mθ + c) = E (ii) x = 1 : e 1 (m + c) = E (iii) lso, the error t x = θ hs turning point, so tht x (ex (mx + c)) x=θ = e θ m = m = e θ θ = log e m. (i) nd (iii) imply 1 c = E = e m c or, m = e 1 1.7183 θ = log e (1.7183). (ii) nd (iii) imply e θ + e mθ c m c = or, c = 1 [m + e mθ m].8941. 2 Hence the minimx stright line is given by 1.7183x +.8941. As the bove exmple illustrtes, finding the minimx polynomil p n(x) for n 1 is not stright forwrd exercise. Also, note tht the process involves the evlution of the error, E in the bove exmple. 4.4.1 Chebyshev Polynomils Revisited Recll tht the Chebyshev polynomils stistfy q(x) P n+1 such tht q(x) = x n+1 +.... 1 2 nt n+1(x) q(x), 76

In prticulr, if we consider n = 2, then x3 3 4 x x 3 + 2 x 2 + 1 x +, or x3 3 4 x x 3 ( 2 x 2 ) 1 x, constnts, 1, 2. Hence p 2 (x) P 2. x3 3 4 x x 3 p 2 (x), This mens the p (x) P 2 tht is the minimx pproximtion to f(x) = x 3 in the intervl [ 1, 1], i.e. the p (x) tht stisfies is p 2(x) = 3 4 x. x 3 p 2 (x) x 3 p 2 (x). From this exmple, we cn see tht the Chebyshev polynomil T n+1 (x) cn be used to quickly find the best polynomil of degree t most n (in the sense tht the mximum error is minimised) to the function f(x) = x n+1 in the intervl [ 1, 1]. Finding the minimx pproximtion to f(x) = x n+1 my see quite limited. However, in combintion with the following results it cn be very useful. If p n(x) is the minimx pproximtion to f(x) on [, b] from P n then 1. αp n (x) is the minimx pproximtion to αf(x) where α R, nd 2. p n(x) + q n (x) is the minimx pproximtion to f(x) + q n (x) where q n (x) P n. (See Tutoril Sheet 8 for proofs nd n exmple) 4.5 Equi-oscilltion From the bove exmples, we see tht the error occurs severl times. In Exmple 4.4.1: n= - mximum error occurred twice In Exmple 4.4.2: n=1 - mximum error occurred three times 77

In Exmple 4.4.3: n=2 - mximum error occurred four times In order to find the minimx pproximtion, we hve found p, p 1 nd p 2 such tht the mximum error equi-oscilltes. Definition: A continuous function is sid to equi-oscillte on n points of [, b] if there exist n points x i x 1 < x 2 < < x n b, such tht nd E(x i ) = mx x b E(x), i = 1,...,n, E(x i ) = E(x i+1 ), i = 1,...,n 1. Theorem: For the function f(x), where x [,b], nd some p n (x) P n, suppose f(x) p n (x) equioscilltes on t lest (n + 2) points in [,b]. Then p n (x) is the minimx pproximtion for f(x). (See Phillips & Tylor for proof.) The inverse of this theorem is lso true: if p n (x) is the minimx polynomil of degree n, then f(x) p n (x) equi-oscilltes on t lest (n + 2) points. The property of equi-oscilltion chrcterises the minimx pproximtion. Exmple 4.5.1 Construct the minimx, stright line pproximtion to x 1/2 on [, 1]. So we wish to find p 1 (x) = mx + c such tht is minimised. mx [,1] x 1/2 (mx + c) From the bove theorem we know the mximum must occur in n + 2 = 3 plces, x =, x = θ nd x = 1. x = : ( + c) = E (i) x = θ : θ 1/2 (mθ + c) = E (ii) x = 1 : 1 (m + c) = E (iii) 78

Also, the error t x = θ hs turning point: ( ) x 1/2 (mx + c) x = x=θ ( ) 1 2 x 1/2 m = 1 2 θ 1/2 m = θ = 1 4m 2. Combining (i) nd (iii): c = 1 m c m = 1 Combining (ii) nd (iii): x=θ θ 1/2 (mθ + c) + 1 (m + c) = 1 2m 1 4m + 1 m 2c = 1 2 1 4 + 1 1 2c = c = 1 8. Hence the minimx stright line pproximtion to x 1/2 is given by x + 1 8. On the other hnd, the lest squres, stright line pproximtion ws 4 5 x + 4 15, mking it cler tht different norms led to different pproximtions! 4.6 Chebyshev Series Agin The property of equi-oscilltion chrcterises the minimx pproximtion. Suppose we could produce the following series expnsion, f(x) = i T i (x) for f(x) defined on [ 1, 1]. This is clled Chebyshev series. Not such crzy ide! Put x = cosθ, then f(cosθ) = i T i (cosθ) = i cos(iθ), θ π, i= i= i= which is just the Fourier cosine series for the function f(cosθ). Hence, it is series we could evlute (using numericl integrtion if necessry). Now, suppose the series converges rpidly so tht, n+1 n+2 n+3... so few terms re good pproximtion of the function. 79

Let Ψ(x) = n i= it i (x) then f(x) Ψ(x) = n+1 T n+1 (x) + n+2 T n+2 (x) +... n+1 T n+1 (x), or, the error is dominted by the leding term n+1 T n+1 (x). Now T n+1 (x) equi-oscilltes (n + 2) times on [ 1, 1]. If f(x) Ψ(x) = n+1 T n+1 (x), then Ψ(x) would be the minimx polynomil of degree n to f(x). Since f(x) Ψ(x) n+1 T n+1 (x), Ψ(x) is not the minimx but is polynomil tht is close to the minimx, s long s n+2, n+3,... re smll compred to n+1. The ctul error lmost equi-oscilltes on (n + 2) points. Exmple 4.6.1: Find the minimx qudrtic pproximtion to f(x) = (1 x 2 ) 1/2 in the intervl [ 1, 1]. First, we note tht if x = cosθ then f(cosθ) = (1 cos 2 θ) 1/2 = sin θ nd the intervl x [ 1, 1] becomes θ [, π]. The Fourier cosine series for sinθ on [, π] is given by F( θ ) sinθ = 2 π 4 π [ cos2θ 3 + cos4θ 15 + cos6θ 35 ] +... π π So with x = cosθ, we hve (1 x 2 ) 1/2 = 2 π 4 π [ T2 (x) 3 + T 4(x) 15 + T 6(x) 35 ] +..., (1 x 2 ) 1/2 where 1 x 1. 1 1 Thus let use consider the qudrtic p 2 (x) = 2 π 4 π T 2 (x) 3 = 2 π 4 3π (2x2 1) = 2 3π (3 2(2x2 1)) = 2 3π (5 4x2 ). The error f(x) p 2 (x) 4 π 8 T 4 (x) 15,

which oscilltes 4 + 1 = 5 times in [-1,1]. At lest 4 equi-oscilltion points re required for p 2 (x) to be the minimx pproximtion of (1 x 2 ) 1/2, so we need to see whether the bove oscilltion points re of equl mplitude. T 4 (x) hs extreme vlues when 8x 4 8x 2 + 1 = ±1, i.e. t x =, x = 1, x = 1, x = 1/ 2 nd x = 1/ 2. (1 x 2 ) 1/2 p 2 (x) error x = 1 1/3π.61 x = ±1/ 2 1/ 2 2/π.75 x = ±1 2/3π.2122 So the error oscilltes but not eqully. Hence, p 2 (x) is not quite the minimx pproximtion to f(x) = (1 x 2 ) 1/2, but it is good first pproximtion. The true minimx qudrtic to (1 x 2 ) 1/2 is ctully ( 9 8 x2) = (1.125 x 2 ), nd thus our estimte of (1.61.8488x 2 ) is not bd. 4.7 Economistion of Power Series Another wy of exploiting the properties of Chebyshev polynomils is possible for functions f(x) for which power series exists. Consider the function f(x) which equls the power series f(x) = n x n. n=1 Let us ssume tht we re interested in pproximting f(x) with polynomil of degree m. One such pproximtion is m f(x) = n x n + R m, n=1 which hs error R m. Cn we get better pproximtion of degree m thn this? Yes! A better pproximtion my be found by finding function p m (x) such tht f(x) p m (x) equi-oscilltes t lest m + 2 times in the given intervl. Consider the truncted series of degree m + 1 m f(x) = n x n + m+1 x m+1 + R m+1. n=1 The Chebyshev polynomil of degree m + 1, equi-oscilltes m + 2 times, nd equls T m+1 (x) = 2 m x m+1 + t m 1 (x), 81

where t m 1 re the terms in the Chebyshev polynomil of degree t most m 1. Hence, we cn write x m+1 = 1 2 m (T m+1(x) t m 1 (x)). Substituting for x m+1 in our expression for f(x) we get f(x) = m n=1 n x n + m+1 2 m (T m+1(x) t m 1 (x)) + R m+1. Re-rrnging we find polynomil of degree t most m, p m (x) = m n=1 n x n m+1 2 m t m 1(x). This polynomil will be pretty good pproximtion to f(x) since f(x) p m (x) = m+1 2 m T m+1(x) + R m+1, which oscilltes m + 2 times lmost eqully provided R m+1 is smll. Although p m (x) is not the minimx pproximtion to f(x) it is close nd the error m+1 2 m T m+1(x) + R m+1 m+1 2 m + R m+1, since T m+1 (x) 1, is generlly lot less thn the error R m for the truncted power series of degree m. This process is clled the Economistion of power series. Exmple 4.7.1: The Tylor expnsion of sin x where R 7 = x7 7! For x [ 1, 1], R 7 1 7!.2. However, where sin x = x x3 3! + x5 5! + R 7, d 7 dx 7 (sin x) x=θ = x7 ( cosθ). 7! sin x = x x3 3! + R 5, R 5 = x5 d 5 5! dx 5 (sin x) x=θ = x5 5! (cosθ), so R 5 1 5!.83. The extr term mkes big difference! Now suppose we express x 5 in terms of Chebyshev polynomils, T 5 (x) = 16x 5 2x 3 + 5x, 82

so Then x 5 = T 5(x) + 2x 3 5x 16 sin x = x x3 6 + 1 ( T5 (x) + 2x 3 5x 5! 16 ( ) ( 1 = x 1 x3 1 1 16 4! 6 16. ) + R 7 ) + 1 16 5! T 5(x) + R 7. Now T 5 (x) 1 for x [ 1, 1] so if we ignore the term in T 5 (x) we obtin ( ) 1 sin x = x 1 x3 16 4! 6 15 16 + Error where 1 Error R 7 + 16 5! T 5(x),.2 + 1 16 12 =.2 + 1 192.2 +.52.7. This new cubic hs mximum error of bout.7, compred with.83 for x x3 6. 83