The Weierstrass Approximation Theorem

36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined as the unknown roots of soe equation and when we cannot solve the equation explicitly, as is ost often the case, then we ust copute approxiate solutions. But even if we write down a real nuber sybolically, like 2, for exaple, we cannot specify its nuerical value copletely in general. In this case, we approxiate the real nuber to any desired accuracy using rational nubers with finite decial expansions. The situation for functions is copletely analogous. In general, functions that are specified as the solutions of differential equations cannot be written down explicitly in ters of known functions. Instead, we ust look for good approxiations. Moreover, ost of the functions that we can write down, i.e., those involving exp, log, sin, and so on, are coplicated in the sense that they take on real values that cannot be written down explicitly. To use these functions in practical coputations, we ust resort to using good approxiations of their values. Put it this way; when we press the e x key on a calculator, we do not get e x, rather we get a good approxiation. This raises one of the fundaental probles of analysis, which is figuring out how to approxiate a given function using sipler functions. In this chapter, we begin the study of this proble by proving a fundaental result which says that any continuous function can be approxiated arbitrarily well by polynoials. This is an iportant result because polynoials are relatively siple. In particular, a polynoial is specified copletely by a finite set of coefficients. In other words, the relatively siple polynoi-

510 36. The Weierstrass Approxiation Theore als play the sae role with respect to continuous functions that rational nubers play with real nubers. The result is due to Weierstrass and it states: Theore 36.1 Weierstrass Approxiation Theore Assue that f is continuous on a closed bounded interval I. Given any ɛ>0, there is a polynoial P n with sufficiently high degree n such that f(x P n (x <ɛfor a x b. (36.1 There are any different proofs of this result, but in keeping with our constructivist tendencies, we present a constructive proof based on Bernstein 1 polynoials. The otivation for this approach rests in probability theory. We do not have space in this book to develop probability theory, but we describe the connection in an intuitive way. Later in Chapter 37 and Chapter 38, we investigate other polynoial approxiations of functions that arise fro different considerations. Before beginning, we note that it suffices to prove Theore 36.1 for the interval [0, 1]. The reason is that the arbitrary interval a y b is apped to 0 x 1byx =(a y/(a b and vice versa by y =(b ax + a. Ifg is continuous on [a, b], then f(x =g((b ax + a is continuous on [0, 1]. If the polynoial P n of degree n approxiates f to within ɛ on [0, 1], then the polynoial P n (y =P n ((a y/(a b of degree n approxiates g(y to within ɛ on [a, b]. 36.1 The Binoial Expansion One ingredient needed to construct the polynoial approxiations is an iportant forula called the binoial expansion. For natural nubers 0 n, we define the binoial coefficient ( n,orn choose, by ( n = n!!(n!. Exaple 36.1. ( 4 = 4! 2 2!2! =6, ( 6 = 6! 1 1!5! =6, ( 3 = 3! 0 3!0! =1 We can interpret n choose as the nuber of distinct subsets with eleents that can be chosen fro a set of n objects, or the nuber of cobinations of n objects taken at a tie. 1 The Russian atheatician Sergi Natanovich Bernstein (1880 1968 studied in France before returning to Russia to work. He proved significant results in approxiation theory and probability.

36.1 The Binoial Expansion 511 Exaple 36.2. We copute the probability P of getting an ace of diaonds in a poker hand of 5 cards chosen at rando fro a standard deck of 52 cards. Recall the forula P( event = probability of an event = nuber of outcoes in the event total nuber of possible outcoes that holds if all outcoes are equally likely. The total nuber of 5 card poker hands is ( 52 5. Obtaining a good hand aounts to choosing any 4 cards fro the reaining 51 cards after getting an ace of diaonds. So there are ( 51 4 good hands. This eans ( 51 4 P = = 51! 5!47! = 5 4!47! 52! 52. ( 52 5 It is straightforward (Proble 36.3 to show the following identities, ( ( ( ( ( ( n n n n n n =, =, = =1. (36.2 n 1 n 1 n 0 An iportant application of the binoial coefficient is the following theore. Theore 36.2 Binoial Expansion For any natural nuber n, Exaple 36.3. (a + b n = n (a + b 2 = a 2 +2ab + b 2 (a + b 3 = a 3 +3a 2 b +3ab 2 + b 3 ( n a b n. (36.3 (a + b 4 = a 4 +4a 3 b +6a 2 b 2 +4ab 3 + b 4 The proof is by induction. For n =1, ( ( 1 1 (a + b 1 = a + b = a + b. 0 1 We assue the forula is true for n 1, so that (a + b n 1 = and prove it holds for n. n 1 ( n 1 a b n 1,

512 36. The Weierstrass Approxiation Theore We ultiply out (a + b n =(a + b(a + b n 1 n 1 ( n 1 = Now changing variables in the su, while Hence, n 1 ( n 1 n 1 ( n 1 (a + b n = a 0 b n + a +1 b n 1 = a +1 b n 1 + n 1 =1 a b n = a 0 b n + n 1 =1 (( n 1 + 1 n 1 ( n 1 a b n. ( n 1 a b n + a n b 0, 1 n 1 =1 ( n 1 ( n 1 a b n. a b n + a n b 0. (36.4 It is a good exercise (Proble 36.5 to show that ( ( ( n 1 n 1 n + =. (36.5 1 Using this in (36.4 proves the result. We use the binoial expansion to drive two other useful forulas. We differentiate both sides of n ( n (x + b n = x b n (36.6 to get n(x + b n 1 = n ( n x 1 b n. Setting x = a and ultiplying through by a/n, a(a + b n 1 = n Differentiating (36.6 twice (Proble 36.6 gives ( 1 1 a 2 (a + b n 2 = n n n ( 2 ( n a b n. (36.7 n 2 n 2 ( n a b n. (36.8

36.2 The Law of Large Nubers 36.2 The Law of Large Nubers 513 The approxiating polynoials used to prove Theore 36.1 are constructed by taking linear cobinations of ore eleentary polynoials called binoial polynoials. In this section, we explore the properties of the binoial polynoials and their connection to probability. We set a = x and b =1 x in the binoial expansion (36.3 to get n ( n 1=(x +(1 x n = x (1 x n. (36.9 We define the +1binoial polynoials of degree n as the ters in the expansion, so ( n p n, (x = x (1 x n, =0, 1,,n. Exaple 36.4. ( 2 p 2,0 (x = x 0 (1 x 2 =(1 x 2 0 ( 2 p 2,1 (x = x 1 (1 x 1 =2x(1 x 1 ( 2 p 2,2 (x = x 2 (1 x 0 = x 2 2 If 0 x 1 is the probability of an event E, then p n, (x isthe probability that E occurs exactly ties in n independent trials. Exaple 36.5. In particular, consider tossing an coin with probability x that a head (H occurs and, correspondingly, probability 1 x that a tail (T occurs. The coin is unfair if x 1/2. The probability of the occurrence of a particular sequence of n tosses containing heads, e.g., } HTTHHTHTHTTHHHTHTHTTT {{ T }, heads in n tosses ( is x (1 x n by the ultiplication rule for probabilities. There are n sequences of n tosses with exactly heads. By the addition rule for probabilities, p n, (x is the probability of getting exactly heads in n tosses. The binoial polynoials have several useful properties, soe of which follow directly fro the connection to probability. For exaple, we interpret n p n, (x = 1 (36.10

514 36. The Weierstrass Approxiation Theore as saying that event E with probability x occurs either exactly 0, 1,,or n ties in n independent trials with probability 1. Since p n, (x 0 for 0 x 1, (36.10 iplies that 0 p n, (x 1 for 0 x 1, as it ust since it is a probability. A couple ore useful properties: (36.7 iplies and (36.8 iplies n p n, (x =nx (36.11 n 2 p n, (x =(n 2 nx 2 + nx. (36.12 An iportant use of the binoial polynoials is an application to the Law of Large Nubers. Suppose we have an event E that has probability x of occurring, such as the unfair coin fro Exaple 36.5. But suppose we don t know the probability. How ight we deterine x? If we conduct a single trial, e.g., flip the coin once, we ight see event E or ight not. One trial does not give uch inforation for deterining x. However, if we conduct a large nuber n 1 of trials, then intuition suggests that E should occur approxiately nx ties out of n trials, at least ost of the tie. Exaple 36.6. The connection between the probability of occurrence in one trial and the frequency of occurrence in any trials is not copletely straightforward to deterine. Consider coin tossing again. If we flip a fair coin 100, 000 ties, we expect to see around 50, 000 heads ost of the tie. Of course, we could be very unlucky and get all tails. But the probability of this occurring is ( 100000 1 10 30103. 2 On the other hand, it is also unlikely that we will see heads in exactly half of the tosses. In fact, one can show that the probability of getting heads exactly half of the tie is approxiately 1/ πn for n large, and therefore also goes to zero as n increases. A Law of Large Nubers encapsulates in soe way the intuitive connection between the probability of an event occurring in one trial and the frequency that the event occurs in a large nuber of trials. A atheatical expression of this intuition is a little tricky to state, however, as we saw in Exaple 36.6. We prove the following version that is originally due to Jacob Bernoulli.

36.2 The Law of Large Nubers 515 Theore 36.3 Law of Large Nubers Assue that event E occurs with probability x and let denote the nuber of ties E occurs in n trials. Let ɛ>0 and δ>0 be given. The probability that /n differs fro x by less than δ is greater than 1 ɛ, i.e., ( P n x <δ > 1 ɛ, (36.13 for all n sufficiently large. Note that we can choose ɛ>0 and δ>0arbitrarily sall at the cost of aking n possibly very large, hence the nae of the theore. Also note that while this result says that it is likely that event E will occur approxiately xn ties in n trials, it does not say that event E will occur exactly xn ties in n trials nor does it say that event E ust occur approxiately xn ties in n trials. Thus, this result does not contradict the coputations in Exaple 36.6. Phrased in ters of the binoial polynoials, we want to show that given ɛ, δ>0, p n, (x > 1 ɛ (36.14 0 n n x <δ for n sufficiently large. Consider the copleentary su p n, (x =1 0 n n x δ which we estiate siply as p n, (x 1 δ 2 0 n n x δ where S n = = n ( nx 2 p n, (x n 2 p n, (x 2nx 0 n n x δ 0 n n x <δ p n, (x, ( n x 2 pn, (x 1 n 2 δ 2 S n n p n, (x+n 2 x 2 n p n, (x. (36.15 Using (36.10, (36.11, and (36.12, we find S n siplifies (Proble 36.9 to S n = nx(1 x. Since x(1 x 1/4 for 0 x 1, S n n/4. Therefore, p n, (x 1 4nδ 2 (36.16 0 n n x δ

516 36. The Weierstrass Approxiation Theore and p n, (x 1 1 4nδ 2. 0 n n x <δ In particular, for fixed ɛ, δ>0, we can insure that (4nδ 2 1 <ɛby choosing n>1/(4δ 2 ɛ. 36.3 The Modulus of Continuity In order to prove a strong version of Theore 36.1, we introduce a useful generalization of Lipschitz continuity. First note that by Theore 32.11, the continuous function f on [a, b] in Theore 36.1 is actually uniforly continuous on [a, b]. That is given ɛ>0 there is a δ>0 such that f(x f(y <ɛfor all x, y in [a, b] with x y < δ. 2 Now a Lipschitz continuous function f with constant L is uniforly continuous because f(x f(y L x y <ɛfor all x, y with x y <δ= ɛ/l. On the other hand, uniforly continuous functions are not necessarily Lipschitz continuous. They do, however, satisfy a generalization of the condition that defines Lipschitz continuity called the odulus of continuity. The generalization is based on the observation that if f is uniforly continuous on a closed, bounded interval I =[a, b], then for any δ>0, the set of nubers { f(x f(y with x, y in I, x y <δ} (36.17 is bounded. Otherwise, f could not be uniforly continuous (Proble 36.10. But, Theore 32.15 then iplies that the set of nubers (36.17 has a least upper bound. Turning this around, we define the odulus of continuity ω(f,δ of a general function f on a general interval I by ω(f,δ = sup { f(x f(y }. x,y in I x y <δ Note that ω(f,δ = if the set (36.17 is not bounded. We can guarantee that ω(f,δ is finite if f is uniforly continuous and I is a closed interval, but if f is not uniforly continuous and/or I is open or unbounded, then ω(f,δ ight be infinite. Exaple 36.7. We know x 2 is uniforly continuous on [0, 1]. Now consider the difference x 2 y 2 = x y x + y, where x y <δ. 2 Unifority refers to the fact that δ can be chosen independently of x and y.

36.4 The Bernstein Polynoials 517 The values of x y increases onotonically fro 0 to δ, while the corresponding largest values of x + y decrease onotonically fro 2 to 2 δ. The largest value of their product occurs when x y = δ so that ω(x 2,δ=2δ δ 2. Exaple 36.8. ω(x 1,δon(0, 1 is infinite. Exaple 36.9. ω(sin(x 1,δ = 2 on (0, 1 since for any δ>0wecan find x and y within δ of 0, and hence within δ of each other, such that sin(x 1 = 1 and sin(y 1 = 1. Note that the functions in Exaple 36.8 and Exaple 36.9 are not uniforly continuous on the indicated intervals. In fact, if f is uniforly continuous on [a, b], then ω(f,δ 0asδ 0 (Proble 36.14. If f is Lipschitz continuous on [a, b] with constant L, then ω(f,δ Lδ. In this sense, the odulus of continuity is a generalization of the idea of Lipschitz continuity. 36.4 The Bernstein Polynoials To construct the approxiating polynoial, we partition [0, 1] by a unifor esh with n + 1 nodes x = n, =0,,n. The Bernstein polynoial of degree n for f on [0, 1] is B n (f,x =B n (x = n f(x p n, (x. (36.18 Note that the degree of B n is at ost n. The reason that the Bernstein polynoials becoe increasingly accurate approxiations as the degree n increases is rather intuitive. The forula for B n (x decoposes into two sus, B n (x = f(x p n, (x+ f(x p n, (x. x x x x large The first su converges to f(x asn becoes large, since we can find nodes x = /n arbitrarily close to x by taking n large. 3 The second su converges to zero by the Law of Large Nubers. This is exactly what we prove below. Before stating a convergence result, we consider a couple of exaples. 3 Recall that any real nuber can be approxiated arbitrarily well by rational nubers.

518 36. The Weierstrass Approxiation Theore Exaple 36.10. n 2 is given by The Bernstein polynoial B n for x 2 on [0, 1] with B n (x = n ( 2 pn, (x. n By (36.12, this eans ( B n (x = 1 1 x 2 + 1 n n x = x2 + 1 x(1 x. n We see that B n (x 2,x x 2 and in fact the error decreases like 1/n as n increases. x 2 B n (x = 1 x(1 x n Exaple 36.11. We copute B 1, B 2, and B 3 for f(x =e x on [0, 1], B 1 (x =e 0 (1 x+e 1 x =(1 x+ex B 2 (x =(1 x 2 +2e 1/2 x(1 x+ex 2 B 3 (x =(1 x 3 +3e 1/2 x(1 x 2 +3e 2/3 x 2 (1 x+ex 3. We plot these functions in Fig. 36.1. 2.5 2.0 exp(x B 1 (x B 2 (x B 3 (x 1.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x FIGURE 36.1. The first three Bernstein polynoials for e x. We prove: Theore 36.4 Bernstein Approxiation Theore Let f beacon- tinuous function on [0, 1] and n 1 a natural nuber. Then f(x B n (f,x 9 4 ω(f,n 1/2. (36.19

If f is Lipschitz continuous with constant L, then 36.4 The Bernstein Polynoials 519 f(x B n (f,x 9 4 Ln 1/2. (36.20 Theore 36.1 follows iediately since for ɛ>0, we siply choose n sufficiently large so that f(x B n (f,x 9 4 ω(f,n 1/2 <ɛ. Using (36.10, we write the error as a su involving the differences between f(x and the values of f at the nodes: f(x B n (x = = n f(xp n, (x n f(x p n, (x n (f(x f(x p n, (x We expect that the differences f(x f(x should be sall when x is close to x by the continuity of f. To take advantage of this, for δ>0, we split the su into two parts f(x B n (x = (f(x f(x p n, (x 0 n x x <δ + 0 n x x δ (f(x f(x p n, (x. (36.21 The first su is sall by the continuity of f, since (f(x f(x p n, (x f(x f(x p n, (x 0 n x x <δ 0 n x x <δ ω(f,δ ω(f,δ n 0 n x x <δ p n, (x n p n, (x =ω(f,δ. We can get a crude bound on the second su in (36.21 easily. Since f is continuous on [0, 1] there is a constant C such that f(x C for 0 x 1. Therefore, 0 n x x δ (f(x f(x p n, (x 2C 0 n x x δ p n, (x C nδ 2

520 36. The Weierstrass Approxiation Theore by (36.16. So we can ake the second su as sall as desired by taking n large. To get a sharper estiate on the second su in (36.21, we use a trick siilar to that used to prove Theore 19.1. We let M be the largest integer less than or equal to x x /δ and choose M uniforly spaced points y 1,y 2,,y M in the interval spanned by x and x so that each of the resulting M + 1 intervals have length x x /(M +1<δ. Now, we can write f(x f(x =(f(x f(y 1 +(f(y 1 f(y 2 + +(f(y M f(x. Therefore, f(x f(x (M +1ω(f,δ We use this to estiate the second su in (36.21, (f(x f(x p n, (x 0 n x x δ ( ω(f,δ 0 n x x δ p n, (x+ 1 δ Using the fact that x x /δ = M 1, (f(x f(x p n, (x 0 n x x δ ( ω(f,δ 0 n x x δ p n, (x+ 1 δ 2 ( n ω(f,δ p n, (x+ 1 δ 2 ( ω(f,δ 1+ 1 4nδ 2 ( 1+ x x ω(f,δ. δ n 0 n x x δ 0 n x x δ x x p n, (x. (x x 2 p n, (x (x x 2 p n, (x by (36.11 and (36.12. So ( (f(x f(x p n, (x ω(f,δ 1+ 1 4nδ 2 0 n x x δ.

36.5 Accuracy and Convergence 521 Putting the estiates on the sus back into (36.21, ( f(x B n (x ω(f,δ 2+ 1 4nδ 2. Setting δ = n 1/2 proves the theore. 36.5 Accuracy and Convergence We can interpret Theore 36.4 as saying that the Bernstein polynoials {B n (f,x} converge uniforly to f(x on[0, 1] as n. In other words, the errors of the Bernstein polynoials B n for a given function f on [0, 1] tend to zero as n increases. This is a strong property; unfortunately, the price is that the convergence is very slow in general. Exaple 36.12. To deonstrate how slowly the Bernstein polynoials can converge, we plot the Bernstein polynoial of degree 4 for sin(πx on[0, 1] in Fig. 36.2. 1.0 0.8 0.6 0.4 0.2 sin(x B 4 (x 0.0 0.0 0.2 0.4 0.6 0.8 1.0 x FIGURE 36.2. A plot of the Bernstein polynoial B 4(x for sin(πx. If the error bound in (36.19 is accurate, i.e., f(x B n (x 9 4 ω(f,n 1/2 Cn 1/2 for soe constant C, then we have to increase n by a factor of 100 in order to see an iproveent of 10 (one additional digit of accuracy in the error. This follows because fro the coputation f(x B n1 (x f(x B n2 (x n 1/2 1 =10 1 n 1/2 2

522 36. The Weierstrass Approxiation Theore we need n 2 = 100n 1. The error can decrease ore quickly in soe cases. Above, we saw that the error for x 2 decreases like 1/n. But even this is relatively slow copared to soe other polynoial approxiations and for this reason the Bernstein polynoials are not often encountered in practice. 36.6 Unanswered Questions We have shown that continuous functions can be approxiated by polynoials. But we have not really explained why polynoials are well-suited for approxiating functions. In other words, what are the properties of polynoials that ake the good approxiations? Are there other sets of functions that have siilar approxiation properties? Atkinson [2], Isaacson and Keller [15], and Rudin [19] have interesting aterial on these topics.

36.6 Unanswered Questions 523 Chapter 36 Probles 36.1. Evaluate ( 8. 3 36.2. Explain the clai that be arranged in groups of. 36.3. Prove (36.2. 36.4. Expand (a + b 6. 36.5. Prove (36.5. 36.6. Prove (36.8. 36.7. Verify (36.12. ( n gives the nuber of ways that n objects can 36.8. Deterine a forula for the probability of getting exactly n/2 heads when tossing a fair coin n ties, where n is even. Make a plot of the forula for a n in the range of 1 to 100 and test the clai that it approaches πn for n large. 36.9. Prove that S n defined in (36.15 is equal to S n = nx(1 x. Probles 36.10 36.15 have to do with the odulus of continuity. Several of the proofs in this book could be generalized by using the odulus of continuity instead of Lipschitz continuity. 36.10. Prove that if f is uniforly continuous on [a, b], then for any δ>0 the set of nubers (36.17 is bounded. 36.11. Evaluate (a ω(x 2,δon[0, 2] (b ω(1/x, δ on[1, 2] (b ω(log(x,δon[1, 2]. 36.12. Verify Exaple 36.8. 36.13. Verify Exaple 36.9. 36.14. Prove that if f is uniforly continuous on [a, b], then ω(f,δ 0as δ 0. 36.15. Prove that if f has a continuous derivative on [a, b], then ω(f, δ ax [a,b] f δ. Coputing Bernstein polynoial approxiations can be tedious. You ight want to use MAPLE c, for exaple, to do Probles 36.16 36.21. 36.16. Copute forulas for p 3,, =0, 1, 2, 3.

524 36. The Weierstrass Approxiation Theore 36.17. Verify the coputations in Exaple 36.11. 36.18. Copute the Bernstein polynoials for x on [0, 1]. 36.19. Copute and plot the Bernstein polynoials for exp(x on[1, 3] of degree 1, 2, and 3. 36.20. (a Copute a suation forula for the Bernstein polynoial for x 3 on [0, 1] for degree 3. (b Find an explicit forula for the Bernstein polynoial fro (a that does not involve suation. (c Write down a forula for the error. 36.21. Copute and plot the Bernstein polynoials for sin(πx on[0, 1] of degree 1, 2, 3, and 4. We have shown that the Bernstein polynoials approxiate a differentiable function, which is continuous of course, uniforly well. In Proble 36.22, we ask you to show that the derivative of the function is also approxiated by the derivatives of the function s Bernstein polynoials. 36.22. If f(x has a continuous first derivative in [0, 1], prove that the derivatives of the Bernstein polynoials {P n(f,x} converge uniforly to f (x on[0, 1]. Hint: First, verify the forulas p n, = n(p n 1, 1 p n 1, for =1,, 1 p n,n = np n 1,n 1, p n,0 = np n 1,0. Then find a suation forula for the error f (x P n(x and rearrange the su in ters of p n 1, for =0, 1,,n 1. 36.23. If f is continuous on [0, 1] and if 1 0 f(xx n dx = 0 for n =0, 1, 2,,, then prove that f(x = 0 for 0 x 1. Hint: This says that the integral of the product of f and any polynoial is zero. Use Theore 36.1 to first prove that 1 0 f 2 (x dx =0. We say that the real nubers R are separable because any real nuber can be approxiated to arbitrary accuracy by a rational nuber. The analogous property holds for the space of continuous functions on a closed, bounded interval, which is the content of the theore we ask you to prove in Proble 36.24. 36.24. Prove the following extension of the Weierstrass Approxiation Theore: Theore 36.5 Assue that f is continuous on a closed bounded interval I. Given any ɛ>0, there is a polynoial P n with rational coefficients with finite decial expansions and of sufficiently high degree n such that f(x P n(x <ɛfor a x b. Hint: Use Theore 36.1 to first get an approxiate polynoial and then analyze the effect of replacing its coefficients by rational approxiations.

http://www.springer.co/978-0-387-95484-4