Approximation by Superpositions of a Sigmoidal Function

Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 22 (2003, No. 2, 463 470 Approximatio by Superpositios of a Sigmoidal Fuctio G. Lewicki ad G. Mario Abstract. We geeralize a result of Gao ad Xu [4] cocerig the approximatio of fuctios of bouded variatio by liear combiatios of a fixed sigmoidal fuctio to the class of fuctios of bouded φ-variatio (Theorem 2.7. Also, i the case of oe variable, [1: Propositio 1] is improved. Our proofs are similar to that of [4]. Keywords: Hölder cotiuity property, sigmoidal fuctio, φ-variatio, uiform approximatio AMS subject classificatio: 41A25, 41A30 0. Itroductio Let g L (R, where R is cosidered with the Lebesgue measure. The g is called a sigmoidal fuctio if lim t + g(t = 1 ad lim t g(t = 0. For N set { } G = c i g(a i x + b i : a i, b i, c i R. (0.1 i=0 By a result of Gao ad Xu [4], each cotiuous fuctio of bouded variatio f ca be approximated, with respect to the uiform orm o the iterval [a, b], i the set G with the error, where > 0 is a costat depedig oly o f. This is a iterestig result i compariso with a result of Barro [1], who showed that i the multi-dimesioal case for a certai class of fuctios we ca get the error i the L 2 -orm. For other results cocerig this type of approximatio see, e.g., [1-3, 5]. The mai result of this ote is Theorem 1.1, where the approximatio of fuctios satisfyig a property (P is cosidered. The class of fuctios satisfyig property (P is larger the the class of fuctios of bouded variatio. I particular, as a cosequece of Theorem 1.1, we get Theorem 2.7, which geeralizes a result of Gao ad Xu [4]. Note that the approximatio of fuctios by superpositios of a sigmoidal fuctio has may applicatios i eural etworks. Usually these problems require multidimesioal approximatio, but we hope that our oe-dimesioal results permits to uderstad multi-dimesioal procedures better. G. Lewicki: Jagielloia Uiv., Dept. Math., Reymota 4, PL - 30-059 Kraków, Polad; lewicki@im.uj.edu.pl G. Mario: Uiv. of alabria, Dept. Math., IT - 87036 Arcavacata di Rede, oseza, Italy; gmario@uical.it ISSN 0232-2064 / $ 2.50 c Helderma Verlag Berli

464 G. Lewicki ad G. Mario 1. Mai result Our mai result is the followig Theorem 1.1. Let φ : R + R + be a cotiuous, strictly icreasig fuctio such that φ(0 = 0. Let the fuctio f R [a, b] satisfy the property (P There exists a costat > 0 such that for every N we ca select a partitio a = x 0 < x 1 <... < x = b such that for every i = 1,...,, if x, y I i = [x i 1, x i ], the f(x f(y φ 1( (1.1 ad let g L (R be a fixed sigmoidal fuctio. The dist(f, G (1 + 8 g φ 1( (1.2 where the distace is take with respect to the supremum orm deoted by [a,b] o [a, b]. Proof. Take a 1 < a ad b 1 > b ad let us exted the fuctio f to f 1 R [a 1, b 1 ] by puttig f 1 (x = f(a for x [a 1, a] ad f 1 (x = f(b for x [b, b 1 ]. Observe that f 1 also satisfies property (P with the same costats as f. Ideed, if a = x 0 < x 1 <... < x = b is a partitio take for f from property (P, the f 1 with the partitio y 0 = a 1, y i = x i for i = 1,..., 1 ad y = b 1 satisfies property (P. Moreover, f [a,b] = f 1 [a1,b 1 ]. Now fix N ad the partitio a 1 = y 0 < y 1 <... < y = b 1 costructed as above. hoose δ > 0 with { } 3δ < mi y j+1 y j, a 1 a, b 1 b : j = 0,..., 1 (1.3 ad take ε > 0 with Select N N such that for ay x [a, b] ad i = 0,...,, 4( 1 f [a,b] ε φ 1(. (1.4 g(n(x y i 1 < ε if x y i > δ, (1.5 g(n(x y i < ε if x y i < δ, (1.6 which is possible sice lim x + g(x = 1 ad lim x g(x = 0. Defie for i = 1,..., g i (x = g(n(x y i 1 g(n(x y i (1.7 ad set P f (x = f 1 (y i 1 g i (x. (1.8

Approximatio by Sigmoidal Fuctios 465 Observe that P f G. Now we estimate f(x P f (x for ay x [a, b]. First ote that for ay x [a, b]. Ideed, g i (x 1 2ε (1.9 g i (x = g(n(x a 1 g(n(x y 1 +... + g(n(x y 1 g(n(x b 1 = g(n(x a 1 g(n(x b 1. Sice x [a, b], x a 1 > δ ad x b 1 < δ, by (1.5 - (1-6 ad the above calculatios, as required. g i (x 1 g(n(x a 1 1 + g(n(x b 1 2ε Now fix x [a, b] ad j {1,..., } such that x [y j 1, y j. The, by (1.9, f(x P f (x ( f 1(x f 1 (x g i (x ( + f 1(x g i (x 2 f [a,b] ε + f 1 (y i 1 g i (x f(x f 1 (y i 1 g i (x = 2 f [a,b] ε + + i j 1 i j >1 f(x f 1 (y i 1 g i (x f(x f 1 (y i 1 g i (x. (1.10 Now we estimate the first sum of (1.10. If i j > 1, the x y i 1 > δ ad x y i > δ. osequetly, by (1.5, g i (x g(n(x y i 1 1 + g(n(x y i 1 2ε. Aalogously, if i j < 1, the x y i 1 < δ ad x y i < δ. Hece, g i (x g(n(x y i 1 + g(n(x y i 2ε. Fially, i j >1 f 1 (x f 1 (y i 1 g i (x 4( 2 f [a,b] ε. (1.11

466 G. Lewicki ad G. Mario To estimate the secod sum of (1.10 observe that Also, osequetly, g i (x 2 g f(x f 1 (y j 1 φ 1( f(x f 1 (y j φ 1(. f(x f 1 (y j 2 f(x f 1 (y j 1 + f 1 (y j 2 f 1 (y j 1 i j 1 By (1.4 ad (1.10 - (1.12 we get Hece 2φ 1(. j+1 f(x f(y i 1 g i (x 2 g i=j 1 8 g φ 1( f(x P f (x (1 + 8 g φ 1(. dist(f, G f P f [a,b] (1 + 8 g φ 1( as required. The proof of Theorem 1.1 is complete f(x f(y i 1 (1.12. Remark 1.2. Theorem 1.1 holds true for complex-valued, cotiuous fuctios defied o the iterval [a, b] satisfyig property (P. The proof goes i the same maer. 2. Further results First let us state the followig Example 2.1. Suppose that f R [a, b] satisfies the property f(x f(y φ 1 (L x y (2.1 for ay x, y [a, b] with a costat L > 0 depedig oly o f. Let φ be as i Theorem 1.1. Fix N ad put x i = a + i (b a for i = 0,...,. Observe that if x, y I i = [x i 1, x i ], the f(x f(y φ 1 (L x y φ 1 (L x i 1 x i = φ 1( L(b a.

Approximatio by Sigmoidal Fuctios 467 Hece (2.1 implies property (P. I particular, if φ(t = t p for some p [1, +, the (2.1 meas that f has the Hölder (Lipschitz, if p = 1 cotiuity property with α = 1 p. I this case, by Theorem 1.1, we get (L(b aα dist(f, G α. Observe that this type of estimates holds true for ay orm weaker tha the supremum orm. Theorem 2.2. Let h : R K, where K = R or K = satisfy the Hölder cotiuity property with α (0, 1]. Suppose that µ is a Borel measure o R ad u : R K is a µ-measurable fuctio such that Let E R be a compact set ad defie f : E K by f(x = t α u(t dµ(t < +. (2.2 h(txu(t dµ(t. (2.3 The dist(f, G α, where the distace is take with respect to the supremum orm o E. Proof. Without loss, we ca assume that E = [a, b]. First we show that f satisfies the Hölder cotiuity property with α give by the assumptio o h. Ideed, f(x f(y = (h(tx h(tyu(t dµ(t L tx ty α u(t dµ(t = L x y α t α u(t dµ(t. By (2.2, the result follows from Example 2.1 ad Theorem 1.1 Example 2.3. Set h(x = e ix ad let f be give by (2.3. Observe that h(x h(y cos x cos y + si x si y 2 x y. Hece, for ay compact set E R, dist(f, G (2.4 where > 0 is a costat depedig o h ad E ad where the distace is take with respect to the supremum orm o E. Observe that this estimate holds true for ay orm weaker tha the supremum orm o E, i particular i ay L p -orm. Hece (2.4 is a essetial improvemet, i the case of oe variable, of a result of Barro [1: Propositio 1]. He showed that, for h(x = e ix ad ay µ-measurable fuctio u satisfyig (2.2 with α = 1, dist L2 (f, G 1 where 1 > 0 is a costat depedig oly o E ad where the distace is take with respect to the orm i L 2 (E, µ. To preset aother applicatio of Theorem 1.1 we eed the followig

468 G. Lewicki ad G. Mario Defiitio 2.4. Let φ : R + R + be as i Theorem 1.1, let f R [a, b] ad set { 1 V φ (f [a,b] = sup φ ( f(x j+1 f(x j } : a = x 0 < x 1 <... < x = b. (2.5 j=0 We say that f has bouded φ-variatio if V φ (f [a,b] < +. I the sequel, we eed two well-kow lemmas. The simple proof of the first lemma will be omitted. However, for the sake of completeess we preset a proof of the secod lemma. Lemma 2.5. Let φ be as i Theorem 1.1 ad f R [a, b]. If a a 1 a 2 ad b b 1 b 2, the V φ (f [a2,b 2 ] V φ (f [a1,b 1 ]. (2.6 Moreover, if c (a, b, the V φ (f [a,c] + V φ (f [c,b] V φ (f [a,b]. (2.7 Lemma 2.6. Let f R [a, b] have bouded φ-variatio. The for every N there exists a partitio a = x 0 < x 1 <... < x = b such that where I i = [x i 1, x i ] for i = 1,...,. V φ (f Ii 1 V φ(f [a,b] (2.8 Proof. For x [a, b] set h(x = V φ (f [a,x] (2.9 with h(a = 0 ad show that h is cotiuous. For this fix ε > 0. The we ca fid δ > 0 such that, for ay w, z [0, 2 f [a,b] ] with w z < δ, φ(w φ(z < ε. Also, there exists δ 1 > 0 such that f(x f(y < δ if x y < δ 1. I the case x a, sice h is icreasig, there exist h (x = lim y x h(y h(x h + (x = lim y x + h(y. (2.10 Hece to prove the cotiuity of h it is eough to show that h (x = h(x = h + (x. Suppose o the cotrary, that h (x + ε < h(x (2.11 for some ε > 0. Let a = z 0 < z 1 <... < z = x be chose such that 1 φ ( f(z j+1 f(z j > h (x + ε. (2.12 j=0 Take y (z 1, x with x y δ 1. The f(y f(z 1 f(x f(z 1 f(y f(x δ.

Approximatio by Sigmoidal Fuctios 469 Hece φ ( f(y f(z 1 φ ( f(x f(z 1 ε. osequetly, 1 φ ( f(z j+1 f(z j j=0 2 φ ( f(z j+1 f(z j + φ ( f(y f(z 1 + ε j=0 h(y + ε with (2.12 implies h(y > h (x, which is a cotradictio. The proof of the facts that h + (x = h(x for ay x (a, b] ad lim y a+ h(y = h(a = 0 goes i a similar maer, so it will be omitted. Now fix N. Sice h is cotiuous ad icreasig, there exists a partitio with a = x 0 < x 1 <... < x = b (2.13 h(x i = i V φ(f [a,b]. (2.14 To ed the proof of the lemma observe that, by Lemma 2.5, for i = 0,.., 1 The proof of Lemma 2.6 is complete V φ (f [xi,x i+1 ] h(x i+1 h(x i. = 1 V φ(f [a,b] Now suppose that f R [a, b] has bouded φ-variatio. By Lemma 2.6, for ay N, i = 1,..., ad x, y I i = [x i 1, x i ] where x i are give by (2.13, φ( f(x f(y V φ (f Ii 1 V φ(f [a,b]. Hece f satisfies property (P from Theorem 1.1 with = V φ (f [a,b]. osequetly, applyig Theorem 1.1, we ca prove Theorem 2.7. Let f R [a, b] be a fuctio with bouded φ-variatio. The dist(f, G (1 + 8 g φ 1( V φ (f [a,b]. Remark 2.8. If φ(t = t p for p [1, +, by Theorem 2.7 we get dist(f, G (1 + 8 g ( V φ (f [a,b] 1 p. (2.15 If p = 1, this has bee prove by Gao ad Xu i [4]. Observe that there exist cotiuous fuctios f such that V id (f [a,b] = + ad V t p(f [a,b] < + for ay p (1, +. Ideed, if we put f(0 = 0, f( 1 = ( 1 1 for N ad exted f i a liear way o the itervals ( 1, 1 1, we get a cotiuous fuctio o [0, 1] satisfyig this property. Observe that for such fuctios it is impossible to estimate the error of approximatio by G applyig the result of Gao ad Xu. But it ca be doe applyig (2.15.

470 G. Lewicki ad G. Mario Refereces [1] Barro, A. R.: Uiversal approximatio bouds for superpositios of a sigmoidal fuctio. IEEE Tras. If. Theory 36 (1993, 930 945. [2] ybeko, G.: Approximatios by superpositios of a sigmoidal fuctio. Math. otrol Sigal Systems 2 (1989, 303 314. [3] Horik, K., Stichcombe, M. ad H. White: Multilayer feedforward etworks are uiversal approximators. Neural Networks 2 (1989, 259 366. [4] Gao, B. ad Y. Xu: Uivariat approximatio by superpositios of a sigmoidal fuctio. J. Math. Aal. & Appl. 178 (1993, 221 226. [5] Joes, L. K.: A simple lemma o greedy approximatio i Hilbert space ad covergece rate for projectios pursuit regressio ad eural etwork traiig. A. Statist. 20 (1992, 608 613. Received 26.06.2002