Wee 0 A Itroductio to Wavelet regressio. De itio: Wavelet is a fuctio such that f j= j ; j; Zg is a orthoormal basis for L (R). This fuctio is called mother wavelet, which ca be ofte costructed from father wavelet '. The father wavelet ' is ot a wavelet, but we ca costruct wavelets from it, so it is equally importat as mother wavelet. Example: Haar wavelet (A. Haar, Math. A. (90)) ad (x) = x [0; =) x [=; ) j= j j= x = j ; j + (j+) j= x j + (j+) ; j + j. Let ' (t) is the idicator fuctio over the iterval [0; ). Let s de e V j = spa ' j = j= ' j o, Z Sice ' (x) = ' (x ) + ' (x), the followig four properties are satis ed (i)... V V V 0 V V :::; (ii) f (x) V j () f (x) V j+ ; (iii) [ jz V j = L (R), (iv) there is a fuctio ' such that f' ( ) ; Zg is a orthoormal basis for V 0. A equivalet form of property (iii): Recall that bf () = Z f (x) exp ( ix) dx It ca be show that, for ay ' the property (iii) is equivalet to b' (0) 6= 0 ad jb'j is cotiuous at 0. A setch of the proof is as follows: First, [ jz V j is ivariace uder all traslatios. Secod, if g is orthogoal to [ jz V j, which implies g (x) is orthogoal to ' j (x + t) for all t, the Placherel formula implies bg () b' j = 0 a.s.. We ow b' 6= 0 aroud 0. Let j!, we the coclude bg () = 0 a.s.. From ' to : Observe that (x) = ' (x) ' (x )
ad If we de e h'; i = 0. W j = spa j= j o, Z the ad more geerally We also see that JL j= i other words,f j = j= j L (R) : V 0 L W0 = V V j L Wj = V j+ W j = V J+! L (R), J!,, j; Zg is a orthoormal basis for. Multiresolutio aalysis (MRA) a geeral framewor to costruct wavelet fuctios More geerally, if there is ' such that f' ( ) ; Zg is a orthoormal system, ad x p ' = h ' (x ) adjb'j is cotiuous at 0 with b' (0) 6= 0. Let s de e V j = spa j= ' j o, Z The the followig four properties are satis ed i)... V V V 0 V V :::; ii) f (x) V j () f (x) V j+ ; iii) [ jz V j = L (R) iv) there is a fuctio ' such that f' ( ) ; Zg is a orthoormal basis for V 0. This is called a MRA i L (R). It is easy to see property (iv) is equivalet to Z ;0 = ' (x) ' (x ) dx = Z b' () e i d = Z for all, i.e., P l= jb' ( + l)j = a.s.. From ' to : There is a sequece fh g such that x p ' = h ' (x ) 0 l= jb' ( + l)j e i d the b' () = p b' () b h () = M p b' () h exp (i).
It is easy to see b h (0) = p ad b h () + b h ( + ) = which is due to the idetity P l= jb' ( + l)j =. If satis es x p = g ' (x ) the De e i.e., b () = p b' () bg () = M p b' () g exp (i). g = ( ) h bg () = ( ) h exp (i) = ( ) h exp ( i ( ) ) = e i b h ( + ); the which implies f ( l= b ( + l) = ) ; Zg is a orthoormal system, ad bg () b h () + bg ( + ) b h ( + ) = 0 which implies spa f ( ) ; Zg? spa f' ( ) ; Zg because De i b () ; b' () E ad it is easy to see More geerally ad = D e i b' () bg () ; b' () b E h () = Z h e i bg () b h () + bg ( + ) b i h ( + ) d = 0. 0 spa f ( ) ; Zg L spa f' ( ) ; Zg = V. JL j= where W j = spaf j = j= j V j L Wj = V j+ W j = V J+! L (R), J!, ; Zg. 3
Costruct ' (Meyer, Mallat): If b h () is periodic, C ear = 0, ad (i) b h (0) = p ad b h () + b h ( + ) = (ii) if [ =;=] b h () > 0, the Q b' () = p b h j is the Fourier trasformatio of a scalig fuctio ' L that geerates a MRA. 3. Wavelets o the iterval. Haar wavelet ca be modi ed to be a orthogoal bases for L (0; ) ' (x) ; j= j x ; j = 0; ; ; :::; = 0; ; :::; m Some importat developmets: ) Meyer (985), C wavelets. ) Mallat ad Meyer (987), Multiresolutio aalysis which gives a easy way to costruct wavelets ad also a fast algorithm (so called Pyramid algorithm, or Mallat algorithm). 3) Daubechies(988-99), Compactly supported C r wavelets (r is a positive umber), for example, Daubechies wavelets, Symlets, Coi ets. 4) Cohe, Daubechies ad Vial (993), Smooth wavelets o a iterval. 4 Good wavelets: vaishig momets The wavelet is said to have r vaishig momets if Z x (x) dx = 0 = 0; ; ; : : : ; r. Thus is orthogoal to all polyomials of degree r. For istace, r = for Haar wavelet. A fuctio f is said to be C ( > 0) o the iterval I R if there exists a costat C ad for every x I, there is a polyomial p x (y) of degree bc such that jf (x + y) p x (y)j C jyj, x + y I Lemma: If f is C o R ad has at least r = bc + vaishig momets, the f; j c C j(+=) Proof: Usig a chage of variable ad the vaishig momets property, Z f f; j = j= j v + j p j v (v) dv Z C j(+=) jv (v)j dv. 4
The scalig fuctio ' is said to have r vaishig momets if Z x ' (x) dx = 0, = ; ; : : : ; r. For = 0 recall that R x ' (x) dx =. Lemma: If f is C o R ad ' has at least r = bc + vaishig momets, the f; 'j j f j c ' C j(+=). Proof: Usig a chage of variable ad the vaishig momets property, f; ' j j f j Z f = j= j v + j f j p j v ' (v) dv Z C j(+=) jv ' (v)j dv. 5. Discrete wavelet trasformatio Multiresolutio aalysis (MRA) A MRA is a sequece of closed subspaces fv j ; j Zg i L (R) such that i)... V V V 0 V V :::; ii) f (x) V j () f (x) V j+ ; iii) [ jz V j = L (R) ; iv) there is a fuctio ' such that f' ( ) ; Zg is a orthoormal basis for V 0. There is a sequece fh g such that x p ' = h ' (x ) o ad for the sequece fg g = ( ) h we have ad More geerally JL j= where W j = spaf j = j= j See f i V J. The x p = g ' (x ) V j L Wj = V j+ W j = V J+! L (R), J!, ; Zg. f = J ' J = hf; ' J i ' J t f J ' J. 5
We see p j= ' j x = h j= ' j x i.e., (j )= ' j x = h j= ' j x the j ;0 = h j. or more geerally, j ;m = h j;+m. Similarly, (j )= j x = g j= ' j x the j ;0 = g j, or more geerally, j ;m = g j;+m. 6. Besov Balls The Besov sequece orms are de ed as follows. Let f = j0' j0 + j j j Suppose R ad 0 < p = q ad that we write s = + = =p. The b p;q = j0; lp + jsp j j j po =p ad geeral de itios for 0 < p; q is b p;q = j0; lp + jsq p q=p =q jjj. A cotrol of the Besov orm b p;q M is equivalet to js jj j p =p Mj or j j j M 0 j p due to the Berstei-type iequality below, or! () j j M 0 j p 6
Berstei-type Iequality Let j, K be a orthoormal sequece of fuctios satisfyig (i) P R j a j= (ii) max j a j=. The for all p, there exists costats C = a (a =a ) =p ad C = a (a =a ) =p such that for ay sequece = ( ; K) C j(= =p) p p j C j(= =p) p. 7. Nearly adaptive rate miimaxity We observe y i = i + z i ; i = ; ; :::; d; where z i ~N (0; ). De e the soft thresholdig estimator b as follows ^ i = where > 0. We have proved the followig result. Theorem P b p log d y i ; i = ; ; :::; d jy i j + ( + log d) +! d i ^. Questio: If M is a ellipsoid, ( ) (m; M) = f : a i i M, a = a + = () m ; i= i= ca you prove that P b p log C (log ) =(+)? I this lecture, we cosider Besov balls 8 8! 9 >< < q=p b p;q (M) = >: : ej0; lp = + j(+= =p)q j : j j p ; j =q 9 >= M >; If we apply b p log j to estimate ( j ) =;;:::; j i each resolutio, we show sup b p;q (M) P b C (log ) =(+) 7
or equivaletly sup P f b fbp;q (M) f C (log ) =(+). However, liear miimax rate is =(+) with = + = =p whe p < ad q. We will assume p + + = (or + =p = ( + )) for a techical reaso. Proof of early adaptive rate miimaxity Observe that ey j0 = e j0 + = z j0; j = j 0 ; = ; ; :::; j0 y j = j + = z j ; j = j 0 ; j 0 + ; :::; ; = ; ; :::; j where z j ~N (0; ) : I practice, j 0 = 4 or 5, ad we use y j0 to estimate j0; J satis es J =, ad we use 0 to estimate j for j J. Write ^ E = j0 + E bj j j 0j j0 + ( + log ) j ^ + j. Jj 0j jj Uder wea assumptios, it is true that j = o jj =(+) which meas eglectig y j for j J i practice is e. The it is eough to show j ^ C =(+). j 0j j 0j Let s start with p <, which is more excitig. De e j such that j = =(+), the we have j ^ = j ^ + j ^ j 0j<j =(+) + J>jj J>jj j ^. From the de itio of Besov balls, j p M j(+= =p), which implies Jjj j ^ p= j j j p j = O +p= p(+= =p) + = O +. 8
For p, the Jese s iequality gives j! =p j( =p) j j j p jj jj = O j( =p) j(+= =p) = O j, Remar. Liear miimaxity for p <? Let s cosider the case p = q <, the R L Bp;q (M) = R L B; (M) where = + = =p; ad R L B ; (M) =(+). For example, whe = p = q =, we have R L B ; (M) = but the optimal miimax rate is =3. Why? Recall that for p < i each resolutio j R L;j = if E (cy j j ) c R L = j P j j + P j Bp;q (M) t j0 + j 0j j P j j + P j Ad the max of P j uder the l p costrait j p M j(+= M j with = + = =p. =p) is 8. SURE Estimatio Dooho ad Johstoe (995, SURE). Setch of the proof. Cosider the sequece model where y i = i + z i, i = ; :::; d ad z i are idepedet ormal N(0; ) variables. Set r() = d P^. The stei s ubiased estimator of ris gives r(; L) = P (U ()) with P U () = d i= I (jyi j ) + y i I (jy i j ) + I (jy i j > ) = d # fi; jy i j g + P (jy i j ^ ) 9
Whe y(i) < y(i+), U () is a icreasig fuctio of. This implies Propositio: s = arg mi U () f0; jy j ; : : : ; jy jg. P sup ju () 0 p log d r (; L)j C log3= d d = : 9. Fuctioal Estimatio Quadratic fuctioal estimatio Model: Observe the sequece model: y i = i + = z i i:i:d: where z i N (0; ). The model comes from the white oise model (or may other models): dy (t) = f (t) dt + = db (t), t [0; ]. Let f i (t) ; i = ; ; : : :g be a orthoormal basis of L ([0; ]). The white oise problem is the equivalet to the sequece model with y i = h i ; yi ; i = h i ; fi ; ad z i = h i ; Bi i:i:d: N (0; ). Assumptio: Let = ( ; ; : : :). Assume ( ;M = : ) i i M. i= If the orthoormal basis is the Fourier basis, this assumptio correspods to the periodic Sobolev ball with smoothess. Now we may write the sequece model as P = fp ; ; ;M g where P ; is the joit distributio of idepedet y i N ( i ; =). Problem: Estimate Q = P i= i ( or R f with f = P i= i i ), ad determie the optimal miimax rate satisfyig lim if! bq sup P Q b Q : Solutio: Step : Achievig the optimal rate. De e m bq m = yi ; i= where m will be speci ed later. We ow P b Q m = 0 m i= i
ad var bqm = = = m i= m i= m i= = m var yi var z i p z i i var z i + 4 m i. i= + m i= var p z i i The ad Choose m such that i.e., 0 bias @ im+ i A M m 4 var bqm m + 4M. m = M m 4 M =(+4) m = =(+4). Thus we have a upper boud of the mea square error P bq Q M m 4 + 4M 8=(+4) +. So the optimal miimax covergece rate is = whe =4, or 8=(+4) whe < =4. Step : Ca ot do better. We tae P 0 = fp, = 0g ad P = fp ; sub g with sub = f : ji j = a; i m; i = 0; i m + g. where a ad m will be speci ed later. Simple calculatios give m i i m + a, i= m i i=! = m a 4, 4 i = ma 4. i=
Let m = =(4+) ad a = c (=) (+)=(4+). The m i i c ; i=! = c 4 (=) 8=(4+) ; 4 i = c 4 i i= i= Let c be a small costat, the sub ;M, ad the a ity P 0 P c for some positive costat c (why?). So (More details i class). sup P bq Q! i i= c (=) 8=(4+). 0. High dimesioal estimatio P 0 P