Robust Average Derivative Estimation. February 2007 (Preliminary and Incomplete Do not quote without permission)

Size: px
Start display at page:

Download "Robust Average Derivative Estimation. February 2007 (Preliminary and Incomplete Do not quote without permission)"

Transcription

1 Robust Average Derivative Estimation Marcia M.A. Scafgans Victoria inde-wals y February 007 (Preliminary and Incomplete Do not quote witout permission) Abstract. Many important models, suc as index models widely used in limited dependent variables, partial linear models and nonparametric demand studies utilize estimation of average derivatives (sometimes weigted) of te conditional mean function. Asymptotic results in te literature focus on situations were te ADE converges at parametric rates (as a result of averaging); tis requires making stringent assumptions on smootness of te underlying density; in practice suc assumptions may be violated. We extend te existing teory by relaxing smootness assumptions and obtain a full range of asymptotic results wit bot parametric and non-parametric rates. We consider bot te possibility of lack of smootness and lack of precise knowledge of degree of smootness and propose an estimation strategy tat produces te best possible rate witout a priori knowledge of degree of density smootness. Te new combined estimator is a linear combination of estimators corresponding to di erent bandwidt/kernel coices tat minimizes te estimated asymptotic mean squared error (AMSE). Estimation of te AMSE, selection of te set of bandwidts and kernels are dicussed. Monte Carlo results for density weigted ADE con rm good performance of te combined estimator. Department of Economics, London Scool of Economics. Mailing address: Hougton Street, London WCA AE, United Kingdom. y Department of Economics, McGill University and CIREQ. Tis work was supported by te Social Sciences and Humanities Researc Council of Canada (SSHRC) and by te Fonds québecois de la recerce sur la société et la culture (FRQSC).

2 Robust Average Derivative Estimation. Introduction Many important models rely on estimation of average derivatives (ADE) of te conditional mean function (averaged response coe cients); te most widely used suc model is te single index model were te conditional mean function can be represented as a univariate function of a linear combination of conditioning variables. Index representations are ubiquitous in econometric studies of limited dependent variable models, partial linear models and in nonparametric demand analysis. Estimation of coe cients in single index models relies on te fact tat averaged derivatives of te conditional mean (or conditional mean weigted by some function) are proportional to te coe cients, tus a non-parametric estimator of te derivative of te conditional mean function provides estimates of te coe cients (up to a multiplicative factor). Tis metod does not require assumptions about te functional form of eiter te density of te data or of te true regression function. Powell, Stock and Stoker (989) and Robinson (989) examined density weigted average derivatives, wile Härdle and Stoker (989) investigated te properties of te average derivatives temselves; one important di erence is te need to introduce some form of trimming wen tere is no weigting by density since te estimator of density appears in te denominator of te ADE and may be close to zero. Newey and Stoker (993) addressed te issue of e ciency related to te coice of weigting function. Horowitz and Härdle (996) extended te ADE approac in te estimation of coe cients in te single index model to te presence of discrete covariates. Donkers and Scafgans (005) addressed te lack of identi cation associated wit te estimation of coe cients in single index models for cases were te derivative of te unknown function on average equals zero; tey propose an estimator based on te average outer product of derivatives wic resolves tis lack of identi cation wile at te same time enabling te estimation of parameters in multiple index models. In all of te literature on ADE estimation asymptotic teory was provided for parametric rates of convergence. Even toug te estimators are based on a nonparametric kernel estimator of te conditional mean wic depends on te kernel and bandwidt and converges at

3 Robust Average Derivative Estimation 3 a nonparametric rate, averaging can produce a parametric convergence rate tus reducing dependence on selection of te kernel and bandwidt wic do not appear in te leading term of te AMSE expansion. However, oter terms are sensitive to bandwidt/kernel coice. Powell and Stoker (996) address te optimal bandwidt coice for (weigted) average derivative estimation. Furter results including nite sample performance of average derivatives and corrections to improve nite-sample properties are discussed in Robinson (995) and Niciyama and Robinson (000, 005). Parametric rates of convergence and tus all te results in tis literature rely on te assumption of su ciently ig degree of smootness of te underlying density. In tis paper we are motivated by a concern about te assumed ig degree of density smootness. Tere is some empirical evidence tat for many variables te density may not be su ciently smoot and may ave sapes tat are not consistent wit a ig level of smootness: peaks and cusps and even discontinuity of density functions are not uncommon (references). We extend te existing asymptotic results by relaxing assumptions on te density. We sow tat insu cient smootness will result in possible asymptotic bias and may easily lead to non-parametric rates. Te selection of optimal kernel order and optimal bandwidt in te absence of su cient smootness moreover presumes te knowledge of te degree of density smootness. Tus an additional concern for us is te possible uncertainty about te degree of density smootness. Incorrect assumptions about smootness may lead to using an estimator tat su ers from problems associated wit under- or oversmooting. To address problems associated wit an incorrect coice of a bandwidt/kernel pair we construct an estimator tat optimally combines estimators for di erent bandwidts and kernels to protect against te negative consequences of errors in assumptions about te order of density smootness. We examine a variety of estimators corresponding to di erent kernels and bandwidt rates and derive te joint limit process for tose estimators. Wen eac estimator is normalized appropriately (wit di erent rates) we obtain a joint Gaussian limit process wic possibly exibits an asymptotic bias and possibly some degeneracy. Any linear combination of suc estimators is asymptotically Gaussian and we are able select a combination

4 Robust Average Derivative Estimation 4 tat minimizes te estimated asymptotic MSE. Te resulting estimator is wat we call te combined estimator. Kotlyarova and inde-wals (006) ave sown tat te weigts in tis combination will be suc tat tey provide te best rate available among all te rates witout a priori knowledge of degree of smootness, tus protecting against making a bandwidt/kernel coice tat relies on incorrect smootness assumptions and would yield ig asymptotic bias. Performance of te combined estimator relies on good estimators for te asymptotic variances and biases tat enter into te combination; a metod of estimation tat does not depend on our knowledge about te degree of density smootness is required. Variances can be estimated witout muc di culty, e.g. by bootstrap. In Kotlyarova and inde-wals (006) a metod of estimation of te asymptotic bias of a (possibly) oversmooted estimator tat utilizes asymptotically unbiased undersmooted estimators is proposed; ere we add bootstrapping to improve te properties of tis estimator of asymptotic bias. Witout prior knowledge of smootness te bandwidt coices must be suc tat bandwidts optimal for smoot densities (obtained by rule-of-tumb or by cross-validation) sould be included to cover te possibility of ig smootness; since suc coices will correspond to oversmooting if density is not su ciently smoot. Some lower bandwidts determined e.g. as percentiles of te optimal bandwidt sould also be considered. Our metod requires utilization of undersmooted estimators to determine asymptotic bias, tus it is important to consider fairly small bandwidts. We select kernels of di erent orders for te combination; most of te Monte Carlo results bot ere and for oter combined estimators (for SMS in binary coice model and for density estimation, Kotlyarova, 005) are not very sensitive to kernel coices. Monte Carlo results ere are for te density weigted ADE for a single index model. We demonstrate ere tat even in te case were te smootness assumptions old te combined estimator performs similarly to te optimal ADE estimator and does not exibit muc of an e ciency loss con rming te results about its being equivalent to te optimal rate estimator. Te results in cases were te density is not su ciently smoot or wile smoot as a sape tat gives ig values for low-order derivatives (e.g. a trimodal mix-

5 Robust Average Derivative Estimation 5 ture of normals) indicate gains from te combined estimator relative to te optimal ADE estimator. Te paper is organized as follows. assumptions. In section we discuss te general set-up and In section 3 we derive te asymptotic properties of te density-weigted ADE under various assumptions about density smootness, derive te joint asymptotic distribution for several estimators and te combined estimator. Section 4 provides te result of a Monte Carlo study analysing for te Tobit model te performance of te combined estimator vis-a-vis single bandwidt/kernel based estimators for te density-weigted ADE in cases wit di erent smootness conditions.. General set-up and assumptions We sould ave a brief intro to tis section maybe? Te unknown conditional mean function can be represented as g(x) = E(yjx) = y f (x; y) dy = G(x) f(x) f(x) ; wit dependent variable y R and explanatory variables x R k. Te joint density of (y; x) is denoted by f (y; x), te marginal density of x is denoted by f(x) and G(x) denotes te function R yf (y; x)dy: Since te regression derivative, g 0 (x); can be expressed as g 0 (x) = G0 (x) f(x) g(x) f 0 (x) f(x) ; te need to avoid imprecise contributions to te average derivative for observations wit low densities emanates from te presence of te density in te denominator. One way of doing tis is to employ some weigting function, w(x); on te oter and, Fan (99, 993), Fan and Gijbels (99) avoid weigting by use of regularization wereby n is added to te denominator of te estimator. In Härdle and Stoker (989) trimming on te basis of te density takes te place of te weigting function, tat is tey consider w N (x) = (f(x) > b N ) were b N! 0: An alternative is te density weigted average derivative estimator of Powell,Stock and Stoker (989), PSS, wit w(x) = f(x): Here we focus on te PSS estimator.

6 Robust Average Derivative Estimation 6 Te nonparametric estimates for te various unknown derivative based functionals make use of kernel smooting functions. E.g., te nonparametric estimate for te derivative of te density is given by ^f 0 (K;)(x i ) = N NX j6=i k+ K 0 ( x i were K is te kernel smooting function and is a smooting parameter tat depends on te sample size N; wit! 0 as N! : We now turn to te fundamental assumptions. Te rst two assumptions are common in tis literature, restricting x to be a continuously distributed random variable, were no component of x is functionally determined by oter components of x; imposing a boundary condition allowing for unbounded x s and requiring di erentiability of f and g: Assumption. Let z i = (y i ; x T i ) T ; i = ; ::; N be a random sample drawn from f (y; x); wit f (y; x) te density of (y; x): Te underlying measure of (y; x) can be written as v y v x, were v x is Lebesque measure. Te support of f is a convex (possibly unbounded) subset of R k wit nonempty interior 0 : Assumption. Te density function f(x) is continuous in te components of x for all x R k ; so tat f(x) = 0 for all denotes te boundary of : f is continuously di erentiable in te components of x for all x 0 and g is continuously di erentiable in te components of all x ; were di ers from 0 by a set of measure 0. Additional requirements involving te conditional distribution of y given x as well as more smootness conditions need to be added. Te conditions are sligtly amended from ow tey appear in te literature, in particular we use te weaker Hölder conditions instead of Lipscitz conditions in te spirit of weakening smootness assumptions as muc as possible. Assumption 3. (a) E(y jx) is continuous in x x j )

7 Robust Average Derivative Estimation 7 (b) Te components of te random vector g 0 (x) and matrix f 0 (x)[y; x 0 ] ave nite second moments; (fg) 0 satis es a Hölder condition wit 0 < : (fg) 0 (x + x) and E(! (fg) 0(x)[ + jyj + kxk]) < : (fg) 0 (x)! (fg) 0(x) kxk Bot te coice of te kernel (its order) and te selection of bandwidt ave played a crucial role in te literature ensuring tat te asymptotic bias for te nonparametric estimates of te derivative based functionals (averages) vanises su ciently fast subject to a ig degree of density smootness. Te kernel smooting function is assumed to satisfy a fairly standard assumption, except for te fact tat we allow for te kernel to be asymmetric. Assumption 4. (a) Te kernel smooting function K(u) is a continuously di erentiable function wit bounded support [ ; ] k : (b) Te kernel function K(u) obeys were (i ; ::; i k ) is an index set, K(u)du = ; u i :::u i k k K(u)du = 0 u i :::u i k k K(u)du 6= 0 i + ::: + i k < v(k) i + ::: + i k = v(k) (c) Te kernel smooting function K(u) is di erentiable up to te order v(k). Various furter assumptions ave been made concerning te smootness of te density in te literature (iger degree of di erentiability, Lipscitz and boundedness conditions) to ensure parametric rates of convergence. We formalize te degree of density smootness in terms of te Hölder space of functions. Tis space for integer m 0 and 0 < is de ned as follows. For a set E R k te space C m+ (E) is a Banac space of bounded and continuous functions wic are m times continuous di erentiable wit all te m t order

8 Robust Average Derivative Estimation 8 derivatives satisfying Hölder s condition of order (see Matematiceskaya Encyclopedia. Englis., ed. M. Hazewinkel): f (m) (x + x) f (m) (x)!f (m)(x) kxk for every x; x + x E: Assumption 5. f C m+ () were C m+ () is te Hölder space of functions on R k wit m ; 0 < and E(! f (m)(x)) [ + jyj + kxk]) < : Te assumption implies tat eac component of te derivative of density f 0 (x) C m + () and tus for every component of te derivative of density continuous derivatives of order m exist (if m = 0 tere is just Hölder continuity of derivative). Tis permits te following expansion for c = 0; wit c = 0 for te expansion of density and c = for te expansion of te derivative of te density function: = = f (c) (x + x) ( mx X p=c i +:::i k =p ( mx X p=c i +:::i k =p c c f (p) i!:::i k (x)x +! f (p) i!:::i k (x)x +! X i +:::i k =m X i +:::i k =m c c f (m) i!:::i k (x + x)x! ) () i!:::i k! f (m) (x + x) f (m) (x) ) x ; were x denotes te vector (x ; ::; x k ); x te product x i x i k k wit te index set (i ; : : : ; i k ); and f (m) (x) te m c f (c) =(@x) ; also : 0 : Te rst equality is obtained by Taylor expansion (wit te remainder term in Lagrange form) and te second equality Assumption te f (m) (x + x) and tus te last sum is O(kxk m c+ ): is obtained by adding and subtracting te terms wit f (m) (x): By f (m) (x) in te last sum satis es te Hölder inequality Lack of smootness of te density can readily be sown to a ect te asymptotic bias of derivative based estimators since te biases of tose estimators can be expressed via te bias of te kernel estimator of te derivative of density. Let v be te degree of smootness of te derivative of te density (equal to m + by Assumption (5)), and v(k), te

9 Robust Average Derivative Estimation 9 order of te kernel. De ne v = min(v; v(k)): Provided v = v(k) v; te bias of te derivative of te density, E( ^f 0 (K;) (x i) f 0 (x i )) = E R K(u)(f 0 (x i u) f 0 (x i ))du ; is as usual O( v(k) ) (by applying te usual v t order Taylor expansion of f 0 (x i u) around f 0 (x i )): We next sow tat wit v = v < v(k); te bias of te derivative vanises at te lower rate O( v ): In te latter case substituting (), wit c = ; x = expression and using kernel order, yields = E E [f 0 (x u) f 0 (x)] K(u)du X +:::i k =m = O( m + ) O( v ); u; into te bias i i!:::i k! m ( ) f m (m) (x i u) f f (m) (x i ) K (u) u du ()! were te latter equality uses te Hölder inequality. If di erentiability conditions typically assumed to ensure tat v > k+ do not old, ten even for bandwidts suc tat N v(k) = o() te bias does not vanis su ciently fast. Wit v = min(v; v(k)) all we can state is te rate O( v ) for te bias: E [f 0 (x u) f 0 (x)] K(u)du = O( v ): 3. Average density weigted derivative estimator Te average density weigted derivative, introduced in Powell, Stock and Stoker (989), is de ned as Given Assumptions -3, (3) can be represented as 0 = E(f(x)g 0 (x)): (3) 0 = E(f 0 (x)y) (see Lemma. in PSS). R [f 0 (x u) f 0 (x)] K(u)du m +! f (m)(x) R kk (u)k kuk du O(); were Assumption 4(a) implies tat kk (u)k is bounded (since it is continuous on a closed bounded set), and kuk is bounded on te support of K; Assumption 5 ensures boundedness of E w f (m)(x) :

10 Robust Average Derivative Estimation 0 Te estimator of 0 proposed by PSS uses te sample analogue were f 0 (x) is replaced by a consistent nonparametric estimate, i.e., were ^N (K; ) = N ^f 0 (K;)(x i ) = N NX j6=i NX i= ^f 0 (K;)(x i )y i ; (4) k+ K 0 ( x i K is te kernel smooting function (wic PSS assume to be symmetric) and is a smooting parameter tat depends on te sample size N; wit! 0 as N! : We derive te variance of ^ N (K; ) witout relying on results on U x j ): statistics to accomodate possibly non-symmetric kernels. Tis is provided in Lemma in te Appendix. We obtain te following expression for tis variance: were V ar(^ N (K; )) = (K)N (k+) + N + O(N ) (5) (K) = 4E y f(x i ) (K) + (K)(gf)(x i )y i ; n o = 4 E([(g 0 f)(x i ) (y i g(x i ))f 0 (x i )] [(g 0 f)(x i ) (y i g(x i ))f 0 (x i )] T ) 4 0 T 0 ; for su ciently smoot f(x) coincides wit te asymptotic variance of p N^ N (K; ) considered in PSS, wen N k+! : For a symmetric kernel, (K) simpli es to 4 (K)E [ (x i )f(x i )] ; wit te conditional variance (x) = E(y jx) E(yjx). For tis case Powell and Stoker (996) discuss te rates of te asymptotic variance in (5) wit a view to selecting te optimal for MSE bandwidt rate. Te asymptotic variance does not depend on te kernel function wen te bandwidt satis es N k+! ; but only if we ave a certain degree of smootness of te density: v > (k + )=: In te absence of tis degree of di erentiability (or wen oversmooting) te asymptotic variance (as te asymptotic bias) does depend on te weigting used in te local averaging possibly yielding a non-parametric rate. To express te asymptotic bias of te estimator ^ N (K; ) de ne i A(K; ; x i ) = E zi ^f 0 (K;) (x i ) f 0 (x i ) = K(u)(f 0 (x i u) f 0 (x i ))du:

11 Robust Average Derivative Estimation Ten Bias(^ N (K; )) = E(A(K; ; x i )y i ): (6) As sown in Section, EA(K; ; x i ) is O( v ). We assume Assumption 6. As N! ; v E(A(K; ; x i )y i )! B(K); were jb(k)j < olds. Te asymptotic bias of te estimator ^ N (K; ) can ten be written as Bias(^ N (K; )) = v B(K) + o( v ) (7) and vanises as! 0: We note tat assumption (6) could old as a results of primitive moment assumptions on y i ; f(x i ); and g(x i ): Let d(n) O() denote te case wen bot d(n)) and =d(n) are O() as N! : Assume tat C = lim N! Nk+ always exists and C [0; ]: Teorem. Under Assumptions 6 (a) If te density is su ciently smoot and order of kernel su ciently ig: v > k+ i. coosing : N k+ = o(); N k+! provides an unbiased but not e cient estimator N k+ ii. if : N k+! C; 0 < C < ; ^N (K; ) 0 d! N(0; (K)); p N ^N (K; ) 0 d! N(0; C (K) + ); iii. wen : N k+! ; N v = o() te same result as in PSS, Teorem 3.3 olds: p N ^N (K; ) 0 d! N(0; ); iv. if : N k+! ; but N v O(); a biased asymptotically normal estimator results: p N ^N (K; ) 0 d! N(B(K); )

12 Robust Average Derivative Estimation v. if : N v! ; te bias dominates: v ^N (K; ) 0 p! B(K): (b) For te case v = k+ (i), (ii) and (v) of part (a) apply. (c) If eiter te density is not smoot enoug or te order of te kernel is low: v < k+ te parametric rate cannot be obtained: i. for : N k++v = o(),n k+! in te limit tere is normality, no bias: N k+ ^N (K; ) 0 d! N(0; (K)); ii. for : N k++v O(); N k+! in te limit tere is normality wit asymptotic bias: N k+ ^N (K; ) 0 d! N(B(K); (K)); iii. for : N k++v! te bias dominates: v ^N (K; ) 0 p! B(K): Proof. See Appendix (were te variances and covariances are derived for any kernel but parts of te normality proof are provided for te case of a symmetric kernel). Selection of te optimal bandwidt as minimizing te mean squared error critically depends on our knowledge of te degree of smootness of te density. Let v denote te true di erentiability (smootness) of f 0 and coose te order of our kernel v(k) [v]. Te MSE(^ N (K; )) can ten be represented as MSE(^ N (K; )) = (K)N (k+) + N + B(K)B T (K) v(k) ; and te optimal bandwidt yields opt = cn =(v(k)+k+) ; were te problem of e cient estimation is to nd an appropriate c (e.g., Powell and Stoker (996)). If iger order derivatives exist, furter improvements in e ciency can be obtained by using a iger

13 Robust Average Derivative Estimation 3 order kernel to reduce te bias. In any case, to ascertain a parametric rate of our limiting distribution for ^ N (K; ) wit te use of te iger order kernel (as long as te density is su ciently di erentiable to ave v(k) [v]); our bandwidt sequence needs to satisfy N v(k)! 0, and te degree of smootness of te derivative of te density v needs be in excess of k+ (wit N k+ O() to guarantee boundedness of te variance of p N ^N (K; )). Te advantage of being able to assume tis ig di erentiability order is te insensitivity of te limit process to te bandwidt and kernel over a range of coices tat satisfy te assumptions (among wic N k+! ); if density is not su ciently smoot te parametric rate may not be acievable and bandwidt and kernel coices become crucial in ensuring good performance. Moreover, if degree of density smootness is not known tere is no guidance for te coice of kernel and bandwidt: a iger order kernel and larger bandwidt tat could be better if tere were more smootness could lead to substantial bias if te density is less smoot. Witout making furter assumptions about knowledge of te degree of smootness of te density, all tat is known is tat for some rate of! 0 tere is undersmooting: no asymptotic bias and a limiting Gaussian distribution, and for some slower convergence rate of tere is oversmooting. An optimal rate may exist, but to determine it, and to identify bandwidts wic lead to under- and over- smooting precise knowledge of v; density smootness, is required. Te situation were tere is uncertainty about smootness of density was considered in Kotlyarova and inde-wals (006), ereafter referred to as KW. Teorem corresponds to Assumption of tat paper and demonstrates tat wen our Assumptions 6 are satis ed te estimator satis es teir Assumption. We next establis tat Assumption of tat paper is satis ed as well. Consider several kernel/bandwidt sequence pairs ( Nj ; K j ); j = ; :::J and te corresponding estimators, ^ N (K j ; Nj ): If all satisfy te assumptions of Teorem ten tere exist corresponding rates r Nj for wic te joint limit process of r Nj ^N (K j ; Nj ) 0 is non-zero Gaussian, possibly degenerate. Teorem. Under te Assumptions of Teorem te joint limit process for te vector wit components r Nj ^N (K j ; Nj ) 0 ; j = ; :::J is Gaussian wit te covariance matrix

14 Robust Average Derivative Estimation 4 suc tat for components tat correspond to di erent rates covariances are zero. Proof. See appendix. Consider a linear combination of te estimators ^ N = X j a j^n (K j ; Nj ) wit JX a j = : j= We can represent te Var of ^ N as X X a t ;s a t ;s Cov(^ N (K t ; s ); ^ X N (K t ; s )) aj a j j j ; t ;s t ;s were (see appendix) wit and as before. j j = (K t ; K t ; s = s )N (k+) s s + N + O(N ) (K t ; K t ; s = s ) = 4E y f(x i ) (K t ; K t ; s = s ) + (K t ; K t ; s = s )(gf)(x i )y i Te MSE(^ N) = MSE( P t;s a t;s^ N (K t ; s )) can ten be represented as MSE(^ N(K t ; s )) = X a j a j (B j B T j + j j ) wit B(K j ) = A(K j ) To optimally coose te weigts a j ; we will minimize te trace of te AMSE as in KW. 3 tr(amse(^ N(K t ; s )) = X a j a j ( ~ B T j ~ Bj + tr ~ j j ) = a 0 Da; were fdg j j = B T j B j + tr j j ; Alternatively, more complicated toug, we could consider ~ N = P N P S N i= s= w s(x i ) ^f 0 s;k s (x i )y i, wit P S s= w s(x i ) = 3 Note MSE only provides a complete ordering wen ^ N is a scalar, using a trace is one way to obtain a complete ordering. Depending on wic scalar function of te AMSE is used te order migt di er.

15 Robust Average Derivative Estimation 5 ~B j = B j =r N (t j ; s j ); and ~ j j = j j =(r N (t j ; s j ) r N (t j ; s j )): Te combined estimator is de ned as te linear combination wit weigts tat minimize te estimated tr(amse(^ N): KW discusses te optimal weigts tat minimize te (consistently) estimated tr(amse(^ N) subject to P j a j = ; ere we summarize te results. After ranking te pairs (K tj ; sj ) in declining order of rates r N (t j ; s j ); denote by D I te largest invertible submatrix of D and by D its square submatrix associated wit estimators aving te fastest rate of convergence; note tat it can ave entries associated wit at most one oversmooted estimator to be of full rank. Ten a 0 D I a (subject to P j a j = ) is minimized by tat is by weigts equalling a I lim = 0 D 0 0 D D ; 0; :::; 0 ; D to te kernel/bandwidt combinations aving te fastest rate of convergence and zero weigt to all combinations wit slower rate of convergence. Note tat te weigts in te limiting linear combination are non-negative for estimators corresponding to D I (at most one asymptotically biased estimator). If D I 6= D; ten D D I = D II is of rank one and corresponds to oversmooted estimators only. D II as dimension more tan one (oterwise te only oversmooted estimator would ave been included in D I ) ; note tat ten tere always exist vectors a II lim suc tat a II0 limd II a II lim = 0; a II lim i = ; in oter words, it is possible to automatically bias-correct by using te combined estimator wit weigts tat are not restricted to be non-negative. Finally, te vector of weigts in te combined estimator approaces an optimal linear combination of a I lim and aii lim. Te combined estimator tus as te trace of AMSE tat converges at te rate no worse tan tat of te trace of AMSE for te fastest converging individual estimator. Te combined estimator provides a useful mecanism for reducing te uncertainty about te degree of smootness and tus about te best rate (bandwidt) and automatically selects te best

16 Robust Average Derivative Estimation 6 rate from tose available even toug it is not known a priori wic of te estimators converges faster. Te optimality property of te combined estimator relies on consistent estimation of biases and covariances. 4 To provide a consistent estimate for te asymptotic variance tat does not rely on te degree of smootness, we apply te bootstap, wic is obtained as b~ = Cov d B (^ N (K t ; s ); ^ N (K t ; s )) (8) = B BX b= 0 ^b;n (K t; s) ^N (K t; s) ^b;n (K t; s) ^N (K t; s) ; To provide us wit a consistent estimator of te biases, we need to assume tat for all kernels we consider an undersmooted bandwidt, yielding an asymptotic bias equalling zero. Let s0 bias is obtained as: denote te smallest bandwidt we consider, a consistent estimator for te b~b j [Bias(^ N (K tj ; sj )) = ^ N (K tj ; sj ) B BX ^b;n (K tj ; s0 ): Alternatively te bootstrapped averaged estimates at te lowest bandwidt for all te kernels, i = ; :::; m could be used in bias estimation: b= b~b j [Bias(^ N (K tj ; sj )) = ^ N (K tj ; sj ) B BX b= m X ^b;n (K ti ; s0 ): i= 4. Simulation In order to illustrate te e ectiveness of te combined estimator, we provide a Monte Carlo study were we consider te Tobit model. Te Tobit model under consideration is given by y i = yi if yi > 0; yi = x T i + " i ; i = ; :::; n = 0 oterwise, 4 Examples in KW demonstrate tat a combined estimator can reduce te AMSE relative to an estimator based on incorrectly assumed ig smootness level even wen te weigts are not optimally determined.

17 Robust Average Derivative Estimation 7 were our dependent variable y i is censored to zero for all observations for wic te latent variable y i lies below a tresold, wic witout loss of generality is set equal to zero. We randomly draw f(x i ; " i )g n i= ; were we assume tat te errors, drawn independently of te regressors, are standard Gaussian. Consequently, te conditional mean representation of y given x can be written as g(x) = x T (x T ) + (x T ); were () and () denote te standard normal cdf and pdf respectively. Irrespective of te distributional assumption on " i ; tis is a single index model as te conditional mean of y given x depends on te data only troug te index x T. Wile MLE obviously o ers te asymptotically e cient estimator of ; (density weigted) ADE o ers a semiparametric estimator for wic does not rely on te Gaussianity assumption on " i : Under te usual smootness assumptions, te nite sample properties of ADE for te Tobit model ave been considered in te literature (Niciyama and Robinson (005)). We select two explanatory variables, and set = (; ) T : We make various assumptions about te distribution of explanatory variables. For te rst model, we use two independent standard normal explanatory variables, i.e., f (x ; x ) = (x )(x ): Tis density is in nitely di erentiable and very smoot; tus, te ADE estimator evaluated at te optimal bandwidt sould be a good coice. Tis model, wic we label (s,s), is considered to demonstrate tat even in te case were te smootness assumptions old, te combined estimator performs similar to te ADE estimator evaluated at te optimal bandwidt. For te second model we use one standard normal explanatory variable and one mixture of normals, and in te tird model bot explanatory variables are mixtures of normals. We label tese models respectively (s,m) and (m,m). As in te rst model, we assume independence of te explanatory variables. Mixtures of normal, wile still being in nitely di erentiable, do allow beaviour resembling tat of nonsmoot densities, e.g., te double claw density and te discrete comb density (see Marron and Wand (99)). We consider ere te trimodal normal mixture given by f m (x) = 0:5(x + 0:767) + 3( x+0:767 0:8 0: ) + ( x+0:767 : 0: ):

18 Robust Average Derivative Estimation 8 So f (x ; x ) = (x )f m (x ) and f 3 (x ; x ) = f m (x )f m (x ): 5 Te sample size is set at 000 and 00 replications are drawn in eac case. Te multivariate kernel function K() (on R ) is cosen as te product of two univariate kernel functions. We use a second and fourt order kernel in our Monte Carlo experiment, were, given tat we use two explanatory variables, te igest order satis es te minimal teoretical requirement for ascertaining a parametric rate subject to te necessary smootness assumptions. Bot are bounded, symmetric kernels, wic satisfy te assumption tat te kernel and its derivative vanis at te boundary. Oter simulations will consider te use of asymmetric kernels, wic may yield for te combined estimator furter improvements. For eac kernel we consider tree di erent bandwidts. Te largest bandwidt is cosen on te basis of a generalized cross validation metod were a gridsearc algoritm and 50 simulations are used. Te cross-validation bandwidt is given by te optimal bandwidt sequence gcv = cn =(p+) (see Stone (98)) wit p equalling te order of te kernel (so tat ere p = v(k)). For densities of su cient smootness, tis bandwidt does not represent te undersmooting required to ensure asymptotic unbiasedness. Wen densities are not su ciently smoot, v = v < v(k); gcv will even correspond to oversmooting as we will ave N v! ; providing cases (a)v, (b)iii, or (c)iii in Teorem. Te smallest bandwidt for eac kernel is cosen as 0:5 gcv ; it needs to be be su ciently small so as to ensure te required level of undersmooting. In addition we take te intermediate bandwidt 0:75 gcv. Te generalized cross validation metod applied is not tat typically applied for nonparametric regression, but is specialized to te derivative of te regression function g 0 (x): P We use te usual generalized cross validation (min n i= (y i ^g (x i )) )) to obtain numerical derivatives of g(x); evaluated at a uniform grid of te x s, wic we denote as ~g 0 pgcv (x). Te optimal bandwidt for te derivative of te regression function is ten obtained by minimizing ~g 0 pgcv (x) ^g 0 (x) : Te bandwidt obtained tis way yielded smaller bandwidts tan te usual cross validation metod, wic accorded well wit se- 5 We are planning an analysis using te claw density and discrete comb density (Marron and Wand (99)) and are also exploring te selection of a density wit a precise order or non-smootness.

19 Robust Average Derivative Estimation 9 lecting te bandwidt by minimizing te mean squared error of te nonparametrically estimated moments (Donkers and Scafgans (005)), a metod wic only can be applied in a simulation setting troug te knowledge of te true data generating process. 6 Consistent estimators for biases and covariances of te density weigted ADE are obtained by bootstrap (wit 50 bootstraps) as discussed in te previous section. In table, we report te Root Mean Squared Errors (RMSE) of various density weigted average derivatives togeter wit te average bias and standard deviation (teoretical (T), sample (S), and bootstrapped (B)) of te average derivatives. Te rst tree columns present te results using te nd order kernel (K ) for various bandwidts, te next tree columns present te results using te 4t order kernel (K 4 ) for various bandwidt, and te nal column tat of te combined estimator. Te RMSEs using te di erent pairs of kernels and bandwidts sould be compared wit te RMSE of te combined estimator, wic optimally cooses te weigts. In all tree models we see tat te biases and standard deviations of te individual estimators on average beave as expected: as te bandwidt increases, bias becomes more pronounced and te standard deviation declines. No kernel/bandwidt pair is te best in terms of RMSE among individual ones for all te models, altoug (K 4 ; gcv ) is best for (s,s) and (s,m) and close to te best for (m,m). Te teoretical standard deviation (using te leading two components of Var(^ N ) given in (5) compares very well wit te standard deviation based on te bootstrap, were we note te importance of taking te kernel/bandwidt dependent component into account to ensure tis close correspondence. Te sample standard deviation still reveals a disparity (smaller for (s,s) and (s,m) versus larger for (m,m)) wic migt be te consequence of aving set te number of simulations too low. Table sows tat in terms of te RMSE of te ADE te combined estimator performs 6 : For te (s,s) model te bandwidts for te second and fourt order kernel were respectively : ; :8 :8 :0 for te nonparametric regression and : ; : : for its derivative. For te (s,m) model tey were : ; :7 and : ; : respectively, wereas for te (m,m) model tey were : ; :9 : : :0 :0 :0 ; : : respectively. : : :9 and

20 Robust Average Derivative Estimation 0 Model (s,s) Table : Density weigted ADE estimators K ; K ; K ; K 4 ; K 4 ; K 4 ; 0:5 gcv 0:75 gcv gcv 0:5 gcv 0:75 gcv gcv Combined RMSE Bias StdDev(T) StdDev(S) StdDev(B) Model (s,m) :0008 :0009 0:004 0:0039 0:003 0:0030 0:004 0:0039 :003 :004 0:009 0:009 0:004 0:004 0:009 0:008 :003 :003 0:005 0:004 0:00 0:00 0:005 0:004 :0000 :0000 0:0083 0:0080 0:0049 0:0050 0:008 0:008 :0005 :0003 0:0045 0:0045 0:0034 0:003 0:0045 0:0045 :005 :00 0:0035 0:0035 0:009 0:009 0:0035 0:0035 :0006 :0007 0:0033 0:0034 0:0034 0:0033 RMSE Bias StdDev(T) StdDev(S) StdDev(B) Model 3 (m,m) :0008 :0009 0:006 0:004 0:0047 0:00 0:006 0:004 :0008 :0009 0:0046 0:0070 0:0037 0:007 0:0046 0:0070 :0008 :0009 0:0036 0:0043 0:0030 0:0044 0:0036 0:0043 :0008 :0009 0:0 0:07 0:008 0:038 0:0 0:070 :0008 :0009 0:0075 0:030 0:0055 0:08 0:0075 0:030 :0008 :0009 0:0060 0:0095 0:0048 0:0096 0:0060 0:0095 :003 :00 0:0060 0:04 0:0063 0:0097 RMSE Bias StdDev(T) StdDev(S) StdDev(B) :08 :0 0:083 0:088 0:097 0:077 0:08 0:087 :0379 :0330 0:008 0:0 0:04 0:08 0:007 0:00 :0584 :054 0:0060 0:0069 0:006 0:0063 0:0060 0:0068 0:08 0:03 0:030 0:0307 0:09 0:047 0:0300 0:0306 0:0078 0:0077 0:039 0:039 0:058 0:038 0:038 0:037 :047 :046 0:066 0:066 0:078 0:06 0:065 0:065 :0047 :0036 0:034 0:099 0:078 0:085

21 Robust Average Derivative Estimation better tan te individual estimators in all cases. Were tere is a clearly superior individual estimator it gets a iger weigt on average and, in agreement wit te results for te combined estimator, oversmooted individual estimators get weigts of di erent signs re ecting te tendency of te combined estimator to balance o te biases. Speci cally, te average weigts of ((K ; 0:75 gcv ); (K ; gcv )) and ((K 4 ; 0:75 gcv ); (K 4 ; gcv )) ave opposite signs in all models, and for (s,s) (K ; 0:75 gcv ) gets a relatively large weigt, for (s,m) so does (K 4 ; gcv ); wile for (m,m) (K ; 0:5 gcv ) gets a large weigt. 7 In Table te parameter estimates of te Tobit model are presented. Since te ADE allows for te estimation of = ( ; ) T up to scale, we report results of te parameter estimates of were is standardized to. For comparison, te Tobit MLE parameter estimates are reported as well. To ensure comparability wit te semiparametric estimates, were is standardized to, we report ^ (t) =^ (t) for te Tobit regressions, were ^ (t) are te Tobit parameter estimates (allowing for te estimation of an intercept 0 ). Again te results are provided for eac kernel/bandwidt pair selection as well as for te combined estimator. Wen looking at table, we note tat superiority in estimating ADE does not necessarily translate into better parameter estimators. If we judge performance on te RMSE, no individual estimator can be ranked to be te best in all models (and none is ranked above te combined estimator in all te models). Te kernel bandwidt combination wic is best for (s,s) and (m,m) is (K ; gcv ) compared to (K ; 0:5 gcv ) for (s,m). Even toug te combined estimator is not ranked best in RMSE sense in any of te models, its RMSE is relatively closer to te best individual estimator tan te worst individual estimator. Te same conclusions can be drawn if we judge te performance on te basis of absolute deviation of te mean of te individual estimator from te true value : In tis case, (K ; 0:75 gcv ) is best for (s,s) compared to (K 4 ; 0:75 gcv ) for (s,m) and (K 4 ; gcv ) for (m,m), only te individual estimator (K 4 ; 0:5 gcv ) wit tis criterion is consistently worse tan te combined estimator. A loss in e ciency arising from not knowing te distribution 7 On average te weigts are for (s,s) (0; 34; 0:99; 0:4; 0:03; 0:47; 0:5); for (s,m) (0:30; 0:39; 0:40; 0:05; 0:60; :7) and for (m,m) (0:66; :58; :50; 0:07; :5; :44).

22 Robust Average Derivative Estimation Model (s,s) Table : Tobit Model: Single Index parameter estimates Parametric MLE Semiparametric, ADE based estimator K ; K ; K ; K 4 ; K 4 ; K 4 ; 0:5 gcv 0:75 gcv gcv 0:5 gcv 0:75 gcv gcv Combined Mean StdDev(T) StdDev(S) RMSE Model (s,m) Mean StdDev(T) StdDev(S) RMSE Model 3 (m,m) Mean StdDev(T) StdDev(S) RMSE

23 Robust Average Derivative Estimation 3 of te disturbances occurs as expected, but is witin reason; te standard deviation of te combined semiparametric estimator is less tan double tat of te Tobit MLE for (s,s). Wile te loss in e ciency arising from not knowing te distribution of te disturbances is more severe for (s,m) and (m,m), te potential gain from using te combined estimator over an incorrect kernel bandwidt combination is greater wit non-smoot densities for te explanatory variables. 5. Appendix Te proof of Teorems and relies on te following Lemmas and, correspondingly, were moments are computed under te general assumptions of tis paper. We do not use te teory of U statistics in te following lemma but obtain te moments by direct computation for symmetric as well as non-symmetric kernels. Lemma. Given Assumptions -4, te variance of ^ N (K; ) can be expressed as V ar(^ N (K; )) N (k+) + N + O(N ) were = 4E yi f(x i ) (K) + (K)g(x i )f(x i )y i + o(); = 4 E(g 0 (x i )f(x i ) (y i g(x i ))f 0 (x i ))(g 0 (x i )f(x i ) (y i g(x i )f 0 (x i ))) T 4 0 T 0 + o(); for (K) = (K) = K 0 (u)k 0 (u) T du K 0 (u)k 0 ( u) T du; (under symmetry (K) = (K)). Proof. First, recall tat Bias(^ N (K; )) = E(A(K; ; x i )y i = v B(K) + o( v ) wit A(K; ; x i ) = K(u)(f 0 (x i u) f 0 (x i ))du: (A.)

24 Robust Average Derivative Estimation 4 To derive an expression for te Variance of ^ N (K; ); we note V ar(^ N (K; )) = E(^ N (K; )^ N (K; ) T ) E^ N (K; )E^ N (K; ) T : Let I(a) = ; if te expression a is true, zero oterwise. We decompose te rst term as follows E ^N (K; )^ N (K; ) T (A.) 8" # " # 9 < T NX NX = = 4E ^f 0 : N (K;)(x i )y i ^f 0 N (K;)(x i )y i ; i= i= n = 4 E N ^f 0 (K;) (x i ) ^f (K;)(x 0 i ) T yi + N E N ^f 0 (K;) (x i ) ^f o (K;)(x 0 i ) T y i y i I( i 6= i ). Te rst expectation yields E ^f 0 (K;)(x i ) ^f (K;)(x 0 i ) T yi 8 0 " < X = N E : E z j6=i k+ K 0 ( x i x j ) # " X j6=i k+ K 0 ( x i = k+ N E y i E zi K 0 ( x i x j )K 0 ( x i x j ) T I( i 6= j) + k+ E yi E zi N N = N N N k+ E yi k+ E E zi y i K 0 ( x i x j ) # 9 T = A ; i K 0 ( x i x j )K 0 ( x i x j ) T I( i; i ; i pairwise distinct) K 0 (u)k 0 (u) T f(x i u)du + x j ) E zi y i K 0 ( x i = N k+ Ey i f(x i ) (K) + O() + N E(f 0 (x N i )y i )(f 0 (x i )y i ) T + O( v ) ; (A.3) x j ) T I( i; i ; i pairwise distinct) were for te tird and te last equality we use cange of variable in integration and independence of x j, x j ; by Assumptions 4 and 5 te moments of te additional terms are correspondingly bounded. Furter E ^f 0 (K;)(x i ) ^f (K;)(x 0 i ) T yi = f k+ N Ey i f(x i ) (K) + O() + E(f 0 (x i )y i )(f 0 (x i )y i ) T + O( v ) gf + O(N ) g:

25 Robust Average Derivative Estimation 5 Te second expectation yields, E = N ^f 0 (K;)(x i ) ^f (K;)(x 0 i ) T y i y i I(i 6= i ) k+ X E yi y i j 6=i X j 6=i K 0 ( xi xj )K 0 ( x i x j = N k+ (N ) E y i y i K 0 ( x i x j )K 0 ( x i x j ) T I(j = j ; j ; j 6= i 6= i ) + k+ (N ) E y i y i K 0 ( x i x i )K 0 ( x i x i ) T I(j 6= j ; j = i ; j = i ) + N (N ) + N (N ) (N )(N 3) + (N ) k+ E y i y i K 0 ( x i x i )K 0 ( x i x j ) T I(j 6= j ; j = i ; j 6= i 6= j ) k+ E y i y i K 0 ( x i x j )K 0 ( x i x i ) T I(j 6= j ; j = i ; j 6= i 6= j ) k+ E y i y i K 0 ( x i x j )K 0 ( x i x j ) T I(j 6= j 6= i 6= i ) : Using te law of iterated expectations, we rewrite E ^f 0 (K;)(x i ) ^f (K;)(x 0 i ) T y i y i I(i 6= i ) (A.4) i i = N k+ (N ) E E zj y i K 0 ( x i x j ) E zj y i K 0 ( x i x T j ) + i k+ (N ) E y i E zi y i K 0 ( x i x i )K 0 ( x i x i ) T + i i N k+ (N ) E E zi y i K 0 ( x i x i ) E zi y i K 0 ( x i x T j ) + i i N k+ (N ) E E zi y i K 0 ( x i x j ) E zi y i K 0 ( x i x T i ) + i (N )(N 3) k+ (N ) E E zi y i K 0 ( x i x j ) E E zi y i K 0 ( x i x T j )i ; were for brevity we omit te term I(i 6= i ) in te terms of te expression. Next follow details of derivation. Denote ) T!

26 Robust Average Derivative Estimation 6 A(K; ; x i ) = E zi ^f 0 (K;) (x i ) i f 0 (x i ) = K(u)(f 0 (x i u) f 0 (x i ))du B(K; ; x i ) = K 0 (u)k 0 (u) T (f(x i u) f(x i )) du: C(K; ; x i ) = K(u) [(gf) 0 (x i + u) (gf) 0 (x i )] du D(K; ; x i ) = K 0 (u)k 0 ( u) T [(gf)(x i + u) (gf)(x i )] du c(x i ) = (gf) 0 (x i ) d(k; x i ) = (K)(gf)(x i ) (K) = K 0 (u)k 0 (u) T du (K) = K 0 (u)k 0 ( u) T du; (under symmetry (K) = (K)). Ten write for terms in (A.4). First, E zi Te remaining conditional moments are E zj E zi i k+ K 0 ( x i x j )y i = f 0 (x i )y i + A(K; ; x i )y i. i k+ K 0 ( x i x j )y i = c(x j ) + C(K; ; x j ) (A.5) i k K 0 ( x j x i )K 0 ( x i x j ) T y j = d(k; x i ) + D(K; ; x i ): (A.6) Indeed, for (A.5) E zj k+ K 0 ( x i x j )y i i = k+ K 0 ( x x j )(gf)(x)dx = K 0 (u)(gf)(x i + u)dx (integration by parts) = (gf) 0 (x j ) K(u) [(gf) 0 (x j + u) (gf) 0 (x j )] du

27 Robust Average Derivative Estimation 7 For (A.6) = = = i k E zi K 0 ( x j x i )K 0 ( x i x j ) T y j k g(x)k 0 ( x x i )K0 ( x i x )T f(x)dx c.o.v. x x i = u K 0 (u)k 0 ( u) T (gf)(x i + u)dudy K 0 (u)k 0 ( u) T (gf)(x i )du + yk 0 (u)k 0 ( u) T [(gf)(x i + u) (gf)(x i )] du = d(k; x i ) + D(K; ; x i ): It is useful to note ere tat E E zi k+ K 0 ( x i x j )y i ii = E E zj k+ K 0 ( x i x j E [f 0 (x i )y i + A(K; ; x i )y i ] = E [c(x j ) + C(K; ; x j )] : )y i ii Indeed it can easily be veri ed tat E(f 0 (x i )y i ) = E(c(x j )): Using (A.), (A.5), and (A.6) we can express (A.4) as E ^f 0 (K;)(x i ) ^f (K;)(x 0 i ) T y i y i = N (N ) E (c(x i ) + C(K; ; x i )) (c(x i ) + C(K; ; x i )) T i + k+ (N ) E [d(k; xi )y i + D(K; ; x i )y i ] N E (c(x (N ) i ) + C(K; ; x i )) (f 0 (x i )y i + A(K; ; x i )y i ) T + N E (N ) (f 0 (x i )y i + A(K; ; x i )y i ) (c(x i ) + C(K; ; x i )) T i (N )(N 3) (N ) E [f 0 (x i )y i + A(K; ; x i )y i ] E [f 0 (x i )y i + A(K; ; x i )y i ] T (A.7)

28 Robust Average Derivative Estimation 8 Combining (A.), (A.3), and (A.7) yields, E ^N (K; )^ N (K; ) T = 4 N(N ) k+ E y i f(x i ) (K) + B(K; ; x i )yi + d(k; x i )y i + D(K; ; x i )y i +4 N N(N ) E((f 0 (x i )y i + A(K; ; x i )y i )(f 0 (x i )y i + A(K; ; x i )y i ) T ) i +4 N E (c(x N(N ) i ) + C(K; ; x i )) (c(x i ) + C(K; ; x i )) T +4 N E (c(x N(N ) i ) + C(K; ; x i )) (f 0 (x i )y i + A(K; ; x i )y i ) T i +4 N E (f 0 (x N(N ) i )y i + A(K; ; x i )y i ) (c(x i ) + C(K; ; x i )) T (N )(N 3) + N(N ) E^ N (K; ) T E^ N (K; ) : Te nal expression (using repeatedly Assumptions 3-5 to sow convergence to zero of expectation of terms involving quantities denoted in capitals) is E ^N (K; )^ N (K; ) T = 4 k+ N E y i f(x i ) (K) + y i (gf)(x i ) (K) + o() +4 N E(y i (f 0 (x i )(f 0 (x i ) T + (gf) 0 (x i )(gf) 0 (x i ) T + y i (gf) 0 (x i )(f 0 (x i ) T + y i f 0 (x i )(gf) 0 (x i ) T + o()) T E^ N (K; ) E^ N (K; ) : (N )(N 3) + N(N ) Alternatively, we can write te variance expression in te form given in te statement of te Lemma. Remark. For N V ar(^ N (K; )) to converge, we require N k+ O() or N k+! : Notice tat indeed given N k+! (regardless of weter we assume te kernel to be symmetric), NV ar(^ N (K; ))! 4 E c(x i )c(x i ) T + E f 0 (x i )c(x i ) T + c(x i )f 0 (x i ) T )y i + y i f 0 (x i )f 0 (x i ) T = 4 E((g 0 (x i )f(x) (y i g(x i )f 0 (x i ))(g 0 (x i )f(x) (y i g(x i )f 0 (x i )) T 4 0 T 0 = as in PSS 989

29 Robust Average Derivative Estimation 9 Proof of Teorem. Tree main situations ave to be dealt wit in te proof. From Lemma it follows tat te variance as two leading parts, one tat converges at a parametric rate, O(N ); requiring N k+! ; wen tis condition on te rate of te bandwidt does not old, te variance converges at te rate O(N (k+) ): Te bias converges at te rate O( v ): Te rst situation arises wen te rate of te bias dominates te rates for bot leading terms in te variance: cases (a) v. (correspondingly in (b)) and (c) iii.. By standard arguments tis situation clearly results in convergence in probability to B(K) as stated in te Teorem. Te second situation refers to parametric rate of te variance dominating (wit or witout bias). For tis case Teorem 3.3 in PSS applies. Since te proof in PSS is based on te teory of U statistics we make te additional assumption of symmetry of te kernel function (see te comment in Ser ing (980, p.7) to wic PSS refer in footnote 7 re symmetrization - it is not actually clear to me ow tis will elp in te proof for a non-symmetric kernel). Te tird situation is wen te condition N k+! is violated; note tat if te degree of smootness, v < k+ tis condition regardless of kernel order could old only in te case wen te bias dominates. Tis possibility N k+! 0 was not examined in te literature previously. We tus need to provide te proof of asymptotic normality for cases (a) i. (corresponding (b)) and (c) i.. Consider N k+! 0: Sketc of proof. We sall say tat x i ; x j are close if jx i x j j < ; ere jwj indicates te maximum of te absolute value of te components of vector w: In te sample of fx ; :::x N g denote by A s te set fx i j exactly s oter x j wit j > i are close to x i g: Ten A is te set of "isolated" x i ; tat do not ave any oter wit exactly one for a given : close point, etc. Clearly, close sample points, A is te set of points [ A s represents a partition of te sample N s=

30 Robust Average Derivative Estimation 30 Step of proof. We sow tat a small enoug results in te probablity measure of N[ A s going to zero fast enoug; tis implies tat most of te non-zero contribution into s=3 ^ comes from A (since A does not add non-zero terms). Step. Consider A : Te contribution from te x 0 s in tis set to ^ reduces to te sum (recall symmetry of te kernel) N X N x i A k+ K 0 ( x i Since in view of te result in step te x j tat is A ; we consider ^A = N X x i ;x j A i=;::n ;j=i+;::n N x j )(y i y j ): close to x i wit ig probability is in k+ K 0 ( x i x j )(y i y j ): (A.8) Te terms in (A.8) are i.i.d. (note tat were a pair x i ; x j is not in A te contribution to te sum is zero.) Te second moments of tese terms were derived in Lemma.?Note tat for cross-products only te terms of te form (N )(N 3) k+ (N ) E y i y i K 0 ( x i x j )K 0 ( x i x j ) T I(j 6= j 6= i 6= i ) are relevant since in A te terms are independent??? or someting of tat sort?? so tat te variance will re ect te rate... To be continued... Lemma. Given Assumptions -4, te Var of ^ N can be represented as X X a t ;s a t ;s Cov(^ N (K t ; s ); ^ X N (K t ; s )) aj a j j j t ;s t ;s

Smoothness Adaptive Average Derivative Estimation ±

Smoothness Adaptive Average Derivative Estimation ± Smootness Adaptive Average Derivative Estimation ± Marcia M.A. Scafgans Victoria Zinde-Walsyz Te Suntory Centre Suntory and Toyota International Centres for Economics and Related Disciplines London Scool

More information

7 Semiparametric Methods and Partially Linear Regression

7 Semiparametric Methods and Partially Linear Regression 7 Semiparametric Metods and Partially Linear Regression 7. Overview A model is called semiparametric if it is described by and were is nite-dimensional (e.g. parametric) and is in nite-dimensional (nonparametric).

More information

Chapter 1. Density Estimation

Chapter 1. Density Estimation Capter 1 Density Estimation Let X 1, X,..., X n be observations from a density f X x. Te aim is to use only tis data to obtain an estimate ˆf X x of f X x. Properties of f f X x x, Parametric metods f

More information

The Priestley-Chao Estimator

The Priestley-Chao Estimator Te Priestley-Cao Estimator In tis section we will consider te Pristley-Cao estimator of te unknown regression function. It is assumed tat we ave a sample of observations (Y i, x i ), i = 1,..., n wic are

More information

Basic Nonparametric Estimation Spring 2002

Basic Nonparametric Estimation Spring 2002 Basic Nonparametric Estimation Spring 2002 Te following topics are covered today: Basic Nonparametric Regression. Tere are four books tat you can find reference: Silverman986, Wand and Jones995, Hardle990,

More information

Uniform Convergence Rates for Nonparametric Estimation

Uniform Convergence Rates for Nonparametric Estimation Uniform Convergence Rates for Nonparametric Estimation Bruce E. Hansen University of Wisconsin www.ssc.wisc.edu/~bansen October 2004 Preliminary and Incomplete Abstract Tis paper presents a set of rate

More information

CS522 - Partial Di erential Equations

CS522 - Partial Di erential Equations CS5 - Partial Di erential Equations Tibor Jánosi April 5, 5 Numerical Di erentiation In principle, di erentiation is a simple operation. Indeed, given a function speci ed as a closed-form formula, its

More information

Bootstrap confidence intervals in nonparametric regression without an additive model

Bootstrap confidence intervals in nonparametric regression without an additive model Bootstrap confidence intervals in nonparametric regression witout an additive model Dimitris N. Politis Abstract Te problem of confidence interval construction in nonparametric regression via te bootstrap

More information

Applications of the van Trees inequality to non-parametric estimation.

Applications of the van Trees inequality to non-parametric estimation. Brno-06, Lecture 2, 16.05.06 D/Stat/Brno-06/2.tex www.mast.queensu.ca/ blevit/ Applications of te van Trees inequality to non-parametric estimation. Regular non-parametric problems. As an example of suc

More information

A Flexible Nonparametric Test for Conditional Independence

A Flexible Nonparametric Test for Conditional Independence A Flexible Nonparametric Test for Conditional Independence Meng Huang Freddie Mac Yixiao Sun and Halbert Wite UC San Diego August 5, 5 Abstract Tis paper proposes a nonparametric test for conditional independence

More information

A = h w (1) Error Analysis Physics 141

A = h w (1) Error Analysis Physics 141 Introduction In all brances of pysical science and engineering one deals constantly wit numbers wic results more or less directly from experimental observations. Experimental observations always ave inaccuracies.

More information

Smoothness Adaptive Average Derivative Estimation

Smoothness Adaptive Average Derivative Estimation Econometrics Journal 009, volume 10, pp. 1 3. Article No. ectj?????? Smoothness Adaptive Average Derivative Estimation Marcia M.A. Schafgans and Victoria Zinde-Walsh Department of Economics, London School

More information

Differentiation in higher dimensions

Differentiation in higher dimensions Capter 2 Differentiation in iger dimensions 2.1 Te Total Derivative Recall tat if f : R R is a 1-variable function, and a R, we say tat f is differentiable at x = a if and only if te ratio f(a+) f(a) tends

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximatinga function fx, wose values at a set of distinct points x, x, x,, x n are known, by a polynomial P x suc

More information

HOMEWORK HELP 2 FOR MATH 151

HOMEWORK HELP 2 FOR MATH 151 HOMEWORK HELP 2 FOR MATH 151 Here we go; te second round of omework elp. If tere are oters you would like to see, let me know! 2.4, 43 and 44 At wat points are te functions f(x) and g(x) = xf(x)continuous,

More information

lecture 26: Richardson extrapolation

lecture 26: Richardson extrapolation 43 lecture 26: Ricardson extrapolation 35 Ricardson extrapolation, Romberg integration Trougout numerical analysis, one encounters procedures tat apply some simple approximation (eg, linear interpolation)

More information

Kernel Density Based Linear Regression Estimate

Kernel Density Based Linear Regression Estimate Kernel Density Based Linear Regression Estimate Weixin Yao and Zibiao Zao Abstract For linear regression models wit non-normally distributed errors, te least squares estimate (LSE will lose some efficiency

More information

Kernel Density Estimation

Kernel Density Estimation Kernel Density Estimation Univariate Density Estimation Suppose tat we ave a random sample of data X 1,..., X n from an unknown continuous distribution wit probability density function (pdf) f(x) and cumulative

More information

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i NADARAYA WATSON ESTIMATE JAN 0, 2006: version 2 DATA: (x i, Y i, i =,..., n. ESTIMATE E(Y x = m(x by n i= ˆm (x = Y ik ( x i x n i= K ( x i x EXAMPLES OF K: K(u = I{ u c} (uniform or box kernel K(u = u

More information

Regularized Regression

Regularized Regression Regularized Regression David M. Blei Columbia University December 5, 205 Modern regression problems are ig dimensional, wic means tat te number of covariates p is large. In practice statisticians regularize

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

Math 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006

Math 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006 Mat 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006 f(x+) f(x) 10 1. For f(x) = x 2 + 2x 5, find ))))))))) and simplify completely. NOTE: **f(x+) is NOT f(x)+! f(x+) f(x) (x+) 2 + 2(x+) 5 ( x 2

More information

Math 1241 Calculus Test 1

Math 1241 Calculus Test 1 February 4, 2004 Name Te first nine problems count 6 points eac and te final seven count as marked. Tere are 120 points available on tis test. Multiple coice section. Circle te correct coice(s). You do

More information

An L p di erentiable non-di erentiable function

An L p di erentiable non-di erentiable function An L di erentiable non-di erentiable function J. Marsall As Abstract. Tere is a a set E of ositive Lebesgue measure and a function nowere di erentiable on E wic is di erentible in te L sense for every

More information

Exercises for numerical differentiation. Øyvind Ryan

Exercises for numerical differentiation. Øyvind Ryan Exercises for numerical differentiation Øyvind Ryan February 25, 2013 1. Mark eac of te following statements as true or false. a. Wen we use te approximation f (a) (f (a +) f (a))/ on a computer, we can

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximating a function f(x, wose values at a set of distinct points x, x, x 2,,x n are known, by a polynomial P (x

More information

Should We Go One Step Further? An Accurate Comparison of One-Step and Two-Step Procedures in a Generalized Method of Moments Framework

Should We Go One Step Further? An Accurate Comparison of One-Step and Two-Step Procedures in a Generalized Method of Moments Framework Sould We Go One Step Furter? An Accurate Comparison of One-Step and wo-step Procedures in a Generalized Metod of Moments Framework Jungbin Hwang and Yixiao Sun Department of Economics, University of California,

More information

. If lim. x 2 x 1. f(x+h) f(x)

. If lim. x 2 x 1. f(x+h) f(x) Review of Differential Calculus Wen te value of one variable y is uniquely determined by te value of anoter variable x, ten te relationsip between x and y is described by a function f tat assigns a value

More information

Copyright c 2008 Kevin Long

Copyright c 2008 Kevin Long Lecture 4 Numerical solution of initial value problems Te metods you ve learned so far ave obtained closed-form solutions to initial value problems. A closedform solution is an explicit algebriac formula

More information

Numerical Differentiation

Numerical Differentiation Numerical Differentiation Finite Difference Formulas for te first derivative (Using Taylor Expansion tecnique) (section 8.3.) Suppose tat f() = g() is a function of te variable, and tat as 0 te function

More information

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines Lecture 5 Interpolation II Introduction In te previous lecture we focused primarily on polynomial interpolation of a set of n points. A difficulty we observed is tat wen n is large, our polynomial as to

More information

Financial Econometrics Prof. Massimo Guidolin

Financial Econometrics Prof. Massimo Guidolin CLEFIN A.A. 2010/2011 Financial Econometrics Prof. Massimo Guidolin A Quick Review of Basic Estimation Metods 1. Were te OLS World Ends... Consider two time series 1: = { 1 2 } and 1: = { 1 2 }. At tis

More information

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx. Capter 2 Integrals as sums and derivatives as differences We now switc to te simplest metods for integrating or differentiating a function from its function samples. A careful study of Taylor expansions

More information

IEOR 165 Lecture 10 Distribution Estimation

IEOR 165 Lecture 10 Distribution Estimation IEOR 165 Lecture 10 Distribution Estimation 1 Motivating Problem Consider a situation were we ave iid data x i from some unknown distribution. One problem of interest is estimating te distribution tat

More information

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Statistica Sinica 24 2014, 395-414 doi:ttp://dx.doi.org/10.5705/ss.2012.064 EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Jun Sao 1,2 and Seng Wang 3 1 East Cina Normal University,

More information

Order of Accuracy. ũ h u Ch p, (1)

Order of Accuracy. ũ h u Ch p, (1) Order of Accuracy 1 Terminology We consider a numerical approximation of an exact value u. Te approximation depends on a small parameter, wic can be for instance te grid size or time step in a numerical

More information

Homework 1 Due: Wednesday, September 28, 2016

Homework 1 Due: Wednesday, September 28, 2016 0-704 Information Processing and Learning Fall 06 Homework Due: Wednesday, September 8, 06 Notes: For positive integers k, [k] := {,..., k} denotes te set of te first k positive integers. Wen p and Y q

More information

2.8 The Derivative as a Function

2.8 The Derivative as a Function .8 Te Derivative as a Function Typically, we can find te derivative of a function f at many points of its domain: Definition. Suppose tat f is a function wic is differentiable at every point of an open

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

Bandwidth Selection in Nonparametric Kernel Testing

Bandwidth Selection in Nonparametric Kernel Testing Te University of Adelaide Scool of Economics Researc Paper No. 2009-0 January 2009 Bandwidt Selection in Nonparametric ernel Testing Jiti Gao and Irene Gijbels Bandwidt Selection in Nonparametric ernel

More information

Logistic Kernel Estimator and Bandwidth Selection. for Density Function

Logistic Kernel Estimator and Bandwidth Selection. for Density Function International Journal of Contemporary Matematical Sciences Vol. 13, 2018, no. 6, 279-286 HIKARI Ltd, www.m-ikari.com ttps://doi.org/10.12988/ijcms.2018.81133 Logistic Kernel Estimator and Bandwidt Selection

More information

MVT and Rolle s Theorem

MVT and Rolle s Theorem AP Calculus CHAPTER 4 WORKSHEET APPLICATIONS OF DIFFERENTIATION MVT and Rolle s Teorem Name Seat # Date UNLESS INDICATED, DO NOT USE YOUR CALCULATOR FOR ANY OF THESE QUESTIONS In problems 1 and, state

More information

Fast Exact Univariate Kernel Density Estimation

Fast Exact Univariate Kernel Density Estimation Fast Exact Univariate Kernel Density Estimation David P. Hofmeyr Department of Statistics and Actuarial Science, Stellenbosc University arxiv:1806.00690v2 [stat.co] 12 Jul 2018 July 13, 2018 Abstract Tis

More information

The derivative function

The derivative function Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Te derivative function Wat you need to know already: f is at a point on its grap and ow to compute it. Wat te derivative

More information

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4.

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4. December 09, 20 Calculus PracticeTest s Name: (4 points) Find te absolute extrema of f(x) = x 3 0 on te interval [0, 4] Te derivative of f(x) is f (x) = 3x 2, wic is zero only at x = 0 Tus we only need

More information

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY (Section 3.2: Derivative Functions and Differentiability) 3.2.1 SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY LEARNING OBJECTIVES Know, understand, and apply te Limit Definition of te Derivative

More information

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households Volume 29, Issue 3 Existence of competitive equilibrium in economies wit multi-member ouseolds Noriisa Sato Graduate Scool of Economics, Waseda University Abstract Tis paper focuses on te existence of

More information

New Distribution Theory for the Estimation of Structural Break Point in Mean

New Distribution Theory for the Estimation of Structural Break Point in Mean New Distribution Teory for te Estimation of Structural Break Point in Mean Liang Jiang Singapore Management University Xiaou Wang Te Cinese University of Hong Kong Jun Yu Singapore Management University

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1 Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1 By Jiti Gao 2 and Maxwell King 3 Abstract We propose a simultaneous model specification procedure for te conditional

More information

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x)

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x) Calculus. Gradients and te Derivative Q f(x+) δy P T δx R f(x) 0 x x+ Let P (x, f(x)) and Q(x+, f(x+)) denote two points on te curve of te function y = f(x) and let R denote te point of intersection of

More information

A Simple Matching Method for Estimating Sample Selection Models Using Experimental Data

A Simple Matching Method for Estimating Sample Selection Models Using Experimental Data ANNALS OF ECONOMICS AND FINANCE 6, 155 167 (2005) A Simple Matcing Metod for Estimating Sample Selection Models Using Experimental Data Songnian Cen Te Hong Kong University of Science and Tecnology and

More information

3.4 Worksheet: Proof of the Chain Rule NAME

3.4 Worksheet: Proof of the Chain Rule NAME Mat 1170 3.4 Workseet: Proof of te Cain Rule NAME Te Cain Rule So far we are able to differentiate all types of functions. For example: polynomials, rational, root, and trigonometric functions. We are

More information

Combining functions: algebraic methods

Combining functions: algebraic methods Combining functions: algebraic metods Functions can be added, subtracted, multiplied, divided, and raised to a power, just like numbers or algebra expressions. If f(x) = x 2 and g(x) = x + 2, clearly f(x)

More information

2.3 Product and Quotient Rules

2.3 Product and Quotient Rules .3. PRODUCT AND QUOTIENT RULES 75.3 Product and Quotient Rules.3.1 Product rule Suppose tat f and g are two di erentiable functions. Ten ( g (x)) 0 = f 0 (x) g (x) + g 0 (x) See.3.5 on page 77 for a proof.

More information

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION PETER G. HALL AND JEFFREY S. RACINE Abstract. Many practical problems require nonparametric estimates of regression functions, and local polynomial

More information

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these.

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these. Mat 11. Test Form N Fall 016 Name. Instructions. Te first eleven problems are wort points eac. Te last six problems are wort 5 points eac. For te last six problems, you must use relevant metods of algebra

More information

Boosting Kernel Density Estimates: a Bias Reduction. Technique?

Boosting Kernel Density Estimates: a Bias Reduction. Technique? Boosting Kernel Density Estimates: a Bias Reduction Tecnique? Marco Di Marzio Dipartimento di Metodi Quantitativi e Teoria Economica, Università di Cieti-Pescara, Viale Pindaro 42, 65127 Pescara, Italy

More information

A Jump-Preserving Curve Fitting Procedure Based On Local Piecewise-Linear Kernel Estimation

A Jump-Preserving Curve Fitting Procedure Based On Local Piecewise-Linear Kernel Estimation A Jump-Preserving Curve Fitting Procedure Based On Local Piecewise-Linear Kernel Estimation Peiua Qiu Scool of Statistics University of Minnesota 313 Ford Hall 224 Curc St SE Minneapolis, MN 55455 Abstract

More information

Polynomial Functions. Linear Functions. Precalculus: Linear and Quadratic Functions

Polynomial Functions. Linear Functions. Precalculus: Linear and Quadratic Functions Concepts: definition of polynomial functions, linear functions tree representations), transformation of y = x to get y = mx + b, quadratic functions axis of symmetry, vertex, x-intercepts), transformations

More information

INTRODUCTION TO CALCULUS LIMITS

INTRODUCTION TO CALCULUS LIMITS Calculus can be divided into two ke areas: INTRODUCTION TO CALCULUS Differential Calculus dealing wit its, rates of cange, tangents and normals to curves, curve sketcing, and applications to maima and

More information

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES (Section.0: Difference Quotients).0. SECTION.0: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES Define average rate of cange (and average velocity) algebraically and grapically. Be able to identify, construct,

More information

Simple and Powerful GMM Over-identi cation Tests with Accurate Size

Simple and Powerful GMM Over-identi cation Tests with Accurate Size Simple and Powerful GMM Over-identi cation ests wit Accurate Size Yixiao Sun and Min Seong Kim Department of Economics, University of California, San Diego is version: August, 2 Abstract e paper provides

More information

Poisson Equation in Sobolev Spaces

Poisson Equation in Sobolev Spaces Poisson Equation in Sobolev Spaces OcMountain Dayligt Time. 6, 011 Today we discuss te Poisson equation in Sobolev spaces. It s existence, uniqueness, and regularity. Weak Solution. u = f in, u = g on

More information

LECTURE 14 NUMERICAL INTEGRATION. Find

LECTURE 14 NUMERICAL INTEGRATION. Find LECTURE 14 NUMERCAL NTEGRATON Find b a fxdx or b a vx ux fx ydy dx Often integration is required. However te form of fx may be suc tat analytical integration would be very difficult or impossible. Use

More information

The Complexity of Computing the MCD-Estimator

The Complexity of Computing the MCD-Estimator Te Complexity of Computing te MCD-Estimator Torsten Bernolt Lerstul Informatik 2 Universität Dortmund, Germany torstenbernolt@uni-dortmundde Paul Fiscer IMM, Danisc Tecnical University Kongens Lyngby,

More information

Integral Calculus, dealing with areas and volumes, and approximate areas under and between curves.

Integral Calculus, dealing with areas and volumes, and approximate areas under and between curves. Calculus can be divided into two ke areas: Differential Calculus dealing wit its, rates of cange, tangents and normals to curves, curve sketcing, and applications to maima and minima problems Integral

More information

1. State whether the function is an exponential growth or exponential decay, and describe its end behaviour using limits.

1. State whether the function is an exponential growth or exponential decay, and describe its end behaviour using limits. Questions 1. State weter te function is an exponential growt or exponential decay, and describe its end beaviour using its. (a) f(x) = 3 2x (b) f(x) = 0.5 x (c) f(x) = e (d) f(x) = ( ) x 1 4 2. Matc te

More information

MA455 Manifolds Solutions 1 May 2008

MA455 Manifolds Solutions 1 May 2008 MA455 Manifolds Solutions 1 May 2008 1. (i) Given real numbers a < b, find a diffeomorpism (a, b) R. Solution: For example first map (a, b) to (0, π/2) and ten map (0, π/2) diffeomorpically to R using

More information

Differential Calculus (The basics) Prepared by Mr. C. Hull

Differential Calculus (The basics) Prepared by Mr. C. Hull Differential Calculus Te basics) A : Limits In tis work on limits, we will deal only wit functions i.e. tose relationsips in wic an input variable ) defines a unique output variable y). Wen we work wit

More information

Section 2: The Derivative Definition of the Derivative

Section 2: The Derivative Definition of the Derivative Capter 2 Te Derivative Applied Calculus 80 Section 2: Te Derivative Definition of te Derivative Suppose we drop a tomato from te top of a 00 foot building and time its fall. Time (sec) Heigt (ft) 0.0 00

More information

5.1 We will begin this section with the definition of a rational expression. We

5.1 We will begin this section with the definition of a rational expression. We Basic Properties and Reducing to Lowest Terms 5.1 We will begin tis section wit te definition of a rational epression. We will ten state te two basic properties associated wit rational epressions and go

More information

Symmetry Labeling of Molecular Energies

Symmetry Labeling of Molecular Energies Capter 7. Symmetry Labeling of Molecular Energies Notes: Most of te material presented in tis capter is taken from Bunker and Jensen 1998, Cap. 6, and Bunker and Jensen 2005, Cap. 7. 7.1 Hamiltonian Symmetry

More information

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist Mat 1120 Calculus Test 2. October 18, 2001 Your name Te multiple coice problems count 4 points eac. In te multiple coice section, circle te correct coice (or coices). You must sow your work on te oter

More information

Math 2921, spring, 2004 Notes, Part 3. April 2 version, changes from March 31 version starting on page 27.. Maps and di erential equations

Math 2921, spring, 2004 Notes, Part 3. April 2 version, changes from March 31 version starting on page 27.. Maps and di erential equations Mat 9, spring, 4 Notes, Part 3. April version, canges from Marc 3 version starting on page 7.. Maps and di erential equations Horsesoe maps and di erential equations Tere are two main tecniques for detecting

More information

Precalculus Test 2 Practice Questions Page 1. Note: You can expect other types of questions on the test than the ones presented here!

Precalculus Test 2 Practice Questions Page 1. Note: You can expect other types of questions on the test than the ones presented here! Precalculus Test 2 Practice Questions Page Note: You can expect oter types of questions on te test tan te ones presented ere! Questions Example. Find te vertex of te quadratic f(x) = 4x 2 x. Example 2.

More information

Exam 1 Review Solutions

Exam 1 Review Solutions Exam Review Solutions Please also review te old quizzes, and be sure tat you understand te omework problems. General notes: () Always give an algebraic reason for your answer (graps are not sufficient),

More information

Math 1210 Midterm 1 January 31st, 2014

Math 1210 Midterm 1 January 31st, 2014 Mat 110 Midterm 1 January 1st, 01 Tis exam consists of sections, A and B. Section A is conceptual, wereas section B is more computational. Te value of every question is indicated at te beginning of it.

More information

Continuity and Differentiability Worksheet

Continuity and Differentiability Worksheet Continuity and Differentiability Workseet (Be sure tat you can also do te grapical eercises from te tet- Tese were not included below! Typical problems are like problems -3, p. 6; -3, p. 7; 33-34, p. 7;

More information

On Local Linear Regression Estimation of Finite Population Totals in Model Based Surveys

On Local Linear Regression Estimation of Finite Population Totals in Model Based Surveys American Journal of Teoretical and Applied Statistics 2018; 7(3): 92-101 ttp://www.sciencepublisinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20180703.11 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

Fast optimal bandwidth selection for kernel density estimation

Fast optimal bandwidth selection for kernel density estimation Fast optimal bandwidt selection for kernel density estimation Vikas Candrakant Raykar and Ramani Duraiswami Dept of computer science and UMIACS, University of Maryland, CollegePark {vikas,ramani}@csumdedu

More information

Bootstrap prediction intervals for Markov processes

Bootstrap prediction intervals for Markov processes arxiv: arxiv:0000.0000 Bootstrap prediction intervals for Markov processes Li Pan and Dimitris N. Politis Li Pan Department of Matematics University of California San Diego La Jolla, CA 92093-0112, USA

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

MATH1151 Calculus Test S1 v2a

MATH1151 Calculus Test S1 v2a MATH5 Calculus Test 8 S va January 8, 5 Tese solutions were written and typed up by Brendan Trin Please be etical wit tis resource It is for te use of MatSOC members, so do not repost it on oter forums

More information

5.1 introduction problem : Given a function f(x), find a polynomial approximation p n (x).

5.1 introduction problem : Given a function f(x), find a polynomial approximation p n (x). capter 5 : polynomial approximation and interpolation 5 introduction problem : Given a function f(x), find a polynomial approximation p n (x) Z b Z application : f(x)dx b p n(x)dx, a a one solution : Te

More information

DEPARTMENT MATHEMATIK SCHWERPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

DEPARTMENT MATHEMATIK SCHWERPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE U N I V E R S I T Ä T H A M B U R G A note on residual-based empirical likeliood kernel density estimation Birte Musal and Natalie Neumeyer Preprint No. 2010-05 May 2010 DEPARTMENT MATHEMATIK SCHWERPUNKT

More information

Section 3: The Derivative Definition of the Derivative

Section 3: The Derivative Definition of the Derivative Capter 2 Te Derivative Business Calculus 85 Section 3: Te Derivative Definition of te Derivative Returning to te tangent slope problem from te first section, let's look at te problem of finding te slope

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION MODULE 5

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION MODULE 5 THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION NEW MODULAR SCHEME introduced from te examinations in 009 MODULE 5 SOLUTIONS FOR SPECIMEN PAPER B THE QUESTIONS ARE CONTAINED IN A SEPARATE FILE

More information

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING Statistica Sinica 13(2003), 641-653 EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING J. K. Kim and R. R. Sitter Hankuk University of Foreign Studies and Simon Fraser University Abstract:

More information

A New Diagnostic Test for Cross Section Independence in Nonparametric Panel Data Model

A New Diagnostic Test for Cross Section Independence in Nonparametric Panel Data Model e University of Adelaide Scool of Economics Researc Paper No. 2009-6 October 2009 A New Diagnostic est for Cross Section Independence in Nonparametric Panel Data Model Jia Cen, Jiti Gao and Degui Li e

More information

Lecture 21. Numerical differentiation. f ( x+h) f ( x) h h

Lecture 21. Numerical differentiation. f ( x+h) f ( x) h h Lecture Numerical differentiation Introduction We can analytically calculate te derivative of any elementary function, so tere migt seem to be no motivation for calculating derivatives numerically. However

More information

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems 5 Ordinary Differential Equations: Finite Difference Metods for Boundary Problems Read sections 10.1, 10.2, 10.4 Review questions 10.1 10.4, 10.8 10.9, 10.13 5.1 Introduction In te previous capters we

More information

Pre-Calculus Review Preemptive Strike

Pre-Calculus Review Preemptive Strike Pre-Calculus Review Preemptive Strike Attaced are some notes and one assignment wit tree parts. Tese are due on te day tat we start te pre-calculus review. I strongly suggest reading troug te notes torougly

More information

POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY

POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY APPLICATIONES MATHEMATICAE 36, (29), pp. 2 Zbigniew Ciesielski (Sopot) Ryszard Zieliński (Warszawa) POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY Abstract. Dvoretzky

More information

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT LIMITS AND DERIVATIVES Te limit of a function is defined as te value of y tat te curve approaces, as x approaces a particular value. Te limit of f (x) as x approaces a is written as f (x) approaces, as

More information

Average Rate of Change

Average Rate of Change Te Derivative Tis can be tougt of as an attempt to draw a parallel (pysically and metaporically) between a line and a curve, applying te concept of slope to someting tat isn't actually straigt. Te slope

More information

Function Composition and Chain Rules

Function Composition and Chain Rules Function Composition and s James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 8, 2017 Outline 1 Function Composition and Continuity 2 Function

More information

estimate results from a recursive sceme tat generalizes te algoritms of Efron (967), Turnbull (976) and Li et al (997) by kernel smooting te data at e

estimate results from a recursive sceme tat generalizes te algoritms of Efron (967), Turnbull (976) and Li et al (997) by kernel smooting te data at e A kernel density estimate for interval censored data Tierry Ducesne and James E Staord y Abstract In tis paper we propose a kernel density estimate for interval-censored data It retains te simplicity andintuitive

More information

1 Proving the Fundamental Theorem of Statistical Learning

1 Proving the Fundamental Theorem of Statistical Learning THEORETICAL MACHINE LEARNING COS 5 LECTURE #7 APRIL 5, 6 LECTURER: ELAD HAZAN NAME: FERMI MA ANDDANIEL SUO oving te Fundaental Teore of Statistical Learning In tis section, we prove te following: Teore.

More information

Taylor Series and the Mean Value Theorem of Derivatives

Taylor Series and the Mean Value Theorem of Derivatives 1 - Taylor Series and te Mean Value Teorem o Derivatives Te numerical solution o engineering and scientiic problems described by matematical models oten requires solving dierential equations. Dierential

More information