A Flexible Nonparametric Test for Conditional Independence

Size: px
Start display at page:

Download "A Flexible Nonparametric Test for Conditional Independence"

Transcription

1 A Flexible Nonparametric Test for Conditional Independence Meng Huang Freddie Mac Yixiao Sun and Halbert Wite UC San Diego August 5, 5 Abstract Tis paper proposes a nonparametric test for conditional independence tat is easy to implement, yet powerful in te sense tat it is consistent and acieves n = local power. Te test statistic is based on an estimator of te topological distance between restricted and unrestricted probability measures corresponding to conditional independence or its absence. Te distance is evaluated using a family of Generically Compreensively Revealing (GCR) functions, suc as te exponential or logistic functions, wic are indexed by nuisance parameters. Te use of GCR functions makes te test able to detect any deviation from te null. We use a kernel smooting metod wen estimating te distance. An integrated conditional moment (ICM) test statistic based on tese estimates is obtained by integrating out te nuisance parameters. We simulate te critical values using a conditional simulation approac. Monte Carlo experiments sow tat te test performs well in nite samples. As an application, we test an implication of te key assumption of unconfoundedness in te context of estimating te returns to scooling. We tank Graam Elliott, Dimitris Politis, Patrick Fitzsimmons, Jin Seo Co, Liangjun Su, James Hamilton, Andres Santos, Brendan Beare, and seminar participants at te University of California San Diego, Hong Kong University of Science and Tecnology, Peking University Guangua scool of management, and Bates Wite, LLC for elpful comments and suggestions. We are equally grateful for te constructive comments from Yuici Kitamura, te co-editor, and two anonymous referees. Special tanks to Liangjun Su for saring computer programs. Address correspondence to Meng Huang, Freddie Mac, 55 Park Run Dr., McLean, VA ; nkuangmeng@gmail.com or to Yixiao Sun, Department of Economics 58, University of California, San Diego, La Jolla, CA 993; yisun@ucsd.edu.)

2 Running ead: Nonparametric Test for Conditional Independence

3 Introduction In tis paper, we propose a exible nonparametric test for conditional independence. Let X; Y; and be tree random vectors. Te null ypotesis we want to test is tat Y is independent of X given, denoted by Y? X j : Intuitively, tis means tat given te information in, X cannot provide additional information useful in predicting Y. Dawid (979) sowed tat some simple euristic properties of conditional independence can form a conceptual framework for many important topics in statistical inference: su ciency and ancillarity, parameter identi cation, causal inference, prediction su ciency, data selection mecanisms, invariant statistical models, and a subjectivist approac to model-building. An important application of conditional independence testing in economics is to test a key assumption identifying causal e ects. Suppose we are interested in estimating te e ect of X (e.g., scooling) on Y (e.g., income), and tat X and Y are related by te equation Y = + X + U; were U (e.g., ability) is an unobserved cause of Y (income) and and are unknown coef- cients, wit representing te e ect of X on Y. (We write a linear structural equation ere merely for concreteness.) Since X is typically not randomly assigned and is correlated wit U (e.g., unobserved ability will a ect bot scooling and income), OLS will generally fail to consistently estimate. Neverteless, if, as in Grilices and Mason (97) and Grilices (977), we can nd a set of covariates (e.g., proxies for ability, suc as AFQT scores) suc tat U? X j ; () we can estimate consistently by various metods: covariate adjustment, matcing, metods using te propensity score suc as weigting and blocking, or combinations of tese approaces. Te assumption in () is a key assumption for identifying. It is called a conditional exogeneity assumption by Wite and Calak (8). It enforces te ignorability or unconfoundedness condition, also known as selection on observables (Barnow, Cain, and Goldberger, 98). Note tat te conditional independence assumption in () cannot be directly tested since U is unobservable. But if tere are oter observable covariates V satisfying certain conditions (see Wite and Calak, ), we ave U? X j implies V? X j ; so we can test te assumption in () by testing its implication, V? X j : Section 6 of tis paper applies tis test in te context of a nonparametric study of returns to scooling. In te literature, tere are many tests for conditional independence wen te variables are categorical. However, in economic applications it is common to condition on continuous variables, and tere are only a few nonparametric tests for te continuous case. Previous work on testing conditional independence for continuous random variables includes Linton and Gozalo (997, LG ), Fernandes and Flores (999, FF ), and Delgado and Gonzalez-Manteiga (, DG ). Su and Wite ave several papers (3, 7, 8,, SW ) addressing tis question. Altoug SW s tests are consistent against any deviation from te null, tey are only able to detect local alternatives converging to te null at a rate slower tan n = and ence su er from

4 te curse of dimensionality. We will compare our test wit te LG, DG and SW tests in our simulation study. Recently, Song (9) as proposed a distribution-free conditional independence test of two continuous random variables given a parametric single index tat acieves te local n = rate. Speci cally, Song (9) tests te ypotesis Y? X j () ; were () is a scalar-valued function known up to a nite-dimensional parameter, wic must be estimated. A main contribution ere is tat our proposed test also acieves n = local power, despite its fully nonparametric nature. In contrast to Song (9), te conditioning variables can be multidimensional. Te test is motivated by a series of papers on consistent speci cation testing by Bierens (98, 99), Bierens and Ploberger (997), and Stinccombe and Wite (998, StW ), among oters. Wereas Bierens (98, 99) and Bierens and Ploberger (997) construct te integrated conditional moment (ICM) tests tat essentially compare a restricted parametric model wit an unrestricted regression model, te test in tis paper follows a suggestion of StW, wic is based on te estimates of te topological distance between unrestricted and restricted probability measures, corresponding to conditional independence or its absence. Tis distance is measured indirectly by a family of moments, wic are te di erences of te expectations under te null and under te alternative for a set of test functions. Te cosen test functions make use of Generically Compreensively Revealing (GCR) functions, suc as te logistic or normal cumulative distribution functions (CDFs), and are indexed by a continuous nuisance parameter vector. Under te null, all moments are zeroes. Under te alternative, te moments are nonzero for essentially all coices of. Tis is in contrast wit DG (), wic employs an indicator testing function tat is not generically and compreensively revealing. We estimate tese moments by teir sample analogs, using kernel smooting. An ICM test statistic based on tese is obtained by integrating out te nuisance parameters. Its limiting null distribution is a functional of a mean zero Gaussian process. We simulate critical values using a conditional simulation approac suggested by Hansen (996) in a di erent setting. Our GCR approac requires bounded random variables. Wen any random variable is not bounded, we rst standardize it on te basis of estimated location and scale parameters and ten apply a bounded and invertible transformation to te standardized data. Te location and scale parameters can be te mean and standard deviation or oter more robust measures suc as te median and interquartile range, respectively. Te plan of te paper is as follows. In Section, we specify a family of moment conditions wic is (essentially) equivalent to te null ypotesis of conditional independence and forms a basis for our test. In Section 3, we establis stocastic approximations of te empirical moment conditions uniformly over te nuisance parameters. We derive te nite-dimensional weak convergence of te empirical moment process. We also provide a bandwidt coice for practical use: a simple plug-in estimator of te MSE-optimal bandwidt. In Section 4, we formally introduce and analyze our ICM test statistic. In Section 5, we report some Monte Carlo results, examining te size and power properties of our test and comparing its performance wit tat of a variety of oter tests in te literature. In Section 6, we study te returns to scooling, using te proposed statistic to test an implication of te key assumption of unconfoundedness. Te last section concludes. Te appendix contains te proofs of te main results and sows tat te estimation errors in te location and scale parameters ave no impact on our asymptotic teory.

5 Te Null Hypotesis and te Testing Approac. Te Null Hypotesis Let X, Y, and be tree random vectors, wit dimensions d X, d Y ; and d, respectively. Denote W = (X ; Y ; ) R d wit d = d X + d Y + d : Given an IID sample fx i ; Y i ; i g n i=, we want to test te null tat Y is independent of X conditional on, i.e., H : Y? X j ; () against te alternative tat Y and X are dependent conditional on, i.e., H a : Y 6? X j : Let F Y jx (y j x; z) be te conditional distribution function of Y given (X; ) = (x; z) and F Y j (y j z) be te conditional distribution function of Y given = z. Ten we can express te null as F Y jx (yjx; z) = F Y j (yjz): (3) Te following tree expressions are equivalent to one anoter and to (3): F XjY (xjy; z) = F Xj (xjz); (4) F XY j (x; yjz) = F Xj (xjz)f Y j (yjz); (5) F XY (x; y; z)f (z) = F X (x; z)f Y (y; z); (6) were we ave used te standard notations for distribution functions. Te approac adopted in tis paper is inspired by a series of papers on consistent speci cation testing: Bierens (98, 99), Bierens and Ploberger (997), and StW, among oters. Te tests in tose papers are based on an in nite number of moment conditions indexed by nuisance parameters. Consider, as an example, te conditional mean function g (x) = E (Y j X = x). Bierens (99) tests te ypotesis tat te parametric functional form, f (x; ), is correctly speci ed in te sense tat g (x) = f (x; ) for some i. Te test statistic is based on an estimator of a family of moments E (Y f(x; ))e X indexed by a nuisance parameter vector. Under te null ypotesis of correct speci cation, tese moments are zeroes for all. Bierens s (99) Lemma sows tat te converse essentially olds, due to te properties of te exponential function, making te test capable of detecting all deviations from te null. Te similar idea is used in Bierens and Wang () for testing a parametric speci cation of te conditional distribution function. StW nd tat a broader class of functions as tis property. Tey extend Bierens s result by replacing te exponential function in te moment conditions wit any GCR function, and by extending te probability measures considered in te Bierens (99) approac to signed measures. As stated in StW, GCR functions include non-polynomial real analytic functions, e.g., exp, logistic CDF, sine, cosine, and also some nonanalytic functions like te normal CDF or its density. Furter, tey point out tat suc speci cation tests are based on estimates of topological distances between a restricted model and an unrestricted model. Following tis idea, we can construct a test for conditional independence based on estimates of a topological distance between unrestricted and restricted probability measures corresponding to conditional independence or its absence. 3

6 To de ne te GCR property formally, let C(F ) be te set of continuous functions on a compact set F R d ; and sp [H ' ] be te linear span of a collection of functions H ' ( ): We write ~w := (; w ) : Te de nition below is te same as De nition 3.6 in StW. De nition (StW, De nition 3.6) We say tat H ' = fh : R d! R j H(w) = ' ( ~w ) ; R +d g is generically compreensively revealing if for all wit non-empty interior, te uniform closure of sp[h ' ] contains C(F ) for every compact set F R d. Intuitively, GCR functions are a class of functions indexed by wose span comes arbitrarily close to any continuous function, regardless of te coice of ; as long as it as nonempty interior. Wen tere is no confusion, we simply call ' GCR if te generated H ' is GCR. We now establis an equivalent ypotesis in te form of a family of moment conditions following StW. Let P be te joint distribution of te random vector W, and let Q be te joint distribution of W wit Y? X j. Tus, P is an unrestricted probability measure, wereas Q is restricted. To be speci c, P and Q are de ned suc tat for any event A, P (A) df XY (x; y; z) = df XY j (x; yjz)df (z) (7) and A Q(A) A A df Xj (xjz)df Y j (yjz)df (z): (8) Note tat te measure P will be te same as te measure Q if and only if te null is true: P (A) = df XY j (x; yjz)df (z) H = df Xj (xjz)df Y j (yjz)df (z) = Q(A) A A for all Borel sets A: Testing te null ypotesis is tus equivalent to testing weter tere is any deviation of P from Q. It sould be pointed out tat te marginal distribution of is te same under P and Q regardless of weter te null is true or not. Let E P and E Q be te expectation operators wit respect to te measure P and te measure Q. De ne ' () E P '( W ~ i ) E Q '( W ~ i ) ; were ( ; ; ; 3 ) R +d is a vector of nuisance parameters, W ~ = (; W ) ; and ' is suc tat te indicated expectations exist for all. Under te null ypotesis, ' () is obviously zero for any coice of and any coice of '; including GCR functions. To construct a powerful test, we want ' () to be nonzero under te alternative. If ' ( ) is not zero under some alternative, we say tat ' can detect tat particular alternative for te coice =. An arbitrary function ' may fail to detect some alternatives for some coices of. Neverteless, according to StW, if W is a bounded random vector, te properties of GCR functions imply tat tey can detect all possible alternatives for essentially all R +d wit aving non-empty interior. Essentially all means tat te set of bad s, i.e., te set f : ' () = and Y 6? X j g as Lebesgue measure zero and is not dense in. Given tat any deviation of P from Q can be detected by essentially any coice of, testing H : Y? X j is equivalent to testing H : ' () = for essentially all (9) 4

7 for a GCR function ' and a set wit non-empty interior. Te alternative is H a : H is false. A straigtforward testing approac is to estimate ' () and to see ow far te estimate is from zero. However, tere are two tecnical issues. First, te result of StW requires W to be bounded. To acieve te boundedness, we can replace eac element V of W by V (V ), were V is a bounded one-to-one mapping wit Borel measurable inverse. De ne Y (Y ) = ( Y (Y ) ; : : : ; Y dy (Y dy )) and de ne X (X) and () similarly. Ten Y? X j is equivalent to Y (Y )? X (X) j () : Te equivalence olds because te sigma elds are not a ected by te transformation. So it is innocuous to assume tat W as a bounded support. In practice, we recommend coosing a bounded set, say [a; b] d ; to closely matc te support of te GCR function we use. Depending on weter te random variables ave bounded supports or not, we can acieve tis using a di erent transformation. As an example, suppose tat te support of V is a bounded interval, say [v min ; v max ] for some known nite constants v min and v max : Ten we can take V (V ) = a + (b a) (V v min) ; () v max v min wic obviously meets our requirement. If te end points of te support are not known, we can estimate tem by ^v min = min i=;:::;n (V i ) and ^v max = max i=;:::;n (V i ) and plug te estimates into (). Under some mild conditions, ^v min and ^v max converge to te true endpoints at te fast rate of =n. As a result, we can sow tat te estimation uncertainty as no e ect on our asymptotic results. Wen te random variable V as an unbounded support, we rst standardize it and ten take V (V ) = a + (b a) arctan ((V v) = v ) + :5 ; () were v and v are location and scale parameters. For example, v and v can be te mean and standard deviation of V: By construction, V (V ) [a; b]. We can also use oter bounded functions suc as V (V ) = a+(b a)f ((V v ) = v ) for a CDF F. Te standardization ensures tat V (V ) does not reside in a small subset of [a; b]: See Bierens and Wang () for more discussion. Wen v and v are not known, we can estimate tem by ^ v and ^ v respectively and plug tem into (). In Section 8., we make te transformation explicit and sow tat under some mild conditions including te p n-consistency of ^ v and ^ v ; te estimator errors in ^ v and ^ v ave no impact on te asymptotic properties of our proposed test. Here for notational simplicity we leave tis transformation implicit and assume tat P (W [a; b] d ) =. Witout loss of generality, we let a = and b = for our teoretical development. Te second issue is related to te nonparametric estimation of ' () : It involves a nonparametric estimator ^f of te density f in te denominator of te test statistic, making te analysis of limiting distributions awkward. To avoid tis tecnical issue, we compute te expectations of 'f rater tan tose of ', leading to a new distance metric between P and Q: 'f () = E P '( W ~ i )f () E Q '( W ~ i )f () : Using te cange-of-measure tecnique, we ave n 'f () = C E P '( W ~ i ) E Q '( W ~ io ) ; 5

8 were P and Q are probability measures de ned according to P (A) = f (z)df XY j (x; yjz)df (z)=c and A Q (A) = f (z)df Xj (xjz)df Y j (yjz)df (z)=c () A wit C = R f (z) dz being te normalizing constant. Under te null of H : Y? X j ; P and Q are te same measure, and so 'f () = for all : Under te alternative of H a : Y 6? X j ; P and Q are di erent measures. By de nition, if ' is GCR, ten its revealing property olds for any probability measure (see De nition 3. of StW). So under te alternative, we ave 'f () 6= for essentially all : Te beaviors of 'f () under H and H a imply tat we can employ 'f () in place of ' () to perform our test. To sum up, wen ' is a GCR function, W as bounded supports, as non-empty interior, and R f (z) dz < ; a null ypotesis equivalent to conditional independence is H : 'f () = for essentially all : Tat is, te null ypotesis of conditional independence is equivalent to a family of moment conditions indexed by. For notational simplicity, we drop te subscript and write () := 'f () ereafter.. Heuristics for Rates Wen te probability density functions exist, te conditional independence is equivalent to any of te following: f Y jx (yjx; z) = f Y j (yjz); f XjY (xjy; z) = f Xj (xjz); f XY j (x; yjz) = f Xj (xjz)f Y j (yjz); f XY (x; y; z)f (z) = f X (x; z)f Y (y; z); (3) were te notation for density functions is self-explanatory. One way to test conditional independence is to compare te densities in a given equation to see if te equality olds. For example, Su and Wite s (8) test essentially compares f XY f wit f X f Y. To do tat, tey estimate f XY ; f ; f X ; and f Y nonparametrically, so teir test as power against local alternatives at a rate of only n = d=4, te slowest rate of te four nonparametric density estimators, i.e., te rate for ^f XY. Tis rate is slower tan n = and ence re ects te curse of dimensionality. Te dimension ere is d = d X + d Y + d, wic is at least tree and could potentially be larger. To acieve te rate n =, we do not compare te density functions directly. Instead, our family of moment conditions indirectly measures te distance between f XY f and f X f Y, so tat for eac given, te test statistic is based on an estimator of an average tat can acieve an n = rate, just as a semiparametric estimator would. To better understand te moment conditions of te equivalent null, we write () = '( ~w ) f (z) f XY (x; y; z) dxdydz '( ~w )f Y (y; z) f X (x; z) dxdydz: 6

9 Instead of comparing f XY f wit f Y f X, we now compare teir integral transforms. Before te transformation, f XY f and f Y f X are functions of (x; y; z), te data points, and tose functions can only be estimated at a nonparametric rate slower tan n =. But teir integral transforms are now functions of. For eac, te transformation is an average of te data so tat semiparametric tecniques could be used ere to get an n = rate. Essentially, we compare two functions by comparing teir weigted averages. Te two comparisons are equivalent because of te properties of te cosen test functions. Tat is, if we coose GCR functions for our test functions, de ned on a compact index space wit non-empty interior, and we do not detect any di erence between P and Q transforms at essentially any point, ten P and Q must agree, and as a consequence P and Q must agree. We gain robustness by integrating over many points :.3 Empirical Moment Conditions Wit some abuse of notation, we write '( + x + y + z 3 ) '(x; y; z; ): De ne g X (x; z; ) = E ['(x; Y; z; )j = z] = '(x; y; z; )f Y j (yjz)dy: (4) Ten te moment conditions can be rewritten as () = E ['(X; Y; ; )f ()] E [g X (X; ; )f ()] : Te rst term of () is a mean of 'f, were ' is known and f can be estimated by a kernel smooting metod. Te second term is a mean of g X f (), were te function g X (x; z; ) is a conditional expectation tat can be estimated by a Nadaraya-Watson estimator: ^g X (x; z; ) = P n j= '(x; Y j; z; )K ( j z) P n j= K ( j z) Tus we can estimate () by ^ n; () = = = = n n i= i= n (n ) n (n ) '( W ~ i ) ^f i ( i ) ['( ~ W i ) n i= j= n K ( j i ) j= ^g X (X i ; i ; ) ^f ( i ) i= '( W ~ i ) '( W ~ i i;j) i= j=;j6=i n (n ) K ( j i ) i= j= '(X i ; Y j ; i ; )K ( j i ) f['( W ~ i ) '( W ~ i;j)]k ( i j )g; (5) 7

10 were W ~ i;j = + Xi + Yj + i 3 and K (u) is a multivariate kernel function. In tis paper, we follow te standard practice and use a product kernel of te form: K (u) = K u du ; : : : ; u d u wit K (u ; : : : ; u du ) = d u Y `= k(u`); were d u is te dimension of u and n is te bandwidt tat depends on n. ^ n; () is an empirical version of (): For eac ; ^ n; () is a second order U-statistic. Wen ^ n; () is regarded as a process indexed by ; ^n; () is a U-process. Note tat ['( W ~ i ) '( W ~ i;j ( i j ) is not symmetric in i and j: To acieve te symmetry so tat te teory of U-statistics and U-processes can be applied, we rewrite ^ n; () as ^ n; () = n X ; (W i ; W j ; ); (6) i<j were ; (W i ; W j ; ) = '( W ~ i ) '( W ~ i i;j) K ( i j ) + '( W ~ j) '( W ~ i j;i) K ( j i ) = ; (W j ; W i ; ): Our test statistic is related to wat DG (, Sec 5.) propose but no formal proof is given tere. DG formulate te null ypotesis as for all ( ; ; 3 ) in te support of (X ; Y ; ) were H : L( ; ; 3 ) = (7) L( ; ; 3 ) = E (Y ) E (Y ) j (X) 3 () f () and v (V ) = fv vg is te indicator function. Te DG statistic is based on te following estimator of L( ; ; 3 ): = = ^L n; ( ; ; 3 ) n(n ) n(n ) i= j=;j6=i i= j=;j6=i (Y i ) (Y j ) (X i ) 3 ( i ) K ( j i ) (X i ) (Y i ) 3 ( i ) (X i ) (Y j ) 3 ( i ) K ( j i ) : Comparing tis wit ^ n; () given in (5), we can see tat ^L n; () takes te same form as ^ n; (). Te di erence is tat we use a GCR function '(x; y; z; ) wile DG use te indicator function (x) (y) 3 (z) : Tis as bot teoretical and practical implications. First, from a teoretical point of view, wile te indicator function is compreensively revealing, it is not generically and compreensively revealing. An advantage of using a GCR function is tat te alternative ypotesis can be revealed by essentially all : Tis property does not old for te 8

11 indicator function, i.e., tere may exist a region of ( ; ; 3 ) wit nonempty interior suc tat te moment condition in (7) olds but Y 6? X j : Second, te GCR approac requires bounded random variables wile te DG approac does not. So in our setting, te boundary smooting bias cannot be avoided and as to be dealt wit explicitly and rigorously. DG assume unbounded supports and so tey do not ave to deal wit te boundary problem. Tird, tere can be a practical problem wen computing some functionals of ^L n; ( ; ; 3 ) : For instance, te Cramér-von Mises statistic in DG () takes te form C n = ^L n; (X i; Y i ; i ) i= were by de nition ^L n; (X i; Y i ; i ) = n(n ) k= j=;j6=k [ Yi (Y k ) Yi (Y j )] Xi (X k ) i ( k ) K ( j k ) : Te DG test rejects te null wen C n is larger tan an asymptotically valid critical value. Wen te dimensions of X and are large, X k X i and k i for k 6= i may never appen and Xi (X k ) i ( k ) may never be di erent from zero, even in large samples. In tis case ^L n; (X i; Y i ; i ) is zero. Tis could ave adverse e ects on te size accuracy and te power property of te test in nite samples. Tis is supported by our Monte Carlo study. A desirable property of te DG test is tat it is invariant to strictly monotonic transformations of fx i g and fy i g : To see tis, let m X (X) = (m X (X ) ; :::; m XdX (X dx )) and m Y (Y ) = (m Y (Y ) ; :::; m YdY (Y dy )) were all te univariate functions m Xk () and m Yk () are strictly monotonic. Ten [ Yi (Y k ) Yi (Y j )] Xi (X k ) = my (Y i )(m Y (Y k )) my (Y i )(m Y (Y j )) mx (X i )(m X (X k )): As a result, ^L n; (X i; Y i ; i ) is invariant to strictly monotonic transformations of fx i g and fy i g : Tis is an appealing property tat is not sared by te GCR test. More broadly, one may argue tat a conditional independence test sould be invariant to measurable and invertible transformations of all te variables fx i g ; fy i g ; and f i g : Te DG test is not invariant under tis stronger notion of invariance. However, tests based on nonparametric estimators of some divergence measures between probability distributions suc as Sannon s entropy metric or Hellinger s distance measure may be invariant in te stronger sense. An example of suc a test is SW (8) wic is based on a weigted Hellinger distance between f XY (x; y; z) f (z) and f X (x; z) f Y (y; z): Like te GCR test, te omnibus tests of LG (997) and SW (7) are not invariant to measurable and invertible transformations or strictly monotonic transformations. Wile transformation invariance is a pleasant property, te invariance requirement is often invoked to reduce te class of tests under consideration so tat uniformly most powerful invariance tests (UMPI) can be designed. However, in te present context, witout specifying te direction of local alternatives, tere is no uniformly most powerful test even among te class of invariance tests. We avoid coosing a direction in order to edge against te possibility of aving no power in oter directions. Te lack of a UMPI test makes te invariance requirement not as compelling as in some oter settings were a UMPI test exists. We view te GCR test and oter tests, invariant or not, as complementary. For example, wile 9

12 te SW (8) test, wic is invariant, can be powerful in detecting ig-frequency alternatives, it su ers from te curse of dimensionality, as discussed above. In contrast, te GCR test, wic is not invariant, does not su er from te curse of dimensionality and is more powerful against low-frequency departures. In addition, te GCR test may be made invariant to strictly monotonic transformations if we convert te data fx i g and fy i g to ranks before applying te GCR test. A systematic study of tis rank-based GCR test is beyond te scope of tis paper. 3 Stocastic Approximations and Finite Dimensional Convergence 3. Assumptions In tis subsection, we state te assumptions tat are required to establis te asymptotic properties of ^n; (): We start wit a de nition, wic uses te following multi-index notation: for j = (j ; : : : ; j m ) wit j` being nonnegative integers, we denote jjj = j + j + + j m ; j! = j! j m!, u j = u j ujm m ; and D j g(u) jjj g(u)=@u jm m : De nition G (A; ; ; m), >, is a class of functions g () : R m! R indexed by A satisfying te following two conditions: (a) for eac ; g () is b times continuously di erentiable, were b is te greatest integer tat is smaller tan ; (b) let Q (u; v) be te Taylor series expansion of g (u) around v of order b : ten sup Q (u; v) = A ku sup vk for some constants > and > : X j:jjjb;j6= D j g (v) j! (u v) j kg (u) g (v) Q (u; v)k ku vk In te absence of te index set A, we use G (; ; m) to denote te class of functions. In tis case, our de nition is similar to De nition in Robinson (988) and De nition in DG (). A su cient condition for condition (b) is tat te partial derivative of te b-t order is uniformly Hölder continuous: sup sup D j g (u) D j g (v) b A kv uk for all j suc tat jjj = b: We are ready to present our assumptions. Assumption (IID) (a) fw i [; ] d g n i= is an IID sequence of random variables on te complete probability space (; F; P ) ; (b) eac element ` of is supported on [; ]; (c) te distribution of admits a density function f (z) wit respect to te Lebesgue measure. Assumption (Smootness of te Densities) (a) f () G q+ (; ; d ) for some integer q > and some constants > and > ; (b) D j f (z) = for all jjj q and all z on te boundary of [; ] d ; (c) te conditional distribution functions F Y j ; F Xj ; and F XY j admit

13 te respective densities f Y j (yjz); f Xj (xjz); and f XY j (x; yjz) wit respect to a nite counting measure, or te Lebesgue measure or teir product measure; (d) as functions of z indexed by x; y; or (x; y) A; f Xj (xjz); f Y j (yjz) and f XY j (xjz) belong to G q+ (A; ; ; d ) wit A = [; ] d X ; [; ] d Y or [; ] d X+d Y. Assumption 3 (GCR) (a) is compact wit non-empty interior; (b) ' G (; ; ). Assumption 4 (Kernel Function) Te univariate kernel k () is te qt order symmetric and bounded kernel k : R! R suc tat (a) R k(v)dv =, R v j k(v)dv = for j = ; ; : : : ; q ; (b) k (v) = O(( + jvj ) ) for some > q + q + : Assumption 5 (Bandwidt) Te bandwidt = n satis es (a) n d! as n! ; (b) p n q = o(); i.e., = o(n =(q) ) as n! : Some discussions on te assumptions are in order. Te IID condition in Assumption is maintained for convenience. Analogous results old under weaker conditions, but we leave explicit consideration of tese aside. Assumptions (a) and (d) are needed to control te smooting bias. Under Assumptions (b) and (a), we ave R f (z) dz < : So it is not necessary to state te square integrability of f (z) as a separate assumption. In assumption (d), te smootness condition is wit respect to te conditioning variable. It does not require te marginal distributions of X and Y to be smoot. In fact, X and Y could be eiter discrete or continuous. In addition, from a tecnical point of view, we only need to assume tat tere exists a version of te conditional density functions satisfying Assumption (d). Assumption (b) is a tecnical condition, wic elps avoid te boundary bias problem, a well-known problem for density estimation at te boundary. Te GCR approac of StW requires te boundedness of te random vectors, so we ave to deal wit te boundary bias problem. If Assumption (b) does not old, we can transform into = ( ( ) ; ( ) ; : : : ; ( d )) ; were : [; ]! [; ] is strictly increasing and q + times continuously di erentiable wit inverse. Now P f < zg = P f < (z ) ; : : : ; d < (z d )g = F ( (z ) ; : : : ; (z d )) ; and te density of is f (z) = f ( (z)) (z ) : : : (z d ) : So if (i) () = (i) () = for i = ; : : : ; q; ten Assumption (b) is satis ed for te transformed random vector and we can work wit rater tan : We can do so because Y? X j if and only if Y? X j : An example of is te CDF of a beta distribution: (v) = B(q + ; q + ) v x q ( x) q dx := B(v; q + ; q + ) B(; q + ; q + ) were B(v; q + ; q + ) = R v xq ( x) q dx is te incomplete beta function. Te idea of using transformations to remove te boundary bias as a long istory; see Geenens (4) and te references terein. However, our idea is di erent ere. In Geenens (4) and te related literature, in order to reduce te boundary bias, a transformation is employed to map a

14 bounded support into an unbounded support. Tis approac clearly is not compatible wit te GCR framework we use ere. Our idea is to transform te data so tat te probability mass around te boundary is relatively small. Tis is a viable approac as long as our focus is not te pdf of te original data per se. Tis idea may be of independent interest. Anoter often used approac to andle te boundary problem is to trim te data wen is boundedly supported. Let " be a proper subset of te support of satisfying P [ i " ] = " for some " approacing zero at a certain rate. In tis approac, we replace K ( i j ) in te de nition of ^ n; () by K ( i j ) [ i " ] [ j " ]. For example, tis approac is used in Li and Fan (3, page 745) in a di erent context. Note tat te use of transformation or trimming is more of teoretical importance, as tey are necessary to acieve te parametric p n rate of convergence of ^ n; () to (). In practice, transformation and trimming may or may not be important. Wen te probability mass near te boundary is relatively small, tey are likely to be unimportant. In tis case, we may skip te transformation or trimming and ignore te boundary bias in practice. Assumption 3(a) is needed only wen we attempt to establis te uniformity of some asymptotic properties over : Like Assumption, Assumption 3(b) elps control te smooting bias. It is satis ed by many GCR functions suc as exp () ; normal PDF, sin () ; and cos (). Te conditions on te ig order kernel in Assumption 4 are fairly standard. For example, bot Robinson (988) and DG () make a similar assumption. Te only di erence is tat Robinson (988) and DG () require tat > q +; wile we require a stronger condition tat > q +q + in Assumption 4(b). Te stronger condition is needed to control te boundary bias, wic is absent in Robinson (988) and DG (), as tey assume tat as an unbounded support. Assumption 4(b) is not restrictive. It is satis ed by typical kernels used in practice, as tey are eiter supported on [; ] or ave exponentially decaying tails. Assumption 5(a) ensures tat te degenerate U-statistic in te Hoe ding decomposition of ^ n; () is asymptotically negligible. Assumption 5(b) removes te dominating bias of ^ n; (): See Lemmas 3 and 4 below. A necessary condition for Assumption 5 to old is tat q > d. 3. Stocastic Approximations To establis te asymptotic properties of ^ n; (); we develop some stocastic approximations, using te teory of U-statistics and U-processes pioneered by Hoe ding (948). Let ; (w; ) = E ; (w; W j ; ): Using Hoe ding s H-decomposition, we can decompose ^ n; () as ^ n; () = () + H n; () + R n; (); were () = E ; (W j ; W i ; ) = E ; (W i ; ) (8) H n; () = ~ ; (W i ; ) (9) n R n; () = i= n X i<j ~ ; (W i ; W j ; ) ()

15 and ~ ; (W i ; ) = ; (W i ; ) () ~ ; (W i ; W j ; ) = ; (W i ; W j ; ) ; (W i ; ) ; (W j ; ) + () : Te sum of te rst two terms in te H-decomposition is known as te Hájek projection. For easy reference, we denote it as ~ n; () = () + H n; (): () By construction, H n; () and R n; () are uncorrelated zero mean random variables. We sow tat te projection remainder R n; () is asymptotically negligible, and as a result ^ n; () and its Hájek projection ~ n; () ave te same limiting distribution. For eac given and ; R n; () is a degenerate second order U-statistic wit kernel ~ ; (; ; ) : According to te teory of U-statistics (e.g., Lee, 99), we ave var [R n; ()] = n(n ) var [~ ;(W i ; W j ; )] : Tis can also be proved directly by observing tat ~ ; (W i ; W j ; ) is uncorrelated wit ~ ; (W`; W m ; ) if (i; j) 6= (`; m) : If were xed, ten it follows from te basic U-statistic teory tat R n; () = o p (= p n) for eac : However, in te present setting,! as n!, so te basic U-statistic teory does not directly apply. Neverteless, we can still sow tat R n; () is still o p n = under Assumption 5(a). In fact, we can prove a stronger result, as Lemma 3 sows. Lemma 3 Under Assumptions 5(a), if! as n!, ten sup p nrn; () = o p (). We proceed to establis a stocastic approximation of te Hájek projection ~ n; (). Note tat bot () and H n; () depend on. Using a Taylor expansion, we can separate terms independent of from tose associated wit in () and H n; (). By using a iger order kernel K and controlling te rate of so tat it srinks fast enoug, we can ensure tat te terms associated wit vanis asymptotically, as in Powell, Stock, and Stoker (989). More speci cally, we rst sow tat () = () + O( q ), were q is te order of te kernel k. Ten we sow tat H n; () = n P n i= f (W i ; ) E [ (W i ; )]g + O p ( q ), were (W i ; ) '( + Xi + Yi + i 3 )f ( i ) '( + Xi + y + i 3 ) f Y (y; i )dy + '( + x + y + i 3 ) f XY (x; y; i )dxdy '( + x + Yi + i 3 ) f X (x; i )dx: Under Assumption 5(b), p n q!, wic makes bot te second term of () and te second term of H n; () vanis asymptotically. Te following lemma presents tese results formally. Lemma 4 Let Assumptions 4 and 5(b) old. Ten 3

16 (a) p n [ () ()] = o () uniformly over ; (b) p nh n; () = = p n P n i= f (W i ; ) E [ (W i ; )]g + o p () uniformly over : It follows from Lemmas 3 and 4 tat p n ^n; () ()i = p nh n; () + p nr n; () + p n [ () ()] = p nh n; () + o p () = p f (W i ; ) E [ (W i ; )]g + o p () n uniformly over : So p i n ^n; () () te same limiting distribution for eac : 3.3 Finite Dimensional Convergence i= and = p n P n i= f (W i ; ) E [ (W i ; )]g ave In tis subsection, we view ^ n; () as a U-process indexed by and consider its nite-dimensional convergence. Let s = f ; ; :::; s g for some s < and ` ; and de ne ^ n; ( s ) := [ ^ n; ( ); ^ n; ( ); :::; ^ n; ( s )] : Similarly, we de ne ( s ) := [( ); ( ); :::; ( s )]. Teorem 5 below establises te asymptotic normality of p i n ^n; ( s ) ( s ). Teorem 5 Let Assumptions 5 old. Ten p i d! n ^n; ( s ) ( s ) N (; ) ; were te (`; m) element of is (`; m) := (`; m ) = 4cov [ (W i ; `); (W i ; m )] : () If, in addition, H olds, ten () =, and were (`; m ) = 4E [(W i ; `)(W i ; m )] ; (W i ; ) = E '( ~ W i )f ( i )jx i ; Y i ; i i E '( W ~ i i )f ( i )jx i ; i E '( W ~ i i )f ( i )jy i ; i + E '( W ~ i i )f ( i )j i : Teorem 5 is of interest in its own rigt. For example, we can use it to construct a Wald test. Tere may be some power loss if s is small. Wen s is large enoug suc tat s approximates very well, ten te power loss will be small. Te idea can be motivated from te metod of sieves. We do not pursue tis ere but refer to Huang (9) for more discussions. Instead, we consider te ICM tests in te next section. Teorem 5 is an important rst step in obtaining te asymptotic distributions of te ICM statistics. 4 (3)

17 Observe tat ^ n; () is not symmetric in X and Y; wereas te ypotesis Y? X j is. However, p n[ ^ n; () ()] is asymptotically equivalent to = p n P n i= [ (W i ; ) E (W i ; )] : It can be readily cecked tat (W ; ) is symmetric in Y and X. Alternatively, we can follow te de nition of g X in (4) and de ne g Y (y; z; ), g (z; ), and g XY (x; y; z; ) as g Y (y; z; ) = E [' (X; y; z; ) j = z] g (z; ) = E [' (X; Y; z; ) j = z] g XY (x; y; z; ) = E [' (x; y; z; ) j = z] = ' (x; y; z; ) were te last equality is tautological. Ten (W ; ) = [g XY (X; Y; ; ) g X (X; ; ) g Y (Y; ; ) + g (; )] f () ; wic is clearly symmetric in Y and X: If we construct anoter estimator, say ^ n; (); by switcing te roles of X and Y, we can sow tat ^ n; and ^ n; () are asymptotically equivalent in te sense tat p n[ ^ n; ^ n; ()] = o p () uniformly over : So tere is no asymptotic gain in taking an average of ^ n; () and ^ n;. Tis point is furter supported by te symmetry of (W ; ) in X and Y: 3.4 Bandwidt Selection Altoug any coice of bandwidt satisfying Assumption 5 will deliver te asymptotic distribution in Teorem 5, in practice we need some guidance on ow to select. Ideally we sould select an tat would give us te greatest power for a given size of test, but deriving tat procedure would be complicated enoug to justify anoter study. Moreover, it would only make a di erence for iger order results. Tus, for te present purposes, we just provide a simple plug-in estimator of te MSE-minimizing bandwidt proposed by Powell and Stoker (996). Since te test statistic is based on ^ n; (), wic estimates (), it is appealing to coose an tat minimizes te mean squared error (MSE) of ^ n; (). After some tedious but straigtforward calculations, we get i i MSE ^n; () = ( () ()) + var ^n; () i = fe [B 5 (W ; )]g q + o( q ) + var ^n; () = fe [B 5 (W ; )]g q + o( q ) +4n var [ (W ; )] + 4n C () q + o(n q ) 4n var [ (W ; )] + n E [ (W ; )] d +o n d n () + o(n ); were B 5 is de ned in (47) in te appendix, and (W ; ) is de ned by i E k ; (W i ; W j ; )k jw i = (W i ; ) d + (W i ; ; ) ; were E (k (W i ; ; )k) = o d : 5

18 Te term 4n var [ (W ; )] 4n var [ (W ; )] does not depend on. Te term n () must be of smaller order tan 4n C q, and 4n C q must be of smaller order tan fe [B 5 (W ; )]g q ; oterwise tere would be a contradiction to Assumption 5(b). So te leading term of MSE[ ^ n; ()] tat involves is i MSE ^n; () = fe [B 5 (W ; )]g q + n E [ (W ; )] d : (4) i By minimizing MSE ^n; () ; we obtain te optimal bandwidt Now Assumption 5(a) is satis ed: And so is Assumption 5(b): d E [ (W ; )] = q fe [B 5 (W ; )]g =(q+d ) =(q+d ) : (5) n n ( ) d n d =(q+d ) n (q d )=(q+d )! ; given q > d : p n ( ) q n = q=(q+d ) n (q d )=(q+d ) = o(); given q > d : Te optimal bandwidt depends on te unknown quantities E [ (W ; )] and E [B 5 (W ; )]. Here we follow te standard practice (e.g., Powell and Stoker (996)) and use a simple plug-in estimator of : Let be an initial bandwidt. Suppose E ; (W i ; W j ; ) 4 = O( d ) for some >, and let % = max f + d ; q + d g. If! and n %!, ten by Proposition 4. of Powell and Stoker (996), ^ ^ ( ) = n X d [ ;(W i ; W j ; )] p! E [ (Wi ; )] ; (6) i<j and ^B 5 ^ n; () ( ) q q ^n; () for some < 6= p! E [B 5 (W ; )] : Te estimator ^B 5 given above is a slope between two points ( q ; ^ n; ()) and ( q ; ^ n; ()). To get a more stable estimator, we could use a regression of ^ n; () on q for various values of. Given ^ and ^B 5 ; te plug-in estimator of is " d ^ = ^ q ^B 5 # =(q+d ) =(q+d ) : (7) n In practice we can coose q large enoug so tat % = maxf + d, q + d g = q + d ; ten we can coose te initial bandwidt to be = o n =(q+d ). Te data driven ^ depends on. We may coose di erent bandwidts for di erent s. Tis is wat we follow in our Monte Carlo experiments. Powell and Stoker (996) mention one tecnical proviso: ^n (; ^) is not guaranteed to be asymptotically equivalent to ^ n (; ) since te MSE calculations are based on te assumption tat is deterministic. Te suggested solution is to discretize te set of possible scaling constants, 6

19 replacing ^ wit te closest value, ^ y, in some nite set. Te estimation uncertainty in ^ y is small enoug tat it will not a ect te asymptotic MSE. 4 An Integrated Conditional Moment Test In tis section, we integrate out to get an integrated conditional moment (ICM) type test statistic, following Bierens (99) and StW (998). 4. Te Test Statistic and its Asymptotic Null Distribution If ' is GCR, testing H : Y? X j is equivalent to testing H : () = for essentially all : In oter words, if we view ^ n; () as a random function in, we are testing weter its mean function () is zero on. If is compact, we can sow tat p n ^ n; () converges to a zero mean Gaussian process under te null. Based on p n ^ n; (), we construct te ICM test statistic M n = n ^n; ()i d () ; were is a probability measure on tat is absolutely continuous wit respect to te Lebesgue measure on. Here we integrate [ ^ n; ()] ; wic gives a Cramer-von Mises (CM) type test. Alternatively, we could integrate j ^ n; ()j p ; p : Te coice p = (wic gives te maximum over ) yields a Kolmogorov-Smirnov (KS) type test. We work wit p = for concreteness and because CM-type tests often outperform KS-type tests. To establis te weak convergence of M n, we rst sow tat p n[ ^ n; () ()] converges to a Gaussian process. De ne n () = p n f (W i ; ) E [ (W i ; )]g. i= Ten Lemmas 3 and 4 imply tat sup p i n ^n; () () n () = o p () : Teorem 6 below sows tat n () converges to a zero mean Gaussian process and so does p n[ ^n; () ()]. Teorem 6 Let Assumptions 5 old. Ten (a) n ()! d (); (b) p n[ ^ d n; () ()]! () ; were is a zero mean Gaussian process on wit covariance function cov (( ); ( )) = 4cov [ (W ; ) ; (W ; )] ( ; ). (8) If H also olds, ten T n () p n ^ n; () d! (). Let M : C ( )! R + be kk continuous. Ten te continuous mapping teorem (Billingsley 999, p. ) implies tat d M [T n ()]! M [()] 7

20 under te null ypotesis. For example, wit M [T n ()] = R [T n ()] d () ; we ave under H : M n M [T n ()] = 4. Global and Local Alternatives i [T n ()] d d () = n ^n; () d ()! [()] d () Te global alternatives for our conditional independence test can always be written as H a : f (z)f XY (x; y; z) f Y (y; z)f X (x; z) = (x; y; z); (9) for some nontrivial and nonzero function (x; y; z). Ten under H a, we ave () = '( ~w )(x; y; z)dxdydz: Tis will be nonzero for essentially all provided tat ' is GCR. It follows from Teorem 6 tat lim Pr(M n > c n ) = n! for any critical value c n = o(n): Tat is, te test is consistent: as te sample size increases, te test will eventually detect te alternative H a. To construct a local alternative, we consider a mixture distribution of te form H a;n : f XY (x; y; z) = c p f n Y j (yjz) + p c ~(yjx; z) f X (x; z) ; (3) n were c is a constant and ~(yjx; z) is a conditional density function of Y ~ given ( X; ~ ) ~ suc tat ~Y 6? X ~ j : ~ By construction, ~(yjx; z) is a nontrivial function of x and z: Tat is, te distribution of W is a mixture of two distributions: one satis es te null of conditional independence and te oter does not. Te mixing proportion is local to unity. Equivalently, we can rewrite te local alternative as (x; y; z) H a;n : f XY (x; y; z) = f Y j (y; z) f X (x; z) + p (3) n for (x; y; z) = c ~(yjx; z) f Y j (yjz) f X (x; z): Since ~(yjx; z) depends on x; ~(yjx; z) f Y j (yjz) cannot be a zero function. Hence wen ' is GCR and c > ; ' () := '( ~w ) (x; y; z) dxdydz 6= (3) for essentially all : Under Assumptions 5 and te local alternative H a;n, we can use te same arguments as in te proof of Teorem 6 to sow tat M n = [T n ()] d ()! d [() + ' ()] d (). Te essentially nonzero mean is te source of te power of te ICM test against te local alter- 8

21 native. Te above local alternative asymptotics can be used to guide te coice of te weigting function : Using Mercer s teorem, we can represent te covariance kernel ( ; ) as ( ; ) = X j e j ( )e j ( ) j= were j is te eigenvalue of te covariance kernel and e j () is te corresponding eigen function suc tat R ( ; ) e j ( ) d = j e j ( ) : fe j ()g also forms a set of complete ortonormal bases in L ( ). Boning and Sowell (999) sow tat coosing to be te uniform density delivers a test wit te greatest weigted average local power against te set of local alternatives wose limiting local mean functions f ' ()g assign weigts j to e j (): Tis provides some teoretical justi cation of te uniform weigting, wic is used in our simulation study. 4.3 Calculating te Asymptotic Critical Values Under te null, M n as a limiting distribution given by a functional of a zero mean Gaussian process wose covariance function depends on te DGP. Te asymptotic critical values tus depend on te DGP and cannot be tabulated. One could follow Bierens and Ploberger (997) and obtain upper bounds for te asymptotic critical values. Here, we use te conditional Monte Carlo approac suggested by Hansen (996) to simulate te asymptotic null distribution. To apply tis approac, we construct a process T n(); wic follows te desired zero mean Gaussian process conditional on fw i g. Te desired conditional covariance function for T n is cov [T n( ); T n( )j fw i g n i= ] = 4 n ^ ; (W i ; ) ^ ; (W i ; ) ^ ( ; ) ; i= were ^ ; (W i ; ) = (n ) j=;j6=i ; (W i ; W j ; ): It is straigtforward to sow tat under Assumptions -5 and te null ypotesis, ^ ( ; ) p! ( ; ). A typical Tn() is constructed by generating fv i g n i= as IID standard normal random variables independent of fw i g and setting T n() = p n ^ ; (W i ; )V i. (33) i= Following te arguments similar to te proof of Teorem in Hansen (996), we can sow tat under te null ypotesis, Mn = [Tn()] d ()! d [()] d () ; provided tat Assumptions -5 old. Simulation results sow tat te empirical pdf s of M n and 9

22 M n are fairly close. To save space, we do not report te results ere, but tey are available in Huang (9). To approximate te distribution of M n, we follow te steps below: (i) generate fv ib g n i= IID N(; ) random variables; (ii) set T n;b () p n ^ ; (W i ; )V ib ; i= (iii) set M n;b M T n;b () i = R T n;b () i d () : Tis gives a simulated sample (Mn; ; :::; M n;b ), wose empirical distribution sould be close to te true distribution of te actual test statistic M n under te null. Ten we can compute te proportion of simulated values tat exceed M n to get te simulated asymptotic p value. We reject te null ypotesis if te simulated p value lies below te speci ed level for te test. As Hansen (996) points out, B is under te control of te econometrician and can be cosen su ciently large to obtain a good approximation. 5 Monte Carlo Experiments In tis section, we perform some Monte Carlo simulation experiments to examine te nite sample performance of our conditional independence test. For all simulations, we generate IID f(x i ; Y i ; i )g. We coose '() to be te standard normal PDF, and k(u) be te sixt-order Gaussian kernel (q = 6). Te number of replications for eac experiment is, and te number of replications for simulating M n is Level and Power Studies We consider tree data generating processes. Under DGP, te sample f(x i ; Y i ; i )g is generated according to were "X " Y and s N Y = X + + " Y ; X = + + " X ; ; X Y = N s N(; ) = N(; 3): 4 ; Wen =, te null is true; oterwise te alternative olds. DGP is a modi cation of DGP tat focuses on te consequences of fat-tailed distributions. Under DGP, " X and " Y are proportional to te Student t wit 3 degrees of freedom: " X s t 3 ; " Y s t 3 ; " X? " Y :

23 DGP 3 is anoter modi cation of DGP. Under tis DGP, we allow skewness, coosing bot " X and " Y to be centered ci-square distributions: " X s ; " Y s ; " X? " Y : We transform eac variable so tat its range is comparable to te support of te GCR function '(): For te standard normal PDF, te support is te real line but te function is e ectively zero out of te interval [ 4; 4]: We transform eac variable to be supported on tis interval. Tis can be acieved by using te transformation below: X i! 8 X i X q A : X i X =(n ) We transform Y i and i analogously. Te conditional independence test is ten applied to te transformed data. We ignore te boundary bias ere as our suggested bias-removing transformation does not give rise to qualitatively di erent results. Altoug any compact wit a non-empty interior can be used, we take = [ ; ] 4 : Tis coice ensures tat f W ~ i ; g can take any value in te e ective support of '(): To compute te ICM statistic M n ; we need to compute te integral R [T n ()] d () were is te uniform distribution. In te absence of a closed-form expression, we recommend using te Monte Carlo integration metod. For eac simulation replication, we coose s s randomly from te uniform distribution on [ ; ] 4 and approximate te integral by te average P s= T n( s )=: We ave also tried using 5 random draws, but te results are e ectively te same. Note tat Tn( s ) depends on te bandwidt parameter : In our simulation experiments, we employ te data-driven bandwidt ^ ( s ) in (7) wit = n =[3(q+d )] and = :5: We use di erent bandwidts for di erent s. Given te bandwidt ^ ( s ) ; we compute te statistic Tn( s ) as Tn( s ) = n ^ ( n;^( s ) s) : Te average of Tn( s ) gives us te ICM statistic M n : We study te nite sample size and power of te test against conditional mean dependence. We use cov (X; Y j) X 4 X;Y j = = = p Xj Y j X q X Y to indicate te strengt of te dependence between X and Y, conditional on. Since bot Xj and Y j are normal under DGP, in tis case X;Y j fully captures te dependence between X and Y, conditional on. We plot te power of te tests for ranging from :9 to :9: For tis, we coose = X;Y j r for X;Y j X;Y j = :9; :8; :::; :9: Figures -3 report te size and power properties of our GCR test. Te size and power look fairly good for sample sizes as small as, and tey look very good wen te sample size reaces. Wen te sample size is small, te levels of te tests approac teir nominal value from below, delivering conservative tests. Wen te sample size increases to ; our tests become fairly accurate in size. Te sape and location of te power curves are well expected. Te power curves are also close to be symmetric, re ecting te equal capacity of our test to detect positive

7 Semiparametric Methods and Partially Linear Regression

7 Semiparametric Methods and Partially Linear Regression 7 Semiparametric Metods and Partially Linear Regression 7. Overview A model is called semiparametric if it is described by and were is nite-dimensional (e.g. parametric) and is in nite-dimensional (nonparametric).

More information

Should We Go One Step Further? An Accurate Comparison of One-Step and Two-Step Procedures in a Generalized Method of Moments Framework

Should We Go One Step Further? An Accurate Comparison of One-Step and Two-Step Procedures in a Generalized Method of Moments Framework Sould We Go One Step Furter? An Accurate Comparison of One-Step and wo-step Procedures in a Generalized Metod of Moments Framework Jungbin Hwang and Yixiao Sun Department of Economics, University of California,

More information

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households Volume 29, Issue 3 Existence of competitive equilibrium in economies wit multi-member ouseolds Noriisa Sato Graduate Scool of Economics, Waseda University Abstract Tis paper focuses on te existence of

More information

The Priestley-Chao Estimator

The Priestley-Chao Estimator Te Priestley-Cao Estimator In tis section we will consider te Pristley-Cao estimator of te unknown regression function. It is assumed tat we ave a sample of observations (Y i, x i ), i = 1,..., n wic are

More information

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator.

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator. Lecture XVII Abstract We introduce te concept of directional derivative of a scalar function and discuss its relation wit te gradient operator. Directional derivative and gradient Te directional derivative

More information

Homework 1 Due: Wednesday, September 28, 2016

Homework 1 Due: Wednesday, September 28, 2016 0-704 Information Processing and Learning Fall 06 Homework Due: Wednesday, September 8, 06 Notes: For positive integers k, [k] := {,..., k} denotes te set of te first k positive integers. Wen p and Y q

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximatinga function fx, wose values at a set of distinct points x, x, x,, x n are known, by a polynomial P x suc

More information

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines Lecture 5 Interpolation II Introduction In te previous lecture we focused primarily on polynomial interpolation of a set of n points. A difficulty we observed is tat wen n is large, our polynomial as to

More information

Robust Average Derivative Estimation. February 2007 (Preliminary and Incomplete Do not quote without permission)

Robust Average Derivative Estimation. February 2007 (Preliminary and Incomplete Do not quote without permission) Robust Average Derivative Estimation Marcia M.A. Scafgans Victoria inde-wals y February 007 (Preliminary and Incomplete Do not quote witout permission) Abstract. Many important models, suc as index models

More information

CS522 - Partial Di erential Equations

CS522 - Partial Di erential Equations CS5 - Partial Di erential Equations Tibor Jánosi April 5, 5 Numerical Di erentiation In principle, di erentiation is a simple operation. Indeed, given a function speci ed as a closed-form formula, its

More information

HOMEWORK HELP 2 FOR MATH 151

HOMEWORK HELP 2 FOR MATH 151 HOMEWORK HELP 2 FOR MATH 151 Here we go; te second round of omework elp. If tere are oters you would like to see, let me know! 2.4, 43 and 44 At wat points are te functions f(x) and g(x) = xf(x)continuous,

More information

Bootstrap confidence intervals in nonparametric regression without an additive model

Bootstrap confidence intervals in nonparametric regression without an additive model Bootstrap confidence intervals in nonparametric regression witout an additive model Dimitris N. Politis Abstract Te problem of confidence interval construction in nonparametric regression via te bootstrap

More information

Simple and Powerful GMM Over-identi cation Tests with Accurate Size

Simple and Powerful GMM Over-identi cation Tests with Accurate Size Simple and Powerful GMM Over-identi cation ests wit Accurate Size Yixiao Sun and Min Seong Kim Department of Economics, University of California, San Diego is version: August, 2 Abstract e paper provides

More information

Financial Econometrics Prof. Massimo Guidolin

Financial Econometrics Prof. Massimo Guidolin CLEFIN A.A. 2010/2011 Financial Econometrics Prof. Massimo Guidolin A Quick Review of Basic Estimation Metods 1. Were te OLS World Ends... Consider two time series 1: = { 1 2 } and 1: = { 1 2 }. At tis

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximating a function f(x, wose values at a set of distinct points x, x, x 2,,x n are known, by a polynomial P (x

More information

Differentiation in higher dimensions

Differentiation in higher dimensions Capter 2 Differentiation in iger dimensions 2.1 Te Total Derivative Recall tat if f : R R is a 1-variable function, and a R, we say tat f is differentiable at x = a if and only if te ratio f(a+) f(a) tends

More information

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x)

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x) Calculus. Gradients and te Derivative Q f(x+) δy P T δx R f(x) 0 x x+ Let P (x, f(x)) and Q(x+, f(x+)) denote two points on te curve of te function y = f(x) and let R denote te point of intersection of

More information

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i NADARAYA WATSON ESTIMATE JAN 0, 2006: version 2 DATA: (x i, Y i, i =,..., n. ESTIMATE E(Y x = m(x by n i= ˆm (x = Y ik ( x i x n i= K ( x i x EXAMPLES OF K: K(u = I{ u c} (uniform or box kernel K(u = u

More information

Kernel Density Based Linear Regression Estimate

Kernel Density Based Linear Regression Estimate Kernel Density Based Linear Regression Estimate Weixin Yao and Zibiao Zao Abstract For linear regression models wit non-normally distributed errors, te least squares estimate (LSE will lose some efficiency

More information

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx. Capter 2 Integrals as sums and derivatives as differences We now switc to te simplest metods for integrating or differentiating a function from its function samples. A careful study of Taylor expansions

More information

IEOR 165 Lecture 10 Distribution Estimation

IEOR 165 Lecture 10 Distribution Estimation IEOR 165 Lecture 10 Distribution Estimation 1 Motivating Problem Consider a situation were we ave iid data x i from some unknown distribution. One problem of interest is estimating te distribution tat

More information

POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY

POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY APPLICATIONES MATHEMATICAE 36, (29), pp. 2 Zbigniew Ciesielski (Sopot) Ryszard Zieliński (Warszawa) POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY Abstract. Dvoretzky

More information

Numerical Differentiation

Numerical Differentiation Numerical Differentiation Finite Difference Formulas for te first derivative (Using Taylor Expansion tecnique) (section 8.3.) Suppose tat f() = g() is a function of te variable, and tat as 0 te function

More information

Chapter 1. Density Estimation

Chapter 1. Density Estimation Capter 1 Density Estimation Let X 1, X,..., X n be observations from a density f X x. Te aim is to use only tis data to obtain an estimate ˆf X x of f X x. Properties of f f X x x, Parametric metods f

More information

THE IMPLICIT FUNCTION THEOREM

THE IMPLICIT FUNCTION THEOREM THE IMPLICIT FUNCTION THEOREM ALEXANDRU ALEMAN 1. Motivation and statement We want to understand a general situation wic occurs in almost any area wic uses matematics. Suppose we are given number of equations

More information

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY (Section 3.2: Derivative Functions and Differentiability) 3.2.1 SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY LEARNING OBJECTIVES Know, understand, and apply te Limit Definition of te Derivative

More information

Order of Accuracy. ũ h u Ch p, (1)

Order of Accuracy. ũ h u Ch p, (1) Order of Accuracy 1 Terminology We consider a numerical approximation of an exact value u. Te approximation depends on a small parameter, wic can be for instance te grid size or time step in a numerical

More information

Applications of the van Trees inequality to non-parametric estimation.

Applications of the van Trees inequality to non-parametric estimation. Brno-06, Lecture 2, 16.05.06 D/Stat/Brno-06/2.tex www.mast.queensu.ca/ blevit/ Applications of te van Trees inequality to non-parametric estimation. Regular non-parametric problems. As an example of suc

More information

A = h w (1) Error Analysis Physics 141

A = h w (1) Error Analysis Physics 141 Introduction In all brances of pysical science and engineering one deals constantly wit numbers wic results more or less directly from experimental observations. Experimental observations always ave inaccuracies.

More information

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example,

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example, NUMERICAL DIFFERENTIATION James T Smit San Francisco State University In calculus classes, you compute derivatives algebraically: for example, f( x) = x + x f ( x) = x x Tis tecnique requires your knowing

More information

Pre-Calculus Review Preemptive Strike

Pre-Calculus Review Preemptive Strike Pre-Calculus Review Preemptive Strike Attaced are some notes and one assignment wit tree parts. Tese are due on te day tat we start te pre-calculus review. I strongly suggest reading troug te notes torougly

More information

MA455 Manifolds Solutions 1 May 2008

MA455 Manifolds Solutions 1 May 2008 MA455 Manifolds Solutions 1 May 2008 1. (i) Given real numbers a < b, find a diffeomorpism (a, b) R. Solution: For example first map (a, b) to (0, π/2) and ten map (0, π/2) diffeomorpically to R using

More information

Math 2921, spring, 2004 Notes, Part 3. April 2 version, changes from March 31 version starting on page 27.. Maps and di erential equations

Math 2921, spring, 2004 Notes, Part 3. April 2 version, changes from March 31 version starting on page 27.. Maps and di erential equations Mat 9, spring, 4 Notes, Part 3. April version, canges from Marc 3 version starting on page 7.. Maps and di erential equations Horsesoe maps and di erential equations Tere are two main tecniques for detecting

More information

Bandwidth Selection in Nonparametric Kernel Testing

Bandwidth Selection in Nonparametric Kernel Testing Te University of Adelaide Scool of Economics Researc Paper No. 2009-0 January 2009 Bandwidt Selection in Nonparametric ernel Testing Jiti Gao and Irene Gijbels Bandwidt Selection in Nonparametric ernel

More information

Basic Nonparametric Estimation Spring 2002

Basic Nonparametric Estimation Spring 2002 Basic Nonparametric Estimation Spring 2002 Te following topics are covered today: Basic Nonparametric Regression. Tere are four books tat you can find reference: Silverman986, Wand and Jones995, Hardle990,

More information

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Statistica Sinica 24 2014, 395-414 doi:ttp://dx.doi.org/10.5705/ss.2012.064 EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Jun Sao 1,2 and Seng Wang 3 1 East Cina Normal University,

More information

Bootstrap prediction intervals for Markov processes

Bootstrap prediction intervals for Markov processes arxiv: arxiv:0000.0000 Bootstrap prediction intervals for Markov processes Li Pan and Dimitris N. Politis Li Pan Department of Matematics University of California San Diego La Jolla, CA 92093-0112, USA

More information

Sin, Cos and All That

Sin, Cos and All That Sin, Cos and All Tat James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 9, 2017 Outline Sin, Cos and all tat! A New Power Rule Derivatives

More information

lecture 26: Richardson extrapolation

lecture 26: Richardson extrapolation 43 lecture 26: Ricardson extrapolation 35 Ricardson extrapolation, Romberg integration Trougout numerical analysis, one encounters procedures tat apply some simple approximation (eg, linear interpolation)

More information

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these.

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these. Mat 11. Test Form N Fall 016 Name. Instructions. Te first eleven problems are wort points eac. Te last six problems are wort 5 points eac. For te last six problems, you must use relevant metods of algebra

More information

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4.

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4. December 09, 20 Calculus PracticeTest s Name: (4 points) Find te absolute extrema of f(x) = x 3 0 on te interval [0, 4] Te derivative of f(x) is f (x) = 3x 2, wic is zero only at x = 0 Tus we only need

More information

3.4 Worksheet: Proof of the Chain Rule NAME

3.4 Worksheet: Proof of the Chain Rule NAME Mat 1170 3.4 Workseet: Proof of te Cain Rule NAME Te Cain Rule So far we are able to differentiate all types of functions. For example: polynomials, rational, root, and trigonometric functions. We are

More information

Average Rate of Change

Average Rate of Change Te Derivative Tis can be tougt of as an attempt to draw a parallel (pysically and metaporically) between a line and a curve, applying te concept of slope to someting tat isn't actually straigt. Te slope

More information

Copyright c 2008 Kevin Long

Copyright c 2008 Kevin Long Lecture 4 Numerical solution of initial value problems Te metods you ve learned so far ave obtained closed-form solutions to initial value problems. A closedform solution is an explicit algebriac formula

More information

New Distribution Theory for the Estimation of Structural Break Point in Mean

New Distribution Theory for the Estimation of Structural Break Point in Mean New Distribution Teory for te Estimation of Structural Break Point in Mean Liang Jiang Singapore Management University Xiaou Wang Te Cinese University of Hong Kong Jun Yu Singapore Management University

More information

Uniform Convergence Rates for Nonparametric Estimation

Uniform Convergence Rates for Nonparametric Estimation Uniform Convergence Rates for Nonparametric Estimation Bruce E. Hansen University of Wisconsin www.ssc.wisc.edu/~bansen October 2004 Preliminary and Incomplete Abstract Tis paper presents a set of rate

More information

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES (Section.0: Difference Quotients).0. SECTION.0: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES Define average rate of cange (and average velocity) algebraically and grapically. Be able to identify, construct,

More information

Math 161 (33) - Final exam

Math 161 (33) - Final exam Name: Id #: Mat 161 (33) - Final exam Fall Quarter 2015 Wednesday December 9, 2015-10:30am to 12:30am Instructions: Prob. Points Score possible 1 25 2 25 3 25 4 25 TOTAL 75 (BEST 3) Read eac problem carefully.

More information

ALGEBRA AND TRIGONOMETRY REVIEW by Dr TEBOU, FIU. A. Fundamental identities Throughout this section, a and b denotes arbitrary real numbers.

ALGEBRA AND TRIGONOMETRY REVIEW by Dr TEBOU, FIU. A. Fundamental identities Throughout this section, a and b denotes arbitrary real numbers. ALGEBRA AND TRIGONOMETRY REVIEW by Dr TEBOU, FIU A. Fundamental identities Trougout tis section, a and b denotes arbitrary real numbers. i) Square of a sum: (a+b) =a +ab+b ii) Square of a difference: (a-b)

More information

Symmetry Labeling of Molecular Energies

Symmetry Labeling of Molecular Energies Capter 7. Symmetry Labeling of Molecular Energies Notes: Most of te material presented in tis capter is taken from Bunker and Jensen 1998, Cap. 6, and Bunker and Jensen 2005, Cap. 7. 7.1 Hamiltonian Symmetry

More information

Gradient Descent etc.

Gradient Descent etc. 1 Gradient Descent etc EE 13: Networked estimation and control Prof Kan) I DERIVATIVE Consider f : R R x fx) Te derivative is defined as d fx) = lim dx fx + ) fx) Te cain rule states tat if d d f gx) )

More information

Poisson Equation in Sobolev Spaces

Poisson Equation in Sobolev Spaces Poisson Equation in Sobolev Spaces OcMountain Dayligt Time. 6, 011 Today we discuss te Poisson equation in Sobolev spaces. It s existence, uniqueness, and regularity. Weak Solution. u = f in, u = g on

More information

REVIEW LAB ANSWER KEY

REVIEW LAB ANSWER KEY REVIEW LAB ANSWER KEY. Witout using SN, find te derivative of eac of te following (you do not need to simplify your answers): a. f x 3x 3 5x x 6 f x 3 3x 5 x 0 b. g x 4 x x x notice te trick ere! x x g

More information

Combining functions: algebraic methods

Combining functions: algebraic methods Combining functions: algebraic metods Functions can be added, subtracted, multiplied, divided, and raised to a power, just like numbers or algebra expressions. If f(x) = x 2 and g(x) = x + 2, clearly f(x)

More information

Click here to see an animation of the derivative

Click here to see an animation of the derivative Differentiation Massoud Malek Derivative Te concept of derivative is at te core of Calculus; It is a very powerful tool for understanding te beavior of matematical functions. It allows us to optimize functions,

More information

Functions of the Complex Variable z

Functions of the Complex Variable z Capter 2 Functions of te Complex Variable z Introduction We wis to examine te notion of a function of z were z is a complex variable. To be sure, a complex variable can be viewed as noting but a pair of

More information

Local Orthogonal Polynomial Expansion (LOrPE) for Density Estimation

Local Orthogonal Polynomial Expansion (LOrPE) for Density Estimation Local Ortogonal Polynomial Expansion (LOrPE) for Density Estimation Alex Trindade Dept. of Matematics & Statistics, Texas Tec University Igor Volobouev, Texas Tec University (Pysics Dept.) D.P. Amali Dassanayake,

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1 Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1 By Jiti Gao 2 and Maxwell King 3 Abstract We propose a simultaneous model specification procedure for te conditional

More information

arxiv: v1 [math.pr] 28 Dec 2018

arxiv: v1 [math.pr] 28 Dec 2018 Approximating Sepp s constants for te Slepian process Jack Noonan a, Anatoly Zigljavsky a, a Scool of Matematics, Cardiff University, Cardiff, CF4 4AG, UK arxiv:8.0v [mat.pr] 8 Dec 08 Abstract Slepian

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

arxiv:math/ v1 [math.ca] 1 Oct 2003

arxiv:math/ v1 [math.ca] 1 Oct 2003 arxiv:mat/0310017v1 [mat.ca] 1 Oct 2003 Cange of Variable for Multi-dimensional Integral 4 Marc 2003 Isidore Fleiscer Abstract Te cange of variable teorem is proved under te sole ypotesis of differentiability

More information

Efficient algorithms for for clone items detection

Efficient algorithms for for clone items detection Efficient algoritms for for clone items detection Raoul Medina, Caroline Noyer, and Olivier Raynaud Raoul Medina, Caroline Noyer and Olivier Raynaud LIMOS - Université Blaise Pascal, Campus universitaire

More information

Kernel Density Estimation

Kernel Density Estimation Kernel Density Estimation Univariate Density Estimation Suppose tat we ave a random sample of data X 1,..., X n from an unknown continuous distribution wit probability density function (pdf) f(x) and cumulative

More information

Preface. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Preface. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. Preface Here are my online notes for my course tat I teac ere at Lamar University. Despite te fact tat tese are my class notes, tey sould be accessible to anyone wanting to learn or needing a refreser

More information

An L p di erentiable non-di erentiable function

An L p di erentiable non-di erentiable function An L di erentiable non-di erentiable function J. Marsall As Abstract. Tere is a a set E of ositive Lebesgue measure and a function nowere di erentiable on E wic is di erentible in te L sense for every

More information

A.P. CALCULUS (AB) Outline Chapter 3 (Derivatives)

A.P. CALCULUS (AB) Outline Chapter 3 (Derivatives) A.P. CALCULUS (AB) Outline Capter 3 (Derivatives) NAME Date Previously in Capter 2 we determined te slope of a tangent line to a curve at a point as te limit of te slopes of secant lines using tat point

More information

New families of estimators and test statistics in log-linear models

New families of estimators and test statistics in log-linear models Journal of Multivariate Analysis 99 008 1590 1609 www.elsevier.com/locate/jmva ew families of estimators and test statistics in log-linear models irian Martín a,, Leandro Pardo b a Department of Statistics

More information

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION PETER G. HALL AND JEFFREY S. RACINE Abstract. Many practical problems require nonparametric estimates of regression functions, and local polynomial

More information

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point MA00 Capter 6 Calculus and Basic Linear Algebra I Limits, Continuity and Differentiability Te concept of its (p.7 p.9, p.4 p.49, p.55 p.56). Limits Consider te function determined by te formula f Note

More information

The derivative function

The derivative function Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Te derivative function Wat you need to know already: f is at a point on its grap and ow to compute it. Wat te derivative

More information

2.1 THE DEFINITION OF DERIVATIVE

2.1 THE DEFINITION OF DERIVATIVE 2.1 Te Derivative Contemporary Calculus 2.1 THE DEFINITION OF DERIVATIVE 1 Te grapical idea of a slope of a tangent line is very useful, but for some uses we need a more algebraic definition of te derivative

More information

Artificial Neural Network Model Based Estimation of Finite Population Total

Artificial Neural Network Model Based Estimation of Finite Population Total International Journal of Science and Researc (IJSR), India Online ISSN: 2319-7064 Artificial Neural Network Model Based Estimation of Finite Population Total Robert Kasisi 1, Romanus O. Odiambo 2, Antony

More information

Math 31A Discussion Notes Week 4 October 20 and October 22, 2015

Math 31A Discussion Notes Week 4 October 20 and October 22, 2015 Mat 3A Discussion Notes Week 4 October 20 and October 22, 205 To prepare for te first midterm, we ll spend tis week working eamples resembling te various problems you ve seen so far tis term. In tese notes

More information

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems 5 Ordinary Differential Equations: Finite Difference Metods for Boundary Problems Read sections 10.1, 10.2, 10.4 Review questions 10.1 10.4, 10.8 10.9, 10.13 5.1 Introduction In te previous capters we

More information

DEPARTMENT MATHEMATIK SCHWERPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

DEPARTMENT MATHEMATIK SCHWERPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE U N I V E R S I T Ä T H A M B U R G A note on residual-based empirical likeliood kernel density estimation Birte Musal and Natalie Neumeyer Preprint No. 2010-05 May 2010 DEPARTMENT MATHEMATIK SCHWERPUNKT

More information

Continuous Stochastic Processes

Continuous Stochastic Processes Continuous Stocastic Processes Te term stocastic is often applied to penomena tat vary in time, wile te word random is reserved for penomena tat vary in space. Apart from tis distinction, te modelling

More information

Quaternion Dynamics, Part 1 Functions, Derivatives, and Integrals. Gary D. Simpson. rev 01 Aug 08, 2016.

Quaternion Dynamics, Part 1 Functions, Derivatives, and Integrals. Gary D. Simpson. rev 01 Aug 08, 2016. Quaternion Dynamics, Part 1 Functions, Derivatives, and Integrals Gary D. Simpson gsim1887@aol.com rev 1 Aug 8, 216 Summary Definitions are presented for "quaternion functions" of a quaternion. Polynomial

More information

Boosting Kernel Density Estimates: a Bias Reduction. Technique?

Boosting Kernel Density Estimates: a Bias Reduction. Technique? Boosting Kernel Density Estimates: a Bias Reduction Tecnique? Marco Di Marzio Dipartimento di Metodi Quantitativi e Teoria Economica, Università di Cieti-Pescara, Viale Pindaro 42, 65127 Pescara, Italy

More information

ERROR BOUNDS FOR THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BRADLEY J. LUCIER*

ERROR BOUNDS FOR THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BRADLEY J. LUCIER* EO BOUNDS FO THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BADLEY J. LUCIE* Abstract. Te expected error in L ) attimet for Glimm s sceme wen applied to a scalar conservation law is bounded by + 2 ) ) /2 T

More information

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES Ronald Ainswort Hart Scientific, American Fork UT, USA ABSTRACT Reports of calibration typically provide total combined uncertainties

More information

. If lim. x 2 x 1. f(x+h) f(x)

. If lim. x 2 x 1. f(x+h) f(x) Review of Differential Calculus Wen te value of one variable y is uniquely determined by te value of anoter variable x, ten te relationsip between x and y is described by a function f tat assigns a value

More information

Kernel Smoothing and Tolerance Intervals for Hierarchical Data

Kernel Smoothing and Tolerance Intervals for Hierarchical Data Clemson University TigerPrints All Dissertations Dissertations 12-2016 Kernel Smooting and Tolerance Intervals for Hierarcical Data Cristoper Wilson Clemson University, cwilso6@clemson.edu Follow tis and

More information

Exam 1 Review Solutions

Exam 1 Review Solutions Exam Review Solutions Please also review te old quizzes, and be sure tat you understand te omework problems. General notes: () Always give an algebraic reason for your answer (graps are not sufficient),

More information

ch (for some fixed positive number c) reaching c

ch (for some fixed positive number c) reaching c GSTF Journal of Matematics Statistics and Operations Researc (JMSOR) Vol. No. September 05 DOI 0.60/s4086-05-000-z Nonlinear Piecewise-defined Difference Equations wit Reciprocal and Cubic Terms Ramadan

More information

Material for Difference Quotient

Material for Difference Quotient Material for Difference Quotient Prepared by Stepanie Quintal, graduate student and Marvin Stick, professor Dept. of Matematical Sciences, UMass Lowell Summer 05 Preface Te following difference quotient

More information

Topics in Generalized Differentiation

Topics in Generalized Differentiation Topics in Generalized Differentiation J. Marsall As Abstract Te course will be built around tree topics: ) Prove te almost everywere equivalence of te L p n-t symmetric quantum derivative and te L p Peano

More information

Fast optimal bandwidth selection for kernel density estimation

Fast optimal bandwidth selection for kernel density estimation Fast optimal bandwidt selection for kernel density estimation Vikas Candrakant Raykar and Ramani Duraiswami Dept of computer science and UMIACS, University of Maryland, CollegePark {vikas,ramani}@csumdedu

More information

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist Mat 1120 Calculus Test 2. October 18, 2001 Your name Te multiple coice problems count 4 points eac. In te multiple coice section, circle te correct coice (or coices). You must sow your work on te oter

More information

Function Composition and Chain Rules

Function Composition and Chain Rules Function Composition and s James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 8, 2017 Outline 1 Function Composition and Continuity 2 Function

More information

2.8 The Derivative as a Function

2.8 The Derivative as a Function .8 Te Derivative as a Function Typically, we can find te derivative of a function f at many points of its domain: Definition. Suppose tat f is a function wic is differentiable at every point of an open

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LAURA EVANS.. Introduction Not all differential equations can be explicitly solved for y. Tis can be problematic if we need to know te value of y

More information

Math 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006

Math 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006 Mat 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006 f(x+) f(x) 10 1. For f(x) = x 2 + 2x 5, find ))))))))) and simplify completely. NOTE: **f(x+) is NOT f(x)+! f(x+) f(x) (x+) 2 + 2(x+) 5 ( x 2

More information

Lecture 21. Numerical differentiation. f ( x+h) f ( x) h h

Lecture 21. Numerical differentiation. f ( x+h) f ( x) h h Lecture Numerical differentiation Introduction We can analytically calculate te derivative of any elementary function, so tere migt seem to be no motivation for calculating derivatives numerically. However

More information

Journal of Computational and Applied Mathematics

Journal of Computational and Applied Mathematics Journal of Computational and Applied Matematics 94 (6) 75 96 Contents lists available at ScienceDirect Journal of Computational and Applied Matematics journal omepage: www.elsevier.com/locate/cam Smootness-Increasing

More information

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING Statistica Sinica 13(2003), 641-653 EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING J. K. Kim and R. R. Sitter Hankuk University of Foreign Studies and Simon Fraser University Abstract:

More information

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT MARCH 29, 26 LECTURE 2 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT (Davidson (2), Chapter 4; Phillips Lectures on Unit Roots, Cointegration and Nonstationarity; White (999), Chapter 7) Unit root processes

More information

AMS 147 Computational Methods and Applications Lecture 09 Copyright by Hongyun Wang, UCSC. Exact value. Effect of round-off error.

AMS 147 Computational Methods and Applications Lecture 09 Copyright by Hongyun Wang, UCSC. Exact value. Effect of round-off error. Lecture 09 Copyrigt by Hongyun Wang, UCSC Recap: Te total error in numerical differentiation fl( f ( x + fl( f ( x E T ( = f ( x Numerical result from a computer Exact value = e + f x+ Discretization error

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION MODULE 5

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION MODULE 5 THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION NEW MODULAR SCHEME introduced from te examinations in 009 MODULE 5 SOLUTIONS FOR SPECIMEN PAPER B THE QUESTIONS ARE CONTAINED IN A SEPARATE FILE

More information