Adaptive Estimation of the Regression Discontinuity Model

Size: px
Start display at page:

Download "Adaptive Estimation of the Regression Discontinuity Model"

Transcription

1 Adative Estimation of the Regression Discontinuity Model Yixiao Sun Deartment of Economics Univeristy of California, San Diego La Jolla, CA Feburary 25 Tel:

2 Abstract In order to reduce the nite samle bias and imrove the rate of convergence, local olynomial estimators have been introduced into the econometric literature to estimate the regression discontinuity model. In this aer, we show that, when the degree of smoothness is known, the local olynomial estimator achieves the otimal rate of convergence within the Hölder smoothness class. However, when the degree of smoothness is not known, the local olynomial estimator may actually in ate the nite samle bias and reduce the rate of convergence. We roose an adative version of the local olynomial estimator which selects both the bandwidth and the olynomial order adatively and show that the adative estimator achieves the otimal rate of convergence u to a logarithm factor without knowing the degree of smoothness. Simulation results show that the nite samle erformance of the locally cross-validated adative estimator is robust to the arameter combinations and data generating rocesses, re ecting the adative nature of the estimator. The root mean squared error of the adative estimator comares favorably to local olynomial estimators in the Monte Carlo exeriments. Keywords: Adative estimator, local cross validation, local olynomial, minimax rate, otimal bandwidth, otimal smoothness arameter JEL Classi cation Numbers: C3, C4

3 Introduction In this aer, we consider the regression discontinuity model: y = m(x) + d + " () where m(x) is a continuous function of x; d = fx x g, and E("jx; d) =. Such a model has been used in the emirical literature to identify the treatment e ect when there is a discontinuity in the treatment assignment. A artial list of examles include Angrist and Lavy (999), Black (999), Battistin and Rettore (22), Van der Klaauw (22), DiNardo and Lee (24), and Chay and Greenstone (25). Given the iid data fx i ; y i g n i= ; our objective is to develo a good estimator of ; the treatment e ect at a known cut-o oint x : In order to maintain generality of the resonse attern, we do not imose a seci c functional form on m(x): Instead, we take m(x) to belong to a family that is characterized by regularity conditions near the cut-o oint. This is a semiarametric aroach to estimating the regression discontinuity model. Semiarametric estimation of the regression discontinuity model is closely related to the estimation of conditional exectation at a boundary oint. In both settings, the widely used Nadaraya-Watson (NW) estimator has a large nite samle bias and slow rate of convergence. To reduce the nite samle bias and imrove the rate of convergence, Hahn, Todd and Van der Klaauw (2) and Porter (23) roose using a linear function or a olynomial to aroximate m(x) in a small neighborhood of the cut-o oint. Porter (23) obtains the otimal rate of convergence using Stone s (98) criterion and shows that the local olynomial estimator achieves the otimal rate when the degree of smoothness of m(x) is known. In this aer, we show that the local olynomial estimator with the asymtotic MSE otimal bandwidth may actually in ate the nite samle bias and reduce the rate of convergence when the degree of smoothness of m(x) is not known. In articular, this will haen if the order of the local olynomial is too large relative to the degree of smoothness. Hence, a drawback of the local olynomial estimator is that the otimal rate of convergence can not be achieved because it deends on the unknown quantity. This calls for an estimator that is adative to the unknown smoothness. We require the estimator to be adative not just at a xed model, but also at a sequence of models near it. The adative rate refers not just to ointwise convergence, but rather to convergence uniformly over models that are very close to some articular model of interest. The roblem of adative estimation of a nonarametric function from noisy data has been studied in a number of aers including Leski (99,99,992), Donoho and John-

4 stone (995), Birge and Massart (997) and the references cited therein. Various aroaches have been roosed, among which Leski s method has been widely used in the statistical literature; see for examle, Leski and Sokoiny (997), Leski, Mammen and Sokoiny (997) and Sokoiny (2). These aers study adative bandwidth choice in local constant or linear regression for estimating the drift function in a Gaussian white noise model or a nonarametric di usion model. More seci cally, Leski and Sokoiny (997) work with the Gaussian white noise model and consider ointwise estimation using a kernel method with the Hölder smoothness class, assuming that the order of smoothness is less than 2. Leski, Mammen and Sokoiny (997) extend the ointwise estimation to global estimation using a high order kernel method with the Bosev class. In addition, Leski s method has been used in several aers on semiarametric estimation of long memory in the time series literature including Giritis, Robinson, and Samarov (2), Hurvich, Moulier and Soulier (22), Ioudisky, Moulier and Soulier (22), Andrews and Sun (24) and Guggenberger and Sun (24). In this aer, we use Leski s method to construct a rate-adative estimator of the regression discontinuity model. In doing so, we extend Leski s method in several imortant ways. First, we consider the local olynomial estimators instead of kernel estimators. The estimation of the regression discontinuity model is similar to the estimation of conditional exectation on the boundary. It is well known that local olynomial estimators have some otimality roerties for the boundary estimation roblem. Second, a direct alication of Leski s aroach to the resent framework involves using a olynomial of a re-seci ed order and comaring local olynomial estimators with di erent bandwidths. More seci cally, one has to rst choose the order of the olynomial to be larger than the uer bound s of the smoothness arameter. Such a strategy is not otimal. If the underlying smoothness arameter s is less than s ; then it is better to use a olynomial of order bsc; the largest integer strictly smaller than s: Using a olynomial of a higher order will only in ate the asymtotic variance without the bene t of bias reduction. In contrast, our adative method chooses both the bandwidth and the order of the olynomial adatively The chosen olynomial in the adative estimator is indeed of order bsc: Third, our adative rule does not use the lower and uer bounds for s while the adative rule in Leski (99) uses them exlicitly. In consequence, the rate of convergence of our adative estimator can be arbitrarily close to the arametric rate in the in nitely smooth case while that of Leski s estimator is caed by the uer bound s : This advan- 2

5 tage of our adative estimator is artly due to the use of the zero-one loss rather than the squared-error loss. Results for the zero-one loss are su cient to obtain the otimal rate of convergence, which is the item of greatest interest here. Finally, one drawback of Leski s aroach is that there are constants in the adative rocedure that are arbitrary. This is true for other adative rocedures although some rocedures may x their constants at certain ad hoc values and seemingly remove the need to choose any constant. In this aer, we roose using local cross validation to select the constants and rovide a ractical strategy to imlement the adative estimator. We comare the root mean-squared error (RMSE) erformance of the adative estimator with the local constant, local linear, local quadratic and local cubic estimators. We consider three grous of models with di erent resonse functions m(x): In the rst grou, m(x) is the sum of a third order olynomial and a term containing (x x ) s for some non-integer s. Resonse functions in this grou are designed to have nite smoothness s : By choosing di erent s ; we can get resonse functions that have di erent degrees of smoothness. The second grou is the same as the rst grou excet that m(x) is erturbed by an additive sine function such that the resonse function has a ner structure. For the third grou, we take m(x) to be a constant, linear, quadratic or cubic function. This grou is designed to give each of the local olynomial estimators the best advantage. The Monte Carlo results show that the RMSE erformance of the adative estimator is very robust to the data generating rocess, re ecting its adative nature. Its RMSE is either the lowest or among the three lowest ones for the arameter combinations and data generating rocesses considered. In contrast, a local olynomial estimator may erform very well in some scenario but disastrously in other scenarios. The best estimator in an overall sense seems to be the adative estimator. The rest of the aer is organized as follows. Section 2 overviews the local olynomial estimator and examines its asymtotic roerties when the order of the olynomial is larger than the underlying smoothness. Section 3 establishes the otimal rate of convergence within the Hölder smoothness class and shows that the local olynomial estimator achieves the otimal rate when the degree of smoothness is known. Section 4 introduces the adative local olynomial estimator. It is shown that the adative estimator achieves the otimal rate for known smoothness u to a logarithm factor when the smoothness is not known. For a given resonse function m(x); it is also shown that the adative rocedure rovides a consistent estimator of the smoothness index de ned in that section. The subsequent section contains the simulation results that comare the nite samle erformance of the adative estimator with those of the local olynomial estimators. Proofs and additional 3

6 technical results are given in the Aendix. Throughout the aer, fg is the indicator function and jj jj signi es the Euclidean norm. C is a generic constant that may be di erent across di erent lines. 2 Local Polynomial Estimation Consider the regression discontinuity design model y = m(x) + d + " where m(x) is a unknown function of x; E("jx; d) = and d = fx x g. Given the iid data (x i ; y i ); i = ; 2; :::; m; our objective is to estimate without assuming the functional form of m(). However, it is necessary to assume that m(x) belongs to some smoothness class. De nition: Let s = `+ where ` is the largest integer strictly less than s and 2 (; ]: If a function de ned on the interval [x ; x + ) is ` times di erentiable, su m (j) (x) K for j = ; ; 2; 3; :::; ` x2[x ;x +) and m (`) (x ) m (`) (x 2 ) K jx x 2 j for x ; x 2 2 [x ; x + ) where m (j) (x) is the j-th order derivative and m (j) (x ) is the j-th order right hand derivative at x ; then we say m(x) is smooth of order s on [x ; x + ). Denote this class of functions by M + (s; ; K): Similarly, we can de ne M (s; ; K) as the class of functions that satisfy the above two conditions with [x ; x + ) relaced by (x ; x ] and m (j) (x ) being the left hand derivative at x : Assumtion : m(x) 2 M(s; ; K) where M(s; ; K) := fm : m 2 M + (s; ; K) \ M (s; ; K) \ C (x ; x + )g and C (x ; x + ) is the set of continuous functions on (x ; x + ): Assumtion allows us to develo an ` term Taylor exansion of m(x) on each side of x : Without loss of generality, we focus on x x ; in which case we have d j m(x) = m(x ) + `X b + j (x x ) j + ~e + (x); (2) j= where b + j = j! m(x)j dx j x=x + is the (normalized) j-th order right hand derivative of m(x) at x and ~e + (x) = `! m (`) (~x) m (`) (x ) (x x )` (3) 4

7 for some ~x between x and x : Under Assumtion ; ~e + (x) satis es ~e + (x) K (`!) jx x j s for all x 2 [x ; x + ): (4) We break u the Taylor exansion into the art that will be catured by the local olynomial regression and the remainder: where and q = min fs; r + 2g : min(r;`) X m(x) = m(x ) + b + j (x x ) j + R + (x), x x (5) R + (x) = `X j= j=min(r;`)+ b + j (x x ) j + ~e + (x) (6) : = f` r + gb + r+ (x x ) r+ + e + (x); e + (x)= (x x ) q = O() uniformly over x 2 [x ; x + ); (7) Let b + (r) denote the column r-vector whose j-th element is b + j for j = ; 2; :::; min(r; `) and for j = min(r; `)+; :::; r: Let z ir = (; (x i x ); :::; (x i x ) r ) be the row (r+)-vector, = (c + ; (b + (r)) ) and c + = + m(x ): Then for x i x ; we have + r To estimate + r, we minimize y i = z ir + r + R + (x i ) + " i (8) nx k h (x i x )d i (y i z ir r ) 2 (9) i= with resect to r, where d i = fx i x g; k h (x i x ) = =hk((x i x ) =h) and h is the bandwidth arameter. Let Y + and Z r + be the data matrix that collects the values of y i and z ir resectively with the corresonding value of x i x. Then (8) can be written in the vector form: Y + = Z + r + r + R + + " + () and the objective function in (9) becomes Y + Z + r r W + Y + Z + r r () where W + = diag(fhk h (x i ^+ r = x )g) xi x: Minimizing the receding quantity gives ^c + r ; (^b (r)) = Z + r 5 W + Z + r (Z + r W + Y + ): (2)

8 De ning Y, Z r, W analogously using the observations satisfying x i < x ; we have Y = Z r r + R + " (3) where r = (c ; (b (r)) ), c = m(x ) and b (r) is similarly de ned but with the right hand derivatives relaced by the left hand derivatives. Minimizing (Y Z r r ) W (Y Z r r ) with resect to r gives an estimate for r : ^r = ^c r ; (^b (r)) = Z r W Z r (Z r W Y ): (4) The di erence between ^c + r and ^c r gives an estimate for : ^ r = ^c + r ^c r : (5) To investigate the asymtotic roerties of ^ r ; we maintain the following two additional assumtions. x : Assumtion 2: (a) E("jx; d) = : (b) 2 (x) = E(" 2 jx) is continuous for x 6= x and the right and left hand limits exist at (c) For some > ; E(j"j 2+ jx) is uniformly bounded on [x (d)the marginal density f(x) of x is continuous on [x ; x + ]. ; x + ]: Assumtion 3: The kernel k () is even, bounded and has a bounded suort. Theorem Let Assumtions -3 hold. If n! and h! such that nh! ; then nh(^r ) B ) N ;! 2 2 r where B = fs > r + g e! 2 = 2+ (x ) + 2 (x ) f(x ; 2 r = e ) r r b + r+ ( ) r+ b r+ f(x ) r = i+j 2 (r+)(r+) = r V r r e ; h r+ nh( + o ()) + O h q nh ; ::: r.. C A, r ::: 2r v ::: v r V r = (v i+j 2 ) (r+)(r+) = C A ; v r ::: v 2r e = (; ; ::::; ), r = ( r+ ; :::; 2r+ ) ; j = R k(u)u j du and v j = R k 2 (u)u j du: 6

9 Remarks. When s > r + ; Theorem is the same as Theorem 3(a) in Porter (23). The roof is straightforward and uses art of Porter s result. 2. If s > r +, the asymtotic bias of ^ r, de ned as B= nh, is of order h r+ : In contrast, the asymtotic bias of ^ is of order h: The asymtotic bias of ^ r for r is smaller than that of ^ by an order of magnitude rovided that m(x) is smooth of order s > r + : 3. If s > r + ; then the asymtotic MSE of ^ r is AMSE(^ r ) = C h 2r+2 + C 2 nh : (6) Assume that C > and C 2 > ; then minimizing AMSE(^ r ) over h gives the AMSE-otimal choice for h : h C =(2r+3) 2 = n =(2r+3) : (7) (2r + 2)C For this AMSE-otimal choice of h; AMSE(^ r ) is roortional to e r r r r e e 2(r+) =(2r+3) r V r r e n 2(r+)=(2r+3) : (8) So ^ r converges to at the rate of n (r+)=(2r+3) : In articular, ^ converges to at the rate of n =3 : As a consequence, by aroriate choice of h, one has asymtotic normality of ^ r with a faster rate of convergence (as a function of the samle size n) than is ossible with ^ : 4. When s > r + and h = h ; the asymtotic mean squared error deends on the kernel only through the quantity (k) = e r r r r e e 2(r+) r V r r e : (9) This quantity is the same as T +; de ned in equation (7) in Cheng, Fan and Marron (997,. 695). Using their roof without change, we can show that the kernel that minimizes (k) over the class of kernels de ned by Z K = k(x) : k(x) ; k(x)dx = ; jk(x) k(y)j C jx yj for some C > is simly the Bartlett kernel k(x) = ( jxj) fjxj g for all r. This is an unusual result because the otimal kernel does not deend on the order of the local olynomial. 7

10 5. Consider the case that s r + and h is roortional to the AMSE otimal rate n =(2r+3) : For such a con guration, the asymtotic bias dominates the asymtotic variance. The estimator ^ r converges to the true at the rate of n s 2r+3 : The larger r is, the slower the rate of convergence is. For examle, when 2r + 3 3s; the rate of convergence is slower than n =3 ; the rate that is obtainable using the Nadaraya- Watson estimator. By tting a high order olynomial, it is ossible that we in ate the boundary e ect instead of reducing it. Theorem shows that the local olynomial estimation has the otential to reduce the boundary bias roblem and deliver a faster rate of convergence when the resonse function is smooth enough. In the next section, we establish the otimal rate of convergence when the degree of smoothness is known. It is shown that the local olynomial estimator with aroriately chosen bandwidth achieves this otimal rate. 3 Otimal Rate of Convergence To obtain the otimal rate of convergence, we cast the regression discontinuity model into the following general framework: Suose P is a family of robability models on some xed measurable sace (; A). Let be a functional de ned on P, taking values in R. An estimator of is a measurable ma ^ :! R: For a given loss function L(^; ), the maximum exected loss over P 2 P is de ned to be R(^; P) = su E P L(^; (P )) (2) P 2P where E P is the exectation oerator under the robability measure P: Our goal is to nd an achievable lower bound for the minimax risk de ned by inf ^ R(^; P) = inf ^ su E P L(^; (P )): (2) P 2P If we add a subscrit n to ^, P; and P where n is the samle size, the achievable lower bound will translate into the best rate of convergence of R(^; P) to zero. This best rate is called the minimax rate of convergence as it is derived from the minimax criterion. It is also commonly referred to as the otimal rate of convergence. Now let us ut the regression discontinuity model in the above general framework. Let f() be a robability density function of x and ' x () be a conditional density of " for a given x such that E("jx) = : For both densities the dominating measures are the usual 8

11 Lesbegue measures. De ne P(s; ; K) = P m; : dp m; d = f(x)' x (y m(x)) fx < x g o + f(x)' x (y m(x) ) fx x g ; m(x) 2 M(s; ; K); jj K where is the Lesbegue measure on R 2 : For this family of models, the marginal distribution of x and the conditional distribution of " are the same across all members. The di erence among members lies in the conditional mean of y for a given x: In other words, the function m() and the constant characterize the robability model in the family P(s; ; K): To re ect this, we use subscrits m; to di erentiate the robability model in P(s; ; K): For the regression discontinuity model, the functional of interest is (P m; ) = : For a given loss function L(; ); we want to design an estimator ^ to minimize su E m; L(^; ) (22) P m;2p(s;;k) where E m; L(^; ) := E Pm; L(^; ) and E Pm; is the exectation oerator under P m; : One common choice of L(; ) is the quadratic loss function L(^; ) := L (^ ) = (^ ) 2 ; (23) in which case R(^; P) is the maximum exected mean squared error. Another common choice is the - loss function L(^; ) =: L (^ ) = fj^ j > =2g (24) for some xed > ; in which case, R(^; P) is the maximum robability that ^ is not in the =2-neighborhood of : Since the exected mean squared error may not exist for the local olynomial estimator, we use the - loss for convenience in this aer. The use of the - loss is innocuous if the otimal rate of convergence is the item of greatest interest. The derivation of a minimax rate of convergence for an estimator involves a series of minimax calculations for di erent samle sizes. There is no initial advantage in making the deendence on the samle size exlicit. Consider then the roblem of nding a lower bound for the minimax risk inf ^ su P 2P E P L(^; ): The simlest method for nding such a bound is to identify an estimator with a test between simle hyotheses. The whole argument could be cast in the language of Neyman-Pearson testing. Let P; Q be robability measures de ned on the same measurable sace (; A). Then the testing a nity (Le Cam (986) and Donoho and Liu (99)) of two robability measures is de ned to be (P; Q) = inf(e P + E Q ( )) (25) 9

12 where the in mum is taken over the measurable function such that : In other words, (P; Q) is the smallest sum of tye I and tye II errors of any test between P and Q. It is a natural measure of the di culty of distinguishing P and Q: Suose is a measure dominating both P and Q with corresonding densities and q: It follows from the Neyman-Pearson lemma that the in mum is achieved by setting = f qg and Z Z (P; Q) = f qg d + f > qg qd Z = j qj d := 2 2 jjp Qjj (26) where jjp Qjj = R j qj d is the L distance between two robability measures. Now consider a air of robability models P; Q 2 P such that (P ) for any estimator ^ Let then and Therefore (Q) : Then fj^ (P )j > =2g + fj^ (Q)j > =2g : (27) = fj^ (P )j > =2g fj^ (P )j > =2g + fj^ (Q)j > =2g ; (28) su P(j^ (P)j > =2) fp (j^ (P )j > =2) + Q(j^ (Q)j > =2)g P2P 2 2 E P + 2 E Q ( ) (P; Q): (29) 2 inf ^ for any P and Q such that (P ) (Q) : su P fj^ j > =2g (P; Q) (3) P2P 2 Inequality (3) suggests a simle way to get a good lower bound for the minimax robability error: search for the air (P; Q) to minimize (P; Q); subject to the constraint (P ) (Q) : To obtain a lower bound with a sequence of indeendent observations, we let (; A) be the roduct sace and P be a family of robability models on such a sace. Then for any air of nite-roduct measures P = n i= P i and Q = n i= Q i, the minimax risk satis es inf su P fj^ j > =2g ^ P2P 2 2 jjn i=p i n i=q i jj (3) rovided that (P ) (Q) : We now turn to the regression discontinuity model. Our objective is to search for two robability models P and Q that are di cult to distinguish by the indeendent observations

13 (x i ; y i ), i = ; 2; :::; n: Note that it is not restrictive to consider only articular distributions for " i and x i for the urose of obtaining a lower bound. The minimax risk for a larger class of robability models must not be smaller than that for a smaller class of robability models. Therefore, if the lower bound holds for a articular distributional assumtion, then it also holds for a wider class of distributions. To simlify the calculation, we assume that " i is iid N(; 2 ) and x i is iid uniform [x ; x + ] under both P and Q: More details on the construction of P and Q are given in the roof of the following theorem: Theorem 2 Let Assumtion 2 hold. (a) For any nite constants s, and K; we have n lim inf inf s su P m; 2s+ (^ ) > C n! ^ P m;2p(s;;k) 2 for some ositive constant C and a small > : (b) Suose Assumtion 3 also holds: Let h = n =(2s+) for some constant ; then Remarks lim! lim su n! su P m;2p(s;;k) n s P m; 2s+ (^` ) > = : 2. Part (a) of the theorem shows that there exists no estimator ^ that converges to at a rate faster than n s=(2s+) uniformly over the class of robability models P(s; ; K): Part (b) of the theorem shows that the rate n s=(2s+) is achieved by the local olynomial estimator rovided that r = ` and h is chosen aroriately. Because of Parts (a) and (b), the rate n s=(2s+) is called the minimax otimal rate of convergence. 2. This results of the theorem extends Porter (23) who considers a class of functions that are ` times continuously di erentiable. Our result is more general as we consider the Hölder smoothness class, which is larger than what Porter (23) has considered. Our method for calculating the lower bound for the minimax risk is also simler than that of Stone (98), which is adoted in Porter (23). 3. An alternative roof of the minimax rate is to use the asymtotic equivalence of nonarametric regression models and Gaussian noise models (see Brown and Low (996)). The Gaussian noise model is de ned by dy = S(t)dt + "dw (t) where W (t) is the standard Brownian motion. Ibragimov and Khasminskii (98) show that the otimal minimax rate for estimating the drift function S(t) is " 2s=(2s+) : Since " in

14 the Gaussian noise model corresonds to = n in a nonarametric regression with n coies of iid data, we infer that the otimal minimax rate in the nonarametric regression is n s=(2s+) : Our roof is in the sirit of Donoho and Liu (99) and involves only elementary calculations. 4 A Rate Adative Estimator The revious section establishes the otimal rate of convergence when the degree of smoothness is known. In this section, we roose a local olynomial estimator that achieves the otimal rate of convergence u to a logarithm factor when the degree of smoothness is not known. Let [s ; s ] for some s > and s 2 [s ; ) be the range of smoothness. For each 2 [s ; s ], we de ne a local olynomial estimator ^ = ^c + c ; by setting h = n =(2+) and r = w for 2 (w; w + ] for w = ; ; ::: (32) where is a ositive constant. Equivalently, r is the largest integer that is strictly less than : Note that the subscrit on ^, ^c + and ^c indicates the order of the local olynomial in the revious sections while it now indicates the underlying smoothing arameter that generates the bandwidth and the order of the olynomial given in (32). Let g := = log n and S g be the g-net of the interval [s ; ): S g = f : = s + jg; j = ; ; 2; :::g: For a ositive constant 2 ; de ne o ^s = su n 2 2 S g : j^ ^ 2 j 2 (nh ) =2 (n) for all 2 ; 2 S g ; (33) where (n) = (log n)(log log(n)) =2 : Intuitively, ^s is the largest smoothness arameter such that the associated local olynomial estimator does not di er signi cantly from the local olynomial estimator with a smaller smoothness arameter. Grahically, one can view the bound in the de nition of ^s as a function of : Then, ^s is the largest value of 2 2 S g such that j^ ^ 2 j lies below the bound for all 2 ; 2 S g : Calculation of ^s is carried out by considering successively larger 2 values s ; s + g; s + 2g; :::; until for some 2 the deviation j^ ^ 2 j exceeds the bound for some 2, 2 S g : Finally, we set the adative estimator to be ^ A = ^^s : (34) 2

15 The roosed adative rocedure is based on the comarison of local olynomial estimators with di erent smoothness arameters from the g-net S g : The total number of smoothness arameters in S g is of order log(n) and the resolution of the g-net S g is = log n: As the samle size increases, the grid of S g becomes ner and ner. However, given the structure of S g ; it is not ossible to distinguish smoothness arameters whose di erence is less than = log n: This is why the roosed estimator can not achieve the best rate of convergence n s=(2s+) for known smoothness. To further understand the adative rocedure, consider a function m() 2 M(s; ; K) but m() =2 M(s ; ; K) for any s > s: In other words, m() is smooth to at most order s: For any 2 s; it follows from Theorem that the asymtotic bias of nh (^ ) is = O asymbias nh (^ ) nh h r + = O n [ min(r +;s)]=(2 +) = O(): (35) Similarly, the asymtotic bias of nh (^ 2 ) is asymbias nh (^ 2 ) = O n =(2 +) n min(r 2 +;s)=(2 2+) (36) = O n =(2 +) 2 =(2 2 +) n [ 2 min(r 2 +;s)]=(2 2 +) = o(): Therefore, the asymtotic bias of nh j^ ^ 2 j is bounded. On the other hand, nh j^ ^ 2 j is no larger than nh j^ j + nh j^ 2 j (37) whose asymtotic variance is of order O(): As a consequence, when 2 s; nh j^ ^ 2 j is stochastically bounded in large samles and nh j^ ^ 2 j 2 (n) holds with robability aroaching. This heuristic argument suggests that the robability that ^s is less than s is small in large samles. Next, consider = s and 2 > s; the asymtotic bias of nh s (^ 2 ) is of order O n s=(2s+) n s=(2 2+) = O (n 2 s ) which will be larger than 2 (n) in general if 2 s is su ciently large. This suggests that ^s can not be too far away from s from above. Rigorous arguments are given in the roofs of the next two Theorems in the Aendix. Theorem 3 Let Assumtions 2 3 hold. Assume that min r2[rs ;r s ] f min ( r )g > where min ( r ) is the smallest eigenvalue of lim lim su C! n! su su s2[s ;s ] P m;2p(s;;k) r: For all s 2 [s ; ) with s > ; we have P m; n s 2s+ (n) j^ A j C = : 3

16 Remarks. Theorem 2 shows that the otimal rate of convergence for the estimation of is given by n s=(2s+) when s is nite and known. Theorem 3 shows that the adative estimator achieves this rate u to a logarithm factor (n) when s is nite and not known. 2. When s is not known, the otimal rate of n s=(2s+) for known smoothness can not be achieved in general. For the Gaussian noise model and quadratic loss, Leski (99) shows that an extra (log n) s=(2s+) factor is needed. This result has been recently challenged by Cai and Low (23) who show that under the - loss the achievable lower bound for unknown smoothness is the same as that is ossible with known smoothness. However, their results are obtained under the assumtion that there are a nite number of di erent values of the smoothness arameter. This assumtion does not hold for the roblem at hand. As a result, the extra logarithm factor may not be removed in general for the - loss. This extra logarithmic factor is an unavoidable rice for adatation and most (if not all) adative estimators of linear functionals share this roerty. 3. If the function m(x) is not smooth to the same order on the two sides of x ; say m(x) 2 M + (s ; ; K) \ M (s 2 ; ; K); then we can estimate c + and c adatively on each side of the cuto oint x : For a constant + 2 > ; let ^s + = su n 2 2 S g : ^c + ^c + o (nh ) =2 (n) for all 2 ; 2 S g where ^c + is the local olynomial estimator of c + when h = + n =(2+) and r = r ; the largest integer strictly less than : The adative estimator ^c + A of c+ is given by ^c^s+ : The adative estimator ^c A of c can be analogously de ned. Finally, the adative estimator of ^ is set to be ^ A = ^c + A ^c A : In this case, the rate of the convergence min(s of ^ A is easily seen to be (n) ex ;s 2 ) 2 min(s ;s 2 )+ log n : In other words, the slower rate of convergence of ^c + A and ^c+ A dictates. 4. Through ^s; the adative estimator deends on several user-chosen constants, namely ; 2 ; s ; and s : In Section 5 we use local cross validation to choose and 2 : For the bounds s and s we suggest using = log(n) and ; resectively. Theorems 2 and 3 suggest that ^s rovides a consistent estimator of s if m(x) 2 M(s; ; K): However, s is not well de ned. According to our de nition of smoothness, 4

17 a function that is smooth of order s is also smooth of order s 2 whenever s > s 2 : The rate-otimal olynomial order and bandwidth are increasing functions of the smoothness and we are therefore interested in de ning a class of functions with a unique smoothness index. Before de ning the new function class, recall that any function m(x) 2 M(s; ; K) admits Taylor exansions of the form: m(x) = m(x ) + m(x) = m(x ) + with the remainder terms satisfying `X b + j (x x ) j + ~e + (x) for x x (38) j= `X b j (x x ) j + ~e (x) for x < x (39) j= ~e + (x) =(x x ) s (`!) K for x x ; ~e (x) = jx x j s (`!) K for x < x : (4) Let ~e + = f~e + (x i )g xi x and ~e = f~e (x i)g xi <x be the vectors that contain the remainder terms. The following de nition imoses an additional condition on ~e + (x) and ~e (x): De nition 4 Let s = ` + where ` is the largest integer strictly less than s and 2 (; ]: Let M (s ; ; K) be the class of functions satisfying (i) m(x) 2 M(s ; ; K) but m(x) =2 M(s; ; K) for any s > s : (ii) Let D n` = nhdiag(; h; h 2 ; :::; h` ). The remainder terms ~e + (x) and ~e (x) of the `-th order Taylor exansion of m(x) around x satisfy (nh) =2 h s Dn`Z + W + ~e + ( )`+ D ` n`z W ~e ` C for a constant C > with robability aroaching as n! ; h! such that nh! : The rst requirement in the above de nition determines the maximum degree of smoothness of a function. For an in nitely di erentiable function, there is no s such that the rst requirement is met. In this case, we de ne s to be : In other words, M (; ; K) is the set of in nitely di erentiable functions. The second requirement asks for a lower bound for the asymtotic bias of the local olynomial estimator with order `: These two requirements make M (s ; ; K) a subset of M(s ; ; K) which is the most di cult to estimate. Heuristically, if m(x) 2 M (s ; ; K); then there exists no estimator ^ with the rate of convergence faster than n 2s =(2s +)+ for any > : For a function m(x) 2 M(s ; ; K)\M(s; ; K) with s > s ; it is easy to see that the estimator ^ s 5 converges to at the rate of n 2s=(2s+)

18 which is faster than the rate n 2s =(2s +) : To rule out this case, we imose the rst requirement. On the other hand, when the rst requirement is met but the asymtotic bias of ^ s diminishes as n!, ossibly due to the cancellation of the asymtotic biases from the two sides, we can choose a large bandwidth without in ating the asymtotic bias and thus obtain a rate of convergence that is faster than n 2s =(2s +) : To rule out this case, we thus imose the second requirement. Su cient conditions for the second requirement are (i) K jx x j s j~e + (x)j K 2 jx x j s and K jx x j s j~e (x)j K 2 jx x j s for some K > ; K 2 > (ii) ~e + (x) 6= ~e (x) when ` is odd. The following theorem shows that ^s rovides a consistent estimate for the maximal degree of smoothness. Theorem 5 Let the assumtions of Theorem 3 hold. If m(x) 2 M (s ; ; K) with s s > ; then Remarks log log n ^s = min(s ; s ) + O log n as n!.. The theorem shows that ^s consistently estimates the maximal degree of smoothness s when it is nite and s and s are aroriately chosen. 2. A direct imlication of Theorem 5 is that ^s converges to s when s s : As a result, when the samle size is not large in ractical alications, we can set an uer bound that is relatively small. This will revent us from using high order olynomials for small samle sizes. For examle, when s = 3; the adative rocedure e ectively rovides a method to choose between the local constant, local linear and local quadratic estimators. In the simulation study, we choose s = 4; which we feel is a reasonable choice for samle size The adative estimator ^ A is not necessarily asymtotically normal. At the cost of a slower rate of convergence, Theorem 5 enables us to de ne a new adative estimator that is asymtotically normal with zero asymtotic bias. obtaining ^s using the above adative rocedure, we de ne More seci cally, after ^ ^s := ^s(r^s ; h ^s ); where h s = n =(2rs+) : (4) If s < and s is not an integer, Theorem 5 imlies that r^s = r s with robability aroaching one: Thus, both r^s and h^s are essentially non-random for large n. In 6

19 consequence, the adative estimator ^ ^s is asymtotically normal: q nh ^s (^^s )! d N(;! 2 2 r^s ): (42) Of course, one would exect that a given level of accuracy of aroximation by the normal distribution would require a larger samle size when r and h are adatively selected than otherwise. 4. The only unknown quantity in (42) is! 2 = 2+ (x ) + 2 (x ) =f(x ): The density of x at the cut-o oint, f(x ); can be estimated consistently by kernel methods. Given a consistent estimate ~; we de ne the estimated residual by ~" i = y i ~m(x i ) ~d i (43) where ~m(x i ) = P n i= k h(x x i ) [y i ~d i ] P n i= k h(x x i ) (44) Porter (23) shows that, under some regularity conditions, ^ 2+ (x ) = 2 P n i= k h(x i x )d i ~" 2 i P n i= k h(x i x ) and ^ 2 (x ) = 2 P n i= k h(x i x )( d i )~" 2 i P n i= k h(x i x ) (45) are consistent for 2+ (x ) and 2 (x ) resectively. Plugging ^ 2+ (x ); ^ 2 (x ) and ^f(x ) = =n P n i= k h(x i x ) into the de nition of! 2 roduces a consistent estimator for it. The adative estimator ^^s or ^ ^s can be used to comute the estimated residual in (43). 5 Monte Carlo Exeriments In this section, we roose a ractical strategy to select the constants and 2 in the adative rocedure and rovide some simulation evidence on the nite samle erformance of the adative estimator. The emirical strategy we use is based on the squared-error cross validation, which has had considerable in uence on nonarametric estimation. Since our objective is to estimate the discontinuity at a certain oint, we use a local version of cross validation roosed by Hall and Schuany (989) for density estimation. For each combination of ( ; 2 ) ; we rst use the adative rule to determine ^s; h^s ; and r^s : We then use the local olynomial estimator with bandwidth h^s and olynomial order r^s to estimate the conditional mean of y i at x = x i leaving the observation (x i; y i ) out. Denote 7

20 the estimate by ^y i ( ; 2 ); where we have made it exlicit that ^y i deends on (, 2 ): Let fx + i ; :::; x + i m g and fx i ; :::; x im g be the closest m observations that are larger and smaller than x resectively. We choose and 2 to minimize the local cross validation function: mx mx CV ( ; 2 ) = (y i + k ^y + i k ( ; 2 )) 2 + (y ik ^y ik ( ; 2 )) 2 (46) k= Finally we use the cross validation choice (^; ^ 2) of ( ; 2 ) to comute the adative estimator, which is denoted by ^ A (^; ^ 2): In this aer, we do not rovide asymtotic results for ^ A (^; ^ 2); but we do give some simle results for an estimator based on a data-deendent method that is close to (^; ^ 2): Let = f ; :::; Ug be a nite grid of ositive real numbers. Take ( ~ ; ~ 2) to be the closest oint in to (^; ^ 2): Let ^ A ( ~ ; ~ 2) denote the adative estimator based on ( ~ ; ~ 2): One can take the grid size of k= to be su ciently small that the minimum of CV ( ; 2 ) over ( ; 2 ) 2 is quite close to its minimum over R + R + ; at least if one has knowledge of suitable lower and uer bounds for and 2 : The asymtotic behavior of ^ A ( ~ ; ~ 2) is relatively easy to obtain. First, Theorem 3 holds for ^ A ( ~ ; ~ 2) under Assumtions 2 and 3. The reasons are that the theorem holds for ^ A for each combination ( ; 2 ) 2 and that there are a nite number of such combinations. So, ^ A ( ~ ; ~ 2) is consistent and has the rate of convergence given by n s 2s+ (n): Second, suose the value (^; ^ 2) is not equidistant to any two oints in (which fails only for a set of oints with Lebesgue measure zero) and assume that (^; ^ 2) converges to ( ; 2) in large samles. Let ( o ; o 2) be the closest oint in ( ; 2) : Let ^ A ( o ; o 2) and ^ A ( ; 2) denote the adative estimators based on ( o ; o 2) and ( ; 2) resectively. Then, the asymtotic distribution of ^ A ( ~ ; ~ 2) as that of ^ A ( o ; o 2) to is the same. This holds because ( ~ ; ~ 2) = ( o ; o 2) with robability that goes to as n! by the discreteness of. After a simle modi cation along the line of (4), we have q nh ^s ^ A( ~ ; ~ 2)! d N(;! 2 2 ): (47) r^s where ^ A (~ ; ~ 2) is the same as ^ A ( ~ ; ~ 2); excet that the bandwidth h^s = ~ n =(2^s+) is relaced by h ^s = ~ n =(2r^s+) : The above theoretical results for ^ A (~ ; ~ 2) are not entirely satisfactory because they require the use of the somewhat arti cial grid. Nevertheless, in the absence of asymtotic results for ^ A (^; ^ 2); they should be useful. Since our cross validation algorithm is based on a grid search, we e ectively use the estimator ^ A ( ~ ; ~ 2) in our simulations. In our Monte Carlo exeriment, we let s = 4, m = :n; and = f:; :5; ; 5g to comute the adative estimator. To evaluate the nite samle erformance of the adative 8

21 estimator ^ A ( ~ ; ~ 2), we comare it with the local constant, local linear, local quadratic and local cubic estimators, each of them using the locally cross-validated bandwidth. For these local olynomial estimators, we use the AMSE-otimal bandwidth h = cn =(2r+3) and choose c over the set C = (:; :2; :::; ) [ (2; 3; 4; :::; ) via cross validation. For each estimator the cross validation is based on the same neighborhood observations fx + i ; :::; x + i m g and fx i ; :::; x im g and uses the grid search method. We have considered other choices of m, and C but the qualitative results are similar. We consider three grous of exeriments. In the rst grou, the data generating rocess is y i = m(x i ) + fx i > x g + " i where = and m(x i ) = ( P 3 i= (x i x ) i + jx i x j s for x i x ; P 3 i= (x i x ) i jx i x j s for x i < x : (48) Both x i and " i are iid standard normal. fx i g n i= is indeendent f" ig n i=. We set x = without loss of generality. We consider several values for s ; i.e. s = =2; 3=2; 5=3; 7=2 and two values for ; i.e. = and 5: s characterizes the smoothness of m(x) while determines the imortance of the not-so-smooth comonent in m(x): For the second grou of exeriments, the data generating rocess is the same as the one above excet that a sine wave is added to m(x); leading to m(x i ) = ( P 3 i= (x i x ) i + 5 sin (x i x ) + jx i x j s for x i x P 3 i= (x i x ) i + 5 sin (x i x ) jx i x j s for x i < x (49) The resonse function we just de ned has a ner structure than that given in (48). Such a resonse function may not be realistic in emirical alications but it is used to examine the nite samle erformances of di erent estimators in the worst situations. For the last grou of exeriments, the data generating rocess is m(x i ) = kx (x i x ) i ; for k = ; ; 2 or 3: (5) i= Since m(x i ) is a constant, linear, quadratic or cubic function, we exect the local constant, local linear, local quadratic and local cubic estimators to have the best nite samle erformances in the resective cases of k = ; ; 2 and 3: The motivation for considering this grou is to crash test the adative estimator against the local olynomial estimators. For each grou of the Monte Carlo exeriments, we comute the bias, standard deviation (SD) and root mean square error (RMSE) of all estimators considered. The number of relication is and the samle size is 5. More seci cally, for an estimator ^; the 9

22 bias, SD, and RMSE are comuted according to bias = ^ ; SD = X i= ^ i ^ 2 and RMSE = q(bias) 2 + (SD) 2 (5) where ^ = = P m= ^ m and ^ m is the estimate for the m-th relication. Table I resents the results for the rst grou of exeriments. It is clear that the local constant estimator has the smallest standard deviation and the largest bias. When s = 3=2; 5=2; 7=2; the sloe of m(x) is relatively at at x = x. As a result, the e ect of the standard deviation outweighs that of the bias. It is not surrising that the local constant estimator has the smallest RMSE in these cases. However, when s = =2; the function m(x) becomes very stee at x = x : As exected, the local constant estimator has a large uward bias and the largest RMSE. Next, for the rest of the local olynomial estimators, the absolute values of the biases are in general comarable while the standard deviation decreases with the order of the olynomial. The latter result seems to be counter-intuitive at rst sight. However, as the order of the olynomial increases, the cross-validated bandwidth also increases. Note that the bandwidth and olynomial order have oosite e ects on the variance of the local olynomial estimators. In nite samles, it is likely that the variance reduction from using a larger bandwidth dominates the variance in ation from using a higher order olynomial. This is the case for the rst grou of data generating rocesses we consider. Finally, the erformance of the adative estimator is very robust to the arameter con gurations. When the underlying rocess is not so smooth (s = =2; = ); the adative estimator has the smallest RMSE. In other cases, the RMSE of the adative estimator is only slightly larger than the smallest RMSE. It is imortant to note that the smallest RMSE is achieved by di erent estimators for di erent arameter combinations. Table II reorts the results for the second grou of exeriments. We reort only the case = as it is reresentative of the case = 5: Due to the raid sloe changes in the resonse function, all estimators have much larger RMSE s than those given in Table I. While the local constant estimator has a satisfactory RMSE erformance in Table I, its RMSE erformance is the oorest because of the large bias. The best estimator, according to the RMSE criterion, is the local linear estimator whose absolute bias is the smallest among the local olynomial estimators and standard deviation is only slightly larger than that of the local constant estimator. Comared with the local olynomial estimators, the adative estimator has the smallest bias for all arameter combinations while its variance is comarable to that of the local linear estimator. As a consequence, the RMSE erformance of the adative estimator is quite satisfactory. Table III gives the result for the last grou of exeriments. As exected, when the 2

23 resonse function is a olynomial with order r; the local olynomial estimator with the same order has the best nite samle erformance in general. An excetion is the local linear estimator whose RMSE is larger than that of the local quadratic and cubic estimators. The erformance of the adative estimator is very encouraging. Its RMSE is either the smallest or slightly larger than that of the estimator which is most suitable for the underlying data generating rocess. To sum u, the RMSE of the adative estimator is either the smallest or among the smallest ones. The erformance of the adative estimator is robust to the underlying data generating rocess. In contrast, a local olynomial estimator may have the best erformance in one scenario and disastrous erformances in other scenarios. For examle, the local constant estimator erforms well in the rst grou of exeriments but erforms oorly in the second grou of exeriments. The local linear estimator has a satisfactory erformance in the second grou of exeriments but its erformance is the worst in the rst grou of exeriments. The adative estimator seems to be the best estimator in an overall sense. 2

24 Table I: Finite Samle Performances of Di erent Estimators When m(x i ) = P 3 i= (x i x ) i + jx i x j s sign(x i x ) Adative Estimator Local Constant Local Linear Local Quadratic Local Cubic (s ; ) = (=2; ) Bias SD RMSE (s ; ) = (3=2; ) Bias SD RMSE (s ; ) = (5=2; ) Bias SD RMSE (s ; ) = (7=2; ) Bias SD RMSE (s ; ) = (=2; 5) Bias SD RMSE (s ; ) = (3=2; 5) Bias SD RMSE (s ; ) = (5=2; 5) Bias SD RMSE (s ; ) = (7=2; 5) Bias SD RMSE The suerscrits ; 2; 3 indicate the smallest, second smallest, and third smallest RMSE in each row, resectively 22

25 Table II: Finite Samle Performances of Di erent Estimators When m(x i ) = P 3 i= (x i x ) i + 5 sin (x i x ) + jx i x j s sign(x i x ) Adative Estimator Local Constant Local Linear Local Quadratic Local Cubic (s ; ) = (=2; ) Bias SD RMSE (s ; ) = (3=2; ) Bias SD RMSE (s ; ) = (5=2; ) Bias SD RMSE (s ; ) = (7=2; ) Bias SD RMSE The suerscrits ; 2; 3 indicate the smallest, second smallest, and third smallest values in each row, resectively 23

26 Table III Finite Samle Performances of Di erent Estimators for Di erent Resonse Functions Adative Local Local Local Local Estimator Constant Linear Quadratic Cubic m(x) = Bias SD RMSE m(x) = (x x ) Bias SD RMSE m(x) = (x x ) + (x x ) 2 Bias SD RMSE m(x) = (x x ) + (x x ) 2 + (x x ) 3 Bias SD RMSE

27 6 Aendix of Proofs Proof of Theorem. It is easy to show that ^+ r + r = Z r + W + Z r + Z + r W + " + + Z r + W Z r + Z + r W + R + (A.) Let D nr = nhdiag(; h; h 2 ; :::; h r ): (A.2) Then D nr ^+ r + r = Dnr Z r + W + Z r + Dnr D nr Z + r W + " + + D It follows from the roof of Lemma A.(a) below that nr Z + r W + Z r + Dnr D nr Z + (A.3) r W + R + : lim D n! nr Z r + W + Z r + Dnr = f(x ) r : (A.4) Porter (23) shows that, under Assumtion 2, Dnr Z r + W + " + ) N ; 2+ (x ) f(x ) V r : (A.5) Combining (A.3), (A.4) and (A.5) gives D n;r ^+ r + r ) N ; 2+ (x ) f(x ) Dnr Z + r V r r r W + Z r + Dnr D nr Z r + W + R +, (A.6) which imlies nh(^c + r c + ) B + ) N ; 2+ (x ) f(x ) e r V r r e ; (A.7) where B + = e D nr Z + r W + Z r + Dnr Similarly, we can show that nh(^cr c ) B ) N D nr Z + r W + R + : ; 2 (x ) f(x ) e r V r r e : (A.8) By the indeendence of nh(^c + r c + ) and nh(^c r c ); we get nh(^r ) (B + B ) ) N ; 2+ (x ) + 2 (x ) f(x e ) r V r r e : (A.9) 25

28 When ` r + ; D nr Z + r W + R + = h r+ nhb + r+ r( + o ()): (A.) When ` r; Dnr Z r + W + R + = nh Dnr Z r + W + e + = O h q nh : (A.) Therefore Similarly B + = f` r + g e r r b + r+ f(x ) h r+ nh( + o ()) + O h q nh : (A.2) B = f` r + g ( )r+ e r r br+ f(x h r+ nh( + o ()) + O h q nh : (A.3) ) Let B = B + B, then B = f` r + g e r r b + r+ ( ) r+ b r+ f(x h r+ nh( + o ()) ) +O h q nh : (A.4) Combining (A.4) and (A.9) leads to the desired result. Proof of Theorem 2. Part (a). The roof uses the following result from Pollard (993): Let P = n i= P i and Q = n i= Q i be the nite roducts of robability measures such that Q i has density + i () with resect to P i : If 2 i = E P i 2 i jj n i=p i Using this result and (3), we have inf ^ rovided that (P ) (Q) > : n i=q i jj ex su P2P P(j^ j =2) nx i= ex 2 i is nite for each i; then! nx i= : (A.5) 2 i!! ; (A.6) To get a good lower bound for the minimax risk, we consider two robability models P and Q: Under the model P; the data is generated according to Y = m P (X) + P d + " (A.7) where Y = (y ; y 2 ; :::; y n ), m P (X) = (m P (x ); :::; m P (x n )), " = (" ; :::; " n ); x i s iid uniform(x ; x + ), " i s iid N(; ) and " i is indeendent of x j for all i and j: The 26

29 data generating rocess under Q is de ned analogously with m P (X) + P d relaced by m Q (X) + Q d: It is obvious that both models P and Q satisfy Assumtion 2. We now secify m and for each model. For the robability model P; we let m P (x) = and P = : For the robability model Q; we let m Q (x) = s ((x x ) =) and Q = s (A.8) where = n =(2s+) and is an in nitely di erentiable function satisfying (i) (x) ; (ii) (x) = for x and (iii) (x) = for x : Obviously m P 2 M(s; ; K): We next verify that m Q 2 M(s; ; K): First, by construction, m Q is continuous on [x ; x + ] : Second, the i-th order derivative of m (i) Q is s i (i) ((x x ) =) which is obviously bounded by K when n is large enough for all i `: Third, we verify the Hölder condition for the `-th order derivative. It su ces to consider the case when x 2 [x ; x + ] and x 2 2 [x ; x + ] as the Hölder condition holds trivially when x 2 [x ; x ] and x 2 2 [x ; x ]: We consider three cases: (i) when x, x 2 2 [x ; x + ]; the `-th order derivative satis es s ` (`) ((x x ) =) s ` (`) ((x 2 x ) =) s ` (`+) ~x jx x 2 j = s ` (`+) ~x jx x 2 j`+ s jx x 2 j s ` C s ` `+ s `+ s jx x 2 j (A.9) K jx x 2 j if is small enough; (ii) when x 2 [x ; x + ] and x 2 x + ; = s ` (`) ((x x ) =) s ` (`) ((x 2 x ) =) s ` (`) ((x x ) =) s ` (`) ((x + x ) =) K jx x j K jx x 2 j (A.2) when the rst inequality follows from (A.9); (iii) when x x + and x 2 x + ; we have (`) ((x x ) =) = (`) ((x 2 x ) =) = : Again the Hölder condition holds trivially. It remains to comute the L distance between the two measures. Let the density of Q i with resect to P i be + i (x i ; y i ); then i (x i ; y i ) = ( ' (yi m Q (x i ) Q ) ='(y i ) ; if x i 2 [x ; x + ) ; otherwise (A.2) 27

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

Chapter 3. GMM: Selected Topics

Chapter 3. GMM: Selected Topics Chater 3. GMM: Selected oics Contents Otimal Instruments. he issue of interest..............................2 Otimal Instruments under the i:i:d: assumtion..............2. he basic result............................2.2

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Estimating Time-Series Models

Estimating Time-Series Models Estimating ime-series Models he Box-Jenkins methodology for tting a model to a scalar time series fx t g consists of ve stes:. Decide on the order of di erencing d that is needed to roduce a stationary

More information

The following document is intended for online publication only (authors webpage).

The following document is intended for online publication only (authors webpage). The following document is intended for online ublication only (authors webage). Sulement to Identi cation and stimation of Distributional Imacts of Interventions Using Changes in Inequality Measures, Part

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

The power performance of fixed-t panel unit root tests allowing for structural breaks in their deterministic components

The power performance of fixed-t panel unit root tests allowing for structural breaks in their deterministic components ATHES UIVERSITY OF ECOOMICS AD BUSIESS DEPARTMET OF ECOOMICS WORKIG PAPER SERIES 23-203 The ower erformance of fixed-t anel unit root tests allowing for structural breaks in their deterministic comonents

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

On the asymptotic sizes of subset Anderson-Rubin and Lagrange multiplier tests in linear instrumental variables regression

On the asymptotic sizes of subset Anderson-Rubin and Lagrange multiplier tests in linear instrumental variables regression On the asymtotic sizes of subset Anderson-Rubin and Lagrange multilier tests in linear instrumental variables regression Patrik Guggenberger Frank Kleibergeny Sohocles Mavroeidisz Linchun Chen\ June 22

More information

Asymptotic F Test in a GMM Framework with Cross Sectional Dependence

Asymptotic F Test in a GMM Framework with Cross Sectional Dependence Asymtotic F Test in a GMM Framework with Cross Sectional Deendence Yixiao Sun Deartment of Economics University of California, San Diego Min Seong Kim y Deartment of Economics Ryerson University First

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

Johan Lyhagen Department of Information Science, Uppsala University. Abstract

Johan Lyhagen Department of Information Science, Uppsala University. Abstract Why not use standard anel unit root test for testing PPP Johan Lyhagen Deartment of Information Science, Usala University Abstract In this aer we show the consequences of alying a anel unit root test that

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

Estimation of spatial autoregressive panel data models with xed e ects

Estimation of spatial autoregressive panel data models with xed e ects Estimation of satial autoregressive anel data models with xed e ects Lung-fei Lee Deartment of Economics Ohio State University l eeecon.ohio-state.edu Jihai Yu Deartment of Economics University of Kentucky

More information

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS #A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS Ramy F. Taki ElDin Physics and Engineering Mathematics Deartment, Faculty of Engineering, Ain Shams University, Cairo, Egyt

More information

On a Markov Game with Incomplete Information

On a Markov Game with Incomplete Information On a Markov Game with Incomlete Information Johannes Hörner, Dinah Rosenberg y, Eilon Solan z and Nicolas Vieille x{ January 24, 26 Abstract We consider an examle of a Markov game with lack of information

More information

Time Series Nonparametric Regression Using Asymmetric Kernels with an Application to Estimation of Scalar Diffusion Processes

Time Series Nonparametric Regression Using Asymmetric Kernels with an Application to Estimation of Scalar Diffusion Processes ime Series Nonarametric Regression Using Asymmetric Kernels with an Alication to Estimation of Scalar Diffusion Processes Nikolay Gosodinov y Concordia University and CIREQ Masayuki Hirukawa z Northern

More information

Bias in Dynamic Panel Models under Time Series Misspeci cation

Bias in Dynamic Panel Models under Time Series Misspeci cation Bias in Dynamic Panel Models under Time Series Misseci cation Yoonseok Lee August 2 Abstract We consider within-grou estimation of higher-order autoregressive anel models with exogenous regressors and

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

On Wald-Type Optimal Stopping for Brownian Motion

On Wald-Type Optimal Stopping for Brownian Motion J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of

More information

Statics and dynamics: some elementary concepts

Statics and dynamics: some elementary concepts 1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and

More information

Semiparametric Estimation of Markov Decision Processes with Continuous State Space

Semiparametric Estimation of Markov Decision Processes with Continuous State Space Semiarametric Estimation of Markov Decision Processes with Continuous State Sace Sorawoot Srisuma and Oliver Linton London School of Economics and Political Science he Suntory Centre Suntory and oyota

More information

COMMUNICATION BETWEEN SHAREHOLDERS 1

COMMUNICATION BETWEEN SHAREHOLDERS 1 COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov

More information

Maxisets for μ-thresholding rules

Maxisets for μ-thresholding rules Test 008 7: 33 349 DOI 0.007/s749-006-0035-5 ORIGINAL PAPER Maxisets for μ-thresholding rules Florent Autin Received: 3 January 005 / Acceted: 8 June 006 / Published online: March 007 Sociedad de Estadística

More information

Partial Identification in Triangular Systems of Equations with Binary Dependent Variables

Partial Identification in Triangular Systems of Equations with Binary Dependent Variables Partial Identification in Triangular Systems of Equations with Binary Deendent Variables Azeem M. Shaikh Deartment of Economics University of Chicago amshaikh@uchicago.edu Edward J. Vytlacil Deartment

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

Heteroskedasticity, Autocorrelation, and Spatial Correlation Robust Inference in Linear Panel Models with Fixed-E ects

Heteroskedasticity, Autocorrelation, and Spatial Correlation Robust Inference in Linear Panel Models with Fixed-E ects Heteroskedasticity, Autocorrelation, and Satial Correlation Robust Inference in Linear Panel Models with Fixed-E ects Timothy J. Vogelsang Deartments of Economics, Michigan State University December 28,

More information

CHAPTER 2: SMOOTH MAPS. 1. Introduction In this chapter we introduce smooth maps between manifolds, and some important

CHAPTER 2: SMOOTH MAPS. 1. Introduction In this chapter we introduce smooth maps between manifolds, and some important CHAPTER 2: SMOOTH MAPS DAVID GLICKENSTEIN 1. Introduction In this chater we introduce smooth mas between manifolds, and some imortant concets. De nition 1. A function f : M! R k is a smooth function if

More information

2 K. ENTACHER 2 Generalized Haar function systems In the following we x an arbitrary integer base b 2. For the notations and denitions of generalized

2 K. ENTACHER 2 Generalized Haar function systems In the following we x an arbitrary integer base b 2. For the notations and denitions of generalized BIT 38 :2 (998), 283{292. QUASI-MONTE CARLO METHODS FOR NUMERICAL INTEGRATION OF MULTIVARIATE HAAR SERIES II KARL ENTACHER y Deartment of Mathematics, University of Salzburg, Hellbrunnerstr. 34 A-52 Salzburg,

More information

A multiple testing approach to the regularisation of large sample correlation matrices

A multiple testing approach to the regularisation of large sample correlation matrices A multile testing aroach to the regularisation of large samle correlation matrices Natalia Bailey Queen Mary, University of London M. Hashem Pesaran University of Southern California, USA, and rinity College,

More information

E cient Semiparametric Estimation of Quantile Treatment E ects

E cient Semiparametric Estimation of Quantile Treatment E ects E cient Semiarametric Estimation of Quantile Treatment E ects Sergio Firo y First Draft: ovember 2002 This Draft: January 2004 Abstract This aer resents calculations of semiarametric e ciency bounds for

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS #A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,

More information

E-companion to A risk- and ambiguity-averse extension of the max-min newsvendor order formula

E-companion to A risk- and ambiguity-averse extension of the max-min newsvendor order formula e-comanion to Han Du and Zuluaga: Etension of Scarf s ma-min order formula ec E-comanion to A risk- and ambiguity-averse etension of the ma-min newsvendor order formula Qiaoming Han School of Mathematics

More information

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund

More information

substantial literature on emirical likelihood indicating that it is widely viewed as a desirable and natural aroach to statistical inference in a vari

substantial literature on emirical likelihood indicating that it is widely viewed as a desirable and natural aroach to statistical inference in a vari Condence tubes for multile quantile lots via emirical likelihood John H.J. Einmahl Eindhoven University of Technology Ian W. McKeague Florida State University May 7, 998 Abstract The nonarametric emirical

More information

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI ** Iranian Journal of Science & Technology, Transaction A, Vol 3, No A3 Printed in The Islamic Reublic of Iran, 26 Shiraz University Research Note REGRESSION ANALYSIS IN MARKOV HAIN * A Y ALAMUTI AND M R

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

CHAPTER 3: TANGENT SPACE

CHAPTER 3: TANGENT SPACE CHAPTER 3: TANGENT SPACE DAVID GLICKENSTEIN 1. Tangent sace We shall de ne the tangent sace in several ways. We rst try gluing them together. We know vectors in a Euclidean sace require a baseoint x 2

More information

A New Asymmetric Interaction Ridge (AIR) Regression Method

A New Asymmetric Interaction Ridge (AIR) Regression Method A New Asymmetric Interaction Ridge (AIR) Regression Method by Kristofer Månsson, Ghazi Shukur, and Pär Sölander The Swedish Retail Institute, HUI Research, Stockholm, Sweden. Deartment of Economics and

More information

Generalized Coiflets: A New Family of Orthonormal Wavelets

Generalized Coiflets: A New Family of Orthonormal Wavelets Generalized Coiflets A New Family of Orthonormal Wavelets Dong Wei, Alan C Bovik, and Brian L Evans Laboratory for Image and Video Engineering Deartment of Electrical and Comuter Engineering The University

More information

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law On Isoerimetric Functions of Probability Measures Having Log-Concave Densities with Resect to the Standard Normal Law Sergey G. Bobkov Abstract Isoerimetric inequalities are discussed for one-dimensional

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

Testing Weak Cross-Sectional Dependence in Large Panels

Testing Weak Cross-Sectional Dependence in Large Panels esting Weak Cross-Sectional Deendence in Large Panels M. Hashem Pesaran University of Southern California, and rinity College, Cambridge January, 3 Abstract his aer considers testing the hyothesis that

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

Exercises Econometric Models

Exercises Econometric Models Exercises Econometric Models. Let u t be a scalar random variable such that E(u t j I t ) =, t = ; ; ::::, where I t is the (stochastic) information set available at time t. Show that under the hyothesis

More information

Linear diophantine equations for discrete tomography

Linear diophantine equations for discrete tomography Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,

More information

On Doob s Maximal Inequality for Brownian Motion

On Doob s Maximal Inequality for Brownian Motion Stochastic Process. Al. Vol. 69, No., 997, (-5) Research Reort No. 337, 995, Det. Theoret. Statist. Aarhus On Doob s Maximal Inequality for Brownian Motion S. E. GRAVERSEN and G. PESKIR If B = (B t ) t

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

SIGNALING IN CONTESTS. Tomer Ifergane and Aner Sela. Discussion Paper No November 2017

SIGNALING IN CONTESTS. Tomer Ifergane and Aner Sela. Discussion Paper No November 2017 SIGNALING IN CONTESTS Tomer Ifergane and Aner Sela Discussion Paer No. 17-08 November 017 Monaster Center for Economic Research Ben-Gurion University of the Negev P.O. Box 653 Beer Sheva, Israel Fax: 97-8-647941

More information

Empirical Likelihood for Regression Discontinuity Design

Empirical Likelihood for Regression Discontinuity Design Emirical Likeliood for Regression Discontinuity Design Taisuke Otsu Yale University Ke-Li Xu y University of Alberta 8t Marc 2009 Abstract Tis aer rooses emirical likeliood con dence intervals for causal

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and

More information

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data Quality Technology & Quantitative Management Vol. 1, No.,. 51-65, 15 QTQM IAQM 15 Lower onfidence Bound for Process-Yield Index with Autocorrelated Process Data Fu-Kwun Wang * and Yeneneh Tamirat Deartment

More information

Hotelling s Two- Sample T 2

Hotelling s Two- Sample T 2 Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

Estimation of Separable Representations in Psychophysical Experiments

Estimation of Separable Representations in Psychophysical Experiments Estimation of Searable Reresentations in Psychohysical Exeriments Michele Bernasconi (mbernasconi@eco.uninsubria.it) Christine Choirat (cchoirat@eco.uninsubria.it) Raffaello Seri (rseri@eco.uninsubria.it)

More information

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education CERIAS Tech Reort 2010-01 The eriod of the Bell numbers modulo a rime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education and Research Information Assurance and Security Purdue University,

More information

Let s Fix It: Fixed-b Asymptotics versus Small-b Asymptotics in Heteroskedasticity and Autocorrelation Robust Inference

Let s Fix It: Fixed-b Asymptotics versus Small-b Asymptotics in Heteroskedasticity and Autocorrelation Robust Inference Let s Fix It: Fixed-b Asymtotics versus Small-b Asymtotics in Heteroskedasticity and Autocorrelation Robust Inference Yixiao Sun Deartment of Economics, University of California, San Diego June 5, 3 Abstract

More information

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP Submitted to the Annals of Statistics arxiv: arxiv:1706.07237 CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP By Johannes Tewes, Dimitris N. Politis and Daniel J. Nordman Ruhr-Universität

More information

Bootstrap Inference for Impulse Response Functions in Factor-Augmented Vector Autoregressions

Bootstrap Inference for Impulse Response Functions in Factor-Augmented Vector Autoregressions Bootstra Inference for Imulse Resonse Functions in Factor-Augmented Vector Autoregressions Yohei Yamamoto y University of Alberta, School of Business February 2010 Abstract his aer investigates standard

More information

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is

More information

arxiv: v2 [stat.me] 3 Nov 2014

arxiv: v2 [stat.me] 3 Nov 2014 onarametric Stein-tye Shrinkage Covariance Matrix Estimators in High-Dimensional Settings Anestis Touloumis Cancer Research UK Cambridge Institute University of Cambridge Cambridge CB2 0RE, U.K. Anestis.Touloumis@cruk.cam.ac.uk

More information

Variable Selection and Model Building

Variable Selection and Model Building LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 38 Variable Selection and Model Building Dr. Shalabh Deartment of Mathematics and Statistics Indian Institute of Technology Kanur Evaluation of subset regression

More information

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation Paer C Exact Volume Balance Versus Exact Mass Balance in Comositional Reservoir Simulation Submitted to Comutational Geosciences, December 2005. Exact Volume Balance Versus Exact Mass Balance in Comositional

More information

Stochastic integration II: the Itô integral

Stochastic integration II: the Itô integral 13 Stochastic integration II: the Itô integral We have seen in Lecture 6 how to integrate functions Φ : (, ) L (H, E) with resect to an H-cylindrical Brownian motion W H. In this lecture we address the

More information

SCHUR S LEMMA AND BEST CONSTANTS IN WEIGHTED NORM INEQUALITIES. Gord Sinnamon The University of Western Ontario. December 27, 2003

SCHUR S LEMMA AND BEST CONSTANTS IN WEIGHTED NORM INEQUALITIES. Gord Sinnamon The University of Western Ontario. December 27, 2003 SCHUR S LEMMA AND BEST CONSTANTS IN WEIGHTED NORM INEQUALITIES Gord Sinnamon The University of Western Ontario December 27, 23 Abstract. Strong forms of Schur s Lemma and its converse are roved for mas

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test) Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant

More information

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty How to Estimate Exected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty Christian Servin Information Technology Deartment El Paso Community College El Paso, TX 7995, USA cservin@gmail.com

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

The Fekete Szegő theorem with splitting conditions: Part I

The Fekete Szegő theorem with splitting conditions: Part I ACTA ARITHMETICA XCIII.2 (2000) The Fekete Szegő theorem with slitting conditions: Part I by Robert Rumely (Athens, GA) A classical theorem of Fekete and Szegő [4] says that if E is a comact set in the

More information

The non-stochastic multi-armed bandit problem

The non-stochastic multi-armed bandit problem Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at

More information

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment

More information

Supplementary Materials for Robust Estimation of the False Discovery Rate

Supplementary Materials for Robust Estimation of the False Discovery Rate Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression 1/9 MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression Dominique Guillot Deartments of Mathematical Sciences University of Delaware February 15, 2016 Distribution of regression

More information

Sharp gradient estimate and spectral rigidity for p-laplacian

Sharp gradient estimate and spectral rigidity for p-laplacian Shar gradient estimate and sectral rigidity for -Lalacian Chiung-Jue Anna Sung and Jiaing Wang To aear in ath. Research Letters. Abstract We derive a shar gradient estimate for ositive eigenfunctions of

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi

More information

John Weatherwax. Analysis of Parallel Depth First Search Algorithms

John Weatherwax. Analysis of Parallel Depth First Search Algorithms Sulementary Discussions and Solutions to Selected Problems in: Introduction to Parallel Comuting by Viin Kumar, Ananth Grama, Anshul Guta, & George Karyis John Weatherwax Chater 8 Analysis of Parallel

More information

Proof: We follow thearoach develoed in [4]. We adot a useful but non-intuitive notion of time; a bin with z balls at time t receives its next ball at

Proof: We follow thearoach develoed in [4]. We adot a useful but non-intuitive notion of time; a bin with z balls at time t receives its next ball at A Scaling Result for Exlosive Processes M. Mitzenmacher Λ J. Sencer We consider the following balls and bins model, as described in [, 4]. Balls are sequentially thrown into bins so that the robability

More information

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 000-9939XX)0000-0 ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS WILLIAM D. BANKS AND ASMA HARCHARRAS

More information

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation Uniformly best wavenumber aroximations by satial central difference oerators: An initial investigation Vitor Linders and Jan Nordström Abstract A characterisation theorem for best uniform wavenumber aroximations

More information

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA Proceedings of the 2011 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelsach, K. P. White, and M. Fu, eds. EFFICIENT RARE EVENT SIMULATION FOR HEAVY-TAILED SYSTEMS VIA CROSS ENTROPY Jose

More information

LORENZO BRANDOLESE AND MARIA E. SCHONBEK

LORENZO BRANDOLESE AND MARIA E. SCHONBEK LARGE TIME DECAY AND GROWTH FOR SOLUTIONS OF A VISCOUS BOUSSINESQ SYSTEM LORENZO BRANDOLESE AND MARIA E. SCHONBEK Abstract. In this aer we analyze the decay and the growth for large time of weak and strong

More information

Nonparametric estimation of Exact consumer surplus with endogeneity in price

Nonparametric estimation of Exact consumer surplus with endogeneity in price Nonarametric estimation of Exact consumer surlus with endogeneity in rice Anne Vanhems February 7, 2009 Abstract This aer deals with nonarametric estimation of variation of exact consumer surlus with endogenous

More information

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split A Bound on the Error of Cross Validation Using the Aroximation and Estimation Rates, with Consequences for the Training-Test Slit Michael Kearns AT&T Bell Laboratories Murray Hill, NJ 7974 mkearns@research.att.com

More information

Estimating function analysis for a class of Tweedie regression models

Estimating function analysis for a class of Tweedie regression models Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal

More information

On a class of Rellich inequalities

On a class of Rellich inequalities On a class of Rellich inequalities G. Barbatis A. Tertikas Dedicated to Professor E.B. Davies on the occasion of his 60th birthday Abstract We rove Rellich and imroved Rellich inequalities that involve

More information

A MARKOVIAN LOCAL RESAMPLING SCHEME FOR NONPARAMETRIC ESTIMATORS IN TIME SERIES ANALYSIS

A MARKOVIAN LOCAL RESAMPLING SCHEME FOR NONPARAMETRIC ESTIMATORS IN TIME SERIES ANALYSIS Econometric heory, 17, 2001, 540 566+ Printed in the United States of America+ A MARKOVIAN LOCAL RESAMPLING SCHEME FOR NONPARAMERIC ESIMAORS IN IME SERIES ANALYSIS EFSAHIOS PAPARODIIS University of Cyrus

More information

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Dedicated to Luis Caffarelli for his ucoming 60 th birthday Matteo Bonforte a, b and Juan Luis Vázquez a, c Abstract

More information

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:

More information

Debt, In ation and Growth

Debt, In ation and Growth Debt, In ation and Growth Robust Estimation of Long-Run E ects in Dynamic Panel Data Models Alexander Chudik a, Kamiar Mohaddes by, M. Hashem Pesaran c, and Mehdi Raissi d a Federal Reserve Bank of Dallas,

More information

Statistical Treatment Choice Based on. Asymmetric Minimax Regret Criteria

Statistical Treatment Choice Based on. Asymmetric Minimax Regret Criteria Statistical Treatment Coice Based on Asymmetric Minimax Regret Criteria Aleksey Tetenov y Deartment of Economics ortwestern University ovember 5, 007 (JOB MARKET PAPER) Abstract Tis aer studies te roblem

More information