On convex least squares estimation when the truth is linear

Size: px
Start display at page:

Download "On convex least squares estimation when the truth is linear"

Transcription

1 Electronic Journal of Statistics Vol ISSN: DOI: 1.114/15-EJS198 On convex least squares estimation when the truth is linear Yining Chen and Jon A. Wellner London School of Economics and Political Science and University of Washington y.chen11@lse.ac.uk; jaw@stat.washington.edu Abstract: We prove that the convex least squares estimator LSE attains a n 1/ pointwise rate of convergence in any region where the truth is linear. In addition, the asymptotic distribution can be characterized by a modified invelope process. Analogous results hold when one uses the derivative of the convex LSE to perform derivative estimation. These asymptotic results facilitate a new consistent testing procedure on the linearity against a convex alternative. Moreover, we show that the convex LSE adapts to the optimal rate at the boundary points of the region where the truth is linear, up to a log-log factor. These conclusions are valid in the context of both density estimation and regression function estimation. MSC 1 subject classifications: Primary 6E, 6G7, 6G8, 6G1, 6G; secondary 6G15. Keywords and phrases: Adaptive estimation, convexity, density estimation, least squares, regression function estimation, shape constraint. Received December 14. Contents 1 Introduction Estimation of a density function A special case Rate of convergence Asymptotic distribution Testing against a general convex density More general settings Adaptation at the boundary points Estimation of a regression function Basic properties Rate of convergence and asymptotic distribution Appendices: proofs Appendix I: existence of the limit process Appendix II: pointwise adaptation for cases B and C Acknowledgments References

2 17 Y. Chen and J. A. Wellner 1. Introduction Shape-constrained estimation has received much attention recently. The attraction is the prospect of obtaining automatic nonparametric estimators with no smoothing parameters to choose. Convexity is among the popular shape constraints that are of both mathematical and practical interest. Groeneboom, Jongbloed and Wellner 1b show that under the convexity constraint, the least squares estimator LSE can be used to estimate both a density and a regression function. For density estimation, they showed that the LSE ˆf n of the true convex density f converges pointwise at a n /5 rate under certain assumptions. The corresponding asymptotic distribution can be characterized via a so-called invelope function investigated by Groeneboom, Jongbloed and Wellner 1a. In the regression setting, similar results hold for the LSE ˆr n of the true regression function r. However, in the development of their pointwise asymptotic theory, it is required that f or r has positive second derivative in a neighborhood of the point to be estimated. This assumption excludes certain convex functions that may be of practical value. Two further scenarios of interest are given below: Scenario 1. At the point x,thek-th derivative f k x = or r k x = for k =, 3,...,s 1andf s x > or r s x >, where s is an integer greater than one; Scenario. There exists some region [a, b] onwhichf or r is linear. Scenario 1 can be handled using techniques developed in Balabdaoui, Rufibach and Wellner 9. The aim of this manuscript is to provide theory in the setting of Scenario. We prove that for estimation of a convex density when Scenario holds, at any fixed point x a, b, the LSE ˆf n x converges pointwise to f x at a n 1/ rate. Its left or right derivative ˆf nx converges to f x at the same rate. The corresponding asymptotic distributions are characterized using { a modified invelope process. More generally, we show that the processes n ˆfn x f x : x [a+δ, b δ] } and { n ˆf nx f x : x [a+δ, b δ] } both converge weakly for any δ>. We remark that unlike the case of Groeneboom, Jongbloed and Wellner 1b, these processes are not local, because intuitively speaking, the linear behavior of the true function is non-local, which requires linearity on an interval with positive length. In addition, there does not exist a universal distribution of ˆf n on a, b, i.e. the pointwise limit distributions at different points are in general different. In addition, we show that the derived asymptotic processes can be used in a more practical setting where one would like to perform tests on the linearity against a convex alternative. Moreover, we study the adaptation of the LSE ˆf n at the boundary points of the linear region e.g. a and b. Note that the difficulty level of estimating f a and f b depends on the behavior of f outside [a, b]. Nevertheless, we show that ˆf n a or ˆf n b converges to f a or f b at the minimax optimal rate up to a negligible factor of log log n. Last but not least, we establish the analogous rate and asymptotic distribution results for the LSE ˆr n in the regression setting.

3 Convex least squares estimation 173 Our study yields a better understanding of the adaptation of the LSE in terms of pointwise convergence under the convexity constraint. It is also one of the first attempts to quantify the behavior of the convex LSE at non-smooth points. When the truth is linear, the minimax optimal n 1/ pointwise rate is indeed achieved by the LSE on a, b. The optimal rate at the boundary points a and b is also achievable by the LSE up to a log-log factor. These results can also be applied to the case where the true function consists of multiple linear components. Furthermore, our results can be viewed as an intermediate stage for the development of theory under misspecification. Note that linearity is regarded as the boundary case of convexity: if a function is non-convex, then its projection to the class of convex functions will have linear components. We conjecture that the LSE in these misspecified regions converges at an n 1/ rate, with the asymptotic distribution characterized by a more restricted version of the invelope process. More broadly, we expect that this type of behavior will be seen in situations of other shape restrictions, such as log-concavity for d =1 Balabdaoui, Rufibach and Wellner, 9 andk-monotonicity Balabdaoui and Wellner, 7. The LSE of a convex density function was first studied by Groeneboom, Jongbloed and Wellner 1b, where its consistency and some asymptotic distribution theory were provided. On the other hand, the idea of using the LSE for convex regression function estimation dates back to Hildreth Its consistency was proved by Hanson and Pledger 1976, with some rate results giveninmammen1991. In this manuscript, for the sake of mathematical convenience, we shall focus on the non-discrete version discussed by Balabdaoui and Rufibach 8. See Groeneboom, Jongbloed and Wellner 8 for the computational aspects of all the above-mentioned LSEs. There are studies similar to ours regarding other shape restrictions. See Remark. of Groeneboom 1985 and Carolan and Dykstra 1999 in the context of decreasing density function estimation a.k.a. Grenander estimator when the truth is flat and Balabdaoui 14 with regard to discrete log-concave distribution estimation when the true distribution is geometric. In addition, Groeneboom and Pyke 1983 studied the Grenander estimator s global behavior in the L norm under the uniform distribution and gave a connection to statistical problems involving combination of p-values and two-sample rank statistics, while Cator 11 studied the adaptivity of the LSE in monotone regression or density function estimation. For estimation under misspecification of various shape constraints, we point readers to Jankowski 14, Cule and Samworth 1, Dümbgen, Samworth and Schuhmacher 11, Chen and Samworth 13, and Balabdaoui et.al. 13. More recent developments on global rates of the shape-constrained methods can be found in Guntuboyina and Sen 15, Doss and Wellner 16, and Kim and Samworth 14. See also Meyer 13 and Chen and Samworth 16 where an additive structure is imposed in shape-constrained estimation in the multidimensional setting. The rest of the paper is organized as follows: in Section, we study the behavior of the LSE for density estimation. In particular, for notational convenience, we first focus on a special case where the true density function f is taken to be

4 174 Y. Chen and J. A. Wellner triangular. The convergence rate and asymptotic distribution are given in Section.1.1 and Section.1.. Section.1.3 demonstrates the practical use of our asymptotic results, where a new consistent testing procedure on the linearity against a convex alternative is proposed. More general cases are handled later in Section. based on the ideas illustrated in Section.1. Section.3 discusses the adaptation of the LSE at the boundary points. Analogous results with regard to regression function estimation are presented in Section 3. Some proofs, mainly on the existence and uniqueness of a limit process and the adaptation of the LSE, are deferred to the appendices.. Estimation of a density function Given n independent and identically distributed IID observations from a density function f :[, [,. Let F be the corresponding distribution function DF. In this section, we denote the convex cone of all non-negative continuous convex and integrable functions on [, byk. TheLSEoff is given by ˆf n =argmin gt dt gtdf n t, g K where F n is the empirical distribution function of the observations. Furthermore, we denote the DF of ˆf n by ˆF n. Throughout the manuscript, without specifying otherwise, the derivative of a convex function can be interpreted as either its left derivative or its right derivative..1. A special case To motivate the discussion, as well as for notational convenience, we take f t = 1 t1 [,1] t in Section.1.1 and Section.1.. Extensions to other convex density functions are presented in Section.. Consistency of ˆf n over, in this setting can be found in Groeneboom, Jongbloed and Wellner 1b. Balabdaoui 7 has shown how ˆf n can be used to provide consistent estimators of f and f. Here we concentrate on the LSE s rate of convergence and asymptotic distribution Rate of convergence The following proposition shows that the convergence rate at any interior point where f is linear is n 1/, thus achieving the optimal rate given in Example 1 of Cai and Low 15. Theorem.1 Pointwise rate of convergence. Suppose f t = 1 t1 [,1] t. Then for any fixed x, 1, ˆf n x f x = O p n 1/.

5 Convex least squares estimation 175 Fig 1. The case of c>: A illustrates the situation of k, where ˆF nx F x cx ; B illustrates the situation of k>, where ˆF nx F x < c1 x. In both scenarios, the dashed curve represents the supporting hyperplane of the LSE, while the thick solid line represents the true density f. Proof of Theorem.1 A key ingredient of this proof is the version of Marshall s lemma in this setting Dümbgen, Rufibach and Wellner, 7, Theorem 1, which states that ˆF n F F n F,.1 where is the uniform norm. Let c = ˆf n x f x. Two cases are considered here: a c>andb c<. In the first case, because ˆf n is convex, one can find a supporting hyperplane of ˆf n passing through x,f x +c such that ˆf n y ky x +f x +c, for any y [,, where k is a negative slope. If k, then ˆF n x F x cx >. Otherwise, ˆFn x F x < c1 x <. Figure 1 explains the above inequalities graphically. Consequently, ˆF n x F x c minx, 1 x. In the second case, if we can find t<x such that ˆf n and f intersect at the point t, t, then ˆF n t F t 1 t c t x t ; ˆFn x ˆF n t F x F t 1 c x t <. Otherwise, by taking t =, we can still verify that the above two inequalities hold true. Figure illustrates these findings graphically.

6 176 Y. Chen and J. A. Wellner Fig. The case of c<: A illustrates the inequalities when ˆf n and f has an intersection on [,x ; the situation where ˆf n and f has no intersection on [,x is represented in B. In both scenarios, the dotted curve represents the LSE and the thick solid line represents the true density f. Therefore, ˆF n F max { ˆF n t F t, ˆFn x F x ˆFn t F t } c t max x t,x t c x 4, where the last inequality uses the fact that for any t [,x, t /x t isan increasing function of t while x t is a decreasing function of t. By.1, we have that O p n 1/ =4 F n F ˆF n F min x, x, x c. 4 It then follows that c = O p n 1/, as desired. Corollary. Uniform rate of convergence. For any <δ 1/, sup ˆf n x f x = O p n 1/. x [δ,1 δ] Proof of Corollary. A closer look at the proof of Theorem.1 reveals that we have O p n 1/ =4 F n F min x, x, x 4 simultaneously for every x [δ, 1 δ]. Therefore, O p n 1/ = inf x, min x, x x [δ,1 δ] 4 sup x [δ,1 δ] ˆf n x f x ˆf n x f x. Consequently, sup x [δ,1 δ] ˆf n x f x is O p n 1/.

7 Convex least squares estimation 177 Let ˆf n and ˆf + n denote respectively the left and right derivatives of ˆf n.the same convergence rate also applies to these derivative estimators. Corollary.3 Uniform rate of convergence: derivatives. For any <δ 1/, sup max ˆf n x f x, ˆf n + x f x = O p n 1/. x [δ,1 δ] Proof of Corollary.3 By the convexity of ˆf n, sup max ˆf n x f x, ˆf n + x f x x [δ,1 δ] max ˆf n δ f δ, ˆf + 1 δ f 1 δ δ max ˆfn δ ˆf n δ/ f δ+f δ/, ˆfn 1 δ/ ˆf n 1 δ f 1 δ/ + f 1 δ = O p n 1/, where the final equation follows from Corollary Asymptotic distribution To study the asymptotic distribution of ˆf n, we start by characterizing the limit distribution. Theorem.4 Characterization of the limit process. Let Xt =UF t and Y t = t Xsds for any t, whereu is a standard Brownian bridge process on [, 1]. Then almost surely a.s., there exists a uniquely defined random continuously differentiable function H on [, 1] satisfying the following conditions: 1 Ht Y t for every t [, 1]; H has convex second derivative on, 1; 3 H = Y and H = X; 4 H1 = Y 1 and H 1 = X1; 5 Ht Y t dh 3 t =. The above claim also holds if we instead consider a standard Brownian motion Xt on [, 1] and replace X, Y, H T by X,Ỹ, H T,whereỸt = t Xsds, and where H is different from H. Figure 3 shows a typical realization of Xt, Y t, H t andht. A detailed construction of the above limit invelope process can be found in Appendix I. Note that our process is defined on a compact interval, so is technically different from the process presented in Groeneboom, Jongbloed and Wellner 1a which is defined on the whole real line. As a result, extra conditions regarding

8 178 Y. Chen and J. A. Wellner Fig 3. A typical realization of Xt, Y t, H t and Ht in Theorem.4 with Xt = UF t and F the distribution function with triangular density f t =1 t1 [,1] t: in A, Xt and H t are plotted in solid and dash-dotted curves respectively; in B, the corresponding Y t is plotted in solid curve, while the invelope process Ht is illustrated in dash-dotted curve. the behavior of H and H at the boundary points are imposed here to ensure its uniqueness. Other characterization of the limit process is also possible. A slight variant is given below, which proof can also be found in Appendix I. Corollary.5 Different characterization. Conditions 3 and 4 in the statement of Theorem.4 can be replaced respectively by 3 lim t + H t =, and 4 lim t 1 H t =. Now we are in the position to state our main result of this section. Theorem.6 Asymptotic distribution. Suppose f t = 1 t1 [,1] t. Then for any δ>, the process ˆfn x f n x ˆf nx f x H x H 3 x in C[δ, 1 δ] D[δ, 1 δ], where C is the space of continuous functions equipped with the uniform norm, D is the Skorokhod space, and H is the invelope process defined in Theorem.4. In particular, for any x, 1, n ˆfn x f x, ˆf nx f x T d H x,h 3 x T..

9 Convex least squares estimation 179 Proof of Theorem.6 Before proceeding to the proof, we first define the following processes on [, : X n t =n 1/ F n t F t ; t Y n t =n 1/ F n s F s ds; t { s } Ĥ n t =n 1/ ˆfn v f v dv ds..3 Furthermore, define the set of knots of a convex function f on, 1 as { Sf = t, 1 : ft < 1 ft δ+ft + δ, δ, min1 t, t }..4 We remark that the above definition of knots can be easily extended to convex functions with a different domain. By Lemma. of Groeneboom, Jongbloed and Wellner 1b, we have that Ĥ n t Y n t fort [, 1], with equality if t S ˆf n. In addition, Ĥnt Y n tdĥ3 n t =. Now define the space E m of vector-valued functions m 3 as E m = C[, 1] C[, 1] C[1/m, 1 1/m] D[1/m, 1 1/m] C[, 1] D[, 1]. and endow E m with the product topology induced by the uniform norm on C i.e. the space of continuous functions and Skorokhod metric on D i.e. the Skorokhod space. Let E m be supported by the stochastic process Z n = T Ĥn, Ĥ n, Ĥ n, Ĥ3 n,y n,x n. Corollary.3 entails the tightness of Ĥ n 3 in D[1/m, 1 1/m], while Corollary. entails the tightness of Ĥ n in C[1/m, 1 1/m]. In addition, both Ĥn and Ĥ n are tight in C[, 1] via an easy application of Marshall s lemma. Since X n converges to a Brownian bridge, it is tight in D[, 1] as well. Finally, the same is also true for Y n. Note that E m is separable. For any subsequence of Z n, by Prohorov s theorem, we can construct a further subsequence Z nj such that {Z nj } j converges weakly in E m to some Z = H,H,H,H3,Y,X T. In addition, it follows from a diagonalization argument that a further extraction can make {Z nj } j converge weakly to Z in every E m. Using Skorokhod s representation theorem, we can assume without loss of generality that for almost every ω in the sample space, Z nj ω Z ω. Moreover, since X has a continuous

10 18 Y. Chen and J. A. Wellner path a.s., the convergence of X nj can be strengthened to X nj ω Xω, where is the uniform norm. In the rest of the proof, ω is suppressed for notational convenience, so depending on the context, Z nj or Z can either mean a random variable or a particular realization. In the following, we shall verify that H satisfies all the conditions listed in the statement of Theorem.4. 1 The fulfillment of this condition follows from the fact that inf Ĥn t Y n t for all n N. t [,1] Since f is linear on [, 1], Ĥ n is convex on [, 1] for every n N. The pointwise limit of convex functions is still convex, so H is convex on [1/m, 1 1/m]. The condition is then satisfied by letting m. 3 This condition always holds in view of our construction of Ĥn in.3. 4 We consider two cases: a if H 1 1/m as m, then the conditions are satisfied in view of Corollary.5; b otherwise, it must be the case that H 1 is bounded from above. Note that H 1 is also bounded from below a.s., which can be proved by using Corollary.3 and the fact that Ĥ n is convex. Denote by τ nj theknotof ˆf nj closest to 1. Then by Lemma. of Groeneboom, Jongbloed and Wellner 1b, Ĥ nj τ nj =Y nj τ nj andĥ n j τ nj =X nj τ nj. Since t =1isaknot of f, consistency of ˆfnj allows us to see that lim j τ nj = 1. Because H 1 is finite and both Y tandxt are sample continuous processes, taking τ nj 1 yields H 1 = Y 1 and H 1 = X1. Note that this argument remains valid even if τ nj > 1, because in this scenario, Ĥ n j is linear and bounded on [ τ nj,τ nj ]. 5 It follows from 1/m 1/m Ĥnj t Y nj t dĥ3 n j t =that 1/m 1/m H t Y t dh 3 t =. Since this holds for any m, one necessarily has that H t Y t dh 3 t =. Consequently, in view of Theorem.4, the limit Z is the same for any subsequences of Z n in E m.fixanym > 1/δ. It follows that the full sequence {Z n } n converges weakly in E m and has the limit H, H,H,H 3,Y,X T. This, together with the fact that H 3 is continuous at any fixed x, 1 with probability one which can be proved using Conditions 1 and 5 of H, yields.. It can be inferred from Corollary.5 and Theorem.6 that both ˆf n and ˆf n 1 do not converge to the truth at a n 1/ rate. In fact, Balabdaoui 7

11 Convex least squares estimation 181 proved that ˆf n is an inconsistent estimator of f. Nevertheless, the following proposition shows that ˆf n is at most O p 1. For the case of the maximum likelihood estimator of a k monotone density, we refer readers to Gao and Wellner 9 for a similar result. Proposition.7 Behavior at zero. ˆfn = O p 1. Proof of Proposition.7 Let τ n be the left-most point in S ˆf n. Since ˆf n is finite for every n, the linearity of ˆf n on [,τ n ] means that ˆf n t 1 t/τ n ˆf n for every t. By Corollary.1 of Groeneboom, Jongbloed and Wellner 1b, F n τ n = τn ˆf n tdt τn 1 t/τ n ˆf n dt = τ n ˆfn /. It follows from Theorem 9.1. of Shorack and Wellner 1986 that ˆf n F n τ n /τ n sup t>.1.3. Testing against a general convex density F n t F t O p 1. F t t The asymptotic results established in Section.1. can be used to test the linearity of a density function against a general convex alternative. To illustrate the main idea, in this section, we focus on the problem of testing H : f t = 1 t1 [,1] t against H 1 : f K and f t 1 t1 [,1] t for some t,. The test we propose is free of tuning parameters. Since the triangular distribution is frequently used in practice, and is closely related to the uniform distribution e.g. the minimum of two independent U[, 1] is triangular, our test could be a valuable addition to practitioners toolkit. For other tests on the linearity in the regression setting, we point readers to Meyer 3 and Sen and Meyer 13. Our test is based on the statistic T n := { n sup 1 t1[,1] t ˆf n t }. t [, The behavior of T n under H is established below. Proposition.8 Behavior of T n under H. Write T := inf t [,1] H t, d where H is the invelope process defined in Theorem.4. ThenT n T.Moreover, P T = 1.

12 18 Y. Chen and J. A. Wellner Table 1 Estimated upper-quantiles of T based on 1 5 simulations α 1%.5% 5% 1% % t α Fig 4. The estimated density function of T based on 1 5 simulations. Here we used the kernel density estimator with the boundary correction at zero. Proof of Proposition.8 The first part follows directly from Theorem.6 and Corollary.5. For the second part, note that if T<, then inf t [,1] H t >, which would imply H 1 H >. But this would violate the characterization of H i.e. H 1 H = X1 X =, since X is a Brownian bridge. Consequently, P T = 1. The quantiles and the density function of T can be approximated numerically using Monte Carlo methods. They are given in Table 1 and Figure 4. Herewe denote the upper α quantile of T by t α. Our numerical evidence also suggests that the density function of T appears to be log-concave. For related work on log-concavity of Chernoff s distribution i.e. a different random variable also associated with a Brownian motion, see Balabdaoui and Wellner 14. For a test of size α, 1, we propose to reject H if T n >t α. According to Proposition.8, this test is asymptotically of size α. In the following, we prove that our test is also consistent, i.e., its power goes to one as n. Theorem.9 Consistency of the test. Under H 1, for any α>, P T n > t α 1 as n. Proof of Theorem.9 Suppose that f t 1 t1 [,1] t for some t,. First, we show that there exists some t, 1 such that f t < 1 t. Suppose the conclusion fails, then f t 1 t1 [,1] t for all t>. Hence f tdt

13 Convex least squares estimation t1 [,1] tdt = 1 with strict inequality if f t > 1 t1 [,1] t for some t>. But then f is not a density and we conclude that f t = 1 t1 [,1] t for every t,. This contradiction yields the conclusion. Next, it follows from Theorem 3.1 of Groeneboom, Jongbloed and Wellner 1b thatthelse ˆf n is consistent at t, i.e., ˆf n t f t a.s.. Therefore, almost surely, T n / n>1 t ˆf n t = 1 t f t +f t ˆf n t 1 t f t >. It follows that P T n >t α 1... More general settings The aim of this subsection is to extend the conclusions presented in Section.1 to more general convex densities. We assume that f is positive and linear on a, b for some a<b, where the open interval a, b is picked as the largest interval on which f remains linear. More precisely, it means that there does not exist a bigger open interval a,b i.e. a, b a,b, on which f is linear. For the sake of notational convenience, we suppress the dependence of H on a, b and F in the following two theorems. Their proofs are similar to those given in Section.1, so are omitted for brevity. Theorem.1 Characterization of the limit process. Let Xt =UF t and Y t = t Xsds for any t>. Then a.s., there exists a uniquely defined random continuously differentiable function H on [a, b] satisfying the following conditions: 1 H t Y t for every t [a, b]; H has convex second derivative on a, b; 3 H a =Y a and H a =Xa; 4 H b =Y b and H b =Xb; 5 b a H t Y t dh 3 t =. Theorem.11 Rate and asymptotic distribution. For any <δ b a/, sup x [a+δ,b δ] ˆf n x f x, ˆf nx f x = O p n 1/. Moreover, n ˆfn x f x ˆf nx f x H x x H 3 in C[a + δ, b δ] D[a + δ, b δ], where H is the invelope process defined in Theorem.1.

14 184 Y. Chen and J. A. Wellner Fig 5. Typical f in different cases are illustrated, where we set α =in B and C..3. Adaptation at the boundary points In this subsection, we study the pointwise convergence rate of the convex LSE ˆf n at the boundary points of the region where f is linear. Examples of such points include a and b given in Section.. To begin our discussion, we assume that x, is such a boundary point in the interior of the support i.e. f x >. Here again f is a convex and decreasing density function on [,. Three cases are under investigation as below. These cases are illustrated in Figure 5. A f t =f x +K 1 t x +K t x 1 [x, t for every t in a fixed small neighborhood of x,withk 1 + K < andk > ; B f t =f x +K 1 t x +K t x α 1 [x, t for every t in a fixed small neighborhood of x,withk 1 <, K > andα>1; C f t =f x +K 1 t x +K x t α 1 [,xt for every t in a fixed small neighborhood of x,withk 1 <, K > andα>1. Note that in all cases above, only the behavior of f in a small neighborhood of x is relevant. As pointed out in Example of Cai and Low 15, the minimax optimal convergence rate at x is n 1/3 in A. Furthermore, in B and C, Example 4 of Cai and Low 15 suggests that the optimal rate at x is n α/α+1. In the following, we prove that the convex LSE automatically adapts to optimal rates, up to a factor of log log n. Though almost negligible, the factor of log log n here indicates that there might be room for improvement. These results should be viewed as a first step for the investigation of the adaptation of the convex LSE. One could also compare our results with Cator 11, where the adaptation of the LSE in the context of decreasing density function estimation was tackled. Theorem.1 Adaptation at the boundary points: I. In the case of A, ˆf n x f x = O p n 1/3 log log n. Moreover, min ˆfn x f x, = O p n 1/.

15 Convex least squares estimation 185 Proof of Theorem.1 Suppose that for some fixed δ>, A holds for every t [x δ, x +δ]. Our first aim is to show that inf min ˆfn t f t, = O p n 1/..5 t [x δ,x +δ] Suppose that c := ˆf n t f t <. If we can find x δ t<x such that ˆf n and f intersect at the point t, f t, then ˆFn t ˆF n x δ F t F x δ 1 c t x + δ ; x t ˆFn x ˆF n t F x F t 1 c x t <. Otherwise, by taking t = x δ, we can still verify the above two inequalities. Therefore, O p n 1/ = ˆF n F max { ˆFn t F t ˆFn x δ F x δ, ˆFn x F x ˆFn t F t } c { t max x + δ },x t x t c δ 4, where the first equation follows from Marshall s lemma see.1. Consequently,.5 holds true. As a remark, the above arguments are essentially the same as those illustrated in the proof of Theorem.1, where one only relies on the fact that f is linear on [x δ, x ] i.e. one-sided linearity. In the rest of the proof, it suffices to only consider the situation of ˆf n x f x >. Recall that to handle this scenario, the proof of Theorem.1 makes use of the fact that the triangular density is linear on the whole [, 1] i.e. two-sided linearity. This is no longer true here since f is not linear on [x δ, x + δ]. In the following, we deploy a different strategy to establish the rate. Let τn =max{t S ˆf n,t < x } and τ n + =min{t S ˆf n,t x },where S is defined in.4. We consider three different cases separately. a max τ n + x,x τn >δ.thenf and ˆf n are linear on either [x δ, x ]or [x,x +δ]. It follows from the line of reasoning as in the proof of Theorem.1 that max ˆfn x f x, = O p n 1/. b max τ n + x,x τn δ and τ + n τn >n 1/3. Note that being in the set S ˆf n implies that F n τn = ˆF n τn andf n τ n + = ˆF n τ n + forτn and τ n +.Since ˆf n is linear on [τn,τ n + ], F n τ n + F τ n + F n τn F τn.6

16 186 Y. Chen and J. A. Wellner = τ + n τ n ˆfn t f t dt ˆfn x f x τ n + τn / +min ˆfn τn f τn, ˆf n τ n + f τ n +, τ n + τn. By the law of the iterated logarithm for local empirical processes cf. Lemma of Csörgő and Horvath 1993, the term displayed in.6 isatmost τ + n τn O p n 1/ log log n.inviewof.5, rearranging the terms in the above inequality yields ˆf n x f x τ n + τn 1/ O p n 1/ log log n + O p n 1/ O p n 1/3 log log n. c τ n + τn n 1/3. Now we define τ n ++ =max{t S ˆf n,t < x +n 1/3 } and τ n +++ =min{t S ˆf n,t x +n 1/3 }. Here the existence of τ n ++ is guaranteed by the condition that τ n + τn n 1/3. Furthermore, we note that in our definitions τ n + and τ n ++ might not be distinct. Within this setting, three further scenarios are to be dealt with. c1 τ n +++ x >δ. Since both f and ˆf n are linear on [τ n ++,τ n +++ ], one can apply a strategy similar to that used in the proof of Theorem.1 to show that ˆf n τ n ++ f τ n ++ = O p n 1/. Consequently, τ ++ ˆf n x f x = ˆf n τ n ++ f τ n ++ n ˆf n t f t dt = O p n 1/3, where we have used the facts that τ ++ n x x < n 1/3 and sup ˆf nt f t = O p 1. t [x,x +δ] Here the second fact can be derived by invoking K 1 +o p 1 inf ˆf nt sup ˆf nt K 1 +K +o p 1, t [x δ,x +δ] t [x δ,x +δ] which follows easily from consistency of ˆf n in estimating f at the points x ± δ. c 3n 1/3 <τ n +++ x δ.then τ n +++ τ n ++ >n 1/3. Using essentially the same argument as in b, we see that and hence, max ˆfn τ ++ n f τ ++ n, = O p n 1/3 log log n, ˆf n τ ++ n f τ ++ n = O p n 1/3 log log n.

17 Convex least squares estimation 187 We then apply the argument presented in c1 to derive ˆf n x f x = O p n 1/3 log log n. c3 τ +++ n x 3n 1/3.Thenn 1/3 τ +++ n τ + n n 1/3.Byproceeding as in the proof of Lemma 4.3 of Groeneboom, Jongbloed and Wellner 1b, we are able to verify that inf ˆf n t f t = O p n 1/3. t [τ n +,τ n +++ ] Here we invoked Lemma A.1 of Balabdaoui and Wellner 7 with k =1andd = to verify the above claim. See also Kim and Pollard 199. Finally, we can argue as in c1 to show that ˆf n x f x = O p n 1/3. The proof is complete by taking into account all the above cases. We remark that in our proof the worst case scenarios that drive the convergence rate are b and c. Theorem.13 Adaptation at the boundary points: II. In the case of B or C, ˆf n x f x = O p n α/α+1 log log n. Moreover, min ˆfn x f x, = O p n 1/. The proof of Theorem.13 is of more technical nature and is deferred to Appendix II. 3. Estimation of a regression function Changing notation slightly from the previous section, we now assume that we are given pairs {X n,i,y n,i :i =1,...,n} for n =1,,... with Y n,i = r X n,i +ɛ n,i, for i =1,...,n, where r :[, 1] R is a convex function, and where {ɛ n,i : i =1,...,n} is a triangular array of IID random variables satisfying i Eɛ 1,1 =andσ = Eɛ 1,1 <. To simplify our analysis, the following fixed design is considered: ii X n,i = i/n +1fori =1,...,n. The LSE of r proposed by Balabdaoui and Rufibach 8 is ˆr n =argmin g K 1 gt dt 1 n n gx n,i Y n,i, 3.1 i=1

18 188 Y. Chen and J. A. Wellner where, in this section, K denotes the set of all continuous convex functions on [, 1]. We note that the above estimator is slightly different from the more classical LSE that minimizes 1 n n i=1 gx n,i 1 n n i=1 gx n,iy n,i over K. Since its criterion function has a completely discrete nature, different techniques are needed to prove analogous results. We will not pursue this direction in the manuscript Basic properties In this subsection, we list some basic properties of ˆr n given as 3.1. For any t [, 1], define R t = ˆR n t = t t R n t = 1 n r sds; ˆr n sds; n Y n,i 1 [Xn,i, t. i=1 Proposition 3.1. The LSE ˆr n exists and is unique. Proposition 3.. Let Sˆr n denote the set of knots of ˆr n. The following properties hold: a t ˆRn s R n s { =, for all t Sˆrn {, 1}, ds, for all t [, 1]; b ˆR n t =R n t for any t Sˆr n {, 1}; c { t ˆRn s R n s ds } dˆr nt =. Proposition 3.3 Marshall s lemma. Proof of Propositions 3.1, 3. and 3.3 Note that we can rewrite ˆr n as 1 argmin gt dt g K ˆR n R R n R. 3. gtdr n t. From this perspective, it is easy to check that Proposition 3.1, Proposition 3. and Proposition 3.3 follow from, respectively, slight modifications of Lemma.1, Lemma. of Groeneboom, Jongbloed and Wellner 1b and Theorem 1 of Dümbgen, Rufibach and Wellner 7. We remark that this version of Marshall s lemma in the regression setting serves as an important tool to establish consistency and the rate. In particular, in

19 Convex least squares estimation 189 the following, we show that 3. easily yields consistency of ˆR n, and consistency of ˆr n follow from this together with convexity of ˆr n. Proposition 3.4 Consistency. Under Assumptions i ii, for any <δ 1/, sup ˆr n x r x a.s., as n. x [δ,1 δ] Proof of Proposition 3.4 The proof replies on the following intermediate result: suppose there are differentiable functions G,G 1,...,allare[, 1] R and have convex first derivative, then for any fixed <δ 1/, G k G implies sup t [δ,1 δ] G k t G t. To see this, we first show that lim sup k sup t [δ,1 δ] Note that for any t [δ, 1 δ] andany<ɛ δ, G kt G t Gk t + ɛ G k t max ɛ It then follows that lim sup k sup t [δ,1 δ] sup t [δ,1 δ] {G kt G t} G t + ɛ G t ɛ {G kt G t}. 3.3 G t, G kt G k t ɛ ɛ Our claim 3.3 can be verified by letting ɛ. Secondly, we show that lim inf k G t, G t G t ɛ ɛ inf t [,1] {G kt G t}. We prove this by contradiction. Suppose that lim inf k inf t [,1] {G kt G t} = M G t. G t. for some M>. By extracting subsequences if necessary, we can assume that inf t [,1] {G k t G t} M as k.inviewof3.3, it follows from the convexity of G k and G that one can find an interval I k of positive length δ which can depend on M such that inf t Ik {G k t G t} M/ for every k>k,wherek is a sufficiently large integer. This implies that G k G sup t I k G k t G t Mδ/4 >,

20 19 Y. Chen and J. A. Wellner for every k>k, which contradicts the fact that G k G ask. Combining these two parts together completes the proof of the intermediate result. a.s. Since R n R by empirical process theory, Proposition 3.3 entails that ˆR a.s. n R. Consistency of ˆr n then follows straightforwardly from the above intermediate result. 3.. Rate of convergence and asymptotic distribution In the following, we assume that r that is linear on a, b, 1. Moreover, a, b is largest in the sense that one can not find a bigger open interval a,b on which r remains linear. Theorem 3.5 Rate and asymptotic distribution. Under Assumptions i ii, for any <δ b a/, sup x [a+δ,b δ] ˆr n x r x, ˆr nx r x = O p n 1/. Moreover, n ˆrn x r x ˆr nx r x σ b a 1/ H x a b a b a 1/ H3 x a b a in C[a + δ, b δ] D[a + δ, b δ], where H is the invelope process defined in the second part of Theorem.4. The proof of Theorem 3.5 is very similar to what has already been shown in Section.1, so is omitted for the sake of brevity. In presence of the linearity of r on a, b, the limit distribution of the process nˆrn r ona, b does not depend on r a orr b. In addition, the above theorem continues to hold if we weaken Assumption ii to: 1 n ii sup t [,1] n i=1 1 [X n,i, t t = o n 1/. Theoretical results in the random design are also possible, where for instance, we can assume that {X n,i,i =1,...,n} are IID uniform random variables on [, 1]. In this case, Theorem 3.5 is still valid, while a process different from H is required to characterize the limit distribution. This follows from the fact that in the random design nr n R can converge to a Gaussian process that is not a Brownian motion. 4. Appendices: proofs 4.1. Appendix I: existence of the limit process Recall that f t = 1 t1 [,1] t, Xt =UF t, and Y t = t Xsds for t. In this section, we show the existence and uniqueness of the invelope

21 Convex least squares estimation 191 process H. The case of H can be handled using essentially the same arguments, so is omitted here. Lemmas are needed to prove Theorem.4. Lemma 4.1. Let the functional φg be defined as φg = 1 g t dt gt dxt for functions in the set G k = {g :[, 1] R,g is convex,g = g1 = k}. Then with probability one, the problem of minimizing φg over G k has a unique solution. Proof of Lemma 4.1 We consider this optimization problem in the metric space L.First,weshow that if it exists, the minimizer must be in the subset G k,m = {g :[, 1] R,g is convex,g = g1 = k, inf gt M} [,1] for some <M<. To verify this, we need the following result sup gt dxt <, a.s. 4.1 g G 1,1 Let W t be a standard Brownian motion. We note that gt dxt has the same distribution as g 1 1 t dw t W 1 gtf tdt. Using the entropy bound of G 1,1 in L Theorem.7.1 of Guntuboyina and Sen 13 and Dudley s theorem cf. Theorem.6.1 of Dudley 1999, we can establish that H = { h :[, 1] R ht =g 1 1 t,g G 1,1 } is a GC-set. As g 1 1 t dw t is an isonormal Gaussian process indexed by H, wehavethata.s.sup g G1,1 1 g 1 1 t dw t <. Furthermore, it is easy to check that W 1 < a.s. and sup g G1,1 1 gtf tdt. So our claim of 4.1 holds. Now for sufficiently large M with M>k, gt dxt M sup 1 gt dxt. sup g G k,m g G 1,1 Thus, for any g G k with inf [,1] g = M, gtdxt isatmostom. Furthermore, 1 1 g t dt is of OM. Since φ at the minimizer could at most

22 19 Y. Chen and J. A. Wellner be as large as φ =, we conclude that it suffices to only consider functions in G k,m for some sufficiently large M. Note that the functional φ is continuous cf. Dudley s theorem and strictly convex. Moreover, for g 1,g G k,m,if g1 t g t dt =,theng1 = g on [, 1]. Since G k,m is compact in L, the existence and uniqueness follow from a standard convex analysis argument in a Hilbert space. As a remark, it can be seen from the proof of Lemma 4.1 that for a given ω Ω from the sample space which determines the value of Xt, if the function φ has a unique minimizer over G 1 which happens a.s., it also admits a unique minimizer over G k for any k>1. Lemma 4.. Almost surely, Y t does not have parabolic tangents at either t =or t =1. Proof of Lemma 4. First, consider the case of t =. Theorem 1 of Lachal 1997 saysthat lim sup t + t W sds /3t3 log log1/t =1, a.s., where W is a standard Brownian motion. From this, it follows that lim sup t + Y t t = lim sup {W s s s s W 1}ds /3t3 log log1/t t + /3t3 log log1/t t W sds /3t3 log log1/t =, a.s. = lim sup t + thanks to the scaling properties of Brownian motion W at = d aw t. This implies that Y t does not have a parabolic tangent at t =. Y 1 Y t Second, consider the case of t = 1. Note that lim t 1 1 t = X1 = and lim sup Y 1 Y t 1 { t 1 1 t = lim sup t W 1 W F s } ds W 11 t 3 /3 t 1 1 t { = lim t W 1 W F t } ds t 1 1 t a.s. t = lim W s ds, t + t where F t =t t. Therefore, to prove that Y t does not have a parabolic tangent at t = 1, it suffices to show that lim sup t + t W s ds =, t a.s.

23 Convex least squares estimation 193 Denote by Zt = t W s ds. For any <t 1 <t < 1, we argue that the random variable Zt Zt 1 t t 1 W t 1 follows a distribution of N,t 4 /6 t 1t +4t 3 1t /3 t 4 1/. This is because Zt Zt 1 t t 1 W t 1 = = = t t 1 t t 1 t t 1 {W s W t 1}ds W s W t 1 ds = s t u 1 dsdw u = s t t 1 t t 1 1 s s t 1 dw uds t udw u, where we invoked the stochastic Fubini s theorem in the last line, and thus, { } Var Zt Zt 1 t t 1 W t 1 = t t 1 t u du = t 4 /6 t 1t +4t 3 1t /3 t 4 1/. Now setting t 1 =1/ andt i+1 = t i for every i N. It is easy to check that the collection of random variables { } Zt i 1 Zt i t i 1 t i W t i i=1 is mutually independent, so lim sup i where we made use of the fact that lim i Zt i 1 Zt i t i 1 t i W t i t i 1 t 4 i 1 /6 t i t i 1 +4t3 i t i 1/3 t 4 i / t i 1 =, a.s., 4. =1/ 6. Assume that there exists some K> such that Zt Kt for all sufficiently small t>. But it follows from 4. that a.s. one can find a subsequence of N denoted by {i j } j=1 satisfying Zt ij 1 Zt ij t ij 1 t ij W t i j K +1, for j =1,,... Consequently, Zt ij 1 t i j 1 t i j 1 K +1+ Zt i j t i j 1 + W t i j t ij 1 t ij t i j 1

24 194 Y. Chen and J. A. Wellner K +1 Kt ij W t i j tij t ij 1 t ij a.s. K +1, tij t ij as j. The last step is due to t ij + and lim sup s + W s /s 1/4 =a.s. which is a direct application of the law of the iterated logarithm. The proof is completed by contradiction. Now denote by f k the unique function which minimizes φg overg k.let H k be the second order integral satisfying H k = Y, H k 1 = Y 1 and H k = f k. Lemma 4.3. Almost surely, for every k N, f k and H k has the following properties: i H k t Y t for every t [, 1]; ii Hk t Y t df k t =, where the derivative can be interpreted as either the left or the right derivative; iii H k t =Y t and H k t =Xt for any t Sf k, wheres is the set of knots; iv t f ksds Xt X and t f ksds X1 Xt for any t Sf k ; v f k is a continuous function on [, 1]. Proof of Lemma 4.3 To show i, ii, iii and iv, one may refer to Lemma. and Corollary.1 of Groeneboom, Jongbloed and Wellner 1a and use a similar functional derivative argument. For v, we note that since f k is convex, discontinuity can only happen at t = or t = 1. In the following, we show that it is impossible at t =. Suppose that f k is discontinuous at zero. Consider the class of functions g δ t =max1 t/δ,. Then f k t+ɛg δ t 1,1 t+k1 {,1} t G k for every ɛ,k lim inf t + f k t]. By considering the functional derivative of φg and using integration by parts, we obtain that for any δ>, kδ/ g δ tf k tdt g δ tdxt =Y δ/δ, which implies that kδ / Y δ for every δ>. However, this contradicts Lemma 4., which says that a.s. Y does not have a parabolic tangent at t =. Consequently, f k is continuous at t =. The same argument can also be applied to show the continuity of f k at t =1. Lemma 4.4. Fix any t, 1. Denotebyτ k the right-most knot of f k on,t]. If such knot does not exist, then set τ k =. Similarly, denote by τ + k the left-most knot of f k on [t,1, and set τ + k =1if such knot does not exist. Then, for almost every ω Ω, thereexistsk>such that for every k K, τ k and τ + k 1. Here we suppressed the dependence of K and f k as well as τ k and τ + k onω via X in the notation.

25 Convex least squares estimation 195 Proof of Lemma 4.4 Recall that Xt is said to be sample-bounded on [, 1] if sup [,1] Xt is finite for almost all ω Ω. Since both Xt andy t are sample bounded. Without loss of generality, in the following, wee can fix ω Ω and assume that sup [,1] Xt < and sup [,1] Y t <. First, we show the existence of at least one knot on, 1. Note that the cubic polynomial P k with P k = Y =, P k 1 = Y 1, P k = P k 1 = k can be expressed as P k s = k s + Y 1 k s, Therefore, take for instance s =.5 and consider the event P k.5 Y.5. This event can be reexpressed as 1 Y 1 Y k, which will eventually become false as k. This is due to the fact that Y t is sample bounded. In view of i and v of Lemma 4.3, we conclude that f k has at least one knot in the open interval, 1 for sufficiently large k. Next, take k large enough so that f k has one knot in, 1, which we denote by τ k. By iii of Lemma 4.3, H k τ k =Yτ k andh k τ k=xτ k. Without loss of generality, we may assume that τ k >t. Now the cubic polynomial P k with P k = Y =, P k τ k =Yτ k, P k = k and P k τ k=xτ k can be expressed as P k s =a k1 s 3 + a k s + a k3 s, where a k1 = Xτ k τ k Y τ k τ 3 k k 4τ k ; a k = k ; a k3 = 3Y τ k Xτ k kτ k τ k 4. By taking, say for example, s = t/, it can then be verified that the event P k t/ Y t/, which is equivalent to Xτ k t 3 16τ k Y τ kt 3 16τ 3 k + 3Y τ kt 4τ k Xτ kt 4 kt 3τ k t τ k, will eventually stop happening as k. This is due to the sample boundedness of both Xt andy t. Consequently, τ k for sufficiently large k. Furthermore, using essentially the same argument, one can also show that τ + k 1for large k, which completes the proof of this lemma.

26 196 Y. Chen and J. A. Wellner Lemma 4.5. For almost every ω Ω which determines Xt and f k, we can find an M>such that inf inf f k > M. k N [,1] Proof of Lemma 4.5 Fix any δ, 1/]. In view of Lemma 4.4, we may assume that there exist knots τ k and τ + k on,δ]and[1 δ, 1 respectively for sufficiently large k. If inf [,1] f k for every k>k, then we are done. Otherwise, we focus on those inf [,1] f k < and find <t k,1 <t k, < 1withf k t k,1 =f k t k, =. Note that the existence of t k,1 and t k, are guaranteed by Lemma 4.3 v. In the following, we take δ =1/1 and consider two scenarios. a t k, t k,1 < δ. Leta k,1 f k t k,1 anda k, f k t k,, where is the subgradient operator. Then Lemma 4.3v implies that a k,1 < and a k, >. Since both a k,1 s t k,1 anda k, s t k, can be regarded as supporting hyperplanes of f k due to its convexity, it follows that f k s max a k,1 s t k,1,a k, s t k,.setc k = a k,1 a k, t k, t k,1 /a k, a k,1 i.e. C k is the value of the above hyperplanes at their intersection, which is negative, so inf [,1] f k C k.now sup Xt t [,1] τ + Xτ + k Xτ k =H kτ + k H kτ k = k τ + k τ k δ δ δ δ inf u [,1 4δ] max a k,1 s t k,1,a k, s t k, ds τ k f k s ds max a k,1 s t k,1,a k, s t k, ds δc k max a k,1 s t k,1,a k, s t k,, ds δ max a k,1 s t k,1, a k, s t k,, ds δc k δ { ak,1 u + a k, 1 4δ u } / C k t k, t k 1 / δc k 1 4δ C k δc k δc k = 4δ 1 4δ + δ C k C k. Consequently, a.s., sup k N C k <. b t k, t k,1 δ. Now consider f k+ =maxf k,. It follows that φf k φf k+ = 1 min,f k t dt min,f k t dxt

27 Convex least squares estimation 197 Let M k = inf [,1] f k. By the convexity of f k, we see that the first term is no smaller than δmk /3. For the second term, we can again use Dudley s theorem and the fact that the following class { G unit = g :[, 1] [ 1, ] x 1 <x 1 } s.t. gs is convex on [x 1,x ]; gs =, s [,x 1 ] [x, 1] has entropy of order η 1/ in L to argue that sup gdxt <, g G unit [,1] a.s. Therefore, the second term is at most OM k. Then we can use the argument in the proof of Lemma 4.1 to establish that a.s. lim sup k M k <. Lemma 4.6. For any fixed t, 1 and almost every ω Ω, sup k f k t, sup k f k t and sup k f + k t are bounded. Proof of Lemma 4.6 LetΔ=mint, 1 t/. In view of Lemma 4.4, for sufficiently large k, we can assume that f k has at least one knot in,t Δ], and one knot in [t +Δ, 1. Denote these two points by τ k and τ + k respectively. By the convexity of f k, there exists some c R such that f k s cs t+f k t for every s [, 1]. It follows that τ + sup Xt Xτ + k Xτ k = k t [,1] τ k [τ k,t Δ] [t+δ,τ + k ] M k +Δf k t, inf [,1] f k f k s ds ds + t+δ t Δ { cs t+fk t } ds where M k = inf [,1] f k. By Lemma 4.5, we see that for almost every ω Ω, lim sup k f k t is bounded. Combining this with the lower bound we established previously entails the boundedness of {f k t} k. Next, note that both {f k t Δ} k and {f k t+δ} k are bounded. The boundedness of {f k t} k and {f + k t} k immediately follows from the convexity of f k by utilizing [ fk t f k t Δ f k t, f ] kt +Δ f k t. Δ Δ Lemma 4.7. For almost every ω Ω, we have that both } and {sup t [,1] H k t are bounded. k { } sup t [,1] H k t k

28 198 Y. Chen and J. A. Wellner Proof of Lemma 4.7 By Lemma 4.3iv, for any t [, 1], min, inf inf f k t inf f k k N [,1] [,1] t f k sds = X1 X 1 t inf [,1] f k X1 X min f k sds t f k sds, inf k N inf [,1] f k { Consequently, Lemma 4.5 entails the boundedness of sup t [,1] } t f ksds. k Furthermore, Lemma 4.4 says that one can always find a knot τ k, 1 with Xτ k =H } k τ k for all sufficiently large k. Thus, the boundedness of {sup t [,1] H k t follows from the fact that k τ sup H kt sup t [,1] t [,1] Xτ k k t f k sds + f k sds t sup Xt + sup f k sds. t [,1] t [,1] { } Finally, one can derive the sample boundedness of sup t [,1] H k t by using k the equality H k t =Y + t H k sds. Lemma 4.8. For almost every ω Ω, both{h k } k and {H k } k are uniformly equicontinuous on [, 1]. In fact, they are uniformly Hölder continuous with exponent less than 1/4. Proof of Lemma 4.8 Here we only show that the family {H k } k is uniformly equicontinuous. Fix any <δ<1. For any t 1 t 1witht t 1 <δ, by the convexity of f k, t δ 1 H kt H kt 1 = f k sds max f k sds, f k sds. 1 δ t 1 In the following, we shall focus on δ f ksds. The term 1 δ f ksds can be handled in exactly the same fashion. By Lemma 4.4, Sf k isnon-emptyfor every k>k,wherek is a sufficiently large positive integer. Furthermore, in view of Lemma 4.5, we can assume that inf k N inf [,1] f k M for some M>. Three scenarios are discussed in the following. Here we fix α, 1/. a There exists at least one knot τ k Sf k withτ k [δ, δ α ]. Then δ f k sds τk f k sds + Mδ α Xτ k X + Mδ α M δ α + Mδ α,.

29 Convex least squares estimation 199 for some M > whichistheα-hölder constant of this particular realization of Xt. The last line follows from Lemma 4.3iv and the fact that Xt isα-hölder-continuous, so Xτ k X M τ k α M δ α α = M δ α. b Sf k,δ α ]=. Letτ k Sf k betheleft-mostknotin[δ α, 1. Then f k is linear on [,τ k ]andf k τ k M. Note that M τ k α Xτ k X so f k M τ α 1 k δ τk f k sds = f k + f k τ k τ k f k Mτ k /, + M M δ α α + M. It now follows that f k sds M δ α α + Mδ M + Mδ α. We remark that since f k = k, the above conclusion also implies that Sf k,δ α ] for all sufficiently large k. c Sf k [δ, δ α =, but Sf k,δ]. Letτ k be the right-most knot in,δ]andτ + k be the left-most knot in [δα, 1. As a convention, we set τ + k = 1 if such a knot does not exist. Note that f k is linear on [τ k,τ+ k ], so we can use essentially the same argument as above to see that δ f τ k sds k M + Mδ α. Finally, δ δ f k sds Xτ k X + f k sds τ k M δ α +M + Mδ α =3M + Mδ α. To finish the proof, we shall apply Lemma 4.5 to verify that H k t H k t 1 Mδ. In addition, combining results from all the scenarios listed above, we see that {H k } k is uniformly Hölder continuous with exponent α < 1/4. Proof of Theorem.4 For every m N with m 3, define the following norms H m = sup Ht + sup H t + sup H t. t [,1] t [,1] t [1/m,1 1/m] First, we show the existence of such a function for almost all ω Ωby construction. Fix ω thus we focus on a particular realization of Xt but suppress its dependence on ω in the notation. Let H k be the function satisfying H k = f k, H k = and H k 1 = Y 1. We claim that the sequence H k admits a convergent subsequence in the topology induced by the norm m.

STATISTICAL INFERENCE UNDER ORDER RESTRICTIONS LIMIT THEORY FOR THE GRENANDER ESTIMATOR UNDER ALTERNATIVE HYPOTHESES

STATISTICAL INFERENCE UNDER ORDER RESTRICTIONS LIMIT THEORY FOR THE GRENANDER ESTIMATOR UNDER ALTERNATIVE HYPOTHESES STATISTICAL INFERENCE UNDER ORDER RESTRICTIONS LIMIT THEORY FOR THE GRENANDER ESTIMATOR UNDER ALTERNATIVE HYPOTHESES By Jon A. Wellner University of Washington 1. Limit theory for the Grenander estimator.

More information

A Law of the Iterated Logarithm. for Grenander s Estimator

A Law of the Iterated Logarithm. for Grenander s Estimator A Law of the Iterated Logarithm for Grenander s Estimator Jon A. Wellner University of Washington, Seattle Probability Seminar October 24, 2016 Based on joint work with: Lutz Dümbgen and Malcolm Wolff

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

Nonparametric estimation of log-concave densities

Nonparametric estimation of log-concave densities Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Seminaire, Institut de Mathématiques de Toulouse 5 March 2012 Seminaire, Toulouse Based on joint work

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Nonparametric estimation under Shape Restrictions

Nonparametric estimation under Shape Restrictions Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions

More information

Nonparametric estimation of log-concave densities

Nonparametric estimation of log-concave densities Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Northwestern University November 5, 2010 Conference on Shape Restrictions in Non- and Semi-Parametric

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Shape constrained estimation: a brief introduction

Shape constrained estimation: a brief introduction Shape constrained estimation: a brief introduction Jon A. Wellner University of Washington, Seattle Cowles Foundation, Yale November 7, 2016 Econometrics lunch seminar, Yale Based on joint work with: Fadoua

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

More information

Nonparametric estimation under Shape Restrictions

Nonparametric estimation under Shape Restrictions Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions

More information

Estimation of a k monotone density, part 3: limiting Gaussian versions of the problem; invelopes and envelopes

Estimation of a k monotone density, part 3: limiting Gaussian versions of the problem; invelopes and envelopes Estimation of a monotone density, part 3: limiting Gaussian versions of the problem; invelopes and envelopes Fadoua Balabdaoui 1 and Jon A. Wellner University of Washington September, 004 Abstract Let

More information

Problem Set 5: Solutions Math 201A: Fall 2016

Problem Set 5: Solutions Math 201A: Fall 2016 Problem Set 5: s Math 21A: Fall 216 Problem 1. Define f : [1, ) [1, ) by f(x) = x + 1/x. Show that f(x) f(y) < x y for all x, y [1, ) with x y, but f has no fixed point. Why doesn t this example contradict

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Maximum likelihood estimation of a log-concave density based on censored data

Maximum likelihood estimation of a log-concave density based on censored data Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen

More information

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions Economics 24 Fall 211 Problem Set 2 Suggested Solutions 1. Determine whether the following sets are open, closed, both or neither under the topology induced by the usual metric. (Hint: think about limit

More information

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES RUTH J. WILLIAMS October 2, 2017 Department of Mathematics, University of California, San Diego, 9500 Gilman Drive,

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo September 6, 2011 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Continuous Functions on Metric Spaces

Continuous Functions on Metric Spaces Continuous Functions on Metric Spaces Math 201A, Fall 2016 1 Continuous functions Definition 1. Let (X, d X ) and (Y, d Y ) be metric spaces. A function f : X Y is continuous at a X if for every ɛ > 0

More information

Introductory Analysis I Fall 2014 Homework #9 Due: Wednesday, November 19

Introductory Analysis I Fall 2014 Homework #9 Due: Wednesday, November 19 Introductory Analysis I Fall 204 Homework #9 Due: Wednesday, November 9 Here is an easy one, to serve as warmup Assume M is a compact metric space and N is a metric space Assume that f n : M N for each

More information

Estimation of a Convex Function: Characterizations and Asymptotic Theory

Estimation of a Convex Function: Characterizations and Asymptotic Theory Estimation of a Convex Function: Characterizations and Asymptotic Theory Piet Groeneboom, Geurt Jongbloed, and Jon Wellner Delft University of Technology, Vrije Universiteit Amsterdam, and University of

More information

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1 Math 8B Solutions Charles Martin March 6, Homework Problems. Let (X i, d i ), i n, be finitely many metric spaces. Construct a metric on the product space X = X X n. Proof. Denote points in X as x = (x,

More information

Estimation of a discrete monotone distribution

Estimation of a discrete monotone distribution Electronic Journal of Statistics Vol. 3 (009) 1567 1605 ISSN: 1935-754 DOI: 10.114/09-EJS56 Estimation of a discrete monotone distribution Hanna K. Jankowski Department of Mathematics and Statistics York

More information

The Arzelà-Ascoli Theorem

The Arzelà-Ascoli Theorem John Nachbar Washington University March 27, 2016 The Arzelà-Ascoli Theorem The Arzelà-Ascoli Theorem gives sufficient conditions for compactness in certain function spaces. Among other things, it helps

More information

The Heine-Borel and Arzela-Ascoli Theorems

The Heine-Borel and Arzela-Ascoli Theorems The Heine-Borel and Arzela-Ascoli Theorems David Jekel October 29, 2016 This paper explains two important results about compactness, the Heine- Borel theorem and the Arzela-Ascoli theorem. We prove them

More information

The Skorokhod reflection problem for functions with discontinuities (contractive case)

The Skorokhod reflection problem for functions with discontinuities (contractive case) The Skorokhod reflection problem for functions with discontinuities (contractive case) TAKIS KONSTANTOPOULOS Univ. of Texas at Austin Revised March 1999 Abstract Basic properties of the Skorokhod reflection

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij Weak convergence and Brownian Motion (telegram style notes) P.J.C. Spreij this version: December 8, 2006 1 The space C[0, ) In this section we summarize some facts concerning the space C[0, ) of real

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

Review of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE)

Review of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE) Review of Multi-Calculus (Study Guide for Spivak s CHPTER ONE TO THREE) This material is for June 9 to 16 (Monday to Monday) Chapter I: Functions on R n Dot product and norm for vectors in R n : Let X

More information

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Statistics Seminar, York October 15, 2015 Statistics Seminar,

More information

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood Nonparametric estimation of s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Cowles Foundation Seminar, Yale University November

More information

On the Converse Law of Large Numbers

On the Converse Law of Large Numbers On the Converse Law of Large Numbers H. Jerome Keisler Yeneng Sun This version: March 15, 2018 Abstract Given a triangular array of random variables and a growth rate without a full upper asymptotic density,

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS SOME CONVERSE LIMIT THEOREMS OR EXCHANGEABLE BOOTSTRAPS Jon A. Wellner University of Washington The bootstrap Glivenko-Cantelli and bootstrap Donsker theorems of Giné and Zinn (990) contain both necessary

More information

Part III. 10 Topological Space Basics. Topological Spaces

Part III. 10 Topological Space Basics. Topological Spaces Part III 10 Topological Space Basics Topological Spaces Using the metric space results above as motivation we will axiomatize the notion of being an open set to more general settings. Definition 10.1.

More information

16 1 Basic Facts from Functional Analysis and Banach Lattices

16 1 Basic Facts from Functional Analysis and Banach Lattices 16 1 Basic Facts from Functional Analysis and Banach Lattices 1.2.3 Banach Steinhaus Theorem Another fundamental theorem of functional analysis is the Banach Steinhaus theorem, or the Uniform Boundedness

More information

n E(X t T n = lim X s Tn = X s

n E(X t T n = lim X s Tn = X s Stochastic Calculus Example sheet - Lent 15 Michael Tehranchi Problem 1. Let X be a local martingale. Prove that X is a uniformly integrable martingale if and only X is of class D. Solution 1. If If direction:

More information

B. Appendix B. Topological vector spaces

B. Appendix B. Topological vector spaces B.1 B. Appendix B. Topological vector spaces B.1. Fréchet spaces. In this appendix we go through the definition of Fréchet spaces and their inductive limits, such as they are used for definitions of function

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differenti. equations.

Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differenti. equations. Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differential equations. 1 Metric spaces 2 Completeness and completion. 3 The contraction

More information

ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin

ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin The Annals of Statistics 1996, Vol. 4, No. 6, 399 430 ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE By Michael Nussbaum Weierstrass Institute, Berlin Signal recovery in Gaussian

More information

HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM Georgian Mathematical Journal Volume 9 (2002), Number 3, 591 600 NONEXPANSIVE MAPPINGS AND ITERATIVE METHODS IN UNIFORMLY CONVEX BANACH SPACES HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

More information

Nonparametric estimation under shape constraints, part 2

Nonparametric estimation under shape constraints, part 2 Nonparametric estimation under shape constraints, part 2 Piet Groeneboom, Delft University August 7, 2013 What to expect? Theory and open problems for interval censoring, case 2. Same for the bivariate

More information

Math 321 Final Examination April 1995 Notation used in this exam: N. (1) S N (f,x) = f(t)e int dt e inx.

Math 321 Final Examination April 1995 Notation used in this exam: N. (1) S N (f,x) = f(t)e int dt e inx. Math 321 Final Examination April 1995 Notation used in this exam: N 1 π (1) S N (f,x) = f(t)e int dt e inx. 2π n= N π (2) C(X, R) is the space of bounded real-valued functions on the metric space X, equipped

More information

arxiv: v1 [math.ca] 28 Jul 2011

arxiv: v1 [math.ca] 28 Jul 2011 arxiv:1107.5614v1 [math.ca] 28 Jul 2011 Illumination by Tangent Lines Alan Horwitz 25 Yearsley Mill Rd. Penn State Brandywine Media, PA 19063 USA alh4@psu.edu 7/27/11 Abstract Let f be a differentiable

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

with Current Status Data

with Current Status Data Estimation and Testing with Current Status Data Jon A. Wellner University of Washington Estimation and Testing p. 1/4 joint work with Moulinath Banerjee, University of Michigan Talk at Université Paul

More information

ASYMPTOTICALLY NONEXPANSIVE MAPPINGS IN MODULAR FUNCTION SPACES ABSTRACT

ASYMPTOTICALLY NONEXPANSIVE MAPPINGS IN MODULAR FUNCTION SPACES ABSTRACT ASYMPTOTICALLY NONEXPANSIVE MAPPINGS IN MODULAR FUNCTION SPACES T. DOMINGUEZ-BENAVIDES, M.A. KHAMSI AND S. SAMADI ABSTRACT In this paper, we prove that if ρ is a convex, σ-finite modular function satisfying

More information

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed

More information

Generalized metric properties of spheres and renorming of Banach spaces

Generalized metric properties of spheres and renorming of Banach spaces arxiv:1605.08175v2 [math.fa] 5 Nov 2018 Generalized metric properties of spheres and renorming of Banach spaces 1 Introduction S. Ferrari, J. Orihuela, M. Raja November 6, 2018 Throughout this paper X

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics

Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee 1 and Jon A. Wellner 2 1 Department of Statistics, Department of Statistics, 439, West

More information

2 Sequences, Continuity, and Limits

2 Sequences, Continuity, and Limits 2 Sequences, Continuity, and Limits In this chapter, we introduce the fundamental notions of continuity and limit of a real-valued function of two variables. As in ACICARA, the definitions as well as proofs

More information

Functional Analysis I

Functional Analysis I Functional Analysis I Course Notes by Stefan Richter Transcribed and Annotated by Gregory Zitelli Polar Decomposition Definition. An operator W B(H) is called a partial isometry if W x = X for all x (ker

More information

7 Complete metric spaces and function spaces

7 Complete metric spaces and function spaces 7 Complete metric spaces and function spaces 7.1 Completeness Let (X, d) be a metric space. Definition 7.1. A sequence (x n ) n N in X is a Cauchy sequence if for any ɛ > 0, there is N N such that n, m

More information

STAT Sample Problem: General Asymptotic Results

STAT Sample Problem: General Asymptotic Results STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

1 Weak Convergence in R k

1 Weak Convergence in R k 1 Weak Convergence in R k Byeong U. Park 1 Let X and X n, n 1, be random vectors taking values in R k. These random vectors are allowed to be defined on different probability spaces. Below, for the simplicity

More information

A Simpler and Tighter Redundant Klee-Minty Construction

A Simpler and Tighter Redundant Klee-Minty Construction A Simpler and Tighter Redundant Klee-Minty Construction Eissa Nematollahi Tamás Terlaky October 19, 2006 Abstract By introducing redundant Klee-Minty examples, we have previously shown that the central

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

Integration on Measure Spaces

Integration on Measure Spaces Chapter 3 Integration on Measure Spaces In this chapter we introduce the general notion of a measure on a space X, define the class of measurable functions, and define the integral, first on a class of

More information

THE INVERSE FUNCTION THEOREM

THE INVERSE FUNCTION THEOREM THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

Some Background Math Notes on Limsups, Sets, and Convexity

Some Background Math Notes on Limsups, Sets, and Convexity EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 1 Some Background Math Notes on Limsups, Sets, and Convexity I. LIMITS Let f(t) be a real valued function of time. Suppose f(t) converges

More information

Topological properties

Topological properties CHAPTER 4 Topological properties 1. Connectedness Definitions and examples Basic properties Connected components Connected versus path connected, again 2. Compactness Definition and first examples Topological

More information

arxiv:math/ v2 [math.st] 17 Jun 2008

arxiv:math/ v2 [math.st] 17 Jun 2008 The Annals of Statistics 2008, Vol. 36, No. 3, 1064 1089 DOI: 10.1214/009053607000000983 c Institute of Mathematical Statistics, 2008 arxiv:math/0609021v2 [math.st] 17 Jun 2008 CURRENT STATUS DATA WITH

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Notes on uniform convergence

Notes on uniform convergence Notes on uniform convergence Erik Wahlén erik.wahlen@math.lu.se January 17, 2012 1 Numerical sequences We begin by recalling some properties of numerical sequences. By a numerical sequence we simply mean

More information

Problem 3. Give an example of a sequence of continuous functions on a compact domain converging pointwise but not uniformly to a continuous function

Problem 3. Give an example of a sequence of continuous functions on a compact domain converging pointwise but not uniformly to a continuous function Problem 3. Give an example of a sequence of continuous functions on a compact domain converging pointwise but not uniformly to a continuous function Solution. If we does not need the pointwise limit of

More information

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Submitted to the Annals of Statistics arxiv: arxiv:0000.0000 RATE-OPTIMAL GRAPHON ESTIMATION By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Network analysis is becoming one of the most active

More information

NONPARAMETRIC CONFIDENCE INTERVALS FOR MONOTONE FUNCTIONS. By Piet Groeneboom and Geurt Jongbloed Delft University of Technology

NONPARAMETRIC CONFIDENCE INTERVALS FOR MONOTONE FUNCTIONS. By Piet Groeneboom and Geurt Jongbloed Delft University of Technology NONPARAMETRIC CONFIDENCE INTERVALS FOR MONOTONE FUNCTIONS By Piet Groeneboom and Geurt Jongbloed Delft University of Technology We study nonparametric isotonic confidence intervals for monotone functions.

More information

Lecture 19 L 2 -Stochastic integration

Lecture 19 L 2 -Stochastic integration Lecture 19: L 2 -Stochastic integration 1 of 12 Course: Theory of Probability II Term: Spring 215 Instructor: Gordan Zitkovic Lecture 19 L 2 -Stochastic integration The stochastic integral for processes

More information

C.7. Numerical series. Pag. 147 Proof of the converging criteria for series. Theorem 5.29 (Comparison test) Let a k and b k be positive-term series

C.7. Numerical series. Pag. 147 Proof of the converging criteria for series. Theorem 5.29 (Comparison test) Let a k and b k be positive-term series C.7 Numerical series Pag. 147 Proof of the converging criteria for series Theorem 5.29 (Comparison test) Let and be positive-term series such that 0, for any k 0. i) If the series converges, then also

More information

Supplementary Notes for W. Rudin: Principles of Mathematical Analysis

Supplementary Notes for W. Rudin: Principles of Mathematical Analysis Supplementary Notes for W. Rudin: Principles of Mathematical Analysis SIGURDUR HELGASON In 8.00B it is customary to cover Chapters 7 in Rudin s book. Experience shows that this requires careful planning

More information

Regularity of the density for the stochastic heat equation

Regularity of the density for the stochastic heat equation Regularity of the density for the stochastic heat equation Carl Mueller 1 Department of Mathematics University of Rochester Rochester, NY 15627 USA email: cmlr@math.rochester.edu David Nualart 2 Department

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond Measure Theory on Topological Spaces Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond May 22, 2011 Contents 1 Introduction 2 1.1 The Riemann Integral........................................ 2 1.2 Measurable..............................................

More information

1.2 Fundamental Theorems of Functional Analysis

1.2 Fundamental Theorems of Functional Analysis 1.2 Fundamental Theorems of Functional Analysis 15 Indeed, h = ω ψ ωdx is continuous compactly supported with R hdx = 0 R and thus it has a unique compactly supported primitive. Hence fφ dx = f(ω ψ ωdy)dx

More information

An elementary proof of the weak convergence of empirical processes

An elementary proof of the weak convergence of empirical processes An elementary proof of the weak convergence of empirical processes Dragan Radulović Department of Mathematics, Florida Atlantic University Marten Wegkamp Department of Mathematics & Department of Statistical

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

On Kusuoka Representation of Law Invariant Risk Measures

On Kusuoka Representation of Law Invariant Risk Measures MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 1, February 213, pp. 142 152 ISSN 364-765X (print) ISSN 1526-5471 (online) http://dx.doi.org/1.1287/moor.112.563 213 INFORMS On Kusuoka Representation of

More information

Introduction to Topology

Introduction to Topology Introduction to Topology Randall R. Holmes Auburn University Typeset by AMS-TEX Chapter 1. Metric Spaces 1. Definition and Examples. As the course progresses we will need to review some basic notions about

More information

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero Chapter Limits of Sequences Calculus Student: lim s n = 0 means the s n are getting closer and closer to zero but never gets there. Instructor: ARGHHHHH! Exercise. Think of a better response for the instructor.

More information

Analogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006

Analogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006 Analogy Principle Asymptotic Theory Part II James J. Heckman University of Chicago Econ 312 This draft, April 5, 2006 Consider four methods: 1. Maximum Likelihood Estimation (MLE) 2. (Nonlinear) Least

More information

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p. Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,

More information

A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis

A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis Jon A. Wellner and Vladimir Koltchinskii Abstract. Proofs are given of the limiting null distributions of the

More information

Real Analysis Problems

Real Analysis Problems Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.

More information

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set Analysis Finite and Infinite Sets Definition. An initial segment is {n N n n 0 }. Definition. A finite set can be put into one-to-one correspondence with an initial segment. The empty set is also considered

More information

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Chapter 4 GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Alberto Cambini Department of Statistics and Applied Mathematics University of Pisa, Via Cosmo Ridolfi 10 56124

More information

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation: Oct. 1 The Dirichlet s P rinciple In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation: 1. Dirichlet s Principle. u = in, u = g on. ( 1 ) If we multiply

More information

Pogorelov Klingenberg theorem for manifolds homeomorphic to R n (translated from [4])

Pogorelov Klingenberg theorem for manifolds homeomorphic to R n (translated from [4]) Pogorelov Klingenberg theorem for manifolds homeomorphic to R n (translated from [4]) Vladimir Sharafutdinov January 2006, Seattle 1 Introduction The paper is devoted to the proof of the following Theorem

More information

Convergence to Common Fixed Point for Two Asymptotically Quasi-nonexpansive Mappings in the Intermediate Sense in Banach Spaces

Convergence to Common Fixed Point for Two Asymptotically Quasi-nonexpansive Mappings in the Intermediate Sense in Banach Spaces Mathematica Moravica Vol. 19-1 2015, 33 48 Convergence to Common Fixed Point for Two Asymptotically Quasi-nonexpansive Mappings in the Intermediate Sense in Banach Spaces Gurucharan Singh Saluja Abstract.

More information

Harmonic Analysis on the Cube and Parseval s Identity

Harmonic Analysis on the Cube and Parseval s Identity Lecture 3 Harmonic Analysis on the Cube and Parseval s Identity Jan 28, 2005 Lecturer: Nati Linial Notes: Pete Couperus and Neva Cherniavsky 3. Where we can use this During the past weeks, we developed

More information

Applications of Ito s Formula

Applications of Ito s Formula CHAPTER 4 Applications of Ito s Formula In this chapter, we discuss several basic theorems in stochastic analysis. Their proofs are good examples of applications of Itô s formula. 1. Lévy s martingale

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations

Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

AW -Convergence and Well-Posedness of Non Convex Functions

AW -Convergence and Well-Posedness of Non Convex Functions Journal of Convex Analysis Volume 10 (2003), No. 2, 351 364 AW -Convergence Well-Posedness of Non Convex Functions Silvia Villa DIMA, Università di Genova, Via Dodecaneso 35, 16146 Genova, Italy villa@dima.unige.it

More information