Additional proofs and details
|
|
- Sheena Powell
- 5 years ago
- Views:
Transcription
1 Statistica Sinica: Supplement NONPARAMETRIC MODEL CHECKS OF SINGLE-INDEX ASSUMPTIONS Samuel Maistre 1,2,3 and Valentin Patilea 1 1 CREST (Ensai) 2 Université de Lyon 3 Université de Strasbourg Supplementary Material This supplementary material contains additional proofs, details and technical lemmas. S1 Additional proofs and details S1.1 Proof of Lemma A.1 Lemma A.1. Let (U 1, Z 1, W 1 ) and (U 2, Z 2, W 2 ) be two independent draws of (U, Z, W ). Let K( ) be a bounded, even, integrable function with positive, integrable Fourier transform. Assume E( Uw(Z) 2 H ) <, Then for any h > 0, E [U Z, W = 0 a.s. I(h) = 0.
2 SAMUEL MAISTRE AND VALENTIN PATILEA Moreover, if P (E [U Z, W = 0) < 1, then inf I(h) > 0. h (0,1 Proof of Lemma A.1. Using the Inverse Fourier Transform and the independence of (U 1, Z 1, W 1 ) and (U 2, Z 2, W 2 ), I(h) = E [ U 1, U 2 H w(z 1 )w(z 2 )h q K((Z 1 Z 2 )/h)ϕ(w 1 W 2 ) [ = E U 1, U 2 H w(z 1 )w(z 2 ) e 2πiv (Z 1 Z 2 ) F[K(vh)dv e 2πis (W 1 W 2 ) F[ϕ(s)ds R q R r [ = E E[U Z, W w(z)e 2πi{v Z+s W } 2 F[K(vh)F[ϕ(s)dvds. R q R r H Clearly, for any h > 0, I(h) = 0 whenever E[U Z, W = 0 a.s. For the reverse implication, since F[ϕ, F[K > 0, and w( ) > 0, for any h > 0, one can deduce [ E E[U Z, W w(z)e 2πi{v Z+s W } = 0, v R q, s R r. Then necessarily E[U Z, W w(z) = 0 a.s., and thus E[U Z, W = 0 a.s. In the case P (E [U W, X = 0) < 1, by the Lebesgue Dominated Con-
3 S1. ADDITIONAL PROOFS AND DETAILS vergence Theorem, the map h I(h) is continuous on (0, 1 and lim I(h) = F [K(0) h 0 R q [ E E[U Z, W w(z)e 2πi{v Z+s W } 2 R r H F [ϕ (s) dvds. Since the nonnegative valued map (v, s) [ E E[U Z, W w(z)e 2πi{v Z+s W } 2 H is continuous, and non identically equal to 0 whenever E [U Z, W 0, and F [K (0), F [ψ ( ) > 0, lim h 0 I(h) is necessarily positive and I(h) is bounded away from zero on the interval (0, 1. S1.2 Some details on equation (4.4) Consider a sequence of alternatives Y = m(z(β 0 )) + r n δ(z(β 0 ), W (β 0 )) + ε, n 1, with E(ε X) = 0 a.s., and δ( ) satisfying the conditions (4.3). In the following, we show that for each n, β 0 is solution of the equation β E [ {Y r β (Z(β))} 2 = 0.
4 SAMUEL MAISTRE AND VALENTIN PATILEA Under suitable conditions on the second order derivative with respect to β, this justifies the equation (4.4). For this purpose, we want to differentiate r β (Z(β)). Let Y 0 = m(z(β 0 ))+ε and define r 0 β(z(β)) = E [ Y 0 Z(β) = E [m(z(β 0 )) Z(β). Moreover, let δ β (Z(β)) = E [δ(z(β 0 ), W (β 0 )) Z(β), and notice r β (Z(β)) = r 0 β(z(β)) + r n δ β (Z(β)), and that, by the first condition in equation (4.3), δ β0 (Z(β 0 )) = 0. Next, by standard results from single-index regression models applied to the response Y 0 (see, for instance, Horowitz (2009) chapter 2), one has β r0 β(z(β)) = m (Z(β 0 )){X E[X Z(β 0 )}. β=β0 On the other hand, by the standard variance decomposition formula, E [ {δ(z(β 0 ), W (β 0 )) δ β (Z(β))} 2 E[δ 2 (Z(β 0 ), W (β 0 )) = E [ {δ(z(β 0 ), W (β 0 )) δ β0 (Z(β 0 ))} 2,
5 S1. ADDITIONAL PROOFS AND DETAILS and thus one can deduce that [ E δ(z(β 0 ), W (β 0 )) β δ β(z(β)) = 0. (S1.1) β=β0 Using this identity and the second condition in equation (4.3), under suitable technical conditions, for any n, one can deduce β E [ [ {Y r β (Z(β))} 2 = 2E {Y r β0 (Z(β 0 ))} β=β0 β r β(z(β)) [ { = 2E {r n δ(z(β 0 ), W (β 0 )) + ε} β r0 β(z(β)) + β=β0 β r nδ β (Z(β)) = 2r n E [δ(z(β 0 ), W (β 0 ))m (Z(β 0 )){X E[X Z(β 0 )} [ 2rnE 2 δ(z(β 0 ), W (β 0 )) β δ β(z(β)) β=β0 = 0. β=β0 β=β0 } Let us end this part with a comment on the condition (4.3). In general, in the case of sequence of alternatives, by standard tools for deriving asymptotic results, one can prove ˆβ β = O P (n 1/2 ) for some β B that could depend on n. The value β tends to β 0 at a rate depending on the rate the alternative hypotheses approaches the null hypothesis. Following a common choice in the literature, herein we want to simplify the presentation and focus on the case where β does not depend on n. For this purpose, we impose
6 SAMUEL MAISTRE AND VALENTIN PATILEA some orthogonality conditions on the function δ( ) such that β = β 0 when the index is estimated by semiparametric least-squares. See, for instance, the second condition of equation (3.11) in Guerre and Lavergne (2005) for a similar condition. S1.3 Some details on the proof of Proposition 1 Herein, we provide detailed justifications of the claim that the norm of any column of A (β) A ( β) is bounded by c β β. Firstly, recall that B {1} R p 1 or B { γ 1 γ : γ = (γ 1,..., γ p ) R p, γ 1 > 0}. Then the norm of every β from the parameter space B is larger or equal to 1. Moreover, if β B and B(β, r) R d is the ball centered at β of radius 0 < r < 1/2, then inf β B(β,r) B β, β β β 1 β > 0. (S1.2) Indeed, for any β B(β, r), β β r and thus β β r 1 r.
7 S1. ADDITIONAL PROOFS AND DETAILS Then, since for any β B(β, r), β β 2 r 2, we have β, β = 1 2 { β 2 + β 2 β β 2} β 2 2. Finally, divide by β β to derive the inequality (S1.2). Now, consider {v 1,..., v p 1 }, a basis in the orthogonal subspace {β}. Then for any β B(β, r), the set of vectors {v 1,..., v p 1 } {β} is a basis in R p. Indeed, equation (S1.2) implies that none of β B(β, r) could be spanned by {v 1,..., v p 1 }. Finally, if A (β) is built by the Gram-Schmidt process applied to {v 1,..., v p 1 } {β}, as described at the beginning of the proof of Proposition 1, and the vectors {v 1,..., v p 1 } are orthogonal, then the norm of any column of A (β) A ( β) is bounded by c β β for some c depending only on the initial p 1 independent vectors. The fact that {v 1,..., v p 1 } are orthogonal it is not a real constraint since, starting from an arbitrary basis in {β}, one could first apply an orthogonalization process with that basis. We consider the case p = 3, the case of larger p could be derived similarly. Let β B(β, 1/2) and consider v, w two linearly independent vectors from the space {β}. The Gram-Schmidt process transforms the
8 SAMUEL MAISTRE AND VALENTIN PATILEA basis {β, v, w} as follows: u 1 = β, e 1 = e 1 (β) = u 1 u 1, u 2 = v v, e 1 e 1, e 2 = e 2 (β) = u 2 u 2, u 3 = w w, e 1 e 1 w, e 2 e 2, e 3 = e 3 (β) = u 3 u 3. Since the matrix A(β) is build with the columns e 2 (β) and e 3 (β), it remains to check that Lipschitz condition for e 2 (β) and e 3 (β) as functions of β. First, note that e 1 (β) e 1 (β) = β β β β β β 1 β β β + 1 β β β c 1 β β, with c 1 = 2/ β. In particular, e1 (β) e 1 (β) c1 β β. Next, since v {β} = {e 1 (β)}, v, e 1 (β) = v, e 1 (β) + v, e 1 (β) e 1 (β) = v, e 1 (β) e 1 (β),
9 S1. ADDITIONAL PROOFS AND DETAILS and thus v, e 1 (β) v e 1 (β) e 1 (β) c 1 v β β c 1 r v. Moreover, β B(β, r) u 2 (β) 2 = v 2 v, e 1 (β) 2 (1 c 2 1r 2 ) v 2 and u2 (β) u 2 (β) u2 (β) 2 u 2 (β) 2 = u 2 (β) + u 2 (β) v, e1 (β) v, e 1 (β) v, e1 (β) + v, e 1 (β) 2(1 c 2 1r 2 ) 1/2 v v e 1(β) e 1 (β) 2 v 2(1 c 2 1r 2 ) 1/2 v c 1 v β β. (1 c 2 1r 2 ) 1/2
10 SAMUEL MAISTRE AND VALENTIN PATILEA Then e 2 (β) e 2 (β) = u 2 (β) {v v, e 1 (β) e 1 (β)} u 2 (β) {v v, e 1 (β) e 1 (β)} u 2 (β) u 2 (β) u 2 (β) u 2 (β) v u 2 (β) u 2 (β) 1 + v, e1 (β) e 1 (β) v, e 1 (β) e 1 (β) u 2 (β) v, e1 (β) e 1 (β) + u 2 (β) u 2 (β) u 2 (β) c 2 β β, where c 2 is a positive constant that depends only on r, β and v. Smaller r is fixed, larger the constant c 2 could be taken. Finally, to get the Lipschitz condition for the map β e 3 (β), we first need to bound from below u 3 (β). Using the orthogonality between e 1 (β) and e 2 (β), we get u 3 (β) 2 = w 2 w, e 1 (β) 2 w, e 2 (β) 2. Since w {β} = {e 1 (β)} and v and w are orthogonal, w, e 1 (β) w e 1 (β) e 1 (β) c 1 w β β
11 S1. ADDITIONAL PROOFS AND DETAILS and w, e 2 (β) w v, e 1 (β) e 1 (β) v, e 1 (β) e 1 (β) c 2 w β β. Deduce that β B(β, r), u 3 (β) 2 (1 c 2 1r 2 c 2 2r 2 ) w 2. The Lipschitz condition for the map β e 3 (β) follows after repeatedly applying the triangle inequality. S1.4 Proof of Proposition 2 Proposition 2. Suppose the conditions in Assumption 1 in the Appendix are met and the null hypothesis (2.2) holds true. Consider β n such that β n β 0 = O P (n 1/2 ). Then nh 1/2 I {l} n H 0, and (β n ) /ˆω n {l} (β n ) N (0, 1) in law under {l} [ˆω n (β 0 ) 2 [ ω {l} (β 0 ) 2 = 2 K 2 (u) du Γ 2 (s, t) ds dt [ E fβ 4 0 (z) ϕ 2 (W 1 (β 0 ) W 2 (β 0 )) π β0 (z W 1 (β 0 )) π β0 (z W 2 (β 0 )) dz,
12 SAMUEL MAISTRE AND VALENTIN PATILEA in probability, where π β0 ( w) is the conditional density of Z(β 0 ) knowing that W (β 0 ) = w, and for t, s [0, 1, Γ (s, t) = E [ϵ (s) ϵ (t), ϵ (t) = 1{Φ(Y ) t} P[Φ(Y ) t X β 0. Proof of Proposition 2. Let us consider the simplified notation from equation (7.3) and further simplify in the case β = β 0 and write L ij = L ij (β 0, g), K ij = K ij (β 0, h), and ϕ ij = ϕ(w i (β 0 ) W j (β 0 )). Notice that (S1.3) I {l} n (β 0 ) = 1 n (n 1) h 1 i j n { (ri r i ) ( ; β 0 ), (r j r j ) ( ; β 0 ) L 2 + ϵ i ( ), ϵ j ( ) L 2 + ϵ i ( ), ϵ j ( ) L ϵ i ( ), (r j r j ) ( ; β 0 ) L 2 2 ϵ i ( ), (r j r j ) ( ; β 0 ) L 2 2 ϵ i ( ), ϵ j ( ) L 2} ˆfβ0,i ˆf β0,jk ij ϕ ij = I 1 (β 0 ) + I 2 (β 0 ) + I 3 (β 0 ) + 2I 4 (β 0 ) 2I 5 (β 0 ) 2I 6 (β 0 )
13 S1. ADDITIONAL PROOFS AND DETAILS with ˆf β,i = 1 (n 1) g L ik (β), r i (t; β) = P [ Y i Φ 1 (t) X iβ, k i r i (t; β) = 1 (n 1) g ˆf r k (t; β) L ik (β) β,i k i and ϵ i ( ) is defined as r i (t; β) by replacing r i (t; β) by ϵ i ( ). This decomposition of I n {l} (β 0 ) is given by the identity U i ω (Z i ) ( ; β 0 ) = [r i ( ; β 0 ) r i ( ; β 0 ) + ϵ i ( ) ϵ i ( ) ˆf β0,i. The terms I 1 (β 0 ) and I 3 (β 0 ) are treated in Lemmas 6 and 7 in Section S2. For I 2 (β 0 ), let us introduce ω 2 n (β) = 2 n (n 1) h n i=1 j i Γ 2 (s, t) ds dt ˆf 2 β,i ˆf 2 β,jk 2 ij (β) ϕ 2 ij (β). Proposition 3 below ensures that nh 1/2 ωn 1 (β 0 ) I 2 (β 0 ) N (0, 1) in law. The terms I 4 (β 0 ), I 5 (β 0 ) and I 6 (β 0 ) can be shown to be negligible in a similar way as I 1 (β 0 ) and I 3 (β 0 ). Lemma 9 shows that ω 2 n (β 0 ) [ ω {l} (β 0 ) 2, in probability, with ω {l} (β 0 ) > 0 and thus I j (β 0 ) /ω n (β 0 ) is of the same order as I j (β 0 ) for j {1, 3, 4, 5, 6}. Finally, it is easy to check that ω n (β 0 ) ˆω {l} n (β 0 ) = o P (1). By Proposition 1, one can replace β 0 by β n,
14 SAMUEL MAISTRE AND VALENTIN PATILEA an estimator of β 0 that converges in probability with the rate O P (n 1/2 ), and have nh 1/2 I {l} n (β n ) /ˆω {l} n (β n ) nh 1/2 I n {l} (β 0 ) /ˆω n {l} (β 0 ) = o P (1). Then the result of the Proposition 2 follows. Proposition 3. Under the conditions of Proposition 2, nh 1/2 ω 1 n (β 0 ) I 2 (β 0 ) N (0, 1) in law. Proof. {S n,m, F n,m, 1 m n, n 1} is a martingale array with S n,1 = 0 and with S n,m (β 0 ) = m G n,i (β 0 ) i=1 G n,i (β 0 ) = 2h p/2 ϵ i ( ) ω n (n 1) h ˆf β0,i, i 1 ϵ j ( ) ˆf β0,jk ij ϕ ij and F n,m is the σ-field generated by {X 1,..., X n, Y 1,..., Y m }. Thus j=1 L 2 nh 1/2 ω 1 n (β 0 ) I 2 (β 0 ) = S n,n (β 0 ).
15 S2. TECHNICAL LEMMAS From Lemma 8 and the nesting of the σ-fields F n,i F n+1,i for 1 i n, n 1, we have that the martingale array satisfies Corollary 3.1 of Hall and Heyde (1980) and the result follows. S2 Technical lemmas In the following results the kernels L and K are supposed to satisfy the conditions of Assumption 1-(f). Lemma 1. Assume that E[exp(a X ) < for some a > 0. Consider that g 0 and ng 4/3 / log n. For any t [0, 1 let Y k (t), 1 k n, be an i.i.d. random variables like in the proof of Proposition 1 such that E[sup t Y k (t) a < for some a > 8. Moreover, assume that the maps v E[ Y k (t) X β = vf β(v), v R, t [0, 1, are uniformly Lipschitz (the Lipschitz constant does not depend on t). Then max sup 1 i n t [0,1 β B n 1 sup Y k (t) 1 [ Lik (β) L ik ( n 1 g β) = O P (n 1/2 g 1/2 log 1/2 n+b n ). k i Moreover, max sup 1 i n t [0,1 β B n 1 sup {Y k (t) E[Y k (t) X k n 1 β} 1 g L ik( β) = O P(n 1/2 g 1/2 log 1/2 n). k i Proof of Lemma 1. Recall that Y i (t) Y i (in the case of SIM for mean re-
16 SAMUEL MAISTRE AND VALENTIN PATILEA gression) or Y i (t) = 1{Y i Φ 1 (t)} (for the case of single-index assumption on the conditional law), and r i (t; β) = E[Y i (t) Z( β), t [0, 1. For any t [0, 1 we decompose 1 ng k i Y k (t)l ik (β) = 1 ng n {Y k (t)l ((X i X k ) β/g) E [Y (t)l ((X i X) β/g) X i } k=1 +E [ Y (t)g 1 L ((X i X) β/g) X i n 1 g 1 L(0)Y i (t) = Σ 1ni (β, t) + Σ 2ni (β, t) n 1 g 1 L(0)Y i (t). The moment condition on Y guarantees that max 1 i n sup t Y i (t) = o P (n b ) for some 0 < b < 1/8. This and the fact that ng 4/3 / log n make that max 1 i n sup t n 1 g 1 Y i (t) = o P (n 1/2 g 1/2 log 1/2 n). On the other hand, by Lemma 5, max sup 1 i n t [0,1 sup Σ2ni (β, t) Σ 2ni ( β, t) = OP (b n ). β B n It remains to uniformly bound Σ 1ni (β, t) and for this purpose we use empirical process tools. Let us introduce some notation. Let G be a class of functions of the observations with envelope function G and let δ J(δ, G, L 2 ) = sup 1 + log N(ε G 2, G, L 2 (Q))dε, 0 < δ 1, Q 0 denote the uniform entropy integral, where the supremum is taken over all
17 S2. TECHNICAL LEMMAS finitely discrete probability distributions Q on the space of the observations, and G 2 denotes the norm of G in L 2 (Q). Let Z 1,, Z n be a sample of independent observations and let G n g = 1 n n γ(z i ), γ G, i=1 be the empirical process indexed by G. If the covering number N(ε, G, L 2 (Q)) is of polynomial order in 1/ε, there exists a constant c > 0 such that J(δ, G, L 2 ) cδ log(1/δ) for 0 < δ < 1/2. Now if Eγ 2 < δ 2 EG 2 for every γ and some 0 < δ < 1, and EG (4υ 2)/(υ 1) < for some υ > 1, under mild additional measurability conditions that are satisfied in our context, Theorem 3.1 of van der Vaart and Wellner (2011) implies sup G n γ = J(δ, G, L 2 ) G ( 1 + J(δ1/υ, G, L 2 ) δ 2 n G 2 1/υ ) υ/(2υ 1) (4υ 2)/(υ 1) G G 2 1/υ 2 O P (1), 2 (S2.4) where G 2 2 = EG 2 and the O P (1) term is independent of n. Note that the family G could change with n, as soon as the envelope is the same for all n. We apply this result to the family of functions G = {γ( ; β, w, t) γ( ; β, w, t) : t [0, 1, β B, w R} where γ(y, X; β, w, t) = Y (t)l((x β w)g 1 ))
18 SAMUEL MAISTRE AND VALENTIN PATILEA for a sequence g that converges to zero and the envelope G(Y, X) = sup Y (t) sup L(w). t [0,1 w R Its entropy number is of polynomial order in 1/ε, independently of n, as L( ) is of bounded variation and the families of indicator functions have polynomial complexity, see for instance van der Vaart (1998). Now for any γ G, Eγ 2 CgEG 2, for some constant C. Let δ = g 1/2, so that Eγ 2 C δ 2 EG 2, for some constant C and υ = 3/2, which corresponds to EG 8 < that is guaranteed by our assumptions. Thus the bound in (S2.4) yields sup G 1 g n G nγ = log1/2 (n) 3/4 [1 + n 1/2 g 2/3 log (n) 1/2 OP (1), ng where the O P (1) term is independent of n. Since ng 4/3 / log n, max sup 1 i n t [0,1 sup Σ 1ni (β, t) Σ 1ni ( β, t) = O P (n 1/2 g 1/2 log 1/2 n). β B n The second part of the statement is now obvious.
19 S2. TECHNICAL LEMMAS Lemma 2. Assume that the density f β( ) is Lipschitz. Then 1 1 max 1 i n n 1 g L ik( β) f β(x i β) = O P(n 1/2 g 1/2 log 1/2 n + g). k i Proof of Lemma 2. We can write 1 1 n 1 g L ik( β) f β(x β) i = 1 n k i n { g 1 L ik ( β) E[g 1 L ik ( β) X i } k=1 +E[g 1 L ik ( β) X i f β(x i β) + O(n 1 g 1 ). By the empirical process arguments used in Lemma 1, the sum on the righthand side of the display is of rate O P (n 1/2 g 1/2 log 1/2 n) uniformly with respect to i. The Lipschitz property of f β and the fact that vl(v) dv < guarantee that max 1 i n E[g 1 L ik ( β) X i f β(x i β) Cg for some constant C. Lemma 3. For any t [0, 1 let Y k (t), 1 k n, be an independent sample from a random variable Y (t) defined like in the proof of Proposition 1. Let r(v; t, β) = E[Y (t) X β = v, v R, and assume that r( ; t, β)
20 SAMUEL MAISTRE AND VALENTIN PATILEA is twice differentiable and the second derivative is bounded by a constant independent of t. If r (v; t, β) is the first derivative of r( ; t, β), then, for any t [0, 1, 1 n 1 {r(x i β; t, β) r(x k β; t, β)} 1 g L ik( β) = r (X i β; t, β)gd 1,ni +g 2 D 1,ni (t), k i where max 1 i n D 1,ni = n 1/2 g 1/2 log 1/2 n and max 1 i n sup t [0,1 D 1,ni (t) = O P (1). Proof of Lemma 3. By Taylor expansion 1 n 1 {r(x i β; t, β) r(x k β; t, β)} 1 g L ik( β) = r (X i β; t, β) 1 n k i + 1 n n k=1 n r (x ik (t); t, β)[(x i X k ) 2 β 1 g L ik( β), k=1 (X i X k ) β 1 g L ik( β) where r stands for the second derivative with respect to v and x ik (t) is a point between X i β and X k β. Since L( ) is symmetric, by the empirical process arguments as in Lemma 1 1 max 1 i n n n k=1 (X i X k ) β g 1 g L ik( β) = O P(n 1/2 g 1/2 log 1/2 n). The result follows taking absolute values in the last sum in the last display,
21 S2. TECHNICAL LEMMAS using the boudedness of r and the fact that 1 max 1 i n n n k=1 [(X i X k ) 2 β 1 g 2 g L ik( β) f β(x i β) R v 2 L(v) dv = o P(1). Lemma 4. Assume that E[exp(a X ) < for some a > 0. Moreover the kernels K and L are of bounded variation, differentiable except at most a finite set of points, and uk(u) du <. Let B R n be a subset in the parameter space such that the event defined in equation (7.2) with b n 0 and b n n 1/2 / log n has probability tending to 1. Let K 12 (β) = K((X 1 X 2 ) β/h), L 12 (β) = L((X 1 X 2 ) β/g) and ϕ(β) = ϕ((x 1 X 2 ) A(β)). If the density f β is Lipschitz with constant C 1, β, then there exists a constant C depending only on K, L, f β and C 1, β such that { [ P E sup K12 (β)ϕ 12 (β) K 12 ( β)ϕ 12 ( β) } X1 Cb n h 1/2 1, (S2.5) b B n
22 SAMUEL MAISTRE AND VALENTIN PATILEA [ E sup K12 (β)ϕ 12 (β) K 12 ( β)ϕ 12 ( β) Cb n h 1/2, b B n (S2.6) { [ P E sup L12 (β) L 12 ( β) } 2 X 1 Cb n g 1 1 b B n (S2.7) and { [ P E sup L13 (β) L 13 ( β) } 2 K 12 ( β) 2 X 2, X 3 Chb n g 1 1, b B n (S2.8) [ E sup L 13 (β) L 13 ( β) 2 K 12 ( β) 2 ϕ 2 12( β) Chb n g 1, b B n (S2.9) In Lemma 4 we provide different bounds for L( ) and K( ) because the bandwidths g and h have to satisfy the condition h/g 2 0. Hence we need less restrictive conditions on the range of h if we want to allow for a larger domain for the pair (g, h). Proof of Lemma 4. Since the univariate kernel K is of bounded kernels, let K 1 and K 2 non decreasing bounded functions such that K = K 1 K 2 and denote K 1h = K 1 ( /h). Clearly, it is sufficient to prove the result
23 S2. TECHNICAL LEMMAS with K 1. Similar arguments apply for K 2 and hence we get the results for K. For simpler writings we assume that K is differentiable and let K 1 (x) = x [K (t) + dt and K 2 (x) = x [K (t) dt, x R. Here [K + (resp. [K ) denotes the positive (resp. negative) part of K. The general case where a finite set of nondifferentiability is allowed can be handled with obvious modifications. Let K 1h (t) = K 1 (t/h) and recall that Z i (β) = X iβ. Note that exp( t 2 ) exp( s 2 ) 2 t s. For any β B n and an elementary event in the set C n = {max 1 i n X i c log n} E n for some large constant c, K1h (Z 1 (β) Z 2 (β)) ϕ 12 (β) K 1h ( Z1 ( β) Z 2 ( β) ) ϕ 12 ( β) 2 b n K 1h (Z 1 ( β) Z 2 ( β) + 2b n ) + [K 1h (Z 1 ( β) Z 2 ( β) + 2b n ) K 1h ( Z1 ( β) Z 2 ( β) 2b n ) ϕ12 ( β). The upper bound on the left-hand side is uniform with respect to β. By a suitable change of variable and since the density f β is bounded, it is easy to check that E [ K 1h ( Z1 ( β) Z 2 ( β) + 2b n ) Z1 ( β) is bounded by a constant times h. Next, note that since nh, there exists a constant C independent of n such that on the set C n we have
24 SAMUEL MAISTRE AND VALENTIN PATILEA Z 1 ( β) Z 2 ( β) ± 2b n /h C h 1/2. Then, applying twice a change of variables and using the Lipschitz property of f β, on the set C n, E [ ( K1h Z1 ( β) Z 2 ( β) ) ( + 2b n K1h Z1 ( β) Z 2 ( β) ) 2b n 1{Cn } Z 1 ( β) h K 1 (u) f β(2b n + Z 1 ( β) uh) f β( 2b n + Z 1 ( β) uh) du [ C /h 1/2, C /h 1/2 h sup t R f β(2b n + t) f β( 2b n + t) [ C /h 1/2, C /h 1/2 K 1 (u)du Ch 1/2 b n, for some constant C > 0. Since by a suitable choice of c the probability of 1{C n } given Z 1 ( β) could be made smaller than any fixed negative power of n, and the probability of the event { Z 1 ( β) c log n} could be also made very small, the bound in the last display implies the statement (S2.5). For the statement (S2.6) it suffices to take expectation. For the bound in equation (S2.7), recall that L(t) = L( t ) for any t R so that we can consider only nonnegative t. Moreover, without loss of generality we can consider L nonnegative and decreasing on [0, ), otherwise, since L is of bounded variation, it could be written an the difference of two nonnegative decreasing functions on [0, ). Moreover, let Z 13 (β) = Z 1 (β) Z 3 (β) and L g,13 (β) = L(Z 13 (β)/g). We split the problem in two cases: Z 13 (β) Z 13 ( β) and Z 13 (β) > Z 13 ( β). Then, for β B n
25 S2. TECHNICAL LEMMAS and on the set C n we have Lg,13 (β) L g,13 ( β) 1{Z13 (β) Z 13 ( β)} [L(0) L g,13 ( β)1{z 13 (β) Z 13 ( β), Z 13 ( β) 2b n } + [L g,13 (β) L g,13 ( β)1{z 13 (β) Z 13 ( β), Z 13 ( β) 2b n } Cb 2 ng 2 1{Z 13 ( β) 2b n } + [ L((Z 13 ( β) 2b n )/g) L(Z 13 ( β)/g) 1{Z 13 (β) Z 13 ( β), Z 13 ( β) 2b n } = Cb 2 ng 2 1{Z 13 ( β) 2b n } + A n and Lg,13 (β) L g,13 ( β) 1{Z13 (β) > Z 13 ( β)} [ L(Z 13 ( β)/g) L((Z 13 ( β) + 2b n )/g) 1{Z 13 (β) > Z 13 ( β)} = B n,
26 SAMUEL MAISTRE AND VALENTIN PATILEA for some constant C. Let us notice that A n + B n [L([Z 13 ( β) 2b n /g) L([Z 13 ( β) + 2b n /g)1{z 13 ( β) 2b n } + [L([Z 13 ( β)/g) L([Z 13 ( β) + 2b n /g)1{0 Z 13 ( β) 2b n } [L([Z 13 ( β) 2b n /g) L([Z 13 ( β) + 2b n /g)1{z 13 ( β) 2b n } + Cb 2 ng 2 1{Z 13 ( β) 2b n } [L([Z 13 ( β) 2b n /g) L([Z 13 ( β) + 2b n /g) + 2Cb 2 ng 2 1{Z 13 ( β) 2b n } = D n + 2Cb 2 ng 2 1{Z 13 ( β) 2b n }. On the other hand, 0 D n 4b n g 1 L ( Z) where Z is some value such that Z Z 13 ( β) 2b n g 1. Since, for some constant c, L (v) c v in a neighborhood of the origin, D n 4b n g 1 L (Z 13 ( β) + C b 2 ng 2, for some constant C. Since L is bounded, deduce that L g,13 (β) L g,13 ( β) 2 is bounded by Cb 2 ng 1 g 1 L (Z 13 ( β) + o(b 2 ng 1 ) for some constant C. Take conditional expectation given X 1, that is the same with the conditional expectation given Z 1 (β), and deduce the bound in equation (S2.7).
27 S2. TECHNICAL LEMMAS On the set of events C n, sup L13 (β) L 13 ( β) K12 ( β) {D n +3Cb 2 ng 2 1{Z 13 ( β) 2b n }} K 12 ( β). β B n Take conditional expectation and use standard change of variables to derive the bound in equation (S2.8). Take expectation and remember that ϕ 12 ( β) is bounded to derive the moment bound in equation (S2.9). Lemma 5. Under the conditions of Lemma 1 sup sup max t [0,1 β B n 1 i n Σ2ni (β, t) Σ 2ni ( β, t) = OP (b n ). Proof of Lemma 5. We can write Σ2ni (β, t) Σ 2ni ( β, t) E [ Y (t) g 1 L ((X i X) β/g) g 1 L ( (X i X) β/g ) Xi = E [ E { Y (t) X} g 1 L ((Xi X) β/g) L ( (X i X) β/g ) Xi. Now, we can apply the monotonicity argument we used in Lemma 4 and deduce the bound. Lemma 6. Under the conditions of Proposition 2, I 1 (β 0 ) = o P ( n 1 h 1/2).
28 SAMUEL MAISTRE AND VALENTIN PATILEA Proof of Lemma 6. With the notation defined in equation (S1.3) we have I 1 (β 0 ) = 1 n (n 1) 3 g 2 h n (r i r k ) ( ; β 0 ), (r j r l ) ( ; β 0 ) L 2 L ik L jl K ij ϕ ij i=1 j i k i l j and if we denote by I 1,1 (β 0 ) the term where i, j, k and l are all different, then E [I 1,1 (β 0 ) = (n 2) (n 3) (n 1) 2 g 2 h E [ E [(r i r k ) ( ; β 0 ) L ik Z i (β 0 ), E [(r j r l ) ( ; β 0 ) L jl Z j (β 0 ) L 2 K ij ϕ ij = O ( g 4 ) as soon as g 1 E [(r i r k ) (t; β 0 ) L ik (β 0 ) Z i (β 0 ) = O(g 2 )D (t; Z i (β 0 )) with D( ) bounded, which is guaranteed by Assumption 1-(c). When i, j, k and l take no more than 3 different values, the number of terms is reduced by a factor n, and thus we have that E [I 1,2 (β 0 ) = O (n 1 g 1 ) = o ( n 1 h 1/2). Similar reasoning can be applied to prove that E [I1 2 (β 0 ) = o (n 2 h 1 ). See also Proposition A.1. in Fan and Li (1996). Lemma 7. Under the conditions of Proposition 2, I 3 (β 0 ) = o P ( n 1 h 1/2).
29 S2. TECHNICAL LEMMAS Proof of Lemma 7. Write I 3 (β 0 ) = 1 n (n 1) 3 g 2 h = 1 n (n 1) 3 g 2 h n ϵ k ( ), ϵ l ( ) L 2 L ik L jl K ij ϕ ij i=1 j i k i l j n ϵ k ( ), ϵ l ( ) L 2 L ik L jl K ij ϕ ij i=1 1 + n (n 1) 3 g 2 h 1 + n (n 1) 3 g 2 h j i k i l j,k n ϵ k ( ) 2 L L 2 ik L ji K ij ϕ ij i=1 j i k i,j n ϵ j ( ) 2 L L 2 ij L ji K ij ϕ ij i=1 j i = I 3,1 (β 0 ) + I 3,2 (β 0 ) + I 3,3 (β 0 ). Then E [I 3,1 (β 0 ) = 1 (n 1) 2 g 2 h E [ ϵ 1 ( ), ϵ 2 ( ) L 2 L 2 12K 12 ϕ 12 = O ( n 2 g 2) E [ ϵ1 ( ), ϵ 2 ( ) L 2 h 1 K 12 = O ( n 2 g 2), E [I 3,2 (β 0 ) = O (n 1 g 1 ) and E [I 3,3 (β 0 ) = O (n 2 g 2 ), thus E [I 3 (β 0 ) = o ( n 1 h 1/2). By quite straightforward but tedious calculations, it can be proved that E [I 2 3 (β 0 ) = o (n 2 h 1 ) and the rate of I 3 (β 0 ) follows.
30 SAMUEL MAISTRE AND VALENTIN PATILEA Lemma 8. Under the conditions of Proposition 2, V 2 n (β 0 ) = n E [ G 2 n,i (β 0 ) F n,i 1 1, i=2 in probability, and the martingale difference array {G n,i, F n,i, 1 i n} satisfies the conditional Lindeberg condition ε > 0, n E [ G 2 n,ii ( G n,i > ε) F n,i 1 0, i=2 in probability. Proof of Lemma 8. First, decompose Vn 2 4 n ( i 1 ) (β 0 ) = ωn 2 (n 1) 2 Γ (s, t) ˆf β 2 h 0,i ϵ j (s) ˆf β0,jk ij ϕ ij i=2 j=1 4 n i 1 = ωn 2 (n 1) 2 h i=2 j=1 Γ (s, t) ˆf β 2 0,iϵ j (s) ϵ j (t) ˆf β 2 0,jKijϕ 2 2 ijds dt 8 n i 1 j 1 + ωn 2 (n 1) 2 h Γ (s, t) ˆf β 2 0,iϵ j (s) ϵ k (t) ˆf ˆf β0,j β0,kkijϕ 2 2 ijds dt i=3 j=2 k=1 = A n (β 0 ) + B n (β 0 ). (S2.10)
31 S2. TECHNICAL LEMMAS We have E [A n (β 0 ) = E [E [A n (β 0 ) X 1,..., X n [ 2n = E Γ (s, t) ωn 2 (β 0 ) (n 1) h ˆf β 2 0,iE [ϵ j (s) ϵ j (t) ˆf β 2 0,jKijϕ 2 2 ijds dt = n n 1 n 1. Moreover, Var (A n (β 0 )) 64 ϕ 4 (n 1) 4 h ϕ 4 (n 1) 4 h ϕ 4 (n 1) 4 h 2 n i 1 j 1 i=3 = o ( n 1 h 1/2), j=2 j =1 n i 1 i 1 i=3 i =2 j=1 [ E ωn 2 (β 0 ) ˆf β 4 ˆf 0,i β 2 ˆf 0,j β 2 0,j K2 ijkij 2 [ E n i 1 i 1 i=3 i =2 j=1 [ E Γ 2 (s, t) Γ 2 (u, v) dsdtdudv ω 2 n (β 0 ) ˆf 2 β 0,i ˆf 2 β 0,i ˆf 4 β 0,jK 2 ijk 2 i j Γ (s, t) Γ (u, v) G (s, t, u, v) dsdtdudv ω 2 n (β 0 ) ˆf 4 β 0,i ˆf 4 β 0,jK 4 ij Γ (s, t) Γ (u, v) G (s, t, u, v) dsdtdudv where G (s, t, u, v) = E [ϵ (s) ϵ (t) ϵ (u) ϵ (v). The decomposition of E [B 2 n involves the same type of terms and is therefore also of rate o ( n 1 h 1/2), so that the convergence of V 2 n (β 0 ) is met.
32 SAMUEL MAISTRE AND VALENTIN PATILEA For the conditional Lindeberg condition, we have ε > 0, n 1 and 1 < i n E [ E [ G 4 G 2 n,i F n,i 1 n,ii ( G n,i > ε) F n,i 1. ε 2 Then n E [ G 2 n,ii ( G n,i > ε) F n,i 1 i=2 1 ε 2 n E [ G 4 n,i F n,i 1 i=2 1 ε 2 16 (n 1) 4 h 2 n i=2 G (s 1, s 2, s 3, s 4 ) ˆf 4 β 0,i 4 i 1 k=1 j k =1 ϵ jk (s k ) ˆf β0,j k K ijk ϕ ijk ds k. The expectation of the last majorant is of rate O ( n 1) G (s 1, s 2, s 3, s 4 ) Γ (s 1, s 2 ) Γ (s 3, s 4 ) ds 1 ds 2 ds 3 ds 4 [ E ˆf 4 ˆf β 0,i β 2 ˆf 0,j β 2 0,j h 1 Kijh 2 1 Kij 2 ϕ2 ijϕ 2 ij + O ( n 2 h 1) n G 2 (s 1, s 2, s 3, s 4 ) ds 1 ds 2 ds 3 ds 4 i=2 [ E ˆf 4 ˆf β 0,i β 4 0,jh 1 Kijϕ 4 4 ij =o ( n 1 h 1/2).
33 S2. TECHNICAL LEMMAS Lemma 9. Under the conditions of Proposition 2, ω 2 n (β 0 ) ω 2 (β 0 ) > 0, in probability. Proof of Lemma 9. We have E [ ωn 2 (β 0 ) [ = 2E ˆfβ,i ˆfβ,j h 1 Kij 2 (β) ϕ 2 ij (β) Γ 2 (s, t) ds dt. On the other hand, [ E ˆfβ,i ˆfβ,j h 1 Kijϕ 2 2 ij [ = 1 g 2 h E L ik L jl L ik L jl h 1 Kijϕ 2 2 ij k i l j k i l j [ 1 = g 2 h (n 1) 2 E L ik L jl L ik L jl h 1 Kijϕ 2 2 ij k i l j k i l j [ 1 = g 2 h (n 1) 4 E L ik L jl L ik L jl h 1 Kijϕ 2 2 ij + o ( n 1 h 1/2) k i l j k i l j (n 1) 3 = (n 2) (n 3) (n 4) ω2 n (β 0 ) + o ( n 1 h 1/2)
34 SAMUEL MAISTRE AND VALENTIN PATILEA where [ ( 1 ω n 2 (β 0 ) = E g L zi z k g 1 h K2 ( zi z j h ) ( 1 g L zj z l g ) ϕ ij ) 1 g L ( zi z k g ) ( 1 g L zj z l g ) f β0 (z k ) f β0 (z l ) f β0 (z k ) f β0 (z l ) [ = E π β0 (z i W i (β 0 )) π β0 (z j W j (β 0 )) dz i dz j dz k dz l dz k dz l f β0 (z i gs 1 ) f β0 (z i gs 2 ) f β0 (z j gt 1 ) f β0 (z j gt 2 ) π β0 (z i W i (β 0 )) π β0 (z j W j (β 0 )) ϕ ij L (s 1 ) L (t 1 ) L (s 2 ) L (t 2 ) 1 ( ) zi z j h K2 dz i dz j ds 1 dt 1 ds 2 dt 2 h [ = E f β0 (z i gs 1 ) f β0 (z i gs 2 ) f β0 (z i gu gt 1 ) f β0 (z i gu gt 2 ) [ E π β0 (z i W i (β 0 )) π β0 (z i gu W j (β 0 )) ϕ ij L (s 1 ) L (t 1 ) L (s 2 ) L (t 2 ) K 2 (u) dz i ds 1 dt 1 ds 2 dt 2 du fβ 4 0 (z) π β0 (z W i (β 0 )) π β0 (z W j (β 0 )) ϕ ij dz K 2 (u) du where the limit is obtained by standard arguments, using uniform continuity of f β0 ( ) and π β0 ( w).
35 REFERENCES References Fan, Y. and Li, Q. (1996). Consistent Model Specification Tests: Omitted Variables and Semiparametric Functional Forms. Econometrica 64, Hall, P. and Heyde, C. C. (1980). Martingale limit theory and its application, New York: Academic Press Inc. [Harcourt Brace Jovanovich Publishers, probab. Math. Statist. van der Vaart, A. W. (1998). Asymptotic statistics, vol. 3 of Camb. Ser. Stat. Probab. Math., Cambridge: Cambridge University Press. van der Vaart, A. W. and Wellner, J. A. (2011). A local maximal inequality under uniform entropy. Electron. J. Stat. 5,
NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA
Statistica Sinica 213): Supplement NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Hokeun Sun 1, Wei Lin 2, Rui Feng 2 and Hongzhe Li 2 1 Columbia University and 2 University
More informationMinimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.
Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of
More informationProduct-limit estimators of the survival function with left or right censored data
Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut
More informationDensity estimators for the convolution of discrete and continuous random variables
Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract
More informationSTAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song
STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April
More informationSupplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data
Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data Raymond K. W. Wong Department of Statistics, Texas A&M University Xiaoke Zhang Department
More informationFinite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product
Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )
More informationPointwise convergence rates and central limit theorems for kernel density estimators in linear processes
Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract Convergence
More informationERRATA: Probabilistic Techniques in Analysis
ERRATA: Probabilistic Techniques in Analysis ERRATA 1 Updated April 25, 26 Page 3, line 13. A 1,..., A n are independent if P(A i1 A ij ) = P(A 1 ) P(A ij ) for every subset {i 1,..., i j } of {1,...,
More informationI teach myself... Hilbert spaces
I teach myself... Hilbert spaces by F.J.Sayas, for MATH 806 November 4, 2015 This document will be growing with the semester. Every in red is for you to justify. Even if we start with the basic definition
More informationKernel Method: Data Analysis with Positive Definite Kernels
Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University
More informationAsymptotic normality of conditional distribution estimation in the single index model
Acta Univ. Sapientiae, Mathematica, 9, 207 62 75 DOI: 0.55/ausm-207-000 Asymptotic normality of conditional distribution estimation in the single index model Diaa Eddine Hamdaoui Laboratory of Stochastic
More informationEstimation of the Bivariate and Marginal Distributions with Censored Data
Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new
More informationGoodness-of-fit tests for the cure rate in a mixture cure model
Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,
More informationOn variable bandwidth kernel density estimation
JSM 04 - Section on Nonparametric Statistics On variable bandwidth kernel density estimation Janet Nakarmi Hailin Sang Abstract In this paper we study the ideal variable bandwidth kernel estimator introduced
More information5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.
5 Measure theory II 1. Charges (signed measures). Let (Ω, A) be a σ -algebra. A map φ: A R is called a charge, (or signed measure or σ -additive set function) if φ = φ(a j ) (5.1) A j for any disjoint
More informationStatistical Properties of Numerical Derivatives
Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationExercise Solutions to Functional Analysis
Exercise Solutions to Functional Analysis Note: References refer to M. Schechter, Principles of Functional Analysis Exersize that. Let φ,..., φ n be an orthonormal set in a Hilbert space H. Show n f n
More informationYour first day at work MATH 806 (Fall 2015)
Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies
More informationMathematics Department Stanford University Math 61CM/DM Inner products
Mathematics Department Stanford University Math 61CM/DM Inner products Recall the definition of an inner product space; see Appendix A.8 of the textbook. Definition 1 An inner product space V is a vector
More informationg(x) = P (y) Proof. This is true for n = 0. Assume by the inductive hypothesis that g (n) (0) = 0 for some n. Compute g (n) (h) g (n) (0)
Mollifiers and Smooth Functions We say a function f from C is C (or simply smooth) if all its derivatives to every order exist at every point of. For f : C, we say f is C if all partial derivatives to
More informationElements of Positive Definite Kernel and Reproducing Kernel Hilbert Space
Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department
More informationAsymptotic Behavior of the Magnetization Near Critical and Tricritical Points via Ginzburg-Landau Polynomials
Asymptotic Behavior of the Magnetization Near Critical and Tricritical Points via Ginzburg-Landau Polynomials Richard S. Ellis rsellis@math.umass.edu Jonathan Machta 2 machta@physics.umass.edu Peter Tak-Hun
More informationAn introduction to some aspects of functional analysis
An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms
More informationRandom Bernstein-Markov factors
Random Bernstein-Markov factors Igor Pritsker and Koushik Ramachandran October 20, 208 Abstract For a polynomial P n of degree n, Bernstein s inequality states that P n n P n for all L p norms on the unit
More informationMAT 257, Handout 13: December 5-7, 2011.
MAT 257, Handout 13: December 5-7, 2011. The Change of Variables Theorem. In these notes, I try to make more explicit some parts of Spivak s proof of the Change of Variable Theorem, and to supply most
More informationBrownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539
Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory
More informationFunctional Analysis HW #3
Functional Analysis HW #3 Sangchul Lee October 26, 2015 1 Solutions Exercise 2.1. Let D = { f C([0, 1]) : f C([0, 1])} and define f d = f + f. Show that D is a Banach algebra and that the Gelfand transform
More informationfor all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true
3 ohn Nirenberg inequality, Part I A function ϕ L () belongs to the space BMO() if sup ϕ(s) ϕ I I I < for all subintervals I If the same is true for the dyadic subintervals I D only, we will write ϕ BMO
More informationBootstrap of residual processes in regression: to smooth or not to smooth?
Bootstrap of residual processes in regression: to smooth or not to smooth? arxiv:1712.02685v1 [math.st] 7 Dec 2017 Natalie Neumeyer Ingrid Van Keilegom December 8, 2017 Abstract In this paper we consider
More informationSUPPLEMENTARY MATERIAL FOR PUBLICATION ONLINE 1
SUPPLEMENTARY MATERIAL FOR PUBLICATION ONLINE 1 B Technical details B.1 Variance of ˆf dec in the ersmooth case We derive the variance in the case where ϕ K (t) = ( 1 t 2) κ I( 1 t 1), i.e. we take K as
More informationEconomics 204 Fall 2011 Problem Set 2 Suggested Solutions
Economics 24 Fall 211 Problem Set 2 Suggested Solutions 1. Determine whether the following sets are open, closed, both or neither under the topology induced by the usual metric. (Hint: think about limit
More informationEstimation of a quadratic regression functional using the sinc kernel
Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073
More informationReal Analysis Notes. Thomas Goller
Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................
More informationNOTES ON FRAMES. Damir Bakić University of Zagreb. June 6, 2017
NOTES ON FRAMES Damir Bakić University of Zagreb June 6, 017 Contents 1 Unconditional convergence, Riesz bases, and Bessel sequences 1 1.1 Unconditional convergence of series in Banach spaces...............
More informationANALYSIS QUALIFYING EXAM FALL 2016: SOLUTIONS. = lim. F n
ANALYSIS QUALIFYING EXAM FALL 206: SOLUTIONS Problem. Let m be Lebesgue measure on R. For a subset E R and r (0, ), define E r = { x R: dist(x, E) < r}. Let E R be compact. Prove that m(e) = lim m(e /n).
More information1 Directional Derivatives and Differentiability
Wednesday, January 18, 2012 1 Directional Derivatives and Differentiability Let E R N, let f : E R and let x 0 E. Given a direction v R N, let L be the line through x 0 in the direction v, that is, L :=
More informationConvergence rates in weighted L 1 spaces of kernel density estimators for linear processes
Alea 4, 117 129 (2008) Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes Anton Schick and Wolfgang Wefelmeyer Anton Schick, Department of Mathematical Sciences,
More informationHILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define
HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,
More informationESTIMATION IN A SEMI-PARAMETRIC TWO-STAGE RENEWAL REGRESSION MODEL
Statistica Sinica 19(29): Supplement S1-S1 ESTIMATION IN A SEMI-PARAMETRIC TWO-STAGE RENEWAL REGRESSION MODEL Dorota M. Dabrowska University of California, Los Angeles Supplementary Material This supplement
More informationNOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES
NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES JONATHAN LUK These notes discuss theorems on the existence, uniqueness and extension of solutions for ODEs. None of these results are original. The proofs
More informationLARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011
LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic
More informationPOINTWISE BOUNDS ON QUASIMODES OF SEMICLASSICAL SCHRÖDINGER OPERATORS IN DIMENSION TWO
POINTWISE BOUNDS ON QUASIMODES OF SEMICLASSICAL SCHRÖDINGER OPERATORS IN DIMENSION TWO HART F. SMITH AND MACIEJ ZWORSKI Abstract. We prove optimal pointwise bounds on quasimodes of semiclassical Schrödinger
More informationLaplace s Equation. Chapter Mean Value Formulas
Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic
More information08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms
(February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops
More informationTesting convex hypotheses on the mean of a Gaussian vector. Application to testing qualitative hypotheses on a regression function
Testing convex hypotheses on the mean of a Gaussian vector. Application to testing qualitative hypotheses on a regression function By Y. Baraud, S. Huet, B. Laurent Ecole Normale Supérieure, DMA INRA,
More informationLinear Models Review
Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign
More informationANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2.
ANALYSIS QUALIFYING EXAM FALL 27: SOLUTIONS Problem. Determine, with justification, the it cos(nx) n 2 x 2 dx. Solution. For an integer n >, define g n : (, ) R by Also define g : (, ) R by g(x) = g n
More informationLecture 21 Representations of Martingales
Lecture 21: Representations of Martingales 1 of 11 Course: Theory of Probability II Term: Spring 215 Instructor: Gordan Zitkovic Lecture 21 Representations of Martingales Right-continuous inverses Let
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids
More informationTHE INVERSE FUNCTION THEOREM
THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)
More informationON THE EQUIVALENCE OF CONGLOMERABILITY AND DISINTEGRABILITY FOR UNBOUNDED RANDOM VARIABLES
Submitted to the Annals of Probability ON THE EQUIVALENCE OF CONGLOMERABILITY AND DISINTEGRABILITY FOR UNBOUNDED RANDOM VARIABLES By Mark J. Schervish, Teddy Seidenfeld, and Joseph B. Kadane, Carnegie
More informationEstimates for probabilities of independent events and infinite series
Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences
More informationRefining the Central Limit Theorem Approximation via Extreme Value Theory
Refining the Central Limit Theorem Approximation via Extreme Value Theory Ulrich K. Müller Economics Department Princeton University February 2018 Abstract We suggest approximating the distribution of
More information2 Lebesgue integration
2 Lebesgue integration 1. Let (, A, µ) be a measure space. We will always assume that µ is complete, otherwise we first take its completion. The example to have in mind is the Lebesgue measure on R n,
More information3 Integration and Expectation
3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ
More informationParcours OJD, Ecole Polytechnique et Université Pierre et Marie Curie 05 Mai 2015
Examen du cours Optimisation Stochastique Version 06/05/2014 Mastère de Mathématiques de la Modélisation F. Bonnans Parcours OJD, Ecole Polytechnique et Université Pierre et Marie Curie 05 Mai 2015 Authorized
More informationDEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE
Estimating the error distribution in nonparametric multiple regression with applications to model testing Natalie Neumeyer & Ingrid Van Keilegom Preprint No. 2008-01 July 2008 DEPARTMENT MATHEMATIK ARBEITSBEREICH
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results
Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics
More informationProblem Set. Problem Set #1. Math 5322, Fall March 4, 2002 ANSWERS
Problem Set Problem Set #1 Math 5322, Fall 2001 March 4, 2002 ANSWRS i All of the problems are from Chapter 3 of the text. Problem 1. [Problem 2, page 88] If ν is a signed measure, is ν-null iff ν () 0.
More informationIf g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get
18:2 1/24/2 TOPIC. Inequalities; measures of spread. This lecture explores the implications of Jensen s inequality for g-means in general, and for harmonic, geometric, arithmetic, and related means in
More information(Part 1) High-dimensional statistics May / 41
Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2
More informationAnalysis Finite and Infinite Sets The Real Numbers The Cantor Set
Analysis Finite and Infinite Sets Definition. An initial segment is {n N n n 0 }. Definition. A finite set can be put into one-to-one correspondence with an initial segment. The empty set is also considered
More informationChapter One. The Calderón-Zygmund Theory I: Ellipticity
Chapter One The Calderón-Zygmund Theory I: Ellipticity Our story begins with a classical situation: convolution with homogeneous, Calderón- Zygmund ( kernels on R n. Let S n 1 R n denote the unit sphere
More information2. Function spaces and approximation
2.1 2. Function spaces and approximation 2.1. The space of test functions. Notation and prerequisites are collected in Appendix A. Let Ω be an open subset of R n. The space C0 (Ω), consisting of the C
More informationTools from Lebesgue integration
Tools from Lebesgue integration E.P. van den Ban Fall 2005 Introduction In these notes we describe some of the basic tools from the theory of Lebesgue integration. Definitions and results will be given
More informationCourse 212: Academic Year Section 1: Metric Spaces
Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........
More informationREAL AND COMPLEX ANALYSIS
REAL AND COMPLE ANALYSIS Third Edition Walter Rudin Professor of Mathematics University of Wisconsin, Madison Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any
More information17. Convergence of Random Variables
7. Convergence of Random Variables In elementary mathematics courses (such as Calculus) one speaks of the convergence of functions: f n : R R, then lim f n = f if lim f n (x) = f(x) for all x in R. This
More informationScalar multiplication and addition of sequences 9
8 Sequences 1.2.7. Proposition. Every subsequence of a convergent sequence (a n ) n N converges to lim n a n. Proof. If (a nk ) k N is a subsequence of (a n ) n N, then n k k for every k. Hence if ε >
More informationBreaking the curse of dimensionality in nonparametric testing
Breaking the curse of dimensionality in nonparametric testing Pascal Lavergne Simon Fraser University Valentin Patilea CREST-ENSAI November 2005 Abstract For tests based on nonparametric methods, power
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More information1. Introduction Boundary estimates for the second derivatives of the solution to the Dirichlet problem for the Monge-Ampere equation
POINTWISE C 2,α ESTIMATES AT THE BOUNDARY FOR THE MONGE-AMPERE EQUATION O. SAVIN Abstract. We prove a localization property of boundary sections for solutions to the Monge-Ampere equation. As a consequence
More informationRANDOM MATRICES: LAW OF THE DETERMINANT
RANDOM MATRICES: LAW OF THE DETERMINANT HOI H. NGUYEN AND VAN H. VU Abstract. Let A n be an n by n random matrix whose entries are independent real random variables with mean zero and variance one. We
More informationDerivatives of Harmonic Bergman and Bloch Functions on the Ball
Journal of Mathematical Analysis and Applications 26, 1 123 (21) doi:1.16/jmaa.2.7438, available online at http://www.idealibrary.com on Derivatives of Harmonic ergman and loch Functions on the all oo
More informationA Martingale Central Limit Theorem
A Martingale Central Limit Theorem Sunder Sethuraman We present a proof of a martingale central limit theorem (Theorem 2) due to McLeish (974). Then, an application to Markov chains is given. Lemma. For
More informationn E(X t T n = lim X s Tn = X s
Stochastic Calculus Example sheet - Lent 15 Michael Tehranchi Problem 1. Let X be a local martingale. Prove that X is a uniformly integrable martingale if and only X is of class D. Solution 1. If If direction:
More informationReal Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi
Real Analysis Math 3AH Rudin, Chapter # Dominique Abdi.. If r is rational (r 0) and x is irrational, prove that r + x and rx are irrational. Solution. Assume the contrary, that r+x and rx are rational.
More informationLinear Algebra 2 Spectral Notes
Linear Algebra 2 Spectral Notes In what follows, V is an inner product vector space over F, where F = R or C. We will use results seen so far; in particular that every linear operator T L(V ) has a complex
More informationδ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by
. Sanov s Theorem Here we consider a sequence of i.i.d. random variables with values in some complete separable metric space X with a common distribution α. Then the sample distribution β n = n maps X
More informationIndex Models for Sparsely Sampled Functional Data. Supplementary Material.
Index Models for Sparsely Sampled Functional Data. Supplementary Material. May 21, 2014 1 Proof of Theorem 2 We will follow the general argument in the proof of Theorem 3 in Li et al. 2010). Write d =
More informationTrace Class Operators and Lidskii s Theorem
Trace Class Operators and Lidskii s Theorem Tom Phelan Semester 2 2009 1 Introduction The purpose of this paper is to provide the reader with a self-contained derivation of the celebrated Lidskii Trace
More informationMath 61CM - Solutions to homework 6
Math 61CM - Solutions to homework 6 Cédric De Groote November 5 th, 2018 Problem 1: (i) Give an example of a metric space X such that not all Cauchy sequences in X are convergent. (ii) Let X be a metric
More informationOctober 25, 2013 INNER PRODUCT SPACES
October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal
More informationAdditive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535
Additive functionals of infinite-variance moving averages Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535 Departments of Statistics The University of Chicago Chicago, Illinois 60637 June
More informationThe Canonical Gaussian Measure on R
The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where
More informationOn Riesz-Fischer sequences and lower frame bounds
On Riesz-Fischer sequences and lower frame bounds P. Casazza, O. Christensen, S. Li, A. Lindner Abstract We investigate the consequences of the lower frame condition and the lower Riesz basis condition
More informationLecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differenti. equations.
Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differential equations. 1 Metric spaces 2 Completeness and completion. 3 The contraction
More informationSPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS
SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint
More informationNotes 18 : Optional Sampling Theorem
Notes 18 : Optional Sampling Theorem Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Chapter 14], [Dur10, Section 5.7]. Recall: DEF 18.1 (Uniform Integrability) A collection
More informationC 1 convex extensions of 1-jets from arbitrary subsets of R n, and applications
C 1 convex extensions of 1-jets from arbitrary subsets of R n, and applications Daniel Azagra and Carlos Mudarra 11th Whitney Extension Problems Workshop Trinity College, Dublin, August 2018 Two related
More informationStanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures
2 1 Borel Regular Measures We now state and prove an important regularity property of Borel regular outer measures: Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon
More informationConvex Geometry. Carsten Schütt
Convex Geometry Carsten Schütt November 25, 2006 2 Contents 0.1 Convex sets... 4 0.2 Separation.... 9 0.3 Extreme points..... 15 0.4 Blaschke selection principle... 18 0.5 Polytopes and polyhedra.... 23
More informationReal Analysis Problems
Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationTHE DVORETZKY KIEFER WOLFOWITZ INEQUALITY WITH SHARP CONSTANT: MASSART S 1990 PROOF SEMINAR, SEPT. 28, R. M. Dudley
THE DVORETZKY KIEFER WOLFOWITZ INEQUALITY WITH SHARP CONSTANT: MASSART S 1990 PROOF SEMINAR, SEPT. 28, 2011 R. M. Dudley 1 A. Dvoretzky, J. Kiefer, and J. Wolfowitz 1956 proved the Dvoretzky Kiefer Wolfowitz
More informationTESTING REGRESSION MONOTONICITY IN ECONOMETRIC MODELS
TESTING REGRESSION MONOTONICITY IN ECONOMETRIC MODELS DENIS CHETVERIKOV Abstract. Monotonicity is a key qualitative prediction of a wide array of economic models derived via robust comparative statics.
More informationYour first day at work MATH 806 (Fall 2015)
Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies
More information