Existence and Uniqueness of Penalized Least Square Estimation for Smoothing Spline Nonlinear Nonparametric Regression Models
|
|
- Juniper Foster
- 5 years ago
- Views:
Transcription
1 Existence and Uniqueness of Penalized Least Square Estimation for Smoothing Spline Nonlinear Nonparametric Regression Models Chunlei Ke and Yuedong Wang March 1, 24 1 The Model A smoothing spline nonlinear nonparametric regression model (SSNNRM) assumes that y i = N i (g 1,, g r ) + ɛ i, i = 1,, n, (1) where N i are known nonlinear functionals, g = (g 1,, g r ) are unknown functions, and ɛ i iid N(, σ 2 ) are random errors. Without loss of generality, we assume that r = 2. As in O Sullivan (199), we express design points x explicitly in the functional N i : N i (g 1, g 2 ) = η(g 1, g 2 ; x i ), where η is a known nonlinear functional. In the following sections, η(g 1, g 2 ; x) is sometimes also represented by η(g 1, g 2 ) or η when the meaning is clear. g 1 and g 2 are estimated as minimizers of the following penalized least squares (PLS) l nλ (g 1, g 2 ) = 1 n (y i η(g 1, g 2 ; x i )) 2 + λ 1 J 1 (g 1 ) + λ 2 J 2 (g 2 ), (2) n i=1 where g j H j, j = 1, 2, H j are Hilbert spaces, λ 1, λ 2 are smoothing parameters and J 1, J 2 are penalty functionals. In this plement, adapting the frameworks in Cox (1988), Cox and O Sullivan (199) and O Sullivan (199), we establish the existence and uniqueness of the solution to (2). To save space, some details are omitted in the following sections. Chunlei Ke ( cke@sjm.com) is Statistician, St. Jude Medical, Cardiac Rhythm Management Division, Sylmar, CA Yuedong Wang ( yuedong@pstat.ucsb.edu) is Professor, Department of Statistics and Applied Probability, University of California, Santa Barbara, California Yuedong Wang s research was ported by NIH Grant R1 GM Address for correspondence: Yuedong Wang, Department of Statistics and Applied Probability, University of California, Santa Barbara, California
2 2 Some Assumptions We make the following assumptions. See O Sullivan (199) for discussion on these assumptions. Assumption A.1 (Observational model) (i) x i Ω R d, where Ω is a bounded, open and simply connected set with C boundary. (ii) {x i, i = 1,, n} is a random sample from a probability density function f, which is strictly bounded away from zero and infinity on Ω. Denote F and F n as the CDF and empirical CDF of sample x i s. (iii) The x i s and ɛ i s are independent. Denote the true functions in model (1) as g 1 and g 2. It is obvious that g 1 and g 2 are minimizers of l (g 1, g 2 ) = (η(g 1, g 2 ; x) η(g 1, g 2 ; x)) 2 df (x). (3) Ω As Cox and O Sullivan (199) and O Sullivan (199), we define l λ (g 1, g 2 ) as the limiting regularization functional of l nλ : l λ (g 1, g 2 ) = (η(g 1, g 2 ; x) η(g 1, g 2 ; x)) 2 df (x) + λ 1 J 1 (g 1 ) + λ 2 J 2 (g 2 ). (4) Ω Assumption A.2 (Parameter space) (i) H 1 and H 2 are Hilbert spaces with norms 1, 2 and inner products < > 1 and < > 2. For simplicity of notation, the subscripts will be dropped when there is no confusion. Let H = H 1 H 2 with the norm (g 1, g 2 ) = g 1 + g 2. (ii) The penalty functionals are of a quadratic form J 1 (g 1 ) =< g 1, W 1 g 1 > and J 2 (g 2 ) =< g 2, W 2 g 2 >, where W 1 and W 2 are nonnegative definite linear operators on H 1 and H 2 respectively. (iii) There are bounded linear operators L 1 : H 1 L 2 (Ω ) and L 2 : H 2 L 2 (Ω ) with zero null-space. There are strictly positive constants M 1 and M 2 such that for all g i H i, M 1 g i 2 L i g i 2 L 2 + < g i, W i g i > M 2 g i 2, i = 1, 2. We then define bounded linear operators U 1 and U 2 by (L i g i, L i h i ) L2 =< g i, U i h i >, g i, h i H i, i = 1, 2. 2
3 U 1 and U 2 are compact, and satisfy M 1 g i 2 < g i, U i g i > + < g i, W i g i > M 2 g i 2, g i H i, i = 1, 2. We now define normed spaces based on U i and W i as O Sullican (199). For i = 1, 2, there is a sequence {φ vi : v = 1, 2, } of eigenfunctions and {γ vi, v = 1, 2, } of eigenvalues that satisfy < φ µi, U i φ vi >= δ vµ, < φ µi, W i φ vi >= γ vi δ vµ, where v and µ are any positive integers and δ vµ is Kronecker s delta. For b, define g i bi = { (1 + γvi) b < g i, U i φ vi > 2 i } 1/2, v=1 and let H bi be the normed linear space obtained by completing {g H i : g bi < }. H bi is a Hilbert space with inner product It is easy to see that H i = H 1i. < g, h > bi = (1 + γvi) b < g, U i φ vi >< h, U i φ vi >. v=1 Assumption A.3 (Property of η, g 1 and g 2 ) For some α 1, α 2 (, 1], there are g 1 H α1, g 2 H α2 and their neighbourhoods N 1 H α1 and N 2 H α2, such that (i) η(g 1, g 2 ; x) is three times continuously Fréchet differentiable with respect to g 1 and g 2 in N 1 N 2. (g 1, g 2 ) are the unique root of D 1 l (g 1, g 2 ) = and D 2 l (g 1, g 2 ) =. (ii) For some s such that mα > s > d/2, there exists M > such that for g 1 N 1 and g 2 N 2, η(g 1, g 2 ; x) W s 2 < M, where W s 2 is a Sobolev space with noninteger order s. For simplicity of notations, we assume that α 1 = α 2 = α. Construction and proof are similar for the case of α 1 α 2. 3 Linearizations In this section, we approximate the systematic and stochastic errors in the PLS estimates using Taylor series expansions. We first derive a Taylor series expansion for a bivariate nonlinear operator. Theorem 1 Let f : D(f) X Y Z, where X, Y and Z are Banach spaces. If f exists at (x, y), then the partial Fréchet derivatives f xx, f xy, f yx and f yy exist at (x, y) and for any h, a X, k, b Y, f (x, y)(h, k)(a, b) = f xx (x, y)ha + f xy (x, y)ka + f yx (x, y)hb + f yy (x, y)kb. (5) 3
4 [Proof] By definition, for any (h, k), (a, b) X Y, f (x + h, y + k)(a, b) = f (x, y)(a, b) + f (x, y)(h, k)(a, b) + o( (h, k) ), as (h, k). For the norm on the product space X Y, choose (h, k) = h + k. Setting b = k =, we have f (x + h, y)(a, ) = f (x, y)(a, ) + f (x, y)(h, )(a, ) + o( h ). Applying equation (15) in Chapter 4 of Zeidler (1985), we have Therefore, f x (x + h, y)a = f x (x, y)a + f (x, y)(h, )(a, ) + o( h ). f xx (x, y)ha = f (x, y)(h, )(a, ). Similarly by setting a = h =, (h, b) = (, ) and (a, k) = (, ) respectively, we get f yy (x, y)kb = f (x, y)(, k)(, b), f xy (x, y)ka = f (x, y)(, k)(a, ), f yx (x, y)hb = f (x, y)(h, )(, b). Addition gives (5). Based on the generalized Taylor s theorem (Theorem 4.A (b), Zeidler, 1985), equation (15) in Chapter 4 of Zeidler (1985) and Theorem 1, we have the first order Taylor series expansion f(x + h, y + k) = f(x, y) + and the second order Taylor series expansion where the remainder [f x (x + τh, y + τk)h + f y (x + τh, y + τk)k]dτ, (6) f(x + h, y + k) = f(x, y) + f x (x, y)h + f y (x, y)k + R, (7) R = = (1 τ)f (x + τh, y + τk)(h, k)(h, k)dτ (1 τ)[f xx (x + τh, y + τk)hh + f xy (x + τh, y + τk)kh + f yx (x + τh, y + τk)hk + f yy (x + τh, y + τk)kk]dτ. We now use Taylor expansions to approximate the systematic and stochastic errors of the estimates. Denote D 1 and D 2 as the partial Fréchet derivatives of η with respect to g 1 and g 2 respectively, D 11 and D 22 as the second partial Fréchet derivatives of η with respect to g 1 and 4
5 g 2 respectively, and D 12 as the second partial Fréchet derivative of η with respect to g 1 and g 2 (Zeidler, 1985). Higher Fréchet partial derivatives are denoted similarly. Let Z i (g 1, g 2 ) = 1 2 D il λ (g 1, g 2 ), i = 1, 2. Note that Z i also depend on λ, which is not expressed explicitly. For g 1 + h 1 N 1, g 2 + h 2 N 2 and i = 1, 2, Z i (g 1 + h 1, g 2 + h 2 ) = Z i (g 1, g 2 ) + D 1 Z i (g 1, g 2 )h 1 + D 2 Z i (g 1, g 2 )h 2 + [D 11 Z i (g 1 + τh 1, g 2 + τh 2 )h 1 h 1 + D 12 Z i (g 1 + τh 1, g 2 + τh 2 )h 1 h 2 +D 21 Z i (g 1 + τh 1, g 2 + τh 2 )h 2 h 1 + D 22 Z i (g 1 + τh 1, g 2 + τh 2 )h 2 h 2 ](1 τ)dτ. Define U 1 (g 1, g 2 ), U 2 (g 1, g 2 ), U 12 (g 1, g 2 ) and U 21 (g 1, g 2 ) by < u 1, U 1 (g 1, g 2 )v 1 > 1 = (D 1 η(g 1, g 2 )u 1, D 1 η(g 1, g 2 )v 1 ) L2, < u 2, U 2 (g 1, g 2 )v 2 > 2 = (D 2 η(g 1, g 2 )u 2, D 2 η(g 1, g 2 )v 2 ) L2, < u 1, U 21 (g 1, g 2 )v 2 > 1 = (D 1 η(g 1, g 2 )u 1, D 2 η(g 1, g 2 )v 2 ) L2, for any u i, v i H i, i = 1, 2. U 12 is the adjoint operator of U 21, i.e. U 12 = U 21. Let Then we have Z 1 (g 1 + h 1, g 2 + h 2 ) G i (g 1, g 2 ) = U i (g 1, g 2 ) + λ i W i, i = 1, 2. = Z 1 (g 1, g 2 ) + G 1 (g 1, g 2 )h 1 + U 12 (g 1, g 2 )h 2 + [e 11 (g 1 + τh 1, g 2 + τh 2 )h 1 h 1 + e 12 (g 1 + τh 1, g 2 + τh 2 )h 1 h 2 + e 13 (g 1 + τh 1, g 2 + τh 2 )h 2 h 1 + e 14 (g 1 + τh 1, g 2 + τh 2 )h 2 h 2 ](1 τ)dτ, where e 1i (g 1, g 2 )uvw = e 1i (g 1, g 2 ; x)uvwdf (x), i = 1, 2, 3, 4 and e 11 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 111 ηuvw +D 1 ηud 11 ηvw + D 1 ηvd 11 ηuw + D 1 ηwd 11 ηuv, e 12 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 112 ηuvw +D 1 ηud 12 ηvw + D 2 ηwd 11 ηuv + D 1 ηvd 12 ηuw, e 13 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 121 ηuvw +D 1 ηud 21 ηvw + D 1 ηwd 12 ηuv + D 2 ηvd 11 ηuw, e 14 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 122 ηvwd 1 ηu +D 1 ηud 22 ηvw + D 2 ηvd 12 ηuw + D 2 ηwd 12 ηuv. 5
6 Similarly, for Z 2 we have Z 2 (g 1 + h 1, g 2 + h 2 ) = Z 2 (g 1, g 2 ) + G 2 (g 1, g 2 )h 2 + U 21 (g 1, g 2 )h 1 + [e 21 (g 1 + τh 1, g 2 + τh 2 )h 1 h 1 + e 22 (g 1 + τh 1, g 2 + τh 2 )h 1 h 2 + e 23 (g 1 + τh 1, g 2 + τh 2 )h 2 h 1 + e 24 (g 1 + τh 1, g 2 + τh 2 )h 2 h 2 ](1 τ)dτ, where e 2i (g 1, g 2 )uvw = e 2i (g 1, g 2 ; x)uvwdf (x), i = 1, 2, 3, 4 and e 21 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 211 ηuvw +D 2 ηud 11 ηvw + D 1 ηvd 21 ηuw + D 1 ηwd 21 ηuv, e 22 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 212 ηuvw +D 2 ηud 12 ηvw + D 2 ηwd 21 ηuv + D 1 ηvd 22 ηuw, e 23 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 221 ηuvw +D 2 ηud 21 ηvw + D 1 ηwd 22 ηuv + D 2 ηvd 21 ηuw, e 24 (g 1, g 2 ; x)uvw = [η(g 1, g 2 ; x) η(g 1, g 2 ; x)]d 222 ηvwd 1 ηu +D 2 ηud 22 ηvw + D 2 ηvd 22 ηuw + D 2 ηwd 22 ηuv. Let e 1 = [e 11(g 1 + τh 1, g 2 + τh 2 )h 1 h 1 + e 12 (g 1 + τh 1, g 2 + τh 2 )h 1 h 2 + e 13 (g 1 + τh 1, g 2 + τh 2 )h 2 h 1 + e 14 (g 1 + τh 1, g 2 + τh 2 )h 2 h 2 ](1 τ)dτ, and e 2 = [e 21(g 1 + τh 1, g 2 + τh 2 )h 1 h 1 + e 22 (g 1 +τh 1, g 2 +τh 2 )h 1 h 2 +e 23 (g 1 +τh 1, g 2 +τh 2 )h 2 h 1 +e 24 (g 1 +τh 1, g 2 +τh 2 )h 2 h 2 ](1 τ)dτ. Then we have the following systems of equations Z 1 (g 1 + h 1, g 2 + h 2 ) = Z 1 (g 1, g 2 ) + G 1 (g 1, g 2 )h 1 + U 12 (g 1, g 2 )h 2 + e 1, Z 2 (g 1 + h 1, g 2 + h 2 ) = Z 2 (g 1, g 2 ) + G 2 (g 1, g 2 )h 2 + U 21 (g 1, g 2 )h 1 + e 2. (8) In the following the dependence of Z i, U ij and G i on g 1 and g 2 are sometimes dropped when there is no confusion. Let G 11 = G 1 U 12 G 1 2 U 21, G 22 = G 2 U 21 G 1 2 U 12. Define the systematic errors as g λ1 g 1 and g λ2 g 2, where g λi s are the local unique root of Z i (g 1, g 2 ) =, i = 1, 2, in N 1 N 2. Ignoring e 1 and e 2 in the system of equations (8) and assuming the existence of G 1 11 (g 1, g 2 ) and G 1 22 (g 1, g 2 ), we have linear approximations to the systematic errors ḡ λ1 g 1 ḡ λ2 g 2 = G 1 11 (g 1, g 2 )(Z 1 (g 1, g 2 ) U 12 (g 1, g 2 )G 1 2 (g 1, g 2 )Z 2 (g 1, g 2 )), = G 1 22 (g 1, g 2 )(Z 2 (g 1, g 2 ) U 21 (g 1, g 2 )G 1 1 (g 1, g 2 )Z 1 (g 1, g 2 )). 6 (9)
7 The stochastic errors are defined as g nλ1 g λ1 and g nλ2 g λ2. Let Z ni (g 1, g 2 ) = 1 2 D il nλ (g 1, g 2 ), i = 1, 2. Approximations of the stochastic errors will be obtained based on expansions of Z n1 and Z n2. For g λ1 + h 1 N 1 and g λ2 + h 2 N 2, we have Z n1 (g λ1 + h 1, g λ2 + h 2 ) = Z n1 (g λ1, g λ2 ) + D 1 Z n1 (g λ1, g λ2 )h 1 + D 2 Z n1 (g λ1, g λ2 )h 2 + e n1, Z n2 (g λ1 + h 1, g λ2 + h 2 ) = Z n2 (g λ1, g λ2 ) + D 1 Z n2 (g λ1, g λ2 )h 1 + D 2 Z n2 (g λ1, g λ2 )h 2 + e n2, where e n1 and e n2 are defined similarly as e 1 and e 2. Again, ignoring remainder terms and assuming the existence of G 1 11 (g λ1, g λ2 ) and G 1 22 (g λ1, g λ2 ), we approximate g nλ1 and g nλ2 by ḡ nλ1 g λ1 ḡ nλ2 g λ2 = G 1 11 (g λ1, g λ2 )(Z n1 (g λ1, g λ2 ) U 12 (g λ1, g λ2 )G 1 2 (g λ1, g λ2 )Z n2 (g λ1, g λ2 )), = G 1 22 (g λ1, g λ2 )(Z n2 (g λ1, g λ2 ) U 21 (g λ1, g λ2 )G 1 1 (g λ1, g λ2 )Z n1 (g λ1, g λ2 )). For b α, λ 1 >, λ 2 >, g 1, h 1 N 1, g 2, h 2 N 2, unit elements u 1, u 2 H 1α ( u i α = 1) and unit elements v 1, v 2 H 2α, let K 11 = u 1,u 2 G 11 (h 1, h 2 ) 1 [e 11 (g 1, g 2 )u 1 u 2 U 12 (h 1, h 2 )G 1 2 (h 1, h 2 )e 21 (g 1, g 2 )u 1 u 2 ] b, K 12 = u 1,v 1 G 11 (h 1, h 2 ) 1 [e 12 (g 1, g 2 )u 1 v 1 U 12 (h 1, h 2 )G 1 2 (h 1, h 2 )e 22 (g 1, g 2 )u 1 v 1 ] b, K 13 = u 1,v 1 G 11 (h 1, h 2 ) 1 [e 13 (g 1, g 2 )v 1 u 1 U 12 (h 1, h 2 )G 1 2 (h 1, h 2 )e 23 (g 1, g 2 )v 1 u 1 ] b, K 14 = v 1,v 2 G 11 (h 1, h 2 ) 1 [e 14 (g 1, g 2 )v 1 v 2 U 12 (h 1, h 2 )G 1 2 (h 1, h 2 )e 24 (g 1, g 2 )v 1 v 2 ] b, K 21 = u 1,u 2 G 22 (h 1, h 2 ) 1 [e 21 (g 1, g 2 )u 1 u 2 U 21 (h 1, h 2 )G 1 1 (h 1, h 2 )e 11 (g 1, g 2 )u 1 u 2 ] b, K 22 = u 1,v 1 G 22 (h 1, h 2 ) 1 [e 22 (g 1, g 2 )u 1 v 1 U 21 (h 1, h 2 )G 1 1 (h 1, h 2 )e 12 (g 1, g 2 )u 1 v 1 ] b, K 23 = u 1,v 1 G 22 (h 1, h 2 ) 1 [e 23 (g 1, g 2 )v 1 u 1 U 21 (h 1, h 2 )G 1 1 (h 1, h 2 )e 13 (g 1, g 2 )v 1 u 1 ] b, K 24 = v 1,v 2 G 22 (h 1, h 2 ) 1 [e 24 (g 1, g 2 )v 1 v 2 U 21 (h 1, h 2 )G 1 1 (h 1, h 2 )e 14 (g 1, g 2 )v 1 v 2 ] b. Similarly we can define K ij,n, i, j = 1, 2. Denote Let A ij (g 1, g 2 ) = D j Z ni (g 1, g 2 ) D j Z i (g 1, g 2 ), i, j = 1, 2. E21,n 1 = g1,g 2 11 (g 1, g 2 )U 12 (g 1, g 2 )G 1 2 (g 1, g 2 )A 21 (g 1, g 2 )u b, E21,n 2 = g1,g 2 11 (g 1, g 2 )A 11 (g 1, g 2 )u b, E31,n 1 = g1,g 2 11 (g 1, g 2 )U 12 (g 1, g 2 )G 1 2 (g 1, g 2 )A 22 (g 1, g 2 )u b, E31,n 2 = g1,g 2 11 (g 1, g 2 )A 12 (g 1, g 2 )u b, E22,n 1 = g1,g 2 22 (g 1, g 2 )U 21 (g 1, g 2 )G 1 1 (g 1, g 2 )A 11 (g 1, g 2 )u b, E22,n 2 = g1,g 2 22 (g 1, g 2 )A 21 (g 1, g 2 )u b, E32,n 1 = g1,g 2 22 (g 1, g 2 )U 11 (g 1, g 2 )G 1 1 (g 1, g 2 )A 12 (g 1, g 2 )u b, E32,n 2 = g1,g 2 22 (g 1, g 2 )A 22 (g 1, g 2 )u b. 7 (1)
8 Let K 1 = K 11 + K 12 + K 13 + K 14, K 2 = K 21 + K 22 + K 23 + K 24, E 21,n = E 1 21,n + E 2 21,n E 31,n = E 1 31,n + E 2 31,n, E 22,n = E 1 22,n + E 2 22,n and E 32,n = E 1 32,n + E 2 32,n. Standard analysis based on Taylor series expansions leads to the estimate of the error terms ḡ λ1 g 1 b K 1 h 1 h 2, ḡ λ2 g 2 b K 2 h 1 h 2, ḡ nλ1 g λ1 b E 21,n h 1 + E 31,n h 2 + K 11,n h 1 h 2, ḡ nλ2 g λ2 b E 22,n h 1 + E 32,n h 2 + K 12,n h 1 h 2. 4 Existence and Uniqueness In this section, we show that l λ and l nλ have unique minimizers. Let S 1,(g1,g 2 )(r, b) = {h H b1 : h g 1 b r}, S 2,(g1,g 2 )(r, b) = {h H b2 : h g 2 b r}, S 1 (r, b) = S 1,(,) (r, b), S 2 (r, b) = S 2,(,) (r, b), d 1 (λ, b) = ḡ λ1 g 1 b, d 2 (λ, b) = ḡ λ2 g 2 b, r 1 (λ, b) = (K 11 + K 21 )d 1 (λ, α) + (K 12 + K 22 )d 2 (λ, α), r 2 (λ, b) = (K 13 + K 23 )d 1 (λ, b) + (K 14 + K 24 )d 2 (λ, b). The following theorem guarantees the existence and uniqueness of g λ1 and g λ2. Theorem 2 If d i (λ, α) and r i (λ, α) as λ 1 and λ 2, then for λ 1 and λ 2 sufficiently small, we have (a) There are unique g λ1 S 1,(g1,g 2 )(2d 1 (λ, α), α) and g λ2 S 2,(g1,g 2 )(2d 2 (λ, α), α) satisfying Z 1 (g λ1, g λ2 ) = and Z 2 (g λ1, g λ2 ) =. (b) ḡ λ1 g λ1 b + ḡ λ2 g λ2 b 4r 1 (λ, b)d 1 (λ, α) + 4r 2 (λ, b)d 2 (λ, α) for b α. [Proof] For simplicity of notations, we let S i = S i (r, b), d i = d i (λ, α) and r i = r i (λ, b), i = 1, 2. Let F 1 (h, k) = h G 1 11 (g 1, g 2 )(Z 1 (g 1 + h, g 2 + k) U 12 (g 1, g 2 )G 1 2 (g 1, g 2 )Z 2 (g 1 + h, g 2 + k)), F 2 (h, k) = k G 1 22 (g 1, g 2 )(Z 2 (g 1 + h, g 2 + k) U 21 (g 1, g 2 )G 1 1 (g 1, g 2 )Z 1 (g 1 + h, g 2 + k)). 8
9 F (h, k) = (F 1 (h, k), F 2 (h, k)) is a linear operator on the product Hilbert space H. It is not difficult to show that F (S 1 S 2 ) S 1 S 2. Then we need to show that F is a contraction on S 1 S 2. For h 1, h 2 S 1 and k 1, k 2 S 2, F 1 (h 1, k 1 ) F 1 (h 2, k 2 ) = h 1 h 2 G 1 11 (g 1, g 2 )[Z 1 (g 1 + h 1, g 2 + k 1 ) Z 1 (g 1 + h 2, g 2 + k 2 ) U 12 (g 1, g 2 )G 1 2 (g 1, g 2 )(Z 2 (g 1 + h 1, g 2 + k 1 ) Z 2 (g 1 + h 2, g 2 + k 2 ))]. Applying Taylor series expansion (6), we have Z 1 (g 1 + h 2, g 2 + k 2 ) = Z 1 (g 1 + h 1, g 2 + k 1 ) + [D 1 Z 1 (g 1 + h 1 + τ(h 2 h 1 ), g 2 + k 1 + τ(k 2 k 1 ))(h 2 h 1 ) +D 2 Z 1 (g 1 + h 1 + τ(h 2 h 1 ), g 2 + k 1 + τ(k 2 k 1 ))(k 2 k 1 )]dτ. Applying Taylor series expansion again to the terms inside the integral, we have Z 1 (g 1 + h 2, g 2 + k 2 ) Z 1 (g 1 + h 1, g 2 + k 1 ) = D 1 Z 1 (g 1, g 2 )(h 2 h 1 ) + D 2 Z 1 (g 1, g 2 )(k 2 k 1 ) + + [D 11 Z 1 (g 1 + τ h, g 2 + k )τ h + D 12 Z 1 (g 1 + τ h, g 2 + τ k )k ](h 2 h 1 )dτdτ [D 21 Z 1 (g 1 + τ h, g 2 + τ k )h + D 22 Z 1 (g 1 + τ h, g 2 + τ k )k ](k 2 k 1 )dτdτ, where h = h 1 +τ(h 2 h 1 ) and k = k 1 +τ(k 2 k 1 ). Similar approximation to Z 2 (g 1 +h 2, g 2 + k 2 ) Z 2 (g 1 + h 1, g 2 + k 1 ) can be obtained. Plugging in and after some algebraic steps, we have F 1 (h 1, k 1 ) F 1 (h 2, k 2 ) b 2(K 11 d 1 + K 12 d 2 ) h 2 h 1 b + 2(K 13 d 1 + K 14 d 2 ) k 2 k 1 b, F 2 (h 1, k 1 ) F 2 (h 2, k 2 ) b 2(K 21 d 1 + K 22 d 2 ) h 2 h 1 b + 2(K 23 d 1 + K 24 d 2 ) k 2 k 1 b. Therefore, F (h 1, h 2 ) F (k 1, k 2 ) b 2(K 11 d 1 + K 12 d 1 + K 12 d 2 + K 22 d 2 ) h 2 h 1 b +2(K 13 d 1 + K 23 d 1 + K 14 d 2 + K 24 d 2 ) k 2 k 1 b 2r 1 h 2 h 1 b + 2r 2 k 2 k 1 b. Choosing λ 1 and λ 2 such that for λ 1 (, λ 1 ] and λ 2 (, λ 2 ], we have r 1, r 2 < C for some constant C < 1. For this choice of λ 1 and λ 2, F is a contraction. The contraction theorem leads to unique h λ1 and k λ2 for which F (h λ1, k λ2 ) =. Let g λ1 = g 1 + h λ1 and g λ2 = g 2 + k λ2, 9
10 and g λ1 S 1,(g1,g 2 )(2d 1, α) and g λ2 S 2,(g1,g 2 )(2d 2, α) are unique solutions to Z 1 (g λ1, g λ2 ) = and Z 2 (g λ1, g λ2 ) =. To obtain the bound, note that Thus (ḡ λ1 g λ1, ḡ λ2 g λ2 ) = F (h λ1, k λ2 ) F (, ). ḡ λ1 g λ1 b + ḡ λ2 g λ2 b F (h λ1, k λ2 ) F (, ) b 2r 1 h λ1 b + 2r 2 k λ2 b 4(r 1 d 1 + r 2 d 2 ). This completes the proof. We next consider the solution to l nλ. For g nλ 1 N 1 and g nλ2 N 2, define d n1 (λ, b) = ḡ nλ1 g λ1 b, d n2 (λ, b) = ḡ nλ2 g λ2 b, r n1 (λ, b) = (K 11,n + K 21,n )d n1 (λ, α) + (K 12,n + K 22,n )d n2 (λ, α) + E 21,n + E 22,n, r n2 (λ, b) = (K 13,n + K 23,n )d n1 (λ, b) + (K 14,n + K 24,n )d n2 + E 31,n + E 32,n. Theorem 3 If λ n1 and λ n2 tend to zero such that g λn1 N 1 and g λn2 N 2 for all n and d n1 (λ, b) p, d n2 (λ, b) p, r n1 (λ, b) p and r n2 (λ, b) p, then with probability tending to unity as n, we have (a) There is a unique root (g nλn1, g nλn2 ) satisfying Z n1 (g nλn1, g nλn2 ) = and Z n2 (g nλn1, g nλn2 ) =. (b) for b [, α], ḡ nλ1 g nλ1 b + ḡ nλ2 g nλ2 b 4r n1 (λ, b)d n1 (λ, α) + 4r n2 (λ, b)d n2 (λ, α). [Proof] For simplicity of notations, denote d ni = d ni (λ, α) and r ni = r ni (λ, b), i = 1, 2. Let F n1 (h, k) = h G11 1 (g λ1, g λ2 )(Z n1 (g λ1 + h, g λ2 + k) U 12 (g λ1, g λ2 )G 1 2 (g λ1, g λ2 )Z n2 (g 1 + h, g 2 + k)), F n2 (h, k) = k G 1 22 (g λ1, g λ2 )(Z n2 (g λ1 + h, g λ2 + k) U 21 (g λ1, g λ2 )G 1 1 (g λ1, g λ2 )Z n1 (g λ1 + h, g λ2 + k)). The proof proceeds similarly to the proof of Theorem 2, with additional terms to approximate Z n1 and Z n2 by Z 1 and Z 2. Take n large enough so that with probability arbitrarily close to unity, S 1,(gλ1,g λ2 )(2d n1, α) N 1, S 2 (2d n2, α) N 2, r n1 < 1 and r 2 n2 < 1. For the rest of the 2 1
11 proof, we restrict to this event. It is not difficult to show that F n (S 1 (2d n1, α) S 2 (2d n2, α)) S 1 (2d n1, α) S 2 (2d n2, α). We then need to show that F n is a contraction on S 1 (2d n1, α) S 2 (2d n2, α). Expand Z n1 and Z n2 as in Theorem 2 and Section 3. After some algebraic steps, for h 1, h 2 S 1 (2d n1, α) and k 2, k 2 S 2 (2d n2, α), F n1 (h 1, k 1 ) F n1 (h 2, k 2 ) b 2(K 11,n d n1 + K 12,n d n2 + E 21,n ) h 2 h 1 b +2(K 13,n d n1 + K 14,n d n2 + E 22,n ) k 2 k 1 b, F n2 (h 1, k 1 ) F n2 (h 2, k 2 ) b 2(K 21,n d n1 + K 22,n d n2 + E 31,n ) h 2 h 1 b +2(K 23,n d n1 + K 24,n d n2 + E 32,n ) k 2 k 1 b. Therefore F n (h 1, h 2 ) F n (k 1, k 2 ) 2r n1 h 1 h 2 b + 2r n2 k 1 k 2 b, which indicates that F n is a contraction. To get the upper bound, notice that ḡ nλ1 g nλ1 b + ḡ nλ2 g nλ2 b = F n (h nλ1, k nλ2 ) b 4r n1 d n1 + 4r n2 d n2. This completes proof of Theorem 3. References [1] Cox, D. D. (1988). Approximation of Method of Regularization estimators. Ann. Statist. 16, [2] Cox, D. D. and O Sullivan (199). Asymptotic Analysis of Penalized Likelihood and Related Estimators. Ann. Statist. 18, [3] O Sullivan, F. (199). Convergence Characteristics of Methods of Regularization Estimators for Nonlinear Operator Equations. SIAM Journal on Numerical Analysis 27, [4] Zeidler, E. (1985). Nonlinear functional analysis and its applications. Springer: New York. 11
Statistical Convergence of Kernel CCA
Statistical Convergence of Kernel CCA Kenji Fukumizu Institute of Statistical Mathematics Tokyo 106-8569 Japan fukumizu@ism.ac.jp Francis R. Bach Centre de Morphologie Mathematique Ecole des Mines de Paris,
More informationStochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions
International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.
More informationA Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators
A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators Lei Ni And I cherish more than anything else the Analogies, my most trustworthy masters. They know all the secrets of
More informationAppendix A Functional Analysis
Appendix A Functional Analysis A.1 Metric Spaces, Banach Spaces, and Hilbert Spaces Definition A.1. Metric space. Let X be a set. A map d : X X R is called metric on X if for all x,y,z X it is i) d(x,y)
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationx 2 x n r n J(x + t(x x ))(x x )dt. For warming-up we start with methods for solving a single equation of one variable.
Maria Cameron 1. Fixed point methods for solving nonlinear equations We address the problem of solving an equation of the form (1) r(x) = 0, where F (x) : R n R n is a vector-function. Eq. (1) can be written
More informationMATH34032 Mid-term Test 10.00am 10.50am, 26th March 2010 Answer all six question [20% of the total mark for this course]
MATH3432: Green s Functions, Integral Equations and the Calculus of Variations 1 MATH3432 Mid-term Test 1.am 1.5am, 26th March 21 Answer all six question [2% of the total mark for this course] Qu.1 (a)
More informationAdaptive Piecewise Polynomial Estimation via Trend Filtering
Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering
More informationNonlinear Nonparametric Regression Models
Nonlinear Nonparametric Regression Models Chunlei Ke and Yuedong Wang October 20, 2002 Abstract Almost all of the current nonparametric regression methods such as smoothing splines, generalized additive
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationExam 2. Average: 85.6 Median: 87.0 Maximum: Minimum: 55.0 Standard Deviation: Numerical Methods Fall 2011 Lecture 20
Exam 2 Average: 85.6 Median: 87.0 Maximum: 100.0 Minimum: 55.0 Standard Deviation: 10.42 Fall 2011 1 Today s class Multiple Variable Linear Regression Polynomial Interpolation Lagrange Interpolation Newton
More informationGaussian Random Fields
Gaussian Random Fields Mini-Course by Prof. Voijkan Jaksic Vincent Larochelle, Alexandre Tomberg May 9, 009 Review Defnition.. Let, F, P ) be a probability space. Random variables {X,..., X n } are called
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationAdaptive methods for control problems with finite-dimensional control space
Adaptive methods for control problems with finite-dimensional control space Saheed Akindeinde and Daniel Wachsmuth Johann Radon Institute for Computational and Applied Mathematics (RICAM) Austrian Academy
More informationSemi-Nonparametric Inferences for Massive Data
Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work
More informationi=1 α i. Given an m-times continuously
1 Fundamentals 1.1 Classification and characteristics Let Ω R d, d N, d 2, be an open set and α = (α 1,, α d ) T N d 0, N 0 := N {0}, a multiindex with α := d i=1 α i. Given an m-times continuously differentiable
More informationKernel B Splines and Interpolation
Kernel B Splines and Interpolation M. Bozzini, L. Lenarduzzi and R. Schaback February 6, 5 Abstract This paper applies divided differences to conditionally positive definite kernels in order to generate
More informationConcentration behavior of the penalized least squares estimator
Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar
More informationDefinition and basic properties of heat kernels I, An introduction
Definition and basic properties of heat kernels I, An introduction Zhiqin Lu, Department of Mathematics, UC Irvine, Irvine CA 92697 April 23, 2010 In this lecture, we will answer the following questions:
More informationRKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee
RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators
More informationReproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationHypothesis Testing in Smoothing Spline Models
Hypothesis Testing in Smoothing Spline Models Anna Liu and Yuedong Wang October 10, 2002 Abstract This article provides a unified and comparative review of some existing test methods for the hypothesis
More informationMathematical Analysis Outline. William G. Faris
Mathematical Analysis Outline William G. Faris January 8, 2007 2 Chapter 1 Metric spaces and continuous maps 1.1 Metric spaces A metric space is a set X together with a real distance function (x, x ) d(x,
More informationEigenvalues and Eigenfunctions of the Laplacian
The Waterloo Mathematics Review 23 Eigenvalues and Eigenfunctions of the Laplacian Mihai Nica University of Waterloo mcnica@uwaterloo.ca Abstract: The problem of determining the eigenvalues and eigenvectors
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationPart 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)
Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization Nick Gould (RAL) x IR n f(x) subject to c(x) = Part C course on continuoue optimization CONSTRAINED MINIMIZATION x
More informationA Computational Approach to Study a Logistic Equation
Communications in MathematicalAnalysis Volume 1, Number 2, pp. 75 84, 2006 ISSN 0973-3841 2006 Research India Publications A Computational Approach to Study a Logistic Equation G. A. Afrouzi and S. Khademloo
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More information1 Math 241A-B Homework Problem List for F2015 and W2016
1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let
More informationVasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks
C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,
More informationSTOP, a i+ 1 is the desired root. )f(a i) > 0. Else If f(a i+ 1. Set a i+1 = a i+ 1 and b i+1 = b Else Set a i+1 = a i and b i+1 = a i+ 1
53 17. Lecture 17 Nonlinear Equations Essentially, the only way that one can solve nonlinear equations is by iteration. The quadratic formula enables one to compute the roots of p(x) = 0 when p P. Formulas
More informationMinimal periods of semilinear evolution equations with Lipschitz nonlinearity
Minimal periods of semilinear evolution equations with Lipschitz nonlinearity James C. Robinson a Alejandro Vidal-López b a Mathematics Institute, University of Warwick, Coventry, CV4 7AL, U.K. b Departamento
More informationReproducing Kernel Hilbert Spaces
9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we
More informationPrimal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization
Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationExtended GaussMarkov Theorem for Nonparametric Mixed-Effects Models
Journal of Multivariate Analysis 76, 249266 (2001) doi:10.1006jmva.2000.1930, available online at http:www.idealibrary.com on Extended GaussMarkov Theorem for Nonparametric Mixed-Effects Models Su-Yun
More informationOptimization Theory. Linear Operators and Adjoints
Optimization Theory Linear Operators and Adjoints A transformation T. : X Y y Linear Operators y T( x), x X, yy is the image of x under T The domain of T on which T can be defined : D X The range of T
More informationMultiplication of Polynomials
Summary 391 Chapter 5 SUMMARY Section 5.1 A polynomial in x is defined by a finite sum of terms of the form ax n, where a is a real number and n is a whole number. a is the coefficient of the term. n is
More information7 Influence Functions
7 Influence Functions The influence function is used to approximate the standard error of a plug-in estimator. The formal definition is as follows. 7.1 Definition. The Gâteaux derivative of T at F in the
More information3 Stability and Lyapunov Functions
CDS140a Nonlinear Systems: Local Theory 02/01/2011 3 Stability and Lyapunov Functions 3.1 Lyapunov Stability Denition: An equilibrium point x 0 of (1) is stable if for all ɛ > 0, there exists a δ > 0 such
More informationSpline Density Estimation and Inference with Model-Based Penalities
Spline Density Estimation and Inference with Model-Based Penalities December 7, 016 Abstract In this paper we propose model-based penalties for smoothing spline density estimation and inference. These
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More informationAsymptotic inference for a nonstationary double ar(1) model
Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationREVIEW OF DIFFERENTIAL CALCULUS
REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be
More informationYURI LEVIN, MIKHAIL NEDIAK, AND ADI BEN-ISRAEL
Journal of Comput. & Applied Mathematics 139(2001), 197 213 DIRECT APPROACH TO CALCULUS OF VARIATIONS VIA NEWTON-RAPHSON METHOD YURI LEVIN, MIKHAIL NEDIAK, AND ADI BEN-ISRAEL Abstract. Consider m functions
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationLinear-quadratic control problem with a linear term on semiinfinite interval: theory and applications
Linear-quadratic control problem with a linear term on semiinfinite interval: theory and applications L. Faybusovich T. Mouktonglang Department of Mathematics, University of Notre Dame, Notre Dame, IN
More informationMath 225B: Differential Geometry, Final
Math 225B: Differential Geometry, Final Ian Coley March 5, 204 Problem Spring 20,. Show that if X is a smooth vector field on a (smooth) manifold of dimension n and if X p is nonzero for some point of
More informationDiffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)
Diffeomorphic Warping Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) What Manifold Learning Isn t Common features of Manifold Learning Algorithms: 1-1 charting Dense sampling Geometric Assumptions
More informationThe Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment
he Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment William Glunt 1, homas L. Hayden 2 and Robert Reams 2 1 Department of Mathematics and Computer Science, Austin Peay State
More informationBayesian Aggregation for Extraordinarily Large Dataset
Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work
More information3.4. ZEROS OF POLYNOMIAL FUNCTIONS
3.4. ZEROS OF POLYNOMIAL FUNCTIONS What You Should Learn Use the Fundamental Theorem of Algebra to determine the number of zeros of polynomial functions. Find rational zeros of polynomial functions. Find
More informationTHE DVORETZKY KIEFER WOLFOWITZ INEQUALITY WITH SHARP CONSTANT: MASSART S 1990 PROOF SEMINAR, SEPT. 28, R. M. Dudley
THE DVORETZKY KIEFER WOLFOWITZ INEQUALITY WITH SHARP CONSTANT: MASSART S 1990 PROOF SEMINAR, SEPT. 28, 2011 R. M. Dudley 1 A. Dvoretzky, J. Kiefer, and J. Wolfowitz 1956 proved the Dvoretzky Kiefer Wolfowitz
More informationSmooth simultaneous confidence bands for cumulative distribution functions
Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang
More informationSample Solutions of Assignment 10 for MAT3270B
Sample Solutions of Assignment 1 for MAT327B 1. For the following ODEs, (a) determine all critical points; (b) find the corresponding linear system near each critical point; (c) find the eigenvalues of
More informationEffective Dimension and Generalization of Kernel Learning
Effective Dimension and Generalization of Kernel Learning Tong Zhang IBM T.J. Watson Research Center Yorktown Heights, Y 10598 tzhang@watson.ibm.com Abstract We investigate the generalization performance
More information( ) y 2! 4. ( )( y! 2)
1. Dividing: 4x3! 8x 2 + 6x 2x 5.7 Division of Polynomials = 4x3 2x! 8x2 2x + 6x 2x = 2x2! 4 3. Dividing: 1x4 + 15x 3! 2x 2!5x 2 = 1x4!5x 2 + 15x3!5x 2! 2x2!5x 2 =!2x2! 3x + 4 5. Dividing: 8y5 + 1y 3!
More informationON THE BOUNDEDNESS BEHAVIOR OF THE SPECTRAL FACTORIZATION IN THE WIENER ALGEBRA FOR FIR DATA
ON THE BOUNDEDNESS BEHAVIOR OF THE SPECTRAL FACTORIZATION IN THE WIENER ALGEBRA FOR FIR DATA Holger Boche and Volker Pohl Technische Universität Berlin, Heinrich Hertz Chair for Mobile Communications Werner-von-Siemens
More informationTwo special equations: Bessel s and Legendre s equations. p Fourier-Bessel and Fourier-Legendre series. p
LECTURE 1 Table of Contents Two special equations: Bessel s and Legendre s equations. p. 259-268. Fourier-Bessel and Fourier-Legendre series. p. 453-460. Boundary value problems in other coordinate system.
More informationMATH 205C: STATIONARY PHASE LEMMA
MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)
More informationSolving a class of nonlinear two-dimensional Volterra integral equations by using two-dimensional triangular orthogonal functions.
Journal of Mathematical Modeling Vol 1, No 1, 213, pp 28-4 JMM Solving a class of nonlinear two-dimensional Volterra integral equations by using two-dimensional triangular orthogonal functions Farshid
More informationStatistics 3657 : Moment Approximations
Statistics 3657 : Moment Approximations Preliminaries Suppose that we have a r.v. and that we wish to calculate the expectation of g) for some function g. Of course we could calculate it as Eg)) by the
More informationSpatial Process Estimates as Smoothers: A Review
Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed
More informationTEST CODE: MMA (Objective type) 2015 SYLLABUS
TEST CODE: MMA (Objective type) 2015 SYLLABUS Analytical Reasoning Algebra Arithmetic, geometric and harmonic progression. Continued fractions. Elementary combinatorics: Permutations and combinations,
More information1 Lyapunov theory of stability
M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability
More informationThe p-adic Numbers. Akhil Mathew
The p-adic Numbers Akhil Mathew ABSTRACT These are notes for the presentation I am giving today, which itself is intended to conclude the independent study on algebraic number theory I took with Professor
More informationSPECTRAL PROPERTIES AND NODAL SOLUTIONS FOR SECOND-ORDER, m-point, BOUNDARY VALUE PROBLEMS
SPECTRAL PROPERTIES AND NODAL SOLUTIONS FOR SECOND-ORDER, m-point, BOUNDARY VALUE PROBLEMS BRYAN P. RYNNE Abstract. We consider the m-point boundary value problem consisting of the equation u = f(u), on
More informationEconomics 204 Fall 2013 Problem Set 5 Suggested Solutions
Economics 204 Fall 2013 Problem Set 5 Suggested Solutions 1. Let A and B be n n matrices such that A 2 = A and B 2 = B. Suppose that A and B have the same rank. Prove that A and B are similar. Solution.
More informationNotes on Linear Algebra and Matrix Theory
Massimo Franceschet featuring Enrico Bozzo Scalar product The scalar product (a.k.a. dot product or inner product) of two real vectors x = (x 1,..., x n ) and y = (y 1,..., y n ) is not a vector but a
More informationNonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix
Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationSensitivity of GLS estimators in random effects models
of GLS estimators in random effects models Andrey L. Vasnev (University of Sydney) Tokyo, August 4, 2009 1 / 19 Plan Plan Simulation studies and estimators 2 / 19 Simulation studies Plan Simulation studies
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationCommunication constraints and latency in Networked Control Systems
Communication constraints and latency in Networked Control Systems João P. Hespanha Center for Control Engineering and Computation University of California Santa Barbara In collaboration with Antonio Ortega
More informationLinear Algebra in Hilbert Space
Physics 342 Lecture 16 Linear Algebra in Hilbert Space Lecture 16 Physics 342 Quantum Mechanics I Monday, March 1st, 2010 We have seen the importance of the plane wave solutions to the potentialfree Schrödinger
More informationPenalty Methods for Bivariate Smoothing and Chicago Land Values
Penalty Methods for Bivariate Smoothing and Chicago Land Values Roger Koenker University of Illinois, Urbana-Champaign Ivan Mizera University of Alberta, Edmonton Northwestern University: October 2001
More informationLECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel
LECTURE NOTES on ELEMENTARY NUMERICAL METHODS Eusebius Doedel TABLE OF CONTENTS Vector and Matrix Norms 1 Banach Lemma 20 The Numerical Solution of Linear Systems 25 Gauss Elimination 25 Operation Count
More informationFunction Spaces. 1 Hilbert Spaces
Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure
More informationOn intermediate value theorem in ordered Banach spaces for noncompact and discontinuous mappings
Int. J. Nonlinear Anal. Appl. 7 (2016) No. 1, 295-300 ISSN: 2008-6822 (electronic) http://dx.doi.org/10.22075/ijnaa.2015.341 On intermediate value theorem in ordered Banach spaces for noncompact and discontinuous
More informationA Note on Two Different Types of Matrices and Their Applications
A Note on Two Different Types of Matrices and Their Applications Arjun Krishnan I really enjoyed Prof. Del Vecchio s Linear Systems Theory course and thought I d give something back. So I ve written a
More informationReproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 12, 2007 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationGeneralized Additive Models
Generalized Additive Models The Model The GLM is: g( µ) = ß 0 + ß 1 x 1 + ß 2 x 2 +... + ß k x k The generalization to the GAM is: g(µ) = ß 0 + f 1 (x 1 ) + f 2 (x 2 ) +... + f k (x k ) where the functions
More informationApplication of density estimation methods to quantal analysis
Application of density estimation methods to quantal analysis Koichi Yoshioka Tokyo Medical and Dental University Summary There has been controversy for the quantal nature of neurotransmission of mammalian
More informationNew Results for Second Order Discrete Hamiltonian Systems. Huiwen Chen*, Zhimin He, Jianli Li and Zigen Ouyang
TAIWANESE JOURNAL OF MATHEMATICS Vol. xx, No. x, pp. 1 26, xx 20xx DOI: 10.11650/tjm/7762 This paper is available online at http://journal.tms.org.tw New Results for Second Order Discrete Hamiltonian Systems
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationIntroduction to Functional Analysis With Applications
Introduction to Functional Analysis With Applications A.H. Siddiqi Khalil Ahmad P. Manchanda Tunbridge Wells, UK Anamaya Publishers New Delhi Contents Preface vii List of Symbols.: ' - ix 1. Normed and
More informationB553 Lecture 1: Calculus Review
B553 Lecture 1: Calculus Review Kris Hauser January 10, 2012 This course requires a familiarity with basic calculus, some multivariate calculus, linear algebra, and some basic notions of metric topology.
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationSemi-parametric estimation of non-stationary Pickands functions
Semi-parametric estimation of non-stationary Pickands functions Linda Mhalla 1 Joint work with: Valérie Chavez-Demoulin 2 and Philippe Naveau 3 1 Geneva School of Economics and Management, University of
More informationA Theoretical Framework for the Regularization of Poisson Likelihood Estimation Problems
c de Gruyter 2007 J. Inv. Ill-Posed Problems 15 (2007), 12 8 DOI 10.1515 / JIP.2007.002 A Theoretical Framework for the Regularization of Poisson Likelihood Estimation Problems Johnathan M. Bardsley Communicated
More informationVector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.
Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +
More informationSpectrum (functional analysis) - Wikipedia, the free encyclopedia
1 of 6 18/03/2013 19:45 Spectrum (functional analysis) From Wikipedia, the free encyclopedia In functional analysis, the concept of the spectrum of a bounded operator is a generalisation of the concept
More informationMultiscale Frame-based Kernels for Image Registration
Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image
More informationEstimation of the Bivariate and Marginal Distributions with Censored Data
Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new
More informationTEST CODE: MIII (Objective type) 2010 SYLLABUS
TEST CODE: MIII (Objective type) 200 SYLLABUS Algebra Permutations and combinations. Binomial theorem. Theory of equations. Inequalities. Complex numbers and De Moivre s theorem. Elementary set theory.
More information1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0
Numerical Analysis 1 1. Nonlinear Equations This lecture note excerpted parts from Michael Heath and Max Gunzburger. Given function f, we seek value x for which where f : D R n R n is nonlinear. f(x) =
More information11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationAn Alternative Proof of Primitivity of Indecomposable Nonnegative Matrices with a Positive Trace
An Alternative Proof of Primitivity of Indecomposable Nonnegative Matrices with a Positive Trace Takao Fujimoto Abstract. This research memorandum is aimed at presenting an alternative proof to a well
More information