Parallelizing Spectrally Regularized Kernel Algorithms

Size: px
Start display at page:

Download "Parallelizing Spectrally Regularized Kernel Algorithms"

Transcription

1 Journal of Machine Learning Research 19 (2018) 1-29 Subitted 11/16; Revised 8/18; Published 8/18 Parallelizing Sectrally Regularized Kernel Algoriths Nicole Mücke Institute of Stochastics and Alications, University of Stuttgart Pfaffenwaldring Stuttgart, Gerany Gilles Blanchard Institute of Matheatics, University of Potsda Karl-Liebknecht-Strae Potsda, Gerany Editor: Ingo Steinwart Abstract We consider a distributed learning aroach in suervised learning for a large class of sectral regularization ethods in an reroducing kernel Hilbert sace (RKHS) fraework. The data set of size n is artitioned into = O(n α ), α < 1 2, disjoint subsales. On each subsale, soe sectral regularization ethod (belonging to a large class, including in articular Kernel Ridge Regression, L 2 -boosting and sectral cut-off) is alied. The regression function f is then estiated via sile averaging, leading to a substantial reduction in coutation tie. We show that iniax otial rates of convergence are reserved if grows sufficiently slowly (corresonding to an uer bound for α) as n, deending on the soothness assutions on f and the intrinsic diensionality. In sirit, the analysis relies on a classical bias/stochastic error analysis. Keywords: Distributed Learning, Sectral Regularization, Miniax Otiality 1. Introduction Distributed learning (DL) algoriths are a standard tool for reducing coutational burden in achine learning robles where assive datasets are involved. Assuing a colexity cost (for tie and/or eory) of O(n β ) (β > 1, β 2, 3] being coon) of the base learning algorith without arallelization, dividing randoly data of cardinality n into disjoint, equally-sized subsales and rocessing the in arallel using the sae base learning algorith has therefore colexity cost of O(.(n/) β ) = O(n β / β 1 ), roughly. Financial suort by the DFG via Research Unit 1735 Structural Inference in Statistics as well as SFB 1294 Data Assiilation is gratefully acknowledged. c 2018 Nicole Mücke and Gilles Blanchard. License: CC-BY 4.0, see htts://creativecoons.org/licenses/by/4.0/. Attribution requireents are rovided at htt://jlr.org/aers/v19/ htl.

2 Mücke and Blanchard gaining a factor β 1 (for tie and eory) coared to the single achine aroach. The final outut is obtained fro averaging the individual oututs. Recently, DL was studied in several achine learning contexts. In oint estiation (Li et al., 2013), atrix factorization (Mackey et al., 2011), soothing sline odels and testing (Cheng and Shang, 2016), local average regression (Chang et al., 2017), in classification (Hsieh et al., 2014; Guo et al., 2015), and also in kernel ridge regression (Zhang et al., 2013; Lin et al., 2017; Xu et al., 2016). In this aer, we study the DL aroach for the statistical learning roble Y i := f(x j ) + ɛ i, j = 1,..., n, (1) at rando i.i.d. data oints X 1,..., X n drawn according to a robability distribution ν on X, where ɛ j are indeendent centered noise variables. The unknown regression function f is real-valued and belongs to soe reroducing kernel Hilbert sace with bounded kernel K. We artition the given data set D = {(X 1, Y 1 ),..., (X n, Y n )} X R into disjoint equal-size subsales D 1,..., D. On each subsale D j, we coute a local estiator ˆf λ D j, using a sectral regularization ethod. The final estiator for the target function f is obtained by sile averaging: f λ D := 1 ˆf D λ j. The non-distributed setting ( = 1) has been studied in the recent aer of Blanchard and Mücke (2017), building the root osition of our results in the distributed setting, where weak and strong iniax otial rates of convergence are established. Our ai is to extend these results to distributed learning and to derive iniax otial rates. We again aly a fairly large class of sectral regularization ethods, including the oular kernel ridge regression (KRR), L 2 -boosting and sectral cut-off. Using the sae notation as Blanchard and Mücke (2017), we let T : f f(x)k(x, )dν(x) denote the kernel integral oerator associated to K and the saling easure ν. We denote T = κ 2 T, with κ 2 the uer bound of K. Our rates of convergence are governed by a source condition assution on f of the for Ω(r, R) := {f : f = T r h, h HK R} for soe constants r, R > 0 as well as by the ill-osedness of the roble, as easured by an assued ower decay of the eigenvalues of T with exonent b > 1. We show that for s 0, 1 2 ] in the sense of -th oent ( 1) exectation T s (f ( ) λn f D ) HK σ 2 (r+s) 2r+1+1/b R, (2) R 2 n for an aroriate choice of the regularization araeter λ n, deending on the global sale size n as well as on R and the noise variance σ 2 (but not on the nuber of subsale sets). Note that s = 0 corresonds to the reconstruction error (i.e. -nor), and s = 1 2 to the 2

3 Parallelizing Sectral Algoriths rediction error (i.e., L 2 (ν)-nor). The sybol eans that the inequality holds u to a ultilicative constant that can deend on various araeters entering in the assutions of the result, but not on n,, σ, nor R. An iortant assution is that the inequality q r + s should hold, where q is the qualification of the regularization ethod, a quantity defined in the classical theory of inverse robles (see Section 2.3 for a recise definition). Basic robles are the choice of the regularization araeter on the subsales and, ost iortantly, the roer choice of, since it is well known that choosing too large gives a subotial convergence rate in the liit n (see, e.g., Xu et al., 2016). Our aroach to this roble is based on a relatively classical bias-variance decoosition rincile. Choosing the global regularization araeter as the otial choice for a single sale of size n results in a bias estiate which is identical for all subsales, is unchanged by averaging, and is straightforward fro the single-sale analysis. On the other hand, the reduced sale size of each of the individual subsales causes an inflation of variance. However, since the subsales are indeendent, so are the oututs of the learning algorith alied to each one of the; as a consequence averaging reduces the inflated variance sufficiently to get iniax otiality. We can write the variance as a su of indeendent rando variables, allowing to successfully aly a Rosenthal s inequality in the Hilbert sace setting due to Pinelis (1994). The technical liiting factors in this arguent give rise to the liitation on the nuber of subsales ; for larger than the allowed range, soe reainder ters are no longer negligible using our roof technique, and rate otiality is not guaranteed any longer. The outline of the aer is as follows. Section 2 contains notation and the setting. Section 3 states our ain result on distributed learning. Section 4 resents nuerical studies. A concluding discussion in Section 5 contains a ore detailed coarison of our results with related results available in the literature. Section 6 contains the roofs of the theores. 2. Notation, statistical odel and distributed learning algorith In this section, we secify the atheatical background and the statistical odel for (distributed) regularized learning. We have included this section for self sufficiency and reader convenience. It essentially reeats the setting in Blanchard and Mücke (2017) in suarized for. 2.1 Kernel-induced oerators We assue that the inut sace X is a standard Borel sace endowed with a robability easure ν, the outut sace is equal to R. We let K be a real-valued ositive seidefinite kernel on X X which is bounded by κ 2. The associated reroducing kernel Hilbert sace will be denoted by. It is assued that all functions f are easurable and bounded in sureu nor, i.e. f κ f HK for all f. Therefore, is a subset of L 2 (X, ν), with S : L 2 (X, ν) being the inclusion oerator, satisfying 3

4 Mücke and Blanchard S κ. The adjoint oerator S : L 2 (X, ν) is identified as S g = g(x)k x ν(dx), X where K x denotes the eleent of equal to the function t K(x, t). The covariance oerator T : is given by T =, K x HK K x ν(dx), X which can be shown to be ositive self-adjoint trace class (and hence is coact). The eirical versions of these oerators, corresonding forally to taking the eirical distribution ν n = 1 n n i=1 δ x i in lace of ν in the above forulas, are given by S x : R n, (S x f) j = f, K xj HK, S x : R n, T x := S xs x :, Sxy = 1 n T x = 1 n n y j K xj, n, K xj HK K xj. We introduce the shortcut notation T = κ 2 T and T x := κ 2 T x, ensuring T 1 and T x 1, for any x X. Siilarly, S = κ 1 S and S xj := κ 1 S xj, ensuring S 1 and S x 1, for any x X. The nubers µ j are the ositive eigenvalues of T satisfying 0 < µ j+1 µ j for all j > 0 and µ j Noise assution and rior classes In our setting of kernel learning, the saling is assued to be rando i.i.d., where each observation oint (X i, Y i ) follows the odel Y = f ρ (X) + ε. For (X, Y ) having distribution ρ, we assue that the conditional exectation wrt. ρ of Y given X exists and belongs to, that is, it holds for ν-alost all x X : E ρ Y X = x] = S x f ρ, for soe f ρ. (3) Furtherore, we will ake the following assution on the observation noise distribution: There exists σ > 0 and M > 0 such that for any l 2 E Y S X f ρ l X ] 1 2 l!σ2 M l 2, ν a.s. (4) To derive nontrivial rates of convergence, we concentrate our attention on secific subsets (also called odels) of the class of robability easures. If P denotes the set of all robability distributions on X, we define classes of saling distributions by introducing a decay 4

5 Parallelizing Sectral Algoriths condition on the effective diension N (λ), being a easure for the colexity of with resect to the arginal distribution ν: For λ (0, 1] we set Note that N (λ) 1. For any b > 1 we introduce N (λ) = Trace ( T + λ) 1 T ]. (5) P < (b) := {ν P : N (λ) C b (κ 2 λ) 1 b }. (6) In De Vito and Caonnetto, 2006, Proosition 3, it is shown that such a condition is ilied by olynoially decreasing eigenvalues of T. More recisely, if the eigenvalues µ i satisfy µ j β/j b j 1 or b > 1 and β > 0, then N (λ) β 1 b b b 1 (κ2 λ) 1 b. For a subset Ω, we let K(Ω) be the set of regular conditional robability distributions ρ( ) on B(R) X such that (3) and (4) hold for soe f ρ Ω. We will focus on a Hölder-tye source condition, i.e. given r > 0, R > 0 and ν P, we define Ω(r, R) := {f : f = T r h, h HK R}. (7) Then the class of odels which we will consider will be defined as M(r, R, P ) := { ρ(dx, dy) = ρ(dy x)ν(dx) : ρ( ) K(Ω(r, R)), ν P }, (8) with P = P < (b). As a consequence, the class of odels deends not only on the soothness roerties of the solution (reflected in the araeters R > 0, r > 0), but also essentially on sectral roerties of T, reflected in N (λ). 2.3 Sectral regularization In this subsection, we introduce the class of linear regularization ethods based on sectral theory for self-adjoint linear oerators. These are standard ethods for finding stable solutions for ill-osed inverse robles. Originally, these ethods were develoed in the deterinistic context (see Engl et al., 2000). Later on, they have been alied to robabilistic robles in achine learning (see, e.g., Bauer et al., 2007; De Vito and Caonnetto, 2006; Dicker et al., 2017 or Blanchard and Mücke, 2017). Definition 1 (Regularization function) Let g : (0, 1] 0, 1] R be a function and write g λ = g(λ, ). The faily {g λ } λ is called regularization function, if the following conditions hold: (i) There exists a constant D < such that for any 0 < λ 1 su tg λ (t) D. (9) 0 t 1 5

6 Mücke and Blanchard (ii) There exists a constant E < such that for any 0 < λ 1 su g λ (t) E 0 t 1 λ. (10) (iii) Defining the residual r λ (t) := 1 g λ (t)t, there exists a constant γ 0 < such that for any 0 < λ 1 su r λ (t) γ 0. 0 t 1 It has been shown in e.g. Gerfo et al. (2008), Dicker et al. (2017), Blanchard and Mücke (2017) that attainable learning rates are essentially linked with the qualification of the regularization {g λ } λ, being the axial q such that for any 0 < λ 1 su r λ (t) t q γ q λ q. (11) 0 t 1 for soe constant γ q > 0. Note that by (iii), using interolation, we have validity of (11) also for any q 0, q] with constant γ q = γ 1 q q The ost oular exales include: q q 0 γq. Exale 1 (Tikhonov Regularization, Kernel Ridge Regression) The choice g λ (t) = 1 λ+t corresonds to Tikhonov regularization. In this case we have D = E = γ 0 = 1. The qualification of this ethod is q = 1 with γ q = 1. Exale 2 (Landweber Iteration, gradient descent) The Landweber Iteration (gradient descent algorith with constant stesize) is defined by k 1 g k (t) = (1 t) j with k = 1/λ N. j=0 We have D = E = γ 0 = 1. The qualification q of this algorith can be arbitrary with γ q = 1 if 0 < q 1 and γ q = q q if q > 1. Exale 3 (ν- ethod) The ν ethod belongs to the class of so called sei-iterative regularization ethods. This ethod has finite qualification q = ν with γ q a ositive constant. Moreover, D = 1 and E = 2. The filter is given by g k (t) = k (t), a olynoial of degree k 1, with regularization araeter λ k 2, which akes this ethod uch faster as e.g. gradient descent. 6

7 Parallelizing Sectral Algoriths 2.4 Distributed learning algorith We let D = {(x j, y j )} n X Y be the dataset, which we artition into disjoint subsales 1 D 1,..., D, each having size n. Denote the jth data subsale by (x j, y j ) (X R) n. On each subsale we coute a local estiator for a suitable a-riori araeter choice λ = λ n according to f λn D j := g λn ( T xj ) S x j y j. (12) By fd λ we will denote the estiator using the whole sale = 1. The final estiator is given by sile averaging of the local ones: f D λ := 1 fd λ j. (13) 3. Main results This section resents our ain results. Theore 3 and Theore 4 contain searate estiates on the aroxiation error and the sale error and lead to Corollary 5 which gives an uer bound for the error T s (f ρ f D λ ) HK and resents an uer rate of convergence for the sequence of distributed learning algoriths. For the sake of the reader we recall Theore 6, which was already shown in Blanchard and Mücke (2017), resenting the iniax otial rate for the single achine roble. This yields an estiate on the difference between the single achine and the distributed learning algorith in Corollary 7. We want to track the recise behavior of these rates not only for what concerns the exonent in the nuber of exales n, but also in ters of their scaling (ultilicative constant) as a function of soe iortant araeters (naely the noise variance σ 2 and the colexity radius R in the source condition, see Reark 9 below). For this reason, we introduce a notion of a faily of rates over a faily of odels. More recisely, we consider an indexed faily (M θ ) θ Θ, where for all θ Θ, M θ is a class of Borel robability distributions on X R satisfying the basic general assutions (3) and (4). We consider rates of convergence in the sense of the -th oents of the estiation error, where 1 < is a fixed real nuber. 1. For the sake of silicity, throughout this aer we assue that n is divisible by. This could always be achieved by disregarding soe data; alternatively, it is straightforward to show that aditting one saller block in the artition does not affect the asytotic results of this aer. We shall not try to discuss this oint in greater detail. In articular, we shall not analyze in which general fraework our sile averages could be relaced by weighted averages. 7

8 Mücke and Blanchard As already entioned in the introduction, our roofs are based on a classical bias-variance decoosition as follows: Introducing we write f λ D = 1 g λ ( T xj ) T xj f ρ, (14) T s (f ρ f D) λ = T s (f ρ f D) λ + T s ( f D λ f D) λ = 1 T s r λ ( T xj )f ρ + 1 T s g λ ( T xj )( T xj f ρ S x j y j ). (15) In all the forthcoing results in this section, we assue: Assution 2 Let s 0, 1 2 ], 1 and consider the odel M σ,m,r := M(r, R, P < (b)) where r > 0 and b > 1 are fixed, and θ = (R, M, σ) varies in Θ = R 3 +. Given a sale λn D (X R) of size n, define f D, f λn λn D as in Section 2.4 and f D as in (14), using a regularization function of qualification q r + s, with araeter sequence λ n := λ n,(σ,r) := in indeendent on M. Define the sequence ( ( σ 2 R 2 n ) b 2br+b+1, 1 ), (16) ( ) σ 2 b(r+s) 2br+b+1 a n := a n,(σ,r) := R. (17) R 2 n We recall that we shall always assue that n is a ultile of. With these rearations, our ain results are: Theore 3 (Aroxiation error) Under Assution 2, we have: If the nuber n of subsale sets satisfies n n α, α < 2b in{r, 1} 2br + b + 1, (18) then su (σ,m,r) R 3 + li su n su ρ M σ,m,r ] T s λn (f ρ f D ) 1 <. a n 8

9 Parallelizing Sectral Algoriths Theore 4 (Sale Error) Under Assution 2, we have: If the nuber n of subsale sets satisfies n n α 2br, α < 2br + b + 1, (19) Then ] T s λn λn ( f D f D ) 1 li su su <. n a n su (σ,m,r) R 3 + ρ M σ,m,r And, as consequence (by (15) and alying the triangle inequality for the L -nor): Corollary 5 Under Assution 2, we have: If the nuber n of subsale sets satisfies n n α, α < 2b in{r, 1} 2br + b + 1, (20) then the sequence (17) is an uer rate of convergence in L for all > 0, for the interolation nor of araeter s, for the sequence of estiated solutions ( f λ n,(σ,r) D ) over the faily of odels (M σ,m,r ) (σ,m,r) R 3 +, i.e. su (σ,m,r) R 3 + li su n su ρ M σ,m,r ] T s λn (f ρ f D ) 1 <. a n Theore 6 (Blanchard and Mücke, 2017) The sequence (17) is an uer rate of convergence in L for all > 0, for the interolation nor of araeter s, for the sequence of estiated solutions (f λ n,(σ,r) D ) over the faily of odels (M σ,m,r ) (σ,m,r) R 3 +, i.e. su (σ,m,r) R 3 + li su n su ρ M σ,m,r ] T s (f ρ f λn D ) 1 a n <. Cobining Corollary 5 with Theore 6 by alying the triangle inequality iediately yields: Corollary 7 If the nuber n of subsale sets satisfies then su (σ,m,r) R 3 + li su n n n α, α < su ρ M σ,m,r 2b in{r, 1} 2br + b + 1, (21) ] T s (f λn λn D f D ) 1 <. a n 9

10 Mücke and Blanchard Reark 8 Our results in the distributed setting slightly differ fro those obtained in Theore 6 fro Blanchard and Mücke (2017) in two several resects: While in the single achine aroach, rates of convergence are obtained for any > 0, the roofs in Section 6 only hold for 1 due to loss of subadditivity of -th oents for 0 < < 1. While the uer uer rates of convergence in Blanchard and Mücke (2017) are derived over classes of arginals ν induced by assuing a decay condition for the eigenvalues of T, we soewhat enlarge this class by assuing a decay condition for N (λ) in (6). Theore 6 also holds under this weaker condition. Note that it is an oen roble if lower rates of convergence can also be obtained by weakening the condition for eigenvalue decay. Reark 9 (Signal-to-noise-ratio) Our results show that the choice of the regularization araeter λ n in (16) and thus the rate of convergence a n in (17) highly deend on the signal-to-noise-ratio σ2, a quantity which naturally aears in the theory of regularization R 2 of ill-osed inverse robles. As a general rule, the degree of regularization should increase with the level of noise in the data, i.e., the iortance of the riors should increase as the odel fit decreases. Our theoretical results recisely show this behavior. 4. Nuerical studies In this section we nuerically study the error in -nor, corresonding to s = 0 in Corollary 5 (in exectation with = 2) both in the single achine and distributed learning setting. Our ain interest is to study the uer bound for our theoretical exonent α, araetrizing the size of subsales in ters of the total sale size, = n α, in different soothness regies. In addition we shall deonstrate in which way arallelization serves as a for of regularization. More secifically, we let = H0 1 0, 1] be the Sobolev sace consisting of absolutely continuous functions f on 0, 1] with weak derivative of order 1 in L 2 0, 1], with boundary condition f(0) = f(1) = 0. The reroducing kernel is given by K(x, t) = x t xt. For all exerients in this section, we siulate data fro the regression odel Y i = f ρ (X i ) + ɛ i, i = 1,..., n, where the inut variables X i Unif0, 1] are uniforly distributed and the noise variables ε i N(0, σ 2 ) are norally distributed with standard deviation σ = We choose the target function f ρ according to two different cases, naely r < 1 (low soothness regie) and r = (high soothness regie). To accurately deterine the degree of soothness r > 0, we aly Proosition 10 below by exlicitly calculating the Fourier coefficients 2 πj cos(πjx), for j N, fors an ONB of. Recall that ( f ρ, e j HK ) j N, where e j (x) = the rate of eigenvalue decay is exlicitly given by b = 2, eaning that we have full control over all araeters in (21). We need the following characterization: 10

11 Parallelizing Sectral Algoriths Proosition 10 (Engl et al., 2000, Pro. 3.13) Let, H 2 be searable Hilbert saces and S : H 2 be a coact linear oerator with singular syste 2 {σ j, ϕ j, ψ j }. Denoting by S the generalized inverse 3 of S, one has for any r > 0 and g H 2 : g is in the doain of S and S g I((S S) r ) if and only if g, ψ j H2 2 j=0 σ 2+4r j <. In our case, is as above, H 2 is L 2 (0, 1]) with Lebesgue easure and S : H0 1 0, 1] L 2 (0, 1]) is the inclusion. Since H0 10, 1] is dense in L2 (0, 1]), we know that (I(S)) is trivial, giving SS = id on I(S). Furtherore, ϕ j = e j is a noralized eigenbasis of T = S S with eigenvalues σj 2 = (πj) 2. With ψ j = Sϕ j Sϕ j we obtain for f H1 L 2 0 0, 1] Sf, ψ j L 2 = Sf, Thus, alying Proosition 10 gives Se L2 j Se j = f, S Se j Se j H 1 0 = σ j f, e j H 1 0. Corollary 11 For S and T = S S defined in Section 2, we have for any r > 0: f I(T r ) if and only if j 4r f, e j L 2 2 <. Thus, as exected, abstract soothness easured by the araeter r in the source condition corresonds in this secial case to decay of the classical Fourier coefficients which by the classical theory of Fourier series easures soothness of the eriodic continuation of f L 2 (0, 1]) to the real line. 4.1 Low soothness regie We choose f ρ (x) = 1 2 x(1 x) which clearly belongs to. A straightforward calculation gives the Fourier coefficient f ρ, e j = 2(πj) 2 for j odd (vanishing for j even). Thus, by the above criterion, f ρ satisfies the source condition f ρ Ran( T r ) recisely for 0 < r < (Observe that although f ρ is sooth on 0, 1], its eriodic continuation on the real line is not, hence the low soothness regie.) According to Theore 6, the worst case rate in the single achine roble is given by n γ, with γ = Regularization is done using the ν ethod (see Exale 3), with qualification q = ν = 1. Recall that the stoing index 2. i.e., the ϕ j are the noralized eigenfunctions of S S with eigenvalues σ 2 j and ψ j = Sϕ j/ Sϕ j ; thus S = σ j ϕ j, ψ j. 3. the unique unbounded linear oerator with doain I(S) (I(S)) in H 2 vanishing on (I(S)) and satisfying SS = 1 on I(S), with range orthogonal to the null sace N(S). 11

12 Mücke and Blanchard k sto serves as the regularization araeter λ, where k sto λ 2. We consider sale sizes fro 500,..., In the odel selection ste, we estiate the erforance of different odels and choose the oracle stoing tie ˆk oracle by iniizing the reconstruction error: over M = 30 runs. ˆk oracle = arg in k 1 M M f ρ ˆf j k In the odel assessent ste, we artition the dataset into n α subsales, for any α {0, 0.05, 0.1,..., 0.85}. On each subsale we regularize using the oracle stoing tie ˆk oracle (deterined by using the whole sale). Corresonding to Corollary 5, the accuracy should be coarable to the one using the whole sale as long as α < 0.5. In Figure 1 (left anel) we lot the reconstruction error f ˆk f ρ HK versus the ratio α = log()/ log(n) for different sale sizes. We execute each siulation M = 30 ties. The lot suorts our theoretical finding. The right anel shows the reconstruction error versus the total nuber of sales using different artitions of the data. The black curve (α = 0) corresonds to the baseline error ( = 0, no artition of data). Error curves below a threshold α < 0.6 are roughly coarable, whereas curves above this threshold show a ga in erforances. In another exerient we study the erforances in case of (very) different regularization: Only artitioning the data (no regularization), underregularization (higher stoing index) and overregularization (lower stoing index). The outcoe of this exerient alifies the regularization effect of arallelizing. Figure 2 shows the ain oint: Overregularization is always hoeless, underregularization is better. In the extree case of (alost) no regularization, there is a shar iniu in the reconstruction error which is only slightly larger than the iniax otial value for the oracle regularization araeter and which is achieved at an attractively large degree of arallelization. Qualitatively, this agrees very well with the intuitive notion that arallelizing serves as regularization. We ehasize that nuerical results see to indicate that arallelization is ossible to a slightly larger degree than indicated by our theoretical estiate. A siilar result was reorted in the aer Zhang et al. (2013), which also treats the low soothness case High soothness regie We choose f ρ (x) = 1 2π sin(2πx), which corresonds to just one non-vanishing Fourier coefficient and by our criterion Corollary 11 has r =. In view of our ain Corollary 5 this requires a regularization ethod with higher qualification; we take the Gradient Descent ethod (see Exale 2). The aearance of the ter 2b in{1, r} in our theoretical result 5 gives a redicted value α = 0 (and would ily that arallelization is strictly forbidden for infinite soothness). More secifically, the left anel in Figure 3 shows the absence of any lateau for the reconstruction error as a function of α. This corresonds to the right anel showing that 12

13 Parallelizing Sectral Algoriths reconstruction error n=500 n=1500 n=3000 n=6000 n=9000 log(recon.error) α=0 α=0.2 α = 0.4 α = 0.6 α = 0.7 α = log()/log(n) log(n) Figure 1: The reconstruction error f k oracle D f ρ HK in the low soothness case. Left lot: Reconstruction error curves for various (but fixed) total sale sizes, as a function of the nuber of subsales. Right lot: Reconstruction error curves for various subsale nuber scalings = n α, as a function of the sale size (on log-scale). n=500 n=5000 reconstruction error T=5 T=15 T=31 (oracle) T=100 T=500 no regularization reconstruction error T=5 T=15 T=45 (oracle) T=100 T=500 no regularization log()/log(n) log()/log(n) Figure 2: The reconstruction error f D λ f ρ HK in the low soothness case. Left lot: Error curves for different stoing ties for n = 500, as a function of the nuber of subsales. Right lot: Error curves for different stoing ties for n = 5000, as a function of the nuber of subsales. 13

14 Mücke and Blanchard reconstruction error n=500 n=1500 n=3000 n=6000 n=8000 log(recon.error) α=0 α=0.1 α = 0.2 α = 0.4 α = 0.6 α = log()/log(n) log(n) Figure 3: The reconstruction error f λ oracle D f ρ HK in the high soothness case. Left lot: Reconstruction error curves for various (but fixed) sale sizes as a function of the nuber of subsales. Right lot: Reconstruction error curves for various subsale nuber scalings = n α, as a function of the sale size (on log-scale). no grou of values of α erfors roughly equivalently, eaning that we do not have any otiality guarantees. Plotting different values of regularization in Figure 4 we again identify overregularization as hoeless, while severe underregularization exhibits a shar iniu in the reconstruction error. But its value at roughly 0.25 is uch less attractive coared to the case of low soothness where the error is an order of agnitude less. 5. Discussion Miniax Otiality: We have shown that for a large class of sectral regularization ethods the error of the distributed algorith T s λn ( f D f ρ) HK satisfies the sae uer bound as the error T s (f λn D f ρ) for the single achine roble, if the regularization HK araeter λ n is chosen according to (16), rovided the nuber of subsales grows sufficiently slowly with the sale size n. Since, the rates for the latter are iniax otial (Blanchard and Mücke, 2017), our rates in Corollary 5 are iniax otial also. Coarison with other results: Zhang et al. (2013) derive iniax-otial rates in three settings: finite rank kernels, sub-gaussian decay of eigenvalues of the kernel and olynoial decay, rovided satisfies a certain uer bound, deending on the rate of decay of the eigenvalues under two crucial assutions on the eigenfunctions of the integral oerator associated to the kernel: For any j N Eφ j (X) 2k ] ρ 2k, (22) 14

15 Parallelizing Sectral Algoriths n=500 n=3000 reconstruction error T=10 T=100 T=223 (oracle) T=500 T=900 no regularization reconstruction error T=10 T=100 T=219 (oracle) T=500 T=900 no regularization log()/log(n) log()/log(n) Figure 4: The reconstruction error f λ D f ρ in the high soothness case. Left lot: Error curves for different stoing ties for n = 500 sales, as a function of the nuber of subsales. Right lot: Error curves for different stoing ties for n = 5000 sales, as a function of the nuber of subsales. for soe k 2 and ρ < or even stronger, it is assued that the eigenfunctions are uniforly bounded, i.e. su φ j (x) ρ, (23) x X or any j N and soe ρ <. We shall describe in ore detail the case of olynoially decaying eigenvalues, which corresonds to our setting. Assuing eigenvalue decay µ j j b with b > 1, the authors choose a regularization araeter λ n = n b b+1 and n b(k 4) k b+1 ρ 4k log k (n) 1 k 2. leading to an error in L 2 -nor λn E f D f ρ 2 L ] n b 2 b+1, being iniax otial. Note that this choice of λ n and the resulting rate corresond to our case r = 0, i.e., no soothness of f ρ is assued (just that f ρ belongs to the RKHS). For k < 4 the bound becoes less eaningful (coared to the case where k 4) since 0 as n in this case (for any sort of eigenvalue decay). On the other hand, if k and b ight be taken arbitrarily large, corresonding to alost bounded eigenfunctions and arbitrarily large olynoial decay of eigenvalues, ight be chosen roortional to n 1 ɛ, for any ɛ > 0. As ight be exected, relacing the L 2k bound on the eigenfunctions by a bound in L, gives an uer bound on which sily is the liit for k in the 15

16 Mücke and Blanchard bound given above, naely n b 1 b+1 ρ 4 log n, which for large b behaves as above. Granted bounds on the eigenfunctions in L 2k for (very) large k, this is a strong result. While the decay rate of the eigenvalues can be deterined by the soothness of K (see, e.g., Ferreira and Menegatto, 2009 and references therein), it is a widely oen question which general roerties of the kernel ily estiates as in (22) and (23) on the eigenfunctions. Zhou (2002) even gives a counterexale and resents a C Mercer kernel on 0, 1] where the eigenfunctions of the corresonding integral oerator are not uniforly bounded. Thus, soothness of the kernel is not a sufficient condition for (23) to hold. Moreover, we oint out that the uer bound (22) on the eigenfunctions (and thus the uer bound for in Zhang et al., 2013) deends on the unknown arginal distribution ν. Only the strongest assution, a bound in su-nor (23), does not deend on ν. Concerning this oint, our aroach is agnostic. As already entioned in the Introduction, these bounds on the eigenfunctions have been eliinated by Lin et al. (2017), for KRR, iosing olynoial decay of eigenvalues as above. This is very siilar to our aroach. As a general rule, our bounds on and the bounds obtained by Lin et al. (2017) are worse than the bounds of Zhang et al. (2013) for eigenfunctions in (or close to ) L, but in the coleentary case where nothing is known on the eigenfunctions still can be chosen as an increasing function of n, naely = n α. More recisely, choosing λ n as in (16), Lin et al. (2017) derive as an uer bound n α, α = 2br 2br + b + 1, with r being the soothness araeter arising in the source condition. We recall here that due to our assution q r + s, the soothness araeter r is restricted to the interval (0, 1 2 ] for KRR (q = 1) and L2 -risk (s = 1 2 ). Our results (which hold for a general class of sectral regularization ethods) are in soe ways coarable to those of Lin et al. (2017). Secialized to KRR, our estiates for the exonent α in = O(n α ) coincide with the result of Lin et al. (2017). Furtherore, we ehasize that Zhang et al. (2013) and Lin et al. (2017) estiate the DL-error only for s = 1/2 in our notation (corresonding to L 2 (ν) nor), while our result holds for all values of s 0, 1/2] which soothly interolates between L 2 (ν)-nor and RKHS-nor and, in addition, for all values of 1, ). Thus, our results also aly to the case of non-araetric inverse regression, where one is articularly interested in the reconstruction error, i.e. -nor (see, e.g., Blanchard and Mücke, 2017). Additionally, we recisely analyze the deendence of the noise variance σ 2 and the colexity radius R in the source condition. Concerning general strategy, while Lin et al. (2017) use a novel second order decoosition in an essential way, our aroach is ore classical. We clearly distinguish between estiat- 16

17 Parallelizing Sectral Algoriths ing the aroxiation error and the sale error. The bias using a subsale should be of the sae order as when using the whole sale, whereas the estiation error is higher on each subsale, but gets reduced by averaging by writing the variance as a su of i.i.d. rando variables (which allows to use Rosenthal s inequality). Finally, we want to ention the recent works of Lin and Zhou (2018) and Guo et al. (2017), which were worked out indeently fro our work. Guo et al. (2017) also treat general sectral regularization ethods (going beyond kernel ridge) and obtain essentially the sae results, but with error bounds only in L 2 -nor, excluding inverse learning robles. Lin and Zhou (2018) investigate distributed learning on the exale of gradient descent algoriths, which have infinite qualification and allow larger soothness of the regression function. They are able to irove the uer bound for the nuber of local achines to n α log 5 (n) + 1, α < br 2br + b + 1, which is larger in the case r > 2. In the interediate case 1 < r < 2, our bound in (20) is still better. An interesting feature is the fact that it is ossible to allow ore local achines by using additional unlabeled data. This indicates that finding the uer bound for the nuber of achines in the high soothness regie is still an oen roble. Adativity: It is clear fro the theoretical results that both the regularization araeter λ and the allowed cardinality of subsales deend on the araeters r and b, which in general are unknown. Thus, an adative aroach to both araeters b and r for choosing λ and is of interest. To the best of our knowledge, there are yet no rigorous results on adativity in this ore general sense. Progress in this field ay well be crucial in finally assessing the relative erits of the distributed learning aroach as coared with alternative strategies to effectively deal with large data sets. We sketch an alternative naive aroach to adativity, based on hold-out in the direct case, where we consider each f also as a function in L 2 (X, ν). We slit the data z (X Y) n into a training and validation art z = (z t, z v ) of cardinality t, v. We further subdivide z t into k subsales, roughly of size t / k, where k t, k = 1, 2,... is soe strictly decreasing sequence. For each k and each subsale z j, 1 j k, we define the estiators ˆf z λ j as in (12) and their average f λ k,z t := 1 k k ˆf λ z j. (24) Here, λ varies in soe sufficiently fine lattice Λ. Then evaluation on z v gives the associated eirical L 2 error Err λ k (zv ) := leading us to define 1 v (yi v f k,z λ t(xv i )) 2, z v = (y v, x v ), y v = (y1, v..., y v v ), (25) v i=1 ˆλ k := argin λ Λ Err λ k (zv ), Err(k) := Errˆλ k k (zv ). (26) 17

18 Mücke and Blanchard Then, an aroriate stoing criterion for k ight be to sto at k := in{k 3 : (k) δ inf 2 j<k (j)}, (j) := Err(j) Err(j 1), (27) for soe δ < 1 (which ight require tuning). The corresonding regularization araeter is ˆλ = ˆλ k, given by (26). At least intuitively, it is then reasonable to define a urely data driven estiator as f n := f ˆλ k,z t. (28) Note that the training data z t enter the definition of f n via the exlicit forula (24) encoding our kernel based aroach, while z v serves to deterine (k, ˆλ ) via iniization of the eirical L 2 -error and a criterion, which tells one to sto where Err(j) does not areciably irove anyore. It is oen if such a rocedure achieves otial rates, and we leave this for future research. 6. Proofs For ease of reading we ake use of the following conventions: for a (bounded) linear oerator A, A denotes the oerator nor; we are interested in a recise deendence of ultilicative constants on the araeters σ, M, R,, n and η. (To be clear about the role of the latter quantity: the roofs rely on high-robability stateents on deviations, tyically holding with high robability 1 η.) the deendence of ultilicative constants on various other araeters, including the kernel araeter κ, the interolating araeter s 0, 1 2 ], the araeters arising fro the regularization ethod, b > 1, β > 0, r > 0, etc. will (generally) be oitted and sily indicated by the sybol. the deendence of the nor araeter will also be indicated, but will not be given exlicitly. the values of C and C, ight change fro line to line. the exression for n sufficiently large eans that the stateent holds for n n 0, with n 0 otentially deending on all odel araeters (including σ, M and R), but not on η. 6.1 Preliinaries Proosition 12 (Guo et al., 2017, Proosition 1) Define ( ) 2 2 N (λ) B n (λ) := 1 + nλ +. (29) nλ 18

19 Parallelizing Sectral Algoriths For any λ > 0, η (0, 1], with robability at least 1 η one has ( T x + λ) 1 ( T + λ) 8 log 2 (2η 1 )B n (λ). (30) Corollary 13 Let η (0, 1). For n N let λ n be ilicitly defined as the unique solution of N ( λ n ) = n λ n. Then for any λ ax( λ n, n 1 ), 1], one has B n (λ) 10. In articular, ( Tx + λ) 1 ( T + λ) 80 log 2 (2η 1 ) holds with robability at least 1 η. We reark that the trace of T is bounded by 1. This ensures that the interval λ n, 1] is non-ety. Proof of Corollary 13] Let λ n be defined via N ( λ n ) = n λ n. Since N (λ)/λ is decreasing, we have for any λ λ n N (λ) nλ N ( λ n ) = 1. n λ n Inserting this bound as well as nλ 1 into (29) and (30) leads to the conclusion. Corollary 14 Assue the arginal distribution ν of X belongs to P < (b, β) with b > 1 and β > 0. If λ n is defined by (16) and if n n α, α < 2br 2br + b + 1, one has rovided n is sufficiently large. B n (λ n ) 2, n Proof of Corollary 14] We can for starters assue that n is sufficiently large so that λ n < 1, i.e. λ n = ( σ 2 R 2 n ) b 2br+b+1 fro (16). Recall that ν P < (b, β) ilies N (λ n ) C λ 1 b n. Looking at the ters entering in B n (λ n), see (29), we have first, using the definition of λ n in (16): N (λ n ) n λ n C λ n n b+1 b = C n 19 ( nr 2 σ 2 ) b+1 2br+b+1,

20 Mücke and Blanchard which (for fixed R, σ and other araeters entering in C ) is O( n n 2br+r+1 ), and hence o(1) rovided n n α 2br, α < 2br + b + 1. For the second ter entering in B n (λ n), we have 2br 1 n λ n = n ( nr 2 σ 2 ) b 2br+b+1, which is O( n n 2br+1 2br+b+1 ) = o(1), rovided n n α, α < 2br + 1 2br + b + 1, which is ilied by the revious stronger condition. We shortly illustrate how Corollary 13 and Proosition 12 will be used. Let u 0, 1], λ n λ as above and f. We have for any bounded oerator A T u A = T u ( T + λ) u ( T + λ) u ( T x + λ) u ( T x + λ) u A T u ( T + λ) u ( T + λ) u ( T x + λ) u ( T x + λ) u A 8 log 2u (2η 1 )B n (λ) u ( T x + λ) u A, (31) with robability at least 1 η, for any η (0, 1); for the last inequality we have used that the first factor is less than 1, and for the second factor Proosition 12 in cobination with the Cordes inequality (see Proosition 22 in the Aendix). In articular, for any ax( λ n, n 1 ) λ (with λ n as in Corollary 13) T u A 80 u log 2u (2η 1 ) ( T x + λ) u A, (32) with robability at least 1 η. In the following, we constantly use (31). Furtherore, to bound ters involving residuals we will frequently use the following estiate: for v 0, u 0, 2], and rovided u + v q (q being the qualification): ( su r λ (t)t v (t + λ) u 2 t 0,1] using twice (11) since q u + v. su t 0,1] rλ (t)t v+u + λ u su t 0,1] ) r λ (t)t v C λ v+u, (33) 20

21 Parallelizing Sectral Algoriths 6.2 Aroxiation error bound Recall that ν denotes the X-arginal of the saling distribution ρ and P the set of all robability distributions on the inut sace X. Lea 15 Let ν P, v R and let x X n be an i.i.d. sale of size n/, drawn according to ν. Assue the regularization (g λ ) λ has qualification q v s. Then with robability at least 1 η: T s r λ ( T x ) T x( v T T x ) ( ) C log 4 (4η 1 )λ s+v+1 B s+1 N (λ) n (λ) nλ +. nλ Proof of Lea 15] Fro (30),(31) and fro Proosition 20 recalled in the Aendix, one has T s r λ ( T x ) T x( v T T x ) C log 2(s+1) (4η 1 )B s+1 n (λ) ( T x + λ) s r λ ( T x ) T x( v T x + λ) ( T + λ) 1 ( T T x ) C log 4 (4η 1 )λ s+v+1 B s+1 n (λ) ( nλ + ) N (λ), nλ for any λ (0, 1], η (0, 1], with robability at least 1 η. We also used that s 1 2, and the estiate (33). Lea 16 Let ν P, v R and let x X n be an i.i.d. sale of size n/ drawn according to ν. Assue the regularization (g λ ) λ has qualification q v + s. Then for any λ (0, 1], η (0, 1], with robability at least 1 η T s r λ ( T x ) T x v C log 2s (2η 1 )B s n (λ)λ s+v, for soe C <. Proof of Lea 16] Using (31), (33), since q v + s, it holds T s r λ ( T x ) T x v C log 2s (2η 1 )B s n (λ) ( Tx + λ) s r λ ( T x ) T v x with robability at least 1 η. C log 2s (2η 1 )B s n (λ)λ s+v, Proosition 17 (Exectation of aroxiation error) Let f ρ Ω(r, R), λ (0, 1] and let B n (λ) be defined in (29). Assue the regularization has qualification q r + s. For any 1 one has: 21

22 Mücke and Blanchard 1. If r 1, then 2. If r > 1, then T s (f ρ f D) λ ] 1 T s (f ρ f D) λ ] 1 C, R λ s+r B s+r n (λ). ( ( )) C, Rλ s B s+1 n (λ) λ r N (λ) + λ nλ + nλ. In 1. and 2. the constant C, does not deend on (σ, M, R) R 3 +. Proof of Proosition 17] Since f ρ Ω(r, R), T s (f ρ f D) λ ] 1 = 1 R 1 T s r λ ( T ] 1 xj )f ρ T s r λ ( T ] 1 xj )f ρ T s r λ ( T xj ) T r ] 1. (34) The first inequality is just the triangle inequality for the -nor f = E f ] 1. We bound the exectation for each searate subsale of size n by first deriving a robabilistic estiate and then we integrate. Consider first the case where r 1. Using (31), the Cordes inequality (Proosition 22 in the Aendix), and (33) one has for any j = 1,...,, T s r λ ( T xj ) T r C log 2(s+r) (4η 1 )B s+r n C log 3 (4η 1 )λ s+r B s+r n (λ) ( T xj + λ) s r λ ( T xj )( T xj + λ) r with robability at least 1 η and where B n (λ) is defined in (29). Recall that the regularization has qualification q r + s. By integration one has T s r λ ( T xj ) T r ] 1 C, λ s+r B s+r n (λ), (λ), for soe C, <, not deending on σ, M, R. Finally, fro (34) T s (f ρ f D) λ ] 1 C, R λ s+r B s+r n (λ). In the case where r 1, we write r = k + u, with k = r and u = r k < 1. We shall use the decoosition T k = k 1 l=0 T l x( T T x ) T k (l+1) + T k x. (35) 22

23 Parallelizing Sectral Algoriths We roceed by bounding (34) according to decoosition (35). For any j = 1,...,, one has T s r λ ( T xj ) T k+u ] 1 k 1 l=0 + k 1 l=0 + T s r λ ( T xj ) T x l j ( T T xj ) T k (l+1)+u ] 1 T s r λ ( T xj ) T x k u j T ] 1 T s r λ ( T xj ) T x l j ( T T xj ) ] 1 T s r λ ( T xj ) T x k u j T ] 1. (36) Here we use that T k (l+1)+u is bounded by 1. By Lea 16 and by (31), (33), with robability at least 1 η T s r λ ( T xj ) T x k u j T C log 2(s+u) (2η 1 )B s+u n (λ) ( T xj + λ) s r λ ( T xj ) T x k j ( T xj + λ) u C log 2(s+u) (2η 1 )B s+u n (λ)λ s+r, and thus integration yields T s r λ ( T xj ) T x r u j T ] 1 C, B s+u n (λ)λ s+r. (37) For estiating the first ter in (36) we ay use Lea 15. For any l = 0,..., k 1, we have l + s + 1 k + s r + s q, hence for any j = 1,..., with robability at least 1 η ( T s r λ ( T xj ) T x l j ( T T ) xj ) C log 4 (8η 1 )λ s+l+1 B s+1 N (λ) n (λ) nλ +. nλ Again by integration, since λ l 1 for any l = 0,..., k 1, one has k 1 l=0 T s r λ ( T xj ) T x l j ( T T xj ) ( ) ] 1 C, r λ s+1 B s+1 N (λ) n (λ) nλ + nλ. (38) Finally, cobining (37) and (38) into (36), then (34), gives in the case where r > 1 T s (f ρ f D) λ ( ( )) ] 1 C H, λ s B s+1 n (λ) λ r N (λ) + λ K nλ +. nλ The rest of the roof follows fro (36). 23

24 Mücke and Blanchard Proof of Theore 3] Let λ n be as defined by (16). According to Corollary 14, we have B n (λ n ) 2 rovided α < 2br n 2br+b+1, for n sufficiently large. We can also assue n sufficiently large so that λ n < 1, i.e., Rλ r+s n = a n (fro (16), (17)). Under these conditions, we iediately obtain fro the first art of Proosition 17 in the case where r 1 T s (f ρ ] λn f D ) 1 C, R λ s+r n = C, a n. We turn to the case where r > 1. We aly the second art of Proosition 17. By Corollary 14 we have T s (f ρ ] λn f D ) 1 C, Rλ s nb s+1 n n C, Rλ s n where we used that N (λ n ) C λ 1/b n λ n, and λ n < 1. Furtherore, rovided and σ (λ n ) λ r n + λ n n nλ n + ( λ r n + λ n ( n nλ n + n R σ λr n λ 1 b n nλ n n = o ( nλ n λ r ) n, n n n α, α < n N (λ n ) nλ n )), = Rλ r n coing fro the definition of 2(br + 1) 2br + b + 1. Finally, for n sufficiently large, R σ n λ n 1, rovided that As a result, for any 1: li su n su ρ M σ,m,r α < for soe C, <, not deending on σ, M, R. 2b 2br + b + 1. ] T s λn (f ρ f D ) 1 C,, a n 6.3 Sale error bound The ain idea for deriving an uer bound for the sale error is to identify it as a su of unbiased Hilbert sace-valued i.i.d. variables and then to aly a suitable version of Rosenthal s inequality. 24

25 Parallelizing Sectral Algoriths Given λ (0, 1], we define the rando variable ξ λ : (X R) n ξ λ (x, y) := T s g λ ( T x )( T x f ρ S xy). by Recall that according to Assution (3), the conditional exectation w.r.t. ρ of Y given X satisfies E ρ Y X = x] = S x f ρ, ilying that ξ λ has zero exectation (since T x = S x S x ). Thus, T s ( f D λ f D) λ = 1 ξ λ (x j, y j ) (39) is a su of centered i.i.d. rando variables. Furtherore, we need the following result (Pinelis, 1994, Theore 5.2), which generalizes Rosenthal s (1970) inequalities (originally only forulated for real valued rando variables) to rando variables with values in a Banach sace. For Hilbert saces this looks articularly nice. Proosition 18 Let H be a Hilbert sace and ξ 1,..., ξ be a finite sequence of indeendent, ean zero H- valued rando variables. If 2 <, then there exists a constant C > 0, only deending on, such that ( ) 1 {( E 1 ξ j C ) 1 ( H ax E ξ j ) 1 } 2 H, E ξ j 2 H. (40) We reark in assing that Dirksen (2011), Corollary 1.22, establishes the interesting result that in addition to the uer bound in (40) there is also a corresonding lower bound where the constant C is relaced by another constant C > 0, only deending on. Proosition 19 (Exectation of sale error) Let ρ be a source distribution belonging to M σ,m,r, s 0, 1 2 ] and let λ (0, 1]. Define B n (λ) as in (29). Assue the regularization has qualification q r + s. For any 1 one has: T s ( f D λ f D) λ ( ) ] 1 C H, 1 2 B n (λ) 1 2 +s λ s M N (λ) K nλ + σ, nλ where C does not deend on (σ, M, R) R 3 +. Proof of Proosition 19] Let λ (0, 1] and 2. Fro Proosition 18 T s λ f D f ] D λ 1 = 1 ξ λ (x j, y j ) { ( C ax ] E ) 1 ( ρ n ξ λ (x j, y j ) ] HK, E ) } 1 ρ n ξ λ (x j, y j ) 2 2 HK. (41) ] 1 25

26 Mücke and Blanchard Again, the estiates in exectation will follow fro integrating a bound holding with high robability. By (31), one has for any j = 1,...,, ξ λ (x j, y j ) HK = T s g λ ( T xj )( T xj f ρ S x j y j ) HK 8 log 2s (4η 1 )B n (λ)s ( T xj + λ) s g λ ( T xj )( T xj f ρ S x j y j ) HK, (42) holding with robability at least 1 η 2, where B n (λ) is defined in (29). We roceed by slitting: ( T xj + λ) s g λ ( T xj )( T xj f ρ S x j y j ) = H x (1) j H x (2) j h λ z j, with H (1) x j := ( T xj + λ) s g λ ( T xj )( T xj + λ) 1 2, H (2) x j := ( T xj + λ) 1 2 ( T + λ) 1 2, h λ z j := ( T + λ) 1 2 ( Txj f ρ S x j y j ) HK. The first ter is estiated using (9),(10) and gives for s 0, 1 2 ] ( ) su g λ (t)(t + λ) s+ 1 2 H (1) x j t 0,1] ( 2 su g λ (t)t s λ s+ 1 2 t 0,1] ( 2( su g λ (t) t 0,1] ) 1 2 s ( su t 0,1] ) su g λ (t) t 0,1] g λ (t)t ) s+ 1 2 ) + λ s+ 1 2 su g λ (t) t 0,1] C λ s 1 2. (43) The second ter is now bounded using (31) once ore. One has with robability at least 1 η 4 H (2) x j Finally, h λ z j is estiated using Proosition 21: 8 log(8η 1 )B n (λ) 1 2. (44) ( ) h λ z j 2 log(8η 1 M N (λ) ) n λ + σ, (45) n holding with robability at least 1 η 4. Thus, cobining (43), (44) and (45) with (42) gives for any j = 1,...,, ξ λ (x j, y j ) HK C log 2(s+1) (8η 1 )B n (λ) 1 2 +s λ s with robability at least 1 η. Integration gives for any 2: ( M nλ + σ ξ λ (x j, y j ) ] C H, A, K 26 ) N (λ), nλ

27 Parallelizing Sectral Algoriths with A := A n (λ) := B n (λ) 1 2 +s λ s ( M N (λ) nλ + σ nλ Cobining this with (41) ilies, since 2: T s ( f D λ f D) λ ] 1 C (, ax (A ) 1, ( A 2) ) 1 2 = C ( ), A ax 1, 1 2 = C, A, where C, does not deend on (σ, M, R) R 3 +. The result for the case 1 2 iediately follows fro Hölder s inequality. ). Proof of Theore 4] Let λ n be as defined by (16); as earlier we assue n is big enough so that λ n < 1. According to Corollary 14, we have B n (λ n) 2 rovided α < 2br 2br+b+1 and n is sufficiently large. Under this condition we iediately obtain fro Proosition 19: ] T s λn λn ( f D f D ) 1 C, λ s M N (λ n ) n + σ nλ n nλ n C, λ s M λ 1 b n n + σ, nλ n nλ n where we used again that N (λ n ) C λn 1/b ; now n M λn = o σ 1/b, nλ n nλ n rovided Recalling that σ λ 1/b n nλ n As a result, for any 1: li su n n n α, α < = Rλ r n = λ s n a n, we arrive at su T s ( ρ M σ,m,r f λn D ] λn f D ) 1 T s ( f λn D 2(br + 1) 2br + b + 1. C, a n. ] λn f D ) 1 C,, a n for soe C,, not deending on the odel araeters (σ, M, R) R 3 +, thus leading to the conclusion. 27

arxiv: v4 [math.st] 9 Aug 2017

arxiv: v4 [math.st] 9 Aug 2017 PARALLELIZING SPECTRAL ALGORITHMS FOR KERNEL LEARNING GILLES BLANCHARD AND NICOLE MÜCKE arxiv:161007487v4 [athst] 9 Aug 2017 Abstract We consider a distributed learning aroach in suervised learning for

More information

Approximation by Piecewise Constants on Convex Partitions

Approximation by Piecewise Constants on Convex Partitions Aroxiation by Piecewise Constants on Convex Partitions Oleg Davydov Noveber 4, 2011 Abstract We show that the saturation order of iecewise constant aroxiation in L nor on convex artitions with N cells

More information

[95/95] APPROACH FOR DESIGN LIMITS ANALYSIS IN VVER. Shishkov L., Tsyganov S. Russian Research Centre Kurchatov Institute Russian Federation, Moscow

[95/95] APPROACH FOR DESIGN LIMITS ANALYSIS IN VVER. Shishkov L., Tsyganov S. Russian Research Centre Kurchatov Institute Russian Federation, Moscow [95/95] APPROACH FOR DESIGN LIMITS ANALYSIS IN VVER Shishkov L., Tsyganov S. Russian Research Centre Kurchatov Institute Russian Federation, Moscow ABSTRACT The aer discusses a well-known condition [95%/95%],

More information

NONNEGATIVE matrix factorization finds its application

NONNEGATIVE matrix factorization finds its application Multilicative Udates for Convolutional NMF Under -Divergence Pedro J. Villasana T., Stanislaw Gorlow, Meber, IEEE and Arvind T. Hariraan arxiv:803.0559v2 [cs.lg 5 May 208 Abstract In this letter, we generalize

More information

Edinburgh Research Explorer

Edinburgh Research Explorer Edinburgh Research Exlorer ALMOST-ORTHOGONALITY IN THE SCHATTEN-VON NEUMANN CLASSES Citation for ublished version: Carbery, A 2009, 'ALMOST-ORTHOGONALITY IN THE SCHATTEN-VON NEUMANN CLASSES' Journal of

More information

AN EXPLICIT METHOD FOR NUMERICAL SIMULATION OF WAVE EQUATIONS

AN EXPLICIT METHOD FOR NUMERICAL SIMULATION OF WAVE EQUATIONS The 4 th World Conference on Earthquake Engineering October -7, 8, Beiing, China AN EXPLICIT ETHOD FOR NUERICAL SIULATION OF WAVE EQUATIONS Liu Heng and Liao Zheneng Doctoral Candidate, Det. of Structural

More information

5. Dimensional Analysis. 5.1 Dimensions and units

5. Dimensional Analysis. 5.1 Dimensions and units 5. Diensional Analysis In engineering the alication of fluid echanics in designs ake uch of the use of eirical results fro a lot of exerients. This data is often difficult to resent in a readable for.

More information

J.B. LASSERRE AND E.S. ZERON

J.B. LASSERRE AND E.S. ZERON L -NORMS, LOG-BARRIERS AND CRAMER TRANSFORM IN OPTIMIZATION J.B. LASSERRE AND E.S. ZERON Abstract. We show that the Lalace aroxiation of a sureu by L -nors has interesting consequences in otiization. For

More information

Frequency Domain Analysis of Rattle in Gear Pairs and Clutches. Abstract. 1. Introduction

Frequency Domain Analysis of Rattle in Gear Pairs and Clutches. Abstract. 1. Introduction The 00 International Congress and Exosition on Noise Control Engineering Dearborn, MI, USA. August 9-, 00 Frequency Doain Analysis of Rattle in Gear Pairs and Clutches T. C. Ki and R. Singh Acoustics and

More information

EXACT BOUNDS FOR JUDICIOUS PARTITIONS OF GRAPHS

EXACT BOUNDS FOR JUDICIOUS PARTITIONS OF GRAPHS EXACT BOUNDS FOR JUDICIOUS PARTITIONS OF GRAPHS B. BOLLOBÁS1,3 AND A.D. SCOTT,3 Abstract. Edwards showed that every grah of size 1 has a biartite subgrah of size at least / + /8 + 1/64 1/8. We show that

More information

arxiv: v1 [math.ds] 19 Jun 2012

arxiv: v1 [math.ds] 19 Jun 2012 Rates in the strong invariance rincile for ergodic autoorhiss of the torus Jérôe Dedecker a, Florence Merlevède b and Françoise Pène c 1 a Université Paris Descartes, Sorbonne Paris Cité, Laboratoire MAP5

More information

INTERIOR BALLISTIC PRINCIPLE OF HIGH/LOW PRESSURE CHAMBERS IN AUTOMATIC GRENADE LAUNCHERS

INTERIOR BALLISTIC PRINCIPLE OF HIGH/LOW PRESSURE CHAMBERS IN AUTOMATIC GRENADE LAUNCHERS XXXX IB08 19th International Syosiu of Ballistics, 7 11 May 001, Interlaken, Switzerland INTERIOR BALLISTIC PRINCIPLE OF HIGH/LOW PRESSURE CHAMBERS IN AUTOMATIC GRENADE LAUNCHERS S. Jaraaz1, D. Micković1,

More information

DISCRETE DUALITY FINITE VOLUME SCHEMES FOR LERAY-LIONS TYPE ELLIPTIC PROBLEMS ON GENERAL 2D MESHES

DISCRETE DUALITY FINITE VOLUME SCHEMES FOR LERAY-LIONS TYPE ELLIPTIC PROBLEMS ON GENERAL 2D MESHES ISCRETE UALITY FINITE VOLUME SCHEMES FOR LERAY-LIONS TYPE ELLIPTIC PROBLEMS ON GENERAL 2 MESHES BORIS ANREIANOV, FRANCK BOYER AN FLORENCE HUBERT Abstract. iscrete duality finite volue schees on general

More information

Lecture 3: October 2, 2017

Lecture 3: October 2, 2017 Inforation and Coding Theory Autun 2017 Lecturer: Madhur Tulsiani Lecture 3: October 2, 2017 1 Shearer s lea and alications In the revious lecture, we saw the following stateent of Shearer s lea. Lea 1.1

More information

Numerical Method for Obtaining a Predictive Estimator for the Geometric Distribution

Numerical Method for Obtaining a Predictive Estimator for the Geometric Distribution British Journal of Matheatics & Couter Science 19(5): 1-13, 2016; Article no.bjmcs.29941 ISSN: 2231-0851 SCIENCEDOMAIN international www.sciencedoain.org Nuerical Method for Obtaining a Predictive Estiator

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical and Matheatical Sciences 13,,. 8 14 M a t h e a t i c s ON BOUNDEDNESS OF A CLASS OF FIRST ORDER LINEAR DIFFERENTIAL OPERATORS IN THE SPACE OF n 1)-DIMENSIONALLY

More information

Exploiting Matrix Symmetries and Physical Symmetries in Matrix Product States and Tensor Trains

Exploiting Matrix Symmetries and Physical Symmetries in Matrix Product States and Tensor Trains Exloiting Matrix Syetries and Physical Syetries in Matrix Product States and Tensor Trains Thoas K Huckle a and Konrad Waldherr a and Thoas Schulte-Herbrüggen b a Technische Universität München, Boltzannstr

More information

SUPPORTING INFORMATION FOR. Mass Spectrometrically-Detected Statistical Aspects of Ligand Populations in Mixed Monolayer Au 25 L 18 Nanoparticles

SUPPORTING INFORMATION FOR. Mass Spectrometrically-Detected Statistical Aspects of Ligand Populations in Mixed Monolayer Au 25 L 18 Nanoparticles SUPPORTIG IFORMATIO FOR Mass Sectroetrically-Detected Statistical Asects of Lig Poulations in Mixed Monolayer Au 25 L 8 anoarticles Aala Dass,,a Kennedy Holt, Joseh F. Parer, Stehen W. Feldberg, Royce

More information

arxiv: v2 [math.st] 13 Feb 2018

arxiv: v2 [math.st] 13 Feb 2018 A data-deendent weighted LASSO under Poisson noise arxiv:1509.08892v2 [ath.st] 13 Feb 2018 Xin J. Hunt, SAS Institute Inc., Cary, NC USA Patricia Reynaud-Bouret University of Côte d Azur, CNRS, LJAD, Nice,

More information

Modi ed Local Whittle Estimator for Long Memory Processes in the Presence of Low Frequency (and Other) Contaminations

Modi ed Local Whittle Estimator for Long Memory Processes in the Presence of Low Frequency (and Other) Contaminations Modi ed Local Whittle Estiator for Long Meory Processes in the Presence of Low Frequency (and Other Containations Jie Hou y Boston University Pierre Perron z Boston University March 5, 203; Revised: January

More information

Handout 6 Solutions to Problems from Homework 2

Handout 6 Solutions to Problems from Homework 2 CS 85/185 Fall 2003 Lower Bounds Handout 6 Solutions to Probles fro Hoewor 2 Ait Charabarti Couter Science Dartouth College Solution to Proble 1 1.2: Let f n stand for A 111 n. To decide the roerty f 3

More information

Minimizing Machinery Vibration Transmission in a Lightweight Building using Topology Optimization

Minimizing Machinery Vibration Transmission in a Lightweight Building using Topology Optimization 1 th World Congress on Structural and Multidiscilinary Otiization May 19-4, 13, Orlando, Florida, USA Miniizing Machinery Vibration ransission in a Lightweight Building using oology Otiization Niels Olhoff,

More information

Some simple continued fraction expansions for an in nite product Part 1. Peter Bala, January ax 4n+3 1 ax 4n+1. (a; x) =

Some simple continued fraction expansions for an in nite product Part 1. Peter Bala, January ax 4n+3 1 ax 4n+1. (a; x) = Soe sile continued fraction exansions for an in nite roduct Part. Introduction The in nite roduct Peter Bala, January 3 (a; x) = Y ax 4n+3 ax 4n+ converges for arbitrary colex a rovided jxj

More information

Optimal Adaptive Computations in the Jaffard Algebra and Localized Frames

Optimal Adaptive Computations in the Jaffard Algebra and Localized Frames www.oeaw.ac.at Otial Adative Coutations in the Jaffard Algebra and Localized Fraes M. Fornasier, K. Gröchenig RICAM-Reort 2006-28 www.rica.oeaw.ac.at Otial Adative Coutations in the Jaffard Algebra and

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Control and Stability of the Time-delay Linear Systems

Control and Stability of the Time-delay Linear Systems ISSN 746-7659, England, UK Journal of Inforation and Couting Science Vol., No. 4, 206,.29-297 Control and Stability of the Tie-delay Linear Systes Negras Tahasbi *, Hojjat Ahsani Tehrani Deartent of Matheatics,

More information

Uniform Deviation Bounds for k-means Clustering

Uniform Deviation Bounds for k-means Clustering Unifor Deviation Bounds for k-means Clustering Olivier Bache Mario Lucic S Haed Hassani Andreas Krause Abstract Unifor deviation bounds liit the difference between a odel s exected loss and its loss on

More information

#A62 INTEGERS 16 (2016) REPRESENTATION OF INTEGERS BY TERNARY QUADRATIC FORMS: A GEOMETRIC APPROACH

#A62 INTEGERS 16 (2016) REPRESENTATION OF INTEGERS BY TERNARY QUADRATIC FORMS: A GEOMETRIC APPROACH #A6 INTEGERS 16 (016) REPRESENTATION OF INTEGERS BY TERNARY QUADRATIC FORMS: A GEOMETRIC APPROACH Gabriel Durha Deartent of Matheatics, University of Georgia, Athens, Georgia gjdurha@ugaedu Received: 9/11/15,

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

The CIA (consistency in aggregation) approach A new economic approach to elementary indices

The CIA (consistency in aggregation) approach A new economic approach to elementary indices The CIA (consistency in aggregation) aroach A new econoic aroach to eleentary indices Dr Jens ehrhoff*, Head of Section Business Cycle and Structural Econoic Statistics * Jens This ehrhoff, resentation

More information

Scaled Enflo type is equivalent to Rademacher type

Scaled Enflo type is equivalent to Rademacher type Scaled Enflo tye is equivalent to Radeacher tye Manor Mendel California Institute of Technology Assaf Naor Microsoft Research Abstract We introduce the notion of scaled Enflo tye of a etric sace, and show

More information

MULTIPLIER IDEALS OF SUMS VIA CELLULAR RESOLUTIONS

MULTIPLIER IDEALS OF SUMS VIA CELLULAR RESOLUTIONS MULTIPLIER IDEALS OF SUMS VIA CELLULAR RESOLUTIONS SHIN-YAO JOW AND EZRA MILLER Abstract. Fix nonzero ideal sheaves a 1,..., a r and b on a noral Q-Gorenstein colex variety X. For any ositive real nubers

More information

Analysis of low rank matrix recovery via Mendelson s small ball method

Analysis of low rank matrix recovery via Mendelson s small ball method Analysis of low rank atrix recovery via Mendelson s sall ball ethod Maryia Kabanava Chair for Matheatics C (Analysis) ontdriesch 0 kabanava@athc.rwth-aachen.de Holger Rauhut Chair for Matheatics C (Analysis)

More information

CHAPTER 2 THERMODYNAMICS

CHAPTER 2 THERMODYNAMICS CHAPER 2 HERMODYNAMICS 2.1 INRODUCION herodynaics is the study of the behavior of systes of atter under the action of external fields such as teerature and ressure. It is used in articular to describe

More information

ACCURACY OF THE DISCRETE FOURIER TRANSFORM AND THE FAST FOURIER TRANSFORM

ACCURACY OF THE DISCRETE FOURIER TRANSFORM AND THE FAST FOURIER TRANSFORM SIAM J. SCI. COMPUT. c 1996 Society for Industrial and Alied Matheatics Vol. 17, o. 5,. 1150 1166, Seteber 1996 008 ACCURACY OF THE DISCRETE FOURIER TRASFORM AD THE FAST FOURIER TRASFORM JAMES C. SCHATZMA

More information

FRESNEL FORMULAE FOR SCATTERING OPERATORS

FRESNEL FORMULAE FOR SCATTERING OPERATORS elecounications and Radio Engineering, 70(9):749-758 (011) MAHEMAICAL MEHODS IN ELECROMAGNEIC HEORY FRESNEL FORMULAE FOR SCAERING OPERAORS I.V. Petrusenko & Yu.K. Sirenko A. Usikov Institute of Radio Physics

More information

Uniform Deviation Bounds for k-means Clustering

Uniform Deviation Bounds for k-means Clustering Unifor Deviation Bounds for k-means Clustering Olivier Bache Mario Lucic S. Haed Hassani Andreas Krause Abstract Unifor deviation bounds liit the difference between a odel s exected loss and its loss on

More information

CALCULATION of CORONA INCEPTION VOLTAGES in N 2 +SF 6 MIXTURES via GENETIC ALGORITHM

CALCULATION of CORONA INCEPTION VOLTAGES in N 2 +SF 6 MIXTURES via GENETIC ALGORITHM CALCULATION of COONA INCPTION VOLTAGS in N +SF 6 MIXTUS via GNTIC ALGOITHM. Onal G. Kourgoz e-ail: onal@elk.itu.edu.tr e-ail: guven@itu.edu..edu.tr Istanbul Technical University, Faculty of lectric and

More information

A Constraint View of IBD Graphs

A Constraint View of IBD Graphs A Constraint View of IBD Grahs Rina Dechter, Dan Geiger and Elizabeth Thoson Donald Bren School of Inforation and Couter Science University of California, Irvine, CA 92697 1 Introduction The reort rovides

More information

IOSIF PINELIS. 1. INTRODUCTION The Euler Maclaurin (EM) summation formula can be written as follows: 2m 1. B j j! [ f( j 1) (n) f ( j 1) (0)], (EM)

IOSIF PINELIS. 1. INTRODUCTION The Euler Maclaurin (EM) summation formula can be written as follows: 2m 1. B j j! [ f( j 1) (n) f ( j 1) (0)], (EM) APPROXIMATING SUMS BY INTEGRALS ONLY: MULTIPLE SUMS AND SUMS OVER LATTICE POLYTOPES arxiv:1705.09159v7 [ath.ca] 29 Oct 2017 IOSIF PINELIS ABSTRACT. The Euler Maclaurin (EM) suation forula is used in any

More information

Design of Linear-Phase Two-Channel FIR Filter Banks with Rational Sampling Factors

Design of Linear-Phase Two-Channel FIR Filter Banks with Rational Sampling Factors R. Bregović and. Saraäi, Design of linear hase two-channel FIR filter bans with rational saling factors, Proc. 3 rd Int. Sy. on Iage and Signal Processing and Analysis, Roe, Italy, Set. 3,. 749 754. Design

More information

Input-Output (I/O) Stability. -Stability of a System

Input-Output (I/O) Stability. -Stability of a System Inut-Outut (I/O) Stability -Stability of a Syste Outline: Introduction White Boxes and Black Boxes Inut-Outut Descrition Foralization of the Inut-Outut View Signals and Signal Saces he Notions of Gain

More information

On spinors and their transformation

On spinors and their transformation AMERICAN JOURNAL OF SCIENTIFIC AND INDUSTRIAL RESEARCH, Science Huβ, htt:www.scihub.orgajsir ISSN: 5-69X On sinors and their transforation Anaitra Palit AuthorTeacher, P5 Motijheel Avenue, Flat C,Kolkata

More information

1. (2.5.1) So, the number of moles, n, contained in a sample of any substance is equal N n, (2.5.2)

1. (2.5.1) So, the number of moles, n, contained in a sample of any substance is equal N n, (2.5.2) Lecture.5. Ideal gas law We have already discussed general rinciles of classical therodynaics. Classical therodynaics is a acroscoic science which describes hysical systes by eans of acroscoic variables,

More information

Shannon Sampling II. Connections to Learning Theory

Shannon Sampling II. Connections to Learning Theory Shannon Sapling II Connections to Learning heory Steve Sale oyota echnological Institute at Chicago 147 East 60th Street, Chicago, IL 60637, USA E-ail: sale@athberkeleyedu Ding-Xuan Zhou Departent of Matheatics,

More information

Computationally Efficient Control System Based on Digital Dynamic Pulse Frequency Modulation for Microprocessor Implementation

Computationally Efficient Control System Based on Digital Dynamic Pulse Frequency Modulation for Microprocessor Implementation IJCSI International Journal of Couter Science Issues, Vol. 0, Issue 3, No 2, May 203 ISSN (Print): 694-084 ISSN (Online): 694-0784 www.ijcsi.org 20 Coutationally Efficient Control Syste Based on Digital

More information

Security Transaction Differential Equation

Security Transaction Differential Equation Security Transaction Differential Equation A Transaction Volue/Price Probability Wave Model Shi, Leilei This draft: June 1, 4 Abstract Financial arket is a tyical colex syste because it is an oen trading

More information

EGN 3353C Fluid Mechanics

EGN 3353C Fluid Mechanics Lecture 4 When nondiensionalizing an equation, nondiensional araeters often aear. Exale Consider an object falling due to gravity in a vacuu d z ays: (1) the conventional diensional aroach, and () diensionless

More information

One- and multidimensional Fibonacci search very easy!

One- and multidimensional Fibonacci search very easy! One and ultidiensional ibonacci search One and ultidiensional ibonacci search very easy!. Content. Introduction / Preliinary rearks...page. Short descrition of the ibonacci nubers...page 3. Descrition

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binoial and Poisson Probability Distributions There are a few discrete robability distributions that cro u any ties in hysics alications, e.g. QM, SM. Here we consider TWO iortant and related cases, the

More information

A STUDY OF UNSUPERVISED CHANGE DETECTION BASED ON TEST STATISTIC AND GAUSSIAN MIXTURE MODEL USING POLSAR SAR DATA

A STUDY OF UNSUPERVISED CHANGE DETECTION BASED ON TEST STATISTIC AND GAUSSIAN MIXTURE MODEL USING POLSAR SAR DATA A STUDY OF UNSUPERVISED CHANGE DETECTION BASED ON TEST STATISTIC AND GAUSSIAN MIXTURE MODEL USING POLSAR SAR DATA Yang Yuxin a, Liu Wensong b* a Middle School Affiliated to Central China Noral University,

More information

Applications to stochastic PDE

Applications to stochastic PDE 15 Alications to stochastic PE In this final lecture we resent some alications of the theory develoed in this course to stochastic artial differential equations. We concentrate on two secific examles:

More information

Ayşe Alaca, Şaban Alaca and Kenneth S. Williams School of Mathematics and Statistics, Carleton University, Ottawa, Ontario, Canada. Abstract.

Ayşe Alaca, Şaban Alaca and Kenneth S. Williams School of Mathematics and Statistics, Carleton University, Ottawa, Ontario, Canada. Abstract. Journal of Cobinatorics and Nuber Theory Volue 6, Nuber,. 17 15 ISSN: 194-5600 c Nova Science Publishers, Inc. DOUBLE GAUSS SUMS Ayşe Alaca, Şaban Alaca and Kenneth S. Willias School of Matheatics and

More information

Mistiming Performance Analysis of the Energy Detection Based ToA Estimator for MB-OFDM

Mistiming Performance Analysis of the Energy Detection Based ToA Estimator for MB-OFDM IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS Mistiing Perforance Analysis of the Energy Detection Based ToA Estiator for MB-OFDM Huilin Xu, Liuqing Yang contact author, Y T Jade Morton and Mikel M Miller

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

PROJECTIONS IN VECTOR SPACES OVER FINITE FIELDS

PROJECTIONS IN VECTOR SPACES OVER FINITE FIELDS Annales Acadeiæ Scientiaru Fennicæ Matheatica Voluen 43, 2018, 171 185 PROJECTIONS IN VECTOR SPACES OVER FINITE FIELDS Changhao Chen The University of New South Wales, School of Matheatics and Statistics

More information

The Construction of Orthonormal Wavelets Using Symbolic Methods and a Matrix Analytical Approach for Wavelets on the Interval

The Construction of Orthonormal Wavelets Using Symbolic Methods and a Matrix Analytical Approach for Wavelets on the Interval The Construction of Orthonoral Wavelets Using Sybolic Methods and a Matrix Analytical Aroach for Wavelets on the Interval Frédéric Chyzak, Peter Paule, Otar Scherzer, Arin Schoisswohl, and Burkhard Zierann

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

On Maximizing the Convergence Rate for Linear Systems With Input Saturation

On Maximizing the Convergence Rate for Linear Systems With Input Saturation IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 48, NO. 7, JULY 2003 1249 On Maxiizing the Convergence Rate for Linear Systes With Inut Saturation Tingshu Hu, Zongli Lin, Yacov Shaash Abstract In this note,

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

A Subspace Iteration for Calculating a Cluster of Exterior Eigenvalues

A Subspace Iteration for Calculating a Cluster of Exterior Eigenvalues Advances in Linear Algebra & Matrix heory 05 5 76-89 Published Online Seteber 05 in SciRes htt://wwwscirorg/ournal/alat htt://dxdoiorg/0436/alat0553008 A Subsace Iteration for Calculating a Cluster of

More information

A GENERAL THEORY OF PARTICLE FILTERS IN HIDDEN MARKOV MODELS AND SOME APPLICATIONS. By Hock Peng Chan National University of Singapore and

A GENERAL THEORY OF PARTICLE FILTERS IN HIDDEN MARKOV MODELS AND SOME APPLICATIONS. By Hock Peng Chan National University of Singapore and Subitted to the Annals of Statistics A GENERAL THEORY OF PARTICLE FILTERS IN HIDDEN MARKOV MODELS AND SOME APPLICATIONS By Hock Peng Chan National University of Singaore and By Tze Leung Lai Stanford University

More information

STRONG TYPE INEQUALITIES AND AN ALMOST-ORTHOGONALITY PRINCIPLE FOR FAMILIES OF MAXIMAL OPERATORS ALONG DIRECTIONS IN R 2

STRONG TYPE INEQUALITIES AND AN ALMOST-ORTHOGONALITY PRINCIPLE FOR FAMILIES OF MAXIMAL OPERATORS ALONG DIRECTIONS IN R 2 STRONG TYPE INEQUALITIES AND AN ALMOST-ORTHOGONALITY PRINCIPLE FOR FAMILIES OF MAXIMAL OPERATORS ALONG DIRECTIONS IN R 2 ANGELES ALFONSECA Abstract In this aer we rove an almost-orthogonality rincile for

More information

RESOLVENT ESTIMATES FOR ELLIPTIC SYSTEMS IN FUNCTION SPACES OF HIGHER REGULARITY

RESOLVENT ESTIMATES FOR ELLIPTIC SYSTEMS IN FUNCTION SPACES OF HIGHER REGULARITY Electronic Journal of Differential Equations, Vol. 2011 2011, No. 109,. 1 12. ISSN: 1072-6691. URL: htt://ejde.ath.txstate.edu or htt://ejde.ath.unt.edu ft ejde.ath.txstate.edu RESOLVENT ESTIMATES FOR

More information

POWER RESIDUES OF FOURIER COEFFICIENTS OF MODULAR FORMS

POWER RESIDUES OF FOURIER COEFFICIENTS OF MODULAR FORMS POWER RESIDUES OF FOURIER COEFFICIENTS OF MODULAR FORMS TOM WESTON Abstract Let ρ : G Q > GL nq l be a otivic l-adic Galois reresentation For fixed > 1 we initiate an investigation of the density of the

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Convergence rates of spectral methods for statistical inverse learning problems

Convergence rates of spectral methods for statistical inverse learning problems Convergence rates of spectral methods for statistical inverse learning problems G. Blanchard Universtität Potsdam UCL/Gatsby unit, 04/11/2015 Joint work with N. Mücke (U. Potsdam); N. Krämer (U. München)

More information

The Semantics of Data Flow Diagrams. P.D. Bruza. Th.P. van der Weide. Dept. of Information Systems, University of Nijmegen

The Semantics of Data Flow Diagrams. P.D. Bruza. Th.P. van der Weide. Dept. of Information Systems, University of Nijmegen The Seantics of Data Flow Diagras P.D. Bruza Th.P. van der Weide Det. of Inforation Systes, University of Nijegen Toernooiveld, NL-6525 ED Nijegen, The Netherlands July 26, 1993 Abstract In this article

More information

Metric Cotype. 1 Introduction. Manor Mendel California Institute of Technology. Assaf Naor Microsoft Research

Metric Cotype. 1 Introduction. Manor Mendel California Institute of Technology. Assaf Naor Microsoft Research Metric Cotye Manor Mendel California Institute of Technology Assaf Naor Microsoft Research Abstract We introduce the notion of cotye of a etric sace, and rove that for Banach saces it coincides with the

More information

An Investigation into the Effects of Roll Gyradius on Experimental Testing and Numerical Simulation: Troubleshooting Emergent Issues

An Investigation into the Effects of Roll Gyradius on Experimental Testing and Numerical Simulation: Troubleshooting Emergent Issues An Investigation into the Effects of Roll Gyradius on Exeriental esting and Nuerical Siulation: roubleshooting Eergent Issues Edward Dawson Maritie Division Defence Science and echnology Organisation DSO-N-140

More information

Quadratic Reciprocity. As in the previous notes, we consider the Legendre Symbol, defined by

Quadratic Reciprocity. As in the previous notes, we consider the Legendre Symbol, defined by Math 0 Sring 01 Quadratic Recirocity As in the revious notes we consider the Legendre Sybol defined by $ ˆa & 0 if a 1 if a is a quadratic residue odulo. % 1 if a is a quadratic non residue We also had

More information

Suppress Parameter Cross-talk for Elastic Full-waveform Inversion: Parameterization and Acquisition Geometry

Suppress Parameter Cross-talk for Elastic Full-waveform Inversion: Parameterization and Acquisition Geometry Suress Paraeter Cross-talk for Elastic Full-wavefor Inversion: Paraeterization and Acquisition Geoetry Wenyong Pan and Kris Innanen CREWES Project, Deartent of Geoscience, University of Calgary Suary Full-wavefor

More information

The Number of Information Bits Related to the Minimum Quantum and Gravitational Masses in a Vacuum Dominated Universe

The Number of Information Bits Related to the Minimum Quantum and Gravitational Masses in a Vacuum Dominated Universe Wilfrid Laurier University Scholars Coons @ Laurier Physics and Couter Science Faculty Publications Physics and Couter Science 01 The uber of Inforation Bits Related to the Miniu Quantu and Gravitational

More information

3 Thermodynamics and Statistical mechanics

3 Thermodynamics and Statistical mechanics Therodynaics and Statistical echanics. Syste and environent The syste is soe ortion of atter that we searate using real walls or only in our ine, fro the other art of the universe. Everything outside the

More information

VACUUM chambers have wide applications for a variety of

VACUUM chambers have wide applications for a variety of JOURNAL OF THERMOPHYSICS AND HEAT TRANSFER Vol. 2, No., January March 27 Free Molecular Flows Between Two Plates Equied with Pus Chunei Cai ZONA Technology, Inc., Scottsdale, Arizona 85258 Iain D. Boyd

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Wind Loading for the Design of the Solar Tower

Wind Loading for the Design of the Solar Tower Wind Loading for the Design of the Solar Tower H.-J. Nieann, R. Höffer Faculty of Civil Engineering, Ruhr-University Bochu, Gerany Keywords: solar chiney, wind seed, wind turbulence and wind load u to

More information

Anomalous heat capacity for nematic MBBA near clearing point

Anomalous heat capacity for nematic MBBA near clearing point Journal of Physics: Conference Series Anoalous heat caacity for neatic MA near clearing oint To cite this article: D A Lukashenko and M Khasanov J. Phys.: Conf. Ser. 394 View the article online for udates

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

Review from last time Time Series Analysis, Fall 2007 Professor Anna Mikusheva Paul Schrimpf, scribe October 23, 2007.

Review from last time Time Series Analysis, Fall 2007 Professor Anna Mikusheva Paul Schrimpf, scribe October 23, 2007. Review fro last tie 4384 ie Series Analsis, Fall 007 Professor Anna Mikusheva Paul Schrif, scribe October 3, 007 Lecture 3 Unit Roots Review fro last tie Let t be a rando walk t = ρ t + ɛ t, ρ = where

More information

New Set of Rotationally Legendre Moment Invariants

New Set of Rotationally Legendre Moment Invariants New Set of Rotationally Legendre Moent Invariants Khalid M. Hosny Abstract Orthogonal Legendre oents are used in several attern recognition and iage rocessing alications. Translation and scale Legendre

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volue 19, 2013 htt://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Physical Acoustics Session 1PAb: Acoustics in Microfluidics and for Particle

More information

Algorithm Design and Implementation for a Mathematical Model of Factoring Integers

Algorithm Design and Implementation for a Mathematical Model of Factoring Integers IOSR Journal of Matheatics (IOSR-JM e-iss: 78-578, -ISS: 39-765X. Volue 3, Issue I Ver. VI (Jan. - Feb. 07, PP 37-4 www.iosrjournals.org Algorith Design leentation for a Matheatical Model of Factoring

More information

Learnability of Gaussians with flexible variances

Learnability of Gaussians with flexible variances Learnability of Gaussians with flexible variances Ding-Xuan Zhou City University of Hong Kong E-ail: azhou@cityu.edu.hk Supported in part by Research Grants Council of Hong Kong Start October 20, 2007

More information

Polynomials and Cryptography. Michele Elia Politecnico di Torino

Polynomials and Cryptography. Michele Elia Politecnico di Torino Polynoials and Crytograhy Michele Elia Politecnico di Torino Trento 0 arzo 0 Preable Polynoials have always occuied a roinent osition in atheatics. In recent tie their use has becoe unavoidable in crytograhy.

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS ISSN 1440-771X AUSTRALIA DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS An Iproved Method for Bandwidth Selection When Estiating ROC Curves Peter G Hall and Rob J Hyndan Working Paper 11/00 An iproved

More information

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT ZANE LI Let e(z) := e 2πiz and for g : [0, ] C and J [0, ], define the extension oerator E J g(x) := g(t)e(tx + t 2 x 2 ) dt. J For a ositive weight ν

More information

CAUCHY PROBLEM FOR TECHNOLOGICAL CUMULATIVE CHARGE DESIGN. Christo Christov, Svetozar Botev

CAUCHY PROBLEM FOR TECHNOLOGICAL CUMULATIVE CHARGE DESIGN. Christo Christov, Svetozar Botev SENS'6 Second Scientific Conference with International Particiation SPACE, ECOLOGY, NANOTECHNOLOGY, SAFETY 4 6 June 6, Varna, Bulgaria ----------------------------------------------------------------------------------------------------------------------------------

More information

CDS 101: Lecture 5-1 Reachability and State Space Feedback. Review from Last Week

CDS 101: Lecture 5-1 Reachability and State Space Feedback. Review from Last Week CDS 11: Lecture 5-1 Reachability an State Sace Feeback Richar M. Murray 3 October 6 Goals:! Define reachability of a control syste! Give tests for reachability of linear systes an aly to exales! Describe

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

Phase field modelling of microstructural evolution using the Cahn-Hilliard equation: A report to accompany CH-muSE

Phase field modelling of microstructural evolution using the Cahn-Hilliard equation: A report to accompany CH-muSE Phase field odelling of icrostructural evolution using the Cahn-Hilliard equation: A reort to accoany CH-uSE 1 The Cahn-Hilliard equation Let us consider a binary alloy of average coosition c 0 occuying

More information

Chordal Sparsity, Decomposing SDPs and the Lyapunov Equation

Chordal Sparsity, Decomposing SDPs and the Lyapunov Equation Chordal Sarsity, Decoosing SDPs and the Lyaunov Equation Richard P. Mason and Antonis Paachristodoulou Abstract Analysis questions in control theory are often forulated as Linear Matrix Inequalities and

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information