On information plus noise kernel random matrices

Size: px
Start display at page:

Download "On information plus noise kernel random matrices"

Transcription

1 On information lus noise kernel random matrices Noureddine El Karoui Deartment of Statistics, UC Berkeley First version: July 009 This version: February 5, 00 Abstract Kernel random matrices have attracted a lot of interest in recent years, from both ractical and theoretical standoints Most of the theoretical work so far has focused on the case were the data is samled from a low-dimensional structure Very recently, the first results concerning kernel random matrices with high-dimensional inut data were obtained, in a setting where the data was samled from a genuinely high-dimensional structure - similar to standard assumtions in random matrix theory In this aer, we consider the case where the data is of the tye information+noise In other words, each observation is the sum of two indeendent elements: one samled from a low-dimensional structure, the signal art of the data, the other being high-dimensional noise, normalized to not overwhelm but still affect the signal We consider two tyes of noise, sherical and ellitical In the sherical setting, we show that the sectral roerties of kernel random matrices can be understood from a new kernel matrix, comuted only from the signal art of the data, but using in general a slightly different kernel The Gaussian kernel has some secial roerties in this setting The ellitical setting, which is imortant from a robustness standoint, is less rone to easy interretation Introduction Kernel techniques are now a standard tool of statistical ractice and kernel versions of many methods of classical multivariate statistics have now been created A few imortant examles can be found in Schölkof and Smola 00 see the descrition of kernel PCA, 4-45, and Bach and Jordan 003 for kernel ICA for instance There are several ways to describe kernel methods, but one of them is to think of them as classical multivariate techniques using generalized notions of inner-roduct A basic inut in these techniques is a kernel matrix, ie an inner-roduct or Gram matrix, for generalized inner roducts If our vectors of observations are X,,X n, the kernel matrices studied in this aer have i, j entry f X i X j or fx i X j, for a certain f Poular examles include the Gaussian kernel entries ex X i X j /s, the Sigmoid kernel entries tanhκx i X j + θ, and olynomial kernels entries X i X j d We refer to Rasmussen and Williams 006 for more examles As exlained in for instance Schölkof and Smola 00, kernel techniques allow ractitioners to essentially do multivariate analysis in infinite dimensional saces, by embedding the data in a infinite dimensional sace through the use of the kernel A nice numerical feature is that the embedding need not be secified, and all comutations can be made using the finite-dimensional kernel matrix Kernel techniques also allow users to do certain forms of non-linear data analysis and dimensionality reduction, which is naturally very desirable Zwald et al 004 and von Luxburg et al 008 are two interesting relatively recent aers concerned broadly seaking with the same tyes of inferential questions we have in mind and investigate in this aer, though the settings of these aers is quite different from the one we will work under I would like to thank Peter Bickel for suggesting that I consider the roblem studied here and in general for many enlightening discussions about toics in high-dimensional statistics I would also like to thank an anonymous referee and associate editors for raising interesting questions which contributed to a strengthening of the results resented in the aer Suort from NSF grants DMS , DMS CAREER and an Alfred P Sloan research fellowshi is gratefully acknowledged AMS 000 SC: Primary: 6H0 Secondary: 60F99 Key words and Phrases : covariance matrices, kernel matrices, multivariate statistical analysis, high-dimensional inference, random matrix theory, machine learning, concentration of measure Contact : nkaroui@statberkeleyedu

2 Kernel matrices and the closely related Lalacian matrices also lay a central role in manifold learning see eg Belkin and Niyogi 003 and Izenman 008 for an overview of various techniques In classical statistics, they have been a mainstay of satial statistics and geostatistics in articular see Cressie 993 In geostatistical alications, it is clear that the dimension of the data is at most 3 Also, in alications of kernel techniques and manifold learning, it is often assumed that the data live on a low-dimensional manifold or structure, the kernel aroach allowing us to somehow recover at least artially this information Consequently, most theoretical analyses of kernel matrices and kernel or manifold learning techniques have focused on situations where the data is assumed to live on such a low-dimensional structure In articular, it is often the case that asymtotics are studied under the assumtion that the data is iid from a fixed distribution - indeendent of the number of oints Some remarkable results have been obtained in this setting see Koltchinskii and Giné 000 and also Belkin and Niyogi 008 Let us give a brief overview of such results In Koltchinskii and Giné 000, the authors rove that if X i are iid with distribution P, under regularity conditions on the kernel kx, y, the k-th largest eigenvalue of the kernel matrix M, with entries Mi, j = n kx i, X j, converges to the k-th largest eigenvalue of the oerator K defined as Kfx = kx, yfydpy In this imortant aer, the authors were also able to obtain fluctuation behavior for these eigenvalues, under certain technical conditions see Theorem 5 in Koltchinskii and Giné 000 Similar first-order convergence results were obtained, at a heuristic level but through interesting arguments, in Williams and Seeger 000 These results gave theoretical confirmation to ractitioners intuition and heuristics that the kernel matrix could be used as a good roxy for the oerator K on L dp and hence kernel techniques could be exlained and justified through the sectral roerties of this oerator To statisticians well-versed in the theory of random matrices, this set of results aears to be similar to results for low-dimensional covariance matrices stating that when the dimension of the data is fixed and the number of observations goes to infinity, the samle covariance matrix is a sectrally consistent estimator of the oulation covariance matrix see eg Anderson 003 However, it is well-known see eg Marčenko and Pastur 967, Bai 999, Johnstone 007 that this is not the case when the dimension of the data,, changes with n the number of observations, and in articular when asymtotics are studied under the assumtion that /n has a finite limit We refer to the asymtotic setting where and n both tend to infinity as the high-dimensional setting We note that given that more and more datasets have observations that are high-dimensional, and kernel techniques are used on some of them see Williams and Seeger 000, it is natural to study kernel random matrices in the high-dimensional setting Another imortant reason to study this tye of asymtotics is that by keeing track of the effect of the dimension of the data,, and of other arameters of the roblem on the results, they might hel us give more accurate rediction about the finite-dimensional behavior of certain statistics than the classical small, large n asymtotics An examle of this henomenon can be found in the aer Johnstone 00 where it turned out in simulation that some of the doubly asymtotic results concerning fluctuation behavior of the largest eigenvalue of a Wishart matrix with identity covariance are quite accurate for and n as small as 5 or 0, at least in the right tail of the distribution We refer the interested reader to Johnstone 00 for more details on the secific examle we just described Hence, it is also otentially ractically imortant to carry out these theoretical studies for they can be informative even for finite dimensional considerations The roerties of kernel random matrices under classical random matrix assumtions have been studied by the author in the recent El Karoui 007 It was shown there that when the data is highdimensional, for instance X i N0, Σ, and the oerator norm of Σ is eg bounded, kernel random matrices essentially act like standard Gram/ covariance matrices, u to re-centering and re-scaling, which deend only on f Naturally, a certain scaling is needed to make the roblem non-degenerate, and the results we just stated hold, for instance, when Mi, j = f X i X j /, for otherwise the kernel matrix is in general degenerate We refer to El Karoui 007 for more details and discussions of the relevance of these results in ractice In limited simulations, we found that the theory agreed with the numerics even when was of the order of several 0 s and /n was not too small eg /n

3 These results came as somewhat of a surrise and seemed to contradict the intuition and numerous ositive ractical results that have been obtained, since they suggested that the kernel matrices we considered were just a centered and scaled version of the matrix XX However, it should be noted that the assumtions imlied that the data was truly high-dimensional So an interesting middle-ground, from modeling, theoretical and ractical oints of view is the following: what haens if the data does not live exactly on a fixed-dimensional manifold, but lives nearby? In other words, the data is now samled from a noisy version of the manifold This is the question we study in this aer We assume now that the data oints X i R we observe are of the form X i = Y i + Z i, where Y i is the signal art of the observations and live for instance on a low-dimensional manifold, eg a 3-dimensional shere and Z i is the noise art of the observations and is eg multivariate Gaussian in dimension, where might be 00 We think this is interesting from a ractical standoint because the assumtion that the data is exactly on a manifold is erhas a bit otimistic and the noisy manifold version is erhas more in line with what statisticians exect to encounter in ractice there is a clear analogy with linear regression here From a theoretical standoint, such a model allows us to bridge the two extremes between truly low-dimensional data and fully high-dimensional data From a modeling standoint, we roose to scale the noise so that its norm stays bounded or does not grow too fast in the asymtotics That way, the signal art of the data is likely to be affected but not totally drowned by the noise It is imortant to note however that the noise is not small in any sense of the word - it is of a size comarable with that of the signal In the case of sherical noise see below for details but note that the Gaussian distribution falls into this category our results say that, to first-order, the kernel matrix comuted from information+noise data behaves like a kernel matrix comuted from the signal art of the data, but, we might have to use a different kernel than the one we started with This other kernel is quite exlicit In the case of dot-roduct kernel matrices ie Mi, j = fx i X j/n, the original kernel can be used under certain assumtions - so, to first order, the noise art has no effect on the sectral roerties of the kernel matrix The results are different when looking at Euclidian distance kernels ie Mi, j = f X i X j /n where the effect of the noise is basically to change the kernel that is used This is in any case a quite ositive result in that it says that the whole body of work concerning the behavior of kernel random matrices with low-dimensional inut data can be used to also study the information+noise case - the only change being a change of kernels The case of ellitical noise is more comlicated The dot-roduct kernels results still have the same interretation But the Euclidian distance kernels results are not as easy to interret Results Before we start, we set some notations We use M F to denote the Frobenius norm of the matrix M so M F = M i, j and M to denote its oerator norm, ie its largest singular value We also use v to denote the Euclidian norm of the vector v a b is shorthand for a, b Unless otherwise noted, functions that are said to be Lischitz are Lischitz with resect to Euclidian norm We slit our results into two arts, according to distributional assumtions on the noise One deals with the Gaussian-like case, which allows us to give a simle roof of the results The second art is about the case where the noise has a distribution that satisfies certain concentration and elliticity roerties This is more general and brings the geometry of the roblem forward It also allows us to study the robustness and lack thereof of the results to the shericity of the noise, an assumtion that is imlicit in the high-dimensional Gaussian and Gaussian-like case We draw some ractical conclusions from our results for the case of sherical noise in Subsection 3 The case of Gaussian-like noise We first study a setting where the noise is drawn according to a distribution that is similar to a Gaussian, but slightly more general Theorem Suose we observe data X,, X n in R, with X i = Y i + Z i, 3

4 where Z i = Σ / U i where the -dimensional vector U i has iid entries with mean 0, variance, and fourth moment µ 4, and {Y i } n i= P n We assume that there exists a deterministic vector a and a real C > 0, ossibly deendent on n, such that i, E Y i a < C Also, µ 4 might change with n but is assumed to remain bounded {Z i } n i= are iid, and we also assume that {Y i} n i= and {Z i} n i= are indeendent We consider the random matrices M f with i, j entry M f i, j = n f X i X j, for functions f FC0n = {f such that su fx fy C 0 n x y } x,y Let us call ν = traceσ Let M f be the matrix with i, j-th entry { M f i, j = n f Y i Y j + ν if i j n f0 if i = j Assuming only that µ 4 is bounded uniformly in n, we have, for a constant C indeendent of n,, and Σ, [ ] trace Σ E su M f M f F C C 0 n f F C0 + Σ C n We lace ourselves in the high-dimensional setting where n and tend to infinity We assume that trace Σ / 0, as tends to infinity Under these assumtions, for any fixed C 0 > 0 and C > 0, lim n, su M f M f F = 0 in robability f F C0 If we further assume that ν remains for instance bounded, the same result holds if we relace the diagonal of M by fν/n, because fν f0 νc 0 and therefore su f FC0 fν f0 νc 0 The aroximating matrix we then get is the matrix with i, j-th entry f ν Y i Y j, where f νx = fx + ν, ie a ure signal matrix involving a different kernel from the one we started with We note that there is a otential measurability issue that we address in the roof Our theorem really means that we can find a random variable that dominates the random element su f FC0 n M f M f F and goes to 0 in robability A subcase of our result is the case of Gaussian noise: then U i is N0, Id and our result naturally alies We also note that P n can change with n The class of functions we consider is fixed in the last statement of the theorem but if we were to look at a sequence of kernels we could ick a different function in the class F C0 for each n It should also be noted that the roof technique allows us to deal with classes of functions that vary with n: we could have a varying C 0 n As Equation makes clear, the aroximation result will hold as soon as the right hand side of Equation goes to 0 asymtotically, ie C 0ntrace Σ /, Σ / 0 Finally, we work here with uniformly Lischitz functions The roof technique carries over to other classes, such as certain classes of Hölder functions, but the bounds would be different Proof The strategy is to use the same entry-wise exansion aroach that was used in El Karoui 007 To do so, we remark that Z i Z j / remains essentially constant across i, j in the setting we are considering - this is a consequence of the sherical nature of high-dimensional Gaussian distributions We can therefore try to aroximate Mi, j by f Y i Y j + ν/n and all we need to do is to show that the remainder is small We also note that if, as we assume, trace Σ / 0, then Σ = o, since Σ trace Σ Work conditional on Y n = {Y i } n i=, for i j We clearly have X i X j = Y i Y j + Z i Z j Y i Y j + Z i Z j Let us study the various arts of this exansion Conditional on Y n, if we call y = Y i Y j, we see easily that Z i Z j = Σ / U i U j, and Z i Z j Y i Y j = U i U j Σ / y 4

5 Note that U i U j, which we denote Γ, has iid entries, with mean 0, variance and fourth moment µ We call With these notations, we have α = Z i Z j Y i Y j / and β = Z i Z j Therefore, for any function f in F C0n, traceσ X i X j Y i Y j + ν = α + β f X i X j f Y i Y j + ν C 0n β + α, and hence, [ f Xi X j f Y i Y j + ν] C0 n [ β + ] 4α We naturally also have [ su f Xi X j f Y i Y j + ν ] C0 n [ β + 4α ] f F C0 n So we have found a random variable τ n = C0 n[ β + ] 4α that dominates the random element ζn = [ su f FC0 f Xi X n j f Y i Y j + ν] One might be concerned about the measurability of ζ n - but by using outer exectations see van der Vaart 998, 58, we can comletely byass this otential roblem In what follows, we denote by E an outer exectation Though this technical oint does not shed further light on the roblem, it naturally needs to be addressed Hence, E su f F C0 n f Xi X j f Y i Y j + ν Y n C 0 n E β + E 4α Yn Let us focus on E β for a moment Let us call Γ = U i U j We first note that Z i Z j = Γ Σ Γ = trace Σ Γ Γ In articular, E Z i Z j = traceσ, so E β = 0 Therefore, E β = var Zi Z j / Now recall the results found for instance in Lemma A- in El Karoui 007: if the vector γ has iid entries with mean 0, variance σ and fourth moment κ 4, and if M is a symmetric matrix, E γ Mγ = σ 4 trace M + tracem + κ 4 3σ 4 tracem M, where M M is the Hadamard roduct of M with itself, ie the entrywise roduct of two matrices Alying this result in our setting ie using the moments given above of Γ, which has iid entries, in the revious formula gives var Z i Z j = var Γ Σ Γ = 8 trace Σ + µ4 3traceΣ Σ It is easy to see that traceσ Σ trace Σ, since trace Σ = i, i Therefore, i σ σ i, j and traceσ Σ = E β var Z i Z j = 8 + µ 4 3 trace Σ trace Σ = O We note that under our assumtions on trace Σ / and the fact that µ 4 remains bounded in n and therefore, this term will go to 0 as On the other hand, because α Y n = Γ Σ/ y /, and because E Γ = 0 and cov Γ = Id, we have E α y Σ y Yn = Σ y 5 4 Σ Y i a + Y j a

6 Hence, we have for C a constant indeendent of Σ, and n, [ E f Xi X j f Y i Y j + ν trace Σ Y n CC 0 n su f F C0 n + Σ Yi a + Y j a ] This inequality allows us to conclude that, for another constant C, [ E M f M f trace Σ F Y n CC0n since clearly, su f F C0 n su f F C0 n M f M f F n su f F C0 n + Σ n ] n Y i a i= f Xi X j f Y i Y j + ν, Under the assumtion that E Y i a exists and is less than C, we finally conclude that [ ] trace Σ E su M f M f F CC 0 n f F C0 + Σ C, n and Equation is shown Therefore, under our assumtions, E su M f M f F = o f F C0 Hence, when n and tend to, as announced in the Theorem su M M F 0 in robability, f F C0 Case of noise drawn from a distribution satisfying concentration inequalities The roof of Theorem makes clear that the heart of our argument is geometric: we exloit the fact that Z i Z j / is essentially constant across airs i, j It is therefore natural to try to extend the theorem to more general assumtions about the noise distribution than the Gaussian-like one we worked under reviously It is also imortant to understand the imact of the imlicit geometric assumtions ie shericity of the noise that are made and in articular the robustness of our results against these geometric assumtions We extend the results in two directions First, we investigate the generalization of our Gaussianlike results to the setting of Euclidian-distance kernel random matrices, when the noise is distributed according to a distribution satisfying a concentration inequality multilied by a random variable, ie a generalization of ellitical distributions This allows us to show that the Gaussian-like results of Theorem essentially hold under much weaker assumtions on the noise distribution, as long as the Gaussian geometry ie a sherical geometry is reserved see Corollary 3 The results of Theorem show that breaking the Gaussian geometry results in quite different aroximation results We also discuss in Theorem 4 the situation of inner-roduct kernel random matrices under the same generalized ellitical assumtions on the noise The case of Euclidian distance kernel random matrices We have the following theorem Theorem Euclidian distance kernels Suose we observe data X,, X n in R, with X i = Y i + R i Z i We lace ourselves in the high-dimensional setting where n and tend to infinity We assume that {Y i } n i= P n 6

7 {Z i } n i= are iid with E Z i = 0, and we also assume that Y n = {Y i } n i= and {Z i} n i= are indeendent R i are random variables indeendent of {Z i } n i= We now assume that the distribution of Z i is such that, for any -Lischitz function F, if µ F = E FZ i, P FZ i µ F > r C ex c 0 r b hr, where for simlicity we assume that c 0, C and b are indeendent of We call ν = E Z i / and assume that ν stays bounded as We assume that i, R i [r, R ], where r and R are deterministic sequences deending on We assume without loss of generality that R Calling MY n = Y i Y j, we assume that there exists M such that PMY n M and ǫ > 0 such that M /, R R log n + log n ǫ /b 0 Then we have Xi X j [ Y i Y j + νr i + R j ] 0 in robability We call WY n = min Y i Y j, and suose we ick W such that PWY n W Note that W = 0 is always a ossibility We call, for η > 0 given, I η = [W + νr η, M + νr + η], and { } F C,I η = f such that We consider the random matrices M f with i, j entry su fx fy C x y x,y I η M f i, j = n f X i X j, for f FC,I η Let us call M f the matrix with i, j-th entry { M f i, j = n f Y i Y j + νr i + R j if i j n f0 if i = j We have, for any given C > 0 and η > 0, lim su n, f F C,Iη M f M f F = 0 in robability 3 We have the following corollary in the case of sherical noise, which is a generalization of the Gaussian-like case considered in Theorem Corollary 3 Euclidian distance kernels with sherical noise Suose we observe data X,, X n in R, with X i = Y i + Z i, where Y i and Z i satisfy the same assumtions as in Theorem with r = R = Then the results of Theorem aly with I η = [W + ν η, M + ν + η], and { n f Y i Y j + ν if i j M f i, j = n f0 if i = j As in Theorem, we deal with otential measurability issues concerning the su in the roof Our theorem is really that we can find a random variable that goes to 0 with robability and dominates the random element su f FC,Iη M f M f F - an outer-robability statement This theorem generalizes Theorem in two ways The sherical case, detailed in Corollary 3, is a more general version of Theorem limited to Gaussian noise This is because the Gaussian setting corresonds to b = and c 0 = / Σ However, assuming only concentration inequalities allows us to handle much more comlicated structures for the noise distribution Some examles are 7

8 given below We also note that if the Y i s ie the signal art of the X i s are samled, for instance, from a fixed manifold of finite Euclidian diameter, the conditions on M are automatically satisfied, with M being the Euclidian diameter of the corresonding manifold Another generalization is geometric : by allowing R i to vary with i, we move away from the sherical geometry of high-dimensional Gaussian vectors and generalizations, to a more ellitical setting Hence our results show clearly the otential limitations and the structural assumtions that are made when one assumes Gaussianity of the noise Theorem and Corollary 3 show that the Gaussian-like results of Theorem are not robust against a change in the geometry of the noise We note however that if R i is indeendent of Z i and E R i =, cov Ri Z i = cov Z i, so all the noise models have the same covariance but they may yield different aroximating matrices and hence different sectral behavior for our information+noise models However, the sherical results have the advantage of having simle interretations In the setting of Corollary 3, if we assume that f0 and fν are uniformly bounded in n over the class of functions we consider, we can relace the diagonal of M by fν/n and have the same aroximation results Then the new M is a kernel matrix comuted from the signal art of the data with the new kernel f ν x = fx + ν To make our result more concrete, we give a few examles of distributions for which the concentration assumtions on Z i are satisfied: Gaussian random variables, for which we have c 0 = / Σ We refer to Ledoux 00, Theorem 7, for a justification of this claim Vectors of the tye v where v is uniformly distributed on the unit l - shere in dimension Theorem 3 in Ledoux 00 shows that our assumtions are satisfied, with c = // c 0 = /4, after noticing that a -Lischitz function with resect to Euclidean norm is also - Lischitz with resect to the geodesic distance on the shere Vectors Γ v, with v uniformly distributed on the unit l -shere in R and with ΓΓ = Σ having bounded oerator norm Vectors of the tye /b v, b, where v is uniformly distributed in the unit l b ball or shere in R See Ledoux 00, Theorem 4, which refers to Schechtman and Zinn 000 as the source of the theorem In this case, c 0 deends only on b Vectors with log-concave density of the tye e Ux, with the Hessian of U satisfying, for all x, HessU c 0 Id, where c 0 > 0 is the real that aears in our assumtions See Ledoux 00, Theorem 7 for a justification Vectors v distributed according to a centered Gaussian coula, with corresonding correlation matrix, Σ, having Σ bounded We refer to El Karoui 009 for a justification of the fact that our assumtions are satisfied If ṽ has a Gaussian coula distribution, then its i-th entry satisfy ṽ i = ΦN i, where N is multivariate normal with covariance matrix Σ, Σ being a correlation matrix, ie its diagonal is Here Φ is the cumulative distribution function of a standard normal distribution Taking v = ṽ / gives a centered Gaussian coula This last examle is intended to show that the result can handle quite comlicated and non-linear noise structure We note that to justify that the assumtions of the theorem are satisfied, it is enough to be able to show concentration around the mean or the median, as Proosition 8 in Ledoux 00 makes clear The reader might feel that the assumtions concerning the boundedness of the R i s will be limiting in ractice We note that the same roof essentially goes through if we just require that R i s belong to the interval [r, R ] with robability going to, but this requires a little bit more conditioning and we leave the details, which are not difficult, to the interested reader So for instance, if we had a tail condition on R i, we could bound R i with high-robability to get a choice of R So this boundedness condition is here just to make the exosition simler and is not articularly limiting in our oinion On the other hand, we note that our conditions allow deendence in the R i s and are therefore rather weak requirements Finally, the theorem as stated is for a fixed C, though the class of functions we are considering might vary with n and through the influence of I η The roof makes clear that C could also vary with n and We discuss in more details the necessary adjustments after the roof Proof of Theorem : We use the notation Y n = {Y i } n i= and P Y n to denote robability conditional on Y n We call L = {Y n : MY n M } Let us also call YR n = {{Y i } n i=, {R i} n i= }; similarly, P YR n denotes robability conditional on YR n We call LR = {YR n : Y L} We will start by working conditionally on YR n and eventually decondition our results 8

9 We assume from now on that the YR n we work with is such that Y n L Note that PY n L by assumtion and also PYR n LR The main idea now is that, in a strong sense, where ν = E Zi To show this formally, we write where i j, X i X j Y i Y j + R i + R jν, X i X j [ Y i Y j + R i + R j ν] = α + β, α = R iz i R j Z j Y i Y j β = R iz i R j Z j Our aim is to show that, as n and tend to infinity, On i j α Note that if i = j, α = 0 Clearly,, and R i E Z i + R j E Z j α + β 0 in robability P YRn α > r P YRn R i Z i Y i Y j > r + P YRn R j Z j Y i Y j > r Since we assumed that R i R, we see that the function F Z = R i Z Y i Y j / is Lischitz with resect to Euclidian norm, with Lischitz constant smaller than M / R /, when Y n is in L Also, since E Z i = 0, E F Z YR n = 0, where the exectation is conditional on YR n Hence, our concentration assumtions on Z i imly that P YRn R i Z i Y i Y j / > r C ex c 0 / r/[m / R ] b Therefore, if we use a simle union bound, we get P YRn α > r Cn ex c 0 / r/[m / R ] b In articular, if we ick, for ǫ > 0, r 0 = R M / / log n + log n ǫ /b /c 0 /b, we see that P YRn α > r 0 Cn ex c 0 / r 0 /[M / R ] b = C ex log n ǫ 0 Since P α > t P α > t &YR n LR + PYR n / LR, and since the latter goes to 0, we have, unconditionally, P α > r 0 0 On i j β We see that if A and B are vectors in R, the ma N Ri,R j : A, B R i A R j B is R i R j - Lischitz on R equied with the norm A + B, by the triangle inequality Therefore, using Proosition and 7 in Ledoux 00and using the fact that hr 0 as r and h is continuous when using the latter, we conclude that P YRn R i Z i R j Z j E R i Z i R j Z j > r 4hr/R 4 If now γ = E R i Z i R j Z j YR n, and if r = R /c 0 /b log n + log n ǫ /b /, P YRn R i Z i R j Z j γ > r K ex log n ǫ 0, 9

10 where K is a constant which does not deend on YR n So we conclude that unconditionally, if 0 = R i Z i R j Z j γ, P 0 > r 0 Note also that under our assumtions, r 0 Recall that we aim to show that = R i Z i R j Z j νri + R j 0 in robability Let us first work on = R i Z i R j Z j γ Using the fact that a b = a ba + b, and therefore, a b a b a b + b, we see that a b a b a b + b If we choose a = R i Z i R j Z j / and b = γ /, we see that the revious equation becomes γ Therefore, if we can show that 0 γ / goes to 0 in robability, we will have 0 in robability Using the concentration result given in 4, in connection with roosition 9 in Ledoux 00 and a slight modification exlained in El Karoui 007, we have Ri + Rjν γ = var YR n R i Z i R j Z j / R 3C bc 0 Γ/b = /b R κ b 5 Using our assumtion that ν remains bounded, we see that Therefore, for some K indeendent of, R γ remains bounded γ 0 KR r, with robability going to Our assumtions also guarantee that R r 0, so we conclude that, for a constant K indeendent of, R i Z i R j Z j γ = Kr R 0 with robability going to Using Equation 5, we have the deterministic inequality R i + Rjν γ R κ b r r So we can finally conclude that with high robability = β = R i Z i R j Z j νri + R j Kr R 0 Putting all these elements together, we see that if u = M / R R log n + log n ǫ /b /, we can find a constant K such that P α + β > Ku 0 0

11 In other words, P Xi X j [ Y i Y j + νr i + R j ] > Ku 0 6 This establishes a strong form of the first art of the theorem, ie Equation Second art of the theorem Equation 3 To get to the second art, we recall that, assuming that f is C -Lischitz on an interval containing { X i X j, Y i Y j + νr i + R j }, we have f Xi X j f Y i Y j + νr i + R j C Xi X j Y i Y j + νr i + R j Let us define, for η > 0 given, the event and the random element ζ n = E = { i j, X i X j I η, Y i Y j [W, M ]}, su f F C,Iη f Xi X j f Y i Y j + νr i + R j When E is true, all the airs { X i X j, Y i Y j +νr i +R j } are in I η: the art concerning Y i Y j + νr i + R j is obvious, and the one concerning X i X j comes from the definition of E So when E is true, we also have i j, f Xi X j f Y i Y j + νr i + R j C α + β Let us now consider the random variable τ n such that τ n = C on E and otherwise, so τ n = C E + E c Our remark above shows that ζ n τ n α + β Now, we see from our assumtions about {Y i } n i=, Equation 6 and the fact that u 0, that for any η > 0, PE So we have Pτ n C Also, α + β Ku with robability tending to, so we can conclude that Hence, we also have Pτ n α + β C Ku P ζ n C Ku, where this statement might have to be understood in terms of outer robabilities - hence the P instead of P See van der Vaart 998, 58 In lain English, we have found a random variable, τ n α + β, bounded by C Ku with robability going to, which is larger than the random element ζ n In other resects, we have, for all f F C,I η, since Therefore, M f M f F ζ n, M f i, j M f i, j n f X i X j f Y i Y j + νr i + R j ζ n n su M f M f F ζ n 0 in robability, 7 f F C,Iη where once again this statement may have to be understood in terms of outer robabilities The result stated in Equation 3 is roved We mentioned before the roof the ossibility that we might let C vary with n and and still get a good aroximation result This can be done by looking at Equation 7 above: ζ n is less than KC u with high-robability, so when u C n 0 the main aroximation result of Theorem holds, for a C and therefore a class of functions, that vary with n and

12 The case of inner-roduct kernel random matrices We now turn our attention to kernel matrices of the form Mi, j = fx i X j/n which are also of interest in ractice In that setting, we are able to obtain results similar in flavor to Theorem, with slight modifications on the assumtions we make about f Theorem 4 Scalar roduct kernels Suose we observe data X,, X n in R, with X i = Y i + R i Z i We lace ourselves in the high-dimensional setting where n and tend to infinity We assume that {Y i } n i= P n {Z i } n i= are iid with E Z i = 0, and we also assume that {Y i } n i= and {Z i} n i= are indeendent {R i } n i= are assumed to be indeendent of {Z i} n i= We also assume that we can find a deterministic sequence R such that i, R i R and R We assume that the distribution of Z i is such for any -Lischitz function F with resect to Euclidian norm, if µ F = E FZ i, P FZ i µ F > r C ex c 0 r b hr, where for simlicity we assume that c 0, C and b are indeendent of We call ν = E Z i / and assume that ν stays bounded as We call M = Y i Y j, and M a real such that PM M We assume that there exists ǫ > 0 such that M /, R R log n + log n ǫ /b 0 We then have X i X j Y i Y j + δ νri 0 in robability 8 We call J η = [ M η R ν, M + η + R ν] and F C,J η = {f such that We then consider the random matrices M f with i, j entry su fx fy C x y } x,y J η M f i, j = n f X ix j, for f F C,J η Let us call M the matrix with i, j-th entry { M f i, j = n f Y i Y j n f Y i + νri if i j if i = j We have, for any C > 0 and η > 0, lim su n, f F C,Jη M f M f F = 0 in robability We note that under our assumtions, we also have f Y i + νr i f Y i νc R, with high robability, and uniformly in f in F C,J η Therefore, when R 4 /n 0, the result is also valid if we relace the diagonal of M f by {f Y i }n i= /n - in which case the new aroximating matrix is the kernel matrix comuted from the signal art of the data Furthermore, the same argument shows that we get a valid oerator norm aroximation of M by this ure signal matrix as soon as R /n tends to 0 The same measurability issues as in the revious theorems might arise here and the statement should be understood as before: we can find a random variable going to 0 in robability that is larger than the random element su M f FC,Jη f M f F Finally, let us note that once again the theorem is stated for a fixed C and hence for an essentially fixed with n class of functions, though some changes in this class might come from varying J η, but the roof allows us to deal with a varying C n The adjustments are very similar to the ones we discussed after the roof of Theorem and we leave them to the interested reader

13 Proof The roof is quite similar to that of Theorem, so we mostly outline the differences and use the same notations as before We now have to focus on X ix j = Y i Y j + R i Z i Y j + R j Z j Y i + R i R j Z i Z j Z i The analysis of R Yj i is entirely similar to our analysis of α in the roof of Theorem The key remark now is that as function of Z i, when YR n LR, it is, with the new definition of M, R M /-Lischitz with resect to Euclidian norm So we immediately have, with the new definition of M : if r 0 = R M / / log n + log n ǫ /b /c 0 /b, and YR n LR, for some K > 0 which does not deend on YR n, P YRn R i Now, since PYR n / LR 0, we conclude as before that P R i Z i Y j > r 0 K ex logn ǫ Z i Y j > r 0 0 On the other hand, using the fact that 4R i R j Z i Z j = R i Z i +R j Z j R iz i R j Z j, and analyzing the concentration roerties of R i Z i + R j Z j in the same way as we did those of R i Z i R j Z j, we conclude that if u = R /c 0 /b log n + log n ǫ /b /, we can find a constant K such that P P R i Z i R j Z j R i Z i + R j Z j νri + R j > Ku 0, and νri + Rj > Ku 0 Similar arguments, relying on the fact that is obviously -Lischitz with resect to Euclidian norm, also lead to the fact that P Ri Z i i νri > Ku 0 Therefore, we can find K, greater than without loss of generality, such that P R Z i ir Z j j δ νri > Ku 0 We can therefore conclude that P X i X j Y i Y j + δ νri > Ku + r 0 0 If R M /, R log n + log n ǫ /b / 0, then both r 0 and u tend to 0 Therefore, under our assumtions, X i X j Y i Y j + δ νri 0 in robability So we have shown the first assertion of the theorem The final ste of the roof is now clear: we have, for all i, j, fx i X j fy i Y j + δ νri C X i X j Y i Y j + δ νri, when for all i, j, X i X j and Y i Y j + δ νri are in J η This event haens with robability going to under our assumtions So following the same aroach as before and dealing with measurability in the same way, we have, with robability going to, su f F C,Jη So we conclude that su f F C,Jη fx i X j fy i Y j + δ νri C X i X j Y i Y j + δ νri fx i X j fy i Y j + δ νri 0 in robability 3

14 From this statement, we get in the same manner as before, su M f M f F 0 in robability f F C,Jη As before, the equations above show that if C nu +r 0 0, the same aroximation result holds, with now a varying C n 3 Practical consequences of the results: case of sherical noise Our aim in giving aroximation results is naturally to use existing knowledge concerning the aroximating matrix to reach conclusions concerning the information+noise kernel matrices that are of interest here In articular, we have in mind situations where the signal art of the data, ie what we called {Y i } n i= in the theorems, and f or f + ν, with ν being as defined in Theorems or are such that the assumtions of Theorems 3 or 5 in Koltchinskii and Giné 000 are satisfied, in which case we can aroximate the eigenvalues of M by those of the corresonding oerator in L dp In this setting the matrix M, which is normalized so its entries are of order /n has a non-degenerate limit, which is why we considered for our kernel matrices the normalization f X i X j /n This normalization by /n makes our roofs considerably simler than the ones given in El Karoui 007 Another otentially interesting alication is the case where the signal art of the data is samled iid from a manifold with bounded Euclidian diameter, in which case our results are clearly alicable 3 Sectral roerties of information+noise kernel random matrices from ure signal kernel random matrices The ractical interest of the theorems we obtained above lie in the fact that the Frobenius norm is larger than the oerator norm, and therefore all of our results also hold in oerator norm Now we recall the discussion in El Karoui 008, Section 33, where we exlained that consistency in oerator norm imlies consistency of eigenvalues and consistency of eigensaces corresonding to searated eigenvalues as consequences of Weyl s inequality and the Davis-Kahane sinθ theorem - see Bhatia 997 and Stewart and Sun 990 Theorems,, 4 therefore imly that under the assumtions stated there, the sectral roerties of the matrix M can be deduced from those of the matrix M In articular, for techniques such as Kernel PCA, we exect, when it is a reasonable idea to use that technique, that M will have some searated eigenvalues, ie a few will be large and there will be a ga in the sectrum In that setting, it is enough to understand M, which corresonds, if i, R i =, to a ure signal matrix, with a ossibly slightly different kernel, to have a theoretical understanding of the roerties of the technique For instance, if i, R i =, if the assumtions underlying the first-order results of Koltchinskii and Giné 000 are satisfied for M, the first-order sectral roerties of M are the same as those of M, and hence of the corresonding oerator in L dp 3 On the Gaussian kernel Our analysis reveals a very interesting feature of the Gaussian kernel, ie the case where Mi, j = ex s X i X j /n, for some s > 0: when Theorem or Corollary 3 ie Theorem with i, R i = aly, the eigensaces corresonding to searated eigenvalues of the signal+noise kernel matrix converge to those of the ure signal matrix This is simly due to the fact that in that setting, if S is the matrix such that Si, j = ex νs n ex s Y i Y j, a rescaled version of the ure signal matrix M with i, j-th entry n ex s Y i Y j, we have S M 0 This latter statement is a simle consequence of the fact that S M is a diagonal matrix with entries ex νs /n on the diagonal, and therefore its oerator norm goes to 0 On the other hand, S clearly has the same eigenvectors as the ure signal matrix M Hence, because the eigensaces of M are consistent for the eigensaces of S corresonding to searated eigenvalues, they are also consistent 4

15 for those of M We note that our results are actually stronger and allow us to deal with a collection of matrices with varying s and not a single s, as we just discussed This is because we can deal with aroximations over a collection of functions in all our theorems Because of the ractical imortance of eigensaces in techniques such as Kernel PCA, these remarks can be seen as giving a theoretical justification for the use of the Gaussian kernel over other kernels in the situations where we think we might be in an information + noise setting, and the noise is sherical On the other hand, S underestimates the large eigenvalues of M because S = ex νsm, and obviously ex νs < Using Weyl s inequality see Bhatia 997, we have, if we denote by λ i M is the i-th eigenvalue of the symmetric matrix M, i, i n, λ i M λ i S M S Since the right hand-side goes to 0 asymtotically, the eigenvalues of M the ure signal matrix that stay asymtotically bounded away from 0 are underestimated by the corresonding eigenvalues of M When the noise is ellitical, ie R i s are not all equal to, the new matrix S we have to deal with has entries Si, j = ex sr iex sr j n ex s Y i Y j, so it can be written in matrix form S = DMD, where D is a diagonal matrix with Di, i = ex sr i By the same arguments as above, S M 0 in robability, but now S does not have the same eigenvectors as the ure signal matrix M So in this ellitical setting if we were to do kernel analysis on M, we would not be recovering the eigensaces of the ure signal matrix M 33 Variants of kernel matrices: Lalacian matrices and the issue of centering In various arts of statistics and machine learning, it has been argued that Lalacian matrices should be used instead of kernel matrices See for instance the very interesting Belkin and Niyogi 008, where various sectral roerties of Lalacian matrices have been studied, under a ure signal assumtion in our terminology For instance, it is assumed that the data is samled from a fixed-dimensional manifold In light of the theoretical and ractical success of these methods, it is natural to ask what haens in the information+noise case There are several definitions of Lalacian matrices A oular one see eg the work of Belkin and Niyogi, among other ublications Belkin and Niyogi 008, is derived from kernel matrices: given M a kernel matrix, the Lalacian matrix is defined as { Mi, j if i j Li, j = Mi, j otherwise When our Theorems or 4 aly, we have seen that, for relevant classes of functions F, su f F n M f i, j M f i, j 0 in robability Let us now focus on the case of a single function f If we call L the Lalacian matrix corresonding to M, we have n Li, j Li, j 0 in robability Li, i Li, i 0 in robability i We conclude that L L 0 in robability; we can therefore deduce that the sectral roerties of the Lalacian matrix L from those of L, which, when i, R i =, is a ure signal matrix, where we have slightly adjusted the kernel Here again, the Gaussian kernel lays a secial role, since when we use a Gaussian kernel, L is a scaled version of the Lalacian matrix comuted from the signal art of the data Finally, other versions of the Lalacian are also used in ractice In articular, a normalized version is sometimes advocated, and comuted as N L = D / L LD / L, if D is the diagonal of the matrix L defined above We have just seen that D L D L 0 in robability and L L 0 in robability Therefore, if the entries of D L are bounded away from 0 with robability going to, we conclude that D L stays bounded with high-robability and N L N L 0 in robability 5

16 So once again, understanding the sectral roerties of N L essentially boils down to understanding those of N L, which is, in the sherical setting where i, R i =, a ure signal matrix In the case of the Gaussian kernel, N L is equal to the normalized Lalacian matrix comuted from the ure signal data {Y i } n i= The question of centering In ractice, it is often the case that one works with centered versions of kernel matrices: either the row sums, the column sums or both are made to be equal zero These centering oerations amount to multilying resectively on the right, left or both our original kernel matrix by the matrix H = Id n /n, where is the n-dimensional vector whose entries are all equal to This matrix has oerator norm, so when M is such that M M 0, the same is true for H a MH b and H a MH b, where a and b are either 0 or This shows that our aroximations are therefore also informative when working with centered kernel matrices 3 Conclusions Our results aim to bridge the ga in the existing literature between the study of kernel random matrices in the resence of ure low-dimensional signal data see eg Koltchinskii and Giné 000 and the case of truly high-dimensional data see El Karoui 007 Our study of information+noise kernel random matrices shows that, to first order, kernel random matrices are somewhat sectrally robust to the corrution of signal by additive high-dimensional and sherical noise whose norm is controlled In articular, they tend to behave much more like a kernel matrix comuted from a low-dimensional signal than one comuted from high-dimensional data Some noteworthy results include the fact that dot-roduct kernel random matrices are, under reasonable assumtions on the kernel and the signal distribution sectrally robust for both eigenvalues and eigenvectors The Gaussian kernel also yields sectrally robust matrices at the level of eigenvectors, when the noise is sherical However, it will underestimate searated eigenvalues of the Gaussian kernel matrix corresonding to the signal art of the data On the other hand, Euclidian distance kernel random matrices are not, in general, robust to the resence of additive noise As our results show, under reasonably minimal assumtions on both the noise, the kernel and the signal distribution, a Euclidian distance kernel random matrix comuted from additively corruted data behaves like another Euclidian kernel matrix comuted from another kernel: in the case of sherical noise, it is a shifted version of f, the shift being twice the norm of the noise For sherical noise, this is bound to create excet for the Gaussian kernel otentially serious inconsistencies in both estimators of eigenvalues and eigenvectors, because the eigenroerties of the kernel matrix corresonding to the function f ν = f + ν are in general different from that of the kernel matrix corresonding to the function f The same remarks also aly to the case of ellitical noise, where the change of kernel is not deterministic and even more comlicated to describe Our study also highlights the imortance of the imlicit geometric assumtions that are made about the noise In articular, the results are qualitatively different if the noise is sherical eg multivariate Gaussian or ellitical eg multivariate t Interretation is more comlicated in the ellitical case and a number of nice roerties eg robustness or consistency which hold for sherical noise do not hold for ellitical noise Our results can therefore be seen as highlighting from a theoretical oint of view the strength and limitations of techniques which rely on kernel random matrices as a rimary element in a data analysis We hoe they shed light on an interesting issue and will hel refine our understanding of the behavior of kernel techniques and related methodologies for high-dimensional inut data References Anderson, T W 003 An introduction to multivariate statistical analysis Wiley Series in Probability and Statistics Wiley-Interscience [John Wiley & Sons], Hoboken, NJ, third edition Bach, F R and Jordan, M I 003 Kernel indeendent comonent analysis J Mach Learn Res 3, 48 Bai, Z D 999 Methodologies in sectral analysis of large-dimensional random matrices, a review Statist Sinica 9, With comments by G J Rodgers and Jack W Silverstein; and a rejoinder by the author 6

17 Belkin, M and Niyogi, P 003 Lalacian eigenmas for dimensionality reduction and data reresentation Neural Comutation 5, URL htt://wwwmitressjournalsorg/doi/abs/06/ Belkin, M and Niyogi, P 008 Convergence of lalacian eigenmas rerint Bhatia, R 997 Matrix analysis, volume 69 of Graduate Texts in Mathematics Sringer-Verlag, New York Cressie, N A C 993 Statistics for satial data Wiley Series in Probability and Mathematical Statistics: Alied Probability and Statistics John Wiley & Sons Inc, New York Revised rerint of the 99 edition, A Wiley-Interscience Publication El Karoui, N 007 The sectrum of kernel random matrices Technical Reort 748, Deartment of Statistics, UC Berkeley To aear in The Annals of Statistics El Karoui, N 008 Oerator norm consistent estimation of large dimensional sarse covariance matrices The Annals of Statistics 36, El Karoui, N 009 Concentration of measure and sectra of random matrices: with alications to correlation matrices, ellitical distributions and beyond The Annals of Alied Probability To Aear Izenman, A J 008 Modern multivariate statistical techniques Sringer Texts in Statistics Sringer, New York Regression, classification, and manifold learning Johnstone, I 00 On the distribution of the largest eigenvalue in rincial comonent analysis Ann Statist 9, Johnstone, I M 007 High dimensional statistical inference and random matrices In International Congress of Mathematicians Vol I, Eur Math Soc, Zürich Koltchinskii, V and Giné, E 000 Random matrix aroximation of sectra of integral oerators Bernoulli 6, 3 67 Ledoux, M 00 The concentration of measure henomenon, volume 89 of Mathematical Surveys and Monograhs American Mathematical Society, Providence, RI Marčenko, V A and Pastur, L A 967 Distribution of eigenvalues in certain sets of random matrices Mat Sb NS 7 4, Rasmussen, C E and Williams, C K I 006 Gaussian Processes for Machine Learning The MIT Press Schechtman, G and Zinn, J 000 Concentration on the l n ball In Geometric asects of functional analysis, volume 745 of Lecture Notes in Math, Sringer, Berlin Schölkof, B and Smola, A J 00 Learning with kernels The MIT Press, Cambridge, MA Stewart, G W and Sun, J G 990 Matrix erturbation theory Comuter Science and Scientific Comuting Academic Press Inc, Boston, MA van der Vaart, A W 998 Asymtotic statistics Cambridge Series in Statistical and Probabilistic Mathematics Cambridge University Press, Cambridge von Luxburg, U, Belkin, M, and Bousquet, O 008 Consistency of sectral clustering Ann Statist 36, URL htt://dxdoiorg/04/ Williams, C and Seeger, M 000 The effect of the inut density distribution on kernel-based classifiers International Conference on Machine Learning 7, Zwald, L, Bousquet, O, and Blanchard, G 004 Statistical roerties of kernel rincial comonent analysis In Learning theory, volume 30 of Lecture Notes in Comut Sci, Sringer, Berlin 7

The spectrum of kernel random matrices

The spectrum of kernel random matrices The sectrum of kernel random matrices Noureddine El Karoui Deartment of Statistics, University of California, Berkeley Abstract We lace ourselves in the setting of high-dimensional statistical inference,

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES OHAD GILADI AND ASSAF NAOR Abstract. It is shown that if (, ) is a Banach sace with Rademacher tye 1 then for every n N there exists

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

Sharp gradient estimate and spectral rigidity for p-laplacian

Sharp gradient estimate and spectral rigidity for p-laplacian Shar gradient estimate and sectral rigidity for -Lalacian Chiung-Jue Anna Sung and Jiaing Wang To aear in ath. Research Letters. Abstract We derive a shar gradient estimate for ositive eigenfunctions of

More information

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 000-9939XX)0000-0 ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS WILLIAM D. BANKS AND ASMA HARCHARRAS

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS #A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,

More information

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT ZANE LI Let e(z) := e 2πiz and for g : [0, ] C and J [0, ], define the extension oerator E J g(x) := g(t)e(tx + t 2 x 2 ) dt. J For a ositive weight ν

More information

1 Extremum Estimators

1 Extremum Estimators FINC 9311-21 Financial Econometrics Handout Jialin Yu 1 Extremum Estimators Let θ 0 be a vector of k 1 unknown arameters. Extremum estimators: estimators obtained by maximizing or minimizing some objective

More information

HENSEL S LEMMA KEITH CONRAD

HENSEL S LEMMA KEITH CONRAD HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar

More information

ON UNIFORM BOUNDEDNESS OF DYADIC AVERAGING OPERATORS IN SPACES OF HARDY-SOBOLEV TYPE. 1. Introduction

ON UNIFORM BOUNDEDNESS OF DYADIC AVERAGING OPERATORS IN SPACES OF HARDY-SOBOLEV TYPE. 1. Introduction ON UNIFORM BOUNDEDNESS OF DYADIC AVERAGING OPERATORS IN SPACES OF HARDY-SOBOLEV TYPE GUSTAVO GARRIGÓS ANDREAS SEEGER TINO ULLRICH Abstract We give an alternative roof and a wavelet analog of recent results

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Uniform Law on the Unit Sphere of a Banach Space

Uniform Law on the Unit Sphere of a Banach Space Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

Multiplicity of weak solutions for a class of nonuniformly elliptic equations of p-laplacian type

Multiplicity of weak solutions for a class of nonuniformly elliptic equations of p-laplacian type Nonlinear Analysis 7 29 536 546 www.elsevier.com/locate/na Multilicity of weak solutions for a class of nonuniformly ellitic equations of -Lalacian tye Hoang Quoc Toan, Quô c-anh Ngô Deartment of Mathematics,

More information

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material Robustness of classifiers to uniform l and Gaussian noise Sulementary material Jean-Yves Franceschi Ecole Normale Suérieure de Lyon LIP UMR 5668 Omar Fawzi Ecole Normale Suérieure de Lyon LIP UMR 5668

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation Uniformly best wavenumber aroximations by satial central difference oerators: An initial investigation Vitor Linders and Jan Nordström Abstract A characterisation theorem for best uniform wavenumber aroximations

More information

Khinchine inequality for slightly dependent random variables

Khinchine inequality for slightly dependent random variables arxiv:170808095v1 [mathpr] 7 Aug 017 Khinchine inequality for slightly deendent random variables Susanna Sektor Abstract We rove a Khintchine tye inequality under the assumtion that the sum of Rademacher

More information

Commutators on l. D. Dosev and W. B. Johnson

Commutators on l. D. Dosev and W. B. Johnson Submitted exclusively to the London Mathematical Society doi:10.1112/0000/000000 Commutators on l D. Dosev and W. B. Johnson Abstract The oerators on l which are commutators are those not of the form λi

More information

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating

More information

1 Riesz Potential and Enbeddings Theorems

1 Riesz Potential and Enbeddings Theorems Riesz Potential and Enbeddings Theorems Given 0 < < and a function u L loc R, the Riesz otential of u is defined by u y I u x := R x y dy, x R We begin by finding an exonent such that I u L R c u L R for

More information

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales Lecture 6 Classification of states We have shown that all states of an irreducible countable state Markov chain must of the same tye. This gives rise to the following classification. Definition. [Classification

More information

A CONCRETE EXAMPLE OF PRIME BEHAVIOR IN QUADRATIC FIELDS. 1. Abstract

A CONCRETE EXAMPLE OF PRIME BEHAVIOR IN QUADRATIC FIELDS. 1. Abstract A CONCRETE EXAMPLE OF PRIME BEHAVIOR IN QUADRATIC FIELDS CASEY BRUCK 1. Abstract The goal of this aer is to rovide a concise way for undergraduate mathematics students to learn about how rime numbers behave

More information

On Z p -norms of random vectors

On Z p -norms of random vectors On Z -norms of random vectors Rafa l Lata la Abstract To any n-dimensional random vector X we may associate its L -centroid body Z X and the corresonding norm. We formulate a conjecture concerning the

More information

Applications to stochastic PDE

Applications to stochastic PDE 15 Alications to stochastic PE In this final lecture we resent some alications of the theory develoed in this course to stochastic artial differential equations. We concentrate on two secific examles:

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

Understanding Big Data Spectral Clustering

Understanding Big Data Spectral Clustering Understanding Big Data Sectral Clustering Romain Couillet, Florent Benaych-Georges To cite this version: Romain Couillet, Florent Benaych-Georges. Understanding Big Data Sectral Clustering. IEEE 6th International

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

p-adic Measures and Bernoulli Numbers

p-adic Measures and Bernoulli Numbers -Adic Measures and Bernoulli Numbers Adam Bowers Introduction The constants B k in the Taylor series exansion t e t = t k B k k! k=0 are known as the Bernoulli numbers. The first few are,, 6, 0, 30, 0,

More information

L p -CONVERGENCE OF THE LAPLACE BELTRAMI EIGENFUNCTION EXPANSIONS

L p -CONVERGENCE OF THE LAPLACE BELTRAMI EIGENFUNCTION EXPANSIONS L -CONVERGENCE OF THE LAPLACE BELTRAI EIGENFUNCTION EXPANSIONS ATSUSHI KANAZAWA Abstract. We rovide a simle sufficient condition for the L - convergence of the Lalace Beltrami eigenfunction exansions of

More information

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems Int. J. Oen Problems Comt. Math., Vol. 3, No. 2, June 2010 ISSN 1998-6262; Coyright c ICSRS Publication, 2010 www.i-csrs.org Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various

More information

The inverse Goldbach problem

The inverse Goldbach problem 1 The inverse Goldbach roblem by Christian Elsholtz Submission Setember 7, 2000 (this version includes galley corrections). Aeared in Mathematika 2001. Abstract We imrove the uer and lower bounds of the

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J. Math. Anal. Al. 44 (3) 3 38 Contents lists available at SciVerse ScienceDirect Journal of Mathematical Analysis and Alications journal homeage: www.elsevier.com/locate/jmaa Maximal surface area of a

More information

Elementary theory of L p spaces

Elementary theory of L p spaces CHAPTER 3 Elementary theory of L saces 3.1 Convexity. Jensen, Hölder, Minkowski inequality. We begin with two definitions. A set A R d is said to be convex if, for any x 0, x 1 2 A x = x 0 + (x 1 x 0 )

More information

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS #A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS Ramy F. Taki ElDin Physics and Engineering Mathematics Deartment, Faculty of Engineering, Ain Shams University, Cairo, Egyt

More information

GENERALIZED NORMS INEQUALITIES FOR ABSOLUTE VALUE OPERATORS

GENERALIZED NORMS INEQUALITIES FOR ABSOLUTE VALUE OPERATORS International Journal of Analysis Alications ISSN 9-8639 Volume 5, Number (04), -9 htt://www.etamaths.com GENERALIZED NORMS INEQUALITIES FOR ABSOLUTE VALUE OPERATORS ILYAS ALI, HU YANG, ABDUL SHAKOOR Abstract.

More information

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

HASSE INVARIANTS FOR THE CLAUSEN ELLIPTIC CURVES

HASSE INVARIANTS FOR THE CLAUSEN ELLIPTIC CURVES HASSE INVARIANTS FOR THE CLAUSEN ELLIPTIC CURVES AHMAD EL-GUINDY AND KEN ONO Astract. Gauss s F x hyergeometric function gives eriods of ellitic curves in Legendre normal form. Certain truncations of this

More information

Cryptanalysis of Pseudorandom Generators

Cryptanalysis of Pseudorandom Generators CSE 206A: Lattice Algorithms and Alications Fall 2017 Crytanalysis of Pseudorandom Generators Instructor: Daniele Micciancio UCSD CSE As a motivating alication for the study of lattice in crytograhy we

More information

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law On Isoerimetric Functions of Probability Measures Having Log-Concave Densities with Resect to the Standard Normal Law Sergey G. Bobkov Abstract Isoerimetric inequalities are discussed for one-dimensional

More information

E-companion to A risk- and ambiguity-averse extension of the max-min newsvendor order formula

E-companion to A risk- and ambiguity-averse extension of the max-min newsvendor order formula e-comanion to Han Du and Zuluaga: Etension of Scarf s ma-min order formula ec E-comanion to A risk- and ambiguity-averse etension of the ma-min newsvendor order formula Qiaoming Han School of Mathematics

More information

Sobolev Spaces with Weights in Domains and Boundary Value Problems for Degenerate Elliptic Equations

Sobolev Spaces with Weights in Domains and Boundary Value Problems for Degenerate Elliptic Equations Sobolev Saces with Weights in Domains and Boundary Value Problems for Degenerate Ellitic Equations S. V. Lototsky Deartment of Mathematics, M.I.T., Room 2-267, 77 Massachusetts Avenue, Cambridge, MA 02139-4307,

More information

On a class of Rellich inequalities

On a class of Rellich inequalities On a class of Rellich inequalities G. Barbatis A. Tertikas Dedicated to Professor E.B. Davies on the occasion of his 60th birthday Abstract We rove Rellich and imroved Rellich inequalities that involve

More information

Sets of Real Numbers

Sets of Real Numbers Chater 4 Sets of Real Numbers 4. The Integers Z and their Proerties In our revious discussions about sets and functions the set of integers Z served as a key examle. Its ubiquitousness comes from the fact

More information

Stochastic integration II: the Itô integral

Stochastic integration II: the Itô integral 13 Stochastic integration II: the Itô integral We have seen in Lecture 6 how to integrate functions Φ : (, ) L (H, E) with resect to an H-cylindrical Brownian motion W H. In this lecture we address the

More information

Lecture: Condorcet s Theorem

Lecture: Condorcet s Theorem Social Networs and Social Choice Lecture Date: August 3, 00 Lecture: Condorcet s Theorem Lecturer: Elchanan Mossel Scribes: J. Neeman, N. Truong, and S. Troxler Condorcet s theorem, the most basic jury

More information

ON MINKOWSKI MEASURABILITY

ON MINKOWSKI MEASURABILITY ON MINKOWSKI MEASURABILITY F. MENDIVIL AND J. C. SAUNDERS DEPARTMENT OF MATHEMATICS AND STATISTICS ACADIA UNIVERSITY WOLFVILLE, NS CANADA B4P 2R6 Abstract. Two athological roerties of Minkowski content

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

GENERICITY OF INFINITE-ORDER ELEMENTS IN HYPERBOLIC GROUPS

GENERICITY OF INFINITE-ORDER ELEMENTS IN HYPERBOLIC GROUPS GENERICITY OF INFINITE-ORDER ELEMENTS IN HYPERBOLIC GROUPS PALLAVI DANI 1. Introduction Let Γ be a finitely generated grou and let S be a finite set of generators for Γ. This determines a word metric on

More information

Positive decomposition of transfer functions with multiple poles

Positive decomposition of transfer functions with multiple poles Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.

More information

Best approximation by linear combinations of characteristic functions of half-spaces

Best approximation by linear combinations of characteristic functions of half-spaces Best aroximation by linear combinations of characteristic functions of half-saces Paul C. Kainen Deartment of Mathematics Georgetown University Washington, D.C. 20057-1233, USA Věra Kůrková Institute of

More information

Collaborative Place Models Supplement 1

Collaborative Place Models Supplement 1 Collaborative Place Models Sulement Ber Kaicioglu Foursquare Labs ber.aicioglu@gmail.com Robert E. Schaire Princeton University schaire@cs.rinceton.edu David S. Rosenberg P Mobile Labs david.davidr@gmail.com

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

216 S. Chandrasearan and I.C.F. Isen Our results dier from those of Sun [14] in two asects: we assume that comuted eigenvalues or singular values are

216 S. Chandrasearan and I.C.F. Isen Our results dier from those of Sun [14] in two asects: we assume that comuted eigenvalues or singular values are Numer. Math. 68: 215{223 (1994) Numerische Mathemati c Sringer-Verlag 1994 Electronic Edition Bacward errors for eigenvalue and singular value decomositions? S. Chandrasearan??, I.C.F. Isen??? Deartment

More information

Inference for Empirical Wasserstein Distances on Finite Spaces: Supplementary Material

Inference for Empirical Wasserstein Distances on Finite Spaces: Supplementary Material Inference for Emirical Wasserstein Distances on Finite Saces: Sulementary Material Max Sommerfeld Axel Munk Keywords: otimal transort, Wasserstein distance, central limit theorem, directional Hadamard

More information

ANALYTIC NUMBER THEORY AND DIRICHLET S THEOREM

ANALYTIC NUMBER THEORY AND DIRICHLET S THEOREM ANALYTIC NUMBER THEORY AND DIRICHLET S THEOREM JOHN BINDER Abstract. In this aer, we rove Dirichlet s theorem that, given any air h, k with h, k) =, there are infinitely many rime numbers congruent to

More information

Real Analysis 1 Fall Homework 3. a n.

Real Analysis 1 Fall Homework 3. a n. eal Analysis Fall 06 Homework 3. Let and consider the measure sace N, P, µ, where µ is counting measure. That is, if N, then µ equals the number of elements in if is finite; µ = otherwise. One usually

More information

Proof: We follow thearoach develoed in [4]. We adot a useful but non-intuitive notion of time; a bin with z balls at time t receives its next ball at

Proof: We follow thearoach develoed in [4]. We adot a useful but non-intuitive notion of time; a bin with z balls at time t receives its next ball at A Scaling Result for Exlosive Processes M. Mitzenmacher Λ J. Sencer We consider the following balls and bins model, as described in [, 4]. Balls are sequentially thrown into bins so that the robability

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

2 Asymptotic density and Dirichlet density

2 Asymptotic density and Dirichlet density 8.785: Analytic Number Theory, MIT, sring 2007 (K.S. Kedlaya) Primes in arithmetic rogressions In this unit, we first rove Dirichlet s theorem on rimes in arithmetic rogressions. We then rove the rime

More information

arxiv:cond-mat/ v2 25 Sep 2002

arxiv:cond-mat/ v2 25 Sep 2002 Energy fluctuations at the multicritical oint in two-dimensional sin glasses arxiv:cond-mat/0207694 v2 25 Se 2002 1. Introduction Hidetoshi Nishimori, Cyril Falvo and Yukiyasu Ozeki Deartment of Physics,

More information

Extension of Minimax to Infinite Matrices

Extension of Minimax to Infinite Matrices Extension of Minimax to Infinite Matrices Chris Calabro June 21, 2004 Abstract Von Neumann s minimax theorem is tyically alied to a finite ayoff matrix A R m n. Here we show that (i) if m, n are both inite,

More information

RECIPROCITY LAWS JEREMY BOOHER

RECIPROCITY LAWS JEREMY BOOHER RECIPROCITY LAWS JEREMY BOOHER 1 Introduction The law of uadratic recirocity gives a beautiful descrition of which rimes are suares modulo Secial cases of this law going back to Fermat, and Euler and Legendre

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE

ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE J Jaan Statist Soc Vol 34 No 2004 9 26 ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE Yasunori Fujikoshi*, Tetsuto Himeno

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

TRACES OF SCHUR AND KRONECKER PRODUCTS FOR BLOCK MATRICES

TRACES OF SCHUR AND KRONECKER PRODUCTS FOR BLOCK MATRICES Khayyam J. Math. DOI:10.22034/kjm.2019.84207 TRACES OF SCHUR AND KRONECKER PRODUCTS FOR BLOCK MATRICES ISMAEL GARCÍA-BAYONA Communicated by A.M. Peralta Abstract. In this aer, we define two new Schur and

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Dedicated to Luis Caffarelli for his ucoming 60 th birthday Matteo Bonforte a, b and Juan Luis Vázquez a, c Abstract

More information

The Longest Run of Heads

The Longest Run of Heads The Longest Run of Heads Review by Amarioarei Alexandru This aer is a review of older and recent results concerning the distribution of the longest head run in a coin tossing sequence, roblem that arise

More information

The Fekete Szegő theorem with splitting conditions: Part I

The Fekete Szegő theorem with splitting conditions: Part I ACTA ARITHMETICA XCIII.2 (2000) The Fekete Szegő theorem with slitting conditions: Part I by Robert Rumely (Athens, GA) A classical theorem of Fekete and Szegő [4] says that if E is a comact set in the

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Combinatorics of topmost discs of multi-peg Tower of Hanoi problem

Combinatorics of topmost discs of multi-peg Tower of Hanoi problem Combinatorics of tomost discs of multi-eg Tower of Hanoi roblem Sandi Klavžar Deartment of Mathematics, PEF, Unversity of Maribor Koroška cesta 160, 000 Maribor, Slovenia Uroš Milutinović Deartment of

More information

REFINED DIMENSIONS OF CUSP FORMS, AND EQUIDISTRIBUTION AND BIAS OF SIGNS

REFINED DIMENSIONS OF CUSP FORMS, AND EQUIDISTRIBUTION AND BIAS OF SIGNS REFINED DIMENSIONS OF CUSP FORMS, AND EQUIDISTRIBUTION AND BIAS OF SIGNS KIMBALL MARTIN Abstract. We refine nown dimension formulas for saces of cus forms of squarefree level, determining the dimension

More information

BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH

BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH DORIN BUCUR, ALESSANDRO GIACOMINI, AND PAOLA TREBESCHI Abstract For Ω R N oen bounded and with a Lischitz boundary, and

More information

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP Submitted to the Annals of Statistics arxiv: arxiv:1706.07237 CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP By Johannes Tewes, Dimitris N. Politis and Daniel J. Nordman Ruhr-Universität

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

The non-stochastic multi-armed bandit problem

The non-stochastic multi-armed bandit problem Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at

More information

LORENZO BRANDOLESE AND MARIA E. SCHONBEK

LORENZO BRANDOLESE AND MARIA E. SCHONBEK LARGE TIME DECAY AND GROWTH FOR SOLUTIONS OF A VISCOUS BOUSSINESQ SYSTEM LORENZO BRANDOLESE AND MARIA E. SCHONBEK Abstract. In this aer we analyze the decay and the growth for large time of weak and strong

More information

Location of solutions for quasi-linear elliptic equations with general gradient dependence

Location of solutions for quasi-linear elliptic equations with general gradient dependence Electronic Journal of Qualitative Theory of Differential Equations 217, No. 87, 1 1; htts://doi.org/1.14232/ejqtde.217.1.87 www.math.u-szeged.hu/ejqtde/ Location of solutions for quasi-linear ellitic equations

More information

#A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS

#A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS #A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS Norbert Hegyvári ELTE TTK, Eötvös University, Institute of Mathematics, Budaest, Hungary hegyvari@elte.hu François Hennecart Université

More information

CSE 599d - Quantum Computing When Quantum Computers Fall Apart

CSE 599d - Quantum Computing When Quantum Computers Fall Apart CSE 599d - Quantum Comuting When Quantum Comuters Fall Aart Dave Bacon Deartment of Comuter Science & Engineering, University of Washington In this lecture we are going to begin discussing what haens to

More information

THE SET CHROMATIC NUMBER OF RANDOM GRAPHS

THE SET CHROMATIC NUMBER OF RANDOM GRAPHS THE SET CHROMATIC NUMBER OF RANDOM GRAPHS ANDRZEJ DUDEK, DIETER MITSCHE, AND PAWE L PRA LAT Abstract. In this aer we study the set chromatic number of a random grah G(n, ) for a wide range of = (n). We

More information

Arithmetic and Metric Properties of p-adic Alternating Engel Series Expansions

Arithmetic and Metric Properties of p-adic Alternating Engel Series Expansions International Journal of Algebra, Vol 2, 2008, no 8, 383-393 Arithmetic and Metric Proerties of -Adic Alternating Engel Series Exansions Yue-Hua Liu and Lu-Ming Shen Science College of Hunan Agriculture

More information

Probability Estimates for Multi-class Classification by Pairwise Coupling

Probability Estimates for Multi-class Classification by Pairwise Coupling Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics

More information

CMSC 425: Lecture 4 Geometry and Geometric Programming

CMSC 425: Lecture 4 Geometry and Geometric Programming CMSC 425: Lecture 4 Geometry and Geometric Programming Geometry for Game Programming and Grahics: For the next few lectures, we will discuss some of the basic elements of geometry. There are many areas

More information

On the Interplay of Regularity and Decay in Case of Radial Functions I. Inhomogeneous spaces

On the Interplay of Regularity and Decay in Case of Radial Functions I. Inhomogeneous spaces On the Interlay of Regularity and Decay in Case of Radial Functions I. Inhomogeneous saces Winfried Sickel, Leszek Skrzyczak and Jan Vybiral July 29, 2010 Abstract We deal with decay and boundedness roerties

More information

Lecture 4: Law of Large Number and Central Limit Theorem

Lecture 4: Law of Large Number and Central Limit Theorem ECE 645: Estimation Theory Sring 2015 Instructor: Prof. Stanley H. Chan Lecture 4: Law of Large Number and Central Limit Theorem (LaTeX reared by Jing Li) March 31, 2015 This lecture note is based on ECE

More information

Intrinsic Approximation on Cantor-like Sets, a Problem of Mahler

Intrinsic Approximation on Cantor-like Sets, a Problem of Mahler Intrinsic Aroximation on Cantor-like Sets, a Problem of Mahler Ryan Broderick, Lior Fishman, Asaf Reich and Barak Weiss July 200 Abstract In 984, Kurt Mahler osed the following fundamental question: How

More information