Limitations of One-Hidden-Layer Perceptron Networks
|
|
- Dominic Adams
- 5 years ago
- Views:
Transcription
1 J. Yaghob (E.): ITAT 2015 pp Charles University in Prague, Prague, 2015 Limitations of One-Hien-Layer Perceptron Networks Věra Kůrková Institute of Computer Science, Czech Acaemy of Sciences, WWW home page: vera Abstract: Limitations of one-hien-layer perceptron networks to represent efficiently finite mappings is investigate. It is shown that almost any uniformly ranomly chosen mapping on a sufficiently large finite omain cannot be tractably represente by a one-hien-layer perceptron network. This existential probabilistic result is complemente by a concrete example of a class of functions constructe using quasi-ranom sequences. Analogies with central paraox of coing theory an no free lunch theorem are iscusse. 1 Introuction A wiely-use type of a neural-network architecture is a network with one-hien-layer of computational units (such as perceptrons, raial or kernel units) an one linear output unit. Recently, new hybri learning algorithms for feeforwar networks with two or more hien layers, calle eep networks [9, 3], were successfully applie to various pattern recognition tasks. Thus a theoretical analysis ientifying tasks for which shallow networks require consierably larger moel complexities than eep ones is neee. In [4, 5], Bengio et al. suggeste that a cause of large moel complexities of shallow networks with one hien layer might be in the amount of variations of functions to be compute an they illustrate their suggestion by an example of representation of -imensional parities by Gaussian SVM. In practical applications, feeforwar networks compute functions on finite omains in R representing, e.g., scattere empirical ata or pixels of images. It is wellknown that shallow networks with many types of computational units have the universal representation property, i.e., they can exactly represent any real-value function on a finite omain. This property hols, e.g., for networks with perceptrons with any sigmoial activation function [10] an for networks with Gaussian raial units [15]. However, proofs of universal representation capabilities assume that networks have numbers of hien units equal to sizes of omains of functions to be compute. For large omains, this can be a factor limiting practical implementations. Upper bouns on rates of approximation of multivariable functions by shallow networks with increasing numbers of units were stuie in terms of variational norms tailore to types of network units (see, e.g., [11] an references therein). In this paper, we employ these norms to erive lower bouns on moel complexities of shallow networks representing finite mappings. Using geometrical properties of high-imensional spaces we show that a representation of almost any uniformly ranomly chosen function on a large finite omain by a shallow perceptron networks requires large number of units or large sizes of output weights. We illustrate this existential probabilistic result by a concrete construction of a class of functions base on Haamar an quasi-noise matrices. We iscuss analogies with central paraox of coing theory an no free lunch theorem. The paper is organize as follows. Section 2 contains basic concepts an notations on shallow networks an ictionaries of computational units. Section 3 reviews variational norms as tools for investigation of network complexity. In Section 3, estimates of probabilistic istributions of sizes of variational norms are proven. In section 4, concrete examples of functions which cannot be tractably represente by perceptron networks are constructe using Haamar an pseuo-noise matrices. Section 5 is a brief iscussion. 2 Preliminaries One-hien-layer networks with single linear outputs (shallow networks) compute input-output functions from sets of the form span n G := { n w i g i w i R, g i G where G, calle a ictionary, is a set of functions computable by a given type of units, the coefficients w i are calle output weights, an n is the number of hien units. This number is sometimes use as a measure of moel complexity. In this paper, we focus on representations of functions on finite omains X R. We enote by F (X) := { f f : X R} the set of all real-value functions on X. On F (X) we have the Eucliean inner prouct efine as an the Eucliean norm f,g := f (u)g(u) u X f := f, f. },
2 168 V. Kůrková To istinguish the inner prouct.,. on F (X) from the inner prouct on X R, we enote it, i.e., for u,v X, u v := u i v i. We investigate networks with units from the ictionary of signum perceptrons P (X) := {sgn(v. + b) : X { 1,1} v R,b R} where sgn(t) := 1 for t < 0 an sign(t) := 1 for t 0. Note that from the point of view of moel complexity, there is only a minor ifference between networks with signum perceptrons an those with Heavisie perceptrons as sgn(t) = 2ϑ(t) 1 an ϑ(t) := sgn(t) + 1, 2 where ϑ(t) := 0 for t < 0 an ϑ(t) = 1 for t 0. 3 Moel Complexity an Variational Norms A useful tool for erivation of estimates of numbers of units an sizes of output weights in shallow networks is the concept of a variational norm tailore to network units introuce in [12] as an extension of a concept of variation with respect to half-spaces from [2]. For a subset G of a norme linear space (X,. X ), G-variation (variation with respect to the set G), enote by. G, is efine as f G := inf{c R + f /c cl X conv(g G)}, where cl X enotes the closure with respect to the norm X on X, G := { g g G}, an { } k k convg := a i g i a i [0,1], a i = 1,g i G,k N is the convex hull of G. The following straightforwar consequence of the efinition of G-variation shows that in all representations of a function with large G-variation by shallow networks with units from the ictionary G, the number of units must be large or absolute values of some output weights must be large. Proposition 1. Let G be a finite subset of a norme linear space (X,. X ), then for every f X, } k k f G = min{ w i f = w i g i, w i R, g i G. Note that classes of functions efine by constraints on their variational norms represent a similar type of a concept as classes of functions efine by constraints on both numbers of gates an sizes of output weights stuie in theory of circuit complexity [16]. To erive lower bouns on variational norms, we use the following theorem from [13] showing that functions which are not correlate to any element of the ictionary G have large variations. Theorem 2. Let (X,. X ) be a Hilbert space with inner prouct.,. X an G its boune subset. Then for every f X G, f G f 2 sup g G f,g X. The following theorem shows that when a ictionary G(X) is not too large, then for a large omain X, almost any ranomly chosen function has large G(X)-variation. We enote by S r (X) := { f F (X) f = r} the sphere of raius r in F (X) an for f F (X), f o := f f. The proof of the theorem is base on geometry of spheres in high-imensional Eucliean spaces. In large imensions, most of areas of spheres lie very close to their equators [1]. Theorem 3. Let be a positive integer, X R with carx = m, G(X) a subset of F (X) with car G(X) = n such that for all g G(X), f r, µ be a uniform probability measure on S r (X), an b > 0. Then µ({ f S r (X) f G(X) b}) 1 2ne m 2b 2. Proof. Denote for g S r (X) an ε (0,1), C(g,ε) := {h Sr m 1 h o,g o ε}. As C(g,ε) is equivalent to a polar cap in R carx, whose measure is exponentially ecreasing with the imension m, we have µ(c(g,ε)) e mε2 2 (see, e.g., [1]). By Theorem 2, { f S r (X) f G(X) b} = S r (X) Hence the statement follows. g G C(g,1/b). Theorem 3 can be applie to ictionaries G(X) on omains X R with carx = m, which are relatively small. In particular, ictionaries of signum an Heavisie perceptrons are relatively small. Estimates of their sizes can be obtaine from bouns on numbers of linearly separable ichotomies to which finite subsets of R can be partitione. Various estimates of numbers of ichotomies have been erive by several authors starting from results by Schläfli [17]. The next boun is obtaine by combining a theorem from [7, p.330] with an upper boun on partial sum of binomials.
3 Limitations of One-Hien-Layer Perceptron Networks 169 Theorem 4. For every an every X R such that carx = m, carp (X) 2 i=0 ( m 1 i ) 2 m!. Combining Theorems 3 an 4, we obtain a lower boun on measures of sets of functions having variations with respect to signum perceptrons boune from below by a given boun b. Corollary 1. Let be a positive integer, X R with carx = m, µ a uniform probability measure on S m(x), an b > 0. Then µ({ f S m(x) f P (X) b}) 1 4 m! e m 2b 2. For example, for the omain X = {0,1} an b = 2 4, we obtain from Corollary 1 a lower boun e (2 2 1)! on the probability that a function on {0,1} with the norm 2 /2 has variation with respect to signum perceptrons greater or equal to 2 4. Thus for large almost any uniformly ranomly chosen function on the -imensional Boolean cube {0,1} of the same norm 2 /2 as signum perceptrons, has variation with respect to signum perceptrons epening on exponentially. 4 Construction of Functions with Large Variations The results erive in the previous section are existential. In this section, we construct a class of functions, which cannot be represente by shallow perceptron networks of low moel complexities. We construct such functions using Haamar matrices. We show that the class of Haamar matrices contains circulant matrices with rows being segments of pseuo-noise sequences which mimic some properties of ranom sequences. Recall that a Haamar matrix of orer m is an m m square matrix M with entries in { 1,1} such that any two istinct rows (or equivalently columns) of M are orthogonal. Note that this property is invariant uner permutating rows or columns an uner sign flipping all entries in a column or a row. Two istinct rows of a Haamar matrix iffer in exactly m/2 positions. The next theorem gives a lower boun on variation with respect to signum perceptrons of a { 1, 1}-value function constructe using a Haamar matrix. Theorem 5. Let M be an m m Haamar matrix, {x i i = 1,...,m} R, {y j j = 1,...,m} R, X = {x i i = 1,...,m} {y j j = 1,...,m} R 2, an f M : X { 1,1} be efine as f M (x i,y j ) =: M. Then Proof. By Theorem 2, f M P (X) f M P (X) m log 2 m. f M 2 sup g P (X) f M,g = m 2 sup g P (X) f M,g. For each g P (X), let M(g) be an m m matrix efine as M(g) = g(x i,y j ). It is easy to see that f M,g = M M(g). Using suitable permutations, we reorer rows an columns of both matrices M(g) an M in such a way that each row an each column of the reorere matrix M(g) starts with a (possibly empty) initial segment of 1 s followe by a (possibly empty) segment of 1 s. Denoting M the reorere matrix M we have f M,g = M M(g) = M M(g). As the property of being a Haamar matrix is invariant uner permutations of rows an columns, we can apply Linsay lemma [8, p.88] to submatrices of the Haamar matrix M on which all entries of the matrix M(g) are either 1 or 1. Thus we obtain an upper boun m m on the ifferences of +1s an 1s in suitable submatrices of M. Iterating the proceure at most log 2 m-times, we obtain an upper boun m mlog 2 m on M M(g) = f M,g. Thus f M P (X) m 2 m mlog 2 m = m log 2 m. Theorem 5 shows that functions whose representations by shallow perceptron networks require numbers of units or sizes of output weights boune from below by m log 2 m can be constructe using Haamar matrices. In particular, when the omain is -imensional Boolean cube {0,1}, where is even, the lower boun is 2/4 /2. So the lower bouns grows with exponentially. Recall that if a Haamar matrix of orer m exists, then m = 1 or m = 2 or m is ivisible by 4 [14, p.44]. It is conjecture that there exists a Haamar matrix of every orer ivisible by 4. Listings of Haamar matrices of various orers can be foun at Neil Sloane s library of Haamar matrices. We show that suitable Haamar matrices can be obtaine from pseuo-noise sequences. An infinite sequence
4 170 V. Kůrková a 0,a 1,...,a i,... of elements of {0,1} is calle k-th orer linear recurring sequence if for some h 0,...,h k {0,1} a i = k j=1 a i j h k j mo 2 for all i k. It is calle k-th orer pseuo-noise (PN) sequence (or pseuo-ranom sequence) if it is k-th orer linear recurring sequence with minimal perio 2 k 1. A 2 k 2 k matrix L is calle pseuo-noise if for all i = 1,...,2 k, L 1,i = 0 an L i,1 = 0 an for all i = 2,...,2 k an j = 2,...,2 k L = L i 1, j 1 where the (2 k 1) (2 k 1) matrix L is a circulant matrix with rows forme by shifte segments of length 2 k 1 of a k-th orer pseuo-noise sequence. PN sequences have many useful applications because some of their properties mimic those of ranom sequences. A run is a string of consecutive 1 s or a string of consecutive 0 s. In any segment of length 2 k 1 of k-th orer PN-sequence, one-half of the runs have length 1, one quarter have length 2, one-eighth have length 3, an so on. In particular, there is one run of length k of 1 s, one run of length k 1 of 0 s. Thus every segment of length 2 k 1 contains 2 k/2 ones an 2 k/2 1 zeros [14, p.410]. Let τ : {0,1} { 1,1} be efine as τ(x) = 1 x (i.e., τ(0) = 1 an τ(1) = 1). The following theorem states that a matrix obtaine by applying τ to entries of a pseuonoise matrix is a Haamar matrix. Theorem 6. Let L be a 2 k 2 k pseuo-noise matrix an L τ be the 2 k 2 k matrix with entries in { 1,1} obtaine from L by applying τ to all its entries. Then L τ is a Haamar matrix. Proof. We show that inner prouct of any two rows of L τ is equal to zero. The autocorrelation of a sequence a 0,a 1,...,a i,... of elements of {0,1} with perio 2 k 1 is efine as ρ(t) = 1 2 k 1 2 k 1 j=0 For every pseuo-noise sequence, ρ(t) = 1 2 k 1 1 a j+a j+t. for every t = 1,...,2 k 2 [14, p. 411]. Thus the inner prouct of every two rows of the matrix L τ is equal to 1. As all elements of the first column of L τ are equal to 1, inner prouct of every pair of its rows is equal to zero. Theorem 5 implies that for every pseuo-noise matrix L of orer 2 k an X R such that carx = 2 k 2 k, there exists a function f Lτ : X { 1,1} inuce by the matrix L τ obtaine from L by replacing 0 s with 1 s an 1 s with 1 s such that f Lτ P (X) 2k/2 k. So the variation of f Lτ with respect to signum perceptrons epens on k exponentially. In particular, setting X = {0,1}, where = 2k is even, we obtain a function of variables with variation with respect to signum perceptrons growing with exponentially as f Lτ P (X) 2/4 /2. Representation of this function by a shallow perceptron network requires number of units or sizes of some output weights epening on exponentially. It is easy to show that for each even integer, the function inuce by Sylvester-Haamar matrix M u,v = 1 u v, where u,v {0,1} /2, can be represente by a twohien-layer network with /2 units in each hien layer. 5 Discussion We prove that almost any uniformly ranomly chosen function on a sufficiently large finite set in R has large variation with respect to signum perceptrons an thus it cannot be tractably represente by a shallow perceptron network. It seems to be a paraox that although representations of almost all functions by shallow perceptron networks are untractable, it is ifficult to construct such functions. The situation can be rephrase in analogy with the title of an article from coing theory Any coe of which we cannot think is goo [6] as representation of almost any function of which we cannot think by shallow perceptron networks is untractable. A central paraox of coing theory concerns the existence an construction of the best coes. Virtually every linear coe is goo (in the sense that it meets the Gilbert-Varshamov boun on istance versus reunancy), however espite the sophisticate constructions for coes erive over the years, no one has succeee in emonstrating a constructive proceure that yiels such goo coes. The only class of functions having large variations which we succeee to construct is the class escribe in section 4 base on Haamar matrices. Among these matrices belong quasi-noise (quasi-ranom) matrices with rows obtaine as shifts of segments of quasi-noise sequences. These sequences have been use in construction of coes, interplanetary satellite picture transmission, precision measurements, acoustics, raar camouflage, an light iffusers. Pseuo-noise sequences permit esign of surfaces that scatter incoming signals very broaly making reflecte energy invisible or inauible. It shoul be emphasize that similarly as no free lunch theorem [18], our results assume uniform istributions of functions to be represente. However, probability istributions of functions moeling some practical tasks of interest (such as colors of pixels in a photograph) might be highly non uniform.
5 Limitations of One-Hien-Layer Perceptron Networks 171 Acknowlegments. This work was partially supporte by the grant COST LD13002 of the Ministry of Eucation of the Czech Republic an institutional support of the Institute of Computer Science RVO References [1] Ball, K.: An elementary introuction to moern convex geometry. In: S. Levy, (e.), Falvors of Geometry, 1 58, Cambrige University Press, 1997 [2] Barron, A. R.: Neural net approximation. In: K. S. Narenra, (e.), Proc. 7th Yale Workshop on Aaptive an Learning Systems, 69 72, Yale University Press, 1992 [3] Bengio, Y.: Learning eep architectures for AI. Founations an Trens in Machine Learning 2 (2009), [4] Bengio, Y., Delalleau, O., Le Roux, N.: The curse of imensionality for local kernel machines. Technical Report 1258, Département Informatique et Recherche Opérationnelle, Université e Montréal, 2005 [5] Bengio, Y., Delalleau, O., Le Roux, N.: The curse of highly variable functions for local kernel machines. In: Avances in Neural Information Processing Systems 18, , MIT Press, 2006 [6] Coffey, J. T., Gooman, R. M.: Any coe of which we cannot think is goo. IEEE Transactions on Information Theory 36 (1990), [7] Cover, T.: Geometrical an statistical properties of systems of linear inequalities with applictions in pattern recognition. IEEE Trans. on Electronic Computers 14 (1965), [8] Erös, P., Spencer, J. H.: Probabilistic Methos in Combinatorics. Acaemic Press, 1974 [9] Hinton, G. E., Osinero, S., Teh, Y. W.: A fast learning algorithm for eep belief nets. Neural Computation 18 (2006), [10] Ito, Y.: Finite mapping by neural networks an truth functions. Mathematical Scientist 17 (1992), [11] Kainen, P. C., Kůrková, V., Sanguineti, M.: Depenence of computational moels on input imension: Tractability of approximation an optimization tasks. IEEE Trans. on Information Theory 58 (2012), [12] Kůrková, V.: Dimension-inepenent rates of approximation by neural networks. In: K. Warwick an M. Kárný, (es.), Computer-Intensive Methos in Control an Signal Processing. The Curse of Dimensionality, , Birkhäuser, Boston, MA, 1997 [13] Kůrková, V., Savický, P., Hlaváčková, K.: Representations an rates of approximation of real-value Boolean functions by neural networks. Neural Networks 11 (1998), [14] MacWilliams, F. J., Sloane, N. J. A.: The theory of errorcorrecting coes. North Hollan, New York, 1977 [15] Micchelli, C. A.: Interpolation of scattere ata: Distance matrices an conitionally positive efinite functions. Constructive Approximation 2 (1986), [16] Roychowhury, V., Siu, K. -Y., Orlitsky, A.: Neural moels an spectral methos. In: V. Roychowhury, K. Siu, an A. Orlitsky, (es.), Theoretical Avances in Neural Computation an Learning, 3 36, Springer, New York, 1994 [17] Schläfli, L.: Theorie er vielfachen Kontinuität. Zürcher & Furrer, Zürich, 1901 [18] Wolpert, D. H., Macreay, W. G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1) (1997), 67 82
Bounds on Sparsity of One-Hidden-Layer Perceptron Networks
J. Hlaváčová (Ed.): ITAT 2017 Proceedings, pp. 100 105 CEUR Workshop Proceedings Vol. 1885, ISSN 1613-0073, c 2017 V. Kůrková Bounds on Sparsity of One-Hidden-Layer Perceptron Networks Věra Kůrková Institute
More informationLecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012
CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration
More informationInfluence of weight initialization on multilayer perceptron performance
Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -
More informationLeast-Squares Regression on Sparse Spaces
Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction
More informationConstructive lower bounds on model complexity of shallow perceptron networks
Neural Comput & Applic (2018) 29:305 315 https://doi.org/10.1007/s00521-017-2965-0 EANN 2016 Constructive lower bounds on model complexity of shallow perceptron networks Věra Kůrková 1 Received: 10 November
More informationFunction Spaces. 1 Hilbert Spaces
Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure
More informationChromatic number for a generalization of Cartesian product graphs
Chromatic number for a generalization of Cartesian prouct graphs Daniel Král Douglas B. West Abstract Let G be a class of graphs. The -fol gri over G, enote G, is the family of graphs obtaine from -imensional
More informationLower bounds on Locality Sensitive Hashing
Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,
More informationCapacity Analysis of MIMO Systems with Unknown Channel State Information
Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,
More informationTractability results for weighted Banach spaces of smooth functions
Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March
More informationAcute sets in Euclidean spaces
Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of
More informationA PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks
A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,
More informationThe chromatic number of graph powers
Combinatorics, Probability an Computing (19XX) 00, 000 000. c 19XX Cambrige University Press Printe in the Unite Kingom The chromatic number of graph powers N O G A A L O N 1 an B O J A N M O H A R 1 Department
More informationLeaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes
Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,
More informationLECTURE NOTES ON DVORETZKY S THEOREM
LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.
More informationWUCHEN LI AND STANLEY OSHER
CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability
More informationMonte Carlo Methods with Reduced Error
Monte Carlo Methos with Reuce Error As has been shown, the probable error in Monte Carlo algorithms when no information about the smoothness of the function is use is Dξ r N = c N. It is important for
More informationAgmon Kolmogorov Inequalities on l 2 (Z d )
Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,
More informationu!i = a T u = 0. Then S satisfies
Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace
More informationarxiv: v4 [cs.ds] 7 Mar 2014
Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning
More informationSurvey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013
Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing
More informationSYNCHRONOUS SEQUENTIAL CIRCUITS
CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents
More informationStable and compact finite difference schemes
Center for Turbulence Research Annual Research Briefs 2006 2 Stable an compact finite ifference schemes By K. Mattsson, M. Svär AND M. Shoeybi. Motivation an objectives Compact secon erivatives have long
More informationRamsey numbers of some bipartite graphs versus complete graphs
Ramsey numbers of some bipartite graphs versus complete graphs Tao Jiang, Michael Salerno Miami University, Oxfor, OH 45056, USA Abstract. The Ramsey number r(h, K n ) is the smallest positive integer
More informationA simple tranformation of copulas
A simple tranformation of copulas V. Durrleman, A. Nikeghbali & T. Roncalli Groupe e Recherche Opérationnelle Créit Lyonnais France July 31, 2000 Abstract We stuy how copulas properties are moifie after
More informationarxiv: v1 [math.mg] 10 Apr 2018
ON THE VOLUME BOUND IN THE DVORETZKY ROGERS LEMMA FERENC FODOR, MÁRTON NASZÓDI, AND TAMÁS ZARNÓCZ arxiv:1804.03444v1 [math.mg] 10 Apr 2018 Abstract. The classical Dvoretzky Rogers lemma provies a eterministic
More informationRobust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k
A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine
More informationAn Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an
More informationTime-of-Arrival Estimation in Non-Line-Of-Sight Environments
2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor
More informationLATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION
The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische
More information7.1 Support Vector Machine
67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to
More informationDiscrete Operators in Canonical Domains
Discrete Operators in Canonical Domains VLADIMIR VASILYEV Belgoro National Research University Chair of Differential Equations Stuencheskaya 14/1, 308007 Belgoro RUSSIA vlaimir.b.vasilyev@gmail.com Abstract:
More informationAnalyzing Tensor Power Method Dynamics in Overcomplete Regime
Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical
More informationA note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz
A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae
More informationConvergence of Random Walks
Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of
More informationRobust Low Rank Kernel Embeddings of Multivariate Distributions
Robust Low Rank Kernel Embeings of Multivariate Distributions Le Song, Bo Dai College of Computing, Georgia Institute of Technology lsong@cc.gatech.eu, boai@gatech.eu Abstract Kernel embeing of istributions
More informationSINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES
Communications on Stochastic Analysis Vol. 2, No. 2 (28) 289-36 Serials Publications www.serialspublications.com SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES
More informationarxiv: v4 [math.pr] 27 Jul 2016
The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,
More informationOn the Surprising Behavior of Distance Metrics in High Dimensional Space
On the Surprising Behavior of Distance Metrics in High Dimensional Space Charu C. Aggarwal, Alexaner Hinneburg 2, an Daniel A. Keim 2 IBM T. J. Watson Research Center Yortown Heights, NY 0598, USA. charu@watson.ibm.com
More informationOn colour-blind distinguishing colour pallets in regular graphs
J Comb Optim (2014 28:348 357 DOI 10.1007/s10878-012-9556-x On colour-blin istinguishing colour pallets in regular graphs Jakub Przybyło Publishe online: 25 October 2012 The Author(s 2012. This article
More informationINDEPENDENT COMPONENT ANALYSIS VIA
INDEPENDENT COMPONENT ANALYSIS VIA NONPARAMETRIC MAXIMUM LIKELIHOOD ESTIMATION Truth Rotate S 2 2 1 0 1 2 3 4 X 2 2 1 0 1 2 3 4 4 2 0 2 4 6 4 2 0 2 4 6 S 1 X 1 Reconstructe S^ 2 2 1 0 1 2 3 4 Marginal
More information19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control
19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior
More informationImproving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers
International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize
More informationOptimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations
Optimize Schwarz Methos with the Yin-Yang Gri for Shallow Water Equations Abessama Qaouri Recherche en prévision numérique, Atmospheric Science an Technology Directorate, Environment Canaa, Dorval, Québec,
More informationEstimation of the Maximum Domination Value in Multi-Dimensional Data Sets
Proceeings of the 4th East-European Conference on Avances in Databases an Information Systems ADBIS) 200 Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Eleftherios Tiakas, Apostolos.
More informationDense Error-Correcting Codes in the Lee Metric
Dense Error-Correcting Coes in the Lee Metric Tuvi Etzion Department of Computer Science Technion-Israel Institute of Technology Haifa 3000, Israel Email: etzion@cstechnionacil Alexaner Vary Department
More informationComputing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions
Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5
More informationFLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction
FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number
More informationDiscrete Mathematics
Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner
More information15.1 Upper bound via Sudakov minorization
ECE598: Information-theoretic methos in high-imensional statistics Spring 206 Lecture 5: Suakov, Maurey, an uality of metric entropy Lecturer: Yihong Wu Scribe: Aolin Xu, Mar 7, 206 [E. Mar 24] In this
More informationA Sketch of Menshikov s Theorem
A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p
More informationThe derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)
Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)
More informationA. Exclusive KL View of the MLE
A. Exclusive KL View of the MLE Lets assume a change-of-variable moel p Z z on the ranom variable Z R m, such as the one use in Dinh et al. 2017: z 0 p 0 z 0 an z = ψz 0, where ψ is an invertible function
More informationNecessary and Sufficient Conditions for Sketched Subspace Clustering
Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This
More informationOn Decentralized Optimal Control and Information Structures
2008 American Control Conference Westin Seattle Hotel, Seattle, Washington, USA June 11-13, 2008 FrC053 On Decentralize Optimal Control an Information Structures Naer Motee 1, Ali Jababaie 1 an Bassam
More informationMulti-View Clustering via Canonical Correlation Analysis
Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in
More informationMcMaster University. Advanced Optimization Laboratory. Title: The Central Path Visits all the Vertices of the Klee-Minty Cube.
McMaster University Avance Optimization Laboratory Title: The Central Path Visits all the Vertices of the Klee-Minty Cube Authors: Antoine Deza, Eissa Nematollahi, Reza Peyghami an Tamás Terlaky AvOl-Report
More informationA global Implicit Function Theorem without initial point and its applications to control of non-affine systems of high dimensions
J. Math. Anal. Appl. 313 (2006) 251 261 www.elsevier.com/locate/jmaa A global Implicit Function Theorem without initial point an its applications to control of non-affine systems of high imensions Weinian
More informationThe group of isometries of the French rail ways metric
Stu. Univ. Babeş-Bolyai Math. 58(2013), No. 4, 445 450 The group of isometries of the French rail ways metric Vasile Bulgărean To the memory of Professor Mircea-Eugen Craioveanu (1942-2012) Abstract. In
More informationLarge Triangles in the d-dimensional Unit-Cube (Extended Abstract)
Large Triangles in the -Dimensional Unit-Cube Extene Abstract) Hanno Lefmann Fakultät für Informatik, TU Chemnitz, D-0907 Chemnitz, Germany lefmann@informatik.tu-chemnitz.e Abstract. We consier a variant
More informationarxiv: v2 [cond-mat.stat-mech] 11 Nov 2016
Noname manuscript No. (will be inserte by the eitor) Scaling properties of the number of ranom sequential asorption iterations neee to generate saturate ranom packing arxiv:607.06668v2 [con-mat.stat-mech]
More informationOptimization of Geometries by Energy Minimization
Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.
More informationLectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs
Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent
More informationA LIMIT THEOREM FOR RANDOM FIELDS WITH A SINGULARITY IN THE SPECTRUM
Teor Imov r. ta Matem. Statist. Theor. Probability an Math. Statist. Vip. 81, 1 No. 81, 1, Pages 147 158 S 94-911)816- Article electronically publishe on January, 11 UDC 519.1 A LIMIT THEOREM FOR RANDOM
More informationNotes on Lie Groups, Lie algebras, and the Exponentiation Map Mitchell Faulk
Notes on Lie Groups, Lie algebras, an the Exponentiation Map Mitchell Faulk 1. Preliminaries. In these notes, we concern ourselves with special objects calle matrix Lie groups an their corresponing Lie
More informationMath Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors
Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+
More informationSpeaker Adaptation Based on Sparse and Low-rank Eigenphone Matrix Estimation
INTERSPEECH 2014 Speaker Aaptation Base on Sparse an Low-rank Eigenphone Matrix Estimation Wen-Lin Zhang 1, Dan Qu 1, Wei-Qiang Zhang 2, Bi-Cheng Li 1 1 Zhengzhou Information Science an Technology Institute,
More informationOn the Connectivity Analysis over Large-Scale Hybrid Wireless Networks
This full text paper was peer reviewe at the irection of IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM proceeings This paper was presente as part of the main Technical
More informationModelling and simulation of dependence structures in nonlife insurance with Bernstein copulas
Moelling an simulation of epenence structures in nonlife insurance with Bernstein copulas Prof. Dr. Dietmar Pfeifer Dept. of Mathematics, University of Olenburg an AON Benfiel, Hamburg Dr. Doreen Straßburger
More informationQubit channels that achieve capacity with two states
Qubit channels that achieve capacity with two states Dominic W. Berry Department of Physics, The University of Queenslan, Brisbane, Queenslan 4072, Australia Receive 22 December 2004; publishe 22 March
More informationarxiv: v2 [cs.ds] 11 May 2016
Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We
More informationensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y
Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay
More informationUC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics
UC Berkeley Department of Electrical Engineering an Computer Science Department of Statistics EECS 8B / STAT 4B Avance Topics in Statistical Learning Theory Solutions 3 Spring 9 Solution 3. For parti,
More informationOn the Value of Partial Information for Learning from Examples
JOURNAL OF COMPLEXITY 13, 509 544 (1998) ARTICLE NO. CM970459 On the Value of Partial Information for Learning from Examples Joel Ratsaby* Department of Electrical Engineering, Technion, Haifa, 32000 Israel
More informationMultidimensional Fast Gauss Transforms by Chebyshev Expansions
Multiimensional Fast Gauss Transforms by Chebyshev Expansions Johannes Tausch an Alexaner Weckiewicz May 9, 009 Abstract A new version of the fast Gauss transform FGT) is introuce which is base on a truncate
More informationA new proof of the sharpness of the phase transition for Bernoulli percolation on Z d
A new proof of the sharpness of the phase transition for Bernoulli percolation on Z Hugo Duminil-Copin an Vincent Tassion October 8, 205 Abstract We provie a new proof of the sharpness of the phase transition
More informationUpper and Lower Bounds on ε-approximate Degree of AND n and OR n Using Chebyshev Polynomials
Upper an Lower Bouns on ε-approximate Degree of AND n an OR n Using Chebyshev Polynomials Mrinalkanti Ghosh, Rachit Nimavat December 11, 016 1 Introuction The notion of approximate egree was first introuce
More informationSome Examples. Uniform motion. Poisson processes on the real line
Some Examples Our immeiate goal is to see some examples of Lévy processes, an/or infinitely-ivisible laws on. Uniform motion Choose an fix a nonranom an efine X := for all (1) Then, {X } is a [nonranom]
More informationALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS
ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS MARK SCHACHNER Abstract. When consiere as an algebraic space, the set of arithmetic functions equippe with the operations of pointwise aition an
More informationLower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners
Lower Bouns for Local Monotonicity Reconstruction from Transitive-Closure Spanners Arnab Bhattacharyya Elena Grigorescu Mahav Jha Kyomin Jung Sofya Raskhonikova Davi P. Wooruff Abstract Given a irecte
More informationLeast Distortion of Fixed-Rate Vector Quantizers. High-Resolution Analysis of. Best Inertial Profile. Zador's Formula Z-1 Z-2
High-Resolution Analysis of Least Distortion of Fixe-Rate Vector Quantizers Begin with Bennett's Integral D 1 M 2/k Fin best inertial profile Zaor's Formula m(x) λ 2/k (x) f X(x) x Fin best point ensity
More informationMARKO NEDELJKOV, DANIJELA RAJTER-ĆIRIĆ
GENERALIZED UNIFORMLY CONTINUOUS SEMIGROUPS AND SEMILINEAR HYPERBOLIC SYSTEMS WITH REGULARIZED DERIVATIVES MARKO NEDELJKOV, DANIJELA RAJTER-ĆIRIĆ Abstract. We aopt the theory of uniformly continuous operator
More informationAsymptotic determination of edge-bandwidth of multidimensional grids and Hamming graphs
Asymptotic etermination of ege-banwith of multiimensional gris an Hamming graphs Reza Akhtar Tao Jiang Zevi Miller. Revise on May 7, 007 Abstract The ege-banwith B (G) of a graph G is the banwith of the
More informationMinimum-time constrained velocity planning
7th Meiterranean Conference on Control & Automation Makeonia Palace, Thessaloniki, Greece June 4-6, 9 Minimum-time constraine velocity planning Gabriele Lini, Luca Consolini, Aurelio Piazzi Università
More informationProblems Governed by PDE. Shlomo Ta'asan. Carnegie Mellon University. and. Abstract
Pseuo-Time Methos for Constraine Optimization Problems Governe by PDE Shlomo Ta'asan Carnegie Mellon University an Institute for Computer Applications in Science an Engineering Abstract In this paper we
More informationResistant Polynomials and Stronger Lower Bounds for Depth-Three Arithmetical Formulas
Resistant Polynomials an Stronger Lower Bouns for Depth-Three Arithmetical Formulas Maurice J. Jansen an Kenneth W.Regan University at Buffalo (SUNY) Abstract. We erive quaratic lower bouns on the -complexity
More informationSTATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING
STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING Mark A. Kon Department of Mathematics an Statistics Boston University Boston, MA 02215 email: mkon@bu.eu Anrzej Przybyszewski
More informationOn the number of isolated eigenvalues of a pair of particles in a quantum wire
On the number of isolate eigenvalues of a pair of particles in a quantum wire arxiv:1812.11804v1 [math-ph] 31 Dec 2018 Joachim Kerner 1 Department of Mathematics an Computer Science FernUniversität in
More informationEfficiently Decodable Non-Adaptive Threshold Group Testing
Efficiently Decoable Non-Aaptive Threshol Group Testing arxiv:72759v3 [csit] 3 Jan 28 Thach V Bui, Minoru Kuribayashi, Mahi Cheraghchi, an Isao Echizen SOKENDAI The Grauate University for Avance Stuies),
More informationSituation awareness of power system based on static voltage security region
The 6th International Conference on Renewable Power Generation (RPG) 19 20 October 2017 Situation awareness of power system base on static voltage security region Fei Xiao, Zi-Qing Jiang, Qian Ai, Ran
More informationON THE OPTIMAL CONVERGENCE RATE OF UNIVERSAL AND NON-UNIVERSAL ALGORITHMS FOR MULTIVARIATE INTEGRATION AND APPROXIMATION
ON THE OPTIMAL CONVERGENCE RATE OF UNIVERSAL AN NON-UNIVERSAL ALGORITHMS FOR MULTIVARIATE INTEGRATION AN APPROXIMATION MICHAEL GRIEBEL AN HENRYK WOŹNIAKOWSKI Abstract. We stuy the optimal rate of convergence
More informationThermal conductivity of graded composites: Numerical simulations and an effective medium approximation
JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University
More informationarxiv: v1 [cs.lg] 22 Mar 2014
CUR lgorithm with Incomplete Matrix Observation Rong Jin an Shenghuo Zhu Dept. of Computer Science an Engineering, Michigan State University, rongjin@msu.eu NEC Laboratories merica, Inc., zsh@nec-labs.com
More informationCOUNTING VALUE SETS: ALGORITHM AND COMPLEXITY
COUNTING VALUE SETS: ALGORITHM AND COMPLEXITY QI CHENG, JOSHUA E. HILL, AND DAQING WAN Abstract. Let p be a prime. Given a polynomial in F p m[x] of egree over the finite fiel F p m, one can view it as
More informationTopic 7: Convergence of Random Variables
Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information
More informationarxiv: v1 [math.co] 15 Sep 2015
Circular coloring of signe graphs Yingli Kang, Eckhar Steffen arxiv:1509.04488v1 [math.co] 15 Sep 015 Abstract Let k, ( k) be two positive integers. We generalize the well stuie notions of (k, )-colorings
More information6 General properties of an autonomous system of two first order ODE
6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x
More informationLecture 2: Correlated Topic Model
Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables
More informationCharacterizing Real-Valued Multivariate Complex Polynomials and Their Symmetric Tensor Representations
Characterizing Real-Value Multivariate Complex Polynomials an Their Symmetric Tensor Representations Bo JIANG Zhening LI Shuzhong ZHANG December 31, 2014 Abstract In this paper we stuy multivariate polynomial
More informationThe Subtree Size Profile of Plane-oriented Recursive Trees
The Subtree Size Profile of Plane-oriente Recursive Trees Michael FUCHS Department of Applie Mathematics National Chiao Tung University Hsinchu, 3, Taiwan Email: mfuchs@math.nctu.eu.tw Abstract In this
More information