FEATURE extraction based on deep convolutional neural. Energy Propagation in Deep Convolutional Neural Networks

Size: px
Start display at page:

Download "FEATURE extraction based on deep convolutional neural. Energy Propagation in Deep Convolutional Neural Networks"

Transcription

1 Energy Propagation in Deep Convolutional Neural Networks Thomas Wiatowski, Philipp Grohs, an Helmut Bölcskei, Fellow, IEEE Abstract Many practical machine learning tasks employ very eep convolutional neural networks. Such large epths pose formiable computational challenges in training an operating the network. It is therefore important to unerstan how fast the energy containe in the propagate signals a.k.a. feature maps) ecays across layers. In aition, it is esirable that the feature extractor generate by the network be informative in the sense of the only signal mapping to the all-zeros feature vector being the zero input signal. This trivial null-set property can be accomplishe by asking for energy conservation in the sense of the energy in the feature vector being proportional to that of the corresponing input signal. This paper establishes conitions for energy conservation an thus for a trivial null-set) for a wie class of eep convolutional neural network-base feature extractors an characterizes corresponing feature map energy ecay rates. Specifically, we consier general scattering networks employing the moulus non-linearity an we fin that uner mil analyticity an high-pass conitions on the filters which encompass, inter alia, various constructions of Weyl-Heisenberg filters, wavelets, rigelets, α)-curvelets, an shearlets) the feature map energy ecays at least polynomially fast. For broa families of wavelets an Weyl-Heisenberg filters, the guarantee ecay rate is shown to be exponential. Moreover, we provie hany estimates of the number of layers neee to have at least ε) 00)% of the input signal energy be containe in the feature vector. Inex Terms Machine learning, eep convolutional neural networks, scattering networks, energy ecay an conservation, frame theory. I. INTODUCTION FEATUE extraction base on eep convolutional neural networks DCNNs) has been applie with significant success in a wie range of practical machine learning tasks [] [6]. Many of these applications, such as, e.g., the classification of images in the ImageNet ata set, employ very eep networks with potentially hunres of layers [7]. Such network epths entail formiable computational challenges in the training phase ue to the large number of parameters to be learne, an in operating the network ue to the large number of convolutions that nee to be carrie out. It is therefore paramount to unerstan how fast the energy containe in the signals generate in the iniviual network T. Wiatowski an H. Bölcskei are with the Department of Information Technology an Electrical Engineering, ETH Zurich, Switzerlan. {withomas, boelcskei}@nari.ee.ethz.ch P. Grohs is with the Faculty of Mathematics, University of Vienna, Austria. philipp.grohs@univie.ac.at The material in this paper was presente in part at the 07 IEEE International Symposium on Information Theory ISIT), Aachen, Germany. Copyright c) 07 IEEE. Personal use of this material is permitte. However, permission to use this material for any other purposes must be obtaine from the IEEE by sening a request to pubs-permissions@ieee.org. layers, a.k.a. feature maps, ecays across layers. In aition, it is important that the feature vector obtaine by aggregating filtere versions of the feature maps be informative in the sense of the only signal mapping to the all-zeros feature vector being the zero input signal. This trivial null-set property for the feature extractor can be obtaine by asking for the energy in the feature vector being proportional to that of the corresponing input signal, a property we shall refer to as energy conservation. Scattering networks as introuce in [8] an extene in [9] constitute an important class of feature extractors base on noes that implement convolutional transforms with prespecifie or learne filters in each network layer e.g., wavelets [8], [0], uniform covering filters [], or general filters [9]), followe by a non-linearity e.g., the moulus [8], [0], [], or a general Lipschitz non-linearity [9]), an a pooling operation e.g., sub-sampling or average-pooling [9]). Scattering network-base feature extractors were shown to yiel classification performance competitive with the state-of-the-art on various ata sets [] [7]. Moreover, a mathematical theory exists, which allows to establish formally that such feature extractors are uner certain technical conitions horizontally [8] or vertically [9] translation-invariant an eformationstable in the sense of [8], or exhibit limite sensitivity to eformations in the sense of [9] on input signal classes such as ban-limite functions [9], [8], cartoon functions [9], an Lipschitz functions [9]. It was shown recently that the energy in the feature maps generate by scattering networks employing, in every network layer, the same set of certain) Parseval wavelets [0, Section 5] or uniform covering [] filters both satisfying analyticity an vanishing moments conitions), the moulus non-linearity, an no pooling, ecays at least exponentially fast an strict energy conservation which, in turn, implies a trivial null-set) for the infinite-epth feature vector hols. Specifically, the feature map energy ecay was shown to be at least of orer Oa N ), for some unspecifie a >, where N enotes the network epth. We note that -imensional uniform covering filters as introuce in [] are functions whose Fourier transforms support sets can be covere by a union of finitely many balls. This covering conition is satisfie by, e.g., Weyl-Heisenberg filters [] with a banlimite prototype function, but fails to hol for multi-scale filters such as wavelets [], [3], α)-curvelets [4] [6], shearlets [7], [8], or rigelets [9] [3], see [, emark. b)]. Contributions. The first main contribution of this paper is a characterization of the feature map energy ecay rate in

2 DCNNs employing the moulus non-linearity, no pooling, an general filters that constitute a frame [], [3] [34], but not necessarily a Parseval frame, an are allowe to be ifferent in ifferent network layers. We fin that, uner mil analyticity an high-pass conitions on the filters, the energy ecay rate is at least polynomial in the network epth, i.e., the ecay is at least of orer ON α ), an we explicitly specify the ecay exponent α > 0. This result encompasses, inter alia, various constructions of Weyl-Heisenberg filters, wavelets, rigelets, α)-curvelets, shearlets, an learne filters of course as long as the learning algorithm imposes the analyticity an high-pass conitions we require). For broa families of wavelets an Weyl-Heisenberg filters, the guarantee energy ecay rate is shown to be exponential in the network epth, i.e., the ecay is at least of orer Oa N ) with the ecay factor given as a = 5 3 in the wavelet case an a = 3 in the Weyl-Heisenberg case. We hasten to a that our results constitute guarantee ecay rates an o not preclue the energy from ecaying faster in practice. Our secon main contribution shows that the energy ecay results above are compatible with a trivial null-set for finitean infinite-epth networks. Specifically, this is accomplishe by establishing energy proportionality between the feature vector an the unerlying input signal with the proportionality constant lower- an upper-boune by the frame bouns of the filters employe in the ifferent layers. We show that this energy conservation result is a consequence of a emoulation effect inuce by the moulus non-linearity in combination with the analyticity an high-pass properties of the filters. Specifically, in every network layer, the moulus non-linearity moves the spectral content of each iniviual feature map to base-ban i.e., to low frequencies), where it is subsequently extracte i.e., fe into the feature vector) by a low-pass output-generating filter. Finally, for input signals that belong to the class of Sobolev functions, our energy ecay an conservation results are shown to yiel hany estimates of the number of layers neee to have at least ε) 00)% of the input signal energy be containe in the feature vector. For example, in the case of exponential energy ecay with a = 5 3 an for ban-limite input signals, only 8 layers are neee to absorb 95% of the input signal s energy. We emphasize that throughout energy ecay results pertain to the feature maps, whereas energy conservation statements apply to the feature vector, obtaine by aggregating filtere versions of the feature maps. Notation. The complex conjugate of z C is enote by z. We write ez) for the real, an Imz) for the imaginary part of z C. The Eucliean inner prouct of x, y C is x, y := i= x iy i, with associate norm x := x, x. For x, x) + := max{0, x} an x := + x ) /. We enote the open ball of raius r > 0 centere at x by B r x). The first canonical orthant is H := {x A wie range of practically relevant signal classes are Sobolev functions, for example, ban-limite functions an as establishe in the present paper cartoon functions [35]. We note that cartoon functions are wiely use in the mathematical signal processing literature [5], [9], [6], [36], [37] as a moel for natural images such as, e.g., images of hanwritten igits [38]. x k 0, k =,..., }, an we efine the rotate orthant H A := {Ax x H} for A O), where O) stans for the orthogonal group of imension N. The Minkowski sum of sets A, B is A + B) := {a + b a A, b B}, an A B := A\B) B\A) enotes their symmetric ifference. A multi-inex α = α,..., α ) N 0 is an orere -tuple of non-negative integers α i N 0. For functions W : N an G : N, we say that W N) = OGN)) if there exist C > 0 an N 0 N such that W N) CGN), for all N N 0. The support suppf) of a function f : C is the closure of the set {x fx) 0} in the topology inuce by the Eucliean norm. For a Lebesgue-measurable function f : C, we write fx)x for its integral w.r.t. Lebesgue measure. The inicator function of a set B is efine as B x) =, for x B, an B x) = 0, for x \B. For a measurable set B, we let vol B) := B x)x = x, an B we write B for its bounary. L p ), with p [, ), stans for the space of Lebesgue-measurable functions f : C satisfying f p := fx) p x) /p <. L ) enotes the space of Lebesgue-measurable functions f : C such that f := inf{α > 0 fx) α for a.e. x } <. For a countable set Q, L )) Q stans for the space of sets S := {f q } q Q, with f q L ) for all q Q, satisfying S := q Q f q ) / <. We enote the Fourier transform of f L ) by f) := fx)e πi x, x an exten it in the usual way to L ) [40, Theorem 7.9]. I : L p ) L p ) stans for the ientity operator on L p ). The convolution of f L ) an g L ) is f g)y) := fx)gy x)x. We write T t f)x) := fx t), t, for the translation operator, an M f)x) := e πi x, fx),, for the moulation operator. We set f, g := fx)gx)x, for f, g L ). H s ), with s > 0, stans for the Sobolev space of functions f L ) satisfying f H s := f) + ) s ) / <, see [4, Section 6..]. Here, the inex s reflects the egree of smoothness of f H s ), i.e., larger s entails smoother f. For a multi-inex α N 0, D α enotes the ifferential operator D α := / x ) α... / x ) α, with orer α := i= α i. The space of functions f : C whose erivatives D α f of orer at most k N 0 are continuous is esignate by C k, C). Moreover, we enote the graient of a function f : C as f. II. DCNN-BASED FEATUE EXTACTOS Throughout the paper, we use the terminology of [9], consier unless explicitly state otherwise) input signals f L ), an employ the moule-sequence Ω := Ψ n,, I) ) n N, ) i.e., each network layer is associate with i) a collection of filters Ψ n := {χ n } {g λn } λn Λ n L ) L ), where χ n, referre to as output-generating filter, an the g λn, inexe Throughout the paper a.e. is w.r.t. Lebesgue measure.

3 3 f g λ j) g λ l) g m) λ 3 f g λ p) g λ r) g s) λ 3 f g λ j) g l) λ f g λ p) g r) λ f g λ j) g l) λ χ 3 f g λ p) g r) λ χ 3 f g j) λ f g p) λ f g j) λ χ f f g p) λ χ f χ Fig. : Network architecture unerlying the feature extractor 5). The inex λ k) n correspons to the k-th filter g k) λ of the collection Ψ n associate with n the n-th network layer. The function χ n+ is the output-generating filter of the n-th network layer. The root of the network correspons to n = 0. by a countable set Λ n, satisfy the frame conition [], [3], [34] A n f f χ n + f g λn B n f, ) λ n Λ n for all f L ), for some A n, B n > 0, ii) the moulus non-linearity : L ) L ), f x) := fx), an iii) no pooling, which, in the terminology of [9], correspons to pooling through the ientity operator with pooling factor equal to one. Associate with the moule Ψ n,, I), the operator U n [λ n ] efine in [9, Eq. ] particularizes to We exten 3) to paths on inex sets U n [λ n ]f = f gλn. 3) q = λ, λ,..., λ n ) Λ Λ Λ n =: Λ n, n N, accoring to U[q]f = U[λ, λ,..., λ n )]f := U n [λ n ] U [λ ]U [λ ]f, 4) where, for the empty path e :=, we set Λ 0 := {e} an U[e]f := f. The signals U[q]f, q Λ n, associate with the n-th network layer, are often referre to as feature maps in the eep learning literature. The feature vector Φ Ω f) is obtaine by aggregating filtere versions of the feature maps. More formally, Φ Ω f) is efine as [9, Definition 3] Φ Ω f) := Φ n Ωf), 5) where Φ n Ωf) := {U[q]f) χ n+ } q Λ n are the features generate in the n-th network layer, see Figure. Here, n = 0 correspons to the root of the network. The function χ n+ is the output-generating filter of the n-th network layer. The feature extractor Φ Ω : L ) L ) ) Λn was shown in [9, Theorem ] to be vertically translationinvariant, provie although that pooling is employe, with pooling factors S n, n N, see [9, Eq. 6] for the efinition of the general pooling operator) such that N lim N n= S n =. Moreover, Φ Ω exhibits limite sensitivity to certain non-linear eformations on input) signal classes such as ban-limite functions [9, Theorem ], cartoon functions [9, Theorem ], an Lipschitz functions [9, Corollary ]. III. ENEGY DECAY AND TIVIAL NULL-SET The first central goal of this paper is to unerstan how fast the energy containe in the feature maps ecays across layers. Specifically, we shall stuy the ecay of W N f) := U[q]f, f L ), 6) q Λ N as a function of network epth N. Moreover, it is esirable that the infinite-epth feature vector Φ Ω f) be informative in the sense of the only signal mapping to the all-zeros feature vector being the zero input signal, i.e., Φ Ω has a trivial null-set N Φ Ω ) := {f L ) Φ Ω f) = 0}! = {0}. 7) Figure illustrates the practical ramifications of a non-trivial null-set in a binary classification task. N Φ Ω ) = {0} can be guarantee by asking for energy conservation in the sense of A Ω f Φ Ω f) B Ω f, f L ), 8) for some constants A Ω, B Ω > 0 possibly epening on the moule-sequence Ω) an with the feature space norm Φ Ω f) := Φn Ω f) ) /, where Φ n Ω f) :=

4 4 w for an unspecifie a >, was establishe in [, Proposition 3.3]. Moreover, [0, Section 5] an [, Theorem 3.6 a)] state for the respective moule-sequences that 8) hols with A Ω = B Ω = an hence Φ Ω f) = f. ) Φ Ω f ) Fig. : Impact of a non-trivial null-set N Φ Ω ) in a binary classification task. The feature vector Φ Ω f) is fe into a linear classifier [0], which etermines set membership base on the sign of the inner prouct w, Φ Ω f). The learne) weight vector w is perpenicular to the separating hyperplane ashe line). If the null-set of the feature extractor Φ Ω is non-trivial, there exist input signals f 0 that are mappe to the origin in feature space, i.e., Φ Ω f ) = 0 gray circle), an therefore lie inepenently of the weight vector w on the separating hyperplane. These input signals f 0 are therefore unclassifiable. /. q Λ U[q]f) χ n+ ) Inee, 7) follows from 8) n as the upper boun in 8) yiels {0} N Φ Ω ), an the lower boun implies {0} N Φ Ω ). We emphasize that, as Φ Ω is a non-linear operator owing to the moulus non-linearities), characterizing its null-set is non-trivial in general. The upper boun in 8) was establishe in [9, Appenix E]. While the existence of this upper boun is implie by the filters Ψ n, n N, satisfying the frame property ) [9, Appenix E], perhaps surprisingly, this is not enough to guarantee A Ω > 0 see Appenix A for an example). We refer the reaer to Section V for results on the null-set of the finite-epth feature extractor N Φn Ω. Previous work on the ecay rate of W N f) in [0, Section 5] shows that for wavelet-base networks i.e., in every network layer the filters Ψ = {χ} {g λ } λ Λ in ) are taken to be specific) -D wavelets that constitute a Parseval frame, with χ a low-pass filter) there exist ε > 0 an a > both constants unspecifie) such that W N f) f) ) r g εa N, 9) for real-value -D signals f L ) an N, where ) r g := e. To see that this result inicates energy ecay, Figure 3 illustrates the influence of network epth N on the upper boun in 9). Specifically, we can see that increasing the network epth results in cutting out increasing amounts of energy of f an thereby making the upper boun in 9) ecay as a function of N. Moreover, it is interesting to note that the upper boun on W N f) = q Λ N U[q]f is inepenent of the wavelets generating the feature maps U[q]f, q Λ N. For scattering networks that employ, in every network layer, uniform covering filters Ψ = {χ} {g λ } λ Λ L ) L ) forming a Parseval frame where χ, again, is a lowpass filter), exponential energy ecay accoring to W N f) = Oa N ), f L ), 0) The first main goal of the present paper is to establish i) for - imensional complex-value input signals that W N f) ecays polynomially accoring to W N f) BΩ N f) ) r l N α, ) for f L ) an N, where α =, for =, an α = log / /)), for, BΩ N = N k= max{, B k}, an r l :, r l ) = ) l +, with l > / +, for networks base on general filters {χ n } {g λn } λn Λ n that satisfy mil analyticity an high-pass conitions an are allowe to be ifferent in ifferent network layers with the proviso that χ n, n N, is of low-pass nature in a sense to be mae precise), an ii) for -D complex-value input signals that 6) ecays exponentially accoring to W N f) f) ) r l a N, 3) for f L ) an N, for networks that are base, in every network layer, on a broa family of wavelets, with the ecay factor given explicitly as a = 5 3, or on a broa family of Weyl-Heisenberg filters [9, Appenix B], with ecay factor a = 3. Thanks to the right-han sie HS) of ) an 3) not epening on the specific filters {χ n } {g λn } λn Λ n, we will be able to establish uner smoothness assumptions on the input signal f universal energy ecay results. Specifically, particularizing the HS expressions in ) an 3) to Sobolev-class input signals f H s ), s > 0, where { } H s ) = f L ) f H s <, with f H s := + ) s f) ) /, we show that ) yiels polynomial energy ecay accoring to an 3) exponential energy ecay W N f) = O N γα), 4) W N f) = O a γn ), 5) where γ := min{, s} in both cases. Sobolev spaces H s ) contain a wie range of practically relevant signal classes such as, e.g., the space L L ) := {f L ) supp f ) B L 0)}, L 0, of L-ban-limite functions accoring to L L ) H s ), for L 0 an s > 0. This follows from + ) s f) = + ) s f) B L 0) + L ) s f <, for f L L ), L 0, an s > 0, where we use Parseval s formula an the fact that + ) s,

5 5 f) h N ) a N a N a N a N f) h N ) a N a N a N a N f) h N+) a N a N a N a N f) h N+) a N a N a N a N Fig. 3: Illustration of the impact of network epth N on the upper boun on W N f) in 9), for ε = an a >. The function h N ) := )) r g εa, N where r g) = e, is of increasing high-pass nature as N increases, which makes the upper boun in 9) ecay in N., is monotonically increasing in, for s > 0, the space CCAT K of cartoon functions of size K, introuce in [35], an wiely use in the mathematical signal processing literature [5], [9], [6], [36], [37] as a moel for natural images such as, e.g., images of hanwritten igits [38] see Figure 4). For a formal efinition of CCAT K, we refer the reaer to Appenix B, where we also show that CCAT K Hs ), for K > 0 an s 0, /). Moreover, Sobolev functions are containe in the space of k-times continuously ifferentiable functions C k, C) accoring to H s ) C k, C), for s > k+ [4, Section 4]. Our secon central goal is to prove energy conservation accoring to 8) which, as explaine above, implies N Φ Ω ) = {0}) for the network configurations corresponing to the energy ecay results ) an 3). Finally, we provie hany estimates of the number of layers neee to have at least ε) 00)% of the input signal energy be containe in the feature vector. IV. MAIN ESULTS Throughout the paper, we make the following assumptions on the filters {g λn } λn Λ n. Assumption. The {g λn } λn Λ n, n N, are analytic in the following sense: For every layer inex n N, for every λ n Λ n, there exists an orthant H Aλn, with A λn O), such that suppĝ λn ) H Aλn. 6) Moreover, there exists > 0 so that λ n Λ n ĝ λn ) = 0, a.e. B 0). 7) In the -D case, i.e., for =, Assumption simply amounts to every filter g λn satisfying either suppĝ λn ), ] or suppĝ λn ) [, ), which constitutes an analyticity an high-pass conition. For imensions, Assumption requires that every filter g λn be of high-pass nature an have a Fourier transform supporte in a not necessarily canonical) orthant. Since the frame conition ) is equivalent to the Littlewoo-Paley conition [43] A n χ n ) + ĝ λn ) B n, a.e., 8) λ n Λ n 7) implies low-pass characteristics for χ n to fill the spectral gap B 0) left by the filters {g λn } λn Λ n.

6 6 Fig. 4: An image of a hanwritten igit is moele by a -D cartoon function. The conitions 6) an 7) we impose on the Ψ n, n N, are not overly restrictive as they encompass, inter alia, various constructions of Weyl-Heisenberg filters e.g., a -D B-spline as prototype function [45, Section ]), wavelets e.g., analytic Meyer wavelets [, Section 3.3.5] in -D, an Cauchy wavelets [46] in -D), an specific constructions of rigelets [3, Section.], curvelets [5, Section 4.], α-curvelets [6, Section 3], an shearlets e.g., cone-aapte shearlets [37, Section 4.3]). We refer the reaer to [9, Appenices B an C] for a brief review of some of these filter structures. We are now reay to state our main result on energy ecay an energy conservation. Theorem. Let Ω be the moule-sequence ) with filters {g λn } λn Λ n satisfying the conitions in Assumption, an let > 0 be the raius of the spectral gap B 0) left by the filters {g λn } λn Λ n accoring to 7). Furthermore, let s 0, A N Ω := N k= min{, A k}, BΩ N := N k= max{, B k}, an {, =, α := log 9) / /)),. i) We have W N f) BΩ N f) ) r l N α, 0) for f L ) an N, where r l :, r l ) := ) l +, with l > / +. ii) For every Sobolev function f H s ), s > 0, we have where γ := min{, s}. iii) If, in aition to Assumption, W N f) = O B N Ω N γα), ) 0 < A Ω := lim N AN Ω B Ω := lim N BN Ω <, ) then we have energy conservation accoring to for all f L ). A Ω f Φ Ω f) B Ω f, 3) Proof. For the proofs of i) an ii), we refer to Appenices C an D, respectively. The proof of statement iii) is base on two key ingreients. First, we establish in Proposition in Appenix E that the feature extractor Φ Ω satisfies the energy ecomposition ientity A N Ω f N Φ n Ωf) + W N f) B N Ω f, 4) for all f L ) an all N. Secon, we show in Proposition in Appenix F that the integral on the HS of 0) goes to zero as N which, thanks to lim N BN Ω = B Ω <, implies that W N f) 0 as N. We note that while the ecomposition 4) hols for general filters Ψ n satisfying the frame property ), it is the upper boun 0) that makes use of the analyticity an high-pass conitions in Assumption. The final energy conservation result 3) is obtaine by letting N in 4). The strength of the results in Theorem erives itself from the fact that the only conition we nee to impose on the filters Ψ n is Assumption, which, as alreay mentione, is met by a wie array of filters. Moreover, conition ) is easily satisfie by normalizing the filters Ψ n, n N, appropriately see, e.g., [9, Proposition 3]). We note that this normalization, when applie to filters that satisfy Assumption, yiels filters that still meet Assumption. The ientity ) establishes, upon normalization [9, Proposition 3] of the Ψ n to get B n, n N, that the energy ecay rate, i.e., the ecay rate of W N f), is at least polynomial in N. We hasten to a that 0) oes not preclue the energy from ecaying faster in practice. Unerlying the energy conservation result 3) is the following emoulation effect inuce by the moulus nonlinearity in combination with the analyticity an high-pass properties of the filters {g λn } λn Λ n. In every network layer, the spectral content of each iniviual feature map is move to base-ban i.e., to low frequencies), where it is extracte by the low-pass output-generating atom χ n+, see Figure 5. The components not collecte by χ n+ see Figure 5, bottom row) are capture by the analytic high-pass filters {g λn+ } λn+ Λ n+ in the next layer an, thanks to the moulus non-linearity, again move to low frequencies an extracte by χ n+. Iterating this process ensures that the null-set of the feature vector be it for the infinite-epth network or, as establishe in Section V, for finite network epths) is trivial. It is interesting to observe that the sigmoi, the rectifie linear unit, an the hyperbolic tangent non-linearities all wiely use in the eep learning literature exhibit very ifferent behavior in this regar, namely, they o not emoulate in the way the moulus non-linearity oes [44, Figure 6]. It is therefore unclear whether the proof machinery for energy conservation evelope in this paper extens to these nonlinearities or, for that matter, whether one gets energy ecay an conservation at all. The feature map energy ecay result ) relates to the feature vector energy conservation result 3) via the energy ecomposition ientity 4). Specifically, particularizing 4) for Parseval frames, i.e., A n = B n =, for all n N, we get N Φ n Ωf) + W N f) = f. 5) This shows that the input signal energy containe in the network layers n N is precisely given by W N f). Thanks to W N f) 0 as N establishe in Proposition in Appenix F) this resiual energy will eventually be collecte

7 7 ĝ λn ) χ n+) f) f) ĝ λn ) χ n+) f g λn ) Fig. 5: Illustration of the emoulation effect of the moulus non-linearity. The {g λn } λn Λ n are taken as perfect ban-pass filters e.g., ban-limite analytic Weyl-Heisenberg filters) an hence trivially satisfy the conitions in Assumption. The moulus operation in combination with the analyticity an the highpass nature of the filters {g λn } λn Λ n ensures that in every network layer the spectral content of each iniviual feature map is move to base-ban i.e., to low frequencies), where it is extracte by the low-pass) output-generating filter χ n+. in the infinite-epth feature vector Φ Ω f) so that no input signal energy is lost in the network. In Section V, we shall answer the question of how many layers are neee to absorb ε) 00)% of the input signal energy. The next result shows that, uner aitional structural assumptions on the filters {g λn } λn Λ, the guarantee energy ecay rate can be improve from polynomial to exponential. Specifically, we can get exponential energy ecay for broa families of wavelets an Weyl-Heisenberg filters. For conceptual reasons, we consier the -D case an, for simplicity of exposition, we employ filters that constitute Parseval frames an are ientical across network layers. Theorem. Let r l :, r l ) := ) l +, with l >. i) Wavelets: Let the mother an father wavelets ψ, φ L ) L ) satisfy supp ψ) [/, ] an φ) + ψ j ) =, a.e. 0. 6) j= Moreover, let g j x) := j ψ j x), for x, j, an g j x) := j ψ j x), for x, j, an set χx) := φx), for x. Let Ω be the moule-sequence ) with filters Ψ = {χ} {g j } j Z\{0} in every network layer. Then, W N f) f) ) r l 5/3) N, 7) for f L ) an N. Moreover, for every Sobolev function f H s ), s > 0, we have W N f) = O 5/3) γn), 8) where γ := min{, s}. ii) Weyl-Heisenberg filters: For, let the functions g, φ L ) L ) satisfy suppĝ) [, ], ĝ ) = ĝ), for, an φ) + ĝ k + )) =, 9) k= a.e. 0. Moreover, let g k x) := e πik+)x gx), for x, k, an g k x) := e πi k +)x gx), for x, k, an set χx) := φx), for x. Let Ω be the moule-sequence ) with filters Ψ = {χ} {g k } k Z\{0} in every network layer. Then, W N f) f) ) r l 3/) N, 30) for f L ) an N. Moreover, for every Sobolev function f H s ), s > 0, we have where γ := min{, s}. Proof. See Appenix G. W N f) = O 3/) γn ), 3) The conitions we impose on the mother an father wavelet ψ, φ in i) are satisfie, e.g., by analytic Meyer wavelets [, Section 3.3.5], an those on the prototype function g an low-pass filter φ in ii) by B-splines [45, Section ]. Moreover, as shown in [44, Theorem 3.], the exponential energy ecay results in 8) an 3) can be generalize to Oa N ) with arbitrary ecay factor a > realize through

8 8 suitable choice of the mother wavelet or the Weyl-Heisenberg prototype function. We note that in the presence of pooling by sub-sampling as efine in [9, Eq. 9]), say with pooling factors S n := S [, a), for all n N, where a = 5 3 in the wavelet case an a = 3 in the Weyl-Heisenberg case) the effective ecay factor in 8) an 3) becomes 5 3S an 3 S, respectively. Exponential energy ecay is hence compatible with vertical translation invariance accoring to [9, Theorem ], albeit at the cost of a slower exponential) ecay rate. The proof of this statement is structurally very similar to that of Theorem an will therefore not be given here. Finally, we note that the energy ecay an conservation results in Theorems an are compatible with the feature extractor Φ Ω being eformationinsensitive accoring to [9, Theorem ], simply by noting that [9, Theorem ] applies to general semi-iscrete frames an general Lipschitz-continuous non-linearities. We next put the results in Theorems an into perspective with respect to the literature. elation to [0, Section 5]: The basic philosophy of our proof technique for 0), 3), 7), an 30) is inspire by the proof in [0, Section 5], which establishes 9) an ) for scattering networks base on certain wavelet filters an with -D real-value input signals f L ). Specifically, in [0, Section 5], in every network layer, the filters Ψ W = {χ} {g j } j Z where g j ) := j ψ j ), j Z, for some mother wavelet ψ L ) L )) are -D functions satisfying the frame property ) with A n = B n =, n N, a mil analyticity conition 3 [0, Eq. 5.5] in the sense of ĝ j ), j Z, being larger for positive frequencies than for the corresponing negative ones, an a vanishing moments conition [0, Eq. 5.6] which controls the behavior of ψ) aroun the origin accoring to ψ) C +ε, for, for some C, ε > 0. Similarly to the proof of ) as given in [0, Section 5], we base our proof of 3) on the energy ecomposition ientity 4) an on an upper boun on W N f) see 9) for the corresponing upper boun establishe in [0, Section 5]) shown to go to zero as N. The exponential energy ecay results ), 8), an 3) for Sobolev functions f H s ) are entirely new. The major ifferences between [0, Section 5] an our results are i) that 9) reporte in [0, Section 5]) epens on an unspecifie a >, whereas our results in 0), ), 7), 8), 30), an 3) make the ecay factor a an the ecay exponent α explicit, ii) the technical elements employe to arrive at the upper bouns on W N f); specifically, while the proof in [0, Section 5] makes explicit use of the algebraic structure of the filters, namely, the multiscale structure of wavelets, our proof of 0) is oblivious to the algebraic structure of the filters, which is why it applies to general possibly unstructure) filters that, in aition, can be ifferent in ifferent network layers, iii) the assumptions impose on the filters, namely the analyticity an vanishing moments conitions in [0, Eq ], in contrast to our Assumption, an iv) the class of input signals f the results 3 At the time of completion of the present paper, I. Walspurger kinly sent us a preprint [47] which shows that the analyticity conition [0, Eq. 5.5] on the mother wavelet is not neee for 9) to hol. apply to, namely -D real-value signals in [0, Section 5], an -imensional complex-value signals in our Theorem. elation to []: For scattering networks that are base on so-calle uniform covering filters [], 0) an ) are establishe in [] for -imensional complex-value signals f L ). Specifically, in [], in every network layer, the -imensional filters {χ} {g λ } λ Λ are taken to satisfy i) the frame property ) with A = B = an hence A n = B n =, n N, see [, Definition. c)], ii) a vanishing moments conition [, Definition. a)] accoring to ĝ λ 0) = 0, for λ Λ, an iii) a uniform covering conition [, Definition. b)] which says that the filters Fourier transform support sets can be covere by a union of finitely many balls. The major ifferences between [] an our results are as follows: i) the results in [] apply exclusively to filters satisfying the uniform covering conition such as, e.g., Weyl-Heisenberg filters with a ban-limite prototype function [, Proposition.3], but o not apply to multi-scale filters such as wavelets, α)-curvelets, shearlets, an rigelets see [, emark. b)]), ii) 0) as establishe in [] leaves the ecay factor a > unspecifie, whereas our results in 8) an 3) make the ecay factor a explicit namely, a = 5/3 in the wavelet case an a = 3/ in the Weyl-Heisenberg case), iii) the exponential energy ecay result in 0) as establishe in [] applies to all f L ) an thus, in particular, to Sobolev input signals owing to H s ) L ), for all s > 0), whereas our ecay results in ), 8), an 3) pertain to Sobolev input signals f H s ), s > 0, only, iv) the technical elements employe to arrive at the upper bouns on W N f), specifically, while the proof in [] makes explicit use of the uniform covering property of the filters, our proof of 0) is completely oblivious to the algebraic) structure of the filters, v) the assumptions impose on the filters, i.e., the vanishing moments an uniform covering conition in [, Definition. a)-b)], in contrast to our Assumption, which is less restrictive, an thereby makes our results in Theorem apply to general possibly unstructure) filters that, in aition, can be ifferent in ifferent network layers. V. NUMBE OF LAYES NEEDED DCNNs use in practice employ potentially hunres of layers [7]. Such network epths entail formiable computational challenges both in training an in operating the network. It is therefore important to unerstan how many layers are neee to have most of the input signal energy be containe in the feature vector. This will be one by consiering Parseval frames in all layers, i.e., frames with frame bouns A n = B n =, n N. The energy conservation result 3) then implies that the infinite-epth feature vector Φ Ω f) = Φ n Ωf) contains the entire input signal energy accoring to Φ Ω f) = Φn Ω f) = f. Now, the ecomposition 5) reveals that thanks to lim W Nf) 0, N increasing the network epth N implies that the feature vector f) progressively contains a larger fraction of the N Φn Ω

9 9 ε) wavelets Weyl-Heisenberg filters general filters Table I: Number N of layers neee to ensure that ε) 00)% of the input signal energy are containe in the features generate in the first N network layers. input signal energy. We formalize the question on the number of layers neee by asking for bouns of the form N ε) Φn Ω f) f, 3) i.e., by etermining the network epth N guaranteeing that at least ε) 00)% of the input signal energy are capture by the corresponing epth-n feature vector N Φn Ω f). Moreover, 3) ensures that the epth-n feature extractor N Φn Ω exhibits a trivial null-set. The following results establish hany estimates of the number N of layers neee to guarantee 3). For peagogical reasons, we start with the case of ban-limite input signals an then procee to a more general statement pertaining to Sobolev functions. Corollary. i) Let Ω be the moule-sequence ) with filters {g λn } λn Λ n satisfying the conitions in Assumption, an let the corresponing frame bouns be A n = B n =, n N. Let > 0 be the raius of the spectral gap B 0) left by the filters {g λn } λn Λ n accoring to 7). Furthermore, let l > / +, ε 0, ), α as efine in 9), an f L ) L-ban-limite. If ) /α L N, 33) ε) l ) then 3) hols. ii) Assume that the conitions in Theorem i) an ii) hol. For the wavelet case, let a = 5 3 an = where correspons to the raius ) of the spectral gap left by the wavelets {g j } j Z\{0}. For the Weyl-Heisenberg case, let a = 3 an = here, correspons to the raius of the spectral ) gap left by the Weyl-Heisenberg filters {g k } k Z\{0}. Moreover, let l >, ε 0, ), an f L ) L-ban-limite. If ) L N log a, 34) ε) l ) then 3) hols in both cases. Proof. See Appenix H. Corollary nicely shows how the escription complexity of the signal class uner consieration, namely the banwith L an the imension through the ecay exponent α efine in 9) etermine the number N of layers neee. Specifically, 33) an 34) show that larger banwiths L an larger imension rener the input signal f more complex, which requires eeper networks to capture most of the energy of f. The epenence of the lower bouns in 33) an 34) on the network properties, through the moule-sequence Ω, is through the ecay factor a > an the raius of the spectral gap left by the filters {g λn } λn Λ n. The following numerical example provies quantitative insights on the influence of the parameter ε on 33) an 34). Specifically, we set L =, =, = which implies α =, see 9)), l =.000, an show in Table I the number N of layers neee accoring to 33) an 34) for ifferent values of ε. The results show that 95% of the input signal energy are containe in the first 8 layers in the wavelet case an in the first 0 layers in the Weyl-Heisenberg case. We can therefore conclue that in practice a relatively small number of layers is neee to have most of the input signal energy be containe in the feature vector. In contrast, for general filters, where we can guarantee polynomial energy ecay only, N = 39 layers are neee to absorb 95% of the input signal energy. We hasten to a, however, that 0) simply guarantees polynomial energy ecay an oes not preclue the energy from ecaying faster in practice. We procee with the estimates for Sobolev-class input signals. Corollary. i) Let Ω be the moule-sequence ) with filters {g λn } λn Λ n satisfying the conitions in Assumption, an let the corresponing frame bouns be A n = B n =, n N. Let > 0 be the raius of the spectral gap B 0) left by the filters {g λn } λn Λ n accoring to 7). Furthermore, let l > / +, ε 0, ), α as efine in 9), an f H s )\{0}, for s > 0. If N l f /γ H s ε /γ f /γ ) /α, 35) where γ := min{, s}, then 3) hols. ii) Assume that the conitions in Theorem i) an ii) hol. For the wavelet case, let a = 5 3 an = where correspons to the raius ) of the spectral gap left by the wavelets {g j } j Z\{0}. For the Weyl-Heisenberg case, let a = 3 an = here, correspons to the raius of the spectral ) gap left by the Weyl-Heisenberg filters {g k } k Z\{0}. Furthermore, let l >, ε 0, ), an f H s )\{0}, for s > 0. If ) l f /γ H N log s a, 36) ε /γ f /γ where γ := min{, s}, then 3) hols. Proof. See Appenix I. As alreay mentione in Section III, Sobolev spaces H s ) contain a wie range of practically relevant signal classes. The results in Corollary therefore provie for a wie variety of input signals a picture of how many layers are neee to have most of the input signal energy be containe in the feature vector.

10 0 The with of the networks consiere throughout the paper is, in principle, infinite as the sets Λ n nee to be countably infinite in orer to guarantee that the frame property ) is satisfie. For input signals that exhibit mil spectral ecay, the number of operationally significant noes will, however, be finite in practice. For a treatment of this aspect as well as results on epth-with traeoffs, the intereste reaer is referre to [44]. APPENDIX A A FEATUE EXTACTO WITH A NON-TIVIAL NULL-SET We show, by way of example, that employing filters Ψ n which satisfy the frame property ) alone oes not guarantee a trivial null-set for the feature extractor Φ Ω. Specifically, we construct a feature extractor Φ Ω base on filters satisfying ) an a corresponing function f 0 with f N Φ Ω ). Our example employs, in every network layer, filters Ψ = {χ} {g k } k Z that satisfy the Littlewoo-Paley conition 8) with A = B =, an where g 0 is such that ĝ 0 ) =, for B 0), an arbitrary else of course, as long as the Littlewoo-Paley conition 8) with A = B = is satisfie). We emphasize that no further restrictions are impose on the filters {χ} {g k } k Z, specifically χ nee not be of low-pass nature an the filters {g k } k Z may be structure such as wavelets [9, Appenix B]) or unstructure such as ranom filters [48], [49]), as long as they satisfy the Littlewoo-Paley conition 8) with A = B =. Now, consier the input signal f L ) accoring to f) := ) l +,, with l > / +. Then f g 0 = f, owing to supp f ) = B 0) an ĝ 0 ) =, for B 0). Moreover, f is a positive efinite raial basis function [50, Theorem 6.0] an hence by [50, Theorem 6.8] fx) 0, x, which, in turn, implies f = f. This yiels U[q N 0 ]f = f g0 g 0 g0 = f, for q0 N := 0, 0,..., 0) Z N an N N. Owing to the energy ecomposition ientity 4), together with A N Ω = BN Ω =, N N, which, in turn, is by A n = B n =, n N, we have f = N = N Φ n Ωf) + W N f) Φ n Ωf) + U[q0 N ]f + U[q]f }{{}, = f q Z N \{q0 N } for N N. This implies N Φ n Ωf) + q Z N \{q N 0 } U[q]f = 0. 37) As both terms in 37) are positive, we can conclue that N Φn Ω f) = 0, N N, an thus Φ Ω f) = Φ n Ωf) = 0. Since Φ Ω f) = 0 implies Φ Ω f) = 0, we have constructe a non-zero f, namely fx) = ) l +e πi x,, that maps to the all-zeros feature vector, i.e., f N Φ Ω ). The point of this example is the following. Owing to the nature of ĝ 0 ) namely, ĝ 0 ) =, for B 0)) an the Littlewoo-Paley conition χ) + k Z ĝ k ) =, a.e., it follows that neither the output-generating filter χ nor any of the other filters g k, k Z\{0}, can have spectral support in B 0). Consequently, the only non-zero contribution to the feature vector can come from U[q N 0 ]f = f, which, however, thanks to supp f ) = B 0), is spectrally isjoint from the output-generating filter χ. Therefore, Φ Ω f) will be ientically equal to 0. Assumption isallows this situation as it forces the filters g k, k Z, to be of highpass nature which, in turn, implies that χ must have lowpass characteristics. The punch-line of our general results on energy conservation, be it for finite N or for N, is that Assumption in combination with the frame property an the moulus non-linearity prohibit a non-trivial null-set in general. APPENDIX B SOBOLEV SMOOTHNESS OF CATOON FUNCTIONS Cartoon functions, introuce in [35], satisfy mil ecay properties an are piecewise continuously ifferentiable apart from curve iscontinuities along smooth hypersurfaces. This function class has been wiely aopte in the mathematical signal processing literature [5], [9], [6], [36], [37] as a stanar moel for natural images such as, e.g., images of hanwritten igits [38] see Figure 4). We will work with the following relative to the efinition in [35] slightly moifie version of cartoon functions. Definition. The function f : C is referre to as a cartoon function if it can be written as f = f + D f, where D is a compact omain whose bounary D is a compact topologically embee C -hypersurface of without bounary 4, an f i H / ) C, C), i =,, satisfy the ecay conition f i x) C x, i =,, for some C > 0 not epening on f,f ). Furthermore, we enote by C K CAT := {f + D f f i H / ) C, C), i =,, f i x) K x, vol D) K, f K} the class of cartoon functions of size K > 0. 4 We refer the reaer to [5, Chapter 0] for a review on ifferentiable manifols.

11 Even though cartoon functions are in general iscontinuous, they still amit Sobolev smoothness. The following result formalizes this statement. Lemma. Let K > 0. Then, CCAT K Hs ), for all s 0, /). Proof. Let f + D f ) CCAT K. We first establish D H s ), for all s 0, /). To this en, we efine the Sobolev-Sloboeckij semi-norm [5, Section..] f H s := fx) fy) x y s+ x y ) /s, an note that, thanks to [5, Section..], D H s ) if D H s <. We have D s H = D x) D y) s x y s+ x y = t s+ D x) D x t) x t, where we employe the change of variables t = x y. Next, we note that, for fixe t, the function h t x) := D x) D x t) satisfies h t x) =, for x S t, where S t := {x x D an x t / D} {x x / D an x t D} =D D + t), 38) an h t x) = 0, for x \S t. It follows from 38) that vol S t ) vol D), t. 39) Moreover, owing to S t D + B t 0) ), where D + B t 0)) is a tube of raius t aroun the bounary D of D see Figure 6), an Lemma, state below, there exists a constant C D > 0 such that vol S t ) vol D + B t 0)) C D t, 40) for all t with t. Next, fix such that 0 < <. Then, D s H = s t s+ D x) D x t) x t = t s+ h t x)x t vol S t ) = t s+ x t = t S t t s+ vol D) C D t + t 4) t s+ t s+ \B 0) = vol D) vol B 0)) B 0) r s+) r } {{ } =:I + C D vol B 0)) r s r, 4) 0 } {{ } =:I where in 4) we employe 39) an 40), an in the last D t D D + t) Fig. 6: Illustration in imension =. The set D + t) grey) is obtaine by translating the set D white) by t. The symmetric ifference D D+t) is containe in D + B t 0)), the tube of raius t aroun the bounary D of D. step we introuce polar coorinates. The integral I is finite for all s > 0, while I is finite for all s < /. Moreover, vol D) = x is finite owing to D being compact an D thus boune). We can therefore conclue that 4) is finite for s 0, /), an hence D H s ), for s 0, /). To see that f + D f ) H s ), for s 0, /), we first note that f + D f H s f H s + D f H s, 43) which is thanks to the sub-aitivity of the semi-norm H s. Now, the first term on the HS of 43) is finite owing to f H / ) H s ), for all s 0, /). For the secon term on the HS, we start by noting that an D f s H = s D f )x) D f )y) ) x y s+ x y D f )x) D f )y) = D x) D y))f x) + f x) f y)) D y) 44) D x) D y)) f x) 45) + f x) f y)) D y), 46) where 45) an 46) are thanks to a + b a + b, for a, b C. Substituting 45) an 46) into 44) an noting that f x) f K, x, which is by assumption, an D y), y, implies D f s H s K D s H + f s s Hs <, 47) where in the last step we use D H s ), establishe above, an f H / ) H s ), both for all s 0, /). This completes the proof. It remains to establish the secon inequality in 40). Lemma. Let M be a compact topologically embee C - hypersurface of without bounary an let T M, r) := { x inf x y r}, r > 0, y M be the tube of raius r aroun M. Then, there exists a constant

12 C M > 0 that oes not epen on r) such that for all r it hols that vol T M, r)) C M r. 48) Proof. The proof is base on Weyl s tube formula [54]. Let κ := max κ i, i {,..., } where κ i is the i-th principal curvature of the hypersurface M see [53, Section 3.] for a formal efinition). It follows from [53, Theorem 8.4 i)] that vol T M, r)) = i=0 r i+ k i M) i j=0 + j), for all r κ, where k i M) = M H ix)x, i {0,..., }, with H i enoting the so-calle i)-th curvature of M, see [53, Section 4.] for a formal efinition. Now, thanks to M being a C -hypersurface, we have that H i, i {0,..., }, is boune see [53, Section 4.]), which together with M compact an thus boune) implies k i M) <, for all i {0,..., }. Moreover, by efinition, k i M), i {0,..., }, is inepenent of the tube raius r. Therefore, setting ) k i M) C M := + max i i j=0 + j) establishes 48) for 0 < r min{, κ }. It remains to prove 48) for min{, κ } < r. Let := inf{ > 0 M B 0)} an D := vol B +0)). Since it follows that vol T M, r)) D, 0 < r, vol T M, r)) < D max{, κ} r, for all min{, κ } < r, which establishes 48) for min{, κ } < r an thereby conclues the proof. APPENDIX C POOF OF STATEMENT I) IN THEOEM We start by establishing 0) with α = log / /)), for. Then, we sharpen our result in the -D case by proving that 0) hols for = with α =. This leas to a significant improvement, in the -D case, of the ecay exponent from log / /)) = to. The iea for the proof of 0) for α = log / /)),, is to establish that 5 U[q]f q Λ n Λ n+ Λ n+n Cn n+n f) ) r l N α, 49) 5 We prove the more general result 49) for technical reasons, concretely in orer to be able to argue by inuction over path lengths with flexible starting inex n. for N N, where n+n Cn n+n := max{, B k }. k=n Setting n = in 49) an noting that C N = BΩ N yiels the esire result 0). We procee by inuction over the path length lq) := N, for q = λ n, λ n+,..., λ n+n ) Λ n Λ n+ Λ n+n. Starting with the base case N =, we have U[q]f = f g λn q Λ n λ n Λ n = ĝ λn ) f) 50) λ n Λ n B n f) 5) \B 0) max{, B n } f) r l }{{} =C n n ), 5) for N N, where 50) is by Parseval s formula, 5) is thanks to 7) an 8), an 5) is ue to supp r l ) B 0) an 0 r l ), for. The inuctive step is establishe as follows. Let N > an suppose that 49) hols for all paths q of length lq) = N, i.e., U[q]f q Λ n Λ n+ Λ n+n Cn n+n f) ) r l N ) α, 53) for n N. We start by noting that every path q Λ n Λ n+... Λ n+n of length l q) = N, with arbitrary starting inex n, can be ecompose into a path q Λ n+... Λ n+n of length lq) = N an an inex λ n Λ n accoring to q = λ n, q). Thanks to 4) we have which yiels U[ q] = U[λ n, q)] = U[q]U n [λ n ], U[q]f q Λ n Λ n+ Λ n+n = U[q] λ n Λ n q Λ n+ Λ n+n Un [λ n ]f ), 54) for n N. We procee by examining the inner sum on the HS of 54). Invoking the inuction hypothesis 53) with n replace by n + ) an employing Parseval s formula, we get Un [λ n ]f ) q Λ n+ Λ n+n U[q] Cn+ n+n Un [λ n ]f) ) r l N ) α = Cn+ n+n Un [λ n ]f U n [λ n ]f) r l,n,α, ) = Cn+ n+n f gλn f g λn r l,n,α, ), 55) for n N, where ) r l,n,α, is the inverse Fourier ) transform of r l N ) α. Next, we note that rl N ) α is a positive efinite raial basis function [50, Theorem 6.0] an hence

13 3 by [50, Theorem 6.8] r l,n,α, x) 0, for x. Furthermore, it follows from Lemma 3, state below, that for {ν λn } λn Λ n, we have f g λn r l,n,α, f g λn M νλn r l,n,α, ). 56) Here, we note that choosing the moulation factors {ν λn } λn Λ n appropriately see 60) below) will be key in establishing the inuctive step. Lemma 3. [8, Lemma.7]: Let f, g L ) with gx) 0, for x. Then, f g f M g), for. Inserting 55) an 56) into the inner sum on the HS of 54) yiels U[q]f q Λ n Λ n+ Λ n+n Cn+ n+n f g λn λ n Λ n f g λn M νλn r l,n,α, ) = Cn+ n+n f) h n,n,α, ), N N, 57) where we applie Parseval s formula together with M f = T f, for f L ), an, an set h n,n,α, ) := ĝ λn ) νλn ) r l N ) α. 58) λ n Λ n The key step is now to establish by juiciously choosing {ν λn } λn Λ n the upper boun ) h n,n,α, ) max{, B n } r l N α, 59) for, which upon noting that Cn n+n = max{, B n } Cn+ n+n yiels 49) an thereby completes the proof. We start by efining H Aλn, for λ n Λ n, to be the orthant supporting ĝ λn, i.e., suppĝ λn ) H Aλn, where A λn O), for λ n Λ n see Assumption ). Furthermore, for λ n Λ n, we choose the moulation factors accoring to ) ν λn := A λn ν, 60) where the components of ν are given by ν k := + / ), for k {,..., }. Invoking 6) an 7), we get h n,n,α, ) = ĝ λn ) νλn ) r l N ) α λ n Λ n = ĝ λn ) νλn ) Sλn, ) r l N ) α, 6) λ n Λ n for, where S λn, := H Aλn \B 0). For the first canonical orthant H = {x x k 0, k =,..., }, we show in Lemma 4 below that ν ), r l rl 6) N ) α N α for H\B 0) an N. This will allow us to euce νλn ), r l rl 63) N ) α N α for S λn,, λ n Λ n, an N, where S λn, = H Aλn \B 0), simply by noting that νλn ) r l = N ) α A λn ) ν) l N ) α + ν ) l ν = N ) α = r l 64) + N ) α ) ) r l = N α l N α 65) + = A λn ) l, N α = r l 66) N α + for = A λn H Aλn \B 0), where H\B 0). Here, 64) an 66) are thanks to = A λn, which is by A λn O), an the inequality in 65) is ue to 6). Insertion of 63) into 6) then yiels h n,n,α, ) ĝ λn ) ) Sλn, ) r l N α λ n Λ n = ĝ λn ) ) r l N α 67) λ n Λ n ) max{, B n } r l N α, 68) for, where in 67) we employe Assumption, an 68) is thanks to 8). This establishes 59) an completes the proof of 0) for α = log / /)),. It remains to show 6), which is accomplishe through the following lemma. ) Lemma 4. Let α := log / /), rl :, r l ) := ) l +, with l > / +, an efine ν to have components ν k = + / ), for k {,..., }. Then, ν ), rl rl 69) N ) α N α for H\B 0) an N. Proof. The key iea of the proof is to employ a monotonicity argument. Specifically, thanks to r l monotonically ecreasing in, i.e., r l ) r l ), for, with, 69) can be establishe simply by showing that κ N ) := N N α ν 0, 70) for H\B 0) an N. We first note that for H\B 0) with > N α, 69) is trivially satisfie as the HS of 69) equals zero owing to N α > together with supp r l ) B 0)). It hence suffices to prove 70) for H with N α. To this en, fix τ [, N α ], an efine the spherical segment Ξ τ := { H = τ}.

Harmonic Analysis of Deep Convolutional Neural Networks

Harmonic Analysis of Deep Convolutional Neural Networks Harmonic Analysis of Deep Convolutional Neural Networks Helmut Bőlcskei Department of Information Technology and Electrical Engineering October 2017 joint work with Thomas Wiatowski and Philipp Grohs ImageNet

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

REAL ANALYSIS I HOMEWORK 5

REAL ANALYSIS I HOMEWORK 5 REAL ANALYSIS I HOMEWORK 5 CİHAN BAHRAN The questions are from Stein an Shakarchi s text, Chapter 3. 1. Suppose ϕ is an integrable function on R with R ϕ(x)x = 1. Let K δ(x) = δ ϕ(x/δ), δ > 0. (a) Prove

More information

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold CHAPTER 1 : DIFFERENTIABLE MANIFOLDS 1.1 The efinition of a ifferentiable manifol Let M be a topological space. This means that we have a family Ω of open sets efine on M. These satisfy (1), M Ω (2) the

More information

Deep Convolutional Neural Networks on Cartoon Functions

Deep Convolutional Neural Networks on Cartoon Functions Deep Convolutional Neural Networks on Cartoon Functions Philipp Grohs, Thomas Wiatowski, and Helmut Bölcskei Dept. Math., ETH Zurich, Switzerland, and Dept. Math., University of Vienna, Austria Dept. IT

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

Sturm-Liouville Theory

Sturm-Liouville Theory LECTURE 5 Sturm-Liouville Theory In the three preceing lectures I emonstrate the utility of Fourier series in solving PDE/BVPs. As we ll now see, Fourier series are just the tip of the iceberg of the theory

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction. Thomas Wiatowski and Helmut Bölcskei, Fellow, IEEE

A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction. Thomas Wiatowski and Helmut Bölcskei, Fellow, IEEE A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction Thomas Wiatowski and Helmut Bölcskei, Fellow, IEEE arxiv:52.06293v3 [cs.it] 24 Oct 207 Abstract Deep convolutional neural

More information

Math 115 Section 018 Course Note

Math 115 Section 018 Course Note Course Note 1 General Functions Definition 1.1. A function is a rule that takes certain numbers as inputs an assigns to each a efinite output number. The set of all input numbers is calle the omain of

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy, NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which

More information

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy IPA Derivatives for Make-to-Stock Prouction-Inventory Systems With Backorers Uner the (Rr) Policy Yihong Fan a Benamin Melame b Yao Zhao c Yorai Wari Abstract This paper aresses Infinitesimal Perturbation

More information

ELEC3114 Control Systems 1

ELEC3114 Control Systems 1 ELEC34 Control Systems Linear Systems - Moelling - Some Issues Session 2, 2007 Introuction Linear systems may be represente in a number of ifferent ways. Figure shows the relationship between various representations.

More information

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS MARK SCHACHNER Abstract. When consiere as an algebraic space, the set of arithmetic functions equippe with the operations of pointwise aition an

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Counting Lattice Points in Polytopes: The Ehrhart Theory

Counting Lattice Points in Polytopes: The Ehrhart Theory 3 Counting Lattice Points in Polytopes: The Ehrhart Theory Ubi materia, ibi geometria. Johannes Kepler (1571 1630) Given the profusion of examples that gave rise to the polynomial behavior of the integer-point

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Ashish Goel Michael Kapralov Sanjeev Khanna Abstract We consier the well-stuie problem of fining a perfect matching in -regular bipartite

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Generalized Tractability for Multivariate Problems

Generalized Tractability for Multivariate Problems Generalize Tractability for Multivariate Problems Part II: Linear Tensor Prouct Problems, Linear Information, an Unrestricte Tractability Michael Gnewuch Department of Computer Science, University of Kiel,

More information

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena Chaos, Solitons an Fractals (7 64 73 Contents lists available at ScienceDirect Chaos, Solitons an Fractals onlinear Science, an onequilibrium an Complex Phenomena journal homepage: www.elsevier.com/locate/chaos

More information

Witt#5: Around the integrality criterion 9.93 [version 1.1 (21 April 2013), not completed, not proofread]

Witt#5: Around the integrality criterion 9.93 [version 1.1 (21 April 2013), not completed, not proofread] Witt vectors. Part 1 Michiel Hazewinkel Sienotes by Darij Grinberg Witt#5: Aroun the integrality criterion 9.93 [version 1.1 21 April 2013, not complete, not proofrea In [1, section 9.93, Hazewinkel states

More information

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES Communications on Stochastic Analysis Vol. 2, No. 2 (28) 289-36 Serials Publications www.serialspublications.com SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

More information

Calculus of Variations

Calculus of Variations Calculus of Variations Lagrangian formalism is the main tool of theoretical classical mechanics. Calculus of Variations is a part of Mathematics which Lagrangian formalism is base on. In this section,

More information

MAT 545: Complex Geometry Fall 2008

MAT 545: Complex Geometry Fall 2008 MAT 545: Complex Geometry Fall 2008 Notes on Lefschetz Decomposition 1 Statement Let (M, J, ω) be a Kahler manifol. Since ω is a close 2-form, it inuces a well-efine homomorphism L: H k (M) H k+2 (M),

More information

1. Aufgabenblatt zur Vorlesung Probability Theory

1. Aufgabenblatt zur Vorlesung Probability Theory 24.10.17 1. Aufgabenblatt zur Vorlesung By (Ω, A, P ) we always enote the unerlying probability space, unless state otherwise. 1. Let r > 0, an efine f(x) = 1 [0, [ (x) exp( r x), x R. a) Show that p f

More information

The Principle of Least Action and Designing Fiber Optics

The Principle of Least Action and Designing Fiber Optics University of Southampton Department of Physics & Astronomy Year 2 Theory Labs The Principle of Least Action an Designing Fiber Optics 1 Purpose of this Moule We will be intereste in esigning fiber optic

More information

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d A new proof of the sharpness of the phase transition for Bernoulli percolation on Z Hugo Duminil-Copin an Vincent Tassion October 8, 205 Abstract We provie a new proof of the sharpness of the phase transition

More information

The Generalized Incompressible Navier-Stokes Equations in Besov Spaces

The Generalized Incompressible Navier-Stokes Equations in Besov Spaces Dynamics of PDE, Vol1, No4, 381-400, 2004 The Generalize Incompressible Navier-Stokes Equations in Besov Spaces Jiahong Wu Communicate by Charles Li, receive July 21, 2004 Abstract This paper is concerne

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

θ x = f ( x,t) could be written as

θ x = f ( x,t) could be written as 9. Higher orer PDEs as systems of first-orer PDEs. Hyperbolic systems. For PDEs, as for ODEs, we may reuce the orer by efining new epenent variables. For example, in the case of the wave equation, (1)

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

DIFFERENTIAL GEOMETRY, LECTURE 15, JULY 10

DIFFERENTIAL GEOMETRY, LECTURE 15, JULY 10 DIFFERENTIAL GEOMETRY, LECTURE 15, JULY 10 5. Levi-Civita connection From now on we are intereste in connections on the tangent bunle T X of a Riemanninam manifol (X, g). Out main result will be a construction

More information

LOCAL AND GLOBAL MINIMALITY RESULTS FOR A NONLOCAL ISOPERIMETRIC PROBLEM ON R N

LOCAL AND GLOBAL MINIMALITY RESULTS FOR A NONLOCAL ISOPERIMETRIC PROBLEM ON R N LOCAL AND GLOBAL MINIMALITY RSULTS FOR A NONLOCAL ISOPRIMTRIC PROBLM ON R N M. BONACINI AND R. CRISTOFRI Abstract. We consier a nonlocal isoperimetric problem efine in the whole space R N, whose nonlocal

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations Lecture XII Abstract We introuce the Laplace equation in spherical coorinates an apply the metho of separation of variables to solve it. This will generate three linear orinary secon orer ifferential equations:

More information

SYSTEMS OF DIFFERENTIAL EQUATIONS, EULER S FORMULA. where L is some constant, usually called the Lipschitz constant. An example is

SYSTEMS OF DIFFERENTIAL EQUATIONS, EULER S FORMULA. where L is some constant, usually called the Lipschitz constant. An example is SYSTEMS OF DIFFERENTIAL EQUATIONS, EULER S FORMULA. Uniqueness for solutions of ifferential equations. We consier the system of ifferential equations given by x = v( x), () t with a given initial conition

More information

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7. Lectures Nine an Ten The WKB Approximation The WKB metho is a powerful tool to obtain solutions for many physical problems It is generally applicable to problems of wave propagation in which the frequency

More information

Direct and inverse theorems of approximation theory in L 2 (R d, w l (x)dx)

Direct and inverse theorems of approximation theory in L 2 (R d, w l (x)dx) MATEMATIKA, 2017, Volume 33, Number 2, 177 189 c Penerbit UTM Press. All rights reserve Direct an inverse theorems of approximation theory in L 2 (, w l (x)x) 1 aouan Daher, 2 Salah El Ouaih an 3 Mohame

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

LOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES

LOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES LOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES ÁRPÁD BÉNYI AND KASSO A. OKOUDJOU Abstract. By using tools of time-frequency analysis, we obtain some improve local well-poseness

More information

Discrete Operators in Canonical Domains

Discrete Operators in Canonical Domains Discrete Operators in Canonical Domains VLADIMIR VASILYEV Belgoro National Research University Chair of Differential Equations Stuencheskaya 14/1, 308007 Belgoro RUSSIA vlaimir.b.vasilyev@gmail.com Abstract:

More information

Two formulas for the Euler ϕ-function

Two formulas for the Euler ϕ-function Two formulas for the Euler ϕ-function Robert Frieman A multiplication formula for ϕ(n) The first formula we want to prove is the following: Theorem 1. If n 1 an n 2 are relatively prime positive integers,

More information

COUPLING REQUIREMENTS FOR WELL POSED AND STABLE MULTI-PHYSICS PROBLEMS

COUPLING REQUIREMENTS FOR WELL POSED AND STABLE MULTI-PHYSICS PROBLEMS VI International Conference on Computational Methos for Couple Problems in Science an Engineering COUPLED PROBLEMS 15 B. Schrefler, E. Oñate an M. Paparakakis(Es) COUPLING REQUIREMENTS FOR WELL POSED AND

More information

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

Analysis IV, Assignment 4

Analysis IV, Assignment 4 Analysis IV, Assignment 4 Prof. John Toth Winter 23 Exercise Let f C () an perioic with f(x+2) f(x). Let a n f(t)e int t an (S N f)(x) N n N then f(x ) lim (S Nf)(x ). N a n e inx. If f is continuously

More information

Extreme Values by Resnick

Extreme Values by Resnick 1 Extreme Values by Resnick 1 Preliminaries 1.1 Uniform Convergence We will evelop the iea of something calle continuous convergence which will be useful to us later on. Denition 1. Let X an Y be metric

More information

The Sobolev inequality on the torus revisited

The Sobolev inequality on the torus revisited Publ. Math. Debrecen Manuscript (April 26, 2012) The Sobolev inequality on the torus revisite By Árpá Bényi an Taahiro Oh Abstract. We revisit the Sobolev inequality for perioic functions on the - imensional

More information

Optimal Control of Spatially Distributed Systems

Optimal Control of Spatially Distributed Systems Optimal Control of Spatially Distribute Systems Naer Motee an Ali Jababaie Abstract In this paper, we stuy the structural properties of optimal control of spatially istribute systems. Such systems consist

More information

Analyzing the Structure of Multidimensional Compressed Sensing Problems through Coherence

Analyzing the Structure of Multidimensional Compressed Sensing Problems through Coherence Analyzing the Structure of Multiimensional Compresse Sensing Problems through Coherence A. Jones University of Cambrige B. Acock Purue Univesity A. Hansen University of Cambrige Abstract In previous work

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

Euler Equations: derivation, basic invariants and formulae

Euler Equations: derivation, basic invariants and formulae Euler Equations: erivation, basic invariants an formulae Mat 529, Lesson 1. 1 Derivation The incompressible Euler equations are couple with t u + u u + p = 0, (1) u = 0. (2) The unknown variable is the

More information

THE GENUINE OMEGA-REGULAR UNITARY DUAL OF THE METAPLECTIC GROUP

THE GENUINE OMEGA-REGULAR UNITARY DUAL OF THE METAPLECTIC GROUP THE GENUINE OMEGA-REGULAR UNITARY DUAL OF THE METAPLECTIC GROUP ALESSANDRA PANTANO, ANNEGRET PAUL, AND SUSANA A. SALAMANCA-RIBA Abstract. We classify all genuine unitary representations of the metaplectic

More information

SOME RESULTS ON THE GEOMETRY OF MINKOWSKI PLANE. Bing Ye Wu

SOME RESULTS ON THE GEOMETRY OF MINKOWSKI PLANE. Bing Ye Wu ARCHIVUM MATHEMATICUM (BRNO Tomus 46 (21, 177 184 SOME RESULTS ON THE GEOMETRY OF MINKOWSKI PLANE Bing Ye Wu Abstract. In this paper we stuy the geometry of Minkowski plane an obtain some results. We focus

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

On lower bounds for integration of multivariate permutation-invariant functions

On lower bounds for integration of multivariate permutation-invariant functions arxiv:1310.3959v1 [math.na] 15 Oct 2013 On lower bouns for integration of multivariate permutation-invariant functions Markus Weimar October 16, 2013 Abstract In this note we stuy multivariate integration

More information

1 Second Facts About Spaces of Modular Forms

1 Second Facts About Spaces of Modular Forms April 30, :00 pm 1 Secon Facts About Spaces of Moular Forms We have repeately use facts about the imensions of the space of moular forms, mostly to give specific examples of an relations between moular

More information

Suppressing Chemotactic Blow-Up Through a Fast Splitting Scenario on the Plane

Suppressing Chemotactic Blow-Up Through a Fast Splitting Scenario on the Plane Arch. Rational Mech. Anal. Digital Object Ientifier (DOI) https://oi.org/1.17/s25-18-1336-z Suppressing Chemotactic Blow-Up Through a Fast Splitting Scenario on the Plane Siming He & Eitan Tamor Communicate

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Witten s Proof of Morse Inequalities

Witten s Proof of Morse Inequalities Witten s Proof of Morse Inequalities by Igor Prokhorenkov Let M be a smooth, compact, oriente manifol with imension n. A Morse function is a smooth function f : M R such that all of its critical points

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

arxiv: v4 [cs.ds] 7 Mar 2014

arxiv: v4 [cs.ds] 7 Mar 2014 Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling Balancing Expecte an Worst-Case Utility in Contracting Moels with Asymmetric Information an Pooling R.B.O. erkkamp & W. van en Heuvel & A.P.M. Wagelmans Econometric Institute Report EI2018-01 9th January

More information

CAUCHY INTEGRAL THEOREM

CAUCHY INTEGRAL THEOREM CAUCHY INTEGRAL THEOREM XI CHEN 1. Differential Forms, Integration an Stokes Theorem Let X be an open set in R n an C (X) be the set of complex value C functions on X. A ifferential 1-form is (1.1) ω =

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

Self-normalized Martingale Tail Inequality

Self-normalized Martingale Tail Inequality Online-to-Confience-Set Conversions an Application to Sparse Stochastic Banits A Self-normalize Martingale Tail Inequality The self-normalize martingale tail inequality that we present here is the scalar-value

More information

A Spectral Method for the Biharmonic Equation

A Spectral Method for the Biharmonic Equation A Spectral Metho for the Biharmonic Equation Kenall Atkinson, Davi Chien, an Olaf Hansen Abstract Let Ω be an open, simply connecte, an boune region in Ê,, with a smooth bounary Ω that is homeomorphic

More information

Optimal Control of Spatially Distributed Systems

Optimal Control of Spatially Distributed Systems Optimal Control of Spatially Distribute Systems Naer Motee an Ali Jababaie Abstract In this paper, we stuy the structural properties of optimal control of spatially istribute systems. Such systems consist

More information

Stable and compact finite difference schemes

Stable and compact finite difference schemes Center for Turbulence Research Annual Research Briefs 2006 2 Stable an compact finite ifference schemes By K. Mattsson, M. Svär AND M. Shoeybi. Motivation an objectives Compact secon erivatives have long

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1 Assignment 1 Golstein 1.4 The equations of motion for the rolling isk are special cases of general linear ifferential equations of constraint of the form g i (x 1,..., x n x i = 0. i=1 A constraint conition

More information

Monotonicity of facet numbers of random convex hulls

Monotonicity of facet numbers of random convex hulls Monotonicity of facet numbers of ranom convex hulls Gilles Bonnet, Julian Grote, Daniel Temesvari, Christoph Thäle, Nicola Turchi an Florian Wespi arxiv:173.31v1 [math.mg] 7 Mar 17 Abstract Let X 1,...,

More information

Generalization of the persistent random walk to dimensions greater than 1

Generalization of the persistent random walk to dimensions greater than 1 PHYSICAL REVIEW E VOLUME 58, NUMBER 6 DECEMBER 1998 Generalization of the persistent ranom walk to imensions greater than 1 Marián Boguñá, Josep M. Porrà, an Jaume Masoliver Departament e Física Fonamental,

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Monte Carlo Methods with Reduced Error

Monte Carlo Methods with Reduced Error Monte Carlo Methos with Reuce Error As has been shown, the probable error in Monte Carlo algorithms when no information about the smoothness of the function is use is Dξ r N = c N. It is important for

More information

Characterizing Real-Valued Multivariate Complex Polynomials and Their Symmetric Tensor Representations

Characterizing Real-Valued Multivariate Complex Polynomials and Their Symmetric Tensor Representations Characterizing Real-Value Multivariate Complex Polynomials an Their Symmetric Tensor Representations Bo JIANG Zhening LI Shuzhong ZHANG December 31, 2014 Abstract In this paper we stuy multivariate polynomial

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

On the Aloha throughput-fairness tradeoff

On the Aloha throughput-fairness tradeoff On the Aloha throughput-fairness traeoff 1 Nan Xie, Member, IEEE, an Steven Weber, Senior Member, IEEE Abstract arxiv:1605.01557v1 [cs.it] 5 May 2016 A well-known inner boun of the stability region of

More information

Convergence rates of moment-sum-of-squares hierarchies for optimal control problems

Convergence rates of moment-sum-of-squares hierarchies for optimal control problems Convergence rates of moment-sum-of-squares hierarchies for optimal control problems Milan Kora 1, Diier Henrion 2,3,4, Colin N. Jones 1 Draft of September 8, 2016 Abstract We stuy the convergence rate

More information

A simple model for the small-strain behaviour of soils

A simple model for the small-strain behaviour of soils A simple moel for the small-strain behaviour of soils José Jorge Naer Department of Structural an Geotechnical ngineering, Polytechnic School, University of São Paulo 05508-900, São Paulo, Brazil, e-mail:

More information

Sharp Thresholds. Zachary Hamaker. March 15, 2010

Sharp Thresholds. Zachary Hamaker. March 15, 2010 Sharp Threshols Zachary Hamaker March 15, 2010 Abstract The Kolmogorov Zero-One law states that for tail events on infinite-imensional probability spaces, the probability must be either zero or one. Behavior

More information

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim QF101: Quantitative Finance September 5, 2017 Week 3: Derivatives Facilitator: Christopher Ting AY 2017/2018 I recoil with ismay an horror at this lamentable plague of functions which o not have erivatives.

More information

The Sokhotski-Plemelj Formula

The Sokhotski-Plemelj Formula hysics 25 Winter 208 The Sokhotski-lemelj Formula. The Sokhotski-lemelj formula The Sokhotski-lemelj formula is a relation between the following generalize functions (also calle istributions), ±iǫ = iπ(),

More information